Strategies to automatically derive a process model from a configurable process model based on event data

: Conﬁgurable process models are frequently used to represent business workﬂows and other discrete event systems among different branches of large organizations: they unify commonalities shared by all branches and describe their differences, at the same time. The conﬁguration of such models is usually done manually, which is challenging. On the one hand, when the number of conﬁgurable nodes in the conﬁgurable process model grows, the size of the search space increases exponentially. On the other hand, the person performing the conﬁguration may lack the holistic perspective to make the right choice for all conﬁgurable nodes at the same time, since choices inﬂuence each other. Nowadays, information systems that support the execution of business processes create event data reﬂecting how processes are performed. In this article, we propose three strategies (based on exhaustive search, genetic algorithms and a greedy heuristic) that use event data to automatically derive a process model from a conﬁgurable process model that better represents the characteristics of the process in a speciﬁc branch. These strategies have been implemented in our proposed framework and tested in both business-like event logs as recorded in a higher educational enterprise resource planning system and a real case scenario involving a set of Dutch municipalities


Introduction
Business process models and other discrete event systems are widely used for analysis, optimization, monitoring and even auditing, as they describe the operations of an organization [1].Often, variants of the same process occur in large organizations as a result of legal restrictions, cultural conditions, business strategies and economic issues, among others.For example, banks commonly have branches in different locations where they use similar processes that might slightly differ in order to adapt to local conditions.Similarly, municipalities provide the same products and services, but the processes in the back-office differ significantly.The challenge for these organizations is to balance standardization and a certain level of flexibility in their business processes.A process model that describes both the commonalities shared by all process variants and their differences is called a configurable process model.Extensions of process modeling languages have been developed in order to represent configurable process models, such as Configurable Yet Another Workflow Language (C-YAWL) [2], Configurable Business Process Execution Language (C-BPEL) [3], Configurable Event-driven Process Chain (C-EPC) [4], Configurable Process Tree (C-PT) [5] and Configurable Business Process Model and Notation (C-BPMN) [6].
The importance of having a configurable process model to describe process variants has been a subject of study in the literature.In [7], twenty Business Process Management (BPM) use cases are identified.Three of those use cases involve configurable process models: design configurable model (DesCM), merge models into configurable model (MerCM) and configure configurable model (ConCM) (see Figure 1).In particular, the use case ConCM consists of deriving a process model from a configurable process model so as to represent a particular process variant.As shown in Figure 1, the original use case does not consider the usage of other sources of information beyond manual configuration.However, in the last few years, a new discipline called process mining has emerged, which studies extracting and analyzing data recorded in information systems about processes' behavior [8].More specifically, process discovery techniques aim at creating a process model based on the historical behavior recorded in an event log.Even though some process mining techniques have been applied to support the creation and derivation of configurable process models (e.g., [9]), the approach was simplistic and not executable on real-life data.Business process management (BPM) use cases [7] related to configurable process models: (a) manually design a configurable process model (DesCM), (b) merge the collection of process models to generate a configurable process model (MerCM) and (c) configure a configurable process model (ConCM) to obtain a process model.In this article, we propose (d) as a new use case: to derive a process model based on a configurable process model and using event data (ConCME).
While analyzing the literature about configurable process models, we have identified two main challenges.As mentioned before, two separate approaches can be recognized: on the one hand, automated process discovery and, on the other hand, manual configuration of configurable process models.At the same time, there is a growing availability of event data in organizations, which also puts pressure on having specialized, smart and efficient techniques to include historical data for configuring a configurable process model.Therefore, the first and main challenge is to develop an automatic method to derive a process model from a configurable model that better represents the observed behavior of a process variant (e.g., the process execution in a target branch) as stored in a given event log, which can be applied to real-life data.It is also important that configuration decisions can be defined in a local context.Therefore, a second challenge is to create a simple representation for the different degrees of freedom we want to allow in the configurable nodes [10].
The goal of automatically deriving a process model from a configurable process model that better represents the behavior observed in an event log, instead of discovering a process model only based on the behavior observed in the event log, is useful when an organization has already defined a configurable process model in order to specify a certain degree of standardization and flexibility among the different branches where the process is being performed.If a new branch is added, it would be desirable to obtain a derived process model that is as similar as possible to the current way of executing the process at such a branch, represented by the event log.If instead, we use any process discovery technique on the event log, we will obtain a process model that describes how the process is currently executed at the branch, but that may not necessarily be compliant with the desired standardization.
This article presents an approach with a three-fold objective.First, we propose an additional use case to those presented by [7], where we combine a configurable process model (manually or automatically generated [9]) and an event log to derive a process model that better represents the observed behavior in the historical data, depicted in Figure 1 as configure configurable process model and using event data ConCME.Second, to support this use case extension, we redefine the configurable process model representation, in particular how to represent configurable process trees [10], so as to generalize and simplify the description of the configuration process.Third, we propose a derivation framework that receives as input a configurable process model and an event log; as depicted in Figure 2, both inputs are part of the extended configurable process model use case.The framework incorporates three derivation strategies: an exhaustive method, used as a reference approach, which finds an optimal configuration in a wide search space, a genetic evolutionary method designed as a smart technique that evolves until it finds a good configuration and a greedy method designed as a heuristic to find a satisfying configuration in less computing time.The configuration obtained by any of these three strategies is then applied to the configurable process model in order to derive a process model.Additionally, we have tested the feasibility and applicability of the framework using two different sets of experiments: an educational process and a real-life municipality scenario.This article is organized as follows: Related work is described in Section 2, and then Section 3 introduces the theoretical foundation of the proposed framework.In Section 4, we present the methodology that describes three strategies (exhaustive, genetic and greedy) that allow finding the best configuration in order to derive a process tree from a configurable process tree that better represents the observed behavior in an event log.Results and discussions are presented in Section 5. Finally, conclusions and future work are presented in Section 6.

Related Work
The majority of research in the area of configurable process models has addressed the issue of describing a configurable process model, or the issue of obtaining such a configurable process model [10][11][12][13][14][15].Manual (i.e., by the user) process model configuration has been addressed by [10,16].
In [16], for example, a questionnaire-driven approach for configuring a reference model is taken, guiding the user in defining a configuration.The work by Schunselaar et al. [10] uses a configurable tree-like representation, which is sound by construction.Applied in the Configurable Services for Local Governments (CoSeLoG) project [17], this approach merges variants of different municipalities to create a configurable process model.The same author underlines in [5] the difficulty of creating a configuration since the user needs a high abstraction level about the process.Hence, the author uses a meta-model to automatically construct an abstraction that helps the end user to apply configurations.The same author has also extended the work to consider several qualitative process aspects such as performance, cost and satisfaction indicators.The results are then presented to the end user, who then inspects the proposed configurations and selects one to be applied.
A configurable process model allows a reference model to be adapted to different business scenarios where the process is being executed (e.g., different branches of a large organization or different companies belonging to a corporate group).A complementary approach aims at adapting a general process model to different modeling views that are needed for the integrated modeling of business processes and information systems [18,19].
However, event data are not often used to configure a configurable process model.One of the few approaches using this is the Evolutionary Tree Miner (ETM) [20], which is able to discover a process model including configurations, when given multiple event logs.However, the performance of the ETM is not optimal.Because it is an evolutionary algorithm and the configuration aspect adds many possibilities, the challenge of discovering a process model and a configuration becomes challenging.
The main limitation of existing approaches is the restricted number of configurations a configurable process model can have.The approach proposed in this paper aims to enhance the work already done in configurable process models by allowing a large number of configurations and by providing a framework that allows combining a configurable process model and an event log to derive a process model that better represents the observed behavior in the event log.Similar to existing approaches, we implemented our techniques in the process mining framework (ProM).

Theoretical Foundation of the Framework for Deriving a Process Model from a Configurable Process Model
In this section, we introduce the theoretical foundation of the proposed framework, such as event log, process tree and configurable process tree, and how to measure the quality of a derived process tree to represent the observed behavior in an event log.

Preliminaries: Set, Multiset, Sequence and Concatenation
A multiset (or a bag) is a generalization of the concept of set, where its elements may appear multiple times.For a given set A, B(A) is the set of all multisets over A. For a multiset b ∈ B(A), b(a) denotes the number of times the element a ∈ A appears in b.For example, b z] are multisets over A = {x, y, z}.b 1 is the empty multiset; b 2 and b 3 both consist of three elements; and b 4 = b 5 since the order of the elements is irrelevant; b 5 representation is preferred because it is a more compact way of representing the same elements.Note that sets are written using curly brackets, while multisets are written using square brackets.
For a given set A, A * is the set of all finite sequences over A. A finite sequence of length n, ρ = a 1 , a 2 , a 3 , ..., a n ∈ A * , is a mapping {1, ..., n} → A. Its length is denoted by |ρ| = n, and the element at position i (a i ) is denoted as ρ i .Furthermore, is the empty sequence.Note that sequences are written using angle brackets.For two sequences, ρ 1 and ρ 2 , ρ 1 • ρ 2 denotes the concatenation of two sequences.For example, a, b, c • m, n = a, b, c, m, n .

Event Log
Information systems record event data in the form of event logs that register events related to the execution of processes within an organization.Each event is identified as part of a trace (a process instance) that is executed for a given process.Definition 1 (Trace, event log).Let A be a set of activities over a universe of activities.A trace σ ∈ A * is a sequence of activities.L ∈ B(A * ) is an event log, i.e., a multiset of traces.
For instance, a, b, c, e, g is a trace that belongs to an event log L 1 = [ a, b, c, e, g 3 , a, c, b, e, g 4 , a, d, f , g 2 ].
Definition 2 (Projection).Let A be a set and A ⊆ A one of its subsets.σ A denotes the projection of σ ∈ A * on A , e.g., a, a, b, c {a,c} = a, a, c .The projection can also be applied to multisets, e.g., [x 3 , y, z 2 ] {x,y} = [x 3 , y].
Projection can be used to obtain a sub-log of an event log.For instance, L 1 {a,e,g} = [ a, e, g 7 , a, g 2 ].

Process Tree
Playing an important role in organizations, a process model can be used to represent a workflow task execution in a certain process [8].The use of Petri nets as modeling notation is common in both the discrete event systems and process mining literature [8].In our framework, we use the different, but still related, process tree notation to represent a process, similar to other approaches in the configurable process models literature [20].A process tree [20,21] is a tree-structured process model, where the leaf nodes represent the activities, and the non-leaf nodes represent control-flow operators, e.g., sequence (→), exclusive choice (×), inclusive choice (∨), parallelism (∧) and loop ( ).A silent activity is denoted by τ and cannot be observed; it is used to model processes where an activity can be skipped under some specific circumstances.The process tree notation ensures soundness, and it is used by a wide range of process mining techniques, such as ETM [22], Inductive Miner (IM) [23] and Inductive Miner-Infrequent (IMI) [24].Its formal definition is as follows: Definition 3 (Process tree).Let A be a finite set of activities, with τ ∈ A representing a silent activity.⊕ = {→, ×, ∨, ∧, } is the set of process tree operators.A process tree is recursively defined as follows [8]: is a process tree and The nodes of a process tree, both operator and activity nodes, are denoted as N(Q).
Notice that both Petri net and process tree modeling notations are closely related and that conclusions obtained in one model can be easily extrapolated to the other.Moreover, a mapping between process tree and Petri net-based workflow nets is described in [8], and it can be easily adapted for other representations such as BPMN, YAWL and EPCs, among others [8].However, process trees also preserve interesting properties for analysis and verification, such as soundness by construction [8] and the block-structure [8].An example of a process tree model Q 1 that contains seven activities and five operators is shown in Figure 3.This process tree contains a sequence operator (→) as a root node, i.e., its branches will be executed from left to right.Hence, the first activity to be executed will be a.Then, there is a loop operator ( ).Its leftmost branch represent the "do" part of the loop; it will be executed at least once, and the loop execution always starts and ends with it.In this case, there is an exclusive choice operator (×) in the leftmost branch, indicating that either the activity b or the parallel (∧) activities c and d will be executed.The rightmost branch of the loop operator ( ) represents the "redo" part of the loop, which in this case contains only the activity e.The process ends with an exclusive choice operator (×) that indicates that the final activity will be f or g.

Configurable Process Tree
Representing configurable process models as process trees has been addressed by [9,10].We have slightly redefined the definition of configurable process trees described by them in our proposed framework, in order to allow greater flexibility and expressiveness in the configurable nodes.A configurable process tree (CPT) represents a family of process models [25].All processes of a family share the same tree topology, while differences are handled using configurable nodes.Applying a particular configuration to these configurable nodes produces a variant of the configurable process model, a derived process model.A configurable node can be set to enable (or allow), hide or block.Enable (allow) means the node is able to be visited; hide means the node is to be skipped over; and block means the node cannot be reached.The foundation of configurable process trees is described in [10] and also applied in [9].

Definition 4 (Process tree configurators).
= {H, B, E} is the set of process tree configurators, and × = {{H}, {B}, {E}, {H, B}, {H, E}, {B, E}, {H, B, E}} is the set of all subsets of the process tree configurators, where: • H: hide a node.It makes a node unobservable, replacing it by a τ node.
• B: block a node.It makes the leading path to this node unreachable.When blocking a node, several cases might occur; details can be found in [20].• E: enable a node.It essentially allows a node to perform either as an operator or an activity, so that it behaves normally.
Definition 5 (Configurable process tree).A configurable process tree Q α = (Q, α) is comprised of a process tree Q with N(Q) nodes and a partial configuration function α : N(Q) × defining a configuration set for some nodes.N α (Q α ) ⊆ N(Q) is the set of configurable nodes, i.e., N α (Q α ) = domain(α).For the sake of clarity, let us assume an ordering among the configurable nodes, i.e., n 1 , n 2 , . . .,  It is possible to apply a configuration to a configurable process tree in order to obtain a process tree, known as a derived process tree, as follows: Definition 6 (Derived process tree).Let Q α be a configurable process tree, and let c ∈ C(Q α ) be a possible configuration.derive(Q α , c) = Q c is the function that generates a derived process tree Q c from Q α by applying the configuration c, using the rules applied in [20].
Figure 5 presents a running example of the execution of the derivation framework shown in Figure 2. A configurable process tree and an event log are the inputs, depicted in (a) and (b), respectively.The implemented derivation strategies use a common data structure to represent all the feasible configurations for all configurable nodes, shown in (c).Each derivation strategy allows obtaining a configuration, shown in (d), which is then used to derive a process tree, shown in (e). Figure 5 also depicts the notation for the process tree configurators and the model activities in (f) and (g), respectively.c,d,e,c,d,f>,  <a,c,d,e,d,c,g>,  <a,d,c,e,c,d,e,c,d Finally, the output, a derived process tree, is shown in (e).

Quality of a Derived Process Tree
Several configurations can be applied to a configurable process tree.In order to assess the quality of a derived process tree to represent the observed behavior in an event log, a quality metric must be defined.Quality is usually measured considering a trade-off among the following four quality criteria: fitness, precision, generalization and simplicity [8,22,26].Conformance checking is a sub-discipline of process mining that allows comparing the behavior allowed by a process model with the behavior recorded in an event log to find commonalities and discrepancies, and also to compute metrics for each of the four quality criteria [26].

Definition 7 (Conformance).
Let Q be a process tree, and let L be an event log.Let f , p, g, s be the fitness, precision, generalization and simplicity metrics as defined in [20] with a range [0, 1], one being the target value, and let W = (w f , w p , w g , w s ) be the weights given to each metric, respectively.Conformance is defined as: In this article, we have defined the conformance metric based on [20] and used the corresponding implementation to evaluate the quality of a given process tree.However, the proposed framework is generic and independent of the conformance metric used.
An optimal configuration is the one that allows obtaining a derived process tree that better represents a given event log, e.g., has the best conformance value.Definition 8 (Optimal configuration).Let Q α be a configurable process tree; let L be an event log; and let W be the weights of the quality metrics.
A configuration c ∈ C(Q α ) is an optimal configuration if and only if there is not another configuration Our goal is to automatically obtain a configuration to derive a process tree from a configurable process tree that maximizes the conformance function for a given event log.To accomplish this goal, we have implemented three strategies, which are detailed in the next section.

Methodology
As part of the proposed framework, we have designed three different derivation strategies that are able to find a suitable configuration in order to derive a process tree.The first strategy is based on an exhaustive approach, guaranteeing one to find an optimal configuration.The other two strategies are based on heuristics that find a configuration that allows one to derive a reasonably good process tree in less time.Each of these strategies is described hereafter.Notice that the approach proposed in this article is different from the one proposed in [20].In [20], the input is a collection of event logs, and the output is a configurable process model.In our case, the input is an already existing configurable process model and an event log.The output is a feasible configuration of the input configurable process model, so that the derived process model obtained with such a configuration better represents the input event log.

Obtaining a Configuration Based on the Exhaustive Strategy
The first strategy in the proposed framework is the exhaustive strategy, whose relevance is that it ensures obtaining the best configuration among all possible ones.In general, an exhaustive strategy is a brute-force method to a problem involving the search for a solution among all possible ones, e.g., those obtained from combinatorial objects, such as permutations or combinations.In this case, for a configurable process tree Q α , we analyze all possible configurations that belong to C(Q α ).Algorithm 1 presents the exhaustive strategy, which can be described as follows: • Generate the set of all possible configurations, C(Q α ), in a systematic manner, using the Cartesian product of the configuration sets for all configurable nodes.• Loop over all configurations.At each iteration, the function derive(Q α ,c) is used to obtain a derived model m from the CPT Q α , given the configuration c. • The model m is evaluated using the conformance function defined in Equation ( 1).
• All potential configurations are evaluated, keeping track of the best solution found.
• After all configurations have been processed, the algorithm returns the configuration with the highest conformance.
Figure 6 presents an illustrative example of the exhaustive strategy.Given the CPT Q α 1 and the log L 1 , shown in (a) and (b), the framework finds the best configuration and the corresponding derived process tree.The CPT Q α 1 , shown in (a), has 4 configurable nodes, where the activity node b can be either H or E, the operator node ∧ can be either B or E, the activity node f can be either H or E and the activity node g can be either H or E. The set of configurations C(Q α ) contains 2 4 = 16 different configurations.The framework checks every configuration c ∈ C(Q α ), shown in (c).Among all possible feasible configurations, the algorithm selects c 15 as the best configuration, shown in (d).
Once the best configuration is obtained, the framework applies it to the CPT Q α 1 to derive the process tree Q 15 , shown in (e).Overview of the exhaustive strategy to derive a process tree.The algorithm generates all possible configurations, in order to find an optimal configuration.This configuration is used to obtain an optimal derived process tree.
Algorithm 1 Obtain a configuration based on the exhaustive strategy.
end for 10: return best 11: end procedure

Obtaining a Configuration Based on the Genetic Strategy
The exhaustive strategy requires a long time to find an optimal solution.Motivated to find a solution in less computing time, we have designed a second strategy to find a reasonable good configuration, based on a genetic evolutionary approach.Genetic algorithms (GA) are search algorithms that imitate the process of natural selection in nature, belonging to the class of evolutionary algorithms [27].They have been successfully applied in the context of process mining for finding a process model that better represents the observed behavior in an event log [28][29][30].In this subsection, we present a GA approach to find a suitable configuration for a CPT model given a specific event log.The elements that define a GA are: representation of individuals, initialization, selection, crossover, mutation and termination condition.Next, we present the setting of each of these elements in our configuration scenario.Figure 7 illustrates these main elements.
• Representation: In GA, a chromosome represents a potential solution, and it is formed by a gene chain.Genes represent distinct aspects of the solution as a whole, just as human genes represent distinct aspects of individual people, such as their sex or eye color.A potential value of a gene is called an allele.In our case, a chromosome represents a configuration c ∈ C(Q α ) (see Figure 7b) where each gene c i corresponds to a configurable node n i , and an allele is a configurator (B, H or E) assigned to that particular configurable node, c i ∈ α(n i ), for all i ∈ 1, . . ., |N α (Q α )|.• Initialization: An initial population is generated randomly, where each individual represents a randomly created configuration c using valid alleles, i.e., c i ∈ α(n i ).Population size is a parameter that determines the number of individuals in the first generation [31].
• Selection: In each generation, the best candidates are selected to move forward to the next generation, and some of them are also selected to be recombined.Each individual is evaluated using the conformance function that evaluates the quality of a chromosome to either be selected for the next generation or to be discarded.In GA theory, the function that evaluates the quality of a chromosome is usually called fitness.This fitness function does not correspond to the fitness function presented in Section 3.5, but to the conformance function.Therefore, and for the sake of clarity, in this article we refer to it as conformance function.We refer to [27] to illustrate different selection strategies.• Crossover: The crossover operation combines two parent chromosomes in order to generate two offspring chromosomes, as is shown in Figure 7c.Given two chromosomes a and b and a cutting point 1 Notice that the proposed chromosome representation combined with the defined crossover operation produce only valid solutions, i.e., the offspring chromosomes are always valid configurations, according to the definitions presented in Section 3.4.• Mutation: A mutation produces a random change in one of the genes of the chromosome.In order not to produce spurious chromosomes, the mutation of a gene is restricted to the valid alleles of the gene, i.e., c i ∈ α(n i ), where i is the mutated gene.• Termination conditions: The most common alternatives for GA to terminate are: an upper limit for the number of generations, an upper limit for the conformance function (1 in our case), when the likelihood of achieving significant improvements in the next generation is very low or when a given number of generation does not obtain any improvements [27].
Crossover and mutation  Algorithm 2 describes the basic GA strategy.The main inputs are a configurable process model and an event log.The initialization of chromosomes is made in initialPopulation(), then all chromosomes of the initial population pop are evaluated using the conformance function in bestIndividual(pop), in order to obtain the best individual.The population evolves over several generations until a termination condition is reached.In each generation, the algorithm selects qualified individuals through elitism in selectParents(pop).Later, the function crossover(parents) recombines pairs of parents to create new individuals; in this way, a new population pop' is obtained.Mutation is then applied randomly to this new population, obtaining the new generation pop".The best individual in the population pop" is then compared to the best configuration obtained so far.At the end, the algorithm returns the best individual (configuration) obtained using this evolutionary approach.For the sake of generality, Algorithm 2 describes the most generic GA strategy.More sophisticated techniques for each step of the algorithm are also possible (e.g., tournament, elitism, among others).Please refer to [20,27] for more details.
Algorithm 2 Obtain a configuration based on the genetic strategy.

Obtaining a Configuration Based on the Greedy Strategy
The above described evolutionary strategy is able to find a good configuration (potentially an optimal one [27]) in less time than the exhaustive strategy, but it is still time consuming.In order to reduce the time even more, but at the same time to be able to find a reasonably good configuration, we present a third strategy, based on a greedy heuristic.A greedy strategy is a heuristic search that creates a feasible solution incrementally, always making the choice that looks best at the moment of making a local choice.Sometimes, these local choices lead to a global optimal solution [32].Depending on the problem and search space, this strategy does not result in finding one of the optimal solutions; however, for many problems, they provide a close to optimal solution (even an optimal one) in a reasonable computing time.
Greedy algorithms usually divide the problem into small sub-problems; each sub-problem is then solved independently and in an incremental fashion.In our case, we can take subtrees from the configurable process tree and process each of them independently, as well as make good local choices in the hope that they result in an optimal solution when we apply all these local configuration choices to the configurable process model.
In a configurable process tree, two (or more) configurable nodes are dependent if one of them is under or above the other one in a tree branch or if they both have a common ancestor that is a operator.If so, the configuration of one of these nodes might affect the configuration of the other one.On the other hand, a configurable node is independent if it is not dependent on any other configurable node.
We can identify three scenarios in a configurable process tree depending on the dependency among its configurable nodes.The configurable process tree can have only independent configurable nodes, as shown in Figure 8a; it can have only dependent configurable nodes, as shown in Figure 8b; or it can combine both independent and dependent configurable nodes, as shown in Figure 8c.Algorithm 3 describes the proposed greedy strategy, which consists of the following steps: • The configurable process tree is traversed to obtain a sorted list of all configurable nodes.
A hierarchical order is achieved by applying the following rules: -Dependent configurable nodes have a higher priority than independent configurable nodes.
-Among configurable nodes of the same type (dependent or independent configurable nodes), a deeper configurable node has a higher priority.-Among configurable nodes at the same level, an operator node has a higher priority than an activity node; otherwise, they are sorted from left to right.
• Every configurable node is then processed according to its priority.For each configurable node, a subtree and a sub-log are obtained to compute the local conformance: -To obtain a subtree for a configurable node, a new root has to be considered.If a configurable node is an activity node, the new root is its direct parent, so that a subtree always has an operator as a root; otherwise, if it is an operator node, and the new root is the own operator node.If the configurable node has some ancestor that is a loop operator, then the new root is the loop operator ancestor that is close to the original root.Such a new subtree might contain other pending configurable nodes; they are temporarily set to τ in order to postpone any decision about their configuration.-To obtain a sub-log, we get the projection of the event log on the set of activities contained in the subtree.
• For the selected configurable node, all possible configurators are evaluated, obtaining different derived process subtrees.The local conformance between each of those process subtrees and the event sub-log is computed.The best configurator is saved and then set in the best configuration for the original configurable process tree.• At the end, the best configuration for the whole configurable process tree is returned.Figure 9 illustrates the greedy strategy to find a configuration for the configurable process tree The derivation process is shown on the right part of the figure for two logs, L 1 and L 2 , where, for every configurable node, the best configurator is selected among all feasible configurations for each configurable node.First, the order in which the configurable nodes will be processed is decided.n 1 and n 2 are dependent nodes because they have a common ancestor that is a operator.Meanwhile, n 3 and n 4 are independent nodes.Hence, n 1 and n 2 have a higher priority than n 3 and n 4 .Since n 2 is an operator node, it has a higher priority than n 1 .n 3 and n 4 are both activity nodes, so they are prioritized from left (n 3 ) to right (n 4 ).Therefore, the order is n 2 , n 1 , n 3 , n 4 , regardless of the event log that will be considered.For the event log L 1 , the algorithm starts from the deeper node n 2 .n 1 is set to τ, and all possible configurators (B and E) for n 2 are then evaluated, considering the subtree that has as a root the loop operator that is an ancestor of n 2 , and the sub-log obtained projecting the original log L 1 on the activities contained in the subtree: c, d, e.The best configuration for n 2 is E. Later, having the configuration of n 2 , in a similar way, n 1 is analyzed and configured to E. Afterwards, n 4 is set to τ, while n 3 is configured to E. Finally, once n 3 is already configured, the last node n 4 is configured to H.The final configuration for L 1 is then obtained and represented as the best configuration.For the event log L 2 , the algorithm proceeds in a similar way.Notice that since the log L 2 contains fewer activities than the configurable process tree Q α 1 , the projection of the activities contained in the subtrees creates very simple sub-logs, even an empty event log, such as obtained when processing the configurable node n 2 .The best configuration in this case considers blocking the configurable node n 2 and hiding the configurable node n 4 , illustrating how the algorithm adapts to different scenarios.As a result, Q 1 is the derived process tree from Q α 1 and L 1 , and Q 2 is the derived process tree from Q α 1 and L 2 .Overview of the greedy strategy to derive a process tree.There are two event log inputs to show different subtree and sub-log scenarios to finally derive a process tree.

Result and Discussion
The proposed strategies have been implemented and evaluated in two different scenarios.In this section, we first describe how the framework has been implemented (see Section 5.1).Then, we describe the two scenarios considered: a realistic scenario based on the adoption of a higher educational enterprise resource planning (ERP) system by some universities (see Section 5.2) and a scenario that represents a real-life registration process [33], which is executed on a daily basis by Dutch municipalities (see Section 5.3).Afterwards, in Section 5.4, we analyze the performance of the proposed strategies on a set of controlled experiments created for testing the performance of the strategies depending on both the log complexity and the model complexity.

Implementation
We have implemented the three strategies, exhaustive, genetic and greedy, as three plug-ins of the ProM process mining framework (available in the ProM nightly-builds, http://www.promtools.org/prom6/nightly), within the Configurable Processes package.The genetic evolutionary strategy is implemented using a Genetic Algorithms and Genetic Programming package written in Java (JGAP); this flexible package fits in our genetic evolutionary approach.All experiments have been performed in a laptop with an Intel Core i5 CPU at 2.7 GHz, 8 GB RAM, running OS X El Capitan 64 bits.Algorithm 3 Obtaining a configuration based on the greedy strategy.

Educational Scenario
Educational institutions such as universities are spread across cities in a country with the purpose of granting academic degrees in various subjects.When the owner of a network of universities decides to standardize the higher educational ERP system to be used for all the universities belonging to the network, the first step is to know how suitable the software is for each university.
The ERP system provides support for different processes, and it can be configured to suit how each process is executed in the university where it will be installed.The diversity of process variants the ERP supports can be represented through a configurable process model.The decisions that are made to adapt the software to the way the process is executed in each university corresponds to the configuration that allows one to generate a derived process model specific for each university.The derived process model will probably not be exactly the same as the process model that represents how the process is currently executed, but it will be the closest process variant the software is able to support.
If a university is currently running the process using other software, we can assume that an event log is currently recording how each process is being executed.
The framework proposed in this article takes as input the configurable process model that represents the different parameterizations provided by the ERP system for the process and the event log that represents the current execution of the process in a university.Based on them, the framework generates a derived process model that represents how the process could be run in the future using the ERP system.
Beyond their organizational structure, universities share some common processes, such as planning academic courses.The planning of academic courses in a university is a complex process due to several factors such as government regulations, internal policies, dynamic changes of knowledge in certain domains, economic and human resources, among others.In this process, both administrative personnel and faculty members, sometimes from different departments, are involved in each stage of the process.This process, in general, considers some similar stages across universities such as course planning, student assistant planning, thesis planning, among others.To deal with its complexity, it can be modeled at a high level.Figure 10 depicts a general process model that includes sub-processes that group common activities in the academic planning process.We have used this process model as a reference process model for three different universities located across a country.For the sake of simplicity, we call them: Northern University, Central University and Southern University, according to their location in the country.Each university executes a different process variant of the reference model according to its policies, regulations, socio-demographic characteristics and geographic needs.For example, the Northern University is a low-income university, so there are some courses that do not have student assistants, whereas in the other two high-income universities, there is more than one student assistant in some courses.The Southern University does not have a proper internal information system, so selecting student assistants is a manual task.
The configurable process tree contains 70 activities, grouped in the eight sub-processes shown in Figure 10 and nine configurable nodes that have mixed dependencies.The configurable process model is considerably large, making it impractical to show it completely.To illustrate how the configuration is performed, Figure 11 shows a scheme of the configuration process.The upper boxes represent configurable process trees for three different sub-processes of the process reference model.The lower process trees, inside the dashed line boxes, are derived process trees where the corresponding configurable nodes have been configured as hide.Having a reference process model, whose overall view is shown in Figure 10, and an academic planning event log per university, it was possible to derive a process model for each university.We applied the proposed exhaustive, genetic and greedy strategies to the three cases.The results are shown in Table 1, including fitness, precision and the conformance metric, calculated as 90% fitness and 10% precision.
It is possible to observe in Table 1 that both Northern University and Central University have a high conformance and also a high fitness.We can assert that both universities have a high percentage of commonality with the configurable process model.However, that is not the case for Southern University, which has a lower conformance and a lower fitness.One of the reasons for this is the considerable amount of activities that are being executed at Southern University that are not found in the reference model.A decision maker would probably decide that the ERP system that is desired to acquire is not suitable for this university.Table 1 shows that all three strategies obtained process trees with the same fitness.However, the greedy strategy obtained a process tree with a lower precision.Taking the Northern University as an example, a qualitative analysis between the process tree obtained by the exhaustive strategy and the greedy strategy suggests that the exhaustive strategy finds a configuration of the model where the activity assign classroom automatically (ACA) is always performed (as observed in the event log), while the greedy strategy finds a configuration of the model where the activity ACA can either be performed or skipped (that was never observed in the event log), obtaining a lower precision.
The performance of the three strategies is depicted in Figure 12.It is possible to observe that the time required by the exhaustive strategy is considerably higher than the time required by the genetic and greedy strategies.Moreover, the greedy approach was able to obtain a derived process model in seconds, whereas the other two algorithms required minutes and even hours.The time required in the case of the Southern University is higher due to the alignment technique used to obtain the conformance metric.As was mentioned before, at the Southern University, many activities are being executed that are not found in the reference model; in those cases, the alignment technique takes longer [34].In conclusion, the greedy strategy is able to obtain in a short time a derived process model that has a quite good conformance, in comparison to the conformance obtained with the other two strategies.

Real-Life Experiments
In order to evaluate the performance of the techniques on real-life data, we apply all three strategies on a real-life dataset.The data used contain event data from the building permit process (WABO) of five Dutch municipalities [17,35].There are five event logs, each describing a different process variant, which were extracted from the IT systems of the corresponding municipality.The same data have been used to evaluate the performance of configurable process discovery of the Evolutionary Tree Miner (ETM) [20] (pp.250-251).In this experiment, we create some CPTs based on two of the models discovered by the ETM on this dataset.We then allow each of the three strategies to configure the CPTs to come to the best solution they can achieve.We then compare these results with the results obtained by the ETM on the same dataset.
We use CPTs based on the models discovered by the ETM on this dataset, which are shown in Figure 13. Figure 14 shows the two CPTs created based on the model shown in Figure 13a.The CPT in Figure 14a has two configurable nodes, while the CPT in Figure 14b has eight configurable nodes, all randomly chosen.Notice that we could not create a CPT similar to Figure 13a because we are not considering the downgrade operator used in [20].Figure 15 shows the three CPTs created based on the model shown in Figure 13b.The CPT in Figure 15a is equivalent to the one shown in Figure 13b.The CPT in Figure 15b has six configurable nodes, while the model in Figure 15c has twelve configurable nodes, all randomly chosen.All configurable nodes can be set to enable (or allow), hide or block.We allow each of the three strategies to configure all models to come to the best solution they can achieve.We then compare these results with the results obtained by the ETM on the same dataset.
The results of the experiments with the CPTs created based on the model discovered by the ETM Approach 3 are shown in Tables 2 and 3. Tables 4-6 show the results of the experiments with the CPTs created based on the model discovered by the ETM Approach 4. All tables include fitness, precision and the conformance metric, calculated as 90% fitness and 10% precision.It can be observed that in all cases, the different strategies obtain good results.Since the exhaustive strategy always obtains the best possible result, we can highlight that the genetic strategy always finds the best result and that the greedy strategy in general obtains very good results, in many cases matching those obtained with the exhaustive strategy.
When comparing the results with those obtained by the ETM, we can point out that in the only case where the CPT is equivalent to the one used by the ETM (Table 4 corresponding to CPT 4-A, shown in Figure 15a), the results obtained with the ETM and the results obtained with the exhaustive and genetic strategies are the same.On the other hand, the greedy strategy obtained the best results in four of the five variants, except on Variant 4.
When the CPTs have greater flexibility than the CPT used by ETM (CPTs based on ETM Approach 4, CPT 4-B and CPT 4-C, shown in Figure 15b,c, respectively), the search space is larger, so eventually, there could be a better configuration.This actually occurs in Variant 1 in Table 5 and Variant 1 and Variant 3 in Table 6.Moreover, in these highlighted cases, all three strategies are able to find better configurations.However, in general, the same results were obtained as those obtained by the ETM, as shown for the other variants in Tables 5 and 6.
When the CPTs have a different flexibility than the CPT used by ETM (CPTs based on ETM Approach 3, CPT 3-A and CPT 3-B, shown in Figure 14a,b, respectively), the search spaces are not directly comparable.Therefore, in some cases, the strategies obtain better results; in other cases, the ETM obtains better results, and in others, the results are equivalent, as shown in Tables 2 and 3.
In this experimental setting, the CPTs are small and the event logs do not contain many variants; therefore, when the CPTs do not contain many configurable nodes, the experiments run very fast for all strategies.However, due to the exponential nature of the search space, when the number of configurable nodes grows, the exhaustive strategy may take a long time, such as in CPT 4-C, in which the exhaustive strategy took between 37 min (for Variant 3) and almost three hours (for Variant 5).

Algorithms' Performance Based on an Empirical Evaluation
To validate the performance of the proposed strategies, a set of controlled experiments was created based on the original reference process model for the universities.The experiments focused on testing the performance of the strategies depending on both the log complexity and the model complexity.

Performance According to Log Complexity
The event log complexity depends mainly on the number of trace variants and on how many times each trace variant is repeated in the event log.A trace variant is a particular sequence of activities that can occur multiple times in the event log.This set of experiments evaluates how the algorithms perform under different event log settings: varying the number of trace variants and varying the number of repetitions of each trace variant in the event log.Notice that alignments will be computed for each trace variant and reused for all cases of this variant.Therefore, no significant impact is to be expected when varying the number of repetitions of each trace variant.
The configurable process tree contains nine configurable nodes that have mixed dependencies.There are eight configurable nodes that include {H, E} as configurators and one configurable node that has {H, B, E} as configurators.The set of feasible configurations allowed by the configurable process model can be generated based on the Cartesian product, obtaining 2 8 • 3 1 = 768 feasible configurations.Three different target models (universities A, B and C) derived from the original configurable process model were used for experimentation; the number of activities for the models corresponding to universities A, B and C are 66, 65 and 55, respectively.Based on these target models, different event logs were simulated.
Table 7 summarizes the experiments.In the first set of experiments, the number of trace variants is varied.In this case, a random number of trace repetitions between 10 and 20 is considered for each trace variant.The final number of traces is therefore different for each university.In the second set of experiments, the number of trace variants is fixed to 100, and then, the number of trace repetitions is varied.A synthetic event log was created per each experiment; thus, 21 event logs were tested with each algorithm.Table 8 displays the results obtained with each algorithm when varying the log complexity.Exhaustive strategy evaluation: As a force-brute method, this strategy searches over all possible configurations that can be applied to the configurable process model.As seen in Figure 16a, computing time increases linearly when increasing the number of trace variants.For event logs with 500 trace variants, the algorithm takes several hours; in the worst case scenario, the university C, it takes more than 40 h to obtain the optimal derived process model.Figure 16b shows how the algorithm performs if we set the number of traces to 100 and we vary the number of trace repetitions.In this case, the computing time is not proportional to the log size as when we vary the number of variants; in fact, the computing time is almost constant.This is a feature of the conformance method applied in our implementation to compute the conformance metric.It uses a cache table: if a trace variant has already been evaluated, it is not evaluated again.Therefore, we can observe that the computing time is only proportional to the number of trace variants in the event log, regardless of the number of trace repetitions.

Genetic strategy evaluation:
As a heuristic method to search for a desirable configuration, the genetic method has some parameters.We set the maximum number of generations to 20 and the population size to 10.For all experiments, the genetic strategy found an optimal configuration, which allows one to obtain an optimal derived process model; in fact, it obtained the same results as the exhaustive strategy.Figure 17a shows that there is a linear dependency between the computing time of the genetic strategy and the number of trace variants.Meanwhile, Figure 17b depicts the computing time of the genetic strategy, which is nearly constant when varying the number of trace repetitions while keeping constant the number of trace variants.As previously mentioned, this is a feature of the conformance method applied to compute the conformance metric.Greedy strategy evaluation: The greedy strategy uses a local subtree configuration heuristic to derive the best process subtree for every configurable node.This strategy found a reasonable configuration and the corresponding derived process model, in all experiments.The solutions are similar to the optimal derived process models obtained by the exhaustive and genetic strategies.Figure 18a shows that the computing time varies linearly with the number of trace variants.It also illustrates that the greedy algorithm is very fast; it took about 7 min in the most complex scenario, corresponding to the university C with 500 trace variants.Figure 18b illustrates a slight ascending computing time when increasing the number of trace repetitions while keeping the number of trace variants constant.In summary, the greedy strategy finds a reasonable derived process model, which is very close to the optimal process model in a short computing time, whereas the exhaustive and genetic strategies find an optimal derived process model, but require more computing time.In all cases, the performances of the algorithms vary linearly with the number of trace variants, and this does not depend on the number of trace repetitions.In addition, Table 8 shows that the genetic strategy obtains the same conformance as the exhaustive strategy in all experiments for the three universities, except in one case (100t 10rep), in which the results are quite similar.By contrast, the conformance of the greedy strategy is below the conformance obtained by the other two strategies.The strategies applied to the event log of the university C do not reach 0.8 of conformance, meaning that providing a model for this location could be reconsidered.

Performance According to Model Complexity
Process model derivation does not only depend on the log complexity, but also on the model complexity.This set of experiment evaluates how the strategies perform under different configurable process model topologies and different configurable node dependencies.Based on the original configurable process model, three configurable process models were built with different topological characteristics: models that only include × and ∧ operator nodes, models that only include × and operator nodes and models that consider all operator nodes (×, ∧ and ).In addition, for each one of these three configurable process models, three different configurable node dependencies were created: independent, dependent and mixed (as described in Section 4.3).In total, there are nine different configurable process models Q α i , where i = 1, . . ., 9; they are summarized in Table 9.Three different target models derived from each of these nine configurable process models were used for experimentation.Based on these 27 target models, 27 different event logs were simulated.In Table 9, they are represented as L i_1 , L i_2 and L i_3 , for i = 1, . . ., 9.
We applied the exhaustive, genetic and greedy strategies to all nine configurable process models with their corresponding three event logs.The results are shown in Table 9, including the computing time and the conformance metric.The computing time required for all strategies is shown in Figure 19, where the average computing time required for the three event logs created for each configurable process model is displayed.
Figure 19a depicts how much time the exhaustive strategy requires to evaluate all possible configurations in order to derive the best process tree in each case.The results obtained suggest that if the configurable process model has parallelism, the computing time required to get a derived process model increases, such as in the case of models with topologies that contain × and ∧ operators or ×, ∧ and operators.This is due to the conformance method requiring more time to handle parallelism, because the traces to be processed are usually longer [26]; whereas models with a topology that only contain × and operators require a lower computing time to derive a process model.It can be noticed that this is independent of the configurable node dependencies.
A similar behavior can be observed in Figure 19a for the genetic strategy and in Figure 19a for the greedy strategy.In all cases, more computing time is required when the configurable process models have parallelism, regardless of the configurable nodes dependencies.
Figure 19a also depicts that the greedy strategy to derive a process model requires considerably less computing time than the other two strategies.

Limitations
Our work is restricted to the new use case presented in Section 1 that seeks to obtain a configuration to derive a process model from a configurable process model that maximizes the conformance function for a given event log.In this context, our research presents the following limitations.First, a configurable process model that contains independent configurable nodes must be available, including a set of feasible configurations for each configurable node.Second, the event log must be representative of the executed process, i.e., it must contain traces representing all possible behaviors of the process; otherwise, the unobserved behavior could be left out of the derived process model.Notice that this is a common limitation of all process discovery techniques used in process mining.Third, each strategy has its own limitations, since the three proposed strategies seek to balance two opposing goals: to obtain optimal solutions versus to obtain solutions quickly.As observed in the analysis of Section 5.2, the exhaustive strategy allows one to obtain an optimal solution, but taking a considerable time.In contrast, the greedy strategy allows one to obtain a solution quickly, although not necessarily an optimal one.An intermediate approach is the genetic strategy, which allows one to obtain a very good, even potentially optimal, solution in a shorter time than the exhaustive strategy.Finally, the three strategies seek to find a single process model that maximizes the conformance function, which in turn weighs the four quality criteria.In contrast, there are discovery techniques that use the Pareto front [20] to obtain a set of process models that represent the best possible solutions for different weights of the four quality criteria.However, this work does not consider this.
The proposed approach does not address privacy issues that usually emerge in cross-organizational business process settings [36].We assume both the configurable process model and the event log belong to the same organization, which might be a corporate group that wants to have a common reference model (a configurable process model) for a given process among all the companies it owns and also might be interested in obtaining a derived process model for a (new) company by using the approach proposed in this article.

Conclusions
In this paper, we propose a framework to derive a process tree from a configurable process tree based on the historical behavior of a process stored in a given event log.Through three different strategies, exhaustive, genetic and greedy, we allow deriving a process model that better conforms to the event log.The exhaustive strategy searches among all possible derived process models given by the Cartesian product of all possible configurations.This strategy finds an optimal configuration, which is used to derive the best possible process model, but not in a suitable time.The genetic strategy also finds a quasi-optimal configuration (sometimes even the optimal one) to derive a process model, but in less time than the exhaustive approach.Although in the general case, optimality is not guaranteed, in our experiments, it was always able to find an optimal configuration.Moreover, in our experiments, the genetic strategy did not have to iterate for many generations; it converges in less than 20 generations.Meanwhile, the greedy strategy finds a good approximate configuration to derive a process model in a very short time compared to the other two algorithms.We have validated the applicability of our framework using both realistic and real-life processes.The proposed strategies allow finding a very good derived process model in a short time, using the greedy strategy, or an optimal derived process model using the exhaustive or the genetic strategies.
The main future works that can be performed to extend this research are the following.First of all, user-defined rules can be used to guide the configuration process [5].These rules can constrain the search space, reducing the computing time required to derive a process model.Second, large configurable process models could be configured in successive stages through a decomposition approach [37].When a configurable process model is large, it can be decomposed into sub-process models that can be configured independently, in order to reduce complexity and improve performance.Third, on the greedy strategy, we used a bottom-up approach.A top-down approach could also be explored.Fourth, including operator downgrading is another way to configure a node.By downgrading an operator, the behavior of the operator is restricted to a subset of the initially possible behavior [20].Finally, when the conformance obtained by the derived process model is not good enough, you might think the event log contains knowledge about the process that is not considered in the configurable process model (e.g., activities that are observed in the event log, but do not exist in the configurable process model).In that case, you could use the event log to first enrich the configurable process model, before tackling the derivation task.

Figure 1 .
Figure1.Business process management (BPM) use cases[7] related to configurable process models: (a) manually design a configurable process model (DesCM), (b) merge the collection of process models to generate a configurable process model (MerCM) and (c) configure a configurable process model (ConCM) to obtain a process model.In this article, we propose (d) as a new use case: to derive a process model based on a configurable process model and using event data (ConCME).

Figure 2 .
Figure 2.Overview of the proposed framework to derive a process model.Three alternative derivation strategies allow obtaining a configuration that later on is used to derive a process model from the configurable process model.

1 Figure 3 .
Figure 3. Example of a process tree model.

Figure 4
Figure 4  is an example of a configurable process tree Q α 1 that contains four configurable nodes, listed from 1-4.Configurable Nodes 1, 3 and 4 can be either hidden or enabled, whereas configurable Code 2 can be either blocked or enabled.

Figure 4 .
Figure 4. Example of a configurable process tree containing four configurable nodes.

Figure 5 .
Figure 5.Running example that illustrates the derivation framework.A configurable process tree and and event log, shown in (a) and (b), are the inputs.The internal representation of a configuration used by the three strategies is represented in (c).The obtained configuration is shown in (d).Finally, the output, a derived process tree, is shown in (e).

Figure 6 .
Figure 6.Overview of the exhaustive strategy to derive a process tree.The algorithm generates all possible configurations, in order to find an optimal configuration.This configuration is used to obtain an optimal derived process tree.

Figure 7 .
Figure 7. Overview of the genetic strategy to derive a process tree.The internal configuration representation allows this evolutionary algorithm to use crossover and mutation operations to generate new candidate configurations to be assessed.

Figure 9 .
Figure 9. Overview of the greedy strategy to derive a process tree.There are two event log inputs to show different subtree and sub-log scenarios to finally derive a process tree.

Figure 10 .
Figure 10.General model of the academic planning process of a university.It includes parallel flows that contain sub-processes.

Figure 11 .
Figure 11.General idea of process model derivation for three different configurable sub-processes.

Figure 12 .
Figure 12.Benchmarking of the performance of the three strategies.

Figure 17 .
Figure 17.Performance of the genetic strategy when varying log complexity.(a) Trace variants complexity; (b) trace repetitions' complexity.

Figure 18 .
Figure 18.Performance of the greedy strategy when varying the log complexity.(a) Trace variants' complexity; (b) trace repetitions' complexity.

Figure 19 .
Figure 19.Performance of the different strategies when varying the model complexity.(a) Exhaustive; (b) genetic; (c) greedy.

Table 1 .
Comparison of the proposed strategies for the three universities.

Table 2 .
Results for CPT 3-A and the event logs corresponding to the five process variants.

Table 3 .
Results for CPT 3-B and the event logs corresponding to the five process variants.

Table 4 .
Results for CPT 4-A and the event logs corresponding to the five process variants.

Table 5 .
Results for CPT 4-B and the event logs corresponding to the five process variants.

Table 6 .
Results for CPT 4-C and the event logs corresponding to the five process variants.

Table 7 .
Different log settings for three universities.

Table 8 .
Results obtained for all algorithms considering different log complexities.

Table 9 .
Results obtained for all strategies considering different model complexities.