Business Process Configuration according to Data Dependency Specification

Configuration techniques have been used in several fields, such as the design of business process models. Sometimes these models depend on the data dependencies, being easier to describe what has to be done instead of how. Configuration models enable to use a declarative representation of business processes, deciding the most appropriate work-flow in each case. Unfortunately, data dependencies among the activities and how they can affect the correct execution of the process, has been overlooked in the declarative specifications and configurable systems found in the literature. In order to find the best process configuration for optimizing the execution time of processes according to data dependencies, we propose the use of Constraint Programming paradigm with the aim of obtaining an adaptable imperative model in function of the data dependencies of the activities described declarative.


Introduction
A business process, henceforth referred to as BP, consists of a set of activities that are executed in coordination within an organizational and technical environment.These activities jointly attain a business goal [1].Several languages propose an imperative representation of business processes, whose specification allows business experts to describe an explicit order of execution between activities, and to transform the process into an executable model (for example, that activities A, B and C are executed sequentially, or activities D and E are executed in parallel) [2][3][4][5][6].Sometimes the order of the activities is therefore derived from the data dependencies, since these dependencies allow different configurations.In those cases, the imperative languages fail in the consideration of the data dependencies between activities.Configurable work-flows permit to include a degree of freedom about how the activities are related according to their data dependencies.
Configurable models tend to use a declarative representation of the configurable aspects.Although the imperative process models are significantly more understandable than declarative models [7,8], it is sometimes very difficult or even impossible to describe a problem in an imperative way [9,10].One of the cases in business processes, where this difficulty can be found, is when the data relation can determine different activities order.The variation of the model is the reason several authors have proposed languages for the definition of BPs as declarative models [11][12][13][14].Unfortunately, the data dependencies and its relation with declarative description has not been faced up.This paper develops two ideas: (1) propose a declarative description of the activities that form a BP, including the data dependencies between them; (2) based on this declarative model, the declarative BP is automatically transformed into an imperative model where the work-flow is defined to minimize the execution time of the instances when the process is executed, and keeping the data dependencies relations.To obtain the imperative model, we propose the use of Constraint Programming, an Artificial Intelligence technique.
The analysis of the data interchanged during the process execution can affect to two aspects to optimize a BP: (i) attainment of a better business product, for example, by reducing the costs or satisfying customers' preferences; and, (ii) the reduction of the execution time of the instances.The problem of determining the best product in terms of the input and output data of the activities is presented in [15].In these studies only the first type of aspect is solved, specifically, the way in which the input data can influence the successful execution of a process is analyzed in order to find an optimized product.On the other hand, in the proposal presented in this paper, these previous works are extended to include the second aspect, the optimization of the execution of the instances regarding the data dependencies.How to order the activities by considering these data dependencies in an imperative representation is a hard and difficult task.It is a hard task because it requires an exhaustive analysis of the data dependencies, and it is a difficult task because the graphical representation of the process is not simple, being able to become a "spaghetti" process [16].To support business experts in the design task, we propose an automatic transformation from the declarative model into an optimal imperative model where the execution time is minimized.The main reason of transforming declarative model into an imperative model is that imperative models are easier to implement in commercial Business Process Management Systems (BPMS).
Therefore, the aim of this article is to present a framework capable of obtaining the model configuration of the imperative description from a declarative specification based on data dependencies using Constraint Programming.This configurable system must also be able to detect errors and omissions in the declarative model, and to reason and validate the models through the different stages of the transformations.To the best of our knowledge, there are no proposals of declarative specification focused on data management, that create optimal executable BP using techniques of configuration.
The rest of the paper is organized as follows: Section 2 presents the example used in the paper to set out the proposal.Section 3 describes related work.Section 4 presents the Conf-BP Framework.Each individual component of the framework is then described in detail.Section 4.1 formalizes the declarative specification and applies it to the example.Section 4.2 proposes the automatic transformation of the declarative configurable model into a imperative BP.Section 4.3 defines the configurable system in charge of the transformation from declarative to imperative modelling using Constraint Programming, and applies it to the example.Results are presented by applying the automatic transformation to the example in Section 5. Finally, conclusions are drawn and future work is proposed in Section 6.

Detailing an Example
One clear example, where the fundamental aspect is the successful execution of a BP, arises in the form of the organization of a trip.Generally, the customer searches by hand on the Internet for the cheapest combination of flight and hotel, for specific dates and cities.In addition, if the results obtained remain unsatisfactory, then other dates are sought until a convenient combination is found.Moreover, if necessary and also cheaper, a car can be rented in order to drive to another city to take a flight from another airport, thereby expanding the range of cities of departure and arrival.
To obviate the search of several combinations by hand, it is possible to combine the activities that represent the three providers (Hotel, Flight, and Car Rental Providers) into a single BP.The question becomes how to configure the activities in a business process since the input and output are related between them.Therefore, the model of the BP implies two difficulties:

•
To find the input of the activities that optimize the trip by means of minimizing the price: The process has to provide customers with various possibilities of trips with flights, hotel and car rental (if necessary), while taking into account the existing combinations between a set of possible dates and airports.In addition, there is an objective function to optimize which selects only one of all the possible combinations obtained by the process.An evaluation and a proposal of this part of the problem is analyzed in [15].

•
To obtain an imperative model that minimizes the execution time of the BP taking into account the data dependencies: Although the activities to enjoy during the trip are: "take a car to go to the airport", "catch the flight" and "arrive to the hotel", the process of booking each part of the trip does not have to follow the same sequence.If the input data of each activity were known, all the activities related to the provider could be executed in parallel.The problem arises when certain activity inputs are related to other activity outputs, or when the activities are executed or not depending on the information obtained from activities executed previously.The objective of this paper is the use of Artificial Intelligence techniques based on Constraint Programming in order to create an imperative model where the data dependencies are taken into account to minimize the execution time of each instance.For example, it is possible for a flight to arrive at its destination on a different day to when it takes off (overseas route), and therefore the check-in date in the hotel cannot be determined until this information is known.The question is: how to configure a business process formed by these activities to minimize the execution time of the instances?This implies the analysis of every combination of activities and to select the control flow gates that relate them.

Related Work
The configuration problems have been used in various industrial application fields, such as computer networks, electrical engineering, telecommunication, data centers [17], financial services, or surveillance [18].Furthermore, configuration has long been part of the field of Artificial Intelligence.Certain attempts to formalize configuration have been proposed in [19,20].Many studies relate the term "configuration" to the setting of parameters [21,22].For instance, Czarnecki et al. in [23] describes a configuration consisting of the features that are selected according to the group and the feature cardinalities defined by a feature diagram.They include a translation from the feature model into a context-free grammar.On the other hand, Gottschalk et al. in [24] include configurable elements in order to enable the modification of the behaviour of the work-flow.The configurable elements permit to activated, blocked, and hidden components of the work-flow with the aim of deriving individual work-flow variant from a more general model.The configuration to determine the actions that should be performed to obtain an individualized model is presented in [25].In this case, a set of questionnaire models are analyzed to detect circular dependencies and to ensure the consistency of domain constraints.Moon et al. in [26] describes an approach for examining the variability of a BP.The proposed tool enables to define an overall scheme and the concept of variability in a BP, and transform it into a specific variation of the BP.Reijers et al. in [27] propose an extension of event-driven process chains to describe in a single model a set of similar processes.Their proposal includes a unique aggregated process model which consists of a common part with the commonalities of the aggregated processes, and the non-common elements of the aggregated processes.Kumar and Yao [28] configure flexible processes by associating business rules with a process template.They build a process tree representation to facilitate the process configuration.Finally, Baran et al. in [29] establish the configuration of various processes through the hierarchization.The aim is to reuse the similar parts of the model by encapsulating it in different levels.However, we focus on the definition of configuration which consists of finding sets of specific objects that satisfy the properties of a given model, such as in [30][31][32].
In this paper, we propose the application of the configuration techniques based on Artificial Intelligence to business process models.These techniques apply declarative knowledge representation and reasoning methods based on Constraint Satisfaction Problems [33].Unfortunately, BPs are not typical configuration problems oriented to execute simple tasks [34].This lack of simplicity is due to the fact that different control flow patterns must be combined in an unknown way taking into account the possible instances that will be executed at runtime.
Several perspectives can be analyzed in configuration area [35].There is quite a long history of research dedicated to the development of configuration knowledge representation language [24].Configuration in Business Processes [36] deals with the problem of managing families of business processes, i.e., business processes that are similar to one another in many ways, yet differ in some other ways from one organization, project or industry to another.This problem arises for example in the context of multinational companies that need to localize their business processes to different legislation, compliance regulations, quality requirements, etc.It also manifests itself in the context of acquisition projects, where an organization needs to merge their own processes with the ones of the acquired organization [37].In the case of the data relation, implies the possibility to create business process models only modifying the data input and output relation and keeping the rest of information.On the other hand, Vanderfeesten et al. in [38,39] define a product-based workflow support to create an optimized process based on a set of recommendations.They firstly analyze the data elements to combine them into a product data model.Then, they select a strategy and configure a product-based workflow system.However, the resulting product model does not make any choice about the ordering of activities.
The transformation from declarative to imperative models with data dependencies has never been studied as a configuration problem.However, some papers in the literature can be found about the transformation between models in BPs.Kulza and Honkisz in [40] transform a Semantic of Business Vocabulary and Rules (SBVR) [41] into a BPMN model.The work is oriented to the transformation of business rules rather than to the dependencies between data.In [42], Natschläger et al. extend BPMN with Deontic Logic.The extension aims to improve the readability and the overall structural complexity, and avoid duplication.The extension proves that a transformation by applying graphs is trusted.In [43], Wiśniewski et al. develop an approach by providing a method for business process modelling.The proposal consists of collecting data coming from different participants and merging it into one declarative specification of performed tasks.Based on this semi-formal description, authors generate synthetic logs which are then used to obtain the BPMN model.Although their BPMN composition is also based on graphs, the main difference with our proposal lies in that they do not take into account data dependencies for the process modeling.
Regarding the definition of logical dependencies between different branches in a graph-based structure, several studies are presented.Borrego et al. in [44] diagnose the correctness of semantic workflow models based on graph-theory and Artificial Intelligence techniques.Vanhatalo et al. in [45] present various techniques for automatic workflow graph refactoring and completion.Finally, Hasanov in [46] analyses the OR gateways in the context of conformance checking.
Although a certain number of these papers are focused on defining the interaction between various participants in order to achieve business goals, none of them deals with the type of problems presented in this work: the creation of an imperative model to minimize the BP execution time according to data dependencies.

Conf-BP Framework: A Configurable System to Create Imperative BP
To generate an imperative model, we propose a framework, called Conf-BP (Configuration of Activities in Business Processes).Conf-BP creates automatically an imperative model from its data declarative description.Therefore, the main objective of Conf-BP is to analyze the data dependencies since a business expert knows what is required (since the specification of the problem is given), but not the specific work-flow to know how to obtain it in an optimal way.The configurable model specifies the activities involved and the data relation between them.The relation between the activities is provided through the relationships and constraints of the data dependencies.Once all these features are specified, the framework is used to analyze the relationships between the activities, and to create the work-flow for an imperative model.The standard Business Process Model and Notation (BPMN) is used to represent the imperative model, since it can be enacted in any of the existing commercial BPMSs.
As shown in Figure 1, Conf-BP Declarative Specification (ConfD-BP) represents the highest level of abstraction, where the experts specify the model.The second stage consists of a configuration system in charge of the transformation from the declarative specification into a specific model for a particular imperative language, BPMN in our case.This transformation is performed in two steps: (1) the definition of a Constraint Optimization Problem which obtains the time interval relations between the activities; and (2) the application of an algorithm that builds the work-flow model.This algorithm creates the work-flow that minimizes the execution time of the process.Finally, Conf-BP Imperative Modelling (ConfM-BP) completes the third stage, by deploying this configuration into a BPMS.A more detailed description of the Conf-BP stages is presented below.

Conf-BP Declarative Specification (ConfD-BP). A Formalization of the Language
The first stage of the framework is the specification of the configurable model in a declarative way.The declarative language and the grammar used in this step is DOOPT-DEC [15].With the aim of better understand the type of problems solved in this article, the parts of the configurable declarative model are detailed in this section.The elements are shown in Figure 2, and they are described following.As the configurable system can be combined with other elements in a more complex work-flow, we provide the way to include the configurable description into a sub-process.To introduce the formalization of the data provided and combined (see Figure 2), the different descriptions to include are divided into: (i) Sub-process description, including the descriptions of the components associated with the activities of the imperative part, detailed in Section 4.1.1;and (ii) Sub-process relationships, which is the declarative description of the relationships between the components through the data-flow (DF), addressed in Section 4.1.2.

Sub-Process Description
The sub-process description includes the components associated with the activities, gateways, and control-flow which are known and can be represented in an imperative way.Taking A as the finite set of activities {A 1 , . .., A i , . .., A n } contained in a determined BP, the following definitions are introduced.

Definition 1. ACTIVITIES_DATA_INPUT(ADI) and ACTIVITIES_DATA_OUTPUT(ADO)
represent the sets containing, respectively, all the input data and output data involved in the execution of all the activities.
Within the set ADI/ADO, containing all the input (output) variables, it is possible to identify the input (output) variables of each activity, defined as follows.
Definition 2. DATA_INPUT (A I ) and DATA_OUTPUT (A I ) describe the set of input and output data of an activity (A I ) involved in the sub-process, respectively, so that Since various activities can share the same data input (or output), then it is also possible that: The union of all the DATA_INPUT sets and all the DATA_OUTPUT sets of all the activities, constitute the sets ADI and ADO respectively, with non-repetitive elements.
Likewise, the specific variables that are inputs or outputs of the overall sub-process can be distinguished.They represent the information that flows from the customer to the sub-process and vice versa: Definition 3. PROCESS_DATA_INPUT (PDI) is the set of input variables of the sub-process, which determines the information provided and defined by the customer.PDI is composed of a subset of variables of ADI.PDI ⊂ ADI Definition 4. PROCESS_DATA_OUTPUT (PDO) is the set of output variables of the sub-process.PDO is composed of a subset of variables of ADO.

Sub-Process Relationships
There exist different relationships between the data inputs and data outputs defined in the sub-process description.Those relationships are expressed as constraints, both at activity and at process level, giving rise to the sub-process relationships, with the following definitions: Definition 5. PRE (A i ) is the set of constraints that limits the specific values of the DATA_INPUT (A i ) that must be satisfied to execute activity A i .Likewise, POST (A i ) is the set of constraints that limits the specific values of the DATA_OUTPUT (A i ) that must be satisfied after the execution of activity A i .Definition 6. INPUT_CONSTRAINT (OUTPUT_CONSTRAINT) relates the values of variables of PDI/PDO with variables of DATA_INPUT/DATA_OUTPUT can take according to the PDI/PDO values in each instance, and the possible values between the input and output of the activities.

Definition 7. OBJECTIVE_FUNCTION (OBJ_FUNC) is an optimization function defined in terms of the data output of the activities (ADO).
The objective of this optimization function is either to maximize or to minimize some of the output data that represents the business product, which constitutes the outcome of the process.
The result of this optimization problem is a set of input values that satisfies the objective function, the pre and post-conditions of the activities, and also satisfies the input and output constraints.
The set of input values that optimizes the output is found at runtime.The role of the constraints is to determine the possible values that this input and output data can take.All the constraints (INPUT_CONSTRAINT, OUTPUT_CONSTRAINT, PRE, and POST) are always defined at design time by a business expert who is familiar with the problem, although they are solved for each instance at runtime.Among every possible tuples of solutions that satisfy the constraints, the outcome of the sub-process will be the one which optimizes the variable defined in the optimization function.

Grammar
The elements used in the declarative description, describe the data relationship by means of numerical constraints.These constraints are defined by the following grammar, which is an extension of the grammar included in [15], where Variable and Constant, which represent the constant value for the variables, can be defined using Integer, Natural, Float, or String domains.On the other hand, Set can be defined as a set of Constant values of a specific Variable.The IF-THEN constraints have been included to DOOPT-DEC grammar [15], which are equivalent to the implication logic operator (→).IF-THEN constraints are used to represent that the relation of values between some data is conditioned to the value of others.The grammar of constraints is the following: {For the String domain only '=' and ' =' are allowed } {These operators are only applicable to Numerical variables} 4.1.4.Specification Applied to the Trip Planner Example In the trip planner example, the activities involved in the model share some data, being necessary to create a work-flow aware data input and output.Figure 3 shows the trip example following the formalization detailed in the previous section and the notation described in [15].Hence, there are eight PDI whose values are given by the customer: • departingFrom: city from where the customer departs.Four different activities are combined in order to perform the package trip offered to customers.This package trip is composed of flights, hotel rooms and, if necessary, the renting of a car to drive to an alternative departure airport, or from the arrival airport to the destination city.Each activity calculates the price as output of the activities for each data input.For the example, the activities (A i ) and their ADI are:

•
Flight Search Activity (A F ) returns the price of flights for a tuple of values for the data input.

DATA_INPUT(A
The pre-and post-conditions of the activity A H are: ) are employed to determine the price of renting a car.Two cars can be rented during the trip, one at the source (A CR1 ) and another at the destination (A CR2 ).Nevertheless, the price of renting both cars is represented by A CRx , where x = {1 ∨ 2}, depends on these entries: The pre-and post-conditions of the activity A CRx are: Therefore, the output of the process, PDO, are the outputs of the activities, which contain the information about the various components of the trip, as well as the total price of the trip (totalPrice).
The customer only provides data to the sub-process: the eight PDI.Then, each search activity only uses the necessary data, as detailed in its specification, for the searching.On the one hand, the sub-process can have input data that is not used by some search activity: for example, the Flight Search Activity takes the possible dates and cities given by the customer to the sub-process, meanwhile, the Hotel Search Activity takes the possible dates, the destination city and the preferences.The constraints between the input data of the sub-process and the activities are specified as the set of INPUT_CONSTRAINT at design time.In the same way, the sub-process returns different data with respect to those returned by the activities, and it is necessary to define, at design time, how these data are calculated and related with the output data of the activities.Therefore, the set of OUTPUT_CONSTRAINT define these relationships between the output data of the activities and the data returned by the sub-process to the customer.Although in the experimental evaluation all the constraints have been included, only the most representative constraints have been formulated in this section.Regarding the relationships between the data input and output that belong to the sub-process and the activities, the most representatives INPUT_CONSTRAINT and OUTPUT_CONSTRAINT are detailed below: There are several numerical constraints that relate data input and output belonging to the process and the activities.Some of the constraints are defined below: ( -If the flight arrives at the destination city (C7), then it is not necessary to rent a car at the destination city.
(C7) A F .goingTo = goingTo → A F .goingTo = A H .location -If the flight does not depart from the departure location (C8), then the rental of a car (CR 2 ) is necessary.
• OUTPUT_CONSTRAINT: -The total price is the sum of all the prices returned by the activities, as presented in constraint (C9).
In this example, the optimization involves the minimization of the total price of the trip, which is composed of the cost of buying flight tickets, staying in a hotel room, and renting cars for the departure and arrival cities.

Conf-BP Imperative Modelling (ConfM-BP)
Once the declarative model is described, it can be possible to obtain an imperative model that satisfies the data dependencies.Between the different options, we propose the most optimal, according to the execution time of the instances of the process.To model the BP work-flow in an imperative way, the standard BPMN [47] is chosen, since it is supported by several commercial BPMS.The activities will be combined in a sub-process that starts and ends with the corresponding events.The various order combination of the activities will be represented with a sequence or by means of the control flows: parallel, exclusive or inclusive execution.
Figure 4 shows how the imperative configuration obtained from the declarative description is included in a business process model.The activity "Supply Input Data Values" provides the input data values used during the process execution.As was commented before, sometimes the most appropriate input data to optimize the business product are unknown at design time, since it depends on each instance.The used example, the trip planner, has these characteristics, since the dates and city airport must be found, as detailed in [15].The sub-process "Execute sub-process" represents the imperative model that should be created according to the data dependencies described and detailed in the following sections.This sub-process is formed of the set of activities involved in the declarative model by means of the BPMN connections and gateways as detailed in Section 4.3.Several possible imperative models exist that satisfy the data dependencies, but our objective is to find the optimal model with respect to execution time of any instantiation of the BP.A simplification of the possibilities of the problem are shown in Figure 5, where only the activities "Flight Search" and "Car Rental 1 Search" are considered for the modelling configuration.Analyzing the five possibilities we can find out that: (a) and (b) describe a sequence relation between two activities that can mean that one of the activities has priority order over the another activity, for the example it will be used if 'Car Rental 1' cannot be executed until 'Flight Search' ends or vice versa, and; (c), (d) and (e) represent the execution of one (XOR gateway), more than one (or gateway), or every activities at the same time.For the example it will be used if there not exists an order dependency between the activities.Therefore, which is the best model option to satisfy our trip problem?This modelling combination and analysis increase considerably as soon as the number of activities grows.Hitherto, this configuration has been made by human experts, we propose obtain it automatically in the Conf-BP Framework.Therefore, in order to transform this declarative model into an imperative model that supports any value of input variables of the process by considering the data dependencies, we propose the use of the Constraint Programming paradigm, as explained in Section 4.3.

Automatic Transformation from Declarative to Imperative Model
The transformation from the declarative description into an imperative model is a difficult and hard task, since it implies the analysis of every possible configuration to obtain at design time the most optimal model at runtime.The difficulty lies in establishing the order of the activities and the control flow components that combine the activities, taking into account the data dependencies.To perform the optimal transformation, we propose the three steps shown in Figure 6.Solve the COP: To solve the COP created in the second step, it is necessary to analyze the possible values of the variables to find the satisfied tuples.To avoid the analysis of every possibilities of values of variables, the solvers use a combination of search and consistency techniques [49] reducing drastically the complexity and time consuming.The consistency techniques remove inconsistent values from the domains of the variables during or before the search.Several local consistency and optimization techniques have been proposed as ways of improving the efficiency of search algorithms.There are several commercial constraint problem solvers.The main difference between them is the programming language that they use, and the types of constraints that can be included.For the proof of concept to validate the configuration problem, we have used Jsolver TM [50], although any of the existing CSP solvers could be used, since the constraints that we need to include (explained in Section 4.3.1)are very common and supported by every commercial solvers.The resolution of the COP will obtain a set of time intervals, each of them associated with an activity, representing the moment when each activity can start and end.

3.
Translate the COP solution into a BPMN Model: As explained in detail in Section 4.3.2,by using the results obtained from the COP, the imperative model can be created.This result has to be interpreted as gateways that relate the activities.For example, if the COP allows that two activities A and B could start at the same instant of time, perhaps there is a constraint which states that only one activity can be executed in each instance.In that case, there can be an exclusive or inclusive gateway relating the two activities.The decision between the possible gateways between the various activities is based on: (a) the study of the domains of the data related in the declarative problem description, and; (b) the data obtained from the resolution of the COP as it is explained in Section 4.3.2.

Configuration of a BP Model: Creating a COP from the Declarative Model
The relation between the input and output of the activities determines their relational order.For the same declarative model, it is possible to find several configurations of activities that satisfy the requirements and description.To find the best configuration, we propose the use of an intelligent system to find the optimal BP model that maximizes the parallelism of the activities, with the aim of minimizing the execution time.To determine this configuration, we suggest the transformation of the declarative model into a COP, to analyze the possible instant when the activities can start and end their executions, according to data dependency.
Firstly, and before to explain how to create the COP, it is necessary as a previous step to ascertain the data relation, especially when they are included in the IF-THEN constraints of the description of the problem.The definition of Tuple of Condition-Relation in order to introduce this relation is introduced.Definition 8.A Tuple of Condition-Relation is a tuple formed of Condition, Activity 1, Activity 2 , where Condition represents under which condition Activity 2 is executed after Activity 1.Therefore, the value true in the Condition implies that Activity 2 is always executed after Activity 1, or, in the case when the Condition contains an expression, Activity 2 is executed after Activity 1 if and only if the expression is met.
These tuples are built analyzing the data relations found in the set of INPUT_CONSTRAINT and OUTPUT_CONSTRAINT that describe the declarative model.For each constraint where the output of an activity A i is related to the input of an activity A j , a tuple is created.If the input-output relation appears in an IF-THEN constraint with a c condition, the tuple will be c, A i , A j , and true, A i , A j otherwise.
With the list of intervals and the list of tuples of data relations, the following COP is created: The possible instants when each activity can be executed, are defined as the tuple of variables t ini is the instant of time in which the activity A i begins, and t A i end is the instant of time in which the activity A i finishes.For each activity A i , the COP will include a couple of variables t A i ini , t A i end .To represent the execution time of the whole process, the variable T is included, that represents the t end of the last activity executed.If every activities is executed in parallel, T will be the maximum value of t A i , but if every activities are executed sequentially, the value of T will be the summation of every t A i .
• Domain: The domain of the t A i ini and t A i end variables is related to the execution time of each activity.If we suppose that each activity spends t A i units of time, then the domain of each variable is: The necessary constraint to model the COP are: -It is mandatory that for every activity A i , t T has to be the greatest value of t end , then T ≥ t A i end for every A i .

-
For each element of the list of tuples building by using the data relations, a numerical constraint is included in the COP.Each relation between data input and output in the way end .For example, in the trip planner, the constraint {A H .checkInDate = A F .outwardArrivalDate} is translated into t A H ini ≥ t A F end .Therefore, all the activities could start at the same instant unless there exists a relationship requiring one activity to start after another.

•
Objective: The optimization function of the COP is the minimization of the total time, it implies to minimize the value of T (minimization(T)).
Table 1 gives a summary of the equivalence of the elements of the formalization and the COP built.The solving of the COP will obtain a pair of start and end instants of time for each activity that minimizes any instantiation time (T) for the model.As mentioned earlier, the minimum execution time of an activity is modelled as a t A i units of time.But it is possible to obtain the same value of (T), where the execution time is minimized, with different t A i ini and t A i end for the same problem.Suppose activities (A 1 , A 2 , A 3 and A 4 ) that last a unit of time, where the input data of A 4 depends on the output of A 1 , and the input data of A 3 depends on the output of A 4 .Figure 7 depicts two possible configurations (a) and (b) that minimize the theoretical execution time (t 3 ).However, there exist more variations, since A 2 can be executed in a parallel way with any of the other activities.
Although both options (a) or (b) are optimal according time consuming, the model of Figure 7b is better than the model of Figure 7a, since it supports an unexpected delay of A 2 executing, not increasing the total execution time (t 3 ).
To make the imperative model more robust to unexpected delays, we propose to modify the proposed COP to find the most paralleled process.To this end, the COP must find the greatest value of t A i end for each A i .It is obtained including the t A i ini variables as a goal in the COP, and leaving the domain of t A i end open during the search.Then, the t A i ini variables will be instantiated during the propagation phase, while all the possible values of the domains of t A i end variables will be obtained as solutions of the COP for each value of t A i ini .The greatest value of the t A i end domain is the most appropriate value to build an imperative robust business process model.Create a variable: Create variable: int t A i end = 0.. ∑ n i=1 t A i 6: end for 7: for each Activity(i) do 8: Create constraint: Create constraint: TEnd ≥ (t A i end + t A i ) 10: end for 11: for each ADO i = ADI j , where an output of Activity i is related to the input of Activity j do 12: Create constraint: (t A i end + t A i ) = t A j ini 13: end for 14: Define goal variables: { (t A 1 end + t A 1 ), ..., (t A n end + t A n ), TEnd } 15: Define objective function: Minimize (TEnd − TStart)

Transformation of the COP results into a BP Imperative Model
The list of time intervals, obtained from the COP, with the most appropriate values for t A i ini and t A i end for each activity, has to be translated into an imperative BP model.For the example of Figure 7b, the obtained list is: {A There exist various imperative models in business processes, but the most widely used standard language is BPMN [47].Although the model in the standard is represented by means of an XML file, to facilitate the understanding of the algorithm that obtains the imperative model, we propose the use of a BPMN-Graph.As explained in [1], BPMN is a graph-oriented language in which control and action nodes can be connected.Therefore, we propose to model the BPMN using a directed graph as defined in [51].
The problem is solved in two steps: firstly, the BPMN-Graph is built not specifying the type of gateways, only including which are splits and joins, and; secondly, the uncertain about the gateways is solved analyzing the list of tuples created in function of the IF-THEN constraints of the model, that related the activity execution on function of the value of other variables.
Step 1: Building BPMN-Graph The proposed BPMN-Graph is a direct graph composed of: (i) vertices, which represent the activities and the gateways; and (ii) edges, which represent the sequence flows that join the various vertices.Firstly, the idea of the algorithm is explained by means of a trace, after which the algorithm is detailed.Analyzing the t A i ini and t A i end of each activity, it is possible to discover the activities that have parallel or sequential relation between them.The main idea of the algorithm is to detect these relations by building sub-graphs, that will be combined in a sequential or parallel way.An example of the trace of the algorithm is shown in Figure 8.The initial Problem is formed by a set of activities A 1 , ..., A 8 , associated with a time interval t A i ini , t A i end , and a list of tuples that describe the condition about the activities order dependencies.The algorithm divides the problem in function of the sequential points and the parallel subsets.Both definitions are introduced: Definition 9. Sequential point (List of Time Intervals): For a List of Time Intervals, a sequential point is an instant, or set of instants of time at which no activity can be executed.It implies that in a sequential point, only the execution of an activity can start or end, and therefore the problem can be broken into two smaller problems that can be executed sequentially.

Definition 10. Parallel subsets (List of Time Intervals, List of Tuples of Relations):
For a list of Time Intervals associated with the activities A 1 . . .A n , the parallel subsets are a set of subset of activities ps 1 , . .., ps m , where there does not exist any activity a ∈ ps i and an activity b ∈ ps j , where ps i = ps j , such that there exists a tuple in the List of Tuples of Relations that condition, a, b .A way to obtain the parallel subsets, is creating a parallel subset for each activity (ps 1 , . .., ps n ), if there exist an activity a ∈ ps i and another activity b ∈ ps j , both subsets are merged.This process is repeated until no more sets can be merged.
With the aim of facilitating the understanding of the algorithm and our proposal in general, we use the example of Figure 8, with the list of tuples: The trace follows the next steps: 1.
Look for sequential points: Firstly, the set of activities should be separated by the sequential points (arrow 1 in Figure 8).The separation is carried out by detecting those instants of time where there is a sequential point.The intervals that are derived from these sequential points delimit the subset of activities.In the example, the set of activities is separated into two subsets of activities, since there is only one sequential point in t 3 .The first subset secp1 goes from t 0 to t 3 and includes all activities except activity A 8 .On the other hand, the second subset secp2 goes from t 3 to t 4 and only includes activity A 8 .

2.
Solving sequential sub-problems: Each subset of activities is solved as a sub-problem (arrows 2 and 3) and combined sequentially both BPMN-sub-graphs are solved.

3.
Look for parallel subsets: The secp1 is separated in three parallel subsets: subg1, subg2 and subg3.To obtain these subsets, the idea described in Definition 10 is applied: (a) Create a subset of each activity: {ps 1 : {A Tuple relations (R5) and (R6) are not used in the creation of subsets since A 8 is not involved in the parallel analysis.

4.
Solving parallel sub-problems: The next step implies to solve each sub-problem (arrows 4, 5 and 6), followed by the search of sequential points.

5.
Combining parallel sub-problems: To solve secp1 it is necessary to combine the sub-graphs obtained by solving subg1, subg2 and subg3 (return of arrow 4, 5 and 6).These BPMN-sub-graphs are combined as various branches joined by a split and join node (G2 in the example).6.
Combining sequential sub-problems: The obtained results after a sequential analysis of sub-graphs (returns of arrow 2 and 3) implies joining the results of secp1 and secp2.To this end, the result (BPMN-Graph) of secp1 and secp2 are joined by an edge between the final vertex of secp1 and the initial vertex of secp2, which are G2 and A 8 respectively.return sol 30: end procedure Algorithm 2 details the procedures for the sequential and parallel treatments explained above.Algorithm 2 takes a Problem and transforms it into a BPMN-Graph.Initially, the Problem is treated sequentially with the algorithm sequentialTreatment, which finds the sequential points following Definition 9.Each of the subproblems found, has to be analyzed to identify the parallel subsets of activities with the algorithm parallelTreatment (following Definition 10).The parallel treatment of a Problem consists of gathering the activities that are related into subsets and creating new problems from these subsets.The relationships among the activities are given by the tuples of Condition-Relation of the Problem.Each subproblem is then solved recursively.To combine the parallel subsets, two gateway vertices are inserted as split and join gateway respectively.Each BPMN-sub-graph will be a branch related by means of the gateway.To combine two sequential BPMN-sub-graphs, only and edge needs to be included between the last node of the first solution and the first node of the second solution.The algorithm draws to a halt when reaching a base case, or when the problem contains only one activity.In both treatments, the solution of a base case is a graph with a unique vertex that represents this activity.
Step 2: Defining The Gateways Once the graph is created, the correct gateways that diverge the sequence flows have to be selected.In the graph, the gateways are identified by means of studying the tuples of Condition-Relation obtained from the constraints, defined in the declarative specification.
The different types of relations described by means of the tuples of Condition-Relation, and how they affect to the type of gateway are explained below:

•
Parallel: two activities are executed in a parallel way if there are no conditions that relate the t ini of both activities (see Figure 9).• Exclusive: two activities are executed in an exclusive way if the constraints that relate the t ini of the two activities and the domain of these constraints are complete and do not overlap.For example, as shown in Figure 10, if the execution of A 2 and A 3 depends on a value of the output data A 1 .output,but there are no overlaps (A 2 is executed when A 1 .output is less than 50 and A 3 is executed when A 1 .output is greater than or equal to 50), then there is an exclusive relationship between them.There is another possibility when only one condition is described, for example only exists the condition (A 2 is executed when A 1 .output is less than 50), in that case it means that there are two branches for the XOR gateway, but one of them with no activities.• Inclusive: two activities are executed in an inclusive way if there are conditions that relate the t ini of both activities, and the domain of these conditions are complete and overlapped.For example, as shown in Figure 11, if the execution of A 2 and A 3 depends on a value of the output data A 1 .output,but there are overlaps between the domains that satisfy the constraints (A 2 is executed when A 1 .output is less than 75 and A 3 is executed when A 1 .output is greater or equal to 25, and hence both coincide when A 1 .output is greater than 25 and less than 75), then there is an inclusive relationship.Table 2 gives a summary of the resulting gateways depending on the conditions established by the constraints that relate the data of the activities.For the example explained in Figure 8, it is necessary to analyze G2 and G1 (see Figure 12).Since there are no relations between activities A 1 , A 2 , and A 6 , then G2 is a parallel gateway.On the other hand, the execution of activities A 3 and A 4 depends on the outputs of A 1 .The conditions specified in relations R1 and R2 indicate that the domain of the constraints is complete and there are no overlaps, therefore, G1 is an exclusive gateway.

Results: Transformation Applied to the Trip Planner Example
To illustrate the use of the algorithms, the application of the algorithms to the Trip Planner example is presented in this section.
The first step is to obtain the list of tuples of Condition-Relation according Definition 8. Analyzing the constraints of the example, we can find: (i) Constraints C1, C2, C3, and C4 do not relate input and output of the activities, then they are not involved in the list of tuples of Condition-Relation; (ii) Constraint C5 is used to create the relation: true, A F , A H ; (iii) Constraint C6 is used to create the tuple: Condition of C6, Start − Event, A CR1 ; (iv) Constraint C7 creates the same tuple than Constraint C5: true, A F , A H , and; (v) Constraint C8 is used to create the tuple: Condition of C8, A F , A CR2 }.
After that, the second step is to build the COP as detailed in Section 4.3.1, that for the example will have the form.
end -For the list of tuples of Condition-Relation mentioned above, the constraints included in the COP are: * The obtained values of the variables after the resolution of the COP are: The resulting BP Model is shown in Figure 13.According to the results, "Flight Search" and "Car Rental 1 Search" activities start at instant 0. However, the "Car Rental 2 Search" activity depends on the value of the output data of the "Flight Search" activity, so there is an exclusive gateway before this activity.Since there is not a complete domain with the constraint that determines the execution of the "Car Rental 2 Search" activity, then there is a branch by default which indicates that nothing happens when the condition is not met.The same situation occurs with "Hotel Search" and "Car Rental 2 Search" activities.The solution parallelizes the activities according to the data input and output relations, being possible execute it in two units of times.

Empirical Evaluation
Since the generated imperative only depends on the declarative model, it is created only once at design time.The transformation performed involves a COP and two own-developed algorithms.Therefore, the critical points of our proposal lies in the resolution of the COP and these algorithms.
Regarding the resolution of a COP, we use one of the existing commercial tools to evaluate the COP, since the COP resolution has remained a known problem studied by researchers in the area over several decades, and is supported by an important set of tools used in real problems.In general, the time to find the solution of a COP depends on: (1) the type of constraints (lineal, polynomial); (2) the number of constraints; and (3) the number and type of the variables.It is possible to find an analysis in [52] on the complexity of the NP-complete constraint problem resolution according to these characteristics.The complexity of resolution of CSPs depends on the number of possible solutions of the problem, and on whether it is neither under constraint nor unsolvable.The most complex of these problems are those that are neither under constraint nor unsolvable.For these reasons, no affirmation can be given concerning the efficiency in a generic way of our proposal, since our declarative specification enables any type and any number of constraints; therefore the evaluation time depends on the specific problem.Depending on the number of constraints associated with a COP, and the number of variables, the COP evaluation time remains variable.
On the one hand, in relation to Algorithms 1 and 2, the computational order is linear since it only depends on the number of activities.On the other hand, Algorithm 2 defines two recursive functions which are related between each other.The main idea of this algorithm is to apply a Divide-and-Conquer Method by means of reducing the problem in sub-problems, being the smallest problem, in our case, a single activity.Therefore, in both the best and the worst cases, the computational order is linear since it depends on the number of activities either they have a parallel or sequential relationships.
With the aim of performing the evaluation, both algorithms are executed over a set of 50 generated test cases (see Table 3).Each test case establishes a different number of activities with a set of PDI and constraints that relate them.On the one hand, as explained before, the number of activities determines the complexity of solving the problem.On the other hand, the number of PDI and the constraints determine the structure of the final imperative business process.Figure 14 shows the computing time needed to solve both algorithms.The test cases are measured using a PC with an Intel Core i7-2675QM CPU with a 2.2 GHz processor and a 8 GB of RAM.

Conclusions and Future Work
Declarative languages have been focused on the order of activities, and not on how the data values and dependencies can affect the optimal execution of a process.In this paper, we apply configuration problems ideas to the construction of an imperative model according to the declarative description of the data relation.Our proposal shields the business experts from unnecessary details, and to provide assistance when experts know what they want (the BP requirements by means of data dependencies) but do not know how to attain what they want (establish the order of the activities), Conf-BP Framework is proposed.Conf-BP establishes a methodology to transform a declarative model into an imperative representation using BPMN.Business experts only have to specify the BP requirements, leaving the configuration of the activities in the imperative model to the intelligent system, which uses the Constraint Programming paradigm and a set of proposed definitions and algorithms.This automatic methodology obtains an imperative model described by BPMN where the optimal model related to execution time is obtained.In addition, the framework is applied to an example: the process of organising a trip.Future work into this area involves increasing the capabilities of Conf-BP by including loops within the model, where there exists a cyclic relation between the data.
• goingTo: destination city.• departDate: the day that the customer prefers to depart.• returnDate: the day that the customer prefers to return.• airportDepartingFrom: the departure airport to catch the flight.• airportGoingTo: the arrival airport for the flight.

-
The constraints that establish the values of departure date (C1) and return date (C2) of the flights have to coincide with the input data proposed by the customer.(C1)A F .departDate = departDate (C2) A F .returnDate = returnDate -The constraints that describe the values of the departure airport (C3) and arrival airport (C4) of the flight have to coincide with the input data proposed by the customer.(C3)A F .departingFrom = airportDepartingFrom (C4) A F .goingTo = airportGoingTo-The date of check-in into the hotel should coincide with the arrival date of the outward flight (C5).(C5)A H .checkInDate = A F .outwardArrivalDate -If the flight does not depart from the departure location (C6), then the rental of a car (CR 1 ) is necessary .

Figure 4 .
Figure 4. Imperative Representation of the Declarative Model.

Figure 7 .Algorithm 1
Figure 7. Flexibility of the BP in terms of the execution time of the Activities.Algorithm 1 describes the creation of the COP from a declarative specification as explained before.

Figure 8 .
Figure 8. Trace example of the Algorithm to create a BPMN-Graph using the COP results.

Figure 12 .
Figure 12.From graph to BPMN Model Example.
Let V k ={x k 1 , x k 2 , . .., x k l } be a subset of X, and an l-tuple (x k 1 , x k 2 , . .., x k l ) from d(x k 1 ), d(x k 2 ), . .., d(x k l ) can therefore be called an instantiation of the variables in V k .An instantiation is a solution if and only if it satisfies the constraints C. If an objective function is included in the CSP, it is called a Constraint Optimization Problem (COP).It is necessary to highlight that CSP and COP specifications are very similar to the process models proposed here, since both are declarative models which define the problem but do not solve it.A solver of CSPs or COPs tries to find a possible tuple of values for the variables that satisfy the constraints defined in the domain.2.
[48]eate a Constraint Optimization Problem using the declarative Model: As explained in detail in Section 4.3.1,weusedConstraintSatisfaction Problems (CSPs)[48]to solve the configuration problem.CSPs represent a reasoning methodology consisting of the representation of a problem by means of a set of variables, domains and constraints.CSPs have also a declarative description, then very similar to the Declarative Specification of ConfD-BP.CSPs are a widely used model-based knowledge representation formalism.In a formal way, it is defined as a tuple X, D, C , where X = {x 1 , x 2 , . .., x n } is a finite set of variables, D = {d(x 1 ), d(x 2 ), . .., d(x n )} is a set of domains of the values of the variables, and C = {C 1 , C 2 , . .., C m } is a set of constraints.A constraint C i = (V i , R i ) specifies the possible values of the variables in V that simultaneously satisfy R.

Table 1 .
Declarative model and COP elements relationship.

Algorithm 2
Create a BP model from COP result.

Table 2 .
Type of gateway decision.

Table 3 .
Summary of the test cases used for evaluation.