Open Access
This article is
 freely available
 reusable
Processes 2017, 5(4), 54; https://doi.org/10.3390/pr5040054
Article
Dynamical Scheduling and Robust Control in Uncertain Environments with Petri Nets for DESs
GREAH Research Group, UNIHAVRE, Normandie University, 76600 Le Havre, France
Received: 3 September 2017 / Accepted: 21 September 2017 / Published: 1 October 2017
Abstract
:This paper is about the incremental computation of control sequences for discrete event systems in uncertain environments where uncontrollable events may occur. Timed Petri nets are used for this purpose. The aim is to drive the marking of the net from an initial value to a reference one, in minimal or nearminimal time, by avoiding forbidden markings, deadlocks, and dead branches. The approach is similar to model predictive control with a finite set of control actions. At each step only a small area of the reachability graph is explored: this leads to a reasonable computational complexity. The robustness of the resulting trajectory is also evaluated according to a risk probability. A sufficient condition is provided to compute robust trajectories. The proposed results are applicable to a large class of discrete event systems, in particular in the domains of flexible manufacturing. However, they are also applicable to other domains as communication, computer science, transportation, and traffic as long as the considered systems admit Petri Nets (PNs) models. They are suitable for dynamical deadlockfree scheduling and reconfiguration problems in uncertain environments.
Keywords:
discrete event systems; timed Petri nets; stochastic Petri nets; model predictive control; scheduling problems1. Introduction
The design of controllers that optimize a cost function is an important objective in many control problems, in particular in scheduling problems that aim to allocate a limited number of resources within several users or servers according to the optimization of a given cost function. In the domains of flexible manufacturing, communication, computer science, transportation, and traffic, the makespan is commonly used as an effective cost function because it leads directly to minimal cycle times. However, due to multilayer resource sharing and routing flexibility of the jobs, scheduling problems are often NPhard problems. Many recent works in operations research, automatic control, and computer science communities have studied such problems. In operations research community, flowshop, and jobshop problem have been investigated from a long time [1,2] and a lot of contributions have been proposed, based either on heuristic methods (like Nawaz, Enscore and Ham or Campbell, Dudek, and Smith heuristics) or artificial intelligence and evolutionary theory [3,4,5]. In the automatic control community, automata, Petri nets (PNs), and maxplus algebra have been used to solve scheduling problems for discrete event systems (DESs) [6,7]. In particular, with PNs, the pioneer contributions for scheduling problems are based on the Dijkstra and A* algorithms [8,9]. Such algorithms explore the reachability graph of the net, in order to generate schedules. Numerous improvements have been proposed: pruning of nonpromising branches [10,11], backtracking limitation [12], determination of lower bounds for the makespan [13], best first search with backtracking, and heuristic [14] or dynamic programming [15]. By combining scheduling and supervisory control in the same approach, one can also avoid deadlocks. Some approaches have been proposed: search in the partial reachability graph [16], genetic algorithms [17], and heuristic functions based on the firing vector [13,18]. The performance of operations research approaches are good, in general, compared to the automatic control approaches as long as static scheduling problems are considered. The advantage to solving scheduling problems with PNs or other tools issued from the control theory is to use a common formalism to describe a large class of problems and to facilitate the representation from one problem to another. In particular, PNs are suitable to represent many systems in various domains as flexible manufacturing, communication, computer science, transportation, and traffic [6,7]. This makes such approaches more suitable for dynamic and robust scheduling in uncertain environments. However, modularity and genericity usually suffer from a large computational effort that disqualifies the approaches for numerous large systems.
This work aims to propose a modular and generic approach of weak complexity. It details a method for timed PNs that incrementally computes control sequences in uncertain environments. Uncertainties are assumed to result from system failures or other unexpected events, and robustness with respect to such uncertainties is obtained thanks to a model predictive control (MPC) approach. The computed control sequences aim to reach a reference state from an initial one. The forbidden states, as deadlocks and deadbranches are avoided. The trajectory duration approaches its minimal value. Thanks to its robustness, the proposed approach generates dynamical and reconfigurable schedules. Consequently, it can be used in a realtime context. Resource allocation and operation scheduling for manufacturing systems are considered as the main applications. The robustness of the resulting trajectory is evaluated as a risk belief or probability. For that purpose structural and behavioral models of the uncertainties are considered. Finally, robust trajectories are computed. Compared to our previous works [19,20,21,22], the main contributions are: including, explicitly, uncertainties by means of uncontrollable stochastic transitions in the PNs model; evaluating the risk of the computed control sequences; proposing a sufficient condition for the existence of robust trajectories.
The paper is organized as follows. In Section 2, the preliminary notions and the proposed method are developed: timed PNs with uncontrollable transitions are presented, nonrobust and robust control sequences are introduced, and the approach to compute nonrobust and robust control sequences with minimal duration is developed. Section 3 illustrates the method on a simple example and then presents the performance for a case study. Section 4 is a discussion about the method and the results. Section 5 sums up the conclusions and perspectives.
2. Materials and Methods
2.1. Petri Nets
A PN structure is defined as G = <P, T, W_{PR}, W_{PO}>, where P = {P_{1}, …, P_{n}} is a set of n places and T = {T_{1}, …, T_{q}} is a set of q transitions with indices {1, ...,q} W_{PO} ∈ (N) ^{n}^{×q} and W_{PR} ∈ (N) ^{n}^{×q} are the post and pre incidence matrices (N is the set of nonnegative integer numbers), and W = W_{PO} − W_{PR} ∈ (Z) ^{n×q} (Z is the set of positive and negative integer numbers) is the incidence matrix. <G, M_{I} > is a PN system with initial marking M_{I} and M ∈ (N) ^{n} represents the PN marking vector. The enabling degree of transition T_{j} at marking M is given by n_{j}(M):
where °T_{j} stands for the preset of T_{j}, m_{k} is the marking of place P_{k}, w^{PR}_{kj} is the entry of matrix W_{PR} in row k and column j. A transition T_{j} is enabled at marking M if and only if (iff) n_{j}(M) > 0, this is denoted as M [T_{j} >. When T_{j} fires once, the marking varies according to ∆M = M’ − M = W(:, j), where W(:, j) is the column j of incidence matrix. This is denoted by M [T_{j} > M’ or equivalently by M’ = M + W.X_{j} where X_{j} denotes the firing count vector of transition T_{j} [7]. A firing sequence σ is defined as σ = T(j_{1})T(j_{2})…T(j_{h}) where j_{1},... j_{h} are the indices of the transitions. X(σ) ∈ (N) ^{q} is the firing count vector associated to σ, σ = X(σ)_{1} = h is the length of σ ( _{1} stands for the 1norm), and σ = ε stands for the empty sequence. The firing sequence σ fired at M leads to the trajectory (σ, M):
where M(0) = M is the marking from which the trajectory is issued, M(1), ..., M(h − 1) are the intermediate markings and M(h) is the final marking (in the next, we write M(k) ∈ (σ, M), k = 0, …h). A marking M is said to be reachable from initial marking M_{I} if there exists a firing sequence σ such that M_{I} [σ > M and σ is said to be feasible at M_{I}. R(G, M_{I}) is the set of all reachable markings from M_{I}.
n_{j}(M) = min{⌊m_{k}/w^{PR}_{kj}⌋: P_{k} ∈ °T_{j}}
(σ, M) = M(0) [T(j_{1}) > M(1)…. M(h − 1) [T(j_{h}) > M(h)
2.2. Forbidden, Dangerous and Robust Legal Markings
For control issues, the set of transitions T is divided into two disjoint subsets T_{C}, and T_{NC} such that T = T_{C} ∪ T_{NC}. T_{C} is the subset of q_{C} controllable transitions, and T_{NC} the subset of q_{NC} uncontrollable transitions. Without loss of generality T_{C} = {T_{1}, …, T_{qC}} and T_{NC} = {T_{qC+1}, …, T_{qC+qNC}}. The firings of enabled controllable transitions are enforced or avoided by the controller, whereas the firings of uncontrollable transitions are not, and uncontrollable transitions fire spontaneously according to some unknown random processes. A set of marking specifications is also defined with the function SPEC: for any marking M ∈ R(G, M_{I}), SPEC(M) = 1 if M satisfies the marking specifications, otherwise SPEC(M) = 0. When no specification is considered, SPEC(M) = 1 for all M ∈ R(G, M_{I}). The two disjoint sets F(G, M_{I}, M_{ref}) and L(G, M_{I}, M_{ref}) of forbidden and legal markings respectively are introduced:
L(G, M_{I}, M_{ref}) = {M ∈ R(G, M_{I}) at Ǝ σ ∈ (T_{C})* with M [σ > M_{ref} with (SPEC(M’) = 1)
for all M’ ∈ (σ, M)}
for all M’ ∈ (σ, M)}
F(G, M_{I}, M_{ref}) = R(G, M_{I})/L(G, M_{I}, M_{ref})
In other words, a marking M ∈ R(G, M_{I}) is legal with respect to M_{ref} if a trajectory exists from M to M_{ref} that contains only controllable transitions and intermediate markings that satisfy the specifications. In addition, a legal marking M is robust with respect to T_{C} if M° ⊆ T_{C}, where M° stands for the set of transitions enabled at M, otherwise M is dangerous (Figure 1) With this definition of robust and dangerous markings, a marking that satisfies M° ⊆ T_{C} but that has only dangerous markings as successors in R(G, M_{I}) is considered as robust. Note that a finer partition of the legal markings in three classes (strong robust, weak robust, and dangerous) could be used for some problems. On the contrary, a forbidden marking is a marking from which no controllable trajectory exists to the reference. Examples of forbidden markings are deadlocks or markings that do not satisfy the system specifications or markings that enable only uncontrollable transitions (Figure 1).
The previous definitions are extended to trajectories. A robust trajectory is a legal trajectory that visits only robust markings. On the contrary a dangerous trajectory is a legal trajectory that visits at least one dangerous marking.
2.3. Timed Petri Nets with Uncontrollable Transitions
Timed Petri nets are PNs whose behaviors are constrained by temporal specifications [7]. For this reason, timed PNs have been intensively used to describe DESs like production systems [6]. This paper concerns partiallycontrolled timed PNs under and infinite server semantic where the firing of controllable transitions behaves according to an earliest firing preselection policy (transitions fire earliest in the order computed by the controller) and time specifications similar to the one used for Ttimed PNs [23]: if T_{j} ∈ T_{C}, the firing of T_{j} occurs at earliest after a minimal delay d_{min j} from the date it has been enabled (d_{min j} = 0 if no time specification exists for T_{j}). On the contrary, the firings of uncontrollable transitions are unpredictable: if T_{j} ∈ T_{NC}, the firings of T_{j} occur according to an unknown arbitrarily random process at any time from the date it has been enabled. Consequently, partiallycontrolled timed PNs (PContTPNs) are defined as <G, M_{I}, D_{min}> where D_{min} = (d_{min j}) ∈ (R^{+})^{qC} and R^{+} is the set of nonnegative real numbers. If in addition, the stochastic dynamics of the uncontrollable transitions are driven by exponential probability density functions (pdfs) of parameters μ = (μ_{j}) ∈ (R^{+})^{qNC}, with a race policy and a resampling memory [24], then partially controlled stochastic timed PNs (PContSPNs) defined as <G, M_{I}, D_{min}, μ> will be used instead of PContTPNs. The parameters d_{min j} are set in an arbitrary time unit (TU) and the parameters μ_{j} are set in TU^{1}.
A timed firing sequence σ of length σ = h and of duration t_{h} is defined as σ = T(j_{1}, t_{1})T(j_{2}, t_{2})…T(j_{h}, t_{h}) where j_{1}, ... j_{h} are the indices of the transitions, and t_{1}, ..., t_{h} represent the dates of the firings that satisfy 0 ≤ t_{1} ≤ t_{2} ≤ … ≤ t_{h}. The timed firing sequence σ fired at M leads to the timed trajectory (σ, M):
with M(0) = M. Note that, under earliest firing policy, an untimed trajectory of the form of Equation (2) that contains only controllable transitions can be transformed in a straightforward way into a timed trajectory of the form of Equation (5) of minimal duration [20,21] using Algorithm 1. This algorithm also returns DURATION(σ, M) = t_{h}.
(σ, M) = M(0) [T(j_{1}, t_{1}) > M(1)…. M(h1) [T(j_{h}, t_{h}) > M(h)
Algorithm 1. Transformation of an untimed trajectory (σ,M) into timed one (σ’,M). 
(Inputs: σ, M, G, D_{min},; Output: σ’, τ)

2.4. Belief and Probability of Trajectory Deviation
The objective of this section is to evaluate the risk that uncontrollable firings may occur during the execution of the trajectory (σ, M_{I}) and deviate the trajectory from the reference. For PContTPNs, this risk is evaluated with the belief RB(σ, M_{I}, T_{C}):
where h_{NC} is the number of intermediate dangerous markings in (σ, M_{I}) and h is the number of markings visited by (σ, M_{I}). For PContSPNs, the belief RB(σ, M_{I}, T_{C}) is replaced by the probability RP(σ, M_{I}, T_{C}) that can be computed with Proposition 1:
if d_{jk} ≠ 0, otherwise π(k) = 0, and d_{jk} = t_{k+1} − t_{k} is the remaining time to fire T(j_{k+1}, t_{k+1}) at date t_{k}.
RB(σ, M_{I}, T_{C}) = h_{NC}/h
Proposition 1.
with:
Let <G, M_{I}, D_{min}, μ> be a PContSPN, under the earliest firing policy, with M_{I} a legal robust marking. Let M_{ref} be a reference marking and (σ, M_{I}) be a legal trajectory to M_{ref}. The probability RP(σ, M_{I}, T_{C}) that (σ, M_{I}) deviates from the reference is given by:
$$RP\left(\sigma ,{M}_{I},{\mathit{T}}_{\mathit{C}}\right)={\displaystyle \sum}_{1\le {k}_{1}\le h}\pi \left({k}_{1}\right){\displaystyle \sum}_{1\le {k}_{1}<{k}_{2}\le h}\left(\pi \left({k}_{1}\right).\pi \left({k}_{2}\right)\right)+\cdots +\phantom{\rule{0ex}{0ex}}{\left(1\right)}^{h1}.{\displaystyle \sum}_{1\le {k}_{1}<\dots <{k}_{h1}\le h}\left(\pi \left({k}_{1}\right)\dots \pi \left({k}_{h1}\right)\right)+{\left(1\right)}^{h}.\pi \left(1\right)\dots \pi \left(h\right)$$
$$\pi \left(k\right)=\frac{{\sum}_{{T}_{j}\in {\mathit{T}}_{\mathit{N}\mathit{C}}\cup \left(\mathit{M}\left(\mathit{k}\right)\right)\xb0}{\mu}_{j}}{{\sum}_{{T}_{j}\in {\mathit{T}}_{\mathit{N}\mathit{C}}\cup \left(\mathit{M}\left(\mathit{k}\right)\right)\xb0}{\mu}_{j}+{\left({d}_{{j}_{k}}\right)}^{1}}$$
Proof.
RP(σ, M_{I}, T_{C}) is the probability to fire uncontrollable transitions when dangerous markings belong to (σ, M_{I}).
Consider the trajectory of Figure 2. Under earliest firing policy, the probability that the uncontrollable transition T_{NC1} or T_{NC2} fires before T(j_{k+1}, t_{k+1}) and that the trajectory deviates from M_{ref} at M(k) is given by:
if d_{jk} ≠ 0, otherwise Prob(T_{NC1} or T_{NC2} fires before T(j_{k+1}, t_{k+1})) = 0. Note that if the controllable transition T(j_{k+1}, t_{k+1}) fires earliest after a duration d_{jk}, then the probability π(k) is computed by considering the approximation 1/d_{jk} of the mean firing rate of T(j_{k+1}, t_{k+1}). Note also that the duration of other controllable transitions enabled at M(k) (for example, T_{C2} in Figure 2) are not considered because this transition does not belong to (σ, M_{I}). Alternatively the probability that the trajectory continues to M(k+1) at M(k) is given by:
$$\pi \left(k\right)=Prob\left({T}_{NC1}\text{}\mathrm{or}\text{}{T}_{NC2}\text{}\mathrm{fires}\text{}\mathrm{before}\text{}T\left({j}_{k+1},\text{}{t}_{k+1}\right)\right)=\frac{{\mu}_{1}+{\mu}_{2}}{{\mu}_{1}+{\mu}_{2}+{\left({d}_{{j}_{k}}\right)}^{1}}$$
$$1\pi \left(k\right)=Prob\left(T\left({j}_{k+1},\text{}{t}_{k+1}\right)\text{}\mathrm{fires}\text{}\mathrm{before}\text{}{T}_{NC1}\text{}\mathrm{and}\text{}{T}_{NC2}\right)=\frac{{\left({d}_{{j}_{k}}\right)}^{1}}{{\mu}_{1}+{\mu}_{2}+{\left({d}_{{j}_{k}}\right)}^{1}}$$
Thus, RP(σ,M_{I},T_{C}) is finally given by:
for which an exhaustive development is easily rewritten as in Equation (7).
$$RP\left(\sigma ,{M}_{I},{\mathit{T}}_{\mathit{C}}\right)=\pi \left(0\right)+\left(1\pi \left(0\right)\right)\left(\pi \left(1\right)+\left(1\pi \left(1\right)\right)\dots \pi \left(h\right)\right))$$
2.5. Model Predictive Control for PContTPNs
The determination of control sequences for untimed and timed PNs that contain only controllable transitions has been considered in our previous works [19,20] with a model predictive control (MPC) approach adapted for DESs. In this section, this approach is extended to PContTPNs (and consecutively to PContSPNs). At each step, the future trajectory is predicted from the current state. A sequence of control actions is computed by minimizing and the first action of the sequence is applied. Then prediction starts again from the new state reached by the system [25,26]. The cost function J_{FC}(M, M_{ref}) = (D_{min})^{T}. X based on the temporal specification and on the evaluation X of the firing count vector, that leads to the reference M_{ref} from the marking M, has been introduced in our previous work [21] to estimate the time to the reference. In this section, this cost function is rewritten for PContTPNs. For this purpose let us define G_{C} and W_{C} ∈ (Z) ^{n}^{×qC} as the restrictions of G and W to the set of controllable transitions T_{C.} The controllable firing count vector X_{C} that satisfies M_{ref} − M = W_{C}.X_{C} and minimizes J_{FC}(M, M_{ref}) = (D_{min})^{T}.X_{C} is obtained by solving an optimization problem with integer variables of reduced size q_{C}r where r is the rank of W_{C}. A regular matrix P_{L} ∈ (Z) ^{n×n} and a regular permutation matrix P_{R} ∈ {0,1} ^{qC×qC} exists at:
with W_{11} ∈ (Z) ^{r}^{×r} a regular upper triangular matrix with integer entries, and W_{21} = 0_{(nr)}_{×r}, W_{22} = 0_{(nr)}_{×(qCr)} zero matrices of appropriate dimensions. For each M ∈ R(G, M_{I}), solving Equation(10):
is equivalent to solving Equation (11) and this leads to reduce the number of variables by r:
with F_{2} = (D_{min})^{T}.(P_{R}_{2} − P_{R}_{1}.(W_{11})^{−1}.W_{12}), P_{R} = (P_{R}_{1}  P_{R}_{2}), P_{L} = ((P_{L1})^{T}  (P_{L2})^{T})^{T} and ∆M_{1} = P_{L1}.(M_{ref} − M). This reformulation results from the rewriting (∆M_{1}^{T} ∆M_{2}^{T})^{T} = P_{L}.(M_{ref} − M) and (X_{C1}^{T} X_{C2}^{T})^{T} = (P_{R})^{−1}.X_{C} with X_{C}_{1} = (W_{11})^{−1}.∆M_{1} − (W_{11})^{−1}.W_{12}.X_{C}_{2}. The linear optimization problem (Equation (11)) has a solution with integer values as long as M_{ref} ∈ R(G_{C}, M) and the cost function J_{FC}(M, M_{ref}) based on firing count vector X_{C}_{2} and on D_{min} is defined by Equation (12):
$${{W}_{C}}^{\prime}={P}_{L}.{W}_{C}.{P}_{R}=\left(\begin{array}{cc}{W}_{11}& {W}_{12}\\ {W}_{21}& {W}_{22}\end{array}\right)$$
Min {(D_{min})^{T}.X_{C} : X_{C} ∈ (N) ^{qC} at W_{C}.X_{C} = (M_{ref} − M)}
Min {F_{2}.X_{C}_{2} : X_{C}_{2}∈(N)^{qC−r} at (W_{11})^{−1}.W_{12}.X_{C}_{2} ≤ (W_{11})^{−1}.∆M_{1}}
J_{FC}(M, M_{ref}) = (D_{min})^{T}.(P_{R1}.(W_{11})^{−1}.∆M_{1} + P_{R2}.X_{C2} P_{R1}.(W_{11})^{−1}.W_{12}.X_{C2})
As long as X_{C2} corresponds to a feasible and legal firing sequence σ to the reference (i.e., X_{C}_{2} does not encode a spurious solution for Equation (11)), J_{FC}(M, M_{ref}) provides an upper bound of the duration of σ as proved with Proposition 2.
Proposition 2.
Let us consider a PContTPN (resp. PContSPN) of parameter D_{min} (with respect to the parameters D_{min} and μ), under the earliest firing policy. Let M_{ref} be a reference marking and (σ, M_{I}) a legal trajectory to M_{ref} with σ ∈ T_{C}* and minimal duration DURATION(σ, M_{I}). Let X_{C}(σ) ∈ (N) ^{qC} be the firing count vector of σ. Then:
DURATION(σ,M_{I}) ≤ (D_{min})^{T}.X_{C}(σ)
Proof.
(σ, M_{I}) is written as in Equation (5). T(j_{1}, t_{1}) is enabled at date 0 and fires at date t_{1} = d_{min j1} to result in marking M(1). T(j_{2}, t_{2}) is enabled at date 0 or t_{1} and fires not later than t_{1} + d_{min j2}. Thus t_{2} ≤ d_{min j1} + d_{min j2}. The same reasoning is repeated h times. T(j_{h}, t_{h}) is enabled at latest at date t_{h1} and fires not later than t_{h1} + d_{min jh}. Thus t_{h} ≤ d_{min j1} + … + d_{min jh}. The minimal duration of (σ, M_{I}) is t_{h}, thus, Equation (13) holds.
The basic idea is to use J_{FC}(M, M_{ref}) to iteratively drive the search of the controllable firing sequence of minimal duration that leads to the reference. At each step (i.e., for each intermediate marking), a part of the controllable reachability graph is explored and a prediction of the remaining duration to the reference is obtained with cost function J_{FC}(M, M_{ref}) computed for each marking M of the explored graph. Then the first control action is applied (i.e., the next controllable transition fires). If an uncontrollable firing occurs, the trajectory deviates from the predicted one and the system enters in an unexpected state. However, the deviation is immediately taken into account by the controller that updates the control sequence at the next step. For this reason the proposed strategy leads to a dynamical and robust scheduling. Two algorithms already developed in our previous works [21,22] are used for that purpose.
Algorithm 2 similar to the one developed in [21,22] encodes as a tree Tree(M, H) a small part of the reachability graph rooted at M (Figure 3). The tree is limited in depth with parameter H and in duration with parameter H_{τ}.
Each node S = {m(S), σ(S), s(S), l(S), e(S)} ∈ Tree(M, H) is tagged with a marking m(S), the firing sequence σ(S) at M [σ(S) > m(S), and the sequence of nodes s(S) in the tree from M to m(S). In addition, the flags l(S) and e(S) are introduced at l(S) = 0 if S is forbidden, otherwise l(S) = 1 and e(S) = 1 if S is a terminal node of the tree, otherwise e(S) = 0. At each intermediate marking, Algorithm 2 returns the next transition T* to fire.
Algorithm 2. Computation of T* for PContTPNs. 
(Inputs: M, M_{ref}, G_{C}, SPEC, F, H, H_{τ} ; Outputs: F, converge, exhaustive, T*)

The complete control sequence σ* is obtained with Algorithm 3 similar to the one developed in [21,22] that adapts the parameter H in range [$1\text{}:\overline{H}$] where $\overline{H}$ is an input parameter (Figure 4) that limits the maximal depth of the search in steps. This algorithm starts at initial marking M_{I}, with no forbidden marking (i.e., F = Ø) and with minimal depth (i.e., H = 1). As long as convergence is ensured, T* is added to σ* and the current marking M is updated. Finally Algorithm 3 also evaluates the risk RP of the computed trajectory.
Algorithm 3. Control sequence computation for PContTPNs. 
(Inputs: M_{I}, M_{ref}, G, T_{C}, T_{NC}, SPEC, D_{min}, μ, $\overline{H}$,H_{τ} ; Outputs: σ*, success, RP)

Note that the complexity of Algorithm 3 is at most O(h.${q}_{C}^{\overline{H}}$) where h = σ*.
Example 1.
PContSPN1 is considered with T_{C} = {T_{1}, T_{2}, T_{3}, T_{4}, T_{5}, T_{6}}, T_{NC} = {T_{7}}, D_{min} = (1, 1, 1, 1, 1, 5)^{T} and μ = μ_{7} = 1 (Figure 5). The control objective is to reach M_{ref} = (5 0 0 0)^{T} from M_{I} = (1 0 0 0)^{T} and no additional marking constraint is considered. The cycles {P_{1}, T_{1}, P_{2}, T_{2}} and {P_{1}, T_{3}, P_{3}, T_{4}} are both token producers due to the weighted arcs: the execution of {P_{1}, T_{3}, P_{3}, T_{4}} multiplies each token by 5 compared to {P_{1}, T_{1}, P_{2}, T_{2}} that multiplies it by 2 only. Thus, sequences with cycle {P_{1}, T_{3}, P_{3}, T_{4}} will reach the reference more rapidly. However, the uncontrollable transition T_{7} may fire during execution of this cycle which leads to an excessive production of tokens. The cycle {P_{1}, T_{5}, P_{4}, T_{6}} which is a token consumer, is then used to correct the excessive number of tokens. Note that the execution of this last cycle is slow compared to the two other ones due (a) to the firing duration of T_{6} that is five times larger than the duration of the other transitions; and (b) to the presence of the selfloop {T_{5}, P_{8}} that limits the number of simultaneous firings of T_{5} to one (whereas the other transitions may fire several times simultaneously according to the infinite server semantic).
The optimal timed sequence to reach M_{ref} is given by σ_{1} = T(3, 1)(T(4, 2))^{5} with duration DURATION(σ_{1}, M_{I}) = 2 time units (TUs). If no unexpected firing of T_{7} occurs, Algorithm 3 applied with T_{C} leads to σ_{1}. However, if unexpected firings of T_{7} occur, the trajectory is disturbed and requires more time to reach the reference. Figure 6 is an example of trajectory including one firing of T_{7} at date 1.6 TUs. The rest of the control sequence is updated in order to compensate the deviation so that the marking finally reaches M_{ref} in 48.6 TUs instead of 2 TUs.
Figure 6 illustrates the systematic updating of the optimization process at each step (i.e., for each new firing). Consequently the firing of an uncontrollable transition at a given step k changes the future predictions, and the control actions computed at steps k + 1, k + 2, ... compensate the deviation as long as a controllable trajectory exists from the current marking to M_{ref}.
2.6. Robust Scheduling
In order to compute robust trajectories that cannot deviate from the reference, the controller should avoid dangerous intermediate markings and consider only legal trajectories with robust markings (i.e., with zerorisk belief or probability). The difficulty in this computation is that the intermediate markings are computed stepbystep and these markings are known in advance only within a small time window provided by the part of the reachability graph, of depth H, explored at each step. During the prediction phase of MPC, only the remaining firing count vector to the reference is determined and this vector does not provide the risk belief or risk probability of the future trajectory. Proposition 3 provides a sufficient condition to ensure that the computed trajectory visits only robust markings. For this purpose, let us define T_{RC} = {T_{j} ∈ T_{C} at (T_{j}°)° ⊆ T_{C}} where (T_{j}°)° = ∪ {P_{i}°:P_{i} ∈ T_{j}°}.
Proposition 3.
Let us consider a PcontTPN (or PcontSPN). Let (σ, M_{I}) be a trajectory such that (M_{I})° ⊆ T_{C}. If σ ∈ T_{RC}* then (σ, M_{I}) is a robust legal trajectory.
Proof.
Note at first that (M_{I})° ⊆ T_{C} implies that the net has no uncontrollable source transition (i.e., °T_{j} ≠ Ø for all T_{j} ∈ T_{NC}). Then, (σ, M_{I}) is written as in Equation (5): σ = M_{I} [T(j_{1}, t_{1}) > M(1)…. > M(h). Assume that there exists T_{j} ∈ (M(1))° such that T_{j} ∈ T_{NC}. T_{j} is necessarily enabled by the firing of T(j_{1}, t_{1}) because T_{j} is not enabled at M_{I}. As T_{j} is not a source transition, there exists a place P_{i} ∈ °T_{j} whose marking increases by firing T(j_{1}, t_{1}) and consequently P_{i} ∈ (T(j_{1}, t_{1}))°. As T_{j} ∈ P_{i}°, T_{j} ∈ ((T(j_{1}, t_{1}))°)°. Thus T_{j} ∈ T_{C} that is contradictory with assumption and (M(1))° ⊆ T_{C}. Repeating successively the same reasoning up to M(h), one can conclude that (M(k))° ⊆ T_{C}, k = 0,…,h, and that (σ, M_{I}) is a robust legal trajectory.
Note that robust legal trajectories are computed with Algorithms 2 and 3 by replacing W_{C} ∈ (Z) ^{n}^{×qC} with W_{RC} ∈ (Z) ^{n}^{×qRC} (i.e., the restriction of W to the set of robust controllable transitions T_{RC}) in the determination of J_{FC}(M, M_{ref}).
Note also that the set T_{RC} is easy to obtain by checking for each transition T_{j} if the condition X_{j}.(W_{PO})^{T}.W_{PR}.(0  I_{qNC})^{T} = 0 is satisfied or not, with X_{j} the firing count vector of T_{j} and I_{qNC} the identity matrix of size q_{NC}:
T_{RC} = {T_{j} ∈ T_{C} at X_{j}.(W_{PO})^{T}.W_{PR}.(0  I_{qNC})^{T} = 0 }
Example 2.
Let us consider again PcontSPN1 of Figure 5. In order to avoid any deviation, T_{RC} = {T_{1}, T_{2}, T_{4}, T_{5}, T_{6}} is considered instead of T_{C}. Algorithm 3 applied with W_{RC} ∈ (Z) ^{n×qRC} leads to σ_{2} = T(1, 1)(T(2, 2))^{2}(T(1, 3))^{2}(T(2, 4))^{4}T(1, 5)(T(2, 6))^{2} that has a duration DURATION(σ_{2}, M_{I}) = 6 TUs larger than DURATION(σ_{1}, M_{I}).
The decision to prefer the control sequence σ_{2} instead of σ_{1} depends on the risk of both control strategies. Table 1 reports the values of RB and RP for both sequences σ_{1} and σ_{2} with respect to several values of μ. From Table 1, one can notice that the sequence σ_{2} that is nonoptimal in time has the advantage to be robust compared to σ_{1}. It cannot be perturbed by any unexpected firing. Note also that the risk probability of σ_{1} depends strongly on the dynamic of the random firing of uncontrollable transition T_{7}. Note finally that computing RP instead of RB provides a better evaluation of that risk.
Table 2 reports the mean duration d of control sequences depending on μ_{7} for three scenarios. All sequences are computed with M_{I} = (1 0 0 0)^{T} and M_{ref} = (5 0 0 0)^{T} and parameters $\overline{H}$ = 1, H_{τ} = 1. In scenario 1 all transitions are assumed to be controllable. In scenarios 2 and 3, T_{C} = {T_{1}, T_{2}, T_{3}, T_{4}, T_{5}, T_{6}}. Algorithm 3 is applied with T_{C} in scenario 2 whereas it is applied with T_{RC} = {T_{1}, T_{2}, T_{4}, T_{5}, T_{6}} in scenario 3. Simulations with scenario 2 are repeated 10 times to obtain a significant average duration. One can notice the advantage to compute a robust suboptimal trajectory with scenario 3, which provides better result from μ_{7} = 0.5. When μ_{7} increases, the mean duration of T_{7} firings decreases and the probability to fire T_{7} before T_{4} increases; consequently, the number of perturbations increases and the mean duration of the global trajectory also increases due to the execution of the cycle {P_{1}, T_{5}, P_{4}, T_{6}}.
3. Results
PcontSPN2 (Figure 7) is the timed model of a production system that processes a single type of products according to two possible jobs [27,28]. The first job is composed of the transitions t_{1} to t_{8,} and the second one by the transitions t_{9} to t_{14}. In the first job the transitions T_{1}, T_{3}, T_{4}, T_{6}, T_{7}, T_{8} represent the operations in successive machines and the places P_{1}, P_{2}, P_{4}, P_{6}, P_{7}, T_{8} are intermediate buffers where products are temporarily stored. The initial marking of place P_{1} represents the maximal number of products that can be simultaneously processed by the Job 1. In the second job the transitions T_{9}, T_{10}, T_{11}, T_{12}, T_{13}, T_{14} represent the operations in successive machines and the places P_{8}, P_{9}, P_{10}, P_{11}, P_{12}, T_{13} are intermediate buffers. The initial marking of place P_{8} represents the maximal number of products that can be simultaneously processed by the Job 2. Job 1 could be altered by a server failure whereas Job 2 could not. The occurrence of this failure is represented by the firing of the subsequence T_{2}T_{5} instead of T_{3}T_{4}. Note that the faults under consideration are not blocking the system, but they delay the cycle time. Consequently the nominal sequence T_{1} T_{3} T_{4} T_{6} T_{7} T_{8} may be altered when an unexpected firing of T_{2} occurs that leads to the perturbed behavior T_{1} T_{2} T_{5} T_{6} T_{7} T_{8} with an excessive global duration. The six resources p_{14} to p_{19} have limited capacities: m(p_{14}) = m(p_{15}) = m(p_{16}) = m(p_{17}) = m(p_{18}) = m(p_{19}) = 1. The places p_{20} and p_{21} represent the input and output buffers, respectively, that contain the number of products to be processed either by Job 1 or Job 2. The temporal specifications are given by D_{min} = (1 1 2 20 1 1 1 3 3 3 3 3 3)^{T} for T_{C} = T/{T_{2}} and by μ_{2} = 1.
Control sequences are computed with M_{I} = 3P_{1} + 3P_{8} + 1P_{14} + 1P_{15} + 1P_{16} + 1P_{17} + 1P_{18} + 1P_{19} + kP_{20} and M_{ref} = 3P_{1} + 3P_{8} + 1P_{14} + 1P_{15} + 1P_{16} + 1P_{17} + 1P_{18} + 1P_{19} + kP_{21} where k is a varying parameter. The results are reported in Table 3 for $\overline{H}$ = 5 and H_{τ} = 20.
Another time, three scenarios are considered: in scenario 1 all transitions, including T_{2}, are assumed to be controllable with d_{min 2} = 1. In scenario 2, T_{C} = T/{T_{2}} and Algorithm 3 is applied with T_{C}. In scenario 3 Algorithm 3 is applied with T_{RC} = T/{T_{1}, T_{2}}. Note, at first, that due to the numerical values of the firing parameters, the cost function prefers Job 1 that has a global duration of 7 TUs to process one product compared to Job 2, which has a global duration of 18 TUs (without considering the constraints due to the limited resources). Thus scenario 1 corresponds to the iterated execution of Job 1. For scenario 2, μ_{2} = 1 and d_{min 3} = 1: consequently the probability that an unexpected firing of T_{2} occurs is 0.5. When such a firing occurs the long firing duration d_{min 5} = 20 of T_{5} compared to d_{min 4} = 2 alters the global duration required to process the product. This explains that scenario 2 leads to longer sequences compared to scenario 1. Scenario 3 is also tested in a stochastic context with the same value of parameters μ_{2} = 1 and d_{min 3} = 1. However, the restriction of the control actions in set T_{RC} prefers systematically Job 2 that is robust to the perturbations. Note also that the global duration for k = 15 and k = 20 is better with scenario 3 than with scenario 2. This is due to the partial exploration of the reachability graph and to the approximation of the remaining sequence duration with cost function J_{FC} that provide solutions with no warranty of optimality.
4. Discussion
As mentioned in the previous section, the solutions returned by Algorithm 3 are not optimal solutions in a systematic way. The performance of the algorithm depends on the two input parameters: $\overline{H}$, which limits the exploration in depth, and H_{τ}, which limits the search in duration. If the depth H is too small, Algorithm 2 returns the flag converge = −1 or exhaustive = 0 and Algorithm 3 increases H in the range [1:$\overline{H}$]. On the contrary, if H is too large, then the iterative use of Algorithm 2 certainly reaches M_{ref} but the computational effort is uselessly high. In that case, Algorithm 3 decreases H in the range [1:$\overline{H}$]. Consequently, the aim of Algorithm 3 is to adapt at each step the depth of the search to maintain converge = 0 and exhaustive = 1 or converge = 1. Table 4 reports the performance in function of the parameters $\overline{H}$ and H_{τ} for PcontSPN2 with M_{I} = 3P_{1} + 3P_{8} + 1P_{14} + 1P_{15} + 1P_{16} + 1P_{17} + 1P_{18} + 1P_{19} + 5P_{20}, M_{ref} = 3P_{1} + 3P_{8} + 1P_{14} + 1P_{15} + 1P_{16} + 1P_{17} + 1P_{18} + 1P_{19} + 5P_{21}, and T_{C} = T. The duration of the control sequences and the computational time required to compute the sequences with Algorithm 3 are reported for an Intel Core i746000 CPU at 2.1–2.7 GHz.
Note that optimal solutions can be searched in a systematic way instead of using Algorithm 3 considering the extended timed reachability graph [29,30,31]. Such a graph contains not only the different markings but also the different timed sequences (a given marking can be reached by several sequences with different durations). Table 5 illustrates the rapid increase of the complexity to build such a graph depending on the initial marking M_{I} = 3P_{1} + 3P_{8} + 1P_{14} + 1P_{15} + 1P_{16} + 1P_{17} + 1P_{18} + 1P_{19} + kP_{20} when k increases. For each value of k, the number of nodes as the computational time required to compute the graph, are reported for the usual reachability graph and for the timed reachability graph. Table 5 shows that such a method is no longer suitable for large systems. This motivates the proposed approach.
5. Conclusions
A method has been proposed to compute control sequences for discrete events systems in uncertain environments. The method uses timed PNs under an earliest firing policy with controllable and uncontrollable transitions as a modeling formalism that is easy to adapt to various problems. The obtained solutions are minimal or nearminimal in duration. Moreover, for each returned solution, the risk to fire uncontrollable transitions is evaluated. Another advantage of the proposed approach is to limit the computational complexity of the algorithm by limiting the part of the reachability graph that is expanded even if the initial marking and reference marking are far from each other, and if deadlocks and dead branches are a priori unknown for the controller. Thanks to the risk evaluation, a robust scheduling becomes computable under some additional assumptions.
In our next works, the research effort will concern, at first, the definition of the cost function that will be improved to provide a more accurate approximation of the remaining time to the reference. The sensitivity of the performance with respect to H will be also studied. We will also include the risk evaluation in the cost function to obtain trajectories of low risk level.
Acknowledgments
The Project MRT MADNESS 20162019 has been funded with the support from the European Union with the European Regional Development Fund (ERDF) and from the Regional Council of Normandie.
Conflicts of Interest
The authors declare no conflict of interest.
References
 Garey, M.R.; Johnson, D.S.; Sethi, R. The complexity of flowshop and jobshop scheduling. Math. Oper. Res. 1976, 1, 117–129. [Google Scholar] [CrossRef]
 Johnson, S.M. Optimal twoand threestage production schedules with setup times included. Nav. Res. Logist. Q. 1954, 1, 61–68. [Google Scholar] [CrossRef]
 Baker, K.R.; Trietsch, D. Principles of Sequencing and Scheduling; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
 Lopez, P.; Roubellat, F. Production Scheduling; ISTE: Arlington, VA, USA, 2008. [Google Scholar]
 Leung, J.Y. Handbook of Scheduling: Algorithms, Models, and Performance Analysis; Chapman & Hall/CRC Computer & Information Science Series: New Delhi, India, 2004; ISBN 9781584883975. [Google Scholar]
 Cassandras, C. Discrete Event Systems: Modeling and Performances Analysis; Aksen Ass. Inc. Pub.: Homewood, IL, USA, 1993. [Google Scholar]
 David, R.; Alla, H. Petri Nets and Grafcet—Tools for Modelling Discrete Events Systems; Prentice Hall: London, UK, 1992. [Google Scholar]
 Chretienne, P. Timed Petri nets: A solution to the minimumtimereachability problem between two states of a timedeventgraph. J. Syst. Softw. 1986, 6, 95–101. [Google Scholar] [CrossRef]
 Lee, D.Y.; DiCesare, F. Scheduling flexible manufacturing systems using Petri nets and heuristic search. IEEE Trans. Robot. Autom. 1994, 10, 123–133. [Google Scholar] [CrossRef]
 Sun, T.H.; Cheng, C.W.; Fu, L.C. Petri net based approach to modeling and scheduling for an FMS and a case study. IEEE Trans. Ind. Electron. 1994, 41, 593–601. [Google Scholar]
 ReyesMoro, A.; Kelleher, H.H.G. Hybrid Heuristic Search for the Scheduling of Flexible Manufacturing Systems Using Petri Nets. IEEE Trans. Robot. Autom. 2002, 18, 240–245. [Google Scholar] [CrossRef]
 Xiong, H.H.; Zhou, M.C. Scheduling of semiconductor test facility via Petri nets and hybrid heuristic search. IEEE Trans. Semicond. Manuf. 1998, 11, 384–393. [Google Scholar] [CrossRef]
 Jeng, M.D.; Chen, S.C. Heuristic search approach using approximate solutions to Petri net state equations for scheduling flexible manufacturing systems. Int. J. FMS 1998, 10, 139–162. [Google Scholar]
 Wang, Q.; Wang, Z. Hybrid Heuristic Search Based on Petri Net for FMS Scheduling. Energy Proced. 2012, 17, 506–512. [Google Scholar] [CrossRef]
 Zhang, W.; Freiheit, T.; Yang, H. Dynamic scheduling in flexible assembly system based on timed Petri nets model. Robot. Comput. Integr. Manuf. 2005, 21, 550–558. [Google Scholar] [CrossRef]
 Hu, H.; Li, Z. Local and global deadlock prevention policies for resource allocation systems using partially generated reachability graphs. Comput. Ind. Eng. 2009, 57, 1168–1181. [Google Scholar] [CrossRef]
 Abdallah, B.; ElMaraghy, H.A.; ElMekkawy, T. Deadlockfree scheduling in flexible manufacturing systems. Int J. Prod. Res. Vol. 2002, 40, 2733–2756. [Google Scholar] [CrossRef]
 Lei, H.; Xing, K.; Han, L.; Xiong, F.; Ge, Z. Deadlockfree scheduling for flexible manufacturing systems using Petri nets and heuristic search. Comput. Ind. Eng. 2014, 72, 297–305. [Google Scholar] [CrossRef]
 Lefebvre, D.; Leclercq, E. Control design for trajectory tracking with untimed Petri nets. IEEE Trans. Autom. Control 2015, 60, 1921–1926. [Google Scholar] [CrossRef]
 Lefebvre, D. Approaching minimal time control sequences for timed Petri nets. IEEE Trans. Autom. Sci. Eng. 2016, 13, 1215–1221. [Google Scholar] [CrossRef]
 Lefebvre, D. Deadlockfree scheduling for Timed Petri Net models combined with MPC and backtracking. In Proceedings of the IEEE WODES 2016, Invited Session “Control, Observation, Estimation and Diagnosis with Timed PNs”, Xi’an, China, 30 May–1 June 2016; pp. 466–471. [Google Scholar]
 Lefebvre, D. Deadlockfree scheduling for flexible manufacturing systems using untimed Petri nets and model predictive control. In Proceedings of the IFAC—MIM, Invited Session “DES for Manufacturing Systems”, Troyes, France, 28–30 June 2016. [Google Scholar]
 Ramchandani, C. Analysis of Asynchronous Concurrent Systems by Timed Petri Nets. Ph.D. Thesis, MIT, Cambridge, MA, USA, 1973. [Google Scholar]
 Molloy, M.K. Performance analysis using stochastic Petri nets. IEEE Trans. Comput. C 1982, 31, 913–917. [Google Scholar] [CrossRef]
 Richalet, J.; Rault, A.; Testud, J.; Papon, J. Model predictive heuristic control: Applications to industrial processes. Automatica 1978, 14, 413–428. [Google Scholar] [CrossRef]
 Camacho, E.; Bordons, A. Model Predictive Control; Springer: London, UK, 2007. [Google Scholar]
 Uzam, M. An optimal deadlock prevention policy for flexible manufacturing systems using Petri net models with resources and the theory of regions. Int. J. Adv. Manuf. Technol. 2002, 19, 192–208. [Google Scholar] [CrossRef]
 Chen, Y.; Li, Z.; Khalgui, M.; Mosbahi, O. Design of a Maximally Permissive LivenessEnforcing Petri Net Supervisor for Flexible Manufacturing Systems. IEEE Trans. Aut. Science and Eng. 2011, 8, 374–393. [Google Scholar] [CrossRef]
 Berthomieu, B.; Vernadat, F. State Class Constructions for Branching Analysis of Time Petri Nets. In Proceedings of the Ninth International Conference on Tools and Algorithms for the Construction and Analysis of Systems TACAS 2003, Warsaw, Poland, 7–11 April 2003; Springer: New York, NY, USA, 2003; Volume 2619, pp. 442–457. [Google Scholar]
 Gardey, G.; Roux, O.H.; Roux, O.F. Using Zone Graph Method for Computing the State Space of a Time Petri Net. In Proceedings of the International Conference on Formal Modeling and Analysis of Timed Systems FORMATS 2003, Marseille, France, 6–7 September 2003; Springer: Berlin/Heidelberg, Germany, 2003; Volume 2791, pp. 246–259. [Google Scholar]
 Klai, K.; Aber, N.; Petrucci, L. A New Approach to Abstract Reachability State Space of Time Petri Nets. In Proceedings of the 20th International Symposium on Temporal Representation and Reasoning, Pensacola, FL, USA, 26–28 September 2013. [Google Scholar]
Figure 1.
Examples of robust (R), dangerous (D) and forbidden (F) markings in R(G, M_{I}) depending on the controllable (TC) and uncontrollable transitions (TNC).
Figure 2.
An example of dangerous trajectory: M(k) enables two controllable transitions T(k + 1) and T_{C2} and two uncontrollable ones T_{NC1} and T_{NC2}.
Figure 6.
Cost function J_{FC} for a controlled sequence disturbed by an unexpected firing of T_{7} with respect to time (TUs).
Figure 7.
PcontSPN2 model of a manufacturing system [28].
RB  RP  

μ_{7} = 0.1  μ_{7} = 1  μ_{7} = 10  
σ_{1}  5/6 = 0.83  1/3 = 0.33  5/6 = 0.83  50/51 = 0.98 
σ_{2}  0  0  0  0 
μ_{7}  0.1  0.5  1  2  10 

Scenario 1  2  2  2  2  2 
Scenario 2  2.0  20.3  57.2  125.4  228.5 
Scenario 3  6  6  6  6  6 
k  Scenario 1  Scenario 2  Scenario 3 

5  45  103.4  72 
10  142  213.5  147 
15  236  321.9  222 
20  325  427.1  297 
Table 4.
Performance of Algorithm 3 with respect to parameters $\overline{H}$ and H_{τ} for PContSPN2, sequence duration (TUs) and computational time (s).
$\overline{\mathit{H}}$/H_{τ}  1  2  3  4  5  6 

5  82 (0.9 s)  86 (0.8 s)  68 (1 s)  68 (1.2 s)  68 (1.3 s)  68 (1.3 s) 
10  82 (0.9 s)  86 (0.8 s)  76 (1.5 s)  76 (2.5 s)  76 (4.7 s)  76 (7.9 s) 
15  82 (1 s)  86 (0.9 s)  63 (2.1 s)  63 (3.5 s)  63 (9.4 s)  63 (16.1 s) 
20  82 (1 s)  86 (0.8 s)  45 (2.6 s)  45 (4.7 s)  45 (10.7 s)  45 (20.6 s) 
Table 5.
Complexity of the exhaustive exploration of control sequences for PContSPN2, the number of nodes and the computation time (s).
$\mathit{k}$  Usual Reachability Graph  Extended Reachability Graph 

5  698 (1.4 s)  2208 (106 s) 
10  1963 (11 s)  6848 (1827 s) 
15  3268 (29 s)  … 
20  …  … 
© 2017 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).