Temporal Analysis of Influence of Resource Failures on Cyber-Physical Systems Based on Discrete Timed Petri Nets

Featured Application: An analysis method to assess and respond to the impact of resource failures on deadline of orders in Cyber-Physical Systems based on transformation of discrete timed Petri nets. Abstract: Advancement of IoT and ICT provide infrastructure to manage, monitor and control Cyber-Physical Systems (CPS) through timely provision of real-time information from the shop floor. Although real-time information in CPS such as resource failures can be detected based on IoT and ICT, improper response to resource failures may cripple CPS and degrade performance. Effective operations of CPS relies on an effective scheme to evaluate the impact of resource failures, support decision making needed and take proper actions to respond to resource failures. This motivates us to develop a methodology to assess the impact of resource failures on operations of CPS and provide the decision support as needed. The goal of this study is to propose solution algorithms to analyze robustness of CPS with respect to resource failures in terms of the impact on temporal properties. Given CPS modeled by a class of discrete timed Petri nets (DTPNs), we develop theory to analyze robustness of CPS by transforming the models to residual spatial-temporal network (RSTN) models in which capacity loss due to resources is reflected. We formulate an optimization problem to determine the influence of resource failures on CPS based on RSTNs and analyze the feasibility to meet the order deadline. To study the feasibility to solve a real problem, we analyze the computational complexity of the proposed algorithms. We illustrate the proposed method by application scenarios. We conduct experiments to study efficiency and verify computational feasibility of the proposed method to solve a real problem. of on of the CPS relevant spatial-temporal by applying to two application In the first only one type production by the failure. The results indicate that the workflows are rerouted properly to reflect the of the failed In the second two types of are due to the The results also that the to of In The results


Introduction
The advancement of information and communications technology (ICT) and Internet of Things (IoT) [1] provides the infrastructure for Cyber-Physical Systems (CPS) [2,3] to manage, monitor and control manufacturing systems. ICT refers to technologies involving any kind of computing/communication devices, networking components and information systems that enable efficient interaction in the digital world, while IoT can be defined as the network of physical objects embedded with sensors, software and relevant ICT technologies for interacting with other objects/systems over the Internet. Although CPS provides the potential visibility and a paradigm for enterprises to respond to dynamic changing environments based on ICT and IoT, these advantages also pose challenges in modeling, design and implementation of CPS [4]. In particular, robust design of CPS has become an important issue to achieve stable and secure CPS and is attracting a lot of attention in the CPS research community. A comprehensive survey of the concept and strategies for robust design of CPS can be found in [5]. In CPS for manufacturing systems, failures of machines/resources are unavoidable. Managers often wonder and worry about whether the original due dates of orders can still be met when failures of machines/resources occur, i.e., whether the manufacturing system is robust with respect to the failures. Providing a decision support tool for managers to assess the impact of failures of machines/resources on the orders is an important issue.
In terms of the robustness design requirements of CPS, the real-time information and communication infrastructure provided by IoT and ICT alone cannot ensure effective operations and robustness of CPS. Robustness of CPS can be achieved only if the real-time information provided by IoT and ICT are timely and properly used to support decision making and respond to the changing environment. For example, real-time information from the shop floor-such as resource failures [6]-can be detected in CPS. However, real-time resource failures information from IoT and ICT is useful only if an effective scheme to evaluate the impact of, and deal with, resource failures is available. In the literature, design issues of CPS include robustness, security and cost effectiveness [7]. Study on the impact of resource failures on CPS is less explored. This motivates us to develop a methodology to assess the impact of resource failures on operations and temporal properties of CPS. The goal of this study is to propose a method for modeling and analysis of CPS, in order to pave the way for developing an effective strategy to handle resource failures in CPS.
To evaluate the impact of resource failures on CPS, we will study the robustness of CPS in the context of manufacturing systems in this paper. The robustness property is concerned with the influence of resource failures on the operations and performance of CPS. In make-to-order manufacturing, meeting the deadline of orders is an important issue. Resource failures often cripple the operations of manufacturing systems and may lead to delays in fulfillment of orders. Therefore, the robustness property of CPS with respect to resource failures to be studied in this paper is concerned with how the orders will be influenced due to resource failures. In classical control theory, robustness addresses the issue about whether the desired system property can be maintained in the presence of uncertainty. The problem to be studied in this paper is to investigate the effects of resource failures on operations of CPS and the fulfillment of an order with multiple product demands and deadlines. The problem setting of this paper is different from that of [8], in that multiple types of product demands are considered in this study whereas the work of [8] considers a single type of product demands. The objective of this study is to propose a solution algorithm to analyze robustness of CPS with respect to resource failures, based on a class of discrete timed Petri net (DTPN) model that extends the one in [8].
The approach to robustness analysis of the DTPN models in this study is different from the existing reachability analysis methods that rely on abstraction of state space of timed Petri nets [9,10]. Our approach is based on the concept of model transformation in Model Driven approach (MDA) [11], which transforms DTPNs into spatial-temporal networks (STNs) to enable the compact representation and efficient search for solutions based on mathematical programming/optimization tools. As this study focuses on assessing the impact of resource failures on temporary properties based on model transformation, this study is different from [12,13], which focus on robust control and scheduling based on timed, extended reachability graphs. This paper is different from [8,14] as it focuses on the robustness property of CPS. In addition, the problem studied in this paper considers order requirements with multiple types of product demands instead of a single type of product demands. This paper is different from [15] as it includes important properties, algorithms, rigorous proof of properties of CPS, complexity analysis and verification of complexity analysis results by experiments. In summary, the contributions of this paper include: (1) proposing a formal timed model to capture resource failures in CPS, (2) formulating a feasibility problem for nominal CPS and a robustness analysis problem to analyze the impact of resource failures on temporal properties, (3) developing algorithms for the problems based on model transformation and (4) analyzing complex-ity of the proposed algorithms and verifying the computational feasibility of the proposed approach.
The structure of this paper is as follows. A literature review and the robustness analysis problems in CPS will be introduced in Sections 2 and 3, respectively. In Section 4, construction of STNs and analysis of the nominal CPS based on STNs will be presented. In Section 5, the impact of resource failures on CPS will be analyzed. In Section 6, we will illustrate the proposed method by examples and present the results. We will discuss the results in Section 7 and conclude this paper in Section 8.

Literature Review
CPS consist of two parts: the cyber space and the physical space. Control of entities in physical space of CPS is based on the cyber space model. To analyze CPS, proper modeling tools to construct the cyber space model of the CPS are relied upon. A modeling tool must be able to capture concurrent, synchronous and asynchronous operations/events in CPS and must support construction, edition and representation of the model in industrial standard format. A variety of modeling approaches/tools can be used to capture the operations of CPS by constructing the cyber space model of CPS, e.g., Discrete Event System [16], Extended Hybrid Automata [17] and Petri nets (PN) [18]. Among these tools, Petri nets [19] provide an easy-to-use graphical user interface that can capture concurrent, synchronous and asynchronous operations/events in CPS. Petri nets can be represented by the Petri Net Markup Language (PNML) [20], an XML-based interchange format widely supported by many software tools [21]. Therefore, we adopt Petri nets in this paper to model CPS. Despite the variety of Petri net tools such as the ones in [22,23] available for modeling CPS, these tools do not provide a method for analyzing the influence of resource failures on performance.
As the objective of this paper is to analyze the influence of resource failures on temporal property of CPS, Petri nets supporting modeling of time are adopted. In the literature, some Petri net models assign time delay to transitions while others assign time delay to places in the nets [24]. A timed Petri net is called deterministic timed Petri net if the time delay is deterministic [25][26][27][28][29]. A timed Petri net is called stochastic timed Petri net if the time delay is probabilistically specified [30][31][32][33]. As this paper focuses on influence of resource failures on temporal property of CPS, a class of deterministic discrete timed Petri nets are adopted. Different models have been proposed to capture resource failures. Some of the works (e.g., [34,35]) model resource failures as loss of tokens in untimed Petri nets models and develop methods to assess the impact of resource failures on the system after the occurrence of failures. The works [36,37] classify the resources into reliable resources and unreliable resources based on a class of untimed Petri nets (S4PR) and develop methods to allocate resources. The works [12,13] consider uncertainties due to the interruptions of operations and unreliable resources and take performance and risk into account in the objective function to deal with uncertainties in manufacturing systems based on timed Petri nets. As the goal of this paper is to develop a method to assess the impact of resource failures on the system after the occurrence of failures, we model resource failures as loss of tokens in discrete timed Petri nets. Although the proposed discrete timed Petri nets can capture the characteristics of CPS, there still lacks a theory to analyze the robustness property. For this reason, this paper is devoted to development of theory to facilitate the analysis of robustness property of CPS based on the proposed discrete timed Petri nets.
To study robustness property of CPS, a nominal solution must be generated first. Robustness analysis for CPS is done based on the nominal solution. To generate a nominal solution, models that can capture the spatial and temporal properties of workflows are constructed. In this paper, we consider order requirements with multiple product demands and a deadline. A spatial-temporal network (STN) model is constructed for each type of production process required by the orders. The STN models are constructed based on transformation of the DTPN models of CPS. The STNs constructed for CPS are capable of capturing the dynamics of workflows in space and time. The STNs can only capture the nominal operations of CPS. To study the influence of resource failures on CPS, we construct residual spatial-temporal network (RSTN) models in which capacity loss due to resources is reflected and modeled. We analyze the impact of resource failures on CPS based on RSTNs. To optimize the performance after resource failures, we formulate an optimization problem to determine the influence of resource failures on CPS and analyze the feasibility to meet the deadline. The optimization problem can be solved by applying any integer programming solver such as the CPLEX Optimizer [38]. We assess practicality of the proposed method by applying it to application scenarios in which resource failures occur in executing production schedules. The influence of resource failures on the schedules in each scenario can be visually and clearly represented in the residual spatial-temporal networks. To study the feasibility to solve a real problem, we analyze the computational complexity of the proposed algorithm. Our analysis indicates that the lower bound on the complexity of the proposed algorithm is polynomial with respect to the tasks (product demands) and scale of the production processes. We verify the computational feasibility of the proposed method by conducting experiments. The results indicate that the computational time grows polynomially with the quantity of the products influenced by resources failures and the number of transitions in the task subnet.

Robustness Analysis Problem Formulation in CPS
The problem to be studied in this paper is to analyze the impact of resource failures on CPS. To formulate the problem to be addressed in this paper, we summarize a list of notations/symbols that will be used in this paper in Table 1. The cyber world model of a CPS is represented by G . It is used to capture the dynamics of CPS. The nominal cyber world model is an abstraction of the physical world of CPS. The structure of the nominal cyber world model considered in this paper is similar to the S3PR model and the S4PR model in the literature [39,40] that can capture sequential processes with shared resources, is extended with time factor and a proper uncertainty model to capture failures of resources and the time intervals during which failures occur. Uncertainties such as machine failures may occur in the physical world of CPS and influence the operations of CPS. In the remainder of this paper, we will use  to denote the uncertainty model. Given the nominal cyber world model G and the uncertainty model  , the problem is to study the influence of  on the operations of CPS. To analyze the influence of  on the operations of CPS, the nominal cyber world model G and the uncertainty model  must be represented properly. In this paper, the nominal cyber world model G will be presented by a DTPN model. The uncertainty model  will be characterized based on the failed resources as well as the time intervals during which unexpected events occur. In this section, we first introduce construction of the nominal model G for CPS and then present the uncertainty model to formulate the problem.
The initial capacity (the number) of type r resources in period t ; initially, rt C is set to The residual capacity rt C of type r resources in period t , initial value The uncertainty model that captures resource failures for a reachable marking m , The quantity of products influenced by resource failures The set of all final state places in P  The number of periods in the time horizon A spatial-temporal network constructed for type j task, A residual spatial-temporal network constructed based on ) , ( We consider a make-to-order manufacturing system in which an order is specified by the quantity of different types of product demands, and the deadline  . The production process of each type of product is associated with a type of task. Therefore, the number of different types of tasks is J . The set of indices of all different types of tasks is denoted by J = } ,..., 3 , 2 , 1 { J and we refer to a type of task by We define a task subnet to describe the production process of a type of products. A task subnet can be described by a discrete timed Petri net (DTPN). A DTPN is a five-tuple ) , , , in which P is a set of places, T is a set of transitions, is an initial marking, and is a function that specifies either a lower bound of the firing time of each transition or the firing time of each transition, where Z is the set of nonnegative integers..
Note that for transition only specifies a lower bound of the firing time of transition t by using any resource. Firing a transition relies on the use of a resource. The firing time of transition j T t  depends on the type of resource used to fire t . The exact firing time of t is determined by the firing time of the same transition specified in the corresponding resource subnet of the resource involved. Therefore, can be set to zero for each transition j T t  because is always a lower bound of transition t by using any resource. Figure 1 shows two examples of task subnets, 1 GJ and 2 GJ . The firing time of transitions in 1 GJ can be set as ) if the lower bound of the firing time of each transition in 1 T is unknown. Similarly, if the lower bound of the firing time of each transition in 2 T is unknown, the firing time of transitions in 2 GJ can be set as ) To facilitate modeling, the operator " " is defined as follows to merge two DTPNs through common transitions, places and arcs.
Definition 2: Given two discrete timed PNs, , the operator " " to combine 1 G and 2 G is defined as follows: An operation is performed by a resource. In this study, the number of different types of resources is denoted by R . We use R = } ,..., 3 , 2 , 1 { R to denote the set of different types of resources in the system. We use r to refer to a type of resources, where R  r . The k -th operation performed by type r resource is represented by a circuit, jk r G , which is a discrete timed PN. Each circuit jk r G has an idle place r , a busy place, one start transition and one end transition.  K denote the set of indices of circuits of type r resources involved in type j task. The capabilities of a resource type r is described by a resource subnet defined below.

Definition 3: A resource subnet
r GR for a type r resource is a discrete timed Petri net Operations in manufacturing systems are performed by resources to process parts. The proposed model captures synchronization between parts and resources by applying the merging operator " " to a common transition in a task subnet and a resource subnet. The operator " " also merges relevant common places and arcs involved between a task subnet and the associated resource subnet. Note that as the firing time for a transition in the task subnet is just a lower bound, which is always smaller than the firing time specified in the resource subnet, the exact time for firing a transition depends on the resource involved. Therefore, after applying the merging operator, the firing time of the merged transition is defined as the maximum of the transition firing time specified in the task subnet and the resource subnet.
Consider a task subnet ) , , , , ( and a resource subnet obtained by applying the " " operator to is determined by the firing time can be set to zero for each transition To capture cooperation and interaction between different types of tasks and resources in the CPS, we merge the task subnets with the resource subnets to construct the discrete timed Petri net model G for the nominal CPS defined G as follows. In this paper, it is assumed that the timed Petri net models work under infinite server policy. As we focus on the deterministic and aim to determine the feasible firing sequences to meet the deadline, the firing of transitions is determined by the algorithm. Note that operations of the DTPN models are based on the firing sequences found by the proposed algorithm. Therefore, firing policy and memory policy are not required in the proposed method. Figure 3 shows an example of ) , , , GJ are shown in Figure 1 and GR is obtained by merging all the circuits in Figure 2. The firing time of each transition of G in Figure 3 is as follows: ) In a make-to-order manufacturing system, an order is specified by the quantity of different types of product demands, The requirements to meet an order with product demands, Therefore, the problem to determine whether the order can be fulfilled by the deadline  can be stated as the following feasibility problem: and a deadline  , determine whether there exists a firing sequence that brings G from m to d m by the deadline . Operations of manufacturing systems are usually influenced by unexpected events such as resource failures. The influence of resource failures may have impact on the operations and orders in the system. To deal with resource failures properly, the impact due to resource failures must be evaluated.
To study the influence of resource failures on CPS, we must represent resource failures properly. We introduce the uncertainty model ) (m  based on the concept of perturbation vectors and failure time intervals as follows.

Definition 5:
A P dimensional perturbation vector  is an integer vector to specify the number of tokens lost for the set of places in P due to resource failures. The  i th element i  of the perturbation vector  is a non-negative integer that represents the number of tokens lost in the In this study, a discrete time horizon of  periods is considered. To describe the time intervals of resource failures, we define a discrete time failure interval. Note that if 1  i  , it means that there are i  failures of multiple resources in place i p . As the time interval of each resource failure may not be the same, we define a failure interval as follows.
, the corresponding discrete time failure interval that describes the starting time and the end time of the resource failure is denoted by According to Definition 6, a failure taking place outside the time horizon  can be represented by , where L is a number larger than  . We use  to refer to all failure intervals il  , Given a nominal cyber-physical system model ) , , an uncertainty model is defined based on a marking m reachable from 0 m . As the dimension of a marking is P , the uncertainty model is defined as follows: represents perturbation of nominal marking and the P dimensional vector  describes all failure intervals associated with the perturbation  .
In this paper, we will study whether deadline  can be met in the presence of un- based on the following problem formulation.

Generation of Solutions for Nominal CPS
The two problems presented in the previous section, FPN and RAP, aim to analyze the robustness of a solution of nominal CPS. In this section, we first present our approach to generate a solution to FPN. We will study RAP and analyze the robustness of the solution in the next section.
As the classical reachability analysis methods suffer from state explosion problem, we will propose a different approach to representing and finding solutions based on spatial-temporal network (STN) models, which describe the movement of flow tasks in space and time. The steps to construct STN will be presented later and are intuitively done by capturing the flows of tasks from upstream to downstream spatially and temporally based on DTPN.
The feasibility problem of nominal CPS aims to determine the feasibility to meet an order with J types of product demands by a deadline  , which extends the problem for a single type of product demand addressed in [8]. The novelty of the proposed approach is to analyze the DTPN based on transformation of the model into spatial-temporal network (STN) models without relying on classical reachability analysis method for timed Petri nets. Such transformation enables compact representation of solutions and efficient search for solutions based on mathematical programming/optimization tools. The proposed method is obviously different from the ones that rely on abstraction of state space of timed Petri nets.
To deal with an order with multiple types of product demands by a deadline, a spatial-temporal network ) , ( To optimize the schedule to meet the deadline, we set the cost for arcs connecting to the end node according to the periods. The cost for an arc corresponding to period t after the deadline is simply set to t . The cost for all the other arcs is set to zero.
is constructed based on the task subnet . Note that the set of places in j P can be decomposed into two disjoint subsets, 1 j P and 2 j P , where 1 j P represents the set of idle state places for parts in type j tasks and 2 j P represents the set of busy state places of parts in type j tasks. That is, . Without loss of generality, it is assumed that the case, the places in j P are numbered from upstream to downstream and are denoted as represents a specific point in time and space. We create  nodes for each place in 1 j P , one start node and one end node in to represent spatial-temporal information.
In constructing ) , ( , the capacity of an arc involving the use of type r resources is set to the residual capacity rt C of type r resources. Let rt C denote the initial capacity (the number) of type r resources in period t , where rt C is set to ) ( 0 r m r for all t . Initially, the residual capacity rt C of type r resources in period t is set to rt C . The residual capacity rt C will be updated iteratively in the Algorithm 2 (to be introduced later) to check feasibility of nominal CPS to reflect the assignment of tasks to resources. The algorithm (Algorithm 1) to construct ) , ( , where j V is the set of nodes and j A is the set of arcs Step 0: Create a start node j s and an end node j e Step 1:

End For End For
Step 2: Add an arc ) 1 , ( j s a  from 1 s to the node 1 Set arc capacity a c to j Q and arc cost a w to 0 Step 3: Set arc capacity a c to according to the residual capacity rt C , where r is the resource type used by operation n Set arc cost a w to 0 End For End For Step 4: , the flows of parts in horizontal arcs denote parts in waiting state and flows of parts in arcs connecting to the end node represent the finished parts. Flows of parts in all the other arcs denote the parts in busy state (being processed by machines/resources). Figure 4 shows the structure of ) , ( for type 1 tasks ) obtained by applying the above algorithm to 1 GJ in Figure 2a.  Let jrt A denote the set of arcs involved in the use of type-r resource in peri- . The capacity constraints (1), the flow balance equation (2), supply constraints (3) and demand constraints (4) must hold. Based on the spatial-temporal network ) , ( , the following problem is formulated to check the feasibility to meet product demands, j Q , by the deadline  .
Feasibility Problem of Spatial-Temporal Network (FPSTN) ) , ( Z , where Z is the set of nonnegative integers (5).
To check whether it is feasible to meet an order with product demands, j Q , J by the deadline  , the following algorithm can be applied by solving the feasibility problem of spatial-temporal network ) , ( By solving the above problem iteratively for . Obviously, the following property holds.

End While
Step 2: If the solution f satisfies the condition of Property 1 The above optimization problem is an integer programming problem that can be solved by applying any integer programming solvers such as the CPLEX Optimizer [38]. We will analyze the complexity of Algorithm 2 later in this section. The following property states that a property can be used to check whether the order deadline  can be met.
. The number of arcs constructed in the spatial-temporal network . Therefore, the complexity to construct ) , ( and the number of nodes in ) , ( . The problem to find the solution j f for the spatial-temporal net- is a minimal cost flow problem with additional constraints. Therefore, a lower bound of the computational complexity to find the solution j f for the spatial-temporal network with capacity constraint rt C is ) ( . Therefore, the overall computational complexity to update rt C is ) ( As 1 J is set to J initially in Algorithm 2 and J  J , a lower bound on the computational complexity of Algorithm 2 to check feasibility is ) ( , which is of polynomial complexity. Although the above analysis only provides a lower bound on the computational complexity of Algorithm 2, the numerical results to be presented in later shows that computational time grows polynominally, which is consistent with above lower bound on the computational complexity analyzed above.

Analysis of Impact of Resource Failures on CPS
Note that j f can be divided into two parts, Step 1: Apply the Algorithm to Construct spatial-temporal network ) , ( Step 2: Step 3: Find the flows 1 j f not influenced by resource failures  in the Step 5: Remove the flows influenced 2 j f by the resource failures Find  j Q denote the quantity of products associated with 2 j f influenced by resource failures ) (m  .
Step 6: where  rt C is the capacity loss due to resource failures  .
Step 2: Apply Procedure I to compute residual capacity rt C for } ,..., 2 , , Procedure I to is called to compute the residual capacity As the number of arcs in the set jrt A is no greater than  R , the number of arithmetic . Therefore, the overall computational complexity to compute the residual capacity is bounded by 2 2  JR . Therefore, the overall computational complexity to compute the residual capacity is ) ( . Note that the number of nodes constructed in the spatial-temporal network ) , ( . The number of arcs constructed in the spatial-temporal network ) , ( . Therefore, the complexity to construct ) , ( and find a solution is ) ( greater than J , a lower bound on the computational complexity of the Algorithm 3 to Check Robustness is ) ( , which is of polynomial complexity. Although the above analysis only provides a lower bound on the computational complexity of the algorithm, we will show that the numerical results indicate that computational time grows polynominally, which is consistent with above lower bound on the computational complexity analyzed above.

Results
In this section, we illustrate the proposed method by examples and present the results to illustrate computational feasibility of the proposed method. We first illustrate the results obtained by applying the proposed method to two examples that represent two application scenarios. In Example 1, a scenario in which resource failures influence only one type of product is considered. The patterns of resource failures in the scenario of Example 2 influence two types of products. We illustrate how the proposed method works to check the robustness of CPS with respect to resource failures. Following Exam-ple 1 and Example 2, we will present the results to study computational feasibility of the proposed method.
Example 1: Consider a CPS that can produce two types of products. The production processes of the two types of products and the resources are modeled by task subnets, 1

GJ
and 2 GJ in Figure 1 and the resource models in Figure 2, respectively. The CPS model is shown in Figure 3. The models can be download from the following link: https://drive.google.com/drive/folders/1rBiYKMoMDJOU1ktvjDGlz_nm1fFXOwdY?usp=sharin g (Accessed on 15 May 2021) Suppose an order which requires three type-1 products and deadline AM 9:20 has arrived. A time horizon starting from AM 8:00 is considered to handle the order. The time horizon is divided into time periods and duration of each period is ten minutes. For this example, the order deadline AM 9:20 is presented by  = 8 and we set  = 10 in our problem formulation. The data of the order described based on the above parameters is summarized in Table 2. As the order only requires type-1 products, only the operations with associated transitions in the type-1 task subnet and the associated processing time of the resources are shown in Table 3.
. To present the solution clearly, the above solution is shown in a spatial-temporal network in Figure 5. In Figure 5, the flows influenced by the resource failure are represented by a red path. Obviously, the above solution can meet the deadline of the order.

A V STN
, where the flows influenced by resource failure is represented in the red path.
In the process of producing products, resource may fail unexpectedly. Suppose a resource failed during the period from AM 8:00. Suppose the failure is expected to be recovered by AM 8:20. The resource failure information is shown in Table 4  As only one type of product involved in the nominal solution, only type-1 products will be influenced. Therefore, . This solution satisfies the condition of Property 3. The above solution can meet the deadline as the condition of Property 1 holds. Figure 6a shows the above solution represented in ) , ( . The solution displayed in Figure 6a is obtained by rerouting the flows in the red path of Figure 5. Figure 6b shows the overall solution obtained by combining the solution in Figure 6a (rerouted workflows) with the flows not influenced by resource failures. It indicates that the solution can meet the deadline as the flows of each arc after the deadline is zero. To illustrate the computational feasibility of the proposed method, we conduct relevant experiments. These experiments were running on a computer with an Intel CoreTM i7-4720 HQ CPU, 2.6 G Hz, and 16 GB of RAM. Two types of experiments have been performed. The first type of experiments aims to study the influence of number of tasks that need to be rerouted due to resource failures on the computation time. The second type of experiment focuses on the scalability of the proposed method with respect to the number of transitions in a task subnet.
For the first type of experiment, we increase the number of tasks influenced,

Discussion
Although CPS provides a paradigm to manage, monitor and control manufacturing resources in enterprises based on the real-time information from the IoT and ICT infrastructure, an effective scheme to deal with unexpected events must be developed to enhance robustness of CPS. Robust design of CPS becomes an important issue to achieve stable and secure CPS. It has attracted a lot of attention in the CPS research community. A comprehensive survey of the concept and strategies for robust design of CPS can be found in [5]. Although many studies have addressed the robustness issues on stability, security and systematicness in the context of CPS, robustness of CPS with respect to resource failures is less explored in the literature. This study aims to bridge this gap through development of a methodology to model and analyze robustness of CPS with respect to resource failures.
CPS relies on an effective scheme to deal with changes in processes, resources and implementation technology. Model driven approach (MDA) provides a method to accommodate these changes [11]. MDA is an approach for developing software systems based on specifications described as models. This MDA approach is supported by the Unified Modeling Language™ [41]. Although UML is the standard for specification of software systems, it is weak in analyzing production processes with shared resources, concurrent, synchronous and asynchronous operations in CPS. Petri nets provides the capabilities to model and analyze CPS. In [8], a class of Petri nets called DTPN is proposed to model CPS. In addition, due to the lack of an efficient method to analyze DTPN, an analysis method is proposed based on transformation of the DTPN model to a network model to analyze temporal property of the DTPN model. However, the problem in [8] considers order requirements with a single type of product demand and the issue to deal with resource failures is not addressed in [8]. In terms of modeling tools and analysis methods, despite the DTPN model for CPS being proposed previously, there still lacks an analysis method to model and analyze the impact of resource failures on the performance of CPS. Although the development of this paper is based on the preliminary results in [15] to analyze the influence of resource failures on CPS, the content presented in this paper is different from [15], in that it includes the rigorous proof of robustness properties of CPS, computational complexity analysis and verification of complexity analysis results by experiments, which are not covered in [15]. In addition, the problem studied in this paper considers order requirements with multiple types of product demands whereas the problem in [8] considers order requirements with a single type of product demand.
In Petri net literature, there are two ways to describe time regarding transition firing. Timed Petri nets specify a duration to represent firing time for a transition [27]. Time Petri nets describe a time interval for firing transition [42]. From the perspective of theoretical development, the timed Petri net model considered in this study is different from [43,44] in that the DTPN model used in this study specifies transition firing time delay whereas the time Petri net model used in [43] specifies time intervals for transition firing. The transition firing time delay in DTPN is discrete, which makes it possible to analyze the robustness property of the system with RSTN. In [42], Akshay et al. defined a model to be robust if the set of discrete behaviors is preserved under arbitrarily small perturbations in the firing intervals of transitions. Akshay et al. showed that TPNs are not robust in general and that the problem to check robustness with respect to boundedness and safety properties is undecidable. In [45], robustness is defined as a property to measure the allowed variability of the timing delays in their neighborhood. The model used in [45] is also TPN which specifies time intervals for transition firing. The robustness property studied in this paper is not robustness with respect to firing time variation. Instead, we study the robustness with respect to perturbation in markings, which corresponds to changes in the number of resources in the system. Therefore, the study in this paper is different from the papers mentioned above.
In this study, we focus on the development of an algorithm to analyze the impact of resource failures on the operations and temporal properties of CPS, based on transformation of the DTPN model of CPS into relevant spatial-temporal network models. We illustrate the proposed algorithm by applying it to two application scenarios. In the first scenario, only one type of production process is influenced by the resource failure. The results indicate that the influenced workflows are rerouted properly to reflect the unavailability of the failed resource. In the second scenario, two types of production processes are influenced due to the resource failure. The results also show that the influenced workflows of the two production processes are rerouted properly to reflect the unavailability of the failed resource. In addition to the two application scenarios above, we also perform experiments to study how the computational time grows with respect to number of tasks influenced due to resource failures and how the computational time grows with respect to number of transitions in the processes. The results of the computational time experiments indicate that the proposed is computationally feasible and is consistent with the characteristics of the lower bound on the polynomial complexity obtained in our analysis. This study provides a foundation for the analysis of temporal properties in CPS.
Complexity analysis provides a formal and widely accepted way to assess the computationally feasibility of different algorithms in the literature. We also follow this approach in this paper to compare our approach with classical reachability analysis methods. Classical reachability analysis methods suffer from the exponential state explosion problem even for untimed Petri nets. In the literature, complexity of the reachability problem of untimed Petri nets is exponential space and time although it is decidable [46,47]. For timed Petri nets, the complexity is higher than untimed Petri nets due to consideration of time factor. The complexity of classical reachability analysis methods for timed Petri nets also suffers from the state explosion problem. For the STN based approach proposed in this paper, the lower bound of the computational complexity is polynomial with respect to problem size parameters. Our experimental results are con-sistent with this lower bound. The proposed STN based approach has advantage over classical reachability analysis methods in terms of computational feasibility to deal with real problems. This advantage is due to the polynomial growth of the STN structure with respect to problem size parameters, compact representation of solutions and an effective way to search for solutions in STN.
In this paper, we limit our scope and aim to study the robustness of CPS for manufacturing systems. Depending on the domains in which CPS are applied, the challenges vary. Therefore, robustness of CPS need to be defined and studied in the context of a specific problem domain. Although the scope of this paper is target at robustness of CPS for manufacturing systems, it is promising to extend the proposed method to apply to other temporal analysis problems in CPS for manufacturing systems modelled with timed Petri nets. In addition to the issue to analyze the impact of resource failures addressed in this study, the widespread adoption of CPS raises other challenging research issues. Safety of CPS is another important issue that has attracted researchers' attention. For example, the study [48] focuses on the safety property in which multiple CPS collaborate. Many safety properties are relevant to temporal analysis. An interesting issue to be studied is the feasibility of applying the proposed method to analyze the safety property of CPS.

Conclusions
With the maturity and adoption of the IoT and ICT infrastructure in CPS, availability of the real-time information from the shop floor enhances visibility of production status for manufacturers. The real-time information provided by the IoT and ICT infrastructure enables the development of an effective strategy to deal with undesirable events in CPS, such as resource failures. Resource failures may take place unexpectedly and have negative effects on the operations of CPS. Assessing the impact of resource failures on CPS is an important issue. Development of a systematic method to support analysis of the impact of resource failures on CPS is needed. This paper aims to propose a method to evaluate the effects of resource failures on operations and temporal properties of CPS by constructing residual spatial-temporal networks obtained based on transformation of a class of discrete timed Petri net models for CPS and the system state information. To evaluate the impact due to resource failures, we formulate a problem based on the residual spatial-temporal networks to check whether the order deadline is feasible after resource failures to efficiently assess the system. The proposed method is illustrated by examples. To assess practicality of the proposed method in terms of computational feasibility, we analyzed the computational complexity of the proposed algorithm and conducted experiments to study the growth of computational time required with respect to the number of tasks influenced due to resource failures and number of transitions in the processes. The analysis indicates that the lower bound on the complexity of the algorithm is polynomial with respect to problem size parameters. The numerical results of the experiments also indicate that the CPU time grows polynomially with respect to the number of tasks influenced due to resource failures and number of transitions in the processes. The numerical results are consistent with the complexity analysis and indicate that the proposed method can be applied to solve real problems. Formal models such as timed Petri nets have been widely accepted as tools for the specifying, modelling and simulation of systems. However, analysis methods of timed Petri nets are still limited. The method proposed in this paper attempts to analyze the temporal property of timed Petri nets based on transformation of the reachability problem into a problem relevant to the scheduling problem in the literature. Such transformation is nontrivial and provides an alternative approach to the analysis of timed Petri nets. One future research direction is to extend the proposed method to other types of processes. Although this study focuses on robustness of temporal property of CPS with respect to resource failures, this is not the only robustness research issue. There are other robustness issues of CPS due to the variety of uncertainties in the real world. These uncertainties include delay of raw materials/parts, unforeseen loss of manpower, arrival/cancellation of orders and variation of processing time of operations. Robustness of CPS can be studied for each type of uncertainty. In summary, there are a lot of promising future research directions relevant to robustness of CPS.
The following hold for  j } ,..., is updated as follows: The algorithm proceeds to the second iteration. Suppose Example 2: Consider a CPS that can produce two types of products. The production processes of the two types of products and the resources are modeled by Figure 1 and Figure 2, respectively. The CPS model of the type-1 product is shown in Figure 3. Suppose an order has arrived. The order requires three type-1 products, one type-2 product and deadline AM 9:40. A time horizon starting from AM 8:00 is considered to handle the order. The time horizon is divided into time periods and the duration of each period is ten minutes. For this example, the order deadline AM 9:40 is presented by = 10 and we set  = 10 in our problem formulation. The data of the order described based on the above parameters is summarized in Table A1. The operations with associated transitions in the type-1 task subnet and the associated processing time of the resources are shown in Table 3. The operations with associated transitions in the type-2 task subnet and the associated processing time of the resources are shown in Table A2.  Figure A1b. Obviously, the above solution can meet the deadline of the order.
Suppose a resource failed during the period from AM 8:00. Suppose the failure is expected to be recovered by AM 8:20. The resource failure information is shown in Table  A3. As there are 16    As there are two types of products involved in the nominal solution, the algorithm first finds the set ) ) ( ( m  J of process types that will be influenced due to the failure. As both types of processes are involved with the use of the failed resource during the failure interval, therefore, in Figure A2b is obtained by rerouting the flows in Figure A1b. The above solution can meet the deadline as the flows of each arc after the deadline is zero and Property 1 holds. Figure A3a and Figure A3b show the overall solution obtained by combining the solution (rerouted flows) with the uninfluenced flows for both types of products. The order deadline can be met by this solution as the flows of each arc after the deadline is zero.