Cooperative and Non-Cooperative Frameworks with Utility Function Design for Intermediate Deadline Assignment in Real-Time Distributed Systems

In real-time distributed systems, it is important to provide offline guarantee for an upper-bound of each real-time task’s end-to-end delay, which has been achieved by assigning proper intermediate deadlines of individual real-time tasks at each node. Although existing studies have succeeded to utilize mathematical theories of distributed computation/control for intermediate deadline assignment, they have assumed that every task operates in a cooperative manner, which does not always hold for real-worlds. In addition, existing studies have not addressed how to exploit a trade-off between end-to-end delay fairness among real-time tasks and performance for minimizing aggregate end-to-end delays. In this paper, we recapitulate an existing cooperative distributed framework, and propose a non-cooperate distributed framework that can operate even with selfish tasks, each of which is only interested in minimizing its own end-to-end delay regardless of achieving the system goal. We then propose how to design utility functions that allow the real-time distributed system to exploit the trade-off. Finally, we demonstrate the validity of the cooperative and non-cooperative frameworks along with the designed utility functions, via simulations.


Introduction
In real-time distributed systems, each real-time task executes on several nodes in a sequential manner subject to timing constraints. Therefore, it is important to provide predictable performance for the end-to-end delay of each real-time task, which implies that an upper-bound of the end-to-end delay should be provided before the system starts [1]. An example of such a real-time distributed system is an LCD video player [2]. In the system, multiple media applications compete for processing units, and each media processing application is executed in the processing units in a sequential manner; also, QoS (Quality of Service) of each media processing application is dependent on end-to-end delays. Since each application's QoS can be expressed as time-sensitive utility functions, the goal of the system is maximizing the collective utilities of individual applications in the worst-case. With this utility, the system can provide QoS at least as much as the collective utilities. More examples of real-time distributed systems can be found in some scenarios in cloud data center [3] and mobile crowdsourcing [4]; a delay-sensitive application is executed in a series of virtual machines and mobile devices, respectively for the former and latter, and QoS of the application is a function of its response time.
The end-to-end delay predictability required by a real-time distributed system has been achieved by assigning proper intermediate deadlines of individual real-time tasks executed on each node [1,[5][6][7][8][9][10][11][12][13][14][15][16][17]. However, although existing studies for the intermediate deadline assignment have succeeded to utilize mathematical theories (including convex optimization [18][19][20]) of distributed computation/control for intermediate deadline assignment, they have assumed that every task operates in a cooperative manner, which does not always hold in real-world situations. In addition, existing studies have not addressed how to exploit a trade-off between end-to-end delay fairness among real-time tasks and performance for minimizing total end-to-end delays of the entire real-time distributed system.
In this paper, we address two important issues for the intermediate deadline assignment problem for real-time distributed systems, which are related to non-cooperative tasks and utility function design. First, after recapitulating an existing cooperative distributed framework, we propose a non-cooperative distributed framework that operates even with selfish tasks, each of which is only interested in minimizing its own end-to-end delay regardless of achieving the system goal. To this end, we propose a penalty function that makes each task spontaneously restrain itself from behaving in a selfish manner, and present how to apply the penalty function to each task. Second, we propose a design principle for utility functions that enable the real-time distributed system to exploit a trade-off between end-to-end delay fairness among individual tasks and performance for minimizing total end-to-end delays of the entire system. This can be achieved by adjusting a parameter α to be detailed in Section 4. Then, our simulation results show the validity of the cooperative and non-cooperative frameworks in conjunction with the designed utility functions.
In summary, this paper makes the following contributions.
• We develop the first non-cooperative distributed framework for intermediate deadline assignment in real-time distributed systems.

•
We design the first utility function that regulates a trade-off between delay fairness and performance for intermediate deadline assignment in real-time distributed systems.

•
We demonstrate the validity of the existing cooperative framework and the proposed non-cooperative framework associated with the designed utility function.
The remainder of this paper is structured as follows. Section 2 presents our target problem and related work. Section 3 recapitulates an existing cooperative distributed framework and proposes a non-cooperative one with its deployment issues. Section 4 proposes a utility function design. Section 5 demonstrates the validity of the proposed techniques. Section 6 concludes the paper.

Target Problem and Related Work
In this section, we first explain our system model and then our target problem, which is the intermediate deadline assignment problem for real-time distributed systems. We next summarize existing studies related to the target problem.

Intermediate Deadline Assignment Problem
We target the intermediate deadline assignment problem for real-time distributed systems, which is defined in [14,15] as follows. We consider a real-time distributed system with a set of tasks τ i ∈ τ and a set of nodes N n ∈ N . Let |A| denote the size of A; therefore |τ| and |N | denote the number of tasks and nodes, respectively. Each task τ i ∈ τ consists of a series of m i subtasks (numbered from 1 to m i ), each of which is executed on exactly one node in a sequential manner. Let J (i,k,n) denote the k th subtask of τ i executed on node N n ; we may omit n in J (i,k,n) if n is irrelevant. Also, we use C (i,k) as the worst-case execution time of J (i,k) . Then, adjacent subtasks of τ i (e.g., J (i,k) and J (i,k+1) ) execute in a sequential manner; that is, J (i,k+1) is ready to execute only after J (i,k) completes its execution. Each task τ i is a sporadic task, meaning that its first subtask is released repeatedly with a minimum separation of T i time units.
Once we calculate the reponse time (i.e., the maximum local delay) that each subtask J (i,k) undergoes in its node (denoted by d (i,k) ), then we can calculate the maximum end-to-end delay by d i = ∑ m i k=1 d (i,k) . Each task has its own delay-sensitive utility function U i (d i ) that characterizes QoS (Quality-of-Service) level, and then the system utility U sys is defined as follows: Since it is difficult to calculate the maximum end-to-end delay d i before the system starts (i.e., offline as opposed to online), existing studies assign D (i,k) , the intermediate deadline for each task τ i on each node N n (i.e., J (i,k,x) : x = n). Then, as long as the following schedulability condition holds for all nodes, the local delay for each task executed on each node is guaranteed to be upper-bounded by the intermediate where UB n denotes the utilization bound for node N n . It is known that UB n = 1.0, UB n = 0.69 and UB n = 1.0 − max J (i,k,x) :x=n , if subtasks in node N n are scheduled by preemptive EDF (Earliest Deadline) First [21], preemptive DM (Deadline Monotonic) [22], and non-preemptive EDF [23], respectively.
Then, considering that holds for all τ i ∈ τ, the intermediate deadline assignment problem that we target is to maximize the system utility subject to the node schedulability conditions, which is formally stated in the following primal problem [14,15].
Subject to: Note that it is straightforward that every intermediate deadline should be no smaller than the corresponding worst-case execution time (i.e., D (j,l) ≥ C (j,l) for every pair of (j, l)) to satisfy the node schedulability conditions in Equation (4).

Related Work
In the literature, several studies regarding the intermediate deadline assignment problem have also been proposed for real-time distributed systems. Several existing studies have addressed the intermediate deadline assignment problem; for example, straightforward deadline distribution technique [7], the critical scaling factor technique [6], and the excess of response time technique [5] have been proposed. In addition, there have been a few studies that focus on parallel subtasks; they usually aims at addressing how to handle interference between subtasks [9,10]. On the other hand, there have been some studies that handle pipelined tasks using end-to-end delay analysis techniques [12,13], and the end-to-end deadline guarantee has also been applied to utilizing optional (i.e., non-mandatory) execution parts [11]. A study has achieved improvement of the schedulability condition for the intermediate assignment problem [16]. Recently, a study has tried to solve the intermediate deadline assignment problem using the machine learning techniques [17]. While the above studies are well summarized in [1], they assumed that a centralized computing unit can control all tasks, and did not consider a distributed framework.
To solve the intermediate deadline assignment problem in a distributed manner, convex optimization has been utilized in some studies. That is, they applied a Lagrange dual function, and then found a sequence that converges to the optimal solution by utilizing the decomposition techniques with descent algorithms. Such decomposition has been applied to the intermediate assignment problem for real-time distributed systems [14,15]. However, all existing studies assume that all tasks are cooperative in achieving the system goal, which does not always hold for real-worlds. In addition, there has been no study about how to exploit a trade-off between delay fairness between individual tasks and performance of achievement of the system goal.
Different from existing studies, this paper (i) proposes the first non-cooperative distributed framework and (ii) designs a utility function that exploits a trade-off between delay fairness and system performance, both for the intermediate deadline assignment problem in real-time distributed systems.

Distributed Framework for Intermediate Deadline Assignment
In this section, we present how to solve the primal problem (defined in Section 2.1) in a distributed manner, assuming that the utility function U i for each task is given. To this end, we recapitulate the existing cooperative framework in which every task spontaneously collaborates on achieving the system goal (i.e., solving the primal problem). We then propose a non-cooperative framework, which can operate even when every task tries to maximize its own utility regardless of achieving the system goal. Finally, we discuss deployment issues for the cooperative and non-cooperative frameworks.

Existing Cooperative Distributed Framework
The primal problem in Section 2.1 can be solved using various techniques (e.g., Shor's r-algorithm [24]), if a centralized computing unit has capability of controlling all tasks. However, in many real environments, a centralized computing/control is impossible or requires non-negligible overhead, which necessitates a framework with distributed computing/control. For the distributed framework, existing studies [14,15] utilize Lagrange Duality; the primal problem in Section 2.1 is transformed by its dual problem using Lagrange multipliers [25] as follows: In the above dual problem, a node price p n is a Lagrange multiplier for the schedulability condition of a node n. If the primal problem is a convex optimization problem, strong duality is guaranteed by making node prices p n non-negative [25]. This means that the solution by the primal problem is equivalent to that by the dual problem. To make the primal problem in Section 2.1 convex, the utility functions and the schedulability constraints should be concave. Since D (j,l) ≥ C (j,l) holds for every pair of (j, l), the schedulability constraints in (4) are concave as follows [14,15]: Therefore, if the utility functions are concave, we can solve the dual problem using the distributed computation/control without any loss of performance. That is, by applying the gradient projection algorithm [26,27], we can find the optimal solution iteratively. Node prices p n can be also calculated in an iterative manner, as follows [14,15]: where [x] + denotes max(0, x).
The constants γ n are step sizes, which determine the rate of convergence of the iteration. If they are sufficiently small to satisfy Lipschitz continuity [26], they are able to guarantee the convergence. We can calculate the intermediate deadline D (i,k) (t + 1) of subtask J (i,k,n) , by solving the following differential [14,15]: The condition for the iteration to halt is to satisfy the following inequalities for all D (i,k) : where is a sufficiently small positive number; also it regulates a trade-off between the solution's accuracy and the rate of convergence. The condition for the iteration to halt has been widely used in gradient algorithms [26]. Then, as long as every task follows Equation (8), the cooperative distributed framework operates correctly.

Proposed Non-Cooperative Distributed Framework
The dual problem presented in Section 3.1 assumes that all tasks are cooperative and collaborate on achieving the system goal (i.e., solving the primal problem). However, in many distributed systems, it is possible for each task to be only interested in maximizing its own utility regardless of achieving the system goal. In this case, the optimization goal for each task τ i is as follows: (Individual-level optimization problem without penalty) If each task tries to achieve the above optimization goal, each task no longer follows the distributed computing process explained in Section 3.1. Therefore, the node schedulability conditions in Equation (4) tend to be violated, as each task tries to monopolize the utilization (i.e., the left-hand-side of the node schedulability conditions in Equation (4)) by decreasing the intermediate deadline as much as possible. This means, the local delay for each task on each node cannot be upper-bounded by the intermediate deadline due to the violation of the node schedulability conditions; therefore, each task actually cannot have any guaranteed utility. To avoid the violation of the node schedulability conditions, we impose a penalty for each task in contributing the utilization to the node schedulablity conditions. Once we apply a penalty function to each task, the optimization problem for each task τ i is as follows: (Individual-level optimization problem with penalty) Then, the most important design guideline for the penalty function is to make each task meet the node schedulability conditions in Equation (4) when each task simply maximizes Equation (11) without considering the node schedulability conditions. Therefore, we apply the two following principles for designing the penalty function. First, the penalty increases as the remaining budget of the node schedulability (i.e., UB n − ∑ J (i0,k0,x) :x=n ) becomes close to zero; in addition, if the remaining budget converges to zero, the penalty increases infinitely, which can avoid the violation of the node schedulability conditions. Second, the penalty increases as the budget of the node schedulability used by each task (i.e., ) gets larger. Using the two principles, we design the penalty function for each task τ i as follows: Note that it is trivial that the above penalty function satisfies the first and second principles for designing the penalty function.
Then, instead of maximizing the system utility, each task can achieve Equation (11) with Equation (12). The advantage of this framework is to guarantee a certain amount of the system utility even if each task is non-cooperative in achieving the system utility. However, this framework cannot achieve the system utility as much as the one by the cooperative framework in Section 3.1, which is inevitable in applying the penalty functions. The next subsection will show how to impose the penalty to each task, and Section 5 will demonstrate the difference between the system utility achieved by the cooperative distributed framework explained in Section 3.1 and that by the non-cooperative one proposed in Section 3.2

Deployment of Distributed Framework
In this subsection, we first discuss an important deployment issue for the non-cooperative distributed framework, which is, how to apply the penalty function to each task. We then explain which information should be exchanged and how the information is exchanged for the cooperative distributed framework and then non-cooperative one.
As to the proposed non-cooperative distributed framework, we need to address an important issue for its deployment, which is how to apply the penalty function to each task. To this end, we appoint the node on which each task is executed in the latest, to an arbitrator node. The role of the arbitrator node is to impose a penalty to each task τ i as much as the one in Equation (12). Since a utility for each task is a function of the end-to-end delay, the only way for each arbitrator node to impose a penalty is to add an artificial delay. To this end, the arbitrator node calculate D i that satisfies as follows.
Once the last subtask of a task τ i (i.e., J (i,m i ,x) : x = n) is scheduled with other subtasks on a node N n , the node (as an arbitrator node for τ i ) intentionally delays the execution of J (i,m i ) such that the total delay for τ i is close to D i (but not larger than D i ). By this postponement, the arbitrator node can impose a penalty as much as Equation (12), which in turn, makes each task to voluntarily operate without monopolizing the node utilization.
We now explain the information exchange issue for the cooperative distributed framework. As shown in Equation (7), each node N n needs to know the current intermediate deadline of all tasks which are executed on the node, i.e., D (i,k) for all J (i,k,x) : x = n. In addition, as shown in Equation (8), each task τ i needs to know the node price p n for all nodes N n that the task is executed on. The information needed for each node and each task can be piggybacked by each task itself or its control message [28,29].
When it comes to the non-cooperative distributed framework, the information exchange issue is similar to that of the cooperative one. That is, each arbitrator node N n needs to know all the intermediate deadlines of tasks, which are executed on the node in the latest, i.e., all τ i satisfying J (i,m i ,x) : x = n, as shown in Equation (12). Also, each task τ i needs to know (i) the current intermediate deadline of all subtasks for τ i and (ii) the utilization for all nodes that τ i is executed on. Similar to the cooperative distributed framework, this information can be piggybacked.

Utility Function Design
In Section 3, we explained that the cooperative and non-cooperative distributed frameworks can successfully work if a concave utility function for each task is given. Then, what if a system designer has a chance to determine utility functions for a real-time distributed system? Now, we propose a guideline in determining utility functions for a distributed real-time system, in order to help the system designer to exploit a trade-off between performance (i.e., the system utility) and fairness between the end-to-end delays of individual tasks. To this end, we define the following utility functions: where α ≤ 0, and D i = ∑ m i k=1 D (i,k) . Figure 1 illustrates the utility functions. With α = 0, the system goal is maximizing the sum of the utilities ∑ i U i (D i ) = − ∑ i D i (that is equivalent to minimizing ∑ i D i ). Therefore, this system goal is interpreted as minimizing the total sum of assigned deadlines, and does not pay attention to end-to-end delay upper-bound fairness among individual tasks. On the other hand, as α becomes smaller, the system tries to achieve a higher degree of end-to-end delay fairness. Finally, with α converging to −∞, the system goal is equivalent to minimizing the longest end-to-end delay of a task (i.e., min max i D i ), which achieves the highest degree of min-max fairness among end-to-end delay upper-bounds of individual tasks.

Experiments
In this section, we show how the cooperative distributed framework in Section 3.1 and the non-cooperative one in Section 3.2 operate with the utility function designed in Section 4. To this end, we target a typical real-time distributed system shown in Figure 2. The system consists of a set of 9 nodes N = {N a , N b , N c , N d , N e , N f , N g , N h , N i } and a set of 6 tasks τ = {τ 1 , τ 2 , τ 3 , τ 4 , τ 5 , τ 6 }. Each task executes on three different nodes sequentially; for example, τ 1 is executed on N a , N b and N c in a sequential manner, and τ 4 is executed on N a , N d and N g in a sequential manner, as shown in Figure 2. We set C (1,a) , C (1,b) , C (1,c) , C (2,d) , C (2,e) , C (2, f ) , C (3,g) , C (3,h) , C (3,i) , C (4,a) , C (4,d) , C (4,g) , C (5,b) , C (5,e) , C (5,h) , C (6,c) , C (6, f ) and C (6,i) to 10,10,10,15,15,15,20,20,20,10,10,10,15,15,15,20,20 and 20, respectively. We set the period of each task is 40. We apply each node employs preemptive EDF (Earliest Deadline First) scheduling, meaning that UB n = 1 holds for all nodes. We plug in U i (x) = − x 1−α 1−α in Equation (14) into Equations (3), (5) and (11), respectively for the primal problem, the dual problem, and the individual-level optimization problem with penalty, where x means D i = ∑ m i k=1 D (i,k) . We consider four options for α: 0, −1, −2 and −3, which means the utility functions are U i (x) = −x, − 1 2 x 2 , − 1 3 x 3 , and − 1 4 x 4 , respectively. For distributed computing, we use MATLAB [30] for the cooperative and non-cooperative frameworks as well as solving the primal problem directly. Tables 1-3 show the system utility, the sum of the assigned intermediate deadlines (i.e., total delay upper-bound) and standard deviation among individual tasks' end-to-end delay upper-bounds, respectively, by solving the primal problem directly (denoted by Primal), by the cooperative framework in Section 3.1 (denoted by Coop), and by the non-cooperative framework in Section 3.2 (denoted by Non-Coop). Then, Tables 1-3 show the results by the centralized framework, the distributed framework for cooperative tasks, and the distributed framework for non-cooperative tasks, respectively. We have three observations for the experimental results in the three tables.  (4,d) , D (4,g) , D (5,b) , D (5,e) , D (5,h) , D (6,c) , D (6, f ) , and D (6,i) . This comes from the fact that the primal problem in Section 2.1 is convex as long as the utility functions are concave, as we mentioned in Section 3.1. On the other hand, the system utility achieved by Coop is different from that by Non-Coop. For example, the system utility with U i (x) = −x under Coop and that under Non-Coop are −5.348 × 10 2 and −5.601 × 10 2 , respectively. The difference is a cost to make the real-time distributed system operate correctly in the presence of selfish tasks.  Second, as α in Equation (14) decreases, the sum of delay upper-bounds also increases and the standard deviation of individual task's delay upper-bounds decreases under Primal and Coop. For example, under Primal and Coop, the sum of delay upper-bounds is 534.8 with U i (x) = −x (i.e., α = 0), and increases up to 543.0 with U i (x) = −1/4 · x 4 (i.e., α = −3). At the same time, the standard deviation of individual task's delay upper-bounds is 20.1 with U i (x) = −x (i.e., α = 0), and decreases down to 11.6 with U i (x) = −1/4 · x 4 (i.e., α = −3). This demonstrates that the proposed utility function succeeds to exploit a trade-off between increasing the fairness of individual tasks' end-to-end delay upper-bounds and decreasing the total end-to-end delay for all tasks.
Third, as α in Equation (14) decreases, the sum of delay upper-bounds also increases under Non-Coop while the standard deviation of individual task's delay upper-bounds does not decrease. For example, under Non-Coop, the sum of delay upper-bounds is 560.1 with U i (x) = −x (i.e., α = 0), and increases up to 611.4 with U i (x) = −1/4 · x 4 (i.e., α = −3). The reason why the standard deviation of individual task's delay upper-bounds does not decrease as α decreases is due to the penalty function. According to the observation, we suggest that the system designer does not decrease α in the proposed utility function in order to achieve the fairness of end-to-end delays of individual tasks, for the non-cooperative framework. Therefore, it is better for the system designer to reduce the sum of delay upper-bounds as much as possible by assigning α = 0 when the target real-time distributed system operates in a non-cooperative manner.
While we already presented the experimental results in terms of performance and fairness for delay upper-bounds, one may wonder the convergence issue for the distributed framework. Now, we present a representative case of how each assigned end-to-end deadline converges. In the case of the cooperative framework with U i (x) = − 1 2 x 2 (i.e., the second case in Table 2), Figure 3 shows the assigned end-to-end deadline for τ 1 , τ 2 and τ 3 (i.e., D 1 , D 2 and D 3 ) according to the number of iterations (1 to 301 in the X-axis). Note that the results for D 4 , D 5 and D 6 are the same as D 1 , D 2 and D 3 , respectively, due to the topology symmetry. The initial values for D 1 , D 2 and D 3 are set to 120.0, and γ n for every N n ∈ N in Equation (7) is set to 1.0. As shown in the figure, the assigned end-to-end deadlines D 1 , D 2 and D 3 rapidly converge to 71.13, 89.83 and 107.36, respectively. The difference between the final converged deadline and the assigned deadline with 64 iterations is smaller than 1.0 for every D i , and that with 110 iterations is smaller than 0.1 for every D i . This demonstrates the convergence rate for the distributed framework with the proposed utility function is high in the representative case, which can be observed in other cases.

Conclusions
In this paper, we focused on the intermediate deadline assignment problem in real-time distributed systems, and addressed two important issues. First, we developed a non-cooperative distributed framework that can operate with selfish tasks, each of which is only interested in maximizing its own utility. Second, we proposed a principle to design utility functions that can exploit a trade-off between minimizing the aggregate end-to-end delays of the entire system and maximizing fairness among end-to-end delays of individual tasks. In addition, we demonstrated the validity of the two techniques via simulations. By addressing the two important issues, we (i) enable real-time distributed systems to operate even in the presence of selfish nodes, and (ii) offer guidelines on how to design real-time distributed systems in consideration of a trade-off between performance and fairness.