# Task-Level Re-Execution Framework for Improving Fault Tolerance on Symmetry Multiprocessors

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- It proposes the FT policy to improves reliability of the target system scheduled by a given real-time scheduling algorithm without sacrificing schedulability.
- A new deadline-based schedulability analysis designed for the re-execution technique is proposed, which can be incorporated into the FT policy.
- FT policy incorporated into FP and EDZL scheduling are proposed as a case study.
- The conducted experiments demonstrate that the FT policy dramatically improves the performance compared to the existing techniques (utilizing the predetermined ${\lambda}_{k}$) when we consider the schedulability and reliability simultaneously.

## 2. The System Model

#### 2.1. The Task Model

#### 2.2. The Fault Model

## 3. The Fault-Tolerant Scheduling Framework

#### 3.1. The Scheduling Algorithm Incorporating FT Policy

- Q1
- How can ${\lambda}_{k}$ of ${\tau}_{k}$ be determined without compromising the schedulability of ${\tau}_{k}$?
- Q2
- How can ${\lambda}_{k}$ of ${\tau}_{k}$ be determined without compromising the schedulability of the other tasks ${\tau}_{i}$?

Algorithm 1 The FT-policy-incorporated scheduling algorithm. |

1: ${\lambda}_{k}$ for every ${\tau}_{k}$ is assigned by a given ${\lambda}_{k}$ assignment algorithm (Algorithm 2) 2: for Every time instance t do3: if ${J}_{k}^{q}$ is released by ${\tau}_{k}$ then4: Insert ${J}_{k}^{q}$ into ${Q}_{ready}$ 5: end if6: Schedule jobs in ${Q}_{ready}$ according to a given base scheduling algorithm 7: if ${\lambda}_{k}-1$ times of executions are completed for ${J}_{k}^{q}$, and a fault is detected then8: Execute ${J}_{k}^{q}$ again. 9: end if10: if ${J}_{k}^{q}$ finishes its execution then11: Delete ${J}_{k}^{q}$ from ${Q}_{ready}$ 12: end if13: end for |

#### 3.2. Schedulability Analysis

**Lemma**

**1**

**.**Suppose that a task set τ is scheduled by a global, preemptive, and work-conserving algorithm. Thus, τ is schedulable if the following inequality holds for all ${\tau}_{k}\in \tau $.

**Proof.**

**Theorem**

**1.**

**Proof.**

#### 3.3. The ${\lambda}_{k}$-Assignment Algorithm

Algorithm 2${\lambda}_{k}$-Assignment Algorithm |

1: ${\lambda}_{k}\leftarrow 0$ for all tasks ${\tau}_{k}\in \tau $ 2: for ${\tau}_{j}$ from the first task to the last one selected by a given selection algorithm do3: while $\tau $ is deemed schedulable by Theorem 1, and ${\lambda}_{j}\xb7{C}_{j}\le {D}_{j}$ holds do4: ${\lambda}_{j}\leftarrow {\lambda}_{j}+1$ 5: end while6: ${\lambda}_{j}\leftarrow {\lambda}_{j}-1$ 7: end for |

## 4. Case Study

#### 4.1. Schedulability Analysis for FT-FP-A

**Theorem**

**2.**

**Proof.**

#### 4.2. Schedulability Analysis for FT-EDZL-A

**Theorem**

**3.**

**Proof.**

#### 4.3. Evaluation Environment

- $\mathsf{RM}$ (also denoted by $\mathsf{RM}$-$\mathsf{1}$): for the RM scheduling algorithm (Equation (12) with every ${\lambda}_{k}=1$),
- $\mathsf{FT}$-$\mathsf{EDZL}$-$\mathsf{Any}$: for the EDZL scheduling algorithm incorporating the FT policy in which the ${\lambda}_{k}$-assignment algorithm increases ${\lambda}_{k}$ in an order of index k (Equation (14) with ${\lambda}_{k}$ determined by a given ${\lambda}_{k}$-assignment algorithm);
- $\mathsf{FT}$-$\mathsf{RM}$-$\mathsf{Inverse}$: for the RM scheduling algorithm incorporating the FT policy in which the ${\lambda}_{k}$-assignment algorithm increases ${\lambda}_{k}$ in an order of task priority (Equation (12) with ${\lambda}_{k}$ determined by a given ${\lambda}_{k}$-assignment algorithm);
- $\mathsf{FT}$-$\mathsf{EQDF}$-$\mathsf{Inverse}$: for the EQDF scheduling algorithm incorporating the FT policy in which the ${\lambda}_{k}$-assignment algorithm increases ${\lambda}_{k}$ in an order of task priority (Equation (12) with ${\lambda}_{k}$ determined by a given ${\lambda}_{k}$-assignment algorithm);
- $\mathsf{RM}$-$\mathsf{2}$: for the RM scheduling algorithm (Equation (12) with every ${\lambda}_{k}=2$);
- $\mathsf{RM}$-$\mathsf{3}$: for the RM scheduling algorithm (Equation (12) with every ${\lambda}_{k}=3$).

#### 4.4. Example of a Task Set: ACSW in Satellite Systems

- tHigh retrieves a single macro command (MCMD) from an MCMD queue in every period and invokes a job corresponding to the MCMD.
- tMilbus is responsible for receiving MCMDs from the ground station by utilizing the MIL-STD-1553B protocol [20] and verifies the integrity of each MCMD before the MCMD is inserted into an MCMD queue.
- tOne performs internal mode transitions such as turning on/off relevant equipment and transmits internal telemetries via the SpaceWire protocol [21].
- tTwo conducts various executions such as fault detection, formatting network packets that will be transferred to the ground station.
- tSync executes a job for the operation preparation whenever there are surplus computing resources.

#### 4.5. Evaluation Results

## 5. Related Work

## 6. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Liu, C.; Layland, J. Scheduling Algorithms for Multi-programming in A Hard-Real-Time Environment. J. ACM
**1973**, 20, 46–61. [Google Scholar] [CrossRef] - Ekpo, S.; George, D. A system-based design methodology and architecture for highly adaptive small satellites. In Proceedings of the IEEE International Systems Conference, San Diego, CA, USA, 5–8 April 2010; pp. 516–519. [Google Scholar]
- Malhotra, S.; Narkhede, P.; Shah, K.; Makaraju, S.; Shanmugasundaram, M. A review of fault tolerant scheduling in multicore systems. Int. J. Sci. Technol. Res.
**2015**, 4, 132–136. [Google Scholar] - Yu, X.B.; Zhao, J.S.; Zheng, C.W.; Hu, X.H. A Fault-Tolerant Scheduling Algorithm using Hybrid Overloading Technology for Dynamic Grouping based Multiprocessor Systems. Int. J. Comput. Commun. Control
**2012**, 7, 990–999. [Google Scholar] [CrossRef] - Zhou, J.; Yin, M.; Li, Z.; Cao, K.; Yan, J.; Wei, T.; Chen, M. Fault-Tolerant Task Scheduling for Mixed-Criticality Real-Time Systems. J. Circuits Syst. Comput.
**2017**, 26, 1750016. [Google Scholar] [CrossRef] - Kang, S.; Yang, H.; Kim, S.; Bacivarov, I.; Ha, S.; Thiele, L. Static mapping of mixed critical applications for fault-tolerant MPSoCs. In Proceedings of the IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 1–5 June 2014; pp. 1–6. [Google Scholar]
- Aminzadeh, S.; Ejlali, A. A comparative study of system-level energy management methods for fault-tolerant hard real-time systems. IEEE Trans. Comput.
**2011**, 60, 1228–1299. [Google Scholar] [CrossRef] - Bertogna, M.; Cirinei, M.; Lipari, G. Schedulability Analysis of Global Scheduling Algorithms on Multiprocessor Platforms. IEEE Trans. Parallel Distrib. Syst.
**2009**, 20, 553–566. [Google Scholar] [CrossRef] - Baker, T.P.; Cirinei, M.; Bertogna, M. EDZL Scheduling Analysis. Real-Time Syst.
**2008**, 40, 264–289. [Google Scholar] [CrossRef] - Lee, J.; Easwaran, A.; Shin, I. LLF Schedulability Analysis on Multiprocessor Platforms. In Proceedings of the Real-Time Systems Symposium, San Diego, CA, USA, 30 November–3 December 2010; pp. 25–36. [Google Scholar]
- Bertogna, M.; Cirinei, M.; Lipari, G. Improved Schedulability Analysis of EDF on Multiprocessor Platforms. In Proceedings of the Euromicro Conference on Real-Time Systems (ECRTS), Balearic Islands, Spain, 6–8 July 2005; pp. 209–218. [Google Scholar]
- Bertogna, M.; Cirinei, M. Response-Time Analysis for globally scheduled Symmetric Multiprocessor Platforms. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS), Tucson, AZ, USA, 3–6 December 2007. [Google Scholar]
- Bini, E.; Buttazzo, G.C. The space of rate monotonic schedulability. In Proceedings of the IEEE Real-Time Systems Symposium (RTSS), Austin, TX, USA, 3–5 December 2002. [Google Scholar]
- Back, H.; Chwa, H.S.; Shin, I. Schedulability Analysis and Priority Assignment for Global Job-level Fixed-Priority Multiprocessor Scheduling. In Proceedings of the Real Time and Embedded Technology and Applications Symposium, Beijing, China, 16–19 April 2012; pp. 297–306. [Google Scholar]
- Baker, T.P. Comparison of Empirical Success Rates of Global vs. Partitioned Fixed-Priority EDF Scheduling for Hard Real-Time; Technical Report TR–050601; Department of Computer Science, Florida State University: Tallahassee, FL, USA, 2005. [Google Scholar]
- Andersson, B.; Bletsas, K.; Baruah, S. Scheduling Arbitrary-Deadline Sporadic Task Systems on Multiprocessor. In Proceedings of the IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, Barcelona, Spain, 30 November–3 December 2008; pp. 197–206. [Google Scholar]
- Lee, J.; Easwaran, A.; Shin, I. Contention-Free Executions for Real-Time Multiprocessor Scheduling. ACM Trans. Embed. Comput. Syst.
**2014**, 13, 1–69. [Google Scholar] [CrossRef] - Baek, H.; Lee, H.; Lee, H.; Lee, J.; Kim, S. Improved Schedulability Analysis for Fault-Tolerant Space-Borne SAR System. In Proceedings of the Conference on Korea Institute of Military Science and Technology (KIIT), Deajeon, Korea, 7–8 June 2018; pp. 1231–1232. [Google Scholar]
- RTEMS Community. RTEMS Real-Time Operating System. Available online: https://www.rtems.org (accessed on 9 May 2019).
- Excalibur Systems. MIL-STD-1553B. Available online: https://www.mil-1553.com (accessed on 9 May 2019).
- European Space Agency. SpaceWire. Available online: http://spacewire.esa.int (accessed on 9 May 2019).
- Ghosh, S.; Melhem, R.; Mosse, D. Fault-tolerance through scheduling of aperiodic tasks in hard real-time multiprocessor systems. IEEE Trans. Parallel Distrib. Syst.
**1997**, 8, 272–284. [Google Scholar] [CrossRef] - Manimaran, G.; Murthy, C.S.R. A fault-tolerant dynamic scheduling algorithm for multiprocessor real-time systems and its analysis. IEEE Trans. Parallel Distrib. Syst.
**1998**, 9, 1137–1152. [Google Scholar] [CrossRef] - Al-Omari, R.; Somani, A.K.; Manimaran, G. Efficient overloading techniques for primary-backup scheduling in real-time systems. J. Parallel Distrib. Comput.
**2004**, 64, 629–648. [Google Scholar] [CrossRef] - Cirinei, M.; Bini, E.; Lipari, G.; Ferrari, A. A Flexible Scheme for Scheduling Fault-Tolerant Real-Time Tasks on Multiprocessors. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium, Rome, Italy, 26–30 March 2007; pp. 1–8. [Google Scholar]
- Liberato, F.; Lauzac, S.; Melhem, R.; Mosse, D. Fault tolerant real-time global scheduling on multiprocessors. In Proceedings of the Euromicro Conference on Real-Time Systems (ECRTS), York, UK, 9–11 June 1999; pp. 252–259. [Google Scholar]

**Figure 1.**Worst-case scenario in which the workload of ${\tau}_{i}$ is maximized under any work-conserving scheduling.

**Figure 2.**Worst-case scenario in which interference of ${\tau}_{i}$ to ${\tau}_{k}$ is maximized under work-conserving EDF scheduling.

**Figure 3.**Worst-case scenario in which interference of ${\tau}_{i}$ to ${\tau}_{k}$ is maximized under work-conserving EDZL scheduling.

**Figure 5.**Evaluation results regarding the average system safety of rate monotonic (RM) with different ${\lambda}_{k}$ assignments for $\gamma =0.01$.

${\mathit{T}}_{\mathit{i}}$ | ${\mathit{D}}_{\mathit{i}}$ | WCET | BCET | ACET | |
---|---|---|---|---|---|

tHigh | 62.5 | 50 | 2.98 | 0.08 | 0.14 |

tMilbus | 125 | 100 | 0.54 | 0.11 | 0.21 |

tOne | 250 | 200 | 30.08 | 0.05 | 0.29 |

tTwo | 500 | 400 | 231.72 | 37.7 | 147.5 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Baek, H.; Lee, J.
Task-Level Re-Execution Framework for Improving Fault Tolerance on Symmetry Multiprocessors. *Symmetry* **2019**, *11*, 651.
https://doi.org/10.3390/sym11050651

**AMA Style**

Baek H, Lee J.
Task-Level Re-Execution Framework for Improving Fault Tolerance on Symmetry Multiprocessors. *Symmetry*. 2019; 11(5):651.
https://doi.org/10.3390/sym11050651

**Chicago/Turabian Style**

Baek, Hyeongboo, and Jaewoo Lee.
2019. "Task-Level Re-Execution Framework for Improving Fault Tolerance on Symmetry Multiprocessors" *Symmetry* 11, no. 5: 651.
https://doi.org/10.3390/sym11050651