Optimizing Federated Scheduling for Real-Time DAG Tasks via Node-Level Parallelization
Abstract
1. Introduction
2. Related Work
3. System Model and Problem Description
- Problem Definition: For high-density tasks within a task set, our problem is to identify parallelization options for the nodes on the longest path to minimize the number of cores allocated to this high-density task. However, as DAG task node parallelization progresses, the longest thread path may also change. We aim to design a strategy that allocates parallelization options for nodes on the changing longest path, thereby minimizing the number of cores allocated to the high-density task and enhancing the federated scheduling.
4. Extension of Federated Scheduling for DAG Tasks with Parallelization Freedom
4.1. Overview of Federated Scheduling Based on Graham’s Bound and Long Paths
| Algorithm 1 Computing_Generalized_Path_List(G) [14] |
| Input: Output:
|
4.2. Extension of Federated Scheduling Based on Graham’s Bound and Long Paths with Parallelization Freedom
- Case 1: In this case, nodes on the longest path are split, and the longest path remains unchanged. Since the nodes along the path are parallelized, the longest thread is reduced; hence, the total length of this path is shorter than before the parallelization.
- Case 2: In this case, nodes on the longest path are split, leading to a change in the longest path. However, since the length of this new path is shorter than the length of the previous the longest path, the length of the longest path is reduced.
- Case 1: After the node on the longest path is parallelized, the longest path becomes the second longest, and the second longest path becomes the longest. The third path remains unchanged. In this case, L and both decrease.
- Case 2: After the node on the longest path is parallelized, the nodes of the longest path remain unchanged, while the lengths of the other two paths are replaced by the sub-threads of the nodes parallel to the longest path. As a result, L decreases and increases.
- Case 3: After the node on the longest path is parallelized, the nodes of the longest path remain unchanged, and the nodes of the other two paths also remain unchanged. In this case, L decreases and remains unchanged.
5. Parallelization Algorithm
5.1. Parallelization Algorithm
5.2. Node Selection Method
- Select the node that minimizes the outcome of Formula (7), as shown in Algorithm 3.
- If the current node’s option does not reach the limit, the node index remains unchanged; otherwise, select a node according to its topological order in the longest path.
- In the longest path, select the node with the longest single-thread execution time among the nodes where the parallelization option has not reached its limit.
- In the longest path, among the nodes where the parallelization option has not reached its limit, select the node with the shortest single-thread execution time.
| Algorithm 2 Parallelization_Algorithm() |
| Input: the high-density task sets: Output: parallelization option combination: , the number of cores required for high-density tasks:
|
| Algorithm 3 Node_Selection(, , , ) |
| Input: , , , Output:
|
5.3. Time Complexity of Algorithm
6. Experimental Results
6.1. Compared to Other Parallel Methods
- The number of nodes per DAG, , was uniformly sampled between 3 and 10 to reflect typical parallel task complexity.
- Node WCETs were drawn from to model heterogeneous computational loads.
- Implicit deadlines were set as , where controls task density and core allocation bounds.
- Edge connectivity probability was chosen to balance sparsity and dependency richness.
- Parallelization overhead was varied as 0.1, 0.2, and 0.4 to assess robustness under different overhead regimes.
- Ours: Parallelization according to Algorithm 2, denoted as Federate-Parallel.
- Single: Tasks are not parallelized and run in a single thread, denoted as Federate-Single.
- Max: Tasks are parallelized to the maximum extent possible within constraints, denoted as Federate-Max.
- Random: For each task node, the limit on parallelization is randomly selected, and then the parallelization options are randomly chosen within the limit, denoted as Federate-Random.
6.2. Compared with Other Scheduling Algorithms
- A global DM scheduling algorithm based on the TDTA strategy, denoted as TDTA-DM [15].
- An improved virtual federated scheduling algorithm, denoted as Virtual-Federate [13].
- A federated scheduling algorithm based on decomposition, denoted as Federate-Decompose [16].
- A gang reservation scheduling algorithm based on federated scheduling, denoted as Gang-Reservation [17].
- An ordinary reservation scheduling algorithm based on federated scheduling, denoted as Ordinary-Reservation [17].
- A parallel algorithm based on the global EDF scheduling algorithm, denoted as EDF-Parallel [1].
- A fluid scheduling algorithm for implicit deadlines, denoted as Fluid-Implicit [18].
- A basic federated scheduling algorithm, denoted as Federate-Li [9].
- A federated scheduling algorithm based on the long path, denoted as Federate-Path [14].
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Cho, Y.; Shin, D.; Park, J.; Lee, C. Conditionally Optimal Parallelization of Real-Time DAG Tasks for Global EDF. In Proceedings of the 2021 IEEE Real-Time Systems Symposium (RTSS), Virtual Event, 7–10 December 2021; pp. 188–200. [Google Scholar]
- Stone, J.E.; Gohara, D.; Shi, G. OpenCL: A parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 2010, 12, 66–73. [Google Scholar] [CrossRef] [PubMed]
- de Supinski, B.R.; Scogland, T.R.W.; Duran, A.; Klemm, M.; Bellido, S.M.; Olivier, S.L.; Terboven, C.; Mattson, T.G. The ongoing evolution of openmp. Proc. IEEE 2018, 106, 2004–2019. [Google Scholar] [CrossRef]
- Kwon, J.; Kim, K.; Paik, S.; Lee, J.; Lee, C. Multicore scheduling of parallel real-time tasks with multiple parallelization options. In Proceedings of the 2015 IEEE Real-Time and Embedded Technology and Applications Symposium, Seattle, WA, USA, 13–16 April 2015; pp. 232–244. [Google Scholar]
- Kim, K.; Cho, Y.; Eo, J.; Lee, C.; Han, J. System-wide time versus density tradeoff in real-time multicore fluid scheduling. IEEE Trans. Comput. 2018, 67, 1007–1022. [Google Scholar] [CrossRef]
- Park, D.; Cho, Y.; Lee, C. Conditionally optimal parallelization for global fp on multi-core systems. In Proceedings of the 2020 3rd International Conference on Information and Computer Technologies (ICICT), San Jose, CA, USA, 9–12 March 2020; pp. 403–412. [Google Scholar]
- Cho, Y.; Kim, D.H.; Park, D.; Lee, S.S.; Lee, C. Conditionally optimal task parallelization for global edf on multi-core systems. In Proceedings of the 2019 IEEE Real-Time Systems Symposium (RTSS), Hong Kong, China, 3–6 December 2019; pp. 194–206. [Google Scholar]
- Cho, Y.; Kim, D.H.; Park, D.; Lee, S.S.; Lee, C. Optimal parallelization of single/multi-segment real-time tasks for global edf. IEEE Trans. Comput. 2021, 71, 1077–1091. [Google Scholar] [CrossRef]
- Li, J.; Chen, J.J.; Agrawal, K.; Lu, C.; Gill, C.; Saifullah, A. Analysis of federated and global scheduling for parallel real-time tasks. In Proceedings of the 2014 26th Euromicro Conference on Real-Time Systems, Madrid, Spain, 8–11 July 2014; pp. 85–96. [Google Scholar]
- He, Q.; Guan, N.; Lv, M.; Jiang, X.; Chang, W. The shape of a DAG: Bounding the response time using long paths. Real-Time Syst. 2023, 60, 199–238. [Google Scholar] [CrossRef]
- Jiang, X.; Guan, N.; Long, X.; Yi, W. Semi-federated scheduling of parallel real-time tasks on multiprocessors. In Proceedings of the 2017 IEEE Real-Time Systems Symposium (RTSS), Paris, France, 5–8 December 2017; pp. 80–91. [Google Scholar]
- Ueter, N.; Von Der Brüggen, G.; Chen, J.; Li, J.; Agrawal, K. Reservation-based federated scheduling for parallel real-time tasks. In Proceedings of the 2018 IEEE Real-Time Systems Symposium (RTSS), Nashville, TN, USA, 11–14 December 2018; pp. 482–494. [Google Scholar]
- Jiang, X.; Liang, H.; Guan, N.; Tang, Y.; Qiao, L.; Wang, Y. Scheduling Parallel Real-Time Tasks on Virtual Processors. IEEE Trans. Parallel Distrib. Syst. 2022, 34, 33–47. [Google Scholar] [CrossRef]
- He, Q.; Guan, N.; Lv, M.; Jiang, X.; Chang, W. Bounding the Response Time of DAG Tasks Using Long Paths. In Proceedings of the 2022 IEEE Real-Time Systems Symposium (RTSS), Houston, TX, USA, 5–8 December 2022; pp. 474–486. [Google Scholar]
- Wu, Y.; Zhang, W.; Guan, N.; Ma, Y. TDTA: Topology-based Real-Time DAG Task Allocation on Identical Multiprocessor Platforms. IEEE Trans. Parallel Distrib. Syst. 2023, 34, 2895–2909. [Google Scholar] [CrossRef]
- Jiang, X.; Guan, N.; Long, X.; Tang, Y.; He, Q. Real-time scheduling of parallel tasks with tight deadlines. J. Syst. Archit. 2020, 108, 101742. [Google Scholar] [CrossRef]
- Ueter, N.; Günzel, M.; von der Brüggen, G.; Chen, J. Parallel Path Progression DAG Scheduling. IEEE Trans. Comput. 2023, 72, 3002–3016. [Google Scholar] [CrossRef]
- Guan, F.; Qiao, J.; Han, Y. DAG-fluid: A real-time scheduling algorithm for DAGs. IEEE Trans. Comput. 2020, 70, 471–482. [Google Scholar] [CrossRef]
- Li, J.; Agrawal, K.; Gill, C.; Lu, C. Federated Scheduling for Stochastic Parallel Real-Time Tasks. In Proceedings of the 2014 IEEE 20th International Conference on Embedded and Real-Time Computing Systems and Applications, Chongqing, China, 20–22 August 2014; pp. 1–10. [Google Scholar]
- Kocian, A.; Chessa, S. Iterative Probabilistic Performance Prediction for Multiple IoT Applications in Contention. IEEE Internet Things J. 2022, 9, 13416–13424. [Google Scholar] [CrossRef]





| Notation | Description |
|---|---|
| the WCET of vertex v | |
| the parallelization option for the node | |
| the length of path | |
| the length of the longest path of DAG G | |
| the volume of G | |
| a generalized path list | |
| the number of all generalized paths in G | |
| C | the total WCET for all nodes within G |
| L | the length of the longest path of G |
| D | the relative deadline of G |
| T | the period of G |
| the after parallelizing node into threads | |
| the lth sibling thread of | |
| the WCET of | |
| m | the number of cores for high-density DAG |
| the sum of the lengths of the first +1 paths in a DAG | |
| the number of cores is calculated by Formula (4) | |
| using the first +1 paths | |
| the total WCET for all threads within the | |
| the sum of the lengths of the first +1 paths in a DAG | |
| Z | |
| the list of parallelization options for DAG nodes | |
| the temporary node’s list of parallelization options | |
| the DAG task with | |
| the DAG task with | |
| y | the formula inside the ceiling function |
| in Formula (1) or (4) | |
| the value of y after the next parallel step in a DAG | |
| the value of C after the next parallel step in a DAG | |
| the value of L after the next parallel step in a DAG | |
| the value of Z after the next parallel step in a DAG | |
| the sum of the lengths of the first +1 paths | |
| in the DAG task with | |
| the total WCET for all nodes within the DAG | |
| with the list of parallelization options | |
| the length of the longest path of DAG | |
| with the list of parallelization options | |
| x | the current parallelization option for a node |
| parallelization overhead | |
| b | the execution time per thread for node |
| the change in the longest path of a DAG | |
| Y | the transformed expression for y |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Qiao, J.; Chen, S.; Chen, T.; Feng, L. Optimizing Federated Scheduling for Real-Time DAG Tasks via Node-Level Parallelization. Computers 2025, 14, 449. https://doi.org/10.3390/computers14100449
Qiao J, Chen S, Chen T, Feng L. Optimizing Federated Scheduling for Real-Time DAG Tasks via Node-Level Parallelization. Computers. 2025; 14(10):449. https://doi.org/10.3390/computers14100449
Chicago/Turabian StyleQiao, Jiaqing, Sirui Chen, Tianwen Chen, and Lei Feng. 2025. "Optimizing Federated Scheduling for Real-Time DAG Tasks via Node-Level Parallelization" Computers 14, no. 10: 449. https://doi.org/10.3390/computers14100449
APA StyleQiao, J., Chen, S., Chen, T., & Feng, L. (2025). Optimizing Federated Scheduling for Real-Time DAG Tasks via Node-Level Parallelization. Computers, 14(10), 449. https://doi.org/10.3390/computers14100449

