Scheduling the Exchange of Context Information for Time-Triggered Adaptive Systems

Onwuchekwa, Daniel; Hekal, Omar; Obermaisser, Roman

doi:10.3390/a18080456

Open AccessArticle

Scheduling the Exchange of Context Information for Time-Triggered Adaptive Systems

by

Daniel Onwuchekwa

^†

,

Omar Hekal

^*,†

and

Roman Obermaisser

^†

Chair of Embedded Systems, University of Siegen, Hölderlinstraße 3, 57076 Siegen, Germany

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Algorithms 2025, 18(8), 456; https://doi.org/10.3390/a18080456

Submission received: 5 June 2025 / Revised: 10 July 2025 / Accepted: 17 July 2025 / Published: 22 July 2025

(This article belongs to the Special Issue Bio-Inspired Algorithms: 2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

This paper presents a novel metascheduling algorithm to enhance communication efficiency in off-chip time-triggered multi-processor system-on-chip (MPSoC) platforms, particularly for safety-critical applications in aerospace and automotive domains. Time-triggered communication standards such as time-sensitive networking (TSN) and TTEthernet effectively enable deterministic and reliable communication across distributed systems, including MPSoC-based platforms connected via Ethernet. However, their dependence on static resource allocation limits adaptability under dynamic operating conditions. To address this challenge, we propose an offline metascheduling framework that generates multiple precomputed schedules corresponding to different context events. The proposed algorithm introduces a selective communication strategy that synchronizes context information exchange with key decision points, thereby minimizing unnecessary communication while maintaining global consistency and system determinism. By leveraging knowledge of context event patterns, our method facilitates coordinated schedule transitions and significantly reduces communication overhead. Experimental results show that our approach outperforms conventional scheduling techniques, achieving a communication overhead reduction ranging from 9.89 to 32.98 times compared to a two-time-unit periodic sampling strategy. This work provides a practical and certifiable solution for introducing adaptability into Ethernet-based time-triggered MPSoC systems without compromising the predictability essential for safety certification.

Keywords:

adaptation; communication efficiency; energy efficiency; metascheduling; MPSoC environment; resource allocation; static remapping; time-triggered architecture; time-triggered systems

1. Introduction

Time-triggered architectures are suitable for safety-critical applications in domains such as aerospace and automotive, where they help reduce power consumption, weight, and certification costs. Time-triggered systems such as time-sensitive networking (TSN) [1] and TTEthernet [2] offer deterministic communication mechanisms that fulfill stringent real-time and safety requirements. They allow mixed-criticality applications to coexist on a unified network infrastructure, which reduces the need for physically separated communication lines and dedicated processing units, leading to significant savings in wiring mass, component footprint, and energy usage, which are especially critical in aerospace systems. For example, TTEthernet has been deployed in platforms such as the Airbus A350 and NASA’s Orion spacecraft to support reliable and predictable communication for flight control, navigation, and actuation systems [3,4]. In the automotive domain, time-triggered architectures are increasingly applied in Advanced Driver Assistance Systems (ADAS) [5], braking, and powertrain control, enabling consolidation of functionalities across fewer electronic control units (ECUs) and thus reducing both weight and power demands. Moreover, the temporal and spatial partitioning of tasks in time-triggered systems enhances fault containment. It contributes to efficient certification under industry safety standards, such as DO-178C for avionics and ISO 26262 for automotive systems [6]. Time-triggered architectures support precise worst-case timing analysis by avoiding dynamic arbitration and ensuring predictable timing, simplifying safety verification, and reducing certification overhead. These benefits have motivated the adoption of time-triggered paradigms in distributed embedded systems, where reliable and analyzable communication between spatially separated nodes is essential [7,8], besides ensuring predictable and analyzable communication between processing elements. In such systems, time-triggered networks enable synchronized task execution and deterministic message delivery across multiple computing units, thereby improving composability and fault containment. As a result, time-triggered communication is increasingly deployed in the design of high-assurance, performance-constrained distributed systems operating in safety-critical domains [9].

In safety-critical systems, adaptation services are motivated by changing user-defined modes, environmental, energy, and failure scenarios. Resource allocation in time-triggered systems is carried out at design time, which promotes the safety argument for dependable systems. However, static resource allocation in time-triggered systems limits the adaptation to changing environment scenarios, failure scenarios, and power conditions during run-time. For this reason, metaschedulers are used to enable adaptation in time-triggered systems. The metascheduler is an offline tool that enables adaptation in a time-triggered system. The computed schedules can also include task remapping. When the analysis and mapping calculations are predicted at the design for different scenarios, it is called static remapping [10]. In some literature, metascheduling is also referred to as quasi-static [11] or hybrid scheduling [12], where task migration decisions are made based on an analysis performed at design time.

Due to safety concerns, reconfiguring safety-critical systems is often restricted to selecting system-wide modes from statically defined scheduling tables (meta-tables) that prescribe the permitted behavior of system components. Metascheduling in distributed systems requires all participating entities—such as end systems and switches—to maintain a consistent view of context events so that transitions between schedule tables occur in a coordinated and deterministic manner, necessitating a reliable mechanism for exchanging system state information among these entities. The interactive consistency protocol introduced in [13,14] enables the periodic exchange of context events within the time-triggered paradigm, ensuring synchronized decision-making. Furthermore, the approach allows the context exchange period to be tuned according to the timing requirements of the application, preserving both predictability and safety.

This work builds on the key idea that adaptation must be deterministic in time-triggered off-chip multi-processor system-on-chip (MPSoC) platforms. In such systems, context event exchanges can be anticipated and optimized in advance. Since the context events are known and temporally ordered at design time, the metascheduler can precompute all valid combinations of event occurrences and their corresponding schedules, which are represented in a multi-schedule graph (MG). Each node in the MG corresponds to a specific context event pattern, with its schedule derived from the state of its predecessor.

The critical insight here is that the need for context information exchange is not periodic but conditional. It depends on whether a particular context event brings about a behavioral change that affects downstream components. As a result, communication of context information is triggered only at specific transitions in the MG, namely when a successor schedule differs from its predecessor in a way that requires informing particular end systems.

In contrast to previous methods that broadcast context messages periodically or to all nodes indiscriminately, our approach integrates the context exchange problem with the scheduling optimization problem. For each schedule node, only the affected end systems—those that experience task or message changes compared to their predecessor—are selected as recipients of the context information. This selective strategy significantly reduces communication overhead while ensuring deterministic system behavior.

In essence, the scheduling process becomes context-aware and communication-efficient. The scheduler determines the timing of tasks and messages and when and where context messages need to be sent. This results in “design-time clairvoyance,” allowing the system to plan adaptation steps with minimal runtime overhead proactively.

This study introduces an algorithm designed to reduce communication overhead in the context of metascheduling for system adaptation. The key contributions of this work include:

Conceptualization of context information with its respective timing requirements.
Development of a synthesis algorithm for the context exchange scheduling problem. The algorithm is based on selective communication to nodes that need the information. The selective communication is designed based on upcoming decision points, and the communication of context information is temporarily aligned with decision-making on schedule changes. This contribution maximizes the impact of adaptation while minimizing the communication cost.
Evaluation of the improvements of the proposed synthesis algorithm by benchmarking against the current practice.

The remainder of this paper is organized as follows: Section 2 reviews related work in metascheduling, context-aware adaptation, and communication optimization in time-triggered systems. Section 3 introduces the foundations and motivations for adaptive time-triggered systems. Section 4 defines the context exchange scheduling problem and presents the proposed synthesis algorithm, including the selective communication strategy aligned with decision points. Section 5 presents experimental results and a comparative analysis demonstrating the effectiveness of the proposed approach in reducing communication overhead and improving adaptation efficiency. Finally, Section 6 concludes the paper.

2. Related Work

2.1. Adaptation in Time-Triggered Systems

Adaptation in safety- and mixed-criticality systems is motivated by the need for energy efficiency, safety, and responsiveness to changing environmental conditions. Resource allocation in time-triggered systems is computed at design time, which significantly benefits the safety arguments of dependable systems. However, static resource allocation limits the flexibility of adaptation. Hence, the following works are efforts to provide adaptation in time-triggered systems.

A network-wide reconfiguration strategy for time-triggered networks is introduced in [15]. The work proposed local, elastic, global, and degraded reconfiguration strategies. In the local reconfiguration, the network routing information and routing scheduling table information resident in the switches are unchanged. Task migration due to a failed core is carried out for the same end system or another one on the same cluster. The elastic reconfiguration is designed for low-priority traffic, where periods of low-priority traffic are increased to decrease the total load. The global reconfiguration involves remapping applications, redistributing routes, and generating a new scheduling table for the system. Degraded reconfiguration ensures that only the most critical applications and their communication tasks are retained. The work developed an Integer Linear Programming (ILP)-based strategy for global reconfiguration. Furthermore, it developed a heuristic algorithm for mapping and routing to reduce reconfiguration time.

Adaptation in safety and mixed-criticality systems is driven by energy efficiency, safety, and responses to environmental changes. While time-triggered systems benefit from design-time resource allocation for safety, this static approach limits flexibility. Many efforts focus on enabling adaptation, particularly by utilizing slack events, which arise from conservative Worst-Case Execution Time (WCET) estimates. Slack can enhance Quality-of-Service (QoS) in safety-critical systems or allow best-effort tasks in mixed-criticality systems.

Skalistis and Kritikakou in [16] propose an interference-sensitive adaptation approach capable of fine-grained synchronization. The work focused on resolving concurrency issues caused by the parallel execution of monitor updates in the system. Runtime adaptation is achieved through controllers that monitor tasks on each core. Each task is padded with a ready phase and an update phase to enable adaptation. The ready phase signifies that a task will be executed when all scheduling dependencies are met. The update phase describes the completion of a task, followed by a notification to all other cores by the controller. However, context propagation between cores is neither time-triggered nor included in the scheduling problem. Conversely, we provide a solution with timing guarantees for exchanging context between cores by scheduling the exchange of context information. Furthermore, beyond proffering a platform to adapt to slack, our proposed work extends adaptation to other events, such as failure scenarios and changing environmental conditions.

Furthermore, based on exploiting slack for adaptation in time-triggered systems, the work in [17] presented a dynamic approach of adapting to interference-sensitive WCET during runtime. Their approach exploits the actual execution time of tasks in all system cores in a way that allows concurrent tasks to sustain more interference while preserving timing. The partial order of tasks is relaxed depending on the progress of each task. Moreover, the runtime variability caused by shorter task execution compared to an offline computed interference-sensitive WCET schedule is exploited for adaptation.

Adaptation for fault recovery in time-triggered systems is tackled in [18]. The work presented a time-triggered extension layer for mixed-criticality network-on-chip (MCNoC), where multiple schedules are associated with specific failure events. The approach consists of a monitor used to monitor the resources within the MCNoC and reconfigure them based on a priori recovery strategies. The work in [18] does not consider the timing requirement of the context information exchanged between resources in cooperation with the global schedule. Conceptualizing context information along with its respective timing requirements enables the scheduling of context information only when needed. In our approach, the resource utilization needed for communicating context information is minimized by factoring in the timing requirement of the context communication.

An adaptive time-triggered multi-core architecture (ATMA) is developed in [9]. The proposed ATMA extends beyond adaptation by leveraging WCET while also addressing failure events and changing environmental conditions. The work describes an architecture that utilizes a network-on-chip (NoC), multi-schedule graphs, and building blocks such as context agreement units, context monitors, and adaptation units. The approach is based on mapping a unique schedule to the different system states. The schedules of the processing elements and NoC are switched at runtime to adapt to various scenarios. Inconsistency in the global view for schedule changes is avoided using the context monitor and agreement unit, thanks to the distributed approach for adaptation decision-making in ATMA.

Adaptive control methods have also been investigated for enhancing fault tolerance in nonlinear systems. In [19], an adaptive critic-based fault-tolerant control (FTC) strategy is proposed using approximate dynamic programming for systems with unknown dynamics and asymmetric constrained inputs. The approach formulates the FTC problem as a zero-sum game and uses an actor-critic neural network structure to approximate optimal control policies. While this strategy supports online adaptation and ensures convergence under certain assumptions, it does not offer deterministic guarantees on timing or resource usage, nor does it support coordinated adaptation across distributed components. In contrast, our approach integrates fault tolerance within a time-triggered metascheduling framework, ensuring predictable context propagation and bounded resource usage under fault conditions.

While the works mentioned above explore different strategies for enabling adaptation in time-triggered systems, including fault tolerance and learning-based control, they often overlook the communication and resource overhead associated with propagating context information. In particular, the exchange of context between processing elements is usually treated as an external process or assumed to be constantly active, resulting in excessive communication, increased power consumption, and reduced scalability. In contrast, our approach integrates context exchange directly into the scheduling problem by embedding it within the metascheduling framework, allowing for the selective triggering of context propagation only for the relevant processing elements. It minimizes the overhead and ensures that adaptation remains efficient, predictable, and resource-aware.

2.2. Learning Based Adaptive Scheduling

Learning-based scheduling approaches have gained attention for their ability to support runtime adaptation in dynamic and heterogeneous environments. These techniques offer scalability and responsiveness by learning scheduling policies from observed behavior rather than relying on exhaustive design-time computation. However, despite their promising results in terms of performance and adaptability, they generally lack deterministic timing guarantees and formal verifiability, making them unsuitable for safety-critical systems.

A Deep Reinforcement Learning (DRL)-based framework for wireless time-sensitive networking (WTSN) was presented in [20]. The method utilizes proximal policy optimization (PPO) to dynamically assign transmission slots, adapting to changing wireless conditions. Although the framework achieves high latency satisfaction, it does not provide worst-case latency bounds or deterministic behavior, limiting its applicability to certified real-time systems.

In fog-based IoT environments, Pandit et al. [21] proposed an RL-based task scheduler using a two-level neural architecture. Tasks are first classified as either fog or cloud using feed-forward and convolutional neural networks, followed by an RL-based allocation across fog devices. While this approach improves response time and reduces communication overhead, it lacks system-wide consistency and temporal guarantees required for safety-critical deployments.

Liu et al. [22] introduced HeterPS, a distributed RL-based framework for scheduling deep neural network (DNN) training tasks across heterogeneous platforms. The scheduler aims to minimize cost and training time under elastic resource constraints. However, the absence of formal scheduling guarantees and predictable fault handling restricts its use in systems that require verifiable runtime behavior.

Beyond RL, Alshaer et al. [23] proposed a GNN-based meta-scheduling framework for adaptive time-triggered systems. Their model is trained to approximate the scheduling behavior of a Genetic Algorithm (GA), enabling schedule reconstruction at runtime. Although this approach offers improved adaptation efficiency compared to traditional heuristics, it does not ensure bounded response times or deterministic schedule switching, factors that are essential in safety-critical applications.

While these learning-based approaches are effective in handling dynamic environments, their model-free or inference-driven nature introduces unpredictability and limits traceability, both of which are unacceptable in safety-certified systems. In contrast, our proposed framework generates a multi-schedule graph (MSG) offline, where each node corresponds to a schedule adapted to a specific context event. Transitions between schedules are triggered deterministically based on explicitly defined context changes, and context exchange is co-scheduled with application tasks to ensure global consistency and bounded overhead. This design guarantees formally verifiable behavior, timing predictability, and system-wide determinism, addressing the critical limitations of learning-based methods in safety-critical domains.

2.3. Agreement Protocols for Time-Triggered Systems

The idea of the agreement protocol considered in this work does not include security. We limit our discussion of the agreement regarding adaptation services for time-triggered systems. We discuss the related work based on three contexts: Who is adapting? To what are we adapting? When do we adapt?

In [24], it is argued that an agreement protocol is required for a distributed system to maintain a consistent view of the order of non-sparse events. An agreement protocol requires the exchange of context information between processing elements. Each processing element acquires a different local view of the state information of other processing elements. After this, every processing element possesses the same information as every other. The processing elements apply a deterministic algorithm to this consistent information to reach the same conclusion about assigning the event to an active interval of the sparse time base.

A system-wide, fault-tolerant state agreement protocol for NoCs in a time-triggered MPSoC is proposed in [13,14]. The works separate the network of context information exchange from the application network to minimize energy consumption. It addresses the adaptation trigger in response to slack and failure scenarios (to what we adapt). The work is based on the premise that synchronous adaptable events have a predictable offset between the potential occurrence and the start of an agreement protocol. Therefore, the agreement protocol is scheduled to fit the adaptable event’s volatility. However, the work does not present a suitable solution for when to run the agreement protocol for adaptable events that occur asynchronously, a challenge for when to adapt.

Another overhead introduced in [13] is that all processing elements participate in each run of the agreement protocol. Certain adaptable events do not require a global adaptation by all processing elements. Therefore, running the agreement protocol for unaffected processing elements creates communication overheads and results in unnecessary computational costs. Hence, the question of who should participate in information exchange within the agreement protocol must be addressed.

In contrast, we propose a solution that triggers adaptation services only for the affected processing elements, thereby minimizing unnecessary computation and communication overhead from unaffected processing elements. Therefore, we solve the challenge of who should adapt. Also, we develop an algorithm to schedule the exchange of context within the schedules computed for adaptation. This approach contrasts with the periodic triggering, which has a loose coupling with the adaptation schedules and a high communication cost. Our proposed solution reduces the computation and communication overhead, addressing the challenge of when to adapt. Furthermore, given the question of what to adapt, our approach uses a metascheduler to compute schedules required for adaptation to slack, failure, and environmental or user-defined scenarios.

Although our agreement protocol does not focus on security, similar coordination challenges have been studied in secure communication systems. For example, the work in [25] proposes secure multi-destination data transmission using relay selection and interference to protect against eavesdropping. While it addresses coordination and communication overhead, it does not consider globally scheduled context exchange or provide timing guarantees. In contrast, our approach schedules context exchange as part of the global system schedule, ensuring predictable timing, reduced overhead, and reliable adaptation, without targeting security.

3. Adaptive Time-Triggered Systems

The architecture of an adaptive time-triggered system, featuring a time-triggered network and multiple node computers, is depicted in Figure 1. The system is semi-static, where at any particular point in time, an active time-triggered schedule determines the points in time for executing tasks and transmitting messages with respect to a global time base. This active schedule is used by a task/message dispatcher, which provides control signals for the operating system and the network interface.

It is assumed that a time-triggered operating system or hypervisor (e.g., XtratuM [26], PikeOS [27]) and a time-triggered network (e.g., TSN, TTEthernet) are used, which manage access to resources based on TDMA [28] with system-wide synchronization. The active time-triggered schedule is taken from a multi-schedule graph, which contains schedules for all combinations of context events.

Context events are events that require different schedules to improve energy efficiency, dependability, and adaptability. Contextual events, such as task slack times or warnings about low battery charges, can be used for energy management. Failures represent context events that enable fault recovery and degradation for improving dependability. Context events for adaptability include changes in the environmental conditions or the constitution of the computer system, which need to be reflected in the system’s configuration. For example, task slack times can be used for an earlier transmission of output messages and an earlier execution of subsequent tasks. The resulting idle time at the end of the system cycle enables power/clock gating to reduce energy consumption.

Key requirements for adaptation and changing schedules in a time-triggered system are the correctness of the offline precomputed schedules and the system-wide consistency in switching between schedules.

Correctness of schedules: Time-triggered schedules need to prevent collisions on shared resources, guarantee deadlines, and ensure implicit synchronization. These requirements are fulfilled at development time by suitable offline scheduling algorithms.
Consistency of switching schedules: When a node computer switches to another schedule due to a context event, then all other node computers need to switch accordingly. Collision and loss of synchronization would occur if subgroups of the system’s node computers were on different schedules at any point in time.

To satisfy the requirement for consistency, observed context events are subject to context agreement. The node computers execute a fault-tolerant agreement protocol to ensure that information about context events is consistent at all nodes where it is used for switching between schedules.

4. Scheduling for Adaptive Time-Triggered Systems

In the context of computing, specifically in the adaptive time-triggered multi-core architecture (ATMA), a metascheduler is an advanced scheduling component that operates above standard task and resource schedulers. It is responsible for creating a comprehensive framework of schedules, known as a multi-schedule graph (MG), which outlines how tasks are to be executed, resources allocated over time in a multi-core system, and how communication should flow across the network on a chip (NoC), and how the system should adapt to changes or events. The metascheduler takes into account the system’s various operational contexts, including different workload scenarios, environmental conditions, and potential system faults. It uses input from the application model (task graph) [29], the platform model, and the context model to generate multiple time-triggered schedules, each tailored to specific sets of conditions or “contexts” the system may encounter. These schedules dictate when and where computational tasks should run, how communication should flow across the network-on-chip (NoC), and how the system should adapt to changes or events. The metascheduler operates offline, meaning it computes schedules before runtime based on predictions and models of possible states the system may experience. It ensures that the system can adapt to changes in context by switching schedules in a time-triggered and predictable manner. The metascheduler thus plays a critical role in maintaining the system’s efficiency, reliability, and adaptability to various runtime conditions.

We use Figure 2 to describe the architecture of our proposed metascheduling algorithm, which mainly consists of an application, platform, and context model; conditionally timed models; and a scheduling component. The context module is used to create an event calendar, which the metascheduling algorithm utilizes in conjunction with the baseline application and platform model to generate conditional, timed models (application, platform, and context). The conditional timed models are used to construct an optimization problem, which a scheduler solves to produce a multi-schedule graph. In the following sections, we provide a detailed description of the building blocks.

4.1. Application and Platform Models

The application model within the ATMA is a detailed blueprint of the computational tasks that are to be executed on the system. A directed acyclic graph (DAG) describes it, specifying each task’s deadlines and worst-case execution times (WCETs). It also outlines the communication patterns, detailing each message, its sender, and its recipient tasks, thereby establishing the precedence constraints that dictate the order in which tasks should be executed and messages sent.

The platform model serves as a comprehensive inventory of the hardware resources available for executing the tasks defined in the application model. An undirected acyclic graph is used to describe the platform model. It informs the meta scheduler about the specifics of the hardware environment, such as the number and types of processing cores, memory capacities, input/output interfaces, routers, and the pattern of interconnections between routers, which together form the network-on-chip (NoC).

This model enables the meta scheduler to comprehend the physical constraints and capabilities of the hardware, thereby facilitating the optimal allocation of tasks to cores and the routing of messages through the NoC. By providing a detailed map of the hardware, the platform model enables the meta scheduler to generate schedules that not only meet the temporal and resource requirements of the application model but also adhere to the physical realities of the hardware, ensuring that the computed schedules are feasible for actual implementation on the given system.

Figure 3a,b show an example for the application and platform models, respectively.

4.2. Context Model and Event Calendar

The context model within the ATMA defines the set of external and internal conditions, or “contexts” that influence the system’s behavior. These contexts include system faults, varying levels of computational task demand (dynamic slack), resource alerts like overheating or low battery, and changes in environmental conditions. The model categorizes these events based on their characteristics, such as predictability, urgency, and impact on system safety and functionality. It provides the metascheduler with the information necessary to prepare for and respond to these events, ensuring the system can adapt its operation to maintain performance and safety standards.

An event calendar, linked with the context model, schedules the monitoring and management of these events over time. It ensures that the system recognizes and responds to context changes on time. For instance, synchronous events, which are predictable in their occurrence, are checked at specific times, allowing the system to prepare and adjust schedules predictably. Asynchronous events occur at unpredictable times and are sampled periodically to detect and react to changes as quickly as possible. The calendar’s role is crucial in managing the system’s adaptation to context changes, whether shifting to more energy-efficient operation modes in response to a detected slack or reconfiguring tasks and message paths due to a hardware fault.

4.3. Conditional Timed Application, Platform, and Schedule Models for Combinations of Context Events

The combination of context events results in an alteration of the platform and application models compared to the baseline models. For each combination of context event, these platform and application models can be precomputed at development time. At runtime, a certain platform and application model becomes a correct representation of the system based on two prerequisites:

Conditional: all events of the platform and application model must have occurred.
Time: If an event occurs, it must do so at a predefined point in time in the system cycle. The platform and application model become a valid representation of the system at the time of the last event occurrence.

Based on the conditional time application and platform models, a schedule can be computed. This schedule is also conditional and timed; i.e., it is only valid after the occurrence time of the last event and in the case that all events occur.

4.4. Conditional Timed Context Agreement Model

For each node of the multi-schedule graph (MG), there is a combination of context events as depicted in Figure 4. These events occur over time at predefined points in time if they occur. Each node in the MG represents a schedule for the system, which consists of schedules for each end system and each switch. For a subset of these endsystems and switches, the schedule differs from the schedules of the predecessors. Those endsystems need to be informed about the occurrence of the context event using context messages, while the other endsystems, where the schedules remain unchanged, do not require the context information of the MG node.

Therefore, we can define the communication requirements for the context messages:

Time: To ensure correct schedule transitions, context messages must be received before the predefined schedule switch time, denoted as $t_{switch}$ . Given a known communication delay $d x_{i}$ , the message must be sent early enough so that it arrives on time. The latest possible transmission time is therefore calculated as:

$t_{send} = t_{switch} - d x_{i}$

This ensures that the message arrival time $t_{receive} = t_{send} + d x_{i}$ satisfies $t_{receive} \leq t_{switch}$ , maintaining timing correctness and system-wide consistency.
Content: Context messages need to contain information about those context events that enable the activation of the schedule associated with an MG node.
Sender: The sender of a context message is the endsystem, where the context event is observed (e.g., endsystem, where a task exhibiting slack is executed).
Receiver: The receivers of a context message are all endsystems, where schedules changed compared to the predecessors in the MG.

4.5. Establishment of Multi-Schedule Graph

The establishment of the multi-schedule graph (MG) is illustrated using Figure 5. Our approach uses a binary schematization to reference the schedules in an MG. In the example figure for five context events, an illustration of all possible combinations of event scenarios is represented. For each combination, the MG displays a binary value attributed to a corresponding schedule. The context events are represented by the bit positions in the binary vector. The least significant bit is the first event, and the most significant bit represents the last event. Context events are assumed to be temporally ordered based on the position in the binary vector.

Row 0 in Figure 5 represents the baseline schedule without any context event, thus labeled with the vector “00000”. Row

c = 1

shows the schedule attributed to the binary “00001”, which is computed upon the occurrence of the first context event. The next row,

c = 2

, contains multiple schedules depending on the context event combinations. In this case, there are two schedules. The two schedules are represented by “00010” and “00011.” The binary “00010” represents the case where only event

c = 2

occurs, and “00011” indicates that although

c = 2

occurred,

c = 1

also occurred. In this manner, the number of new schedules doubles in every row, reaching sixteen possible schedules in the last row. Also, it is important to note that all events located within any row of c are mutually exclusive: for example, in row

c = 2

, the events 00010 and 00011 are mutually exclusive.

In the following subsection, we propose algorithms to optimize the exchange of context information. The optimization is two-fold:

The exchange of context information will only be carried out by end systems that require context messages. This approach contrasts with previous methods [30], where all end systems receive context information.
The exchange of context information is addressed as part of the overall scheduling problem by the metascheduler, in contrast to previous methods where context information is exchanged periodically. The implication is the reduction of communication overhead, where context information is only exchanged when necessary.

The rationale behind the proposed approach is that, since the metascheduler operates at design time for user-defined context events, it is then possible to make temporal and spatial reservations for exchanging context information.

4.5.1. Metascheduler

Algorithm 1 describes the establishment of the MSG considering context information exchange. The variable p represents the currently processed context event index in the multi-schedule graph. Algorithm 1 populates a schedule database B with an object D, which consists of a schedule s and its corresponding context value

v

. The value of

v

is represented by n number of bits, which corresponds to the number of context events in the event calendar. The index p of

v

within a schedule database is

1 \leq p \leq 2^{n}

. The algorithm begins by initializing B. The root schedule

S M_{b a s e}

is computed by invoking a genetic algorithm using the base application model

A M_{b a s e}

and the base platform model

P M_{b a s e}

.

A M_{b a s e}

and

P M_{b a s e}

are the initial application and platform model without the occurrence of any context. After which the base schedule and associated context

p = 0

are entered in B. The insertion of schedules into B occurs from top to bottom. It means that a parent schedule is first computed before the child schedule. Having computed

S M_{b a s e}

at

p = 0

, we select the next context event in a temporally descending order. The schedule for the parent node of any child schedule in the multischedule graph is computed by subtracting the row index c or its bit pattern

v

from the bit pattern or index of the child schedule.

The variable c is the currently processed row of the multi-schedule graph. c is initialized with the number of bits in the context event vector. The event combinations for the schedules in row c correspond to all binary numbers from 0 to

2^{c} - 1

, where the

c^{t h}

bit is set. The value representation of the context event in Figure 5 example is a 5-bit vector. The binary representation v is derived from its index p, and it is used to express the combination of context event occurrences. For example, the schedule of the parent/predecessor of the highlighted bottom-most schedule with value

v = 10101

is computed by subtracting the index c : 10000, which results in

v = 00101

. In the same way, the parent schedule of the schedule with

v = 01101

is computed by subtracting

c : 01000

to get

v = 00101

.

For each context event in the event calendar, we apply it to

A M_{x}

,

P M_{x}

, and then invoke the genetic algorithm (Algorithm 2) to solve for a schedule that includes the context communication cost of the context message exchange with only affected nodes.

Algorithm 1 Metaschedule(B, n)

1:: Input:
    n: number of context events
    c: row index of the MG: value set 1
    B: schedule database, $B = {D_{0}, D_{1}, D_{2}, \dots}$ , where $D = {(s, v) ∣ s$ & $v$ are attributes of $D}$
    s: schedules per scenario
    $v$ : event in bit pattern
   e p: context event index
    $C M$ : Context Model, $A M_{b a s e}$ , $P M_{b a s e}$ : base application and platform models
2:: Output: A Multi-Schedule Graph (MG)
3:: Initialize B
4:: $S M_{b a s e} \leftarrow G A_{S o l v e} (A M_{b a s e}, P M_{b a s e})$ ▷ Base Schedule
5:: $D_{0} \leftarrow (S M_{b a s e}, A M_{b a s e}, P M_{b a s e}, p)$ where $p = 0$
6:: $B \leftarrow [D_{0}]$ ▷ Append root schedule
7:: $S M_{x} \leftarrow S M_{b a s e}$ ▷ Predecessor schedule
8:: $c \leftarrow 1$
9:: for $p = 1$ to $2^{n}$ do
10:: $(A M_{x}, P M_{x}) \leftarrow$ getPredecessors(p, B)
11:: $(A M_{y}, P M_{y}) \leftarrow$ applyContext( $A M_{x}, P M_{x}, p$ )
12:: $S M_{y} \leftarrow G A_{C o n t e x t} (A M_{y}, P M_{y}, A M_{x}, P M_{x}, S M_{x})$
13:: $D_{i} \leftarrow (S M_{y}, A M_{y}, P M_{y}, p)$
14:: Append $D_{i}$ to B
15:: end for

Algorithm 2

G A_{S o l v e W i t h C o n t e x t} (A M_{y}, P M_{y}, S M_{x})

1:: Input:
    $A M_{y}$ : Application model for context event p
    $P M_{y}$ : Platform model for context event p
    $S M_{x}$ : Schedule(s) of direct predecessors
2:: Output: $S M_{y}$ : Schedule for context event index p
3:: Initialize:
4:: Chromosome $S \leftarrow (A_{o} [t], R_{o} [t], T_{o} [m], P_{o} [m])$
5:: $P o p \leftarrow$ Random population of individuals of all S
6:: EvalFitness( $P o p$ )
7:: while termination condition not met do
8:: Select chromosomes from $P o p$
9:: Perform crossover
10:: Perform mutation
11:: EvalFitness( $P o p$ )
12:: end while
13:: return Best solution
14:: procedureEvalFitness( $P o p$ )
15:: for all chromosome in $P o p$ do
16:: ScheduleReconstruction(chromosome) ▷ See Algorithm 3
17:: $f i t n e s s (s_{i}) \leftarrow$ Makespan
18:: end for
19:: end procedure

4.5.2. Application of Genetic Algorithm

Algorithm 2 uses the genetic algorithm to solve the scheduling problem. The result of the algorithm is a schedule for the event vector v. In the scheduling problem, the application model for event vector v can be described by the number of messages

{n m}_{v}

, the number of tasks

{n t}_{v}

, and dependency constraints

G_{v}

. The dependency constraints are a vector with

{n m}_{v}

elements, each being a pair of two tasks

(t_{1}, t_{2})

, meaning that

t_{2}

depends on the outputs of

t_{1}

. The platform model attributed to the event vector v can be described by the number of endsystems

n e_{v}

, the number of switches

n s_{v}

, and the number of valid multicast path indexes

n p_{v}

. The above parameters of the application and platform models are input constants for the optimization problem. In addition, there are decision variables for the optimization. Each message must be assigned a path-index P and a transmission ordering index T. The path-index uniquely identifies a valid multicast communication path among endsystems and switches from a particular source endsystem to one or more destination endsystems. Based on the k-shortest-path problem, the multicast paths can be precomputed before starting the optimization, thus reducing the routing problem to the selection of an optimized path index. The message ordering index is required to determine the sequence of messages using shared switches or endsystems. Further decision variables are the endsystem allocation A and the relative ordering index R for each task. The task ordering index is relevant when multiple tasks are assigned to the same endsystems.

In contrast to the node being scheduled, P, T, A, and R are constants of the already scheduled predecessor nodes. In formulating the scheduling problem, we use the subscript y for the decision variable of the node being scheduled, whereas x denotes an index of a predecessor node that is already scheduled. The scheduling constraints ensure the satisfaction of the precedence constraints for each path-index. In addition, the precedence constraints ensure that all nodes obtain context information about c that requires this information.

For this purpose, the successor nodes of schedule v are determined, which exhibit schedule differences compared to v. In these schedules, we determine the endsystems where message transmissions or task executions are changed. These endsystems become context event recipients, and the path-index

C_{0}

must ensure that the context information about c is delivered to them.

Formally, we introduce a helper function

E S^{*}

for determining whether an endsystem e is a sender or receiver of a message m in the schedule x:

E S^{*} (x, m, e) = \{\begin{matrix} t r u e & G [m] = (t_{1}, t_{2}) \\ \land (A_{x} [t_{1}] = e \lor A_{x} [t_{2}] = e) \\ f a l s e & otherwise \end{matrix}

(1)

Based on this function, we can define the set of endsystems, which are recipients of context information:

\begin{matrix} r e c i p i e n t s = {e \in {1, \dots, n e_{v}} | \\ \exists m with (P_{x} [m] \neq P_{0} [m] \lor T_{x} [m] \neq T_{0} [m]) \land E S^{*} (x, m, e) \\ \lor \exists t with (A_{x} [t] \neq A_{0} [t] \lor R_{x} [t] \neq R_{0} [t]) \land A_{x} (t) = e} \end{matrix}

(2)

An endsystem e is a recipient of context information c from a schedule if the schedule of a message or task is different in a predecessor node and the endsystem e is a sender or receiver of the message or the allocated endsystem of the task.

4.5.3. Schedule Reconstruction

Algorithm 3 shows the implementation of the schedule reconstruction algorithm used in the genetic algorithm. The outcome of the genetic algorithm processes the relative execution order of each task, including task allocation, message path index, and message priority ordering. The process is initiated by duplicating the message list to prevent alterations to the original data. It then identifies processors affected by the current scheduling context and updates task allocations and node lists accordingly. The algorithm emphasizes the importance of precedence, constructing a partial schedule based on prior data and dynamically processing tasks based on predecessor completion. Key to its functionality is the adjustment of message sizes according to path costs and the prioritization of messages for each receiver. The final phase involves merging the newly computed schedule with unchanged segments of the preceding schedule.

Algorithm 3 Schedule Reconstruction

1:: Input: $A_{o} [t]$ , $R_{o} [t]$ , $T_{o} [m]$ , $P_{o} [m]$ , process_times
2:: Output: schedule
3:: Determine affected processors
4:: Identify tasks scheduled in the current context
5:: Update node list and task allocation based on scheduled tasks
6:: Construct a partial schedule from previous data
7:: Initialize processor list and task completion times
8:: for all message m in message list do
9:: $p \leftarrow$ message_path_index[ $m . id$ ]
10:: $d \leftarrow$ path_costs[p] ▷ hop-based delay from k-means-selected path
11:: $m . size \leftarrow m . size + d$ ▷ inject $d_{x_{i}}$ into message
12:: Update message dictionary with $(sender, size, priority)$
13:: end for
14:: Sort messages by priority for each receiver
15:: Set current time per processor and initialize completed tasks
16:: while there are ready tasks do
17:: Process each task considering predecessors and processor times
18:: Update schedule and task completion times
19:: end while
20:: Merge the new schedule with the unchanged part of the previous schedule
21:: return merged schedule, affected processors

5. Results and Discussion

5.1. Evaluation Methodology and Performance Metrics

This study addresses the limitations of static resource allocation in time-triggered systems, specifically their inflexibility in adapting to dynamic scenarios such as environmental changes, failures, and varying power demands. We introduce an advanced metascheduling method and a specialized algorithm for time-triggered multi-processor system-on-chip (MPSoC) environments, aimed at optimizing communication efficiency through precomputed adaptation schedules. This approach is expected to reduce overhead and enhance system performance.

The evaluation of this research tests the effectiveness and efficiency of the proposed metascheduling algorithm in a time-triggered MPSoC context. Key areas of focus include

Communication Efficiency: The evaluation aims to validate our algorithm’s ability to optimize context information exchange, assessing whether its selective communication strategy, aligned with decision points, reduces communication cost.
Comparative Analysis: By comparing the new algorithm with existing methods, the evaluation highlights the advancements in our research, specifically in reducing communication costs and improving overall system performance.

5.2. Experimental Setup and Communication Cost Modeling

In the evaluation of our study, we assessed the performance of our proposed metascheduling algorithm in time-triggered systems, focusing on communication cost as the primary metric. For the genetic algorithm, we utilized Python (version 3.10) along with the DEAP library (version 1.4.3) [31]. The choice is due to the flexibility in implementing both value- and permutation-encoded genomes [29]. The detailed configuration of the genetic algorithm used in our experiments is summarized in Table 1.

We used the Stanford SNAP Library [32] to generate Directed Acyclic Graphs (DAGs) conforming to application models with job sizes ranging from 10 to 60. These DAGs, created using SNAP’s efficient graph representation capabilities and based on a random model, allowed us to simulate various workload scenarios for metascheduler evaluation. The models varied in nodes, edges, indegrees, and outdegrees to reflect a diverse set of application scenarios. The structural format of the application and platform models used in our experiments is defined using abstract JSON schemas. The application schema captures jobs, messages, and an application deadline, while the platform schema defines nodes and interconnecting links. These schemas are illustrated in Figure 6.

The use of Directed Acyclic Graphs (DAGs) in scheduling and resource allocation research is well-established and widely supported within the community. Tools such as Task Graphs for Free (TGFF) [33] and Synchronous Dataflow For Free (SDF3) [34] were specifically developed to generate and analyze DAG-based workloads, allowing researchers to simulate and evaluate scheduling strategies under controlled and repeatable conditions.

TGFF offers parameterizable task graphs with adjustable structural properties, including degree, critical path length, and communication volume. It is extensively used for benchmarking task allocation and scheduling algorithms in heterogeneous systems. Similarly, SDF3 offers advanced modeling capabilities for synchronous dataflow applications, supporting the generation of consistent, deadlock-free graphs that closely resemble digital signal processing (DSP) and multimedia workloads.

These tools, along with the underlying DAG abstraction, serve as standard baselines in the design and evaluation of schedulers for embedded, multiprocessor, and MPSoC platforms. Therefore, our use of DAGs generated through the SNAP library aligns with these best practices, enabling comparative analysis across a broad design space.

We applied the algorithm to a range of datasets, ensuring a consistent platform model across all experimental runs to isolate performance variations to application model differences. Each dataset, encompassing diverse job sizes and structures generated via the SNAP library, was processed by the algorithm, enabling an evaluation across varying complexities and scenarios.

In our study, the computation of communication cost plays a crucial role in evaluating our proposed metascheduling algorithm. This cost is fundamentally associated with the propagation of context information among end systems (ES) within the network. To accurately compute this cost, we adopted the following approach:

(a): We categorized end systems requiring context information as ’recipients’. This categorization is based on Equations (1) and (2).
(b): We incorporated a dedicated time slot for the exchange of context information prior to the commencement of task scheduling on the end systems. Following the occurrence of a context event, we established the current time for each affected end system (denoted as $t_{e_{1}}, t_{e_{2}}, \dots, t_{e_{n}}$ ).

We then reserved time slots following the initiation of a broadcast of context events to all recipient end systems. This process included the assumption of a specified delay, denoted as

d x_{i}

, which varies based on the path index of each context message, reflecting the network’s topology and the data flow paths. The overall communication cost associated with each scheduling decision is given as

Comm_cost/Affected_ES = \sum_{i = 1}^{n} (d x_{i})

(3)

Equation (3) captures the cumulative path delays (

d x_{i}

) based on routing decisions made during schedule construction. Other sources of communication latency, such as serialization time or software-induced processing delays, are not explicitly modeled. We assume fixed-size messages and static link characteristics, where serialization is either negligible or embedded within the path delay values used in the model. Moreover, such delays remain constant across all schedules and are not influenced by the optimization variables. As the genetic algorithm ranks individuals by relative makespan, including such constants would not change the selection process and can therefore be safely omitted to simplify the model.

The value of (

d_{x_{i}}

) in Equation (3) is derived from the platform model using a set of precomputed multicast paths. These paths are generated by solving the k-shortest-path problem between relevant end-system pairs. For each message, the genetic algorithm selects one of these precomputed paths through its path index P genome. The associated path delay

d_{x_{i}}

is calculated as the number of hops (i.e., the total number of links and switches traversed) along the selected path. This value is retrieved from the platform data structure during schedule reconstruction and used to compute the message delay contribution in the makespan.

The communication cost for periodic context information exchange, denoted as

C_{c}

, is mathematically expressed in Equation (4). It is proportional to the ratio of the makespan M over the sampling frequency

f_{s}

, scaled by the context information cost

C_{ci}

. The makespan M represents the total time to complete tasks, while

f_{s}

is the rate of system data collection, and

C_{ci}

covers the resources required to process the sampled data.

C_{c} = (\frac{M}{f_{s}}) \cdot C_{ci}

(4)

5.3. Comparative Evaluation with Periodic Sampling Approaches

To benchmark our algorithm against existing solutions, we conducted a comparison focusing on the prevalent approach of periodic sampling for context information exchange, a standard in the current state of the art. Our methodology for this comparison involved the following steps:

We simulated the conventional approach by implementing periodic sampling across a range of application profiles. Sampling points were calculated at intervals varying from 2 to 10 time units (represented as 2TU to 10TU), reflective of typical practices observed in existing solutions. By sampling points, we refer to the periodic exchange of context information. The range (2TU to 10TU) is chosen to encompass a broad spectrum of scenarios, from more frequent to less frequent information exchanges, providing a robust basis for comparison. We describe the context information utilized in the test as a tuple. The elements of the tuple are as follows: (

e v e n t_t y p e

,

j o b_i d

, and

p e r c e n t a g e_s l a c k

).

Each application profile was analyzed under both our proposed algorithm and the traditional periodic sampling method. The primary focus of this comparative analysis was to assess the impact on two crucial metrics: communication cost and makespan (the total time taken to complete all jobs). By evaluating these metrics, we aimed to understand not only the efficiency of our algorithm in reducing communication overhead but also its effectiveness in optimizing overall system performance.

Table 2 presents the total communication costs associated with different context exchange methods as the number of jobs increases for each MSG. Our proposed algorithm exhibits the most efficient performance, with a gradual increase in communication cost, ranging from 490 time units for 10 jobs (10J) to 147,420 time units for 60 jobs (60J). In contrast, sampling every 2 time units incurs the highest costs, especially noticeable at higher job counts (e.g., 4,862,588 time units for 60 jobs). As the sampling interval increases (from 2 TU to 10 TU), there is a consistent reduction in communication cost, demonstrating the impact of less frequent context exchanges in reducing communication overhead. However, sampling at a less frequent interval poses the challenge of omitting responses to context events. Compared to the 2TU sampling method, the proposed algorithm reduces communication overhead by significant factors across different job sizes, as shown in Table 2 below.

The plot in Figure 7 compares the communication costs associated with different scheduling algorithms as the number of jobs increases. Using a logarithmic scale for the y-axis (Total Communication Cost in Time Units, TU), the plot demonstrates a clear trend: as the number of jobs grows, the communication cost escalates in all cases. The data illustrate that our proposed algorithm (blue) consistently incurs lower communication costs compared to other sampling intervals, particularly at higher job counts, thus emphasizing its efficiency.

Figure 8 illustrates the observed makespan for the previous example with and without the application of our algorithm. The values in time units are given in Table 3. It can be observed in Figure 8 that the makespan when our proposed algorithm is implemented (blue plot) is slightly increased compared to when the algorithm is not applied (the case of no context exchange (red plot)). It is important to note that the makespan in the case “No Context Exchange” does not include the cost of sampling the exchange of context information. However, in a scenario where the communication cost associated with different sampling frequencies is included, the makespan can be impacted, as shown in the results obtained in Table 4, if the same communication channel is used for exchanging context information. Nevertheless, if a dedicated communication infrastructure is used for exchanging context information, similar communication costs will still be incurred on the dedicated network, which includes extra hardware implementation for its realization.

Table 3 and Table 4 provide a broader comparison of makespan outcomes under various context exchange strategies. The periodic sampling methods reveal a consistent pattern: shorter intervals, such as 2TU, introduce substantial communication overhead, leading to inflated makespan values. Conversely, longer intervals, such as 10TU, lessen this overhead, resulting in comparatively lower makespans. Nonetheless, these periodic strategies still underperform when contrasted with our proposed algorithm, which seamlessly incorporates context information exchange during scheduling. For example, in the case of 30 jobs, the proposed method completes scheduling in 210,652 time units, outperforming even the most efficient periodic method (10TU) by a considerable margin. While the no-context-exchange setup yields the lowest makespan (e.g., 177,685 for 30 jobs), it entirely omits adaptivity, rendering it unsuitable for dynamic environments. These comparative results underscore the practical advantage of our algorithm in managing communication overhead while maintaining system responsiveness.

5.4. Statistical Consistency Across GA Executions

To assess the consistency of the genetic algorithm (GA) under repeated executions, we conducted a statistical stability analysis using a 10-job application. The GA was executed 100 times independently for both the proposed method and a benchmark configuration without context exchange. For each run, the resulting makespan was recorded, and key statistical metrics were computed, including the mean, standard deviation, variance, and the coefficient of variation (CoV).

The CoV, defined as the ratio of standard deviation to the mean, serves as a normalized measure of dispersion. As shown in Table 5, the makespan CoV for the 10-job for our proposed method and the benchmark case remained well below 7%, indicating that the GA consistently converges to similar solutions across runs. This analysis exemplifies the statistical stability of the proposed method and supports its applicability in reliable and repeatable scheduling scenarios.

5.5. Real-World Benchmark Implementation

To support the evaluation of our scheduling method, we used an application model based on the automotive benchmark described by Kramer et al. [35]. This benchmark, derived from real engine control software, provides detailed information about execution times, communication sizes, and task dependencies. We selected and adapted only the attributes that are directly relevant to our scheduler.

Each job in the application represents a schedulable software unit. The key attributes are

execution_time( $μ$ $s$ ): This represents the worst-case execution time (WCET) of each job, in microseconds. We generated these values using a Weibull distribution to reflect runtime variability. For each job, we selected a period group (e.g., 10 ms, 20 ms) and used the corresponding minimum, average, and maximum execution times from Table IV in [35]. The distribution was scaled to match the average, and the values were constrained to remain within the reported bounds.

core_node: This field indicates which processing core each job is mapped to. In our method, this mapping is not predefined but is determined as part of the genetic algorithm. The scheduler explores different job-to-core assignments to minimize makespan and reduce contention. While the original benchmark assumes fixed task-to-core mappings for multi-core execution platforms (as discussed in [35]), we reinterpret core-level mapping as an optimization parameter targeting an integrated MPSoC architecture.

communications: These define data dependencies between jobs. Each message includes:

sender and receiver: the job IDs involved in the communication,
size_bytes: the size of the message in bytes, sampled according to the distribution in Table I of [35], where most labels are between 1 and 4 bytes,
type: whether the message is intra-task or inter-task, as described in [35]; this is used to enforce precedence constraints but not timing semantics.

The system runs on a 2 × 2 MPSoC mesh with four processing cores connected via routers. While the original benchmark targets distributed ECUs connected via a CAN bus, we adapted the workload to run on a tiled MPSoC platform. This allows us to evaluate mapping and scheduling decisions in the context of on-chip multi-core communication and resource contention.

As shown in Table 6, the metascheduler employing selective context exchange (proposed algorithm) achieves the most favorable timing performance. Its makespan of 7724

μ

s

is markedly lower than that of any periodic-sampling alternative and is only marginally higher than the “No Context Exchange’’ reference, whose result ( 6853

μ

s

) is interesting yet practically infeasible because it violates the consistency guarantees mandated by functional-safety standards. Periodic sampling with a 2 TU interval increases the makespan to 23,986

μ s

, a degradation of approximately 211 relative to the proposed approach. Even at the longest interval considered ( 10 TU), the makespan (10,279

μ s

) remains 33% larger than that obtained with selective exchange.

Table 7 clarifies the origin of these timing effects. The selective strategy transmits only 880 byte of context data, whereas 2 TU periodic sampling injects 17,132 byte, representing an increase by a factor of nearly 20. Although longer sampling intervals reduce traffic volumes, they never approach the communication efficiency of the proposed method. This demonstrates that the selective context-exchange mechanism effectively balances communication cost and timing performance, offering a more scalable and resource-efficient solution compared to fixed-interval policies.

5.6. Benchmark Against State-of-the-Art Adaptation Techniques

We recognize the importance of benchmarking against recent adaptive metaschedulers, such as the discrete particle swarm optimization (DPSO)-based method mentioned in [36] and the genetic algorithm (GA)-based metascheduler in [37]. However, a direct quantitative comparison is currently not feasible due to the unavailability of their source code, which is not publicly accessible. Additionally, these implementations are specifically tailored to niche domains, namely wireless IoT-based wireless sensor networks (IoT-WSNs) and network-on-chip (NoC)-based multi-processor system-on-chips (MPSoCs), each with unique architectural assumptions, fault models, and evaluation metrics. Despite this limitation, a qualitative comparison reveals several key differences. The metascheduler proposed in [36] concentrates on fault recovery in the event of single and double component failures, utilizing DPSO to precompute schedule graphs. While effective, this method treats adaptation and context synchronization as distinct layers and relies on periodic task re-allocation. Conversely, our approach seamlessly integrates timing-aware context communication into the metascheduling process. This integration allows for proactive and selective synchronization that minimizes communication overhead and preserves system-wide determinism. Similarly, the approach in [37] constructs multi-schedule graphs using a GA and implements path reconvergence to address state-space explosion. However, it does not consider context propagation within the scheduling loop. In contrast, our algorithm jointly optimizes both schedule construction and the selective transfer of context information, creating a communication-efficient adaptation framework. Although we present a simplified “No Context Exchange” scenario to illustrate a theoretical lower bound, the results demonstrate that our approach offers structural and functional advantages over those in [36,37], thereby supporting our primary claims of efficiency and predictability.

6. Conclusions

This study addresses a key limitation in time-triggered systems, particularly in safety-critical domains, where static resource allocation impedes adaptability to runtime changes such as failures, environmental shifts, or variable workloads. To overcome this, we introduced a synthesis algorithm that incorporates context information exchange directly into the scheduling process. Unlike periodic sampling methods that impose regular communication overhead regardless of necessity, our approach transmits context updates selectively, only to the nodes that require them and only at relevant decision points. This targeted strategy reduces communication costs while maintaining system responsiveness and performance.

Evaluation results demonstrate that our proposed method consistently outperforms traditional periodic sampling in terms of both communication efficiency and makespan. Rather than predicting when context events will occur, the algorithm efficiently handles their communication once they are triggered, ensuring timely and minimal-overhead propagation of context information.

Nevertheless, a limitation of our current approach is its increasing computational demand as the number of context events grows. This introduces scalability concerns due to the potential for state-space explosion, a problem previously discussed in the literature [37]. The initial configuration effort increases with the number of context events due to the need to precompute a large number of schedule variations. Our approach preserves unchanged parts of the predecessor schedules and only reconstructs the segments affected by the context event (incremental reconstruction). Furthermore, context information is sent only to the end systems whose tasks or message schedules are impacted by the event. These optimizations help reduce both the communication and computation overhead within the multi-schedule graph (MG).

While this work does not fully address large-scale scalability, future research will explore scalable representations such as context abstraction, event clustering, and AI-based schedule synthesis [38], to manage complexity while preserving the benefits of integrated context-aware scheduling.

In summary, this work makes a substantial contribution to the design of adaptive time-triggered MPSoC systems by formalizing the context exchange scheduling problem, presenting an efficient synthesis algorithm, and demonstrating its practical advantages over existing approaches.

Author Contributions

Conceptualization, D.O., O.H. and R.O.; methodology, D.O., O.H. and R.O.; software, D.O. and O.H.; validation, D.O., O.H. and R.O.; formal analysis, D.O., O.H. and R.O.; investigation, D.O.; resources, R.O.; data curation, O.H.; writing—original draft preparation, D.O.; writing—review and editing, D.O., O.H. and R.O.; visualization, R.O.; supervision, D.O. and R.O.; project administration, D.O. and R.O.; funding acquisition, R.O. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported by the research project EcoMobility in part by the EC under grant number 101112306 and the BMBF under grant number 16MEE0316. Algorithms 18 00456 i001

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy reasons.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Farkas, J.; Bello, L.L.; Gunther, C. Time-Sensitive Networking Standards. IEEE Commun. Stand. Mag. 2018, 2, 20–21. [Google Scholar] [CrossRef]
SAE Standard AS6802; Time-Triggered Ethernet AS6802. SAE International: Warrendale, PA, USA, 2016.
Loveless, A.T. On TTEthernet for integrated Fault-Tolerant spacecraft networks. In Proceedings of the AIAA SPACE 2015 Conference and Exposition, Pasadena, CA, USA, 31 August–2 September 2015; p. 4526. [Google Scholar]
Bombardier CSeries—TTTech. TTTech. Available online: https://www.tttech.com/aerospace/resources/case-studies/bombardier-cseries (accessed on 21 May 2025).
McLean, S.D.; Craciunas, S.S.; Hansen, E.A.; Pop, P. Mapping and scheduling automotive applications on ADAS platforms using metaheuristics. In Proceedings of the 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), Vienna, Austria, 8–11 September 2020; Volume 1. [Google Scholar]
Gerlach, M.; Hilbrich, R.; Weißleder, S. Can cars fly? From avionics to automotive: Comparability of domain-specific safety standards. In Proceedings of the Embedded World Conference, Nürnberg, Germany, 1–3 March 2011; pp. 1–9. [Google Scholar]
Schoeberl, M. A Time-Triggered Network-on-Chip. In Proceedings of the 2007 International Conference on Field Programmable Logic and Applications, Amsterdam, The Netherlands, 27–29 August 2007; pp. 377–382. [Google Scholar] [CrossRef]
Ahmadian, H.; Obermaisser, R. Time-Triggered Extension Layer for On-Chip Network Interfaces in Mixed-Criticality Systems. In Proceedings of the 2015 Euromicro Conference on Digital System Design, Madeira, Portugal, 26–28 August 2015; pp. 693–699. [Google Scholar] [CrossRef]
Obermaisser, R.; Ahmadian, H.; Maleki, A.; Bebawy, Y.; Lenz, A.; Sorkhpour, A.B. Adaptive time-triggered multi-core architecture. Designs 2019, 3, 7. [Google Scholar] [CrossRef]
Yang, C.; Orailoglu, A. Predictable execution adaptivity through embedding dynamic reconfigurability into static MPSoC schedules. In Proceedings of the 5th IEEE/ACM International Conference on Hardware/Software Codesign and System Synthesis, Salzburg, Austria, 30 September–3 October 2007; pp. 15–20. [Google Scholar] [CrossRef]
Kadri, N.; Koudil, M. A Survey on Fault-Tolerant Application Mapping Techniques for Network-on-Chip. J. Syst. Archit. 2019, 92, 39–52. [Google Scholar] [CrossRef]
Singh, A.K.; Shafique, M.; Kumar, A.; Henkel, J. Mapping on multi/many-core systems: Survey of current and emerging trends. In Proceedings of the 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC), Austin, TX, USA, 29 May–7 June 2013; pp. 1–10. [Google Scholar] [CrossRef]
Lenz, A. System-Wide, Fault-Tolerant State Agreement Protocol for Time-Triggered MPSoC. Ph.D. Thesis, Universität Siegen, Siegen, Germany, 2020. [Google Scholar] [CrossRef]
Lenz, A.; Obermaisser, R. Global Adaptation Controlled by an Interactive Consistency Protocol. J. Low Power Electron. Appl. 2017, 7, 13. [Google Scholar] [CrossRef]
Li, J.; Xiong, H.; Li, Q.; Xiong, F.; Feng, J. Run-Time Reconfiguration Strategy and Implementation of Time-Triggered Networks. Electronics 2022, 11, 1477. [Google Scholar] [CrossRef]
Skalistis, S.; Kritikakou, A. Timely fine-grained interference-sensitive run-time adaptation of time-triggered schedules. In Proceedings of the 2019 IEEE Real-Time Systems Symposium (RTSS), Hong Kong, China, 3–6 December 2019; pp. 233–245. [Google Scholar]
Skalistis, S.; Kritikakou, A. Dynamic interference-sensitive run-time adaptation of time-triggered schedules. In Proceedings of the ECRTS 2020-32nd Euromicro Conference on Real-Time Systems, Virtual, 7–10 July 2020; pp. 1–22. [Google Scholar]
Ahmadian, H.; Nekouei, F.; Obermaisser, R. Fault recovery and adaptation in time-triggered Networks-on-Chips for mixed-criticality systems. In Proceedings of the 2017 12th International Symposium on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC), Madrid, Spain, 12–14 July 2017; pp. 1–8. [Google Scholar] [CrossRef]
Zhang, D.; Wang, Y.; Meng, L.; Yan, J.; Qin, C. Adaptive critic design for safety-optimal FTC of unknown nonlinear systems with asymmetric constrained-input. ISA Trans. 2024, 155, 309–318. [Google Scholar] [CrossRef] [PubMed]
Kim, H.; Kim, Y.-J.; Kim, W.-T. Deep Reinforcement Learning-Based Adaptive Scheduling for Wireless Time-Sensitive Networking. Sensors 2024, 24, 5281. [Google Scholar] [CrossRef] [PubMed]
Pandit, M.K.; Mir, R.; Chishti, M.A. Adaptive task scheduling in IoT using reinforcement learning. Int. J. Intell. Comput. Cybern. 2020, 13, 261–282. [Google Scholar] [CrossRef]
Liu, J.; Wu, Z.; Feng, D.; Zhang, M.; Wu, X.; Yao, X.; Yu, D.; Ma, Y.; Zhao, F.; Dou, D. Heterps: Distributed deep learning with reinforcement learning based scheduling in heterogeneous environments. Future Gener. Comput. Syst. 2023, 148, 106–117. [Google Scholar] [CrossRef]
Alshaer, S.; Lua, C.; Muoka, P.; Onwuchekwa, D.; Obermaisser, R. Graph Neural Networks Based Meta-scheduling in Adaptive Time-Triggered Systems. In Proceedings of the 2022 IEEE 27th International Conference on Emerging Technologies and Factory Automation (ETFA), Stuttgart, Germany, 6–9 September 2022; pp. 1–6. [Google Scholar] [CrossRef]
Obermaisser, R.; Kopetz, H. Time-Triggered Communication; CRC Press: Boca Raton, FL, USA, 2018; Chapter 2. [Google Scholar]
Liang, L.; Li, X.; Huang, H.; Yin, Z.; Zhang, N.; Zhang, D. Securing Multidestination Transmissions With Relay and Friendly Interference Collaboration. IEEE Internet Things J. 2024, 11, 18782–18795. [Google Scholar] [CrossRef]
Masmano, M.; Ripoll, I.; Crespo, A.; Metge, J. Xtratum: A hypervisor for safety critical embedded systems. In Proceedings of the 11th Real-Time Linux Workshop, Dresden, Germany, 28–30 September 2009; Volume 9. [Google Scholar]
Kaiser, R.; Wagner, S. The PikeOS Concept History and Design SysGO AG White Paper. 2007. Available online: http://www.sysgo.com (accessed on 21 July 2025).
Faruque, S. Time division multiple access (TDMA). In Radio Frequency Multiple Access Techniques Made Easy; Springer International Publishing: Cham, Switzerland, 2019; pp. 35–43. [Google Scholar]
Sinnen, O. Task Scheduling for Parallel Systems; John Wiley & Sons: Hoboken, NJ, USA, 2007. [Google Scholar]
Muoka, P.; Umuomo, O.; Onwuchekwa, D.; Obermaisser, R. Adaptation for Energy Saving in Time-Triggered Systems Using Meta-scheduling with Sample Points. In Proceedings of the International Embedded Systems Symposium, Lippstadt, Germany, 3–4 November 2022; pp. 28–40. [Google Scholar]
De Rainville, F.M.; Fortin, F.A.; Gardner, M.A.; Parizeau, M.; Gagné, C. Deap: A python framework for evolutionary algorithms. In Proceedings of the 14th annual Conference Companion on Genetic and Evolutionary Computation, Philadelphia, PA, USA 7–11 July 2012; pp. 85–92. [Google Scholar]
Leskovec, J.; Sosič, R. Snap: A general-purpose network analysis and graph-mining library. ACM Trans. Intell. Syst. Technol. (TIST) 2016, 8, 1. [Google Scholar] [CrossRef] [PubMed]
Dick, R.P.; Rhodes, D.L.; Wolf, W. TGFF: Task graphs for free. In Proceedings of the Sixth International Workshop on Hardware/Software Codesign. (CODES/CASHE’98), Seattle, WA, USA, 18 March 1998; pp. 97–101. [Google Scholar] [CrossRef]
Stuijk, S.; Geilen, M.; Basten, T. SDF3: SDF For Free. In Proceedings of the Sixth International Conference on Application of Concurrency to System Design (ACSD’06), Turku, Finland, 28–30 June 2006; pp. 276–278. [Google Scholar] [CrossRef]
Simon, K.; Ziegenbein, D.; Hamann, A. Real world automotive benchmarks for free. In Proceedings of the 6th International Workshop on Analysis Tools and Methodologies for Embedded and Real-time Systems (WATERS), Lund, Sweden, 7 July 2015; Volume 130. [Google Scholar]
Baniabdelghany, H.; Obermaisser, R.; Khalifeh, A.; Muoka, P. Metascheduling Using Discrete Particle Swarm Optimization for Fault Tolerance in Time-Triggered IoT-WSN. IEEE Internet Things J. 2023, 10, 12666–12675. [Google Scholar] [CrossRef]
Muoka, P.; Onwuchekwa, D.; Obermaisser, R. Adaptive Scheduling for Time-Triggered Network-on-Chip-Based Multi-Core Architecture Using Genetic Algorithm. Electronics 2022, 11, 49. [Google Scholar] [CrossRef]
Lua, C.; Onwuchekwa, D.; Obermaisser, R. Ai-based scheduling for adaptive time-triggered networks. In Proceedings of the 2022 11th Mediterranean Conference on Embedded Computing (MECO), Budva, Montenegro, 7–10 June 2022; pp. 1–7. [Google Scholar]

Figure 1. Adaptive Time-Triggered System.

Figure 2. Metascheduling with Application, Platform, and Context Models.

Figure 3. Application Model and Platform Model.

Figure 4. Multischedule Graph.

Figure 5. Schedule tree for five events. The

c^{t h}

row shows the binary patterns of the schedules activated by event c. Additionally, for the red schedule 00101, the direct predecessors are marked in yellow, and the direct successors are marked in blue.

Figure 5. Schedule tree for five events. The

c^{t h}

row shows the binary patterns of the schedules activated by event c. Additionally, for the red schedule 00101, the direct predecessors are marked in yellow, and the direct successors are marked in blue.

Figure 6. Abstract JSON schemas of the application model (left) and platform model (right), used to define jobs, messages, hardware nodes, and interconnects.

Figure 7. Total communication cost for different sampling intervals as the number of jobs increases.

Figure 8. Makespans for the proposed algorithm and the no-context-exchange scenario across increasing job sizes.

Table 1. Genetic Algorithm Configuration Parameters.

Parameter	Value / Description
GA Library	DEAP (Python) [31]
Population Size	100
Number of Generations	50
Crossover Probability	0.3
Mutation Probability	0.2
Selection Method	Tournament Selection
Fitness Function	Minimization of makespan
Genome Structure and Operators
Task Order (R)	Encoding: Permutation; Crossover: Partially matched crossover (PMX); Mutation: Randomly swaps two elements
End System Allocation (A)	Encoding: Value-based (Integer); Mutation: Replaces element with random integer in valid range
Path Index (P)	Encoding: Value-based (Integer); Mutation: Replaces element with random integer in valid range
Message Priority Ordering (T)	Encoding: Permutation; Mutation: Random shuffle with a given probability

Table 2. Communication cost measured in bytes per period for different sampling methods and job sizes.

Job Size	Prop. Alg.	2TU Samp.	4TU Samp.	6TU Samp.	8TU Samp.	10TU Samp.
10J	490	4845	2423	1610	1207	971
20J	1005	188,330	94,164	62,777	47,083	37,660
30J	7690	444,216	222,116	148,073	111,064	88,846
40J	39,000	783,236	391,627	261,136	195,834	156,618
50J	74,200	1,978,093	989,038	659,387	494,580	395,609
60J	147,420	4,862,588	2,431,259	1,620,815	1,215,552	972,574

Table 3. Makespan values for periodic context exchange and without periodic exchange.

Makespan	10J	20J	30J	40J	50J	60J
Proposed Algorithm	2722	90,421	210,652	412,215	1,038,333	2,334,854
No Context Exchange	1938	75,335	177,685	313,309	791,242	1,945,007

Table 4. Makespan for different sampling methods and job sizes.

Job Size	2TU Samp.	4TU Samp.	6TU Samp.	8TU Samp.	10TU Samp.
10J	6783	4360	3553	3149	2907
20J	263,672	169,503	138,114	122,419	113,002
30J	621,897	399,791	325,755	288,738	266,527
40J	1,096,581	704,945	574,399	509,127	469,963
50J	2,769,347	1,780,294	1,450,610	1,285,768	1,186,863
60J	6,807,524	4,376,265	3,565,846	3,160,636	2,917,510

Table 5. Statistical evaluation over 100 GA runs for the 10-Job application configuration, with and without context exchange. CoV = Coefficient of Variation (std/mean).

Metric	Mean	Std	Var	Best	CoV (%)
10J Proposed Method
Makespan	2669.8	115.4	13,317.19	2314	4.30
Communication Cost	486.85	10.46	109.42	465	2.10
10J without Context Exchange
Makespan	1844.57	123.47	15,243.68	1496	6.70

Table 6. Makespan comparison of the proposed algorithm, the no context exchange baseline, and periodic sampling strategies at different intervals, applied to a benchmark application based on [35] consisting of 20 jobs.

Metric	Proposed Algorithm	No Context Exchange	2TU Samp.	4TU Samp.	6TU Samp.	8TU Samp.	10TU Samp.
Makespan Value	7724	6853	23,986	15,419	12,563	11,136	10,279

Table 7. Communication cost comparison of the proposed algorithm and periodic sampling strategies at different intervals, applied to a benchmark application based on [35] consisting of 20 jobs.

Metric	Proposed Algorithm	2TU Samp.	4TU Samp.	6TU Samp.	8TU Samp.	10TU Samp.
Communication cost	880	17,132	8566	5711	4283	3427

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Onwuchekwa, D.; Hekal, O.; Obermaisser, R. Scheduling the Exchange of Context Information for Time-Triggered Adaptive Systems. Algorithms 2025, 18, 456. https://doi.org/10.3390/a18080456

AMA Style

Onwuchekwa D, Hekal O, Obermaisser R. Scheduling the Exchange of Context Information for Time-Triggered Adaptive Systems. Algorithms. 2025; 18(8):456. https://doi.org/10.3390/a18080456

Chicago/Turabian Style

Onwuchekwa, Daniel, Omar Hekal, and Roman Obermaisser. 2025. "Scheduling the Exchange of Context Information for Time-Triggered Adaptive Systems" Algorithms 18, no. 8: 456. https://doi.org/10.3390/a18080456

APA Style

Onwuchekwa, D., Hekal, O., & Obermaisser, R. (2025). Scheduling the Exchange of Context Information for Time-Triggered Adaptive Systems. Algorithms, 18(8), 456. https://doi.org/10.3390/a18080456

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Scheduling the Exchange of Context Information for Time-Triggered Adaptive Systems

Abstract

1. Introduction

2. Related Work

2.1. Adaptation in Time-Triggered Systems

2.2. Learning Based Adaptive Scheduling

2.3. Agreement Protocols for Time-Triggered Systems

3. Adaptive Time-Triggered Systems

4. Scheduling for Adaptive Time-Triggered Systems

4.1. Application and Platform Models

4.2. Context Model and Event Calendar

4.3. Conditional Timed Application, Platform, and Schedule Models for Combinations of Context Events

4.4. Conditional Timed Context Agreement Model

4.5. Establishment of Multi-Schedule Graph

4.5.1. Metascheduler

4.5.2. Application of Genetic Algorithm

4.5.3. Schedule Reconstruction

5. Results and Discussion

5.1. Evaluation Methodology and Performance Metrics

5.2. Experimental Setup and Communication Cost Modeling

5.3. Comparative Evaluation with Periodic Sampling Approaches

5.4. Statistical Consistency Across GA Executions

5.5. Real-World Benchmark Implementation

5.6. Benchmark Against State-of-the-Art Adaptation Techniques

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI