Interference-Aware Schedulability Analysis and Task Allocation for Multicore Hard Real-Time Systems

: There has been a trend towards using multicore platforms for real-time embedded systems due to their high computing performance. In the scheduling of a multicore hard real-time system, there are interference delays due to contention of shared hardware resources. The main sources of interference are memory, cache memory, and the shared memory bus. These interferences are a great source of unpredictability and they are not always taken into account. Recent papers have proposed task models and schedulability algorithms to account for this interference delay. The aim of this paper is to provide a schedulability analysis for a task model that incorporates interference delay, for both ﬁxed and dynamic priorities. We assume an implicit deadline task model. We rely on a task model where this interference is integrated in a general way, without depending on a speciﬁc type of hardware resource. There are similar approaches, but they consider ﬁxed priorities. An allocation algorithm to minimise this interference (Imin) is also proposed and compared with existing allocators. The results show how Imin has the best rates in terms of percentages of schedulability and increased utilisation. In addition, Imin presents good results in terms of solution times.


Introduction
The use of embedded systems is nowadays spreading at an increasing speed, to all aspects of modern life as well as all phases of industrial production. The processing capability of multicore systems permits multiple embedded applications on a single shared hardware platform. Nevertheless, multicore systems add many sources of indeterminism, leading to a number of execution delays. These sources of indeterminism mainly involve shared hardware resources, such as buses, caches, and memories. It is necessary to analyse the temporal model in the context of a multicore system and not just when it is running without contenders [1]. Specifically, the interference appears when cores contend for these shared resources. In addition, in highly critical systems, the static plan must take the interference into account. If not, the system's feasibility may be jeopardized.
When this interference is taken into account, the model and the schedulability analysis is often limited to a particular type of hardware resource and even a particular type of memory.
Recently, in [2], a new task model is proposed that takes into account the interference of hardware shared resources in the context of multicore hard real-time systems. This model is general for any kind of hardware resource. However, there is no schedulability test for this new task model. This interference delay can be large and highly variable, posing a major challenge to the schedulability of real-time critical multicore systems.
Contribution: This paper proposes a schedulability test for multicore real-time systems for the model presented by Ref. [2], since in this work they did not present any schedulability analysis.

Related Works
There is a lot of research about real-time multicore systems scheduling, and one of the most relevant surveys in this area is [3]. In multicore scheduling there are two main branches depending on the criticality required: partitioned and global scheduling. Since our scope of application is hard real-time, from now on we will assume partitioned scheduling. Partitioned multicore scheduling involves two phases: task to core allocation and task scheduling of each core.
The allocation of tasks to cores can be solved with bin packing techniques. We know that this problem is NP-hard in the strong sense [4]. Some of the most popular heuristics, cited in Refs. [5,6] are Worst fit (WF), First fit (FF), and Best fit (BF). Coffman et al. [7] describes many of these algorithms in detail.
It is important to note that the previous allocators do not take into account the unpredictability produced by shared hardware resources. In Ref. [8] a comprehensive analysis of all possible sources of indeterminism is presented. These sources are classified into primary and secondary. They consider primary sources such as caches, memory, FSB, and memory controller and secondary sources such as power saving strategies, hardware-prefetching, system management interrupts, and translation look-aside buffer.
Additionally, in [1] a state-of-the-art about contention delays is presented. This topic is also analysed in [9], where the sources of timing interference in single-core, multi-core, and distributed systems are presented. As stated in this paper, memory interference can render a system infeasible. It is shown in [10] that it is possible that memory interference can cause a worst-case response time increase of almost 300%, even for tasks that spend only 10% of their time fetching memory on an eight-core system.
There are some works that reduce interference delays by using modified task models. In [11], a new model called PREM (predictable execution model) is proposed. This proposal divides one task into two phases: communication and execution. A similar technique is used in [12], which schedules the system with the goal of minimising the makespan by letting the scheduler decide when it is appropriate to avoid interference. For DAG task models, Ref. [13] proposes a scheduling method that applies the LET (Logical Execution Time) paradigm and considers the communication timing between nodes to reduce interference delay due to shared hardware resources.
Other works, such as Ref. [14], proposes a feedback control scheme where critical and non-critical tasks are separated and assigned to different partitions for ensuring the execution of the critical tasks, so a hypervisor manages the multicore hardware system in order to limit memory bus access in non-critical cores measured with Performance Monitor Counters. In Ref. [15], the authors propose an analysis of memory contention as an optimisation problem, which tries to minimize memory interference. They split tasks intro three phases and consider multiple memory transactions issued during each phase. Other approaches, such as the one presented in Ref. [16], reduce contention using synchronisation-based interference models and specific memory allocation schemes.
The survey in Ref. [17] provides an overview on timing verification techniques for multicore real-time systems until 2018. This survey considers single shared resources (memory bus, shared cache, DRAM) and also multiple resources, which is what this work focuses on. The most relevant works come from the Multicore Response Time Analysis framework in Ref. [18], which provides a general approach to timing verification for multicore systems. They omit the notion of worst case execution time (WCET) and instead directly target the calculation of task response times through execution traces. They start from a given mapping of tasks to cores and assume fixed-priority preemptive scheduling.
Other works cited in the survey, such as Ref. [19,20] consider the amount of time for shared resources accesses and the maximum number of access segments, which is out of the scope of this work.
Regarding mapping and scheduling, the survey presents several works grouped by the techniques that are used: bin-packing algorithms, genetic algorithms, ILP and MILP (Integer/Mixed Integer Linear Programming), etc. ILP and MILP techniques consider scratch-pad memories with different objectives: to minimise the initiation intervals of task graphs, to minimise the WCETs, to minimize the worst case response time (WCRT) of task graphs, etc. The presented techniques are not scalable to large numbers of cores in the system and the authors present heuristics that are scalable.
There are some works that introduce interference due to memory contention as a new parameter in the temporal model. In Ref. [21] WCRA (Worst Case number of shared Resource Accesses) is defined and added to the WCET. In the same way, Ref. [22] proposed the concept of interference-sensitive Worst-Case Execution Time (isWCET). A dynamic approach is presented in Ref. [23] so that, depending on the progress of each kernel, the dependencies of the isWCET schedules are reduced or completely eliminated. The concept of isWCET is similar to our work, but the proposals are centred around minimising the effect of interference with new scheduling methods while our work focuses on allocation algorithms and schedulability conditions. In Ref. [24] memory interference is analysed in a similar way to our work, as it is also represented as a parameter of the temporal model. However, their work is based on the interference produced only by the DRAM memory while our proposal is agnostic with respect to the shared hardware resource used. Their work also considers fixed priority scheduling while our proposal considers dynamic priorities.
The work in Ref. [25] provides the Multicore Resource Stress and Sensitivity (MRSS) task model that characterises how much stress each task places on resources and its sensitivity to such resource stress. This work considers different types of interference (limited, direct, and indirect) and fixed priority scheduling policies. In contrast to this work, the task-to-cores allocation is known a priori.
In Ref. [2], the interference due to contention is added to the temporal model. Instead of adding it to the WCET, they propose a scheduling algorithm that computes the exact value of interference and an allocator that tries to reduce this total interference. In Ref. [26], partitioned scheduling that considers interference while making partition and scheduling decisions is presented. They present a mixed integer linear programming model to obtain the optimal solution but with a high computation cost and they also propose approximation algorithms. They only consider implicit deadline models. This paper differentiates between isolated WCET and the WCET with interference and overhead. They define an Inter-Task interference matrix, in which each element of the matrix is the interference utilisation between two tasks, considering the inflated WCET when two tasks run together. This work is similar to Ref. [2], but in Ref. [2] a general model is considered, valid for any type of shared hardware resource while Ref. [26] only interference due to cache sharing is considered.
Our work continues the research of Ref. [2], so we use the same temporal model to provide a schedulability bound. Therefore, our model does not only provide a schedulability analysis for both fixed and dynamic priorities with a general model for interference, but also a new allocation algorithm that minimizes this bound.

Problem Definition and Task Model
The aim of this work is to provide a schedulability condition and an interferenceconscious allocator for the task model presented below. This task model is the same as the one presented in Ref. [2].
We suppose a multicore system with m cores (M 0 , M 1 , M 2 , . . . , M m−1 ) where a task set τ of n independent tasks should be allocated. Each task τ i is represented by the tuple: where C i is the WCET, T i is the period and I i is the interference. We assume implicit deadlines, so the deadlines are equal to periods. When we refer to M τ i , we mean the core in which τ i is allocated. The hyperperiod of the task set, H, is defined as the least common multiple (lcm) of the periods of all the periodic tasks: We define A i as the number of activations that τ i has throughout H: Since the goal of this paper is to obtain a schedulability test, we will assume a synchronous task system.
To visualize the concept of I i , we can observe Figure 1, where the computation time (C 0 ) of task τ 0 is represented. As part of this computation time, the time in which the task performs read and/or write operations in memory has been differentiated (depicted with diagonal lines), as an example of access to a shared hardware resource. It is possible that when the task wants to do one of these r/w operations, the requested resource is busy. Or the other way around, that when the task is accessing this hardware resource, other tasks are blocked from accessing this resource. In any case, this time is the time considered as interference caused to other tasks in other cores I i . In general, the term I i is the time the task takes to access shared hardware resources. Although I i is part of C i , during the time the task is accessing the shared resource, other tasks on other cores will be delayed. So this interference time is defined independently of C i , as will be used to represent the delay caused to other tasks. Although the parameter I i is defined in [2], it is worth describing in detail by means of an example. Example of task set execution with interference: We have a task set composed of three tasks and three cores. Every task is allocated to a different core. τ 0 and τ 2 request access to the same shared hardware resource, as I 0 , I 2 > 0. However, τ 1 will not be affected by the interference of the other tasks, as I 1 = 0.
The scheduling of this system is represented in Figure 2. As we can observe in Figure 2, on one hand, task τ 1 does not receive any interference from the other tasks, so its execution time is not affected by the contention. On the other hand, τ 0 and τ 2 suffer an increase in their execution time. At the beginning, both tasks (τ 0 and τ 2 ) suffer a delay of 1 unit because of the interference between them. In t = 6, τ 0 is released again and coincides with the first job of τ 2 . In that moment, each task provokes a unit of interference in the execution of the other task. We represent the interference as unit times in unified blocks, positioned in the beginning of the execution in order to facilitate the calculation but always preserving the real magnitude of the contention.
Once the model has been defined, we will now present some parameters necessary for the development of the following sections.
From Figure 2, it is deduced that, if contention is considered, the total utilisation of a task does not only depend on its computation time and period, but also on the interference received from other tasks.
We consider that the interference parameter refers to a single type of hardware resource. If there are several types of hardware resources, the model would have to extend to as many interference parameters as there are hardware resources. Considering parameter I for all interference types would result in an even more pessimistic model.
It should be taken into account that a task τ j will cause interference to other task τ i when the following conditions are met: • τ i and τ j are not allocated to the same core (M τ i = M τ j ); • τ i and τ j are active at the same time; • τ i and τ j have at least 1 unit of interference (I i > 0, I j > 0) Therefore, the real utilisation (we refer to real utilisation to the utilisation that includes the interference delay) of a task, U i , is: where U i = C i /T i and U int i are the utilisation due to the interference caused by other tasks to τ i : where I T i the total interference that a task τ i receives from other tasks in a period T i . Therefore: Hence, the utilisation of a core would be the sum of all the task utilisations: And the utilisation of all the system would be the sum of all the core utilisations:

Contention Aware Utilisation Factor
In this section, we are going to provide an upper bound to U int i , so we will be able to provide a schedulability test. This bound will be called U iub i . First, we need two definitions, which were also introduced in Ref. [2]: . A task is defined as a receiving task when it accesses shared hardware resources and it suffers an increment of its computation time due to the interference produced by other tasks allocated to other cores.

Definition 2 ([2]).
A task is defined as a broadcasting task when it accesses shared hardware resources and it provokes an increment of computation time in other tasks allocated to other cores due to contention.
If I i = 0, τ i is neither broadcasting nor receiving task. If I i > 0, τ i will be a broadcasting and receiving task if there is at least one task τ j in other core whose I j > 0.
To estimate U iub i , we will calculate the maximum total interference that will depend on the maximum number of activations of a broadcasting task that fall within a receiving task.

Definition 3. Let I Tub
j→i (in what follows, the expression j → i means that τ j is a broadcasting task and τ i is a receiving task) be the maximum total interference that τ j can cause to τ i in a period of τ i , T i . Definition 4. Let A j→i be the maximum number of activations of the broadcasting task τ j that fall within an activation of the receiving task, τ i .
From the two previous definitions, it is clear that: Note that I Tub j→i is an upper bound because the maximum number of interferences does not have to occur in all activations. Besides, I Tub j→i only takes into account the interference caused by τ j . To calculate U iub i , we need to consider all tasks allocated to other cores. Therefore: From Equation (10), it is easy to see that we have reduced the study of U iub i to the study of I Tub j→i .

Worst Case Estimation of I Tub j→i
To calculate I Tub j→i , we need to know the exact value of A j→i . To do this, we will calculate A j→i assuming the following conditions: • τ i and τ j are allocated to different cores; T j ≥ T i (the period of the broadcasting task is greater than the period of the receiving task).
We will have further considerations of the case in which T j < T i . Let us study the total interference received by τ i . First, the maximum number of activations of τ j that fall within an activation of τ i is calculated: being K = 0 If periods T i and T j are harmonics 1 Elsewhere Equation (11) expresses the relationship between the periods of broadcasting and receiving task modified by factor K and minus 1 to reflect the number of activations of the broadcasting task that fall within a period of the receiving task. With the division of periods, we obtain the whole times that a task is included in the other and we apply the ceiling function in order to obtain the greatest integer number of times. Moreover, the worst scenario happens if one of the tasks is shifted one unit of time. That is why the minus one is added to the formula. Then, two cases are possible: • Harmonic periods, with zero or small residues in the division of periods. Then, K = 0; • Non-harmonic periods. Then, K = 1.
The above definition of I Tub j→i is very pessimistic in the sense that it considers that the maximum interference occurs equally in all activations. This is not always true, as it depends on how the activations of the two tasks coincide.
Once the definition of I Tub j→i is provided, let us consider the case in which T i > T j (the period of the broadcasting task is shorter than the period of the receiving task) with the following example: let us suppose two tasks τ i and τ j allocated to a dual-core platform, with I i = 2I j . Figures 3 and 4 represent the maximum interference that may be produced when tasks τ i and τ j are executed simultaneously. Note that, for the sake of simplicity, we have not depicted the tasks computation times, but only the interference.   Figure 3 considers that τ j is the broadcasting task and τ i the receiving task. In this case, the period of the broadcasting task τ j is greater than the period of the receiving task τ i , and the total received interference is equal to I Tub j→i = 4. This scenario has been studied previously.
Let us consider now that τ i is the broadcasting task and τ j is the receiving task, as depicted in Figure 4. In this case, the period of the broadcasting task is shorter than the period of the receiving task. Could we apply Equation (11)? We are going to prove it numerically. Suppose that T i = 4 and T j = 6. Applying Equation (11), the maximum number of activations of the broadcasting task that fall within an activation of the receiving As seen in Figure 4, τ i never falls three times within an activation of τ j . So Equation (11) can not be applied when the period of the broadcasting task τ j is shorter than the period of the receiving task.
Then, we are proposing a methodology to relate the total interference between a broadcasting and a receiving task, regardless of the lengths of their periods. Theorem 1. The ratio of interference received and broadcast by τ j to τ i is: Proof. At time 0, as both tasks are released, interference is always introduced. Moreover, every time any task is released, I j units of interference may be introduced. In particular, it happens as many times as activations the tasks have in a hyperperiod minus one, because the first activation has already been considered. Then, the total number of times is: Then, the maximum number of interferences that a task τ j provokes in τ i is: Then, let us have a look at the total interference received by task j, I T i→j . If we repeat the previous process, the maximum number of interferences that the task τ i provokes in τ j is: It is easy to deduce that the only difference between both situations is the coefficient of interference of the task that provokes the interference. Therefore, we can conclude that the total interference that task j provokes to task i is related with the total interference that task i provokes to task j as follows: Then, the total number of interferences between two tasks may be calculated interchangeably, from the point of view of both tasks. Thus, from now on and for calculus purposes, this work considers that, when two tasks interfere, the task with the biggest period provokes the interference on the task with the shortest period. If it is not the case, we will apply Equation (12).
To conclude with the example in Figures 3 and 4, as we deduced that I Tub j→i = 4, from Equation (12): and this result coincides with that depicted in Figure 4.

Schedulability Analysis
Once an upper value is given for I Tub j→i , we can estimate an upper bound of the utilisation of a receiving task τ i , U ub i , taking into account interferences due to contention.
Note that, if τ i is not a receiving task, then U iub i = 0. This upper bound is always greater or equal to the real utilisation of the receiving task i, U i . This results from the fact that: Therefore, the upper bound of the utilisation of each core is defined as the sum of the upper bounds of the utilisations of the tasks that belong to that core, U ub M k = ∑ τ i ∈M k U ub i . Similarly, the upper bound of the system utilisation is defined as the sum of the upper bound of the utilisations of all cores (or all tasks), U ub τ = ∑ M k U ub M k . Then, the system will be schedulable if:

1.
The upper bound of the utilisation of each task is less or equal to n(2 1 n − 1) for fixed priorities and to one for dynamic priorities, Ref. [27]: for dynamic priorities ∀i = 0, . . . , n − 1

2.
The upper bound of the utilisation of each core M k is less or equal to one: As a consequence of (i) and (ii), the upper bound of the system utilisation is less or equal to the number of cores [28], m: As utilisations U ub i are overestimated, we can conclude that if previous conditions are accomplished, the system will be schedulable.
Note that this is a necessary but not a sufficient condition.

Example
Let us show the procedure to be followed with an example: let us consider the following task set, τ = [τ 0 , τ 1 , τ 2 ] with τ 0 = (2, 3, 0), τ 1 = (4, 8,2), and τ 2 = (5, 12, 1), allocated to a system with three cores. τ 0 is allocated to core M 0 , τ 1 , in M 1 , and τ 2 , in M 2 . Once the tasks are allocated to cores, the Earliest Deadline First (EDF) algorithm [27] schedules tasks in each core. The actual execution of the task set is shown in Figure 5. We are applying the method described in Section 4 to calculate the upper bound of the system utilisation. Then we will compare it with the actual system utilisation, shown in Figure 5.

•
Step 1: Define τ j and τ i . The task with the greatest period is the broadcasting task τ j , and the receiving task is the task with the shortest period, τ i . In this example, τ j = τ 2 and τ i = τ 1 . As I 0 = 0, τ 0 is neither receiving nor broadcasting. • Step 2: Calculate the maximum number of times that task τ 2 falls within an activation of τ 1 , A 2→1 . As depicted in Figure 6, τ 2 falls twice within the second activation of τ 1 . It may be calculated applying Equation (11): Figure 6. Upper bound of the interference received by τ 1 , I Tub 2→1 .

•
Step 3: To calculate the maximum total interference that τ 2 can cause to τ 1 . First, we calculate the number of activations of task τ 1 in a hyperperiod, A 1 = H/T 1 = 3. Following Equation (4): The maximum total interference is only received in the second activation of τ 1 , as shown in Figure 6, but we consider the worst case, in which all activations receive the maximum of interference from τ 2 . • Step 4: Calculate the total interference received by the other receiving tasks, τ 2 , as shown in Figure 7. As the period of τ 2 is greater than the period of τ 1 , we apply the Equation (12):

•
Step 5: Calculate the upper bound of the utilisation of each task.
From Figure 5, the actual utilisations of the tasks are: It is deduced that the actual utilisations of the tasks are equal or less than the upper bounds estimated in this paper. Therefore, the system is schedulable.

Task Allocation Algorithms
In the following, we first briefly discuss several existing allocation techniques. We then propose a new mapping strategy. Allocation is a problem that appears with multicore systems. It comes to answer the question: which processor will execute each task?
The main disadvantage of the partitioning approach to multicore scheduling is that the task allocation problem is analogous to the bin packing problem and is known to be NP-Hard [29].

Overview of Existing Heuristic Bin-Packing Algorithms
Refs. [5,6] detailed some of the most well-known bin-packing heuristics: One problem with the previous mentioned heuristics is that, if the items are not ordered properly, large items could not be packed efficiently. Therefore, items should be ordered effectively to avoid these problems. One way is to pack the items in order of decreasing weights or utilizations. This way, WF becomes WFDU (Worst Fit Decreasing utilisation), and FF becomes FFDU (First Fit Decreasing utilisation).

Overview of Aceituno's Method
In Ref. [2], a task model that takes into account interference delays due to contention of shared hardware resources is proposed. Moreover, three tasks-to-cores allocation methods are also presented.
In that paper, a discrepancy-based method is presented, which is defined as the difference between the maximum and minimum utilisation of a multicore system, UD τ . One of the algorithms is focused on reducing the discrepancy, UDmin, and the other on maximizing it, UDmax. On the one hand, UDmin behaves as the heuristic WFDU, as both balance the load among cores. They present excellent schedulability ratios (98.1% and 100%, respectively) but reach high rates of increment of utilisation due to the system interference (2.3%). On the other hand, UDmax behaves as FFDU, as it unbalances the load among cores. In contrast to UDmin/WFDU, it reaches a low rate of increment of utilisation but their rates of schedulability are very low (around 43%). The Wmin allocator accounts for the possible interference produced for each task and it provides a low increment of utilisation (0.266%) and high schedulability ratio (up to 89%), with respect to previous algorithms.

Proposed Allocator Considering the Interference: Imin
Previous bin-packing and discrepancy-based algorithms do not take into account the interference produced when two or more tasks allocated to different cores coincide in execution. However, Wmin does take interference into account, although not exactly, but based on a binary matrix (W) of possible interference between tasks on other cores. This allocator tries to minimise the number of 1 s in the matrix, which represent a possible interference.
In this subsection, a new allocation algorithm to minimise the interference is proposed-Imin. As it has been analysed in previous sections, U ub i is the upper bound of the utilisation taking into account the maximum possible interference. The allocator Imin obtains the task allocation to cores that minimises this upper bound as much as possible. This allocator is based on integer linear programming to obtain an optimal solution. For that, we define a set of parameters and variables shown in Table 1. Table 1. Model notation of the implicit deadline task model.

PARAMETERS
s.t: Constraint (15) assures that a task is allocated to one and only one core. Equation (16) sets the value of the extra utilisation of a task provoked by the interference in its upper bound. Equation (17) sets the value of the total utilisation taking account the interference in its upper bound, which means in its maximum value, as it has been explained and established in this paper. The total utilisation per core is calculated as the sum of the utilisations of the tasks that belong to that core (Equation (18)) and its value should be less or equal to 1 (Equation (19)). Equations (20) and (21) represent the decision variable domains.

Experimental Conditions
To validate our proposed technique, we implemented a synthetic task generator to generate up to 5400 random feasible task sets with different configurations. The task generator works by calculating the following parameters: • U τ : utilisation of the task set calculated using the UUniFast discard algorithm [30]; • T i : task periods, which are generated randomly in [20,1000]; • C i : computation times are deduced from the periods and the utilisations, as The experimental parameters of the evaluation are specified in Table 2. To ensure the reproducibility of the results, these parameters coincide with those used in [2]. The only difference is that a wide range of theoretical utilisations and interference factors are also studied.
The theoretical utilisation of the maximum load of all cores varies between 50% and 75% of the maximum load of the system. For example, the maximum load of a system with 8 cores is 8, so for evaluation purposes, the initial utilisation is set to 4 (50%) and 6 (75%).
As is well shown in Table 2, the number of broadcasting tasks is 25% of the total number of tasks, except for scenarios 1-6 (2 cores), which is 50%. This is due to the fact that, if only one task is broadcasting, no interference will be produced. Each combination of the number of cores and utilisation is evaluated in three cases, with 10%, 20%, and 30% of interference over the WCET. It means that, for example, if the percentage of interference in a task set is established to 20% and a broadcasting task of this set has 10 units of WCET, then the interference of this broadcasting task should be 2 units. With this setup, we apply the proposed Imin allocator and the already existing allocation methods (FFDU, WFDU, and Wmin) to the synthetic task sets. Once the allocation phase is complete, it must be checked if all tasks have been allocated to cores assuring that the maximum capacity per core is not exceeded, i.e., U M k ≤ 1 ∀k = 0, . . . , m − 1. If any of the allocators cannot allocate the task set, this task set will be discarded and a new one is generated.
To perform the allocators Imin and Wmin, we use Gurobi optimizer 9.0 [31], from Gurobi Optimization, Inc. (Houston, TX, USA), which is the fastest and most powerful mathematical programming solver available for ILP, MILP and other problems. It provides a Python interface.
The evaluation process is executed on an Intel Core i7 CPU with 16 GB of RAM (Santa Clara, CA, USA).
Once all allocations are validated, the task sets are scheduled following the contention aware scheduling algorithm proposed in [2] which takes into account the interference. The selected priority-based algorithm for the mentioned scheduling algorithm is EDF [27] but any fixed-priority algorithm could also be used. As an output of this phase, the real utilisation of the system, U τ , is obtained.
After the scheduling phase, the scheduling plans must be validated in order to ensure that the temporal constraints are met throughout the hyperperiod. In addition, the following performance parameters are measured to compare different methods: • Schedulability ratio. It is calculated as the relation between the number task sets with feasible scheduling plans and the number task sets with feasible allocations, expressed as a percentage; • Increased utilisation. It is calculated as the relation between the increase in utilisation with respect to the theoretical utilisation. This is calculated as • Relation between the upper bound of utilisation, U ub τ , and the actual utilisation, U τ .

Experimental Results
Depending on the experimental parameters specified in Table 2, different scenarios are defined (numbered from 1 to 18). Figures 8 and 9 depict the comparison of schedulability ratio and the increased utilisation for each scenario and each allocator. In Figure 8 it is shown that, as was expected, that generally, as the number of cores and tasks increases, the schedulability ratio decreases. Regarding the percentage of interference, it is shown that, as a general rule, with the same number of cores and tasks, the higher percentage of interference, the lower is the schedulability for all the allocators, even though there are some exceptions to this rule, such as FFDU in scenarios 7, 8, and 9 or scenarios 10, 11, and 12. After an analysis of the data with the exception cases, we realized that these exceptions are just provoked by the randomness of the tasks generated as input data. In general terms, the allocators FFDU and WFDU are affected by the increase of interference percentage more than Wmin and Imin. This property can be clearly shown, for example, in scenarios 7, 8, and 9 of Figure 8, where the percentage of interference is 10, 20, and 30%, respectively. In these cases, FFDU and WFDU have very different results depending on the amount of interference but Wmin and Imin maintain almost the same values in spite of the variations in the amount of interference.  Figure 9 shows the increased utilisation for each scenario. The results vary a lot depending on the scenario and the allocator. In the case of FFDU, the increased utilisation with 2 cores is clearly higher than 4 and 8 cores. In the case of WFDU the increased utilisation is always the highest with respect to other allocators. In the case of Wmin and Imin the increased utilisation is significantly lower with respect to WFDU.
In Figures 8 and 9, average values of all scenarios of schedulability and increased utilisation are represented for each allocator. In Figure 10 it is shown that Imin has the highest rate of schedulability from all the allocators with a 76.83%. We should be aware that Wmin also has a high index of schedulability, almost the same as Imin, 76.8%. The FFDU allocator has the worst percentage of schedulability with 35.56% and WFDU has an acceptable rate of schedulability with 76.48%.  It is important to note that the results in [2] showed that WFDU could always ensure schedulability for all the task sets (100%). In our case, this behaviour has not been reproduced. This difference lies in the interference coefficient selected in the evaluation process. In Ref. [2], I i = 1 ∀τ i ∈ τ. In this work, I i ≥ 0.1 · C i ∀τ i ∈ τ, which is always greater or equal than 1. Adding small interferences does not affect the schedulability of allocators that balance the load, as WFDU does. In fact, when more interference is considered, these results vary. With bigger interferences, the percentage of schedulability is reduced.
In Figure 11 it is shown that the WFDU algorithm has by far the highest increase in utilisation while Wmin and Imin have the best results. For WFDU and FFDU, the schedulability ratio and increased utilisation are directly proportional, which is not the desired behaviour. This is not the case of Wmin and Imin, in which a high schedulability ratio does not suppose an increment in the interference and then, in the utilisation of the system. The comparison between the upper bound (U ub τ ) and the real utilisation (U τ ) for the best allocators is depicted in Figure 12 (Imin), and Figure 13 (Wmin). They are expressed in terms of utilisation system percentage. As was expected, in both graphics, the real utilisation is always equal or less than its upper bound utilisation. Also, in both graphics, the more cores and tasks in the scenario, the more percentage of system utilisation is estimated. In Figure 14, the comparison between the utilisation upper bound of Wmin and Imin is depicted. We can see that the results are very similar as in previous figures, but Imin achieves a slight lower value for U ub i because Imin minimises the upper bound while Wmin minimises W matrix. However, the results are very similar since both represent the possible interference between tasks in other cores.  Finally, the solution times for the MILP approaches are evaluated in Figure 15. It shows that, generally, as the number of cores increases, the solution time increases. This is more obvious in the cases of 8 cores scenarios (scenarios 13 to 18). This is because when the number of cores increases (and consequently, the number of tasks and broadcasting tasks), the complexity of the search of the optimal solution also increases. Moreover, it is observed that, specially in the 8 cores scenarios, the solution time for Imin is clearly lower than the solution time needed for Wmin. This is a clear advantage of Imin allocator.

Conclusions
This paper has proposed a schedulability analysis for task models that consider the delay produced by the contention of hardware shared resources in a hard real-time multicore system. Then, an allocation algorithm that minimises this contention, Imin, has been proposed and evaluated by comparison with other existing allocators.
After the results of the experimental evaluation, we can conclude that the allocation algorithm proposed in this paper, Imin, has the best rates in terms of percentages of schedulability and increased utilisation. In addition, Imin presents good results in terms of solution times. So, it is reasonable to affirm that Imin may be an eligible option as an allocator for the implementation of a multicore system, especially in those systems where the priority is to maximize the scheduling rate or to minimise the contention produced in hardware shared resources.
We plan to further investigate how to improve Imin in order to achieve a lower utilisation rate without decreasing the schedulability. We also plan to find a less pessimistic upper bound for the utilisation and extend the model to the constrained deadlines.