Framework for Delay Guarantee in Multi ‐ Domain Networks Based on Interleaved Regulators

: The key to the asynchronous traffic shaping (ATS) technology being standardized in IEEE 802.1 time sensitive network (TSN) task group (TG) is the theorem that a minimal interleaved regulator (IR), attached to a FIFO system does not increase delay upper bound while suppresses the burst accumulation. In this work it is observed that the FIFO system can be a network for flows that share same input/output ports and same queues of the network, and are treated with a scheduling scheme that guarantees the FIFO property within a queue. Based on this observation, a framework for delay bound guarantee is further proposed, in which the networks with flow aggregates (FAs) scheduling and minimal IRs per FA attached at the network edge are interconnected. The framework guarantees the end ‐ to ‐ end delay bound with reduced complexity, compared to the traditional flow ‐ based approach. Numerical analysis shows that the framework yields smaller bound than both the flow ‐ based frameworks such as the integrated services (IntServ) and the class ‐ based ATS, at least in the networks with identical flows and symmetrical topology.


Introduction
ITU-T focus group on technologies for network 2030 (FG Net-2030) has declared that the network delay guarantee is one of the most important requirements for the 6G network [1]. For example, it is required to have 5 ms delay upper bound for the Tactile Internet. The remote industrial management control and the remote robotic surgery are the example applications of the Tactile Internet. A practical solution, however, for delay guarantee in large scale multi domain networks such as the Internet is yet to come. The flow-based approaches such as the integrated services (IntServ) framework are known to provide the delay guarantee. However, IntServ's scheduling complexity is proportional to the number of flows, which grows to millions in core networks. This scheduling complexity prohibits the IntServ from being implemented. Therefore, the differentiated services (DiffServ) framework provides relative performance differentiation with 8 or 32 queues for each priority class. Flows belonging to a class are put into a queue. The queues are served with strict priority. Because of such a simplicity, DiffServ has been adopted in the current Internet. However, the maximum burst of a flow increases to the sum of all the flows' max bursts within a queue. When there is a cycle in a network, the max burst grows to infinity, so does the delay bound. The DiffServ framework does not provide a delay bound in a general topology network.
There are efforts to suppress the burst accumulation caused by the flow aggregation with regulators [2]. IEEE 802.1 TSN [3] is the representative international standard for network delay guarantee. The standard aims to guarantee delay upper bound in a single domain network, therefore a small scale network only. The P802.1Qcr asynchronous traffic shaping (ATS) [4] technique presented in TSN employs a node with an output port with interleaved regulators (IR) per input port, and a strict priority class-based FIFO system side by side. IR is a single queue system that examines the packet at the head of the queue, lets it leave as soon as it is qualified according to the regulation rule of the flow the packet belongs to. The rest of the packets in the queue may be delayed even when they are already qualified. However, it is proven that a minimal IR does not increase the delay upper bound of the attached FIFO system, such as the class-based FIFO system employed in ATS [5]. However, the ATS framework requires an IR per input port at every output port of every node. The frequent regulations significantly affect the probabilistic performance, such as the average delay. It also implies increased implementation cost.
The current frameworks, IntServ, DiffServ, and ATS, have their own shortcomings to be employed in large scale multi-domain networks. A new framework is required, which is less complex than IntServ and has better probabilistic performance than ATS. The framework has to provide delay bounds in arbitrary topology networks. The key contribution of this work is the proposition of such a novel framework that satisfies all the requirements mentioned above.
Section 2 provides the technical detail of the proposed framework. In Section 3 the proposed framework is analyzed with a symmetrical network topology. The delay bounds for IntServ, ATS, and the proposed framework are given. Based on the formula, the delay bounds of three frameworks are compared with varying parameters. In Section 4 the three frameworks are compared and discussed. Section 5 summarizes the result and suggests the future works.

Proposed Framework for Delay Guarantee
We will briefly describe the main result of [5]. Consider a FIFO system attached with a minimal IR, depicted in Figure 1. A is the sequence of arrival times. For example, A = (A1, A2, …, An, …) comprises the arrival times of the nth packet in the sequence. Similarly, D and E are the sequences of departure times at the corresponding systems, respectively. Π-Regulator is a generalized concept of any currently existing regulator. If a flow is Π-Regular, then it conforms to an arrival curve with parameters {average arrival rate, max burst size}. For detail description on Π-Regularity, see [5]. Theorem 4 of [5] (The definition of the minimal IR): Consider a packet sequence A with every flow f conforms to a regulation operator Π f . The "minimal interleaved regulator" is defined as the single-queue system that transforms the input packet sequence A into the output packet sequence D defined by D1 = A1 and max , , Π D , for n > 1, where is the flow to which nth packet in the sequence belongs, and I(n) is the index of the nth packet in the flow .
Theorem 4 essentially says that if the departure of a packet occurs at the eligible time, Π D , then it is minimal IR. Theorem 5 of [5]: The delay upper bound of a FIFO system attached with a minimal IR equals the delay upper bound of the FIFO system only, that is, sup sup , n > 0, if the conditions listed below are met. 1) Every flow into the FIFO system conforms to an arrival curve with parameters {average arrival rate, maximum burst size}.
2) The FIFO system outputs all the packets FIFO.
3) The IR regulates every flow to reproduce the arrival characteristics at the ingress of the FIFO system. 4) (Minimal IR) IR transmits immediately when packet at the head of queue meets the output condition. Such IR is called a minimal regulator. 5) IR provides zero delay, including transmission delay, for packets satisfying the output condition. For example, if a packet comes in when the queue is empty, it can be cut-through if the condition is met.
One thing to note is that even if all of the above conditions are satisfied and the delay upper bound is not increased, this is a property that holds only for the delay upper bound of the entire system. That is, the delay bound of a specific flow may increase. Using such a property, it is suggested that the ATS framework in which the FIFO system and a minimal IR is cascaded in adjacent nodes in a network, as depicted in Figure 2. Note that the redistribution of packets to minimal IRs is necessary according to the input port of the packets at the output module. Observe that Theorem 5 of [5] holds even if the condition for the FIFO system is generalized as follows: The system S guarantees FIFO property for a subset of the packets in a network. The subset of packet are fed into a minimal IR. In other words, the FIFO system S in Figure 1 has multiple subsets of output packets, and only the packets within a same subset are FIFO. The same number of minimal IRs of the subsets are attached to S.
The generalization of the FIFO condition enables the interpretation of a FIFO system S to be a network, for a set of flows that share same input/output ports and same queues. The set of flows are treated with a scheduling scheme that guarantees FIFO within a queue.
Note that any fair-queueing based scheduler guarantees FIFO in a queue. Therefore, in a network if the flows with the same input and output ports are aggregated with a FA, with the FAbased fair-queueing schedulers, and the minimal IRs are implemented per FA at the edge of a network, then a delay upper bound is intact while the burst accumulation is suppressed.
Based on this observation, the following end-to-end delay guarantee framework is proposed.  Flows are divided into high priority and low priority.  Low priority flows are put in a single FIFO queue at the output port of all nodes and processed in strict priority mode with preemption.  High priority flows are handled as follows.  Select an appropriately sized network.    Figure 4 schedules fairly the flow aggregates that are aggregated according to the inputoutput port of the network. An FA in the network is fed to the next network, at whose ingress edge node the minimal IRs are implemented and the FA is regulated. If the minimal IRs were located in the egress edge node, then the scheduled packets according to a fair scheduler should be redistributed into different IRs. The eligible times and corresponding transmission times of packets from different IRs would overlap, which causes violation of the non-zero delay condition. As exactly in the framework in ATS, we assume zero delay may be provided with a switch module in a node, for example with infinitely large bandwidth of the switch module.

Numerical Analysis
We present the numerical performance analysis of the proposed framework. The symbols for the parameters frequently used in the analysis are given in Table 1. If a flow traverses only the latency-rate (LR) schedulers in its path (with total k LR schedulers), then the end-to-end delay experienced by the packets in the flow is bounded by the following inequality [6].

Θ
(1) The latency of a scheduler can be interpreted to be a maximum time a flow may have to wait, from the start of a busy period, to be served with its allocated service rate. The packetized generalized processor sharing (PGPS) is an ideal but complex LR scheduler. PGPS's latency is given as follows [6].

Θ
(2) The deficit round robin (DRR) is the representative round-robin LR scheduler with reduced complexity. The latency of a DRR scheduler, in case the quantum values may be smaller than the max packet length, is given as follows [7].
where is the sum of all quantum values ( ) of active flows in the scheduler, and N is the number of active flows. Quantum refers to the amount of data serviced at one time, which is determined in proportion to the service rate allocated to each flow [8].
A FIFO scheduler is also an LR scheduler with the latency given as the following.
where N is the number of active flows.

A Signle Network Case
Consider a network in which all the flows have the same characteristics and have to pass the same h hops to depart, as shown in Figure 5. Every node has two inputs and two output ports. np h flows ingress to an input port, and among them np h-1 flows egress to the same output port. On the second node, np h-2 flows among them egress to the same port, and on the last node, np h-h = n flows of them egress to the same port. Therefore, there are n flows having the same pair of {input and output ports} in the network. Suppose this input/output pattern occurs on all the nodes.

PGPS Scheduler Case
From (1) and (2) the network delay of the flow-based framework with the PGPS schedulers is bounded by We assume for simplicity that , and r/n2 , for all i. Similarly, for FAbased framework with the PGPS, there are p h FAs in an output port, and max burst of an FA is nL, therefore The difference between the two delay bounds, max _ max _ / . The difference is linearly proportional to the FA size (the number of flows in an FA) and maximum packet size, and exponentially proportional to the network size (the max number of hops in a network), and inversely proportional to the capacity of the link. Note the difference is zero when n=1, and positive for all n, h > 1, which means the smaller bound by the FA framework.

DRR Scheduler Case
With DRR schedulers, from (1) and (3) we obtain the delay bound in flow-based framework with DRR scheduler as follows.
We assume for simplicity that , since the quantum values are determined to be proportional to the flows' arrival rates which are all identical for flow-based framework. Similarly for the FA-based framework we assume that . Note that every FA has the same aggregated arrival rate. Therefore with p h FAs in an output port, and max burst of an FA to be nL, Θ _ / , and _ 3ℎ 1 2ℎ / The difference between the two delay bounds max _ max _ / has a similar form with the one with PGPS schedulers. It is linearly proportional to the FA size and exponentially proportional to the single network size. Note again the difference is zero when n = 1. We have seen that the delay bound gains by applying FA framework is always positive for all n, h > 1. Even if the networks are interconnected to from a bigger internetwork, in the identical flows case the gain becomes only larger, since the delay bounds of all the flows are identical.

ATS Framework with FIFO Scheduler Case
We investigate the case where the network in Figure 5 employs the ATS framework, that has FIFO schedulers and minimal IRs at every node. In this case (1) is applied to a single node, since the minimal IR is not an LR scheduler. We assume for simplicity that , again. We have flows in an output port, whose burst size are all L. The burst size of the aggregated flows at FIFO scheduler is therefore . By using (1) and (4), we get / , since we assume r for highest priority FIFO scheduler. We have h such nodes in a network, therefore the network delay bound is given as The difference max max _ / / / / is larger than 0 when n ≥ 2, which means with more than just one flow in an FA, the proposed framework performs better than ATS.
We can see that the proposed framework with DRR schedulers performs better than both flowbased framework and ATS when the h and n are large. This is because the dominant term in max _ and max is proportional to ℎ , while the dominant term in max _ is proportional to ℎ and .

Internetwork of Multiple Networks Case
We investigate the delay bound of the proposed framework with multiple networks interconnected. We focus on how the delay bound changes with the choice of a single network size, given a fixed internetwork size. An example network for the analysis of the proposed framework is depicted in Figure 6, in which minimal IRs are implemented between the networks. Assume the internetwork in Figure 1 is perfectly symmetrical. A flow under observation travels d networks, with identically h hops in a network, which further makes the total number of hops the flow travels is E = hd. The critical design choice in this architecture would be the value of h (and thus d), given E. Further let us define the number of flows enter a port, F, is represented with np h , as in Figure 6. We consider the end-to-end delay bound of the internetwork, with the fixed values of E and F. Larger h means smaller d, n, and the number of minimal IRs. If h = E, then d = 1, the networks are merged into a single network, and there is no interim IRs. If h = E and n = 1, then there is no flow aggregation, which is identical to the IntServ framework. On the other hands, smaller h means smaller network size and more minimal IRs. If h = 1, then IR resides at every node, which is similar to the ATS framework, except that the ATS uses FIFO scheduler.
Consider the end-to-end delay bounds of the IntServ, ATS, and the proposed framework. From the constants p, F, E, np h = F, hd = E, we get d = E/h, n = F/p h . First, for IntServ, since it has the "pay burst only once" property, from (5), the end-to-end delay bound is given as follows.
Second, for the proposed framework, from (6), Similarly, Now consider a case where p = 2, E = 16, F = 65536 = 2 16 , r = 1 Gbps, L = 10 Kbit. The RHS (right hand side) of Equation (11) gives the red curve in Figure 7. The blue line represents the value of the RHS of (10), IntServ, which is 10.486. The value of the RHS of (12), ATS, is not shown which is 20.97. Note that if h = 16, then n = 1, and the case becomes identical to an IntServ framework. With h = 16, the total delay of IntServ is identical to that of the proposed framework, indeed. Also note that h = 1 gives the same bound. Note that all the possible choices of h and d give smaller than or equal delay bounds than that of IntServ. They are always smaller than that of ATS. The optimal choice in this case is {h = 8, d = 2}, which gives 1.347 s of delay bound, which is almost 8 times better than IntServ, and 16 times better than ATS. This result is remarkable. By only diving a path into two parts, put a minima IR in the path, and aggregating flows accordingly, we can reduce the delay bound as much as to 1/8, compared to the IntServ. The scheduler complexity is reduced in the order of 2 8 .
Next, consider a network with p = 8, E = 4, F = 8 4 = 4096, r = 1 Gbps, L = 10 Kbit. This set of parameters represents a smaller sized network with nodes with more ports. The RHS (right hand side) of Equation (11) gives the red curve in Figure 8. The blue line represents the value of the RHS of (10), IntServ, which is 0.164. The value of the RHS of (12), the delay bound of ATS, is not shown, which is 0.328. It means that even in a small network with the endto-end hop count 4; dividing the path into two, aggregating flows accordingly, and inserting a minimal IR per FA would produce a delay bound that is only a half of that of the IntServ. The scheduler complexity is also reduced in the order of 8 2 . Table 2 summarizes the delay bounds of three frameworks, with two network scenarios.

Discussion on the Comparison of the Frameworks
The proposed framework with minimal IRs and FA scheduling can be seen as a generalized framework that embraces the IntServ and the TSN ATS framework at its extreme implementation cases, as Table 3 suggests. At one extreme the network for flow aggregation encompasses only a single node, then IRs are in between every node, which is similar to the TSN ATS framework except the ATS uses class-based FIFO scheduler. On the other extreme the network for flow aggregation encompasses the whole internetwork, which does not need any IR, which is similar to IntServ framework. The difference in this case is that the proposed framework aggregates flows according to the input and output ports of a network. The major complexity of the three frameworks comes from the scheduler. In this regard, ATS has the advantage. The proposed framework shows smaller or equal complexity to that of the IntServ. The IR also contributes to the complexity, but it is negligible since it maintains a single queue. The IR still has to keep and update every flow's states. The drawback of the IR resides in the average delay. It is conjectured that more IRs produce larger average delay. This is for further study, with analysis or simulations. The number of IRs required for the ATS framework is proportional to the square of the port numbers of all the nodes in a network. The number of IRs required for the proposed framework is proportional to the square of the number of ports of all the edge nodes, which is always less than that in ATS. Therefore, the proposed framework is expected to enjoy less complexity than IntServ and a smaller average delay than the ATS, with a smaller delay bound than both.

Conclusion
We proposed a framework with flow aggregate (FA)-based scheduling in a network, and with minimal interleaved regulators (IRs) placed between the networks. We have shown that the framework can guarantee a smaller delay upper bound, than both the IntServ and the ATS framework.
Our contribution is two-fold. First, it is observed that a FIFO system can be a network for flows that share same input/output ports, same queues of the network, and are treated with a scheduling scheme that guarantees the FIFO property within a queue. Any fair-queueing based scheduler meets the FIFO requirement. Second, based on the observation, we proposed a delay guarantee framework in which the flows are aggregated in a network based on their {input, output ports} pair and the minimal IRs per FA are implemented at the edge of the network. The networks with the minimal IRs are interconnected to form an internetwork. Compared to the IntServ, since the number of FA is in the order of the square of the number of ports in the network, the queueing and scheduling complexity is lower than that of a system based on flows. Compared to the ATS framework, the FAbased framework requires IRs only at the edge of a network, while the ATS requires them at every node. Therefore, the average delay is conjectured to be smaller than that of the ATS framework, which needs further study.
Numerical analysis confirmed that the performance of the proposed framework is better than both the IntServ and the ATS framework, at least in networks with identical flows and symmetrical topology. In these networks, we found that the dominant term of the delay bound in both the flowbased and the ATS framework is proportional to ℎ , while the dominant term in the proposed framework is proportional to ℎ and , where h, n, p are the network hop count, number of flows in an FA, and the number of ports of a node, respectively. Further analysis shows that for a given internetwork size and the number of flows, with a proper selection of a network size for flow aggregation, the delay upper bound is greatly decreased. Further study is required for the optimum choice of a network size for flow aggregation with arbitrary parameters and topologies.
Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.