Towards Causal Consistent Updates in Software-Defined Networks

Amine Guidara; Saúl E. Pomares Hernández; Lil María X. Rodríguez Henríquez; Hatem Hadj Kacem; Ahmed Hadj Kacem

doi:10.3390/app10062081

,

and

¹

Department of Computer Science, Instituto Nacional de Astrofísica, Óptica y Electrónica (INAOE), Tonantzintla, Puebla 72840, Mexico

²

ReDCAD Laboratory, University of Sfax, Sfax 3029, Tunisia

³

CNRS, LAAS, 7 Avenue du Colonel Roche, F-31400 Toulouse, France

⁴

Université de Toulouse, LAAS, F-31400 Toulouse, France

Appl. Sci.2020, 10(6), 2081;https://doi.org/10.3390/app10062081

This article belongs to the Section Computing and Artificial Intelligence

Version Notes

Order Reprints

Abstract

A network paradigm called the Software-Defined Network (SDN) has recently been introduced. The idea of SDN is to separate the control logic from forwarding devices to enable a centralized control platform. However, SDN is still a distributed and asynchronous system: events can be triggered by any network entity, while messages and packets are prone to arbitrary and unpredictable transmission delays. Moreover, the absence of a global temporal reference results in a broad combinatorial range space of event order. During network updates, an out-of-order execution of events may result in a deviation from desirable consistent network update properties, leading, for example, to forwarding loops and forwarding black holes, among others. In this paper, we introduce a study of the Transient Forwarding Loop (TFL) phenomenon during SDN updates; for this, we define a formal model of the TFL based on causal dependencies that capture the conditions under which it may occur. Based on this model, we introduce an algorithm that ensures the causal dependencies of the system oriented toward TFL-free SDN updating. We formally prove that it is sufficient to ensure the causal dependencies in order to guarantee TFL-free network updates. Finally, we analytically evaluate our algorithm and discuss how it outperforms the state-of-the-art in terms of updating overhead.

Keywords:

software-defined networks; OpenFlow; Transient Forwarding Loop (TFL); pattern; happened-before relation; causal order delivery; consistent network updates

1. Introduction

Currently, Software-Defined Networks (SDNs) present a revolution in the field of computer networks since they have reshaped several concepts of IP networks []. An important concept is that the network control logic is decoupled from forwarding devices. Indeed, the control logic is moved from forwarding devices, and it is encapsulated into an external entity called the controller (control plane). Through the latter, and by leveraging a logically centralized network view and standardized communication protocols (e.g., OpenFlow [] and ForCES []), forwarding devices (data plane) become programmable. Despite the concept of a logically centralized controller, SDN remains a distributed system wherein the control and data planes cooperate and communicate via an asynchronous communication (asynchronous communication implies arbitrary and unpredictable transmission delays) interface to establish networking. Furthermore, an out-of-order execution of events may occur since no global temporal reference is shared between network entities and message delays are arbitrary. This leads to the following problem: when updating a network while packet flows are taking routes to their destination, an out-of-order execution could give rise to non-deterministic behavior that temporarily deviates from network properties, which in turn may result in an inconsistent network update. Moreover and as a result, the network-wide view from the controller can transitorily be inconsistent with the current data plane state, which could affect the consistency of future network updates. To ensure consistent updates, depending on the network application, the network should align with some properties, such as no Transient Forwarding Loop (TFL) and no forwarding black hole, among others. The no TFL is one of the essential network properties desired by several network applications, including traffic engineering, virtual machine migration, and planned maintenance []. Informally, it ensures that a packet is never forwarded along a loop back during an arbitrary time interval to a forwarding device in the network where it was previously processed. To the best of our knowledge, (i) no study formally specifies under which conditions TFLs may occur in the context of SDNs. Furthermore, (ii) no solution to this problem aligns with the distributed and asynchronous nature of the SDNs. Indeed, the proposed solutions are centralized or synchronized. In fact, a centralized-based solution is associated with memory overhead: to perform updates, it makes use of the centralized controller with full knowledge of the network, i.e., the complete network forwarding graph. In addition, calculating consistent updates based on full knowledge of the network in large-scale networks becomes costly in terms of performance overhead. On the other hand, a synchronized-based solution comes with bandwidth overhead: it relies on synchronizing switch clocks to perform updating operations simultaneously. Apart from the control information overhead, such a solution presents a risk of inconsistency during the transition phases of updates since synchronization protocols do not perfectly synchronize switch clocks. We argue that achieving updates in a loop-free manner with a trade-off between ensuring consistent updates and performing efficient updates is still a challenging open problem.

In this paper, we introduce the first of a family of patterns: TFL is defined to capture asynchronously and avoid preventively it at run-time during SDN updates. The main contributions are the following:

OpenFlow-based SDN updates are modeled at the event level according to the distributed and asynchronous nature of SDNs.
A formal model of the TFL based on temporal and causal dependencies (the causal dependencies are based on the happened-before relation defined by Lamport in [] (see Section 5 for more details) that capture the conditions under which it may occur is presented. We highlight that this study is a key contribution in this work since it defines the root cause behind the triggering of TFLs and allows us to define how to achieve a TFL-free property when updating SDNs.
A causal consistent update algorithm oriented to ensure the TFL-free property is presented. This algorithm is based on the work of Prakash et al. [].
A proof of correctness that shows that the algorithm is TFL-free is provided.
The proposed algorithm is analytically evaluated, concluding that it enhances the update performance overhead.

The rest of this paper is structured as follows. Section 2 presents the preliminaries for the rest of this paper. Section 3 motivates the importance of the problem. A frame and a discussion of related work are presented in Section 4. Section 5 describes the network model. A study of the problem from a temporal perspective is described in Section 6. The proposed solution is presented in Section 7. An example scenario is described in Section 8. We evaluate and discuss the proposed approach in Section 9. Conclusions and future work are discussed in Section 10.

2. Preliminaries

2.1. Fundamental Abstractions of SDNs

A Software-Defined Network (SDN) refers to a new generation in the evolution of computer networks. SDN was launched in 2008. It was standardized by the Open Network Foundation [] and implemented by a number of original equipment manufacturers (HP, CISCO, IBM, Juniper, NEC, and Ericsson).

SDN is an emerging paradigm that relies on decoupling the tightly coupled implementation of the network control logic and network devices. In fact, the control logic (control plane) is separated from network devices (data plane) and is implemented in a logically centralized controller []. Thus, SDN mainly consists of three planes: the application plane, the control plane, and the data plane. Figure 1 depicts a simplified view of the SDN architecture. The application plane is presented as a set of network applications that implement network control logic (e.g., firewall, traffic engineering, load-balancer, etc.) leveraging a northbound interface, e.g., REST API [], which offers universal network abstraction data models and functionality to developers [,]. This set of network applications also either explicitly or directly notifies the network behavior to the control plane by means of a northbound interface. The former is presented by a logically centralized controller, which is also named the Network Operating System (NOS), and supports the translation of the network requirements and the desired behavior of the application plane to the data plane based on a southbound interface, e.g., OpenFlow [] and ForCES []. Indeed, a southbound interface formalizes the way in which the control plane and the data plane communicate. Finally, the data plane is presented by the set of network devices, e.g., switches and routers, which remain a set of simple forwarding devices [,].

Figure 1. The general architecture of SDN.

2.2. Network Traffic Handling in an OpenFlow-Based SDN

This work of research is built on an OpenFlow-based SDN []. OpenFlow proposes a flow-based traffic forwarding decision approach. This forwarding approach enables flexibility on programming network traffic as all packets that belong to the same flow receive the same forwarding decisions []. To (re)configure switches, the controller interacts with switches via an OpenFlow channel interface that connects each switch with the controller and by exchanging OpenFlow messages based on the OpenFlow switch protocol. This protocol provides a reliable message delivery and processing but does not ensure ordered message processing [] since the communication between the OpenFlow controller and the switches is asynchronous. In this paper, we are interested in a single type of message: a message instructing an OpenFlow switch in the network to update its flow table with new entries, namely the FlowMod controller-to-switch message (see the details in Section 5). In fact, each switch contains one or more flow tables that store a set of entries (forwarding rules). An entry consists of matching fields and forwarding actions. Each entry matches a flow or a set of flows and performs certain actions (dropping, forwarding, modifying, etc.) on the matched flows.

3. Problem Description: TFL as an Inconsistent Network Update Problem

Despite the centralization of the logic control, SDN remains a distributed and an asynchronous system. Indeed, the data plane is managed by a logically-centralized controller that communicates with the data plane only by OpenFlow message passing via asynchronous communication channels. In the real world, during updates, applications running on the controller may compile several entries, requiring the controller to disseminate OpenFlow messages to install entries on switches. Therefore, the switches may deliver OpenFlow messages that are injected by the controller and in-fly packets, which interleave with update messages in any order, leading to inconsistent updates.

Flow swaps: a motivating scenario. In order to analyze the TFL phenomenon in SDNs, we use a common and inevitable network update scenario known as flow swaps. Figure 2 illustrates an example of a basic flow swap scenario. In this scenario, the topology is composed of an OpenFlow controller and two switches. Initially, Switch 1 (

S_{1}

) contains an entry that forwards a specific packet flow

f_{i}

to Switch 2 (

S_{2}

) (see Figure 2a). Due to a network policy change,

f_{i}

should be forwarded from

S_{2}

to

S_{1}

. To this end, the OpenFlow controller starts by instructing, based on two controller-to-switch OpenFlow messages of type FlowMod, (1)

S_{1}

to delete the entry directing

f_{i}

from

S_{1}

to

S_{2}

(see Figure 2b) and, then, (2)

S_{2}

to install a new entry to forward

f_{i}

from

S_{2}

to

S_{1}

. While the controller sends Operation (1) before Operation (2), the second operation finishes before the first one. In this case, an in-fly packet flow

f_{j}

may interleave with the installation of the rule in

S_{2}

, forwarding network traffic to

S_{1}

, and then

f_{j}

enters into a TFL between

S_{1}

and

S_{2}

.

Figure 2. Flow swap scenario: (a) initial forwarding policy; (b) after updating the initial forwarding policy.

Based on the previous example, an SDN update may result in a broad combinatorial range space of messages and packets ordering, and thus, the controller may not assume anything about the data plane state. Therefore, the network wide-view from the controller may be temporarily or permanently inconsistent with the current data plane state. Intuitively, this affects the consistency of network updates. Indeed, three of eleven bugs reported in [] were caused by inconsistent views from the controller. Alternatively, an OpenFlow controller can explicitly request acknowledgments by sending a barrier request message (BarrierReq) after sending a message, that is it sets a synchronization point between messages. An OpenFlow switch of a destination should then respond with a barrier response message (BarrierRes) once the message and all other messages received before the message have been processed []. However, in which case and time should the controller explicitly request to install a BarrierReqmessage? What about the overhead of messages in large-scale networks? On the other hand, such a solution forces the system to stop processing any new requests until all messages sent before the BarrierReq message are completely processed. Indeed, this solution can harm performance by increasing the number of control messages, controllers, switch operations, and delays.

5. System Model

We consider a network represented through a set of nodes

N = {n_{1}, n_{2}, \dots}

(defined later as a set of processes) and a generic (random) forwarding path, leading messages and packet flows, between a node

n_{i}

and another

n_{j}

(represented later as a set of forwarding paths), which is subject to being updated.

To formalize this system, we introduce a model of executions of message- and packet-passing between OpenFlow-based SDN entities during updates. First of all, in Section 5.1, we present the model related to a typical distributed system. Then, in the next subsection, we define relevant concepts for this paper, basically the Happened-Before Relation (HBR) and the Immediate Dependency Relation (IDR) definitions. After that, in Section 5.3, we introduce our SDN model by adapting the typical distributed system model to the SDN context and extending it to cover other sets specific to SDNs. Finally, in Section 5.4, we discuss the causal order delivery, which is a fundamental property for our proposed solution.

5.1. Distributed System Model

To introduce our SDN model, we start by presenting the model of a typical distributed system as a point of departure. A typical distributed system is composed of different separated entities that communicate with each other by exchanging messages. It is assumed that no global time reference exists and the message transmission delay is arbitrary and unpredictable. Furthermore, for simplicity in this work, it is assumed that no message is lost.

At a high level of abstraction, a distributed system can be described based on the following sets: P, M, and E, which correspond, respectively, to the set of processes, the set of messages, and the set of events.

Processes: programs, instances of programs, or threads running simultaneously and communicating with other programs, instances of programs or threads. Each process belongs to the set of processes P. A process $p \in P$ can only communicate with other processes by message passing over an asynchronous communication network.
Messages: Abstractions of any type of messages, which can contain either arbitrarily simple or complex data structures. Each message in the system belongs to the set of messages M. A message $m \in M$ is sent considering an asynchronous and reliable network.
Events: An event e is an action or occurrence performed by a process $p \in P$ . Each event e in the system belongs to the set E. There are two types of events under consideration: internal events and external events. An internal event is an action that occurs at a process in a local manner. An external event is an action that occurs in a process, but it is seen by other processes and affects the global system state. The external events are the send and delivery events. A send event identifies the emission event of a message $m \in M$ executed by a process, whereas a delivery event identifies the execution performed or the consumption of an ingoing message m by a recipient process.

5.2. Time and Causal Order

Time is an important theoretical construct to understand how the transactions are executed in distributed systems. Indeed, it is difficult to determine if an event takes place before another one in the absence of a global physical time. However, the logical time introduces an agreed time between all processes based on which one can establish the execution order relation between any two events in a distributed system. This order relation was defined by Lamport [] by means of the Happened-Before Relation (HBR). The HBR is a strict partial order, and it establishes precedence dependencies between events. The HBR is also known as the relation of causal order or causality.

Definition 1.

The Happened-Before Relation (HBR) [] “→”is the smallest relation on a set of events E satisfying the following conditions:

If a and b are events that belong to the same process and a occurred before b, then $a \to b$ .
If a is the sending of a message by one process and b is the receipt of the same message by another process, then $a \to b$ .
If $a \to b$ and $b \to c$ , then $a \to c$ .

In practice, the use of HBR to maintain the causality is expensive since the relation between each pair of events should be considered, including the transitively Happened-before relationships between events (defined in the third condition of the HBR definition). To obliterate the notion that causality is expensive to set up in distributed systems, the author of [] proposed the Immediate Dependency Relation (IDR). The IDR is the minimal binary relation of the HBR that has the same transitive closure. We use the IDR to send the necessary and sufficient amount of control information sent per message to ensure causal order.

Definition 2.

The Immediate Dependency Relation (IDR) [] “↓”is the transitive reduction of the HBR, and it is defined as follows:

a ↓ b i f a \to b \land \forall c \in E, \neg (a \to c \to b)

5.3. SDN Model

We developed the SDN model by adapting the sets P, M (also extended to the set

M P

), and E, presented in the distributed system model of Section 5.1, to the SDN context and by adding the sets

M A T C H

,

M P

,

P F L O W

, and

F P A T H

, which correspond, respectively, to the set of matches, the set of messages and data packets, the set of packet flows, and the set of forwarding paths, which are specific to the SDN model.

Processes: The system under consideration is composed of a set of processes $P = {p_{1}, p_{2}, \dots}$ | $P = P_{c p}$ ∪ $P_{r p}$ where $P_{c p} =$ ${c p}$ represents the controller process $c p$ and $P_{r p} = {r p_{1}, r p_{2}, \dots}$ represents the set of routing processes of OpenFlow switches.
Matches: We consider a finite set of matches $M A T C H = {m a t c h_{1}, m a t c h_{2}, \dots}$ . A match $m a t c h \in M A T C H$ is a field value that identifies a forwarding path in the network.
It is important to mention that the match is a key attribute in establishing the proposed updating algorithm (see Section 7.2). In fact, each update is performed based on a match. This is because an update makes reference to a forwarding path $f p a t h_{μ} \in F P A T H$ , a packet flow $p f l o w_{τ} \in P F L O W$ , which is taking route to its destination according to $f p a t h_{μ}$ , and to any OpenFlow message $m \in M$ disseminated by the controller to update any routing process $r p \in f p a t h_{μ}$ that shares the same match $m a t c h_{λ} \in M A T C H$ .
Messages and data packets: The system includes a set of messages and data packets $M P = {m p_{1}, m p_{2}, \dots}$ | $M P = M \cup P K T$ where $M = {m_{1}, m_{2}, \dots}$ represents the OpenFlow messages and $P K T = {p k t_{1}, p k t_{2}, \dots}$ represents the data packets.
In addition to the set of messages M, we consider the set of OpenFlow message types $M_{t y p e}$ [] where $f_{m e s s a g e} : M \mapsto M_{t y p e}$ . Furthermore, we note that each message $m \in M$ corresponds to a $m a t c h \in M A T C H$ , denoted by $m \hat{=} m a t c h$ . Furthermore, a $m a t c h$ may also correspond to a subset of OpenFlow messages. We denote a message $m \in M$ as $m = (c p, t_{c p}, O p e n F l o w_m e s s a g e, r p_{j})$ where a controller $c p$ sends an $O p e n F l o w_m e s s a g e$ to a $r p_{j} \in P_{r p}$ at $t_{c p}$ (the logical clock of $c p$ ). Note that the tuple ( $c p$ , $t_{c p}$ ) represents the identifier of a message $m \in M$ .
In this paper, we consider controller-to-switch messages, and in particular the FlowMod message. This message allows the controller to modify the state of OpenFlow switches either by adding, modifying, or deleting entries. The FlowMod message is composed of various structures (see the details in []). In this work, we consider $m a t c h$ and $c o m m a n d$ as the relevant FlowMod message structures. In fact, the former specifies to which packet flow an entry corresponds, whereas the latter specifies the action to be performed on the matched packet flow. We consider FlowMod as a relevant message, where $c o m m a n d$ and $m a t c h$ represent the “ $O p e n F l o w_m e s s a g e$ ” structure in the specification of a message $m \in M$ .
A packet $p k t$ in the subset $P K T$ mentioned above is denoted as $p k t = (r p_{i}, t_{i}, h e a d e r, d a t a, r p_{j})$ , where an $r p_{i}$ forwards at the logical clock $t_{i}$ a data packet, composed of a $h e a d e r$ and $d a t a$ , to an $r p_{j}$ such that $r p_{i}, r p_{j} \in P_{r p}$ and $(r p_{i} \neq r p_{j})$ . Note that the tuple ( $r p_{i}$ , $t_{i}$ ) represents the identifier of a data packet $p k t_{i} \in P K T$ . The header of each data packet piggybacks a $m a t c h \in M A T C H$ that corresponds to all $m a t c h$ of packets belonging to the same packet flow. The data consist of the payload (the current intended message).
Packet flow: We consider a finite set of packet flows $P F L O W = {p f l o w_{1}, p f l o w_{2}, \dots}$ . A $p f l o w \in P F L O W$ (also $p f l o w \subset P K T$ ) is a sequence of packets between a source $r p_{i}$ and a destination $r p_{j}$ , ( $r p_{i}, r p_{j} \in P_{r p})$ . Furthermore, we consider the bijection $f_{p f l o w} : P F L O W \mapsto M A T C H$ , that is each $p f l o w$ corresponds to a $m a t c h \in M A T C H$ denoted by $p f l o w \hat{=} m a t c h$ .
Forwarding paths: A finite set of forwarding paths $F P A T H = {f p a t h_{1}, f p a t h_{2}, \dots}$ is considered. An $f p a t h \in F P A T H$ is a subset of routing processes $R p = {r p_{k}, r p_{k + 1}, r p_{k + 2}, \dots, r p_{k + n}}$ between an $r p_{s r c}$ source and an $r p_{d s t}$ destination, where $r p_{i}, r p_{j} \in P_{r p}$ and $R p \subset P_{r p}$ . Furthermore, we take into consideration the bijection $f_{f p a t h} : F P A T H \mapsto M A T C H$ , that is each $f p a t h$ corresponds to a $m a t c h \in M A T C H$ denoted by $f p a t h \hat{=} m a t c h$ .
Events: As mentioned in Section 5.1, there are two types of events: internal and external ones. We note that the internal events are not relevant to the rest of the paper. However, we define them for the completeness of the formal specification. The set of finite internal events $E_{i n t e r n a l}$ is the following:
-
$P r M (r p_{i}, t_{i}, m, c p)$ denotes that at $t_{i}$ , an $r p_{i} \in P_{r p}$ processes a message $m \in M$ sent by the $c p$ .
-
$P r P (r p_{i}, t_{i}, p k t, r p_{j})$ denotes that at $t_{i}$ , an $r p_{i} \in P_{r p}$ processes a data packet $p k t \in P K T$ forwarded by an $r p_{j} \in P_{r p}$ ( $r p_{i} \neq r p_{j}$ ).
The external events considered are the send, the receive, and the delivery events. The set of external events is represented as a finite set $E_{e x t e r n a l} = E_{s e n d} \cup E_{r e c e i v e} \cup E_{d e l i v e r y}$ . The set of send events $E_{s e n d}$ is the following:
-
$S d M (m)$ denotes that the $c p$ sends a message $m \in M$ to an $r p_{j} \in P_{r p}$ .
-
$F w d P (p k t)$ : denotes that an $r p_{i} \in P_{r p}$ forwards a data packet $p k t \in P K T$ to an $r p_{j} \in P_{r p}$ ( $r p_{i} \neq r p_{j}$ ).
$E_{r e c e i v e}$ is composed of one event:
-
$R e c M P (m p)$ : denotes that an $r p_{j} \in P_{r p}$ receives a message or a data packet $m p \in M P$ . Such an event only notifies the reception of an $m p$ by $r p_{j}$ .
Furthermore, $E_{d e l i v e r y}$ is composed of a unique event:
-
$D l v M P (m p)$ : denotes that an $r p_{j} \in P_{r p}$ delivered a message or a data packet $m p \in M P$ . A delivery event identifies the execution performed or the consumption of an ingoing $m p$ by $r p_{j}$ .

The set of events associated with

M P

is the following:

E (M P) = {S d M (m), F w d P (p k t)} \cup {R e c M P (m p)} \cup {D l v M P (m p)}

(1)

The whole set of events in the system is the finite set:

E = E_{i n t e r n a l} \cup E (M P)

(2)

The order of occurrence of events can be collected based on the causal dependencies between them. The representation

\hat{E}

expresses the causality between events E in P, using the happened-before relation

(\to)

(see Definition 1) where:

\hat{E} = {E, \to} .

(3)

5.4. Causal Order Delivery

Causal order delivery is a fundamental property in the field of distributed systems, and it is required for various distributed applications (e.g., distributed simulation). In fact, it is useful for synchronizing distributed protocols by inducing the causal order of events. Indeed, this property means that if two send messages are causally related and are sent to the same process

p_{i}

, then their delivery should be held according to the causal order, that is by respecting the send order.

The author of [] showed that in order to ensure a causal order delivery for a multicast communication, it suffices to ensure the causal delivery of immediately related send events. Therefore, to ensure message-data packet causal order delivery in SDNs, the set of events is determined to be

R = {S d M (m), F w d P (p k t) | m, p k t \in M P}

. For simplicity, we represent the two send events

S d M (m), F w d P (p k t)

as

s e n d (m p)

. Formally, the message causal delivery based on the IDR (see Definition 2) is defined for the SDN context as follows:

Definition 3.

Causal order delivery in the SDN context:

\forall ((s e n d (m p), s e n d (m p^{'}) | m p, m p^{'} \hat{=} m a t c h) \in R, s e n d (m p) ↓ s e n d (m p^{'}) w h e r e D e s t (m p) = D e s t (m p^{'}) \Rightarrow d e l i v e r y (m p) \to d e l i v e r y (m p^{'})

.

Therefore, causal order delivery establishes that if the diffusion of a message-data packet

m p

causally precedes the diffusion of message

m p^{'}

, then the delivery of

m p

should precede the delivery of

m p^{'}

at a common destination process.

6. Modeling the TFL Pattern from a Temporal Perspective

In this section, we study the use of physical time as a tool to ensure consistent network updates, and we analyze if physical time will be sufficient to deal with the TFL inconsistency problem.

We return to the flow swap update scenario of Figure 2. In this scenario, the topology is composed of an OpenFlow controller and two switches (

S_{1}

and

S_{2}

). According to our network model (see Section 5), the set of processes for this scenario is

P = {c p, r p_{1}, r p_{2}}

, where

c p

is the OpenFlow controller and

r p_{1}

,

r p_{2}

represent the switches

S_{1}

and

S_{2}

. During the update, the OpenFlow controller

c p

sends to

r p_{1}

a FlowMod message

m_{1} = (c p, 1, (m a t c h, d e l e t e), r p_{1})

where a

p f l o w \hat{=} m a t c h

, and then, it sends to

r p_{2}

a FlowMod message

m_{2} = (c p, 2, (m a t c h, a d d), r p_{2})

where also

p f l o w \hat{=} m a t c h

. The routing process

r p_{2}

receives

m_{2}

and installs the entry directing all

p k t_{i} \in p f l o w

to

r p_{1}

. Accordingly,

r p_{2}

directs

p k t_{1} \in p f l o w

to

r p_{1}

(see Figure 3). Subsequently,

p k t_{1}

enters

r p_{1}

before

m_{1}

is delivered to

r p_{1}

due to a delay of the reception. Finally, as

p k t_{1}

matches the entry already installed in

r p_{1}

, the former ends by redirecting

p k t_{1}

to

r p_{2}

, generating a TFL between

r p_{1}

and

r p_{2}

.

Figure 3. The communication diagram corresponding to Figure 2.

Upon analyzing the communication diagram corresponding to the execution diagram of Figure 3, we can observe that the TFL between

r p_{1}

and

r p_{2}

is created when the transmission time interval of

m_{1}

is greater than the transmission time interval of

m_{2}

plus the packet forwarding time of

p k t_{1} \in p f l o w

to

r p_{1}

. We formally define the TFL pattern from a temporal perspective.

Definition 4.

We have a

m a t c h_{λ} \in M A T C H

and two OpenFlow messages m and

m^{'}

, where

m = (c p, t_{c p}, (m a t c h_{λ}, d e l e t e), r p_{i})

and

m^{'} = (c p, t_{c p}^{'}, (m a t c h_{λ}, a d d), r p_{j})

, such that (i)

t i m e (m) < t i m e (m^{'})

, (ii)

(r p_{i} \neq r p_{j})

and (iii) ∃ a forwarding path

f p a t h_{μ} = {r p_{i}, \dots, r p_{j}}

from

r p_{i}

to

r p_{j}

, such that intermediates

r p_{r}

may exist where

f p a t h_{μ} \in F P A T H | f p a t h_{μ} \hat{=} m a t c h_{λ}

. A TFL pattern from

r p_{i}

to

r p_{j}

exists iff there is a data packet flow

p f l o w_{τ} = p k t_{1}, p k t_{2}, \dots, p k t_{n}

(n \geq 1)

, where

p f l o w_{τ} \in P F L O W | p f l o w_{τ} \hat{=} m a t c h_{λ}

, such that:

T (m) > T (m^{'}) + \sum_{k = 1}^{n} T (p k t_{k})

(4)

where

T (m p)

is a function that returns the transmission time of an OpenFlow message or a data packet

m p \in M P

from a

p_{i}

to a

p_{j}

with

p_{i}, p_{j} \in P

, and

t i m e (m)

gives the local physical time at the moment a message

m \in M

is sent.

Based on Definition 4, a possible solution to avoid TFLs in the data plane is to establish temporal references and perform timed execution of updating operations. However, and as was discussed in Section 4, such a solution is not effective since it is quite difficult to perfectly synchronize clocks across network entities. Indeed, any clock synchronization mechanism (e.g., Network Time Protocol (NTP) []) presents a clock synchronization accuracy. In this case, we will not be able to find out or to force the execution order of any pair of events, which may harm the update consistency.

7. The TFL-Free Property

As studied in Section 6, a TFL occurs during an update due to the no ordered execution of events. This is because networks are not able to reason about the global real-time ordering of events across entities. In this section, we explore the use of the causality (more detail in Section 5) as a theoretical construction, firstly, to define the TFL pattern (see Section 7.1), and secondly, to perform coordinated network updates, ensuring the TFL-free property (see Section 7.2).

7.1. Modeling the TFL Pattern from a Causal Perspective

In this subsection, we analyze the TFL phenomenon during OpenFlow-based SDN updates by using the scheme of the happened-before relation (See Definition 1).

Figure 4 illustrates the generic scenario in which an out-of-order execution of messages/packets leads to a TFL during updates. In this scenario, the topology is composed of an OpenFlow controller and n OpenFlow switches. According to our network model (see Section 5), the set of processes for this scenario is

P = {c p, r p_{s r c}, r p_{1}, r p_{2}, \dots, r p_{n}, r p_{d s t}}

where

c p

is the OpenFlow controller,

r p_{s r c}

and

r p_{d s t}

represent, respectively, the

S o u r c e

and the

D e s t i n a t i o n

switches, and

r p_{1}

,

r p_{2}

,

\dots, r p_{n}

represent the intermediateswitches

S_{1}, S_{2}, \dots, S_{n}

. Initially, each intermediate routing process

r p_{i}

, except

r p_{n}

, contains an entry r directing a

p f l o w \in P F L O W

to its

r p_{i + 1}

(see the solid lines in Figure 4a). Due to a network policy change,

p f l o w

should be forwarded from

r p_{n}

to

r p_{1}

(see the dashed lines in Figure 4a). To this end, the

c p

starts by instructing each

r p_{i}

, except

r p_{n}

, to delete the entry directing

p f l o w

to its

r p_{i + 1}

and then instructs again each

r p_{i}

, except

r p_{1}

, to install an entry directing

p f l o w_{τ}

to its

r p_{n - 1}

. In an OpenFlow-based SDN, the

c p

has to send

2 \times (n - 1)

controller-to-switch OpenFlow messages of type FlowMod. For brevity, and as it is sufficient to express the generality of the phenomenon, Figure 4b depicts only one message (Message (1) depicted in the dashed line) for deleting the rule in

r p_{1}

and all the other messages (from (2) to (n)) for adding the new forwarding path from

r p_{n}

to

r p_{1}

.

Figure 4. The generic Transient Forwarding Loop (TFL) scenario: (a) initial and final forwarding paths; (b) a sequence of messages to redirect a

p f l o w

from

S_{n}

to

S_{1}

ends with a TFL.

The communication diagram of Figure 5, which corresponds to the generic scenario of Figure 4b, is used to characterize the phenomenon. As shown, the

c p

starts by sending to

r p_{1}

a FlowMod message

m_{1} = (c p, 1, (m a t c h, d e l e t e), r p_{1})

of command type

d e l e t e

and

m a t c h

as a match, where

p f l o w \hat{=} m a t c h

, and then, it sends to

r p_{2}

a FlowMod message

m_{2} = (c p, 2, (m a t c h, a d d), r p_{2})

of command type

a d d

and also to

m_{1}

,

m a t c h

as a match, as it is about forwarding the same packet flow

p f l o w

. The rest of the OpenFlow messages from

m_{i}

to

m_{j}

(represented in Figure 4b by Messages (3), …, (n)) are sent to their corresponding

r p

to add the entries directing

p f l o w

from

r p_{n}

to

r p_{1}

. Upon the reception of the messages, and due to the asynchronous communication between the controller and all the underlying switches,

r p_{n}

,

r p_{n - 1}

, …and

r p_{2}

receive their messages and install the entries while packets

p k t_{i} \in p f l o w

hit

r p_{n}

, directing them to

r p_{1}

(see Figure 5). Hence,

p k t_{n}

enters

r p_{1}

before

m_{1}

is delivered to

r p_{1}

(see Figure 5). Consequently,

p k t_{n}

matches the entry r already installed in

r p_{1}

directing all

p k t_{i} \in p f l o w

to

r p_{2}

. Finally,

r p_{1}

ends by redirecting

p k t_{n}

to

r p_{2}

(see the solid line between

S_{1}

and

S_{2}

in Figure 4b), which generates a TFL. Indeed, the fact that

p k t_{n}

is delivered to

r p_{1}

and went back through

r p_{1}

means that

p k t_{n}

has already entered into a TFL.

Figure 5. The communication diagram corresponding to Figure 4.

We define below an abstraction of the TFL pattern in SDN, as a specification of Lamport’s happened-before relation to express the phenomenon from a causal perspective.

Definition 5.

We have a

m a t c h_{λ} \in M A T C H

and two FlowMod messages

m, m^{'} \in M

where

m = (c p, t_{c p}, (m a t c h_{λ}, d e l e t e), r p_{i})

and

m^{'} = (c p, t_{c p}^{'}, (m a t c h_{λ}, a d d), r p_{j})

, such that (i)

m \to m^{'}

, (ii)

(r p_{i} \neq r p_{j})

, and (iii) ∃ a forwarding path

f p a t h_{μ} = {r p_{i}, \dots, r p_{j}}

from

r p_{i}

to

r p_{j}

, such that intermediates

r p_{r}

may exist where

f p a t h_{μ} \in F P A T H | f p a t h_{μ} \hat{=} m a t c h_{λ}

. A TFL from

r p_{i}

to

r p_{j}

exists iff there is a data packet flow

p f l o w_{τ} =

p k t_{1}, p k t_{2}, \dots, p k t_{n}

(n \geq 1)

, where

p f l o w_{τ} \in P F L O W | p f l o w_{τ} \hat{=} m a t c h_{λ}

, such that:

$p k t_{1}$ is sent by $r p_{j}$ after the delivery of $m^{'}$ ,
if $p k t_{k}$ $(1 \leq k \leq n)$ is delivered by $r p_{r}$ $(r p_{r} \neq r p_{i})$ , then $p k t_{k + 1}$ is the next data packet sent by $r p_{r}$ , and
$p k t_{n}$ is delivered by $r p_{i}$ before the delivery of m.

It can be interpreted that a TFL occurs due to the violation of the causal order delivery of the OpenFlow update message(s) of type deleteand matched packets/packet flows (see the three conditions in Definition 5). Based on the abstraction defined in Definition 5, we present how to capture and avoid the occurrence of the defined pattern during updates, ensuring the TFL-free property.

7.2. Algorithm for TFL-Free SDN Updating

An update algorithm based on the algorithm of [] is presented in this subsection. Based on the study provided in Section 7.1, we show how the presented algorithm allows capturing asynchronously and avoiding preventively the occurrence of TFLs during updates, referred to as TFL-free updates.

7.2.1. Algorithm Overview

Throughout an update, an OpenFlow-based SDN expresses message/packet-passing between processes P, specifically, between the controller process

P_{c p}

and the set of routing processes

P_{r p}

, or between

P_{r p}

themselves. Therefore, we distributed the algorithm over the

P_{c p}

and the set of

P_{r p}

. It can be summarized as follows:

Input: The algorithm takes the list of all OpenFlow update messages and ingoing data packets as input. As described in Section 5, updates are match-based performed, i.e., update messages and in-fly data packets are grouped and processed by match. This allows updates of different matches to be executed concurrently without the risk of harming the TFL-free property.
Condition: Let $m a t c h_{λ} \in M A T C H$ be a match. All matched OpenFlow messages of type delete should be disseminated from the $P_{c p}$ to the $P_{r p}$ before all OpenFlow messages of type add.
Execution model: All OpenFlow update messages are asynchronously disseminated from $P_{c p}$ to $P_{r p}$ . No upper bound on messages/data packets transmission delay is required. $P_{c p}$ never waits for an acknowledgment message from $P_{r p}$ once a message is delivered, and clocks of all entities are not in synchronization with each other, that is the execution model is fully asynchronous.
Data structures: At a process-level, each process $p_{i} \in P$ maintains a vector of control information $C I_{i}$ to store direct dependency information between the set of messages/packets $M P$ with respect to the corresponding match. Furthermore, each $r p_{i} \in P_{r p}$ maintains a matrix $D e l i v e r y_{i}$ to track dependency information (see the data structures’ subsection below for more details) when an $r p$ receives an $m p \in M P$ . We will also make reference to the state of the delivery matrix structure when a process $p_{i}$ sends an $m p$ , denoted as $F o r D e l i v e r y_{i}$ .
Functionally: The algorithm is implemented at the time of sending an OpenFlow message $m \in M$ from the $P_{c p}$ , and/or at the time of forwarding a data packet $p k t \in P K T$ from an $r p \in P_{r p}$ , and/or at the time of receiving an $m p \in M P$ by a routing process $r p \in P_{r p}$ . At the sending of an OpenFlow message from the $P_{c p}$ , and besides the message m, the algorithm encapsulates into the message a vector $C I_{c p}$ containing control information on the send message events that directly depend on it (see Algorithm 1). Similarly, an outgoing packet $p k t$ in the data plane piggybacks a vector $C I_{r p}$ that carries control information of the OpenFlow message/packet send events that directly depend on it (see Algorithm 2). At the reception of an $m p$ , and based on the control information encapsulated into $C I_{m p}$ , the receiver $r p_{j}$ decides if it can deliver $m p$ or if it should wait for the reception of another/other $m p^{'}$ (s) to then be able to deliver $m p$ (see Algorithm 3).
Ensured properties: Based on the algorithm functionality, neither an OpenFlow message nor a packet, which share the same match, will be delivered out of causal order, ensuring the per-match causal order delivery property (see Definition 3). In Section 7.3, we prove that the presented algorithm ensures the TFL-free property.

Algorithm 1: Controller-to-switch message sending.

Input: The set of OpenFlow update messages

Condition:

\forall m \in M : (m = (c p, t_{c p}, (m a t c h_{λ}, d e l e t e), r {p_{d}}^{1}) \to m^{'} = (c p, t_{c p}^{'}, (m a t c h_{λ}, a d d), r p_{d}))

1:

t_{c p} : =

0,

C I_{c p} [j] = {}

\forall j : 1 \dots N

2: for all OpenFlow messages

\hat{=} m a t c h_{λ}

do

3:

t_{c p} : =

t_{c p} + 1

4:

m = (c p, t_{c p}, O p e n F l o w_m e s s a g e, r p_{i})

5:

S d M (m, C I_{c p})

6:

C I_{c p} [i] : =

(c p, t_{c p})

Algorithm 2: Switch-to-switch packet forwarding.

Input: Ingoing data packets

1:

t_{r p_{i}} : =

0,

C I_{r p_{i}} [j] = {}

\forall j : 1 \dots N

2:

t_{r p_{i}} : =

t_{r p_{i}} + 1

3:

p k t = (r p_{i}, t_{r p i}, h e a d e r, d a t a, r p_{j})

4:

F w d P (p k t, C I_{r p_{i}})

5:

C I_{r p_{i}} [j] : =

(r p_{i}, t_{r p_{i}})

Algorithm 3: Switch message/packet reception.

Input: OpenFlow messages sent from the controller and in-fly data packets

1:

D e l i v e r y_{r p_{j}} [i, k] = 0

\forall i, k : 1 \dots N

,

C I_{r p_{j}} [i] = {}

\forall i : 1 \dots N

2:

R e c M P (m p = (p_{i}, t_{i}, c o n t e n t, p_{j}, C I_{m p}))

(the content may be an OpenFlow message or a packet)

3: wait (

\forall k

(k, x) \in C I_{m p} [j]

|

D e l i v e r y_{j} [k, j] \geq x

)

4:

D l v M P (m p)

5:

D e l i v e r y_{j} [i, j] : =

t_{m p}

6:

\forall k

|

(k, y) \in C I_{m p} [i]

:

D e l i v e r y_{j} [k, i] : = m a x (D e l i v e r y_{j} [k, i], y)

7:

C I_{j} [j] : = (C I_{j} [j] \cup_{m a x} {(i, t_{m p})})

-_{m a x}

C I_{m p} [j]

8:

\forall p_{k} \in P

|

k \neq i, j

:

C I_{j} [k] : = C I_{j} [k] \cup_{m a x} C I_{m p} [k]

9:

\forall p_{k} \in P

|

k \neq j

:

10:

\forall (l, x) \in C I_{j} [k]

11: if

D e l i v e r y_{j} [l, k] \geq x

then

12: delete

(l, x)

from

C I_{j} [k]

7.2.2. Data Structures

In the algorithm, each process

p_{i} \in P

maintains a vector of control information

C I_{i}

of length N to store direct dependency information (N is the number of processes). Each element of

C I_{i}

is a set of tuples of the form

(p r o c e s s_{i d}, l o g i c a l_c l o c k)

. For instance, let

C I_{i}

be the vector of a process

p_{i}

such that

(k, t) \in C I_{i} [j]

(i \neq j)

. This implies that any message sent by a process

p_{i}

should be delivered to

p_{j}

after the message

m p

of sequence number t sent by

p_{k}

has been delivered to

p_{j}

. Furthermore, each process

r p_{i}

maintains an

N \times N

integer matrix

D e l i v e r y_{i}

to track dependency information. Each matrix

D e l i v e r y_{i}

records the messages of the last sequence number delivered to other processes. For instance,

D e l i v e r y_{i}

[j, k] = t

denoted that

p_{i}

knows that messages sent by process

p_{j}

to

p_{k}

, whose sequence numbers are less than or equal to t, have been delivered to

p_{k}

.

7.2.3. Algorithm Details

Controller-to-switch message sending: This algorithm (see Algorithm 1) takes all update messages calculated by the controller that should be communicated to the routing processes as input. As a condition, all messages of type delete should be disseminated by the controller before all messages of type add. Before sending the messages, they should be grouped by match (see Line 2). In each message send event, the logical clock of

c p

(denoted by

t_{c p}

) is incremented (see Line 3) to associate a timestamp with m (see Line 4). Upon sending m,

C I_{c p}

is encapsulated into the send event (see Line 5).

C I_{c p} [k]

contains information about the direct predecessors of m with respect to messages sent to

p_{k}

. After sending m,

C I_{c p} [i]

is updated by adding

(c p, t_{c p})

as a potential direct predecessor of future messages sent to

r p_{i}

after m (see Line 6). In general,

C I_{c p} [i]

contains delivery constraints of messages sent to

r p_{i}

by

c p

.

Switch-to-switch packet forwarding: This algorithm (see Algorithm 2) takes ingoing data packets as input. Before forwarding a packet

p k t

to an

r p_{j}

, the logical clock of

r p_{i}

(denoted by

t_{r p i}

) is incremented (see Line 2) to associate a timestamp with

p k t

(see Line 3). Upon forwarding a

p k t

to its next hop,

C I_{r p_{i}}

is encapsulated into the forwarding event (see Line 4).

C I_{r p_{i}} [k]

contains information about the direct predecessors of

p k t

with respect to OpenFlow messages/packets sent to

p_{k}

. After forwarding

p k t

,

C I_{r p_{i}} [j]

is updated by adding

(r p_{i}, t_{r p_{i}})

as a potential direct predecessor of future packets sent to

r p_{j}

after

p k t

(see Line 5). The instructions of this algorithm are similar to the previous one. The difference is at the level of the communicating processes and of the type of data for sending.

Switch message/packet reception: From the point of view of the receiver

r p_{j}

(see Algorithm 3), an

m p

piggybacks delivery constraints encapsulated into

C I_{m p}

, that is messages that must be delivered to

p_{j}

before

m p

. Note that there is a distinction between the reception of a message

m p

(see Line 2) and its delivery (see Line 4). The delivery of

m p

to a process

p_{j}

implies that the message in question was received and all previous delivery constraints on

p_{j}

were satisfied (see Line 3). Thus, once the delivery constraints are satisfied,

m p

is delivered to

p_{j}

(see Line 4), and what follows is updating the control information of the receiver process

p_{j}

. Then,

p_{j}

updates its

D e l i v e r y_{j}

matrix, indicating that the message sent from

p_{i}

, whose sequence number is equal to

t_{m p}

, was already delivered to

p_{j}

(see Line 5). The

D e l i v e r y_{j}

matrix is also updated with respect to the messages

m p s

delivered to process

p_{i}

(see Line 6). After the delivery of

m p

,

p_{j}

updates a new delivery constraint for future messages sent from

p_{j}

. Therefore,

C I_{j} [j]

is updated by adding

(i, t_{m p})

, using the

\cup_{m a x}

operator as a new potential direct dependency on subsequent messages sent from

p_{j}

and by deleting older direct dependencies (transitive dependencies), using the

-_{m a x}

operator (see Line 7), which were already satisfied. The operator

\cup_{m a x}

(see Algorithm 4) ensures that if there are multiple constraints corresponding to a sender process, the most recent constraint is selected []. The

-_{m a x}

operator (see Algorithm 5) deletes the delivery constraints already known to be satisfied (T2) from the current set of message delivery constraints (T1) []. Furthermore, and to maintain the causality property, a message sent by a process

p_{k}

to

p_{j}

, whose send event is causally dependent on messages sent by

p_{i}

to

p_{j}

, should be delivered to

p_{j}

only after the

n^{th}

message sent by

p_{i}

to

p_{j}

. Thus, for all processes

p_{k}

, except the ones for the sender and receiver,

C I_{i} [k]

is updated by adding the delivery constraints piggybacked by

C I_{m p}

(see Line 8). As we have mentioned, the

D e l i v e r y

matrix is used for garbage collection to reduce the communication overhead when ensuring the causal ordering of messages. Therefore, based on the

D e l i v e r y_{j}

matrix of process

p_{j}

,

C I_{j}

is updated in such a way that it contains only the recent delivery constraint needed for the delivery of future messages (see Lines 9 to 12).

Algorithm 4: The operator

\cup_{m a x}

: (

T 1 \cup_{m a x} T 2

).

1: Boolean change, set of tuples

T 1

, set of tuples

T 2

, set of tuples T

2: change:= true

3:

T : = T 1 \cup T 2

(

T 1

and

T 2

contain the delivery constraints)

4: while (change) {

5: change:= false

6: if

(i, x) \in T

and

(i, y) \in T

and

(x < y)

7: { T:=

T - {(i, x)}

8: change:= true } }

9: return

(T)

Algorithm 5: The operator

-_{m a x}

: (

T 1 -_{m a x} T 2

).

1: Boolean change, set of tuples

T 1

, set of tuples

T 2

, set of tuples T

2: change:= true

3:

T : = T 1

4: while (change) {

5: change:= false

6: if

(i, x) \in T

and

(i, y) \in T 2

and

(x \leq y)

7: {T:=

T - {(i, x)}

8: change:= true } }

9: return

(T)

7.3. Proof of Correctness

In the previous subsection, we presented an adaptation of a causal ordering algorithm as a solution for the TFL problem. We now demonstrate that the proposed algorithm is TFL-free.

When analyzing Definition 5, we can interpret that a TFL occurs in the data plane due to the violation of the causal order delivery of the relevant OpenFlow message FlowMod, and consequently, a packet or a flow of packets starts to loop between the underlying switches. Thus, and in order to prove that the algorithm is TFL-free, it is sufficient to demonstrate that there is no data packet flow that holds with the conditions specified in the TFL pattern of Definition 5. To accomplish the proof, we focus on the packet

p k t

that triggers the TFL pattern. The following theorem proves this observation.

Theorem 1.

The algorithm guarantees that ∄ a data packet

p k t_{k} \in p f l o w_{τ} | p f l o w_{τ} \hat{=} m a t c h_{λ}

, such that (i)

S d M (m^{'}) \to F w d P (p k t_{k})

and (ii)

D l v M P (p k t_{k}, t_{i}) \to D l v M P (m, t_{i}^{'})

where

m, m^{'} \hat{=} m a t c h_{λ}

and

S d M (m) \to S d M (m^{'})

.

Let us assume that the Algorithm 1 and the Algorithm 2 store knowledge of the latest message/packet

m p \in M P

sent from the same process

p_{i} \in P

to another process, through a local matrix named

F o r D e l i v e r e d_{i}

that has the same structure as the delivery matrix (for more details, see the structure description subsection). Therefore, for this proof, we consider this additional instruction in both mentioned algorithms: (6)

F o r D e l i v e r e d_{i} [i, j] = t

. This hypothetical instruction allows building a matrix that considers the message(s)/packet(s) that belong(s) to the causal history of a message/packet

m p

and that has (have to be delivered before

m p

to a process

p_{j}

. Thus, if

F o r D e l i v e r y_{i} [i, j]

is less than or equal to

D e l i v e r y_{j} [i, j]

, this means that all message(s)/packet(s) that causally depend(s) on

m p

has (have) already been delivered to

p_{j}

. On the other hand, if

F o r D e l i v e r y_{i} [i, j]

is greater than

D e l i v e r y_{j} [i, j]

, this means that some of the message(s)/packet(s) that causally depend(s) on

m p

has (have) not been delivered to

p_{j}

.

Proof.

This is proven by contradiction. Let us assume that there is a packet

p k t_{k} \in p f l o w_{τ}

, such that:

$S d M (m) \to F w d P (p k t_{k})$
According to Definition 5, this is by transitivity since $S d M (m) \to S d M (m^{'})$ and $S d M (m^{'}) \to F w d P (p k t_{k})$ , such that m and $p k t_{k}$ have $r p_{i}$ as a common destination.
$D l v M P (p k t_{k}, t_{i}) \to D l v M P (m, t_{i}^{'})$

The existence of a

p k t_{k}

under the mentioned Conditions 1 and 2 implies that

F o r D e l i v e r y_{c p} [c p, r p_{i}] > D e l i v e r y_{r p_{i}} [c p, r p_{i}]

when

r p_{i}

receives

p k t_{k}

. However, and based on the proof of [], this cannot occur as the algorithm allows the delivery of

p k t_{k}

to

r p_{i}

only when

F o r D e l i v e r y_{c p} [c p, r p_{i}] < = D e l i v e r y_{r p_{i}} [c p, r p_{i}]

, which contradicts the initial assumption. □

8. Scenario Description

We present the example of Figure 6 to illustrate how our algorithm works. This example shows two independent updating processes within the same topology. Note that updating processes in OpenFlow-based SDNs are independent as forwarding decisions are flow-based: each packet flow is determined by match fields, as well as by update messages disseminated by the controller to update the forwarding tables of switches. The two updating processes shown in Figure 6 are distinguished by means of two different colors, corresponding to a

m a t c h_{1}

in purple and a

m a t c h_{2}

in blue. In both examples, data packets that interleave with the update messages sent from the controller (

c p

) (the

c p

is not depicted in Figure 6) and installed into the switches

s o u r c e, S_{1}, S_{2}, \dots, S_{n}

(

r p_{s r c}, r p_{1}, r p_{2}, \dots, r p_{n}

) may enter into TFLs between all intermediate switches

r p_{1}, r p_{2}, \dots, r p_{n}

. For brevity, only the update forwarding process corresponding to

m a t c h_{1}

is explored. However, and for completeness, Table 1 and Table A1 and Table A2 in Appendix A include the information related to both matches. Update messages are illustrated in Table 1. Table A1 shows piggybacked control information of update messages, and Table A2 depicts control information for future update messages.

Figure 6. Two forwarding path update scenarios from the same source to the same destination.

Table 1. Update messages calculated for updating

P_{r p}

that correspond to

m a t c h_{1}

and

m a t c h_{2}

.

TFLs that occur when updating forwarding path related to

m a t c h_{1}

: We will follow a packet

p k t_{k}

from the origin to its destination taking into consideration the routing tables of switches corresponding to the initial policy of

m a t c h_{1}

(colored with purple). Assuming that

r p_{s r c}

and

r p_{n}

update their flow tables, then

r p_{s r c}

forwards an ingoing matched data packet

p k t_{k}

to

r p_{n}

(see the dashed purple arrow between

r p_{s r c}

and

r p_{n}

). After delivering

p k t_{k}

,

r p_{n}

forwards it to

r p_{n - 1}

, and the latter delivers

p k t_{k}

before updating its flow table. As a result,

p k t_{k}

ends in a TFL between

r p_{n - 1}

and

r p_{n}

(see the dashed purple arrow between

r p_{n}

and

r p_{n - 1}

and the solid purple arrow between

r p_{n - 1}

and

r p_{n}

). If the last assumption is not the case, i.e.,

r p_{n - 1}

delivers

p k t_{k}

after having updated its flow table, then

r p_{n - 1}

should forward

p k t_{k}

to

r p_{n - 2}

. The same risk that

p k t_{k}

enters into a TFL between

r p_{n - 2}

and

r p_{n - 1}

exists if

r p_{n - 2}

delivers

p k t_{k}

before updating its flow tables.

p k t_{k}

may indeed enter into TFLs between all the subsequent

r p

pairs, i.e., until

p k t_{k}

reaches

r p_{2}

, and then, the risk may arise between

r p_{1}

and

r p_{2}

.

How the update algorithm proceeds to capture and avoid TFLs: As already mentioned, in each hop, there is a risk of falling into a TFL during the update of the forwarding path. We will review how the initial condition and the algorithm prevent this from happening. Table 2 illustrates the behavior of the data packet

p k t_{k}

, which interleaves with the disseminated update messages corresponding to

m a t c h_{1}

. Note that in each hop,

p k t_{k}

gets a different identifier. In the first row, for example,

p k t_{k}

must travel from the

S o u r c e

switch to the switch

S_{n}

. Later, in the following hop,

p k t_{k}

is forwarded from

S_{n}

to

S_{n - 1}

(second row). This continues until

p k t_{k}

reaches the

D e s t i n a t i o n

switch.

Table 2. Packets interleaving with update messages that correspond to

m a t c h_{1}

.

As mentioned above, before performing the update, the algorithm (see Section 7.2) begins with the initial condition by sorting update messages by their match fields and arranging the dissemination of messages as follows: all delete messages are disseminated before all add messages. We describe now the algorithmic part. Each disseminated message (see Table 1) piggybacks direct dependency information related to message(s), which directly depend(s) on it and correspond(s) only to their match fields. Table A1 shows the control information (

C I_{m}

) encapsulated into each message when sending it from the controller, and Table A2 illustrates the control information needed to be piggybacked for future messages. In Table A1, the vectors

C I_{m}

of tuples encapsulated into update messages are presented and referred to the messages that causally depend on the message, i.e., messages that should be delivered to a switch before the message in question, while Table A2 presents the vector of the controller

C I_{c p}

and the tuples stored just after sending each message and that should be encapsulated with the future messages. On the other hand, the control information piggybacked by

p k t_{k}

is built after delivering it in each hop.

Upon receiving messages/packets (see Table 3), switches also receive the control information

C I_{m p}

related to messages and packets. After each delivery, new potential direct dependencies on subsequent messages/packets are added to a receipt switch

C I_{r p}

. In this scenario, all add update messages sent to the intermediate routing processes

r p_{1}, r p_{2}, \dots, r p_{n}

are delivered without any restriction as no message or packet sent before them with a common destination exists. However, for example, the delivery of

p k t_{k}

with identifier

p k t_{2}

is left on standby (wait) by

r p_{n - 1}

as

p k t_{2}

piggybacked the tuple

(c p, n - 1)

, which means that the message disseminated from the controller

c p

with sequence number

n - 1

should be delivered before

p k t_{2}

to

r p_{n - 1}

(see Table 3). Once the message sent by

c p

with sequence number

n - 1

is delivered,

p k t_{2}

can be delivered to

r p_{n - 1}

. The algorithm treats all

p k t_{k}

in the same way in the subsequent hops, i.e.,

r p_{n - 2}, . . ., r p_{1}

, if they reach them before the delivery of the delete messages. Therefore, any forwarding of

p k t_{k}

generated by an installation of an add update message will not be delivered by the next hop switch before the delivery of the corresponding delete message (if it exists), preventing any

p k t_{k}

from going back to a switch from which it was forwarded, avoiding having them enter TFLs during the update.

Table 3. Delivery of update messages and data packets

m p \in M P

corresponding to

m a t c h_{1}

.

9. Discussion

The problem of TFL during updates was intensively attacked by using different techniques. The proposed solutions can be classified into four approaches: ordered update, n-phase commit update, timed update, and the causal update (see Section 4 for more details). Although previous works have reached loop-free updates, they have not achieved effective and efficient solutions. We refer to the trade-off between ensuring TFL-free updates and performing updates with tolerable computational cost. To achieve a better trade-off, an important consideration is how to reason about the modeling of TFL-free updates. In this context, while all previous works set up synchronous execution models to perform updates, our solution was based on an asynchronous model. This allowed getting rid of the strong assumptions and certain constraints that come with a synchronous execution model. Indeed, solutions based on strong assumptions tend to be less applicable.

In regards to the ordered update approaches [,,], such as the loop-free routing update algorithms proposed by [], it has to be noticed that algorithms are centralized into the controller side to calculate updates. During each routing policy update, they perform updates in a free-loop manner by calculating the possible edges that can be updated without generating loops. To do so, the algorithms take forwarding graph properties (set of nodes, set of edges to be added, set of edges to be removed, etc.) as input. In the worst case, the algorithms require |number of new edge to be added| loop-free update steps to perform one routing policy update [], which is very expensive in terms of update time and bandwidth. The fact that the controller should stop disseminating update messages at each k-step of an update until receiving acknowledgment messages from all involved switches means that the update requires extra time to be completed. Furthermore, acknowledgment messages result in bandwidth overhead. In contrast, our algorithm proposed to disseminate update messages in one-shot. As shown in Section 7.2, the only required treatment is to sort update messages by their match fields and arranging the dissemination of messages following a predefined order (see Algorithm 1). In addition, switches are not required to send acknowledgment messages to the controller as the proposed algorithm works with one-step update, and unlike [], it does not need acknowledgment messages that confirm the reception of the disseminated messages in order to start another update step.

Regarding the solutions based on the two-phase commit update [,], they require that each switch maintains the entries of both policies (the initial and the final ones) during the transition synchronous phase. This may end by congesting the memories of switches. Even worse, this approach may not be feasible when the number of entries bypasses the memory limit size of the switches. Our proposed solution avoids such a problem since the update algorithm is based on the replacement of entries. Firstly, it consists of computing an order in which the switches should consume the delete entry update message prior to the add entry update message, instructed by the controller. Secondly, by ensuring the computed order of message execution through causal order delivery of messages/packets, no switch will have entries of both policies at the same time.

As far as the one-phase commit update approach ez-Segway [] is concerned, updating tasks are delegated to the switches, qualifying it as a decentralized approach. However, consistent updates are ensured through centralized calculus performed by the controller. This is based on a forwarding graph that contains the global network state at each moment. Therefore, even when the updating process is decentralized among the switches, the process to make decisions is centralized in the controller side and carried out in a synchronous manner. Furthermore, no switch can participate in a new update until the previous updating process has finished. As already mentioned, our proposal avoids the use of a global graph since it is based on the causal order delivery of messages/packets.

The timed update approach [,] ensures consistent updates. However, the consistency during the intervals of time of updates is not guaranteed. This is due to the adaptation of a synchronous execution model to tackle an asynchronous problem, i.e., the inconsistent update problem. In fact, updates on involved switches are synchronized to be processed simultaneously, assuming an upper bound on message delivery. Actually, each switch receives messages/packet flows at an interval time

[T - ε, T + ε]

where

ε

represents the clock synchronization error. In this case, two periods of events may overlap where the execution order is not trivial in order to ensure consistency. Indeed, as analyzed in Section 6, the reception of in-fly packets that interleave with the reception of update messages can be a question of inconsistency during the interval time when the delivery of messages may take place, leading to TFLs. Hence, one cannot find out or force the execution order of events by using mechanisms for synchronizing physical time.

The consistency update approach based on Suffix Causal Consistency (SCC) [] introduced an update algorithm based on the Lamport timestamp, tackling the forwarding loop problem. We should highlight that this approach and ours are quite different. In fact, SCC works based on timestamps tagging packets to reflect the rules that correspond to each switch. On the other hand, our work is based on establishing a causal order between update messages and in-fly packets. The idea behind SCC is to ensure that an in-fly packet is routed based on its recently installed forwarding path, ensuring bounded looping. As mentioned in Section 4, the fourth step of the proposed algorithm requires the controller to calculate and install the extra temporarily forwarding rule, redirecting an in-fly packet to the recent forwarding path to reach its destination. The question here is how many extra rules would be required per policy update? Indeed, this generates extra overhead related to controller and switch memory, as well bandwidth as this requires message exchange between the controller and switches. On the other hand, our work proposes an OpenFlow-based SDN model where the relevant update events and an abstraction of the TFL phenomenon are defined. Based on this model, we demonstrated that ensuring causal order delivery of the defined relevant update events, based on our SDN model, was sufficient to ensure the TFL-free property. Contrary to the SCC, ensuring causal order delivery does not need any extra rule installation. Furthermore, packets should not only traverse the most recent path to avoid the occurrence of such phenomenon. As shown in the scenario of Section 8, a packet may start flying from the old path and then be forwarded to the destination based on the new path without the need for an extra rule to direct it to the new path. Thus, only the original calculated rules are required to perform updates.

Instead, our proposal established a message/packet causal order delivery to detect and avoid any TFL (refer to the correctness proof in Section 7.3). Obviously, this resulted in memory and bandwidth overhead. However, the causal order message/packet algorithm was designed based on the IDR (see Definition 2). In fact, only direct dependency information between messages/packets with respect to the destination process(es) was needed to be piggybacked with them. In the worst case, each component of a

C I_{c p} [j]

(vector of the control information of the controller process) could have at most one tuple for each process. This was because there was no concurrent message sent from the controller to a switch, i.e., each update message was sent to a specific switch and not to others. Furthermore, each message only piggybacked control information corresponding to their match fields. Let us go back to the scenario example of Section 8, in Table A1, of control information

C I_{c p}

presented in Appendix A. Messages of

m a t c h_{2}

do not piggyback control information of messages of

m a t c h_{1}

. On the other hand, a component of a

C I_{r p_{i}} [j]

(vector of the control information of a routing process) can have at most N tuples for each process, i.e., for each vector component. This occurs when a routing process should, for example, flood a data packet to all outgoing links. Therefore,

O (N^{2})

control information is required to be piggybacked with each message/packet. Concerning memory overhead, a routing process

r p_{i}

needs to store a vector

C I_{r p_{i}} [j]

and a matrix

D e l i v e r y_{i}

of N X N, which then requires

O (N^{2})

integers to be stored.

In this work, we introduced a new approach to tackle the TFL SDN update problem based on distributed system principles. By means of this approach, we outperformed the-state-of-the-art by proposing a solution that was more suitable for the characteristics of the SDNs. Indeed, the solution was totally distributed, where the execution model was asynchronous, no global references were required, and it did not assume a message upper bound delivery. The aspects mentioned above favored the following points: (1) the system configuration with the proposed update mechanism aligned with the pure features of a distributed and asynchronous system; (2) the proposed update mechanism minimized the interaction of the controller with the switches during an update (update messages were sent in one shot per updating flow); (3) the controller was not the only network entity that calculated the update operations as the switches contributed in the calculation (based on the exchanged control information and the messages/packets delivery constraints ensured by the update mechanism).

10. Conclusions

A model of the TFL pattern based on causal dependencies in SDNs was presented. We characterized the TFL phenomenon in an OpenFlow-based SDN, by identifying the relevant events and conditions under which it could occur during updates. In Definition 6, we specified that a TFL occurred due to a violation of the causal order delivery of a data packet during the process to update flow tables. This updating process was performed through FlowMod controller-to-switch messages of type add and delete. Based on this abstraction, we proposed a causal consistent update algorithm oriented to ensure the TFL-free property. To demonstrate this, a proof of correctness of the algorithm was provided. Based on these results, we propose as a future work to extend our solution to manage temporal constraints by including the principle of Δ-causal proposed by Pomares et al. in []. Furthermore, we would like to extend the study to other network invariant violation patterns. This study will allow us to answer whether ensuring causal dependencies will be sufficient to cover consistent network updates in SDNs.

Author Contributions

Formal analysis, A.G., S.E.P.H., L.M.X.R.H., H.H.K. and A.H.K.; Investigation, A.G., S.E.P.H., L.M.X.R.H., H.H.K. and A.H.K.; Methodology, A.G., S.E.P.H., L.M.X.R.H., H.H.K. and A.H.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Acknowledgments

This work is supported by the Mexican Agency for International Development Cooperation (AMEXCID).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Tables of Control Information Related to the Scenario Description of Section 8

Table A1. Piggybacked control information corresponding to update messages of

m a t c h_{1}

and

m a t c h_{2}

.

Table A1. Piggybacked control information corresponding to update messages of

m a t c h_{1}

and

m a t c h_{2}

.

Message	${C I}_{m}$
	Piggybacked control information of delete messages that correspond to $m a t c h_{1}$
$m_{1}$	$[n u l l]$
$m_{2}$	$[(c p, 1)]$
$m_{3}$	$[(c p, 1), (c p, 2)]$
...	...
$m_{n - 2}$	$[(c p, 1), (c p, 2), (c p, 3), \dots]$
$m_{n - 1}$	$[(c p, 1), (c p, 2), (c p, 3), \dots, (c p, n - 2)]$
$m_{n}$	$[(c p, 1), (c p, 2), (c p, 3), \dots, (c p, n - 2), (c p, n - 1)]$
	Piggybacked control information of delete messages that correspond to $m a t c h_{2}$
$m_{n + 1}$	$[n u l l]$
$m_{n + 2}$	$[(c p, n + 1)]$
$m_{n + 3}$	$[(c p, n + 1), (c p, n + 2)]$
$m_{n + 4}$	$[(c p, n + 1), (c p, n + 2), (c p, n + 3)]$
...	...
$m_{n + m}$	$[(c p, n + 1), (c p, n + 2), (c p, n + 3), (c p, n + 4), \dots]$
$m_{n + m + 1}$	$[(c p, n + 1), (c p, n + 2), (c p, n + 3), (c p, n + 4), \dots, (c p, n + m)]$
	Piggybacked control information of add messages that correspond to $m a t c h_{1}$

$m_{n + m + 2}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n)]$
$m_{n + m + 3}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n + m + 2)]$
$m_{n + m + 4}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n + m + 2), (c p, n + m + 3)]$
$m_{n + m + 5}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n + m + 2), (c p, n + m + 3)]$
...	...
$m_{n + m + k}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n + m + 2), (c p, n + m + 3), (c p, n + m + 5), \dots]$
$m_{n + m + k + 1}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n + m + 2), (c p, n + m + 3), (c p, n + m + 5), \dots, (c p, n + m + k)]$

	Piggybacked control information of add messages that correspond to $m a t c h_{2}$

$m_{n + m + k + 2}$	$[C I_{c p}$ of $m_{n + m + 1}$ + $(c p, n + m + 1)]$
$m_{n + m + k + 3}$	$[C I_{c p}$ of $m_{n + m + 1}$ + $(c p, n + m + k + 2)]$
$m_{n + m + k + 4}$	$[C I_{c p}$ of $m_{n + m + 1}$ + $(c p, n + m + k + 2), (c p, n + m + k + 3)]$
...	...
$m_{n + m + k + j}$	$[C I_{c p}$ of $m_{n + m + 1}$ + $(c p, n + m + k + 2), (c p, n + m + k + 3), (c p, n + m + k + 4), \dots]$
$m_{n + m + k + j + 1}$	$[C I_{c p}$ of $m_{n + m + k + j}]$
$m_{n + m + k + j + 2}$	$[C I_{c p}$ of $m_{n + m + k + j}$ + $(c p, n + m + k + j + 1)]$

Table A2. Control information of update messages generated on the

c p

for future message corresponding to

m a t c h_{1}

and

m a t c h_{2}

.

Table A2. Control information of update messages generated on the

c p

for future message corresponding to

m a t c h_{1}

and

m a t c h_{2}

.

Message	${C I}_{cp}$
	Control information of delete messages that correspond to $m a t c h_{1}$
$m_{1}$	$[(c p, 1)]$
$m_{2}$	$[(c p, 1), (c p, 2)]$
$m_{3}$	$[(c p, 1), (c p, 2), (c p, 3)]$
...	...
$m_{n - 2}$	$[(c p, 1), (c p, 2), (c p, 3), \dots, (c p, n - 2)]$
$m_{n - 1}$	$[(c p, 1), (c p, 2), (c p, 3), \dots, (c p, n - 2), (c p, n - 1)]$
$m_{n}$	$[(c p, 1), (c p, 2), (c p, 3), \dots, (c p, n - 2), (c p, n - 1), (c p, n)]$
	Control information of delete messages that correspond to $m a t c h_{2}$
$m_{n + 1}$	$[(c p, n + 1)]$
$m_{n + 2}$	$[(c p, n + 1), (c p, n + 2)]$
$m_{n + 3}$	$[(c p, n + 1), (c p, n + 2), (c p, n + 3)]$
...	...
$m_{n + m}$	$[(c p, n + 1), (c p, n + 2), (c p, n + 3), (c p, n + 4), \dots, (c p, n + m)]$
$m_{n + m + 1}$	$[(c p, n + 1), (c p, n + 2), (c p, n + 3), (c p, n + 4), \dots, (c p, n + m), (c p, n + m + 1)]$
	Control information of add messages that correspond to $m a t c h_{1}$
$m_{n + m + 2}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n + m + 2)]$
$m_{n + m + 3}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n + m + 2), (c p, n + m + 3)]$
$m_{n + m + 4}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n + m + 2), (c p, n + m + 3), (c p, n + m + 4)]$
...	...
$m_{n + m + k}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n + m + 2), (c p, n + m + 3), (c p, n + m + 5), \dots, (c p, n + m + k)]$
$m_{n + m + k + 1}$	$[C I_{c p}$ of $m_{n}$ + $(c p, n + m + 2), (c p, n + m + 3), (c p, n + m + 5), \dots, (c p, n + m + k), (c p, n + m + k + 1)]$
	Control information of add messages that correspond to $m a t c h_{2}$
$m_{n + m + k + 2}$	$[C I_{c p}$ of $m_{n + m + 1}$ + $(c p, n + m + k + 2)]$
$m_{n + m + k + 3}$	$[C I_{c p}$ of $m_{n + m + 1}$ + $(c p, n + m + k + 2), (c p, n + m + k + 3)]$
$m_{n + m + k + 4}$	$[C I_{c p}$ of $m_{n + m + 1}$ + $(c p, n + m + k + 2), (c p, n + m + k + 3), (c p, n + m + k + 4)]$
...	...
$m_{n + m + k + j}$	$[C I_{c p}$ of $m_{n + m + 1}$ + $(c p, n + m + k + 2), (c p, n + m + k + 3), (c p, n + m + k + 4), \dots, (c p, n + m + k + j)]$
$m_{n + m + k + j + 1}$	$[C I_{c p}$ of $m_{n + m + k + j}$ + $(c p, n + m + k + j + 1)]$
$m_{n + m + k + j + 2}$	$[C I_{c p}$ of $m_{n + m + k + j}$ + $(c p, n + m + k + j + 1), (c p, n + m + k + j + 2)]$

References

Kreutz, D.; Ramos, F.M.V.; Verissimo, P.; Rothenberg, C.E.; Azodolmolky, S.; Uhlig, S. Software-defined networking: A comprehensive survey. Proc. IEEE 2015, 103, 14–76. [Google Scholar] [CrossRef]
Openflow Switch Specification Version 1.5.1. Open Networking Foundation Tech. Rep. 2015. Available online: https://www.opennetworking.org/sdn-resources/technical-library (accessed on 20 February 2019).
Doria, A.; Salim, J.H.; Haas, R.; Khosravi, H.; Wang, W.; Dong, L.; Gopal, R.; Halpern, J. Forwarding and control element separation (forces) protocol specification. Internet Eng. Task Force (IETF) 2010, 5810, 1–124. [Google Scholar]
Reitblatt, M.; Foster, N.; Rexford, J.; Schlesinger, C.; Walker, D. Abstractions for Network Update. ACM SIGCOMM Comput. Commun. Rev. 2012, 42, 323–334. [Google Scholar] [CrossRef]
Lamport, L. Time, Clocks, and the Ordering of Events in a Distributed System. Commun. ACM 1978, 21, 558–565. [Google Scholar] [CrossRef]
Prakash, R.; Raynal, M.; Singhal, M. An Adaptive Causal Ordering Algorithm Suited to Mobile Computing Environments. J. Parallel Distrib. Comput. 1997, 41, 190–204. [Google Scholar] [CrossRef]
Open Networking Fundation. Software-Defined Networking: The New Norm for Networks. 2012. Available online: https://www.opennetworking.org/images/stories/downloads/sdn-resources/white-papers/wpsdn-newnorm.pdf (accessed on 20 February 2019).
Jarraya, Y.; Madi, T.; Debbabi, M. A Survey and a Layered Taxonomy of Software-Defined Networking. IEEE Commun. Surv. Tutor. 2014, 16, 1955–1980. [Google Scholar] [CrossRef]
Zhou, W.; Li, L.; Luo, M.; Chou, W. REST API Design Patterns for SDN Northbound API. In Proceedings of the 2014 28th International Conference on Advanced Information Networking and Applications Workshops, Victoria, BC, Canada, 13–16 May 2014; pp. 358–365. [Google Scholar]
Canini, M.; Venzano, D.; Peresini, P.; Kostic, D.; Rexford, J. A NICE way to test OpenFlow applications. NSDI 2012, 12, 127–140. [Google Scholar]
Föerster, K.-T.; Schmid, S.; Vissicchio, S. Survey of Consistent Software-Defined Network Updates. IEEE Commun. Surv. Tutor. 2018, 21, 1435–1461. [Google Scholar] [CrossRef]
Ludwig, A.; Rost, M.; Foucard, D.; Schmid, S. Good Network Updates for Bad Packets: Waypoint Enforcement Beyond Destination-Based Routing Policies. In Proceedings of the 13th ACM Workshop on Hot Topics in Networks, Los Angeles, CA, USA, 27–28 October 2014; pp. 15:1–15:7. [Google Scholar]
Liu, H.H.; Wu, X.; Zhang, M.; Yuan, L.; Wattenhofer, R.; Maltz, D. zUpdate: Updating Data Center Networks with Zero Loss. SIGCOMM Comput. Commun. Rev. 2013, 43, 411–422. [Google Scholar] [CrossRef]
Förster, K.-T.; Mahajan, R.; Wattenhofer, W. Consistent updates in software defined networks: On dependencies, loop freedom, and black holes. In Proceedings of the 2016 IFIP Networking Conference (IFIP Networking) and Workshops, Vienna, Austria, 17–19 May 2016. [Google Scholar]
Vissicchio, S.; Cittadini, L. FLIP the (Flow) Table: Fast LIghtweight Policy-preserving SDN Updates. In Proceedings of the IEEE INFOCOM 2016—The 35th Annual IEEE International Conference on Computer Communications, San Francisco, CA, USA, 10–14 April 2016; pp. 1–9. [Google Scholar]
Katta, N.P.; Rexford, J.; Walker, D. Incremental Consistent Updates. In Proceedings of the Second ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking, Hong Kong, China, 16 August 2013; pp. 49–54. [Google Scholar]
Nguyen, T.D.; Chiesa, M.; Canini, M. Decentralized Consistent Updates in SDN. In Proceedings of the Symposium on SDN Research, Santa Clara, CA, USA, 3–4 April 2017; pp. 21–33. [Google Scholar]
Fayazbakhsh, S.K.; Chiang, L.; Sekar, V.; Yu, M.; Mogul, J.C. Enforcing Network-wide Policies in the Presence of Dynamic Middlebox Actions Using Flowtags. In Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation, Seattle, WA, USA, 2–4 April 2014. [Google Scholar]
Mizrahi, T.; Moses, Y. Software defined networks: It’s about time. In Proceedings of the 35th Annual IEEE International Conference on Computer Communications, San Francisco, CA, USA, 10–14 April 2016; pp. 1–9. [Google Scholar]
Mizrahi, E.S.T.; Moses, Y. Timed consistent network updates in software-defined networks. IEEE/ACM Trans. Netw. 2016, 24, 3412–3425. [Google Scholar] [CrossRef]
Liu, S.; Benson, T.A.; Reiter, M.K. Efficient and Safe Network Updates with Suffix Causal Consistency. In Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, 25–28 March 2019; pp. 1–15. [Google Scholar]
Mahajan, R.; Wattenhofer, R. On Consistent Updates in Software Defined Networks. In Proceedings of the 12th ACM Workshop on Hot Topics in Networks, New York, NY, USA, 21–22 November 2013; pp. 1–7. [Google Scholar]
Amiri, S.A.; Ludwig, A.; Marcinkowski, J.; Schmid, S. Transiently Consistent SDN Updates: Being Greedy is Hard. In International Colloquium on Structural Information and Communication Complexity; Springer: Cham, Switzerland, 2016; pp. 391–406. [Google Scholar]
Förster, K.-T.; Wattenhofer, W. The power of two in consistent network updates: Hard loop freedom, easy flow migration. In Proceedings of the 2016 25th International Conference on Computer Communication and Networks (ICCCN), Waikoloa, HI, USA, 1–4 August 2016. [Google Scholar]
Ludwig, A.; Marcinkowski, J.; Schmid, S. Scheduling Loop-free Network Updates: It’s Good to Relax! In Proceedings of the 2015 ACM Symposium on Principles of Distributed Computing, New York, NY, USA, 21–23 July 2015; pp. 13–22. [Google Scholar]
Föerster, K.-T.; Ludwig, A.; Marcinkowski, J.; Schmid, S. Loop-Free Route Updates for Software-Defined Networks. IEEE/ACM Trans. Netw. 2018, 26, 328–341. [Google Scholar] [CrossRef]
Föerster, K.-T.; Luedi, T.; Seidel, J.; Wattenhofer, R. Local checkability, no strings attached: (a)cyclicity, reachability, loop free updates in sdns. Theor. Comput. Sci. 2018, 709, 48–63. [Google Scholar] [CrossRef]
Naor, M.; Stockmeyer, L.J. What can be computed locally? In Proceedings of the Twenty-Fifth Annual ACM Symposium on Theory of Computing, San Diego, CA, USA, 16–18 May 1993; pp. 184–193. [Google Scholar]
Pomares Hernández, S.E. The Minimal Dependency Relation for Causal Event Ordering in Distributed Computing. Appl. Math. Inf. Sci. 2015, 9, 57–61. [Google Scholar] [CrossRef]
Mills, D.L. Internet Time Synchronization: The Network Time Protocol. IEEE/ACM Trans. Netw. 1991, 39, 1482–1493. [Google Scholar] [CrossRef]
Pomares Hernandez, S.E.; Lopez Dominguez, E.; Rodriguez Gomez, G.; Fanchon, J. An Efficient Δ-Causal Algorithm for Real-Time Distributed Systems. J. Appl. Sci. 2009, 9, 1711–1718. [Google Scholar] [CrossRef]

Figure 1. The general architecture of SDN.

Figure 2. Flow swap scenario: (a) initial forwarding policy; (b) after updating the initial forwarding policy.

Figure 3. The communication diagram corresponding to Figure 2.

Figure 4. The generic Transient Forwarding Loop (TFL) scenario: (a) initial and final forwarding paths; (b) a sequence of messages to redirect a

p f l o w

from

S_{n}

to

S_{1}

ends with a TFL.

Figure 5. The communication diagram corresponding to Figure 4.

Figure 6. Two forwarding path update scenarios from the same source to the same destination.

Table 1. Update messages calculated for updating

P_{r p}

that correspond to

m a t c h_{1}

and

m a t c h_{2}

.

Table 1. Update messages calculated for updating

P_{r p}

that correspond to

m a t c h_{1}

and

m a t c h_{2}

.

Message	Message Content
	Delete update messages that correspond to $m a t c h_{1}$
$m_{1}$	$(c p, t_{c p} = 1, (m a t c h_{1}, d e l e t e), r p_{s r c})$
$m_{2}$	$(c p, t_{c p} = 2, (m a t c h_{1}, d e l e t e), r p_{1})$
$m_{3}$	$(c p, t_{c p} = 3, (m a t c h_{1}, d e l e t e), r p_{2})$
...	...
$m_{n - 2}$	$(c p, t_{c p} = n - 2, (m a t c h_{1}, d e l e t e), r p_{n - 2})$
$m_{n - 1}$	$(c p, t_{c p} = n - 1, (m a t c h_{1}, d e l e t e), r p_{n - 1})$
$m_{n}$	$(c p, t_{c p} = n, (m a t c h_{1}, d e l e t e), r p_{n})$
	Delete update messages that correspond to $m a t c h_{2}$
$m_{n + 1}$	$(c p, t_{c p} = n + 1, (m a t c h_{2}, d e l e t e), r p_{s r c})$
$m_{n + 2}$	$(c p, t_{c p} = n + 2, (m a t c h_{2}, d e l e t e), r p_{n})$
$m_{n + 3}$	$(c p, t_{c p} = n + 3, (m a t c h_{2}, d e l e t e), r p_{n - 1})$
...	...
$m_{n + m}$	$(c p, t_{c p} = n + m, (m a t c h_{2}, d e l e t e), r p_{2})$
$m_{n + m + 1}$	$(c p, t_{c p} = n + m + 1, (m a t c h_{2}, d e l e t e), r p_{1})$
	Add update messages that correspond to $m a t c h_{1}$
$m_{n + m + 2}$	$(c p, t_{c p} = n + m + 2, (m a t c h_{1}, a d d), r p_{s r c})$
$m_{n + m + 3}$	$(c p, t_{c p} = n + m + 3, (m a t c h_{1}, a d d), r p_{n})$
$m_{n + m + 4}$	$(c p, t_{c p} = n + m + 4, (m a t c h_{1}, a d d), r p_{n - 1})$
...	...
$m_{n + m + k}$	$(c p, t_{c p} = n + m + k, (m a t c h_{1}, a d d), r p_{2})$
$m_{n + m + k + 1}$	$(c p, t_{c p} = n + m + k + 1, (m a t c h_{1}, a d d), r p_{1})$
	Add update messages that correspond to $m a t c h_{2}$
$m_{n + m + k + 2}$	$(c p, t_{c p} = n + m + k + 2, (m a t c h_{2}, a d d), r p_{s r c})$
$m_{n + m + k + 3}$	$(c p, t_{c p} = n + m + k + 3, (m a t c h_{2}, a d d), r p_{1})$
$m_{n + m + k + 4}$	$(c p, t_{c p} = n + m + k + 4, (m a t c h_{2}, a d d), r p_{2})$
...	...
$m_{n + m + k + j}$	$(c p, t_{c p} = n + m + k + j, (m a t c h_{2}, a d d), r p_{n - 2})$
$m_{n + m + k + j + 1}$	$(c p, t_{c p} = n + m + k + j + 1, (m a t c h_{2}, a d d), r p_{n - 1})$
$m_{n + m + k + j + 2}$	$(c p, t_{c p} = n + m + k + j + 2, (m a t c h_{2}, a d d), r p_{n})$

Table 2. Packets interleaving with update messages that correspond to

m a t c h_{1}

.

Table 2. Packets interleaving with update messages that correspond to

m a t c h_{1}

.

Packet ${pkt}_{k}$	Packet Content
$p k t_{1}$	$(r p_{s r c}, t_{r p_{s r c}} = 1, (m a t c h_{1}, d a t a), r p_{n})$
$p k t_{2}$	$(r p_{n}, t_{r p_{n}} = 1, (m a t c h_{1}, d a t a), r p_{n - 1})$
$p k t_{3}$	$(r p_{n - 1}, t_{r p_{n - 1}} = 1, (m a t c h_{1}, d a t a), r p_{n - 2})$
...	...
$p k t_{n}$	$(r p_{2}, t_{r p_{2}} = 1, (m a t c h_{1}, d a t a), r p_{1})$
$p k t_{n + 1}$	$(r p_{1}, t_{r p_{1}} = 1, (m a t c h_{1}, d a t a), r p_{d s t})$

Table 3. Delivery of update messages and data packets

m p \in M P

corresponding to

m a t c h_{1}

.

Table 3. Delivery of update messages and data packets

m p \in M P

corresponding to

m a t c h_{1}

.

Message/Packet	${CI}_{mp}$	Delivery Condition	${CI}_{rp}$
Reception of the delete update message by $r p_{s r c}$
$m_{1}$	$C I_{m_{1}}$	Delivered (✔)	$C I_{r p_{s r c}} = \cup_{m a x}$ $C I_{m_{1}} + (c p, 1)$
Reception of the add update message by $r p_{s r c}$
$m_{m + n + 2}$	$C I_{m_{n + m + 2}}$	✔	$C I_{r p_{s r c}} = \cup_{m a x} C I_{m_{n + m + 2}} + (c p, n + m + 2)$
Reception of the add update message by $r p_{n}$
$m_{m + n + 3}$	$C I_{m_{n + m + 3}}$	✔	$C I_{r p_{n}} = \cup_{m a x} C I_{m_{n + m + 3}} + (c p, n + m + 3)$
Reception of the add update message by $r p_{n - 1}$
$m_{m + n + 4}$	$C I_{m_{n + m + 4}}$	✔	$C I_{r p_{n - 1}} = \cup_{m a x} C I_{m_{n + m + 4}} + (c p, n + m + 4)$
Reception of the add update message by $r p_{2}$
$m_{m + n + k}$	$C I_{m_{n + m + k}}$	✔	$C I_{r p_{2}} = \cup_{m a x} C I_{m_{n + m + k}} + (c p, n + m + k)$
Reception of the add update message by $r p_{1}$
$m_{m + n + k + 1}$	$C I_{m_{n + m + k + 1}}$	✔	$C I_{r p_{1}} = \cup_{m a x} C I_{m_{n + m + k + 1}} + (c p, n + m + k + 1)$
Reception of $p k t_{1}$ by $r p_{n}$
$p k t_{1}$	$C I_{r p_{s r c}}$	✔	$C I_{r p_{n}} = \cup_{m a x} C I_{r p_{s r c}} + (r p_{s r c}, 1)$
Reception of $p k t_{2}$ by $r p_{n - 1}$
$p k t_{2}$	$C I_{r p_{n}} = [\dots, (c p, n - 1), \dots]$	Wait ()	–
Reception of the delete update message by $r p_{n - 1}$
$m_{n - 1}$	$C I_{m_{n - 1}}$	✔	$C I_{r p_{n - 1}} = \cup_{m a x} C I_{m_{n - 1}} + (c p, n - 1)$
Reception of the $p k t_{2}$ by $r p_{n - 1}$ after delivering $m_{n - 1}$
$p k t_{2}$	$C I_{r p_{n}} = [\dots, (c p, n - 1), \dots]$	✔	$C I_{r p_{n - 1}} = \cup_{m a x} C I_{r p_{n}} + (r p_{n}, 1)$
Reception of $p k t_{3}$ by $r p_{n - 2}$
$p k t_{3}$	$C I_{r p_{n - 1}} = [\dots, (c p, n - 2), \dots]$		–
Reception of the delete update message by $r p_{n - 2}$
$m_{n - 2}$	$C I_{m_{n - 2}}$	✔	$C I_{r p_{n - 2}} = \cup_{m a x} C I_{m_{n - 2}} + (c p, n - 2)$
Reception of the $p k t_{3}$ by $r p_{n - 2}$ after delivering $m_{n - 2}$
$p k t_{3}$	$C I_{r p_{n - 1}} = [\dots, (c p, n - 2), \dots]$	✔	$C I_{r p_{n - 2}} = \cup_{m a x} C I_{r p_{n - 1}} + (r p_{n - 1}, 1)$
The same for the other intermediate routing processes
...	...	...	...
Reception of $p k t_{n}$ by $r p_{1}$
$p k t_{n}$	$C I_{r p_{2}} = [\dots, (c p, 2), \dots]$		–
Reception of the delete update message by $r p_{1}$
$m_{2}$	$C I_{m_{2}}$	✔	$C I_{r p_{2}} = \cup_{m a x} C I_{m_{2}} + (c p, 2)$
Reception of $p k t_{n}$ by $r p_{1}$ after delivering $m_{2}$
$p k t_{n}$	$C I_{r p_{2}} = [\dots, (c p, 2), \dots]$	✔	$C I_{r p_{1}} = \cup_{m a x} C I_{r p_{2}} + (r p_{2}, 1)$
Reception of $p k t_{n} + 1$ by $r p_{d s t}$
$p k t_{n} + 1$	$C I_{r p_{1}}$	✔	$C I_{r p_{d s t}} = \cup_{m a x} C I_{r p_{1}} + (r p_{1}, 1)$

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Towards Causal Consistent Updates in Software-Defined Networks

Abstract

1. Introduction

2. Preliminaries

2.1. Fundamental Abstractions of SDNs

2.2. Network Traffic Handling in an OpenFlow-Based SDN

3. Problem Description: TFL as an Inconsistent Network Update Problem

5. System Model

5.1. Distributed System Model

5.2. Time and Causal Order

5.3. SDN Model

5.4. Causal Order Delivery

6. Modeling the TFL Pattern from a Temporal Perspective

7. The TFL-Free Property

7.1. Modeling the TFL Pattern from a Causal Perspective

7.2. Algorithm for TFL-Free SDN Updating

7.2.1. Algorithm Overview

7.2.2. Data Structures

7.2.3. Algorithm Details

7.3. Proof of Correctness

8. Scenario Description

9. Discussion

10. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Tables of Control Information Related to the Scenario Description of Section 8

References

Article Metrics

Citations

Article Access Statistics

Towards Causal Consistent Updates in Software-Defined Networks

Abstract

1. Introduction

2. Preliminaries

2.1. Fundamental Abstractions of SDNs

2.2. Network Traffic Handling in an OpenFlow-Based SDN

3. Problem Description: TFL as an Inconsistent Network Update Problem

4. Related Work

5. System Model

5.1. Distributed System Model

5.2. Time and Causal Order

5.3. SDN Model

5.4. Causal Order Delivery

6. Modeling the TFL Pattern from a Temporal Perspective

7. The TFL-Free Property

7.1. Modeling the TFL Pattern from a Causal Perspective

7.2. Algorithm for TFL-Free SDN Updating

7.2.1. Algorithm Overview

7.2.2. Data Structures

7.2.3. Algorithm Details

7.3. Proof of Correctness

8. Scenario Description

9. Discussion

10. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Tables of Control Information Related to the Scenario Description of Section 8

References

Article Metrics

Citations

Article Access Statistics