Reducing Flow Table Update Costs in Software-Defined Networking

In software-defined networking (SDN), the traffic forwarding delay highly depends on the latency associated with updating the forwarding rules in flow tables. With the increase in fine-grained flow control requirements, due to the flexible control capabilities of SDN, more rules are being inserted and removed from flow tables. Moreover, the matching fields of these rules might overlap since multiple control domains might generate different rules for similar flows. This overlap implies dependency relationships among the rules, imposing various restrictions on forwarding entries during updates, e.g., by following update orders or storing entries at specified locations, especially in flow tables implemented using ternary content addressable memory (TCAM); otherwise, mismatching or packet dropping will occur. It usually takes a while to resolve and maintain dependencies during updates, which hinders high forwarding efficiency. To reduce the delay associated with updating dependent rules, in this paper, we propose an updating algorithm for TCAM-based flow tables. We formulate the TCAM maintenance process as an NP-hard problem and analyze the inefficiency of existing moving approaches. To solve the problem, we propose an optimal moving chain for single rule updates and provide theoretical proof for its minimum moving steps. For multiple rules arriving at a switch simultaneously, we designed a dynamic approach to update concurrent entries; it is able to update multiple rules heuristically within a restricted TCAM region. As the update efficiency concerns dependencies among rules, we evaluate our flow table by updating algorithms with different dependency complexities. The results show that our approach achieves about 6% fewer moving steps than existing approaches. The advantage is more pronounced when the flow table is heavily utilized and rules have longer dependency chains.


Introduction
The cloud-edge computing paradigm presents great challenges for networking technology tasked with serving millions of endpoints, each with various communication requirements, e.g., low delay, high bandwidth, security, etc.First, computing services are distributed into the cloud or multiple edges, creating more communication demands between the core cloud and edge clouds.Thus, switches have to maintain increasing numbers of forwarding entries in flow tables to forward traffic to the correct service.Second, due to the mobility characteristics of endpoints (e.g., sensors, unmanned vehicles), the communication quality between the endpoints and edge clouds varies, depending on the changing environment.Forwarding entries are added or deleted in flow tables as connections go up or down.In particular, in highly dynamic environments, these changes may occur at a rate of approximately hundreds to thousands of entries per second [1,2].Thus, the flow table update efficiency is quite critical to ensure the forwarding quality.
In the context of software-defined networking (SDN), new entries are inserted into corresponding flow tables at the arrival of a new flow, so that switches are able to look up flow tables to match the flow and find the forwarding actions.The flow entry update is usually in the order of milliseconds in a flow table [3], while the packet lookup rate is in the order of millions of packets per second.Slow flow entry updates will lead to packets being buffered in switches and will result in the dropping of packets in cases of buffer overflow.As the flow table update delay is relatively slow when compared to packet lookup, it is paramount to speed up flow table updates to avoid the packet-forwarding bottleneck.
In recent years, ternary content addressable memory (TCAM) has been widely used for the SDN flow table, largely due to its ternary representation (0, 1,*), and the fast lookup property is quite suitable for network address matching.It is able to perform a full flow table lookup in a clock cycle in spite of the flow table size, which is at least six times faster than SRAM-based solutions.A TCAM-based flow table could be seen as an ordered array with parallel look-up abilities [4].Entries are stored in the TCAM from high to low physical addresses in a considerably designed order; entries at higher addresses have higher matching priorities than those at lower addresses.As the matching fields of flow entries may overlap, an incoming packet will match multiple entries.In this case, only the entry at the highest physical address is chosen [5].Thus, switches have to maintain the desired ordering of entries during flow table updates, which is important to ensure correct packet matching.
The order of entries in flow tables is determined by relationships among forwarding entries.The overlapping matching fields of entries produce dependency relationships.For instance, flow entry A : dst = 192.168.1.10/32,priority = 2, action = output:1 has an intersecting destination matching field with flow entry B : dst = 192.168.1.0/24,priority = 1, action = output:2, but the priority of A is higher than B. Thus, packets matching entry A will ignore B. In other words, entry B depends on entry A. To effectively maintain dependencies among entries, there are two prevalent approaches: integer priority and dependency graph.The integer priority value provides poor clues about rule dependencies; it brings in plenty of non-existent dependencies and oftentimes makes avoidable priority updates necessary [6].It leads to massive redundant TCAM moves [7].The dependency graph approach constructs a directed acyclic graph (DAG) with dependency relationships among flow entries, where nodes are the entries and directed edges represent the dependencies between adjacent nodes.With the dependency graph, we can clearly identify dependencies among entries, and focus on these actual dependencies during flow table updates.A topological sort of DAG is able to maintain the correct ordering of entries.The actual dependencies could be further utilized to help avoid unnecessary TCAM update overhead.
Hardware flow tables usually contain less than 5000 flow entries [8].With frequent changes caused by limited table sizes, the time cost for the TCAM update is a non-linear function of the number of rules.It takes about 6 s to install 500 rules, while 1500 rules take almost 2 min [9].An entry update takes longer when removing an existing entry and installing a new entry; thus, one update takes up to 50 ms in a 1k-sized flow table [7].To efficiently insert new entries, existing solutions usually move existing entries to other locations to make space for new entries, considering the dependency relationship.A naive solution would shift the downstream or upstream entries next to the desired inserting location (downward or upward, each by one location), creating an empty space to store the new entry without violating the dependency relationship.The worst-case time complexity is O(N) [10].To avoid the overhead of naive moving, an entry could be moved to a downward or upward location more freely as long as dependency is maintained.RuleTris [7] considers the dependencies among entries and moves either an upstream moving chain upward or a downstream moving chain downward.However, without fully leveraging the dependency relationships, this is not the minimum moving scheme.We further reduce the moving steps in this paper.
In this paper, we first show that a unidirectional (either upstream or downstream) moving approach can overlook chances to move downstream entries up or upstream entries down, failing to find a shorter moving chain.Moreover, inserting a new entry may introduce extra dependency relationships among existing entries, such that existing entries in the flow table may violate the new dependencies, even if they comply with previous dependencies.We propose an algorithm for single-entry updating that utilizes bidirectional moving and we prove its efficiency in minimizing update costs.Further, we design a heuristic algorithm to update the TCAM-based flow table dynamically, which supports updating multiple entries concurrently.The evaluation results show that our updating algorithms consistently outperform existing updating approaches, reducing the moving costs of TCAM entries by 6% compared to unidirectional approaches.
The rest of this paper is organized as follows: We formulate the TCAM-based flow table update problem in Section 3. In Section 4, we describe the problem of updating with a moving chain.We introduce algorithms to minimize the TCAM update overhead for inserting a single entry in Section 5 and update multiple entries dynamically in Section 6.In Section 7, we evaluate the proposed algorithms and analyze the evaluation results.Finally, we present our conclusions in Section 8.

Related Work
Forwarding rules dependency.In recent years, many research studies have focused on resolving the dependencies among forwarding rules, e.g., [3,[11][12][13].FLIP [14] proposes an algorithm for SDN network updates that preserves forwarding policies.It keeps track of alternative constraints, preventing repeated policy violations.However, priorities do not actually mean dependency relationships, resulting in many fake dependencies.CoVisor [15] assigns priority values in composing multiple policies of controllers without changing the priorities of existing rules, minimizing updates on existing entries.However, it is unable to resolve dependencies among rules on switches locally when switches update autonomously or when people manually configure the flow tables.
Flow entry installation.Flow entry installation could be divided into reactive and proactive approaches based on the first packet of a new flow arriving before or after the corresponding entry installation.As it is not possible to install all the rules in advance, reactive approaches are used to handle fine-grained traffic control in most cases, e.g., [16,17].Once a matching failure occurs, it will take longer to establish new entries along the forwarding path, which would lead to further congestion and packet loss.Proactive approaches, e.g., [18][19][20], help reduce packet-forwarding latency with traffic prediction and estimation.However, it limits the controller's ability to dynamically react to changes in the network traffic [21], and it does not alleviate issues related to new entry establishment latency.

Flow table caching.
Many research studies aim to use less TCAM space to forward the vast majority of traffic; this includes dependencies among rules, e.g., [22,23].Devoflow [2] proposes sampling and trigger techniques to figure out elephant flows, deploying flow entries only for these elephant flows.CRAFT [24] reduces the occurrence of multiple overlapping rules in the cache.Ref. [25] proposes a cover-set approach to solve the rule dependency issue, prioritizing important rules for TCAM caching.CacheFlow [26] caches small groups of rules by breaking long dependency chains while preserving the dependencies among policies.Ref. [27] designs traffic-aware architecture, monitoring the temporal behavior of rules to update flow table caching in a timely manner.
Verify and synthesize flow table.Automated tools verify and synthesize consistent flow tables and are also quite popular in network updating.Snowcap [28] leverages counter-examples to synthesize network configuration updates.NetStack [29] utilizes the Stackelberg game to ensure fundamental forwarding properties, such as reachability, loop freedom, and waypoint.Kaki [30] presents a Petri game-based synthesis approach.However, guaranteeing real-time update efficiency with complicated dependencies remains a challenge for automatic verification and the synthesis approach.

TCAM Maintenance Problem
Before updating the TCAM-based flow table, we need to sort the overlapping entries, so that those with longer prefixes are always situated in the top positions, with higher priority.As flow table lookups are top-down, it means that entries with higher priority are placed closer to the top of the flow table.The key notations are listed in Table 1.< r i , r j > edge r i → r j in dependency graph (r i depends on r j ) Definition 1.We use r i → r j to denote the dependency between rule r i and r j , if the matching fields of r i and r j overlap r i .match∩ r j .match= ∅, and the priority of r i is lower than r j r i .priority< r j .priority.
When we update a flow table, we first have to discover dependency among entries.We can organize the dependency relationships among the rules into a directed acyclic graph G = {R, E}, R = {r 1 , r 2 , ..., r m }.A function τ maps the rules to TCAM, complying with the dependency relationships.Definition 2. TCAM-based flow table mapping is a bijective function: τ : rule → slot maps a rule to a flow table slot, and r : slot → rule obtains an entry from its position.
Mapping rules into the TCAM-based flow table could be viewed as searching for a topological order of a dependency graph.However, there are usually many topological ordering sequences complying with the dependency relationships, resulting in different update overheads.
During the update process, the flow table may move some existing entries away to insert new rules due to the dependency relationships.To record the moving changes, we add the new rules and their dependency relationships into the directed graph, evolving from G to G .For example, in Figure 1a, when we insert entry R with dependency relationships R → R6 and R7 → R , the position of R in the table should be between R6 and R7, although there is no empty slot to admit R in the current mapping τ.Thus, we have to move some rules away to vacate an available slot for R .It means we need another mapping function τ , which allows R to be inserted into the flow table complying with the new dependency graph G and the existing G.
To minimize the potential update overhead while maintaining dependencies in G and G , we formulate the TCAM-based flow table update problem as follows.In the formulation, we attempt to minimize the number of moving rules during the updates.c(r) = 1 means that the position of rule r changes after updating, while the dependencies are consistent with τ and τ .The model is an integer programming (IP) problem, which is known as HP-hard.
and the minimum moving cost is 5 entries for both directions.(c) Checks the minimum moving chain; potentially moves each rule both upward and downward, i.g., R7, R8 → R2 → R1, and the minimum moving cost is 4 entries.

Update Flow Table with a Moving Chain
Intuitively, inserting a new entry involves exchanging existing entries with an empty slot to make space for the inserted entry.In Figure 1a, a new rule R is to be inserted with dependency R7 → R and R → R6.To avoid unexpected moves, RuleTris [7] proposes an algorithm to find a moving chain that consists of dependent rules and unrelated rules.A moving chain is a moving sequence {r 0 , r 1 , ..., r k }, where entry r i could be moved to replace r i+1 's slot (i = 0, 1, ..., k − 1), and the moving of entries along the chain complies with the dependency relationships.The length of the moving chain is the number of moving rules.
As Figure 1b shows, the shortest moving chain with RuleTris would be R7 → R9 → R10 → R12 → R13 or R6 → R5 → R4 → R3 → R1, both with five moves.However, it only considers moving either the upstream entries upward or downstream entries downward.Some entries at higher positions do not have any relationship with the insert rule, and they could be moved to lower positions.Similarly, the independent entries at lower positions may also be moved to higher positions to make space for the insertion.With this in mind, in Figure 1c, we find a shorter moving chain R7 → R8 → R2 → R1 with only four moves, which means moving unrelated rules upward or downward may lead to fewer moving steps.
Moreover, it may be infeasible to find a moving chain with RuleTris [7].In Figure 2a, we insert a new entry R with matching field 0 * 0, with priority 3. The matching field of R intersects with R1(01 * ), R3( * 00), and R4( * 0 * ).Considering the priorities, R depends on R3 (i.e., R → R3) while R1, R4 depend on R (i.e., R1 → R , R4 → R ).The dependencies of R with existing entries are shown in Figure 2b.We would insert R into the table above all the predecessors (R1, R4) and below all the successors (R3); however, there is no intersecting area.As R1 and R3 have no dependency relationship, R1 will not be swapped to a location below R3, R4 with the moving chain constructed by RuleTris [7].Thus, we have to arrange the disorder area [R1, R2, R3] to be [R2, R3, R1] in Figure 2c before searching for a moving chain.We designed Algorithm 1 to reschedule the disorder area to make sure predecessors are always below successors.Lines 5-10 either move predecessors (that were previously placed above successors) downward or shift successors (that were formerly below their predecessors) upward.Line 7 moves the rule with the minimum moving chain heuristically until satisfying the ordering requirement of predecessors and successors.Therefore, we need to design an algorithm to search for a moving chain, which is described in Section 5.
Algorithm 1 Reschedule existing rules for reversed dependency.(b) However, inserting R directly between predecessor R1 at a lower location and successor R3 at a higher location may lead to incorrect matching.(c) Reorder R1, R3, R4 to insert R between R3 and R4.

Minimizing the TCAM Update Overhead of a Single Entry Insertion
We propose an algorithm to calculate a minimum moving chain for inserting a single entry, and help reschedule disorder rules in Algorithm 1. First, we define the feasible moving range for each entry in Lemma 1, where an entry can be moved to any slot in a manner that respects dependencies.Then, we design Algorithm 2 and prove its minimum moving steps with Theorem 1.

Lemma 1. succ(r insert ) = min
<r insert ,r>∈E τ(r) is the entry that r insert depends on at the lowest position, and pre(r insert ) = max <r,r insert >∈E τ(r) is the entry that depends on r insert at the highest position.τ succ is the closest empty slot to τ(succ(r insert )), and τ pre is the closest empty slot to τ(pre(r insert )).For all entries, τ(r) < τ pre and τ(r) > τ succ , there exists a mapping τ , satisfying τ (r) = τ(r) after inserting r insert .
Proof.If there is an empty slot τ(pre(r insert )) < τ empty < τ(succ(r insert )), we can insert r insert into τ empty without moving any other entries.Otherwise, τ pre < τ(pre(r insert )) < τ(succ(r insert )) < τ succ .For r insert and n entries between τ pre and τ succ , as G is acyclic, the subgraph consisting of these n + 1 entries r insert and ∀r, τ pre < τ(r) < τ succ is a directed acyclic graph.We can always find a topological order for these n + 1 entries and adopt them into n + 2 slots between τ pre and τ succ .In the sub-table from τ pre , downward, the subgraph consisting of entries ∀r, τ(r) < τ pre below τ pre does not change after the insertion, so that τ (r) = τ(r).Analogously, for ∀r, τ(r) > τ succ , we can also obtain τ (r) = τ(r).
Lemma 1 indicates that we can make space for the new entry with the nearest upstream and downstream empty slots.Hence, we restrict the moving range of the rule r to be [τ pre , τ succ ].With the definitions of succ(•) and pre(•), each entry r has no dependency with rules in slots [τ(pre(r)) + 1, τ(succ(r)) − 1].Thus, r could be moved to any slot in [pre(r), succ(r)] by filling in an empty slot or replacing an existing entry, while the dependencies among entries are still maintained.
We designed Algorithm 2 to find a valid inserting update plan to insert a single entry by moving entries within τ pre , τ succ , and further prove that the moving plan is the minimum.With Lines 2 and 3, if there is an empty slot between r insert 's predecessor pre(r insert ) at the highest position and successor succ(r insert ) at the lowest position, we can insert r insert directly into the empty slot with chain [r insert , t].Otherwise, we have to search for the moving chain between the nearest upstream and downstream empty slots [τ pre , τ succ ] with Lines 5-32.We consider the potential moving steps of each entry in this region.We start searching from the middle of the region (Line 12), and then check slots one by one, until τ pre , τ succ (Line 15).For rule r(t) in slot t, we check the moving step of its dependent rules in slot t , and update t .movesand t .previf the moving step is larger than moving r(t) to t (Lines 20 -22).Gradually, we obtain the minimum moving step for each slot in the region.Finally, we obtain the moving chain by tracking .prevfrom the empty slot (Lines 28-32).
Theorem 1.The moving chain found by Algorithm 2 within [τ pre , τ succ ] is a minimum moving chain for an entry insertion.
Proof.We construct a directed graph DG = {(r, r(k)) | k ∈ [τ(pre(r)), τ(succ(r))]}, which consists of existing entries in G.The edges in the graph indicate that an entry can move from the head node to the tail node while maintaining the dependencies.Algorithm 2 searches for the chain, which is the shortest path from r insert to the empty slot τ succ , or τ pre with Dijkstra's algorithm, which was proven to be the shortest in [31].
If Algorithm 2 is not a minimum-moving chain, there must be another chain shorter than chain.We then restrict the moving range of r with the nearest upstream and downstream empty slots, τ pre and τ succ , with Line 14.In the above constructed directed graph DG, the shortest path calculated by Dijkstra's algorithm from r insert to r is d(r insert , r), and the shortest path from r to an empty slot is d(r), so that the shortest path from r insert to an empty slot d(r insert ) = min {d(r insert , r) + d(r)}.For min <r,r i >∈E τ(r i ) > τ succ or min <r j ,r>∈E τ(r j ) < τ pre , if r is moved to τ succ or τ pre , the shortest path from r to an empty slot is τ(r j ), τ pre , and the shortest path from r to an empty slot is d 2 (r) = d(r ) + 1.As d(r ) 0, d 1 (r) d 2 (r), such that r will be moved within the range of τ pre , τ succ .Iteratively, for any entry r to be moved, r will be moved within the range of τ pre , τ succ in each moving step.Thus, the moving chain we find within the range of τ pre , τ succ is, globally, the shortest.

Complexity analysis.
The moving overhead of the entry update depends on the length of the shortest moving chain.If c is the diameter of the dependency graph, which is the longest shortest path between any two nodes in the graph, the moving complexity of a single entry update is less than O(c), while the computation complexity is O(N • c).

Dynamic TCAM Update for Concurrent Rules
Multiple flows may arrive in a switch at almost the same time, or a switch may receive a batch of flow table configurations; both require updating multiple entries in one update operation.However, for the concurrent updating of multiple entries, different insertion plans may lead to diverse update overheads.For example, we insert R5 and R6 simultaneously in Figure 3.The two shortest moving chains for R5, i.g., [R1, R2] and [R4, R6, R3], both move two existing entries.The insertion plans both use the shortest moving chains for R5, R6 in Figure 3b,c, and the flow tables after updating both comply with the dependency graph.However, the overheads of the two moving plans differ.There are three existing entries relocated in Figure 3b, while two existing entries are moved in Figure 3c.The shortest moving chains of multiple insertion entries may interact with each other by containing common existing entries (e.g., R4 in the shortest moving chains of R5 and R6) or another inserting entry (e.g., R6 in R5's shortest moving chain [R4, R6, R3]).In Figure 3c, we move each entry in the moving chain [R4, R3] one slot downward to insert R5, while R6 further requires moving R4 downward to obtain another empty slot.
To find the appropriate empty slots for rule insertion, we designed Algorithm 3 to search the update plans for multiple entry updates.We adopt a divide-and-conquer approach to speed up the updating process.With Theorem 1, we are able to divide a flow table into multiple independent regions, and each region could be updated freely without interfering with others.

8:
rgn.end = min(rgn.end,τ succ (r)) 9: #expand rgn until containing enough empty slots to accommodate rules rgn.insert where k is the number of concurrent entries.As the time cost and power consumption of the TCAM entry updates are much higher than the computation cost, we prefer to exchange the computational cost for enhanced hardware efficiency.

Evaluation
To evaluate the effectiveness of our proposed algorithms, we conduct various experiments with different dependency complexities.As the dependencies among rules are constructed as directed acyclic graphs, we generate dependency graphs with different connectivity and shapes.We evaluate two kinds of dependency complexities by varying: (1) dependency graph density, i.e., forwarding rules with more complex dependencies form a dense graph, while fewer dependencies result in a sparse graph.We adjust the densities of dependency graphs by setting the node degrees d.By changing d, we can generate three kinds of graph densities: a sparse graph with a small d value (Sparse), a medium complex graph with a moderate d value (Medcplx), and a dense graph with a big d value (Dense).(2) Dependency graph shape: Rules with longer dependency chains form a slim but long dependency graph, while rules with shorter dependency chains form a fat but short dependency graph.We adjust the shape of the dependency graph by controlling the ratio of the width to length.During the graph construction, we put all the nodes with k predecessors in the longest dependency chain into the k-th layer, so that we obtain a dependency graph with l layers.We treat l as the length of the graph, and the maximum number of nodes w in each layer as the width of the graph.By varying l and w, we are able to generate three kinds of graph shapes: slim and long graphs with small w/l (<0.5) ratios (SimLong), medium-shaped graphs with middle w/l (0.5∼1.5) ratios (Medium), fat and short graphs with big w/l (>1.5) ratios (FatShort).We simulate a TCAM-based flow table, where the capacity is 10,000 entries, and we generate different dependency graphs by adjusting the graph's density and shape.
We first check the possibility of rescheduling existing rules under different flow table utilizations (low, med, high).Figure 4 shows that the rescheduling possibility increases with the flow table utilization, as more rules in the flow table may potentially introduce more reversed dependency during the updates.For dependency graphs with different complexities, we find that sparse graphs have higher rescheduling possibilities than dense graphs because dense graphs impose more-and complex-restrictions on dependency relationships among rules compared to sparse graphs, which are not easy to break during updates.In sparse dependency graphs, 20-30% of updates need to reschedule local rules, while the percentage of dense graphs is less than 10%.For dependency graphs with different shapes, the rescheduling possibility increases slightly as the shape of the dependency graph becomes fatter and shorter, from Figure 4a,b to Figure 4c.Entries in fat and short graphs have shorter dependency chains compared with slim and long graphs, where entries at high locations may have no relationships with those at lower locations, making entry insertions in fat and short graphs more likely to lead to reversed dependencies.We compare our single-entry updating Algorithm 2 (Bidirect) with the existing single direction-moving approach (Sidirect), e.g., RuleTris [7] and FastRule [5], as shown in Section 7.1.Then, we evaluate our concurrent updating Algorithm 3 (denoted by H) and compare it with update entries in random order (denoted by O), which is a widely-used mechanism for multiple updates, as shown in Section 7.2.

A Single Entry Update Evaluation
For the single entry update in the flow table, we check the moving steps in different flow table utilizations, dependency graph shapes, and densities, respectively.Fewer moving steps indicate less update overhead, which is expected to reduce the TCAM maintenance cost and improve traffic forwarding efficiency.
Figure 5 shows the variation in moving steps as the flow table utilization changes under different dependency graph shapes.When the flow table utilization rises, the moving steps rapidly increase.As the higher flow table utilization leaves less available storage space, the flow table needs to move more existing rules to make space for new entries.The evaluation results show that our moving approach takes fewer moving steps compared with Sidirect in any dependency graph shape.In particular, under high flow table utilization, which means the flow table is almost full, our moving cost is 0.6-5.2%lower than Sidirect, which is the most common scenario in daily life.Fat and short graphs tend to take fewer moving steps with shorter dependency chains, as Figure 5c shows, than slim and long graphs with longer dependency chains, as shown in Figure 5a,b 15% 35% 55% 75% 95%  Figure 6 shows that the moving steps vary with the dependency graph shapes under different flow table utilizations.When the ratio of the width to length increases, the moving steps decrease under any flow table utilization.The increasing width/length ratio indicates that the dependency graph is becoming fatter and shorter and contains more shorter dependency chains.The shorter chains take fewer moving steps to accommodate new entries.The moving steps increase as the flow table fills up, as seen in Figure 6a-c, and are in alignment with the results in Figure 5.Our moving approach consistently incurs no more cost than Sidirect, regardless of the graph shape; in particular, under high flow table utilization, our average cost is 0.7-4.9%lower than Sidirect.Figure 7 shows that the moving steps change with the dependency complexity under different flow table utilizations.We find that the moving steps decrease when the dependency graph density rises.Dense graphs contain more dependency relationships among entries compared with sparse graphs, which means more moving opportunities.Thus, with dense graphs, there is a higher likelihood of selecting a shorter moving chain for an entry insertion.This explains why sparse graphs consume more moving steps than dense graphs in Figures 5 and 6.Similar to the results with different graph shapes, updates under high flow table utilization in Figure 7c take more moving steps than those under the lower table utilization in Figure 7a,b.The advantage of our moving approach is more significant under high flow table utilization, where our average moving cost is 1-4.9% lower than Sidirect.

Dynamic Updating Evaluation
To examine the dynamic update effectiveness of our moving approach, we update 1-10 entries at once in the flow table.We compare our heuristic updating Algorithm 3 (denoted as H) with updating entries, one by one, in a random sequence (O), combined with our minimum moving Algorithm 2 (B) and Sidirect (S), respectively.
We first check the updating cost with different numbers of concurrent rules.The moving steps increase rapidly when we insert more rules simultaneously in Figure 8, as more rules usually indicate more dependencies to resolve, which results in more moving steps.We then compare the performances of different approaches when we insert 1-5 concurrent rules into the flow table.Figure 9 shows the moving steps under high flow table utilization with varying graph density.The moving steps decrease as the width/length ratio increases in all three scenarios, which are consistent with the single-entry updating in Figure 6.The heuristic updating method for minimum movement (HB) consistently requires fewer steps compared to other approaches; this is more evident with sparse dependency graphs.The average cost with our heuristic updating method could be 2.1-6.4% lower than updating one by one.Figure 10 shows the moving steps under high flow table utilization with different graph shapes.Similar to the single-entry updating in Figure 7, the moving steps decrease when the dependency density increases.Our heuristic updating method with minimum moving (HB) also achieves the lowest moving cost compared with other updating approaches; this is about 1.8-6.3%lower than updating one by one.The updates on slim and long graphs, as seen in Figure 10a, tend to take more steps with longer dependency chains than those on fatter and shorter graphs, as seen in Figure 10b,c.

Conclusions
Considering the expensive price and limited storage space of TCAM, in this paper, we focus on reducing the updating cost of the TCAM-based flow table.We first formulate the TCAM-based flow table maintenance as an NP-hard problem.To solve the problem, we propose a minimum moving algorithm for inserting a single entry, and prove its correctness.We then extend it to a heuristic algorithm to dynamically and concurrently update entries.The evaluation results with various dependency graphs show that our updating algorithm achieves an approximate 6% cost reduction compared to existing approaches for each entry updating.It means faster packet forwarding and less energy consumption.As SDN introduces more innovations in network virtualization, such as network isolation and fine-grained flow control, this generates more forwarding rules within the data plane.The proposed approach is able to help handle a large number of forwarding rule updates with enhanced efficiency.Meanwhile, real-time networks have more strict low latency requirements, especially in automotive, medical, and industry areas.The reduced update costs of TCAM-based switches are more beneficial for speeding up real-time communication efficiency.

Figure 1 .
Insert a new flow entry into the existing TCAM-based flow table with (a) R7 depending on R and R depending on R6.(b) Moves the upstream chain R6

Figure 2 .
Reorder existing entries for an insert rule: (a) Insert R with priority 3 into the TCAM table.

Figure 4 .
Reschedule existing rules with different dependency graph shapes: (a) Slim and long graph.(b) Medium graph.(c) Fat and short graph.

Figure 5 .
Moving steps under different flow table utilizations: (a) Slim and long graph.(b) Medium graph.(c) Fat and short graph.

Figure 8 .
Figure 8. Dynamic moving with different numbers of insert rules.

Figure 10 .
Dynamic moving steps with different dependency densities: (a) Slim and long graph.(b) Medium graph.(c) Fat and short graph.
#the highest position of rules that depends on r insert 3: rvssucc = {r|τ(r) < above, < r insert , r >∈ E} #rules that r insert depends on but has low address 4: rvspre = {r|τ(r) > below, < r, r insert >∈ E} #rules that depends on r insert but has high address 5: while below < above do 6: calculate chains of moving r ∈ rvspre to (, below], r ∈ rvssucc to [above, ) with ≥ τ pre & t ≤ τ succ & r(t) = ∅ then .prev=t .moves> (t.moves + 1)?t : t .prev22: t .moves= min {t .moves,t.moves + 1} while (start -offset ≥ τ pre ) or (start + offset ≤ τ succ ) do t table into independent regions.The flow table could be divided into several independent insertion regions for multiple entry updates, and each insertion region could update entries independently without interfering with each other.An insertion region is a series of slots [t a , t b ] in the TCAM-based flow table.The entries to be inserted in this region are R([t a , t b]) = {r | t a τ (r pre ) τ (r) τ (r succ ) t b }.Lines 1-20 divide TCAM tables into independent regions by checking the moving range [τ pre (r), τ succ (r)] of each insert rule r.If the moving range of r overlays with any other insert rule, they will be merged into a region (Lines 4-8).Merging moving ranges may cause the number of available empty slots to be less than the number of insert rules.In this case, we expand the size of the region by extending an empty slot upward and downward, respectively (Lines 10-12).Finally, we merge the regions to ensure that there is no overlap among independent regions (Line 20).2.Update entries within each region.For each region R([t a , t b ]), as we have reserved enough empty slots during the region-dividing phase, rules could be inserted into the region without interfering with other regions.We use Algorithm 2 to calculate the shortest moving chain for each rule in R([t a , t b ]), and update rules in the increasing length of moving chains heuristically (Line 22).
Moving steps with different dependency graph shapes: (a) Under low flow table utilization.(b) Under medium flow table utilization.(c) Under high flow table utilization.
Moving steps with different dependency densities: (a) Low flow table utilization.(b) Medium flow table utilization.(c) High flow table utilization.