Entry Aggregation and Early Match Using Hidden Markov Model of Flow Table in SDN

The usage of multiple flow tables (MFT) has significantly extended the flexibility and applicability of software-defined networking (SDN). However, the size of MFT is usually limited due to the use of expensive ternary content addressable memory (TCAM). Moreover, the pipeline mechanism of MFT causes long flow processing time. In this paper a novel approach called Agg-ExTable is proposed to efficiently manage the MFT. Here the flow entries in MFT are periodically aggregated by applying pruning and the Quine–Mccluskey algorithm. Utilizing the memory space saved by the aggregation, a front-end ExTable is constructed, keeping popular flow entries for early match. Popular entries are decided by the Hidden Markov model based on the match frequency and match probability. Computer simulation reveals that the proposed scheme is able to save about 45% of space of MFT, and efficiently decrease the flow processing time compared to the existing schemes.

of the flow. In SDN, the physical characteristics of the networking devices are concealed so that the control plane can provide a unified management and services for high level applications [7]. At the same time, the data forwarding devices are abstracted from the control layer to ensure the mobility, and reduce future investment of the devices.
Among the main issues with SDN such as switch designs [8,9], distributed controller platforms [10,11], resilient communication [12,13], and security [14,15], flow table management is one of the primary tasks directly influencing the performance. From OpenFlow 1.3 [16,17], there were several improvements including multiple flow tables (MFT). In SDN, the OpenFlow switches just forward the flows through the MFT which are composed of data packets, and the operation is managed by OpenFlow controllers. Even though MFT plays an important role in SDN, two challenging issues exist which are memory overhead and flow forwarding delay. The memory overhead is high due to the need of ternary content addressable memory (TCAM) and fast forwarding [18]. If the number of entries in the table is limited too much for reducing the overhead, the flow match rate will be very low. Meanwhile, the pipeline mechanism employed with MFT causes forwarding delay as each flow has to go through the flow table one after another, and the action corresponding to the matched entry is executed at the end of the pipeline. There exist various studies dealing with such issues of MFT. For example, the Flow Table Reduction Scheme [18] (FTRS) aggregates the flow entries of the same action and destination addresses. In [19], the Multi-Stage OF (MsOF) switch model is proposed to save memory space and reduce forwarding time by deploying the tables each requiring less memory space. However, the FTRS works well only in local area network (LAN) environment and the implementation of MsOF is relatively complicated.
In this paper we advocate a novel approach called Agg-ExTable which efficiently reduces the memory overhead and flow forwarding time of MFT. Here pruning and the Quine-McCluskey (QM) algorithm are periodically applied to aggregate the flow entries in each flow table, and a table called ExTable is put in front of the pipeline which contains the flow entries of high expected match probability. Its flow entry is dynamically determined using a Hidden Markov model constructed based on the number of transitions and match probability of the entries. The QM algorithm allows an efficient usage of TCAM by reducing the size and power consumption, while ExTable allows quick match and execution of the incoming flows. Computer simulation reveals that the proposed scheme substantially reduces the memory size and flow processing time compared to the TCAM-based size reduction scheme [20] and the scheme merging and migrating the flow entries based on directed acyclic graph (DAG) [21]. The improvement gets more significant as the flow arrival rate rises. The main contributions of the paper are summarized as follows: • The flow entry aggregation problem is simplified by transforming it to a logic minimization problem, which is effectively solved by the QM algorithm.

•
The ExTable scheme substantially reduces the flow processing time by placing a table containing the entries of high match probability up front.

•
The match frequency and match probability of the entries are handled with the Hidden Markov model to decide the likelihood of the match.
The remainder of the paper is organized as follows. Section 2 provides an overview of TCAM used for the implementation of MFT, Hidden Markov model, and MFT-based switching. In Section 3 the proposed Agg-ExTable approach is presented, along with the analytical model of the flow processing time based on queueing theory. Section 4 is for the performance evaluation of the proposed schemes. Finally, Section 5 concludes the paper and outlines the future research direction.

TCAM
TCAM [20,22] is usually employed to speed up the table look up operation. However, the size is always limited due to high cost. TCAM allows a third matching state of 'X' or 'don't care' for one or Sensors 2019, 19, 2341 3 of 20 more bits in the stored entry, allowing the flexibility in the search. For example, a TCAM may have an entry of '101XX' which will match to '10100', '10101', '10110', or '10111'. The additional state is typically implemented by adding a mask bit to the corresponding memory cell. If a bit of the mask is '0', the bit of corresponding entry is 'don't care'. In Table 1, for example, if the first five bits of a flow are '10011' or '10111', it matches E 1 .

No.
Entry/Mask Action Another important feature of TCAM is that '0' and '1' in the mask is not required to be continuous. For E 1 in Table 1, the third bit of the entry is 'don't care', and thus the flow whose first three bits are '100' or '101' matches this entry. The proposed aggregation method takes advantage of this feature of TCAM.
In reference [20], two techniques, pruning and ESPRESSO-II based mask extension, are proposed to compact traditional routing table stored in TCAM. They allow a smaller TCAM reducing the size and power consumption.

Quine-McCluskey (QM) Algorithm
The Quine-McCluskey (QM) algorithm [23][24][25] was developed for simplifying logical functions. The QM algorithm is effective for implementation since it has tabular form, and it also provides a deterministic method checking if the logical function is minimal. There exist mainly four definitions in using the QM algorithm: • Minterm: an expression in which all variables of the logical function appears once. • Implicant: an aggregation of minterms in the logical function.

•
Prime implicant: an implicant that cannot be covered by a more simplified implicant. • Essential prime implicant: prime implicants that cover an output of the logical function for which no combination of other prime implicants is able to cover.
The procedure of the QM algorithm consists of two steps:

•
Step 2: Find the essential prime implicants and other necessary prime implicants to cover the logical function.
In the proposed scheme the QM algorithm is employed for mask extension which is transformed to be a logic minimization problem.

Hidden Markov Model (HMM)
HMM [26][27][28][29] is used in various fields such as language recognition, reinforcement learning, and bioinformatics. An HMM is a finite discrete time Markov model in which the system is assumed to be a Markov process with hidden state. It can be defined as a triple λ = (π,A,B), where π is the initial probability distribution, A is transition probability matrix, and B is a sequence of observation likelihoods (emission probabilities). Specifically, an HMM is defined by the following components [30]: • Q = {q 1 , q 2 , . . . , q N } is a set of states where N is the number of hidden states, and q t is the hidden state at time t. 11 , a 12 , . . . , a N1 , . . . , a NN } is an N × N transition probability matrix. a ij (1 ≤ i, j ≤ N) represents the probability of changing from state_i to state_j. Here, while, N j=1 a ij = 1, ∀i. (3) represents the probability of an observation o t being generated from state_j. Here, while, • π = {π 1 , π 2 , . . . , π N } is the initial probability distribution over hidden states. π i is the probability that the Markov chain will start in state_i. Here, while, A generic HMM is illustrated in Figure 1, where H i (i = 1, 2, 3, . . . , T) is the set of hidden states and O i (i = 1, 2, 3, . . . , T) is the set of observations. The Markov process, located above the observable states, is determined by the current state and A. Only O i is able to be observed, which is determined by the hidden states of the Markov process and B. while, represents the probability of an observation ot being generated from state_j. Here, while, • π = {π1, π2, ..., πN} is the initial probability distribution over hidden states. πi is the probability that the Markov chain will start in state_i. Here, while, A generic HMM is illustrated in Figure 1, where Hi(i = 1, 2, 3, …, T) is the set of hidden states and Oi(i = 1, 2, 3, …, T) is the set of observations. The Markov process, located above the observable states, is determined by the current state and A. Only Oi is able to be observed, which is determined by the hidden states of the Markov process and B.

MFT
The pipeline processing of OpenFlow [16,17] is depicted in Figure 2. An OpenFlow switch is required to have one or more flow tables, while a single flow table is used only for relatively simple network.

MFT
The pipeline processing of OpenFlow [16,17] is depicted in Figure 2. An OpenFlow switch is required to have one or more flow tables, while a single flow table is used only for relatively simple network. The flow tables of an OpenFlow switch are sequentially numbered, starting from 0. The pipeline processing always starts from the first flow table where the packet is matched against its flow entries. Other flow tables may be used depending on the outcome of the match. The pipeline processing occurs only in the forward direction. If a flow entry matches, the instruction set included in that entry is executed at the Action Execution Unit (AEU) located at the end of the pipeline.
The existing flow entry aggregation techniques can be classified into two types according to the location of aggregation, in packet classifier or in OpenFlow switch flow table. The TCAM Razor [31] is a systematic flow aggregation algorithm which uses the decision diagram to minimize the TCAM rules required for packet classification. However, the execution time grows rapidly when the number of flow entries increases. The Fast Flow Table Aggregation [32] (FFTA) and FTRS also support entry aggregation in SDN. FFTA is an offline aggregation scheme based on bit weaving, which applies ORTC (Optimal Routing Table Constructor) after cutting the entries using a binary search tree. The FTRS aggregates the flow entries according to the destination IP address, instead of the match field. It achieves a good compression ratio with different topologies. However, FFTA causes coarse traffic statistics due to the mixture of all the entries, while FRTS has a high probability of flow table overflow.
The schemes of [19,33] aim to implement effective forwarding with MFT. In reference [21], an algorithm called migrating flow rules (MILE) was proposed which merges and migrates the flow entries to reduce the number of flow table lookup operations by employing directed acyclic graph (DAG). The dependencies of the flow entries are handled using DAG, where interdependent entries are grouped and migrated as a whole. Using the 2Q LRU replacement algorithm, the recently accessed entries are replaced at Flow Table_0 to be matched early. With the Multi-Stage OF (MsOF) [19] switch model, more tables each requiring less memory space are deployed. Here a processor is implemented in each flow table, and multiple pipeline operations occur in parallel so that several flows can be matched at the same time. The spatial and temporal complexity were examined using queuing theory. The implementation of MsOF is relatively complicated and needs large networking resources.
In order to improve the efficiency of the MFT, both the TCAM implementation and flow match probability are focused in this paper. The proposed scheme aggregate the entries using the pruning and QM algorithm to minimize the space of TCAM, while the saved memory space is used to store popular entries. With the HMM constructed based on the match frequency and match probability of the entries, the presumably popular flow entries are selected and put in the ExTable located in front of the flow table. The pruning and QM algorithm aim to maximize the efficiency of the TCAM by reducing the size and power consumption. Maintaining high match rate with ExTable substantially decreases the overall flow processing time, via early match and execution of the incoming flows. The proposed scheme is presented in the next section. The flow tables of an OpenFlow switch are sequentially numbered, starting from 0. The pipeline processing always starts from the first flow table where the packet is matched against its flow entries. Other flow tables may be used depending on the outcome of the match. The pipeline processing occurs only in the forward direction. If a flow entry matches, the instruction set included in that entry is executed at the Action Execution Unit (AEU) located at the end of the pipeline.

The Structure
The existing flow entry aggregation techniques can be classified into two types according to the location of aggregation, in packet classifier or in OpenFlow switch flow table. The TCAM Razor [31] is a systematic flow aggregation algorithm which uses the decision diagram to minimize the TCAM rules required for packet classification. However, the execution time grows rapidly when the number of flow entries increases. The Fast Flow Table Aggregation [32] (FFTA) and FTRS also support entry aggregation in SDN. FFTA is an offline aggregation scheme based on bit weaving, which applies ORTC (Optimal Routing Table Constructor) after cutting the entries using a binary search tree. The FTRS aggregates the flow entries according to the destination IP address, instead of the match field. It achieves a good compression ratio with different topologies. However, FFTA causes coarse traffic statistics due to the mixture of all the entries, while FRTS has a high probability of flow table overflow.
The schemes of [19,33] aim to implement effective forwarding with MFT. In reference [21], an algorithm called migrating flow rules (MILE) was proposed which merges and migrates the flow entries to reduce the number of flow table lookup operations by employing directed acyclic graph (DAG). The dependencies of the flow entries are handled using DAG, where interdependent entries are grouped and migrated as a whole. Using the 2Q LRU replacement algorithm, the recently accessed entries are replaced at Flow Table_0 to be matched early. With the Multi-Stage OF (MsOF) [19] switch model, more tables each requiring less memory space are deployed. Here a processor is implemented in each flow table, and multiple pipeline operations occur in parallel so that several flows can be matched at the same time. The spatial and temporal complexity were examined using queuing theory. The implementation of MsOF is relatively complicated and needs large networking resources.
In order to improve the efficiency of the MFT, both the TCAM implementation and flow match probability are focused in this paper. The proposed scheme aggregate the entries using the pruning and QM algorithm to minimize the space of TCAM, while the saved memory space is used to store popular entries. With the HMM constructed based on the match frequency and match probability of the entries, the presumably popular flow entries are selected and put in the ExTable located in front of the flow table. The pruning and QM algorithm aim to maximize the efficiency of the TCAM by reducing the size and power consumption. Maintaining high match rate with ExTable substantially decreases the overall flow processing time, via early match and execution of the incoming flows. The proposed scheme is presented in the next section.

The Structure
The proposed Agg-ExTable scheme allows entry aggregation and fast pipeline operation using ExTable holding popular flows. It works in two phases. In Phase 1, the proposed entry aggregation algorithm is executed periodically to reduce the size of TCAM. Then in Phase 2, the saved memory space is utilized to set up ExTable in front of the pipeline, keeping popular entries. Here the HHM is used to decide the popularity of the flow entry. The structure of the proposed MFT pipeline of an OF switch is illustrated in Figure 3. Observe that there exist two paths; express path for popular flow and regular path for nonpopular flow. When a flow arrives at a switch, it is first parsed for the matching with the ExTable. Upon a match, the actions of the flow entry are sent to AEU. Otherwise, the match operation is continued with the other flow tables. The proposed Agg-ExTable scheme allows entry aggregation and fast pipeline operation using ExTable holding popular flows. It works in two phases. In Phase 1, the proposed entry aggregation algorithm is executed periodically to reduce the size of TCAM. Then in Phase 2, the saved memory space is utilized to set up ExTable in front of the pipeline, keeping popular entries. Here the HHM is used to decide the popularity of the flow entry. The structure of the proposed MFT pipeline of an OF switch is illustrated in Figure 3. Observe that there exist two paths; express path for popular flow and regular path for nonpopular flow. When a flow arrives at a switch, it is first parsed for the matching with the ExTable. Upon a match, the actions of the flow entry are sent to AEU. Otherwise, the match operation is continued with the other flow tables.

Aggregation of Entry
The number of possible flow paths in a flow table is typically small because only a limited number of interface cards can fit into the switch chassis. In contrast, the number of forwarding entries is quite large, in the range of several thousands. Considering this disparity, a scheme reducing the size of flow table is developed which involves two techniques presented below.

Pruning of Redundant Entries
Pruning is a technique eliminating some redundant entries [20]. To facilitate the discussion, some terms are defined as follows. Notice that the match fields of a flow entry may have different lengths.

•
Assume that entry_P is the parent of entry_Q, Lp is the length of entry_P, and P(i) is the ith bit of entry_P. Then the following three conditions hold: Entry_P is identical to entry_Q if same actions are executed for the matched packet.
If P is identical to Q, Q is a redundant flow entry. Assume that Q matches a flow. Then the flow will match P as well by the definition. If Q is removed from the flow table, P becomes the matched entry. As P and Q have the same actions, removing Q makes no difference. Note that the technique is general enough that it can be used with any entry lookup algorithm regardless of the type of the flow table.

QM-Based Mask Extension
The second technique exploits the flexibility offered by the TCAM hardware. TCAM allows arbitrary mask, in other words, the bits of 1s or 0s do not require to be continuous. Table 2 shows an example of mask extension. E1 and E2 both have the same action of 'Forward to 1'. It is possible to combine E1 and E2 into one single entry with the prefix of 1100 and mask of 1101. The 0 at bit 3 in the mask allows combining E1 and E2 into a same entry. The aggregated version

Aggregation of Entry
The number of possible flow paths in a flow table is typically small because only a limited number of interface cards can fit into the switch chassis. In contrast, the number of forwarding entries is quite large, in the range of several thousands. Considering this disparity, a scheme reducing the size of flow table is developed which involves two techniques presented below.

Pruning of Redundant Entries
Pruning is a technique eliminating some redundant entries [20]. To facilitate the discussion, some terms are defined as follows. Notice that the match fields of a flow entry may have different lengths.

•
Assume that entry_P is the parent of entry_Q, Lp is the length of entry_P, and P(i) is the ith bit of entry_P. Then the following three conditions hold: Entry_P is identical to entry_Q if same actions are executed for the matched packet.
If P is identical to Q, Q is a redundant flow entry. Assume that Q matches a flow. Then the flow will match P as well by the definition. If Q is removed from the flow table, P becomes the matched entry. As P and Q have the same actions, removing Q makes no difference. Note that the technique is general enough that it can be used with any entry lookup algorithm regardless of the type of the flow table.

QM-Based Mask Extension
The second technique exploits the flexibility offered by the TCAM hardware. TCAM allows arbitrary mask, in other words, the bits of 1s or 0s do not require to be continuous. Table 2 shows an example of mask extension. E 1 and E 2 both have the same action of 'Forward to 1'. It is possible to combine E 1 and E 2 into one single entry with the prefix of 1100 and mask of 1101. The 0 at bit 3 in the mask allows combining E 1 and E 2 into a same entry. The aggregated version of the original flow table with the mask extension technique is shown in the right-hand side. The table size has been reduced to 3 from 5.  Note that the mask extension is equivalent to the logic minimization problem [20]. The problem is that 'given a set of entries with the same action, find a set of minimal covers.' Such logic minimization problem [33] is a non-deterministic polynomial (NP) complete problem, and there exist mainly three kinds of methods used for its solution.

•
Karnaugh mapping [34]: It is simple but when the number of variables is larger than six, it becomes very complex. [23][24][25]: It is functionally identical to Karnaugh mapping, but the tabular form makes it more efficient to be used with a computer algorithm, supporting any number of variables. It also provides a deterministic method checking if the logical function is minimal.

•
Espresso logic minimizer [35,36]: It can produce a solution fast but cannot guarantee optimal result.
Here the QM algorithm is employed for mask extension. Algorithm 1 shows the proposed entry aggregation scheme with the QM algorithm-based mask extension. Here, E(l,a) is the set of original entries having the same length of l and action of a. A(l,a) is the result of QM algorithm. An example of the proposed mask extension scheme is shown below. In Table 3, there are 11 entries with different actions. After selecting the entries having the same action, the entry aggregation problem is simplified into the following minimization function: Note that the mask extension is equivalent to the logic minimization problem [20]. The problem is that 'given a set of entries with the same action, find a set of minimal covers.' Such logic minimization problem [33] is a non-deterministic polynomial (NP) complete problem, and there exist mainly three kinds of methods used for its solution.

•
Karnaugh mapping [34]: It is simple but when the number of variables is larger than six, it becomes very complex.

•
Quine-McCluskey (QM) algorithm [23][24][25]: It is functionally identical to Karnaugh mapping, but the tabular form makes it more efficient to be used with a computer algorithm, supporting any number of variables. It also provides a deterministic method checking if the logical function is minimal.

•
Espresso logic minimizer [35,36]: It can produce a solution fast but cannot guarantee optimal result.
Here the QM algorithm is employed for mask extension. Algorithm 1 shows the proposed entry aggregation scheme with the QM algorithm-based mask extension. Here, E(l,a) is the set of original entries having the same length of l and action of a. A(l,a) is the result of QM algorithm. An example of the proposed mask extension scheme is shown below. In Table 3, there are 11 entries with different actions. After selecting the entries having the same action, the entry aggregation problem is simplified into the following minimization function: 3,4,5,6,7,8,9). Step 1: find prime implicants (P i ). Here all minterms are placed in the minterm table as shown in Table 4, and Stage I is to combine the minterms. If two terms vary by only a single bit, that bit is replaced with a dash (-). Stage II is the result of Stage I, and Stage III is the result of combining the minterms in Stage II. '/' indicates if the entry is combined in the next stage. According to Table 4, all the prime implicants are shown as follows: Sensors 2019, 19, 2341 9 of 20 Step 2: find essential prime implicants (P e ). From Table 4, none of the minterms can be combined any further. At this point, the table of essential prime implicant is constructed as in Table 5. In order to find the essential prime implicants, each column needs to be checked whether there exists only one '*'. If a column has only one '*', the minterm can be covered by only one prime implicant. Then this prime implicant is essential. According to Table 5, P 1 is the only essential prime implicant.
As P 2 can be covered by P 1 and P 3 , same as P 3 , P 4 , P 5 , P 6 , and P 7 , they are not essential. In this example, the essential prime implicants cannot handle all the minterms (only E 0 , E 3 , E 5 , and E 9 are covered). Therefore, other prime implicants are combined with P 1 to get the final result as: At last the final Table 6 is obtained as follows: The number of entries is aggregated from 11 to 6, and the compression ratio is 6/11 = 0.545.

Hidden Markov Model-Based Prediction
The goal of the proposed scheme based on HMM is to dynamically predict the popularity of the flow entries as accurately as possible and update the ExTable accordingly. After the flow entries are aggregated, the popularity of the flow entries are estimated periodically and the entries deemed to be popular are moved to ExTable. Note that the size of ExTable, n p , is smaller than (1 − C)·N MFT where C and N MFT are the TCAM compression ratio and the size of entire MFT, respectively.
The match frequency of an entry indicates the popularity. For this, a counter, M, is associated to each flow entry, which is activated when the entry is installed in the flow table. M is the number of matches before the prediction occurs. Note that M may not be the only indicator of the popularity, and thus HMM is employed to estimate the probability of the flow entries to be matched in the near future.
The interarrival time of the flow is assumed to follow exponential distribution, and therefore the number of arrivals can be modeled using Poisson distribution. The probability of a flow arriving in a given interval of time of ∆t is predicted as follows. Assume that flow arrival occurs at any time. The probability of k arrivals in ∆t is given by: The probability of at least one arrival in the interval ∆t is given as: The probability of a flow arriving in the next interval is computed as the mean value of the entire period from the beginning. It is: HMM is effective to predict the probability of an observed sequence with the given triple λ = (π,A,B). Let H = {H 0 , H 1 , . . . , H n } be a set of hidden states, where H i is defined as the number of time segments (∆t seconds per segment) a flow entry has not been matched from the initial state. For example, if there was no match for last 3∆t seconds, H 3 = 3∆t. If there was a match, H 3 = 0. Note that the hard timeout period (T H ) is preset, and an entry is forced to be evicted if no match occurs during T H . Therefore, there will be n (=T H /∆ t ) segments before an entry is finally evicted, and H n is the last state. Since O = {O 0 , O 1 , . . . , O n } is a set of observable states of any entry, O i indicates if the entry is matched in ith segment. It has two values; '1' for a successful match, '0' no match. Figure 4 shows the structure of the HMM of the proposed scheme.
The probability of a flow arriving in the next interval is computed as the mean value of the entire period from the beginning. It is: HMM is effective to predict the probability of an observed sequence with the given triple λ = (π,A,B). Let H = {H0, H1, …, Hn} be a set of hidden states, where Hi is defined as the number of time segments (Δt seconds per segment) a flow entry has not been matched from the initial state. For example, if there was no match for last 3Δt seconds, H3 = 3Δt. If there was a match, H3 = 0. Note that the hard timeout period (TH) is preset, and an entry is forced to be evicted if no match occurs during TH. Therefore, there will be n (=TH/Δt) segments before an entry is finally evicted, and Hn is the last state. Since O = {O0, O1, …, On} is a set of observable states of any entry, Oi indicates if the entry is matched in ith segment. It has two values; '1' for a successful match, '0' no match. Figure 4 shows the structure of the HMM of the proposed scheme. With the HMM the probability of an observed sequence is found with the given parameters, A, B, and π. For a flow entry, there exist N (=n + 1) hidden states in its life time. Assume that there exist m time segments before the prediction occurs. Then the (n − m) × (n − m) hidden state transition probability matrix, A, and the (n − m) × 2 emission probability matrix, B, are obtained as follows: In order to compute the likelihood probability, P(O|λ) of O = {O0, O1, …, O(n-m)}, the forward algorithm is adopted. Note that the probability of the observation sequence is obtained in which the value of Oi is 1 and the predicted last observable state is O(n−m). Then P(O|λ) is calculated as follows.

Initialization. Each cell of the forward algorithm, αt(j), represents the probability of hidden state
Hj after checking the first t observations with the given λ. It expresses the probability as: Here Ht = qj denotes that the tth hidden state in the sequence is qj. Then the initial probability is calculated as follows: ( ) = ( , , ⋯ , , = | ). With the HMM the probability of an observed sequence is found with the given parameters, A, B, and π. For a flow entry, there exist N (=n + 1) hidden states in its life time. Assume that there exist m time segments before the prediction occurs. Then the (n − m) × (n − m) hidden state transition probability matrix, A, and the (n − m) × 2 emission probability matrix, B, are obtained as follows: In order to compute the likelihood probability, P(O|λ) of O = {O 0 , O 1 , . . . , O (n−m) }, the forward algorithm is adopted. Note that the probability of the observation sequence is obtained in which the value of O i is 1 and the predicted last observable state is O (n−m) . Then P(O|λ) is calculated as follows.

1.
Initialization. Each cell of the forward algorithm, α t (j), represents the probability of hidden state H j after checking the first t observations with the given λ. It expresses the probability as: Here H t = q j denotes that the tth hidden state in the sequence is q j . Then the initial probability is calculated as follows: If one or more matches occur during m segments, the following is obtained according to Equation (12) and the sum of π is 1.
If no match occurs, then: 2.
Recursion. For the hidden state sequence, H, and the observation sequence, O, the likelihood of O is estimated as: While introducing π and A, it follows: where q 0 is the initial state. Then the joint probability of H and O is: Therefore, the total probability of the observations can be calculated by summing up all possible hidden state sequences: For each given state q j at time t, the probability α t (j) is estimated as: where 0 ≤ j ≤ (n − m) and 1 ≤ t ≤ n.

3.
Termination. According to Equations (23) and (24), the probability of O is estimated as: Through the forward algorithm, the P(O|λ) of each observed state can be computed. In order to decide the popularity of each entry, the value of O i is set to 1 to record the probability of successful match. Large P(O|λ) means that the entry has high match probability. The number of popular flow entries selected in each flow table is denoted as k. Periodic ExTable update occurs in every ∆T. Here the popularity, ω, is decided based on the match frequency, M, and match probability, P(O|λ), as follows: After calculating the popularity, the flow entries of k largest popularity are moved to the ExTable. The number of flow entries in ExTable is n t ·k if there exist n t flow tables (n t ·k ≤ n p ). If there exist several flow entries of the same value, the one of the longest remaining life time is selected. The proposed periodic entry selection scheme is depicted in Algorithm 2. for j from 0 to n f −1 10: input O, λ = (π,A,B) 11: initial α 0 (j) by Equation (17), π by Equations (18) and (19)

Flow Processing Time
The queuing model [19,37] of the proposed ExTable scheme is shown in Figure 5 where each of the nodes is considered as an M/M/1 queue. Table 7 is the list of variables used in the model.  According to Little's Law [37], the flow processing time in the system can be calculated as T = N/λ, where N is the average number of flows in the system and λ is the arrival rate of the flows. For obtaining the flow processing time of the proposed scheme, TF, firstly the average number of flows in the system, NF, needs to be obtained. In the following formula Rm is the match rate of the ExTable, and ρp, ρf, and ρe are the utilization of ExTable, flow table, and AEU, respectively. Note that there exists a direct path from each flow table to AEU. The rate of sending packets directly to AEU from the flow tables are {Rd0, Rd1, …, Rdn t −2}, where nt ≥ 2. In order to calculate NF, the average number of flows in ExTable, flow table 0, flow table 1, flow table (nt−1) and AEU, Np, Nf0, Nf1, Nf(n t −1), and Ne, should be estimated. They are calculated as follows: The average number of flows in the system, NF, is as follows: Note that there exist one ExTable, nt flow tables and one AEU. As a result, the flow processing According to Little's Law [37], the flow processing time in the system can be calculated as T = N/λ, where N is the average number of flows in the system and λ is the arrival rate of the flows. For obtaining the flow processing time of the proposed scheme, T F , firstly the average number of flows in the system, N F , needs to be obtained. In the following formula R m is the match rate of the ExTable, and ρ p , ρ f , and ρ e are the utilization of ExTable, flow table, and AEU, respectively. Note that there exists a direct path from each flow table to AEU. The rate of sending packets directly to AEU from the flow tables are {Rd 0 , Rd 1 , . . . , Rd n t −2 }, where n t ≥ 2. In order to calculate N F , the average number of flows in ExTable, flow table 0, flow table 1, flow table (n t −1) and AEU, N p , N f0 , N f1 , N f(n t −1) , and N e , should be estimated. They are calculated as follows: The average number of flows in the system, N F , is as follows: Note that there exist one ExTable, n t flow tables and one AEU. As a result, the flow processing time, T F , is obtained as follows: The queueing model of the flow processing time for ExTable is used later in computer simulation. The proposed scheme is simulated and compared with the existing schemes in the following section.

Performance Evaluation
In this section computer simulation is conducted to evaluate the TCAM compression ratio, prediction accuracy, and match rate of the proposed approach.

Simulation Environment
The simulation is conducted on Intel Core i5 process, 3.2 GHz PC with 8GB RAM, and Matlab R2014a. The flows used in the simulation are generated following exponential distribution with λ = 1. A virtual SDN environment is built with Floodlight controller, Open vSwitch, and end nodes emulated by Mininet. Here the Floodlight controller is linked to an Open vSwitch, and two end nodes are connected to the switch. The performance of the proposed scheme is compared with the original MFT approach and existing schemes [20,21].
For testing the proposed entry aggregation scheme, eight different numbers of flow entries are generated randomly. The number of entries is varied from 100 to 800. For obtaining the flow processing time for MFT and ExTable, the queueing models of them are used in the simulation. The number of flow entries in a flow table is set to 20, while the size of ExTable is n t ·k. The service rate of the ExTable, flow table, and AEU are set to be 1.67, 1.67, and 10 [19,21], respectively. Tables 8 and 9 list the parameter values and the factors used in the simulation, respectively.  The match rate and prediction accuracy of the proposed scheme are examined with various values of the operation parameters, and the flow processing time is compared with the existing schemes. Note that small processing time implies higher match rate and prediction accuracy.
The simulation is run 1000 times to achieve dependable result. The accuracy of the proposed HMM-based prediction is then calculated as: Here, N_Success indicates the number of selected flow entries actually having the largest number of matches, and N_ Experiment is the whole number of simulations. The match rate of the proposed ExTable scheme is obtained as follows: Here, N_Match is the number of incoming flows matching the ExTable, while N_Flows is the total number of incoming flows. Figure 6 shows the average compression ratios with eight different numbers of flow entries. In order to achieve dependable result, the simulation is run 1000 times for each number of flow entries generated randomly. Here compression ratio is the number of entries after reduction to that of original entries [8,32]. Therefore, lower compression ratio means more entries are aggregated. The pruning alone reduces the table size by almost 25%. The proposed scheme (pruning + QM) shows the best compression ratio among the four schemes compared. Here, N_Success indicates the number of selected flow entries actually having the largest number of matches, and N_ Experiment is the whole number of simulations. The match rate of the proposed ExTable scheme is obtained as follows:

Simulation Results
Here, N_Match is the number of incoming flows matching the ExTable, while N_Flows is the total number of incoming flows. Figure 6. The compression ratios of four different schemes. Figure 6 shows the average compression ratios with eight different numbers of flow entries. In order to achieve dependable result, the simulation is run 1000 times for each number of flow entries generated randomly. Here compression ratio is the number of entries after reduction to that of original entries [8,32]. Therefore, lower compression ratio means more entries are aggregated. The pruning alone reduces the table size by almost 25%. The proposed scheme (pruning + QM) shows the best compression ratio among the four schemes compared.    Figure 6 shows the average compression ratios with eight different numbers of flow entries. In order to achieve dependable result, the simulation is run 1000 times for each number of flow entries generated randomly. Here compression ratio is the number of entries after reduction to that of original entries [8,32]. Therefore, lower compression ratio means more entries are aggregated. The pruning alone reduces the table size by almost 25%. The proposed scheme (pruning + QM) shows the best compression ratio among the four schemes compared.     (20,70). Notice from the figure that the proposed scheme shows consistently higher accuracy, while the accuracy of both the schemes increases as the number of flow entries grows. Note that high match frequency may not be the only indicator of the popularity of an entry. Even though an entry has low match frequency, it could be a popular one if its match probability is high.

Simulation Results
In Figure 9, the match rate of popular flow entries is obtained with different ∆T of 20, 40, and 60. According to the result of Figure 6, the space saving due to the compression is about 45%. Since n t ·k ≤ 0.45·20n t , n t and k are set to 5 and 9, respectively. Observe from the figure that the match rate decreases as ∆T increases, which demonstrates that ExTable needs to be updated as quickly as possible. The proposed scheme achieves nearly 68% match rate when ∆T is 20. As ∆T becomes large, some entries may not be popular any more. There exists a trade-off between the match rate and update cost.    (20,70). Notice from the figure that the proposed scheme shows consistently higher accuracy, while the accuracy of both the schemes increases as the number of flow entries grows. Note that high match frequency may not be the only indicator of the popularity of an entry. Even though an entry has low match frequency, it could be a popular one if its match probability is high.
In Figure 9, the match rate of popular flow entries is obtained with different ΔT of 20, 40, and 60. According to the result of Figure 6, the space saving due to the compression is about 45%. Since nt•k ≤ 0.45•20nt, nt and k are set to 5 and 9, respectively. Observe from the figure that the match rate decreases as ΔT increases, which demonstrates that ExTable needs to be updated as quickly as possible. The proposed scheme achieves nearly 68% match rate when ΔT is 20. As ΔT becomes large, nt•k ≤ 0.45•20nt, nt and k are set to 5 and 9, respectively. Observe from the figure that the match rate decreases as ΔT increases, which demonstrates that ExTable needs to be updated as quickly as possible. The proposed scheme achieves nearly 68% match rate when ΔT is 20. As ΔT becomes large, some entries may not be popular any more. There exists a trade-off between the match rate and update cost.  . Note that k can be varied from 1 to 9 each allowing different compression ratio. Here nt is set to 5 and ΔT to 40, while k is varied from 3 to 9. Notice from Figure 10 that the match rate increases as k grows as expected. A proper number of popular entries needs to be selected which allows high match rate while requiring reasonable operation cost.  . Note that k can be varied from 1 to 9 each allowing different compression ratio. Here n t is set to 5 and ∆T to 40, while k is varied from 3 to 9. Notice from Figure 10 that the match rate increases as k grows as expected. A proper number of popular entries needs to be selected which allows high match rate while requiring reasonable operation cost. In Figure 11, the match rate is obtained with different number of flow tables, nt, of 3, 5, and 7. Here ΔT is 40 and k is 7. It is clear that the match rate of ExTable decreases as nt grows since the number of total flow entries becomes larger. Therefore, k needs to be set properly considering various design parameters.   In Figure 11, the match rate is obtained with different number of flow tables, n t , of 3, 5, and 7. Here ∆T is 40 and k is 7. It is clear that the match rate of ExTable decreases as n t grows since the number of total flow entries becomes larger. Therefore, k needs to be set properly considering various design parameters. In Figure 11, the match rate is obtained with different number of flow tables, nt, of 3, 5, and 7. Here ΔT is 40 and k is 7. It is clear that the match rate of ExTable decreases as nt grows since the number of total flow entries becomes larger. Therefore, k needs to be set properly considering various design parameters.    Figure 12 compares the flow processing time of the three schemes, original MFT, MILE, and the proposed Agg-ExTable. Here (∆T,R m ,n p ,n f ,k,n t ) of the proposed scheme are set to (20,0.68,35,20,7,5) using the data obtained from the simulation. The rate of using the direct path, Rd n , is decided randomly between 0 and 0.1. Observe that the proposed Agg-ExTable scheme achieves much smaller flow processing time than the other schemes as the ExTable provides an express forwarding path for a large portion of incoming flows. The difference gets more significant as the flow arrival rate increases. Note that high arrival rate means high network load, and the flow processing time of all the schemes grows as the network load increases. Another important merit of the proposed scheme is that the processing time is almost constant regardless of the load unlike the other schemes. Figure 13 shows the flow processing times with different settings of (40,0.6,35,20,5,7). The proposed scheme consistently outperforms the other schemes.

Conclusions
In this paper a novel flow management scheme has been proposed for MFTs of SDN switches. In the proposed Agg-ExTable scheme, the flow entries in the MFT are periodically aggregated by applying the pruning and Quine-Mccluskey algorithm. Utilizing the memory space saved by the aggregation, ExTable is constructed, keeping popular flow entries and allowing early match with the incoming flows. The proposed scheme is able to save about 45% TCAM space of MFT and efficiently decrease the flow processing time through the express forwarding path provided by the front-end ExTable. Popular flow entries are selected from the flow tables using the HMM, where popularity is decided based on the match frequency and match probability. Computer simulation revealed that the proposed scheme significantly outperforms the existing schemes in terms of flow processing time.
In the future, we plan to investigate the approach further reducing the memory space used for ExTable. Various parameters are involved in the proposed scheme. A formal model will be developed with which proper parameter values can be decided for the given condition. In addition, the match rate of the ExTable will be further improved with more sophisticated techniques such as machine learning and fuzzy theory in the selection of popular entries. The proposed approach will also be tested with a real test bed for various operational conditions and SDN environments.

Conclusions
In this paper a novel flow management scheme has been proposed for MFTs of SDN switches. In the proposed Agg-ExTable scheme, the flow entries in the MFT are periodically aggregated by applying the pruning and Quine-Mccluskey algorithm. Utilizing the memory space saved by the aggregation, ExTable is constructed, keeping popular flow entries and allowing early match with the incoming flows. The proposed scheme is able to save about 45% TCAM space of MFT and efficiently decrease the flow processing time through the express forwarding path provided by the front-end ExTable. Popular flow entries are selected from the flow tables using the HMM, where popularity is decided based on the match frequency and match probability. Computer simulation revealed that the proposed scheme significantly outperforms the existing schemes in terms of flow processing time.
In the future, we plan to investigate the approach further reducing the memory space used for ExTable. Various parameters are involved in the proposed scheme. A formal model will be developed with which proper parameter values can be decided for the given condition. In addition, the match rate of the ExTable will be further improved with more sophisticated techniques such as machine learning and fuzzy theory in the selection of popular entries. The proposed approach will also be tested with a real test bed for various operational conditions and SDN environments.

Conclusions
In this paper a novel flow management scheme has been proposed for MFTs of SDN switches. In the proposed Agg-ExTable scheme, the flow entries in the MFT are periodically aggregated by applying the pruning and Quine-Mccluskey algorithm. Utilizing the memory space saved by the aggregation, ExTable is constructed, keeping popular flow entries and allowing early match with the incoming flows. The proposed scheme is able to save about 45% TCAM space of MFT and efficiently decrease the flow processing time through the express forwarding path provided by the front-end ExTable. Popular flow entries are selected from the flow tables using the HMM, where popularity is decided based on the match frequency and match probability. Computer simulation revealed that the proposed scheme significantly outperforms the existing schemes in terms of flow processing time.
In the future, we plan to investigate the approach further reducing the memory space used for ExTable. Various parameters are involved in the proposed scheme. A formal model will be developed with which proper parameter values can be decided for the given condition. In addition, the match rate of the ExTable will be further improved with more sophisticated techniques such as machine learning and fuzzy theory in the selection of popular entries. The proposed approach will also be tested with a real test bed for various operational conditions and SDN environments.