# RMHIL: A Rule Matching Algorithm Based on Heterogeneous Integrated Learning in Software Defined Network

## 1. Introduction

## 2. Problem Description and Transformation

#### 2.1. Problem Description

#### 2.2. Problem Transformation

## 3. Algorithm Design and Implementation

#### 3.1. Algorithm Design

Algorithm 1: Decision Tree Construction Algorithm (DTC) |

1: Input: initial root node ${\mathrm{x}}_{0}$. |

2: Initialize parameters ($\mathsf{\theta},\mathsf{\omega},\mathsf{\lambda}$), $\mathsf{\lambda}\in \left[0,1\right]$; |

3: For ($\mathrm{n}=0;\mathrm{n}<\mathrm{N};\mathrm{n}++$) |

// N is the maximum number of iterations. |

4: get $\mathsf{\pi}(\mathrm{x}|\mathrm{a},\mathsf{\theta})$; |

5: $\mathrm{x}={\mathrm{x}}_{0}$; |

6: while $\mathrm{f}\left(\mathrm{x}\right)\ne 0$ |

// f(x) = 0 if and only if x is a leaf node. |

7: $\mathrm{a}=\mathsf{\pi}(\mathrm{x}|\mathrm{a},\mathsf{\theta});$ |

8: $\mathrm{x}=\mathrm{NextNode}\left(\mathrm{x},\mathrm{a}\right);$ |

// NextNode(x,a) is to select the next non-leaf node in depth-first traversal order. |

9: NandA(${\mathrm{x}}_{0}$)$\leftarrow \mathrm{x},\mathrm{a}$; |

// NandA(${\mathrm{x}}_{0}$) is used to record all pairs $\left(\mathrm{x},\mathrm{a}\right)$. |

10: End while |

11: $\mathrm{d}\mathsf{\theta}=0,\mathrm{d}\mathsf{\omega}=0;$ |

12: get ${\mathrm{V}}_{\mathrm{n}}\left(\mathrm{x},\mathsf{\omega}\right)$ according to formula (3); |

13: For each$\left(\mathrm{x},\mathrm{a}\right)\in \mathrm{NandA}\left({\mathrm{x}}_{0}\right)$ |

14: get R from environment; |

15: $\mathrm{d}\mathsf{\omega}=\mathrm{d}\mathsf{\omega}+\partial {\left({\mathrm{V}}_{\mathrm{n}}\left(\mathrm{x},\mathsf{\omega}\right)-\mathrm{R}\right)}^{2}/\partial \mathsf{\omega}$; |

16: d$\mathsf{\theta}=\mathrm{d}\mathsf{\theta}+(\partial (\mathsf{\pi}(\mathrm{x}|\mathrm{a},\mathsf{\theta})/\partial \mathsf{\theta})\xb7(\partial \left({\mathrm{V}}_{\mathrm{n}}\left(\mathrm{x},\mathsf{\omega}\right)-\mathrm{R}\right)/\partial \mathsf{\pi}(\mathrm{x}|\mathrm{a},\mathsf{\theta}))$; |

17: End for |

18: $\mathsf{\theta}=\mathsf{\theta}+\mathrm{d}\mathsf{\theta},\mathsf{\omega}=\mathsf{\omega}+\mathrm{d}\mathsf{\omega};$ |

19: End for |

20: Output: strategy $\mathsf{\pi}(\mathrm{x}|\mathrm{a},\mathsf{\theta})$, value function ${\mathrm{V}}_{\mathrm{n}}\left(\mathrm{x},\mathsf{\omega}\right)$ |

Algorithm 2: Rule Matching Algorithm based on Heterogeneous Integrated Learning (RMHIL) |

1: Input: a flow packet and a set of rules |

2: Call Algorithm 1 (DTC) to get strategy $\mathsf{\pi}\text{}(\mathrm{x}|\mathrm{a},\mathsf{\theta});$ |

3: Construct a decision tree of the rule set according to the optimal strategy; |

4: Traverse the decision tree to select the highest priority rule. |

5: Output: the rule matching the flow packet |

#### 3.2. Algorithm Implementation

## 4. Comparative Experimental and Performance Verification

#### 4.1. Matching Time

#### 4.2. Memory Overhead

## 5. Conclusions

- (1)
- With the continuous expansion of network scale and the continuous increase of SDN functional requirements, the most commonly used 5-field matching method will be difficult to meet the demand. Therefore, we must consider the 40-field rule matching. Although the sample construction method is simple and the software is ready, the strategy generation and value function calculation of a large number of 40-field real samples take a lot of time. It is planned to be completed in the next step, and then the algorithm performance test is performed based on the sample.
- (2)
- The algorithm proposed in this paper does not have obvious advantages in memory usage. We consider fusing as the means of reducing the rule overlap problem in the CutTSS algorithm (first dividing the rule set and then constructing the decision tree in parallel) or designing efficient and feasible solutions for the characteristics of the algorithm.
- (3)
- We do not use the countdown in the traditional sense when measuring the matching time.
- (4)
- A custom standard is used to facilitate comparative experiments. To further optimize the matching time of the algorithm proposed in this paper, we consider using GPU version TensorFlow, AWS cloud computing service platform, and other acceleration methods.

## References

**Figure 3.**This is a recurrent neural network model, where a is the parameter passed between the cyclic units.

Source IP/IPv6 Address | Destination IP/IPv6 Address | TCP/UDP Source Port Number | TCP/UDP Destination Port Number | Protocol | Instructions Set | Counters | Priority | Timeout | Cookie |
---|---|---|---|---|---|---|---|---|---|

1.0.0.0/32 | 1.0.1.0/32 | * | * | TCP | drop | 3 | 0 | 00:50 | A |

* | * | 0 | 20 | TCP | add | 5 | 1 | 00:39 | B |

* | * | * | * | TCP | change | 4 | 2 | 00:27 | C |

Data Set | The Number Of Rules | Magnitude | Data Set | The Number of Rules | Magnitude | Data Set | The Number of Rules | Magnitude |
---|---|---|---|---|---|---|---|---|

1 | 799 | 1000 | 11 | 7298 | 10,000 | 21 | 75,100 | 100,000 |

2 | 846 | 1000 | 12 | 8772 | 10,000 | 22 | 83,752 | 100,000 |

3 | 857 | 1000 | 13 | 8833 | 10,000 | 23 | 83,797 | 100,000 |

4 | 863 | 1000 | 14 | 9039 | 10,000 | 24 | 83,966 | 100,000 |

5 | 934 | 1000 | 15 | 9377 | 10,000 | 25 | 88,081 | 100,000 |

6 | 942 | 1000 | 16 | 9418 | 10,000 | 26 | 96,078 | 100,000 |

7 | 961 | 1000 | 17 | 9476 | 10,000 | 27 | 98,180 | 100,000 |

8 | 970 | 1000 | 18 | 9591 | 10,000 | 28 | 99,052 | 100,000 |

9 | 990 | 1000 | 19 | 9655 | 10,000 | 29 | 99,318 | 100,000 |

10 | 990 | 1000 | 20 | 9774 | 10,000 | 30 | 99,480 | 100,000 |

