EasyLB: Adaptive Load Balancing Based on Flowlet Switching for Wireless Sensor Networks

Load balancing is effective in reducing network congestion and improving network throughput in wireless sensor networks (WSNs). Due to the fluctuation of wireless channels, traditional schemes achieving load balancing in WSNs need to maintain global or local congestion information, which turn out to be complicated to implement. In this paper, we design a flowlet switching based load balancing scheme, called EasyLB, by extending OpenFlow protocol. Flowlet switching is efficient to achieve adaptive load balancing in WSNs. Nevertheless, one tricky problem lies in determining the flowlet timeout value, δ. Setting it too small would risk reordering issue, while setting it too large would reduce flowlet opportunities. By formulating the timeout setting problem with a stationary distribution of Markov chain, we give a theoretical reference for setting an appropriate timeout value in flowlet switching based load balancing scheme. Moreover, non-equal probability path selection and multiple parallel load balancing paths are considered in timeout setting problem. Experimental results show that, by setting timeout value following the preceding theoretical reference, EasyLB is adaptive to wireless channel condition change and achieves fast convergence of load balancing after link failures.


Introduction
In wireless sensor networks (WSNs), many sensor nodes are deployed to collect various types of data from the environment, e.g., temperature, image and video. The data collected are forwarded to the sink nodes through wireless channels with or without the help of some intermediate nodes. At the sink nodes, the data are further processed to perform some specific tasks, such as fire detection, water quality monitoring and natural disaster prevention. To improve reliability and throughput, multi-path routing algorithms are widely used in WSNs [1][2][3][4][5][6], where load balancing plays a key role in reducing transmission latency and extending the lifetime of WSNs. As the wireless channels fluctuate frequently due to the inherent characteristics of wireless signals, the channel capacities of the paths change rapidly [7,8]. In the worst case, some links may even fail if deep fade happens [9]. In this scenario, how to achieve adaptive load balancing in WSNs is a crucial problem.
Many efforts focus on achieving load balancing in various scenarios. These load balancing schemes differ in operation granularity and the ability to handle asymmetry. According to load balancing granularity, load balancing schemes can be broadly divided into three categories: packet level, flow level and sub-flow level. MPTCP [10], DRB [11], Fastpass [12] and DeTail [13] are typically working at packet level. In general, the packet level schemes are able to achieve accurate control of load ratio among multi-paths, but often lead to packet reordering. ECMP [14], WCMP [15], Hedera [16]  The size of flowlet will change with the real-time network congestion. More specifically, the less congested the network is, the larger the size of flowlet is. Thus, Vanini et al. [21] found that this property of flowlet can achieve adaptive and resilient load balancing in the presence of topological asymmetry. They have implemented an extremely simple flowlet switching based load balancing scheme, called LetFlow [21]. However, a key problem lies in determining a proper timeout value. Setting the value too small will lead to heavy packet reordering issue, and setting the value too large will reduce the opportunities of flowlet generation.
However, to achieve adaptive load balancing, choosing a proper timeout value is quite difficult because of dynamic TCP traffic patterns, different capacities of multiple parallel paths, etc. Currently, the timeout is usually set empirically based on sufficient simulations and statistics. Besides, a pre-designed timeout in one network cannot always reach good performance in other scenarios.
To solve this problem, in this paper, we investigate the theoretical reference on setting an appropriate timeout value in flowlet switching based load balancing scheme, and verify the theoretical results in an OpenFlow enabled network. The contributions of this paper are summarized as follows: • First, we design a flowlet switching based load balancing scheme for WSNs, called EasyLB, by adding one selection method for group table defined in OpenFlow [26]. We show that, by setting timeout according to the theorem, EasyLB achieves approximately optimal load balancing in WSNs and re-balances traffic quickly after link failures. • Second, the flowlet switching process is modeled by a stationary Markov chain, with the assumption that all flows occur as the Poisson process. Based on this model, we derive a theorem that specifies the sufficient condition on timeout that ensures the system can converge to an ideal load balancing effect. • Third, we further study the timeout setting problem when it comes to non-equal probability path selection and multiple parallel paths in flowlet switching. We conclude more generally when non-equal probability path selection is adopted in flowlet switching based load balancing scheme. Then, we give an algorithm for solving timeout setting problem in multiple parallel load balancing paths.
The paper is organized as follows. Section 2 presents the design and implementation of EasyLB. In Section 3, we describe the flow number change of parallel paths with a Markov chain model, and then we reveal the relationship between δ and the load balancing effect by the stationary distribution of Markov chain. We further study the timeout setting problem from the perspective of non-equal probability path selection and multiple parallel paths in flowlet switching. We evaluate EasyLB in different scenarios in Section 4. Section 5 concludes the paper and presents our future works.

EasyLB Design and Implementation
In this section, we introduce the architecture of EasyLB and give a brief explanation of the modified group table selection algorithm. EasyLB can be deployed at any merge nodes in WSNs.
The architecture of EasyLB and one example of deployment policy in WSNs is shown in Figure 3. In the WSN, the sensor nodes are reconfigurable, which is enabled by software defined networking. More specifically, the sensor nodes communicate with a common controller via OpenFlow [26]. Flowlet detection module of EasyLB is implemented in the sensor nodes while load balancing decision module resides in the controller. The controller obtains the whole network topology through the topology discovery module and periodically collects the channel state, link delay and flow information through the network monitoring module. The controller pushes the source-destination multi-path decisions into corresponding sensor nodes as group table entries.

Topology discovery Load balancing decision
Hash  As one representative southbound protocols in software defined networking [27], OpenFlow [26] standardizes the control signalling between control plane and data plane. Group table defined in OpenFlow protocol consists of multiple group entries and provides more advanced packet forwarding features to OpenFlow enabled switches. Each group entry contains multiple action buckets. Only one action bucket chosen by selection method such as hash will be executed in the select type of group entry. Obviously, the group entry of type "select" is suitable for implementing multi-path forwarding. By extending the selection method, we implement EasyLB by modifying the source code of OpenvSwitch [28].
The group table selection algorithm is described in Algorithm 1; the hash tables of last_arrival_time and last_output_channel are used to record the last packet arrival time and egress channel for every flow, respectively. When the time interval between two packets belonging to one flow is greater than δ, one new flowlet is created. The flowlet switching will randomly choose one from available channels of multi-path as its egress channel. If time interval is smaller than δ, the packet will still be forwarded from the same egress channel as the previous one of the same flow. The group_alive_buckets function selects a normal state channel from available output channels. When channel is down, the corresponding action bucket will not be selected or executed. Meanwhile, a port down message will be sent to controller and controller will make a new multi-path decision. These mechanisms can guarantee the fast convergence of load balancing after link failures.

Algorithm 1
The group table selection algorithm.

Timeout Setting in EasyLB
A wrongly chosen timeout value cannot mitigate the congestion or even reduce the throughput of the whole network in flowlet switching based load balancing scheme. Generally, the value of timeout is obtained in advance through sufficient simulations and statistics. In this section, we introduce the Markov chain model to describe the flowlet switching process, and to investigate the timeout setting problem in EasyLB. We first start with the simplest case where there are two parallel paths with equal path selection, and then extend it to non-equal path selection scenario. Finally, we solve the timeout setting problem in the general multiple parallel case.

Markov Chain Model
Without loss of generality, consider that N flows transfer on two parallel paths P 1 and P 2 with bottleneck capacities C 1 and C 2 , respectively. Assume the arrivals of packets of each flow follow the Poisson process (although not all flows are subject to Poisson process in different network environments, the assumption of Poisson process can help us better understand the burstiness of TCP and carry out relevant theoretical studies, as shown in [29][30][31][32]), and all the competing flows on the same path fairly share the path's capacity. Let F 1 and F 2 denote the set of flows on P 1 and P 2 , respectively. In an instant, we use γ to represent the average packet size, thus the arrival rate λ i of flow i can be calculated as where n 1 = | F 1 | and n 2 = | F 2 | denote the number of flows on P 1 and P 2 , respectively. The aggregate arrival rate λ a on P 1 and P 2 is Obviously, the flowlet switching process can be modeled by a Markov Chain, where the number of flows on P 1 and P 2 is the state. For state (n 1 , n 2 ), the next state it may transfer into is (n 1 , n 2 ), (n 1 −1, n 2 + 1) or (n 1 + 1, n 2 −1), which depends on the random path selection of new triggered flowlet. As Figure 4 depicts, if flow i on path P 1 triggers a new flowlet and the random selection path is P 2 , (n 1 , n 2 ) will transfer to (n 1 −1, n 2 + 1). Similarly, if flow i on path P 2 triggers a new flowlet and the random selection path is P 1 , (n 1 , n 2 ) will transfer to (n 1 + 1, n 2 −1). Otherwise, the state will remain unchanged.
(n 1 , n 2 ) (n 1 +1, n 2 -1) (n 1 -1, n 2 +1) (0, N) Figure 4. The model of Markov chain. In state (n 1 , n 2 ), there are n 1 and n 2 flows on P 1 and P 2 , respectively. The green line represents flow i on P 1 triggers a new flowlet and selects P 2 as its next path while the blue line represents flow i on P 2 triggers a new flowlet and selects P 1 as its next path. The red line represents the other cases.

Formalization of Timeout Setting Problem
In this subsection, we derive the sufficient condition on timeout that ensures the flowlet switching based load balancing scheme to achieve ideal load balancing effect.

Definition 1.
In the flowlet switching based load balance scheme considered in the previous section, if n 1 n 2 = C 1 C 2 , the ideal load balancing effect is achieved.
We define µ as the number of flows mapping to P 1 when the ideal load balancing effect is achieved.
According to Definition 1, µ = C 1 C 1 +C 2 × N . In this case, the corresponding state of the Markov chain is (µ, N − µ), we call it the ideal state. Theorem 1. In flowlet switching based load balancing scheme, if N flows transmit over two parallel paths with capacities C 1 and C 2 , respectively, the sufficient condition on timeout δ to achieve ideal load balancing effect is (1) proof of Theorem 1. The number of states in the preceding Markov chain is restricted and all states are accessible, thus this Markov chain is irreducible and all states are recurrent. Additionally, any state can access itself again through one step, so this Markov chain has stationary distribution. Let π = [π 0 , π 1 , ..., π m , ..., π N ] represent the stationary distribution of this Markov chain, where π m denotes the stationary probability of state (m, N − m). We can get the stationary distribution by solving where ∑ N m=0 π m = 1. After some mathematical manipulations, we finally get: The relationship of stationary probability between ideal state and that of any other state can be deduced from Equation (2) as: From the perspective of stationary distribution, when it comes to ideal load balancing effect, the stationary probability of ideal state is greater than any other one, i.e., By solving Equations (5) and (6), we can obtain the results specified in Equation (1). For more detailed derivation, please refer to Appendix A.
When the capacity of P 1 is greater than that of P 2 , Equation (3) is always larger than 1. That is, when P 1 has more flows than ideal state, flows themselves incline to transfer to P 2 . Contrarily, when the capacity of P 1 is smaller than that of P 2 , Equation (4) is always smaller than 1. This indicates that, when P 1 has fewer flows than ideal state, flows tend to transfer to P 1 . Note that, when paths are symmetric, the timeout has no effect on the final load balancing state.
Assuming that the delay difference between P 1 and P 2 is d, δ not only has to meet the constraints in Theorem 1, but also has to be greater than d to avoid packet reordering. However, setting δ too large also lowers the possibility of flowlet generation. An extreme case is setting δ at infinity, which will result in a flow-level load balancing scheme. An upper bound on δ is left as a future study.

Timeout Setting in Non-Equal Probability Path Selection
It is shown in [21] that, with equal probability path selection, flowlet switching based load balancing scheme can achieve adaptive load balancing on the premise of appropriate selection of δ, which is the magic of this load balancing technology. Considering a more general case where the paths are selected with non-equal probabilities, it is worth investigating whether the flowlet switching based load balancing scheme can still maintain its adaptive load balancing capability. In that case, how should δ be set? In practice, the paths are usually selected with non-equal probabilities; let us take the following two scenarios as an example. The first scenario is the switch queue's length on path P 1 is greater than that on path P 2 . To reduce the packet loss rate and improve overall performance of the network, we should choose path P 1 as new routing path at a higher probability for newly triggered flowlet. The second scenario is when the bandwidth of path P 1 is greater than that of P 2 ; for the purpose of speeding up the convergence rate of load balancing, we choose P 1 as routing path for newly triggered flowlet at a higher probability. In this section, we study the timeout setting problem when non-equal probability path selection is applied to flowlet switching based load balancing scheme.
Denote the probability that P 1 and P 2 are selected as the routing path for newly triggered flowlet as p and q, respectively. p and q meet the following conditions (to accelerate the convergence process of load balancing, p and q are usually set proportional to the bandwidth of the corresponding paths, i.e., p = C 1 It is easy to obtain the transition probability P 1 (n 1 ,n 2 ) as Similarly, the transition probability P 2 (n 1 ,n 2 ) is According to Equation (2), we can get the relationship of stationary probability between ideal state and that of any other state is Following Equations (5) and (6), we have if qC 1 < pC 2 .
As long as δ > δ min , adaptive load balancing can be achieved even with non-equal probability path selection. The derivation process is similar to Appendix A It can be seen that Theorem 1 is actually a special case with p = q = 0.5.

Timeout Setting in Multiple Parallel Load Balancing Paths
In this subsection, we study the problem of how to set δ to achieve adaptive load balancing when the load balancing parallel paths are multiple. When there are multiple parallel load balancing paths, each newly triggered flowlet may choose any of the paths as new routing path. As the number of multiple parallel paths increases, the Markov chain model encounters the state explosion problem and hence the computation of the transition probability matrix and the stationary distribution will become intractable. Alternatively, we leverage the results shown in Equation (7) to derive the threshold on δ.
Assume there are M parallel paths P 1 , P 2 , ..., P M with capacities C 1 , C 2 , ..., C M , respectively. We divide all paths into two logical paths P 1 and P 2 , where P 1 contains P 1 only and P 2 contains all the other paths. The bandwidth of P 1 and P 2 are C 1 = C 1 and C 2 = ∑ M k=2 C k , respectively. With random path selection of the M paths, the probability that P 1 or P 2 is chosen as the routing path for newly triggered flowlet is p = 1 M and q = M−1 M , respectively. The number of flows mapping to P 1 can be calculated as µ = C 1 C 1 +C 2 × N when ideal load balancing effect is achieved. According to Equation (7), we can obtain δ 1 min that specifies the lower bound on δ when ideal load balance effect is achieved between P 1 and P 2 . After obtaining δ 1 min , we remove P 1 from the physical paths resulting in a new problem of how to achieve load balance between the remaining M−1 paths P 2 , P 3 , ..., P M . Similarly, we can obtain δ 2 min by regarding P 2 as one logical path and all the other M−2 paths as another logical path. This procedure can be processed recursively until there are only two paths P M−1 and P M left which yields δ M−1 min . When ideal load balance effect is achieved between the M parallel paths, δ must be set as The detailed algorithm is described in Algorithm 2. The time complexity of the algorithm is O(M), Algorithm 2 An iterative algorithm for solving timeout setting problem with multiple parallel paths.

Input:
One physical path set, P(P 1 , P 2 , ..., P i , ...P M )(2 ≤ i ≤ M); One logic path set that contains physical path in P, P k (k = 1 or 2); The bandwidth of physical path P i , C i ; The bandwidth of P k , C k (k = 1 or 2); The total flow number, N; The number of flows mapping to P 1 when ideal load balancing effect is achieved, µ; The probabilities of P 1 and P 2 is chosen as the routing path for newly triggered flowlet, p and q; The m-th minimum value which must be less then δ, δ m (1 ≤ m ≤ M − 1); Output: The maximum value of δ m , δ min ; 1: for m from 1 to M-1 do 2: P p ← pick the first physical path in P; 3: P 1 ← (P p ) and P 2 ← P -P 1 ; 4: According to (7), we get the δ m ; 8: N ← N-µ;

9:
Remove P p from physical path set P; 10: end for 11: where M is the number of total parallel load balancing paths.

Performance Evaluation
In this section, we evaluate the performance of EasyLB both in symmetric topology and asymmetric topology with random path selection. Further, we evaluate EasyLB under non-equal path selection and multiple parallel paths in flowlet switching. Experimental results show that EasyLB achieves relatively ideal load balancing effect. Meanwhile, without maintaining explicit congestion information, EasyLB has the ability to handle asymmetry in the topology and achieves fast convergence of load balancing after link failures.

Asymmetric Topology with Random Path Selection
The testbed topology of wireless sensor networks is shown in Figure 5. The testbed is based on SDN-WISE [33] which is a prototype system of SD-WSNs and uses OMNeT++ [34] simulator. We break down P 3 in the simulation of asymmetric topology and symmetric topology. The capacity of P 1 and P 2 are set to 20 and 10 Kbps, respectively. The delay difference between P 1 and P 2 is 0.01 s. The forwarding queues of the sensor nodes are running in a round-robin [35] fashion in our simulations to enhance the fairness of links. Ten long-lived TCP flows are generated that transmit from node C 1 to C 2 . The randomly generated requests are distributed to P 1 and P 2 . The average packet size of each flow is 80 Bytes. We run a basic flowlet switching process whose path selection is random, as described in Algorithm 1. We vary timeout value from 0.2 s to 8 s and get the convergent traffic load ratio of P 1 to P 2 as shown in Figure 6a. According to Equation (1), δ min equals to 0.64 s. When δ > δ min , such as 1 s or 1.2 s, the system will achieve the ideal load balancing effect. However, if the value of timeout is set too large, such as bigger than 2 s, the granularity of flowlet switching based load balancing scheme is approximate to flow level and the traffic load ratio of P 1 to P 2 will be much more random. Similarly, if timeout is set too small, such as 0.2 s, it turns out to be packet-level load balancing and the traffic distributed onto this two paths is roughly the same.  In Figure 7, we set timeout to different values to show the real time traffic load ratio of P 1 to P 2 . As shown in Figure 7a,b, the traffic load ratio fluctuates in both symmetric and asymmetric cases. As more flowlets are assigned to one path, congestion is more likely to occur on this path which results in the flowlet size reduction. Flows will transfer from a more congested path to a less congested one, and finally the size of flowlet on this two paths is balanced.

Symmetric Topology with Random Path Selection
We also evaluate the performance of EasyLB in symmetric topology, where the capacity of P 1 and P 2 are both set to 10 Kbps. The convergent traffic load ratio of P 1 to P 2 is shown in Figure 6b. According to Equation (1), δ min equals to 0 s. As depicted in Figure 6b, the traffic load ratio is approximately 1, which is the ideal load balancing effect in symmetric topology as long as the value of δ is not too large.
The load ratio change of P 1 to P 2 in symmetric topology is shown in Figure 7b. We can find that EasyLB converges to ideal load balance effect. We know that, when the number of flows in one path increases, this path becomes more congested and more packets are likely to be dropped. Packet loss in TCP brings about TCP timeout and a new flowlet will be generated, which forces flows switch to another path. When the system eventually reaches steady state, the congestion degree is almost the same.

Non-Equal Probability Path Selection
To evaluate EasyLB with non-equal probability path selection, we set the path selection probabilities of P 1 and P 2 to 1 3 and 2 3 , respectively. The bandwidth of P 1 and P 2 is set to 20 and 10 Kbps, respectively. According to Equation (7), δ min equals to 1.28 s. As shown in Figure 8, as long as δ is larger than δ min , the traffic load ratio of P 1 to P 2 approximately approaches 2, which is the ideal load balancing effect. Note that, as δ is sufficiently large, it acts more like flow-level load balancing and hence the traffic load ratio of P 1 to P 2 has difficulty converging to 2, as shown in Figure 8. An interesting finding is that, although the path selection probabilities of P 1 and P 2 are inversely proportional to their bandwidth, EasyLB maintains its adaptive load balancing capability as long as the timeout is appropriately set.

Multiple Parallel Paths
In this subsection, we evaluate the performance of EasyLB in the presence of multiple paths. As shown in Figure 5, the client applications on sensor node C 1 randomly generate requests to server applications on C 2 . The capacity of the three paths P 1 , P 2 , and P 3 are set to 6, 6 and 18 Kbps, respectively. We randomly assign new triggered flowlet to the three paths, thus the path selection probability of each path is 1 3 . According to Algorithm 2, δ min equals 0.88 s. As shown in Figure 9, when δ is set smaller than δ min , EasyLB acts more like a packet-level load balancing scheme, therefore traffic loads of the three paths are almost the same. When δ is set larger than δ min , EasyLB achieves the ideal load balancing effect, in which the traffic distributed to each path is proportional to its bandwidth. However, as noted above, when δ is too large, EasyLB evolves to flow level load balance scheme, thus the final traffic load ratio of the three paths is much more random.

React to Link Failure
The capacity of three paths P 1 , P 2 , and P 3 are set to 6, 6 and 18 Kbps, respectively. We set δ to 1.2 s and 1.4 s. At first, the traffic is distributed over the three paths, and around 20% traffic is allocated to P 2 . At t = 30 s, we manually break the link (N 1 , S 1 ). As is shown in Figure 10, EasyLB reacts fast to this link failure event and around 25% of traffic is allocated to P 2 after the re-balance. This is because, once link failure happens in multi-paths case, the SDN controller can perceive it quickly and pushes the new multi-path decision into switches as a new group entry.

Conclusions and Future Work
We have studied the relationship between the timeout value in flowlet switching and the final state of load balancing with a stationary Markov chain. We further derived a theorem that specifies the sufficient condition on timeout to achieve ideal load balancing effect, which gives a comprehensive recommendation on setting timeout in different network environments. We also implemented flowlet switching based load balancing scheme called EasyLB in software defined networking by extending OpenFlow protocol. Experimental results show that, by setting the timeout following the theorem, EasyLB has adaptive load balancing ability both in symmetric topology and in asymmetric topology. The number of active flows was assumed to be static. However, in highly dynamic networks [36,37], such as vehicle-to-vehicle (V2V) networks, the number of flows changes rapidly. Moreover, the flow priority may affect the timeout value setting. How to deal with the dynamic arrival and departure of flows, as well as take the flow priority into account in timeout value setting, is left as future studies.

Conflicts of Interest:
The authors declare no conflict of interest.