CoFi : Coding-Assisted File Distribution over a Wireless LAN

The wireless channel is volatile in nature, due to various signal attenuation factors including path-loss, shadowing, and multipath fading. Existing media access control (MAC) protocols, such as the widely adopted 802.11 wireless fidelity (Wi-Fi) family, advocate masking harsh channel conditions with persistent retransmission and backoff, in order to provide a packet-level best-effort service. However, the asymmetry of the network environment of client nodes in space is not fully considered in the method, which leads to the decline of the transmission efficiency of the good ones. In this paper, we propose CoFi, a coding-assisted file distribution protocol for 802.11 based wireless local area networks (LANs). CoFi groups data into batches and transmits a random linear combination of packets within each batch, thereby reducing redundant packet and acknowledgement (ACK) retransmissions when the channel is lossy. In addition, CoFi adopts a MAC layer caching scheme that allows clients to store the overheard coded packets and use such cached packets to assist nearby peers. With this measure, it further improves the effective throughput and shortens the buffering delay when running applications such as bulk data transmission and video streaming. Our trace based simulation demonstrates that CoFi can maintain a similar level of packet delay to 802.11, but increases the throughput performance by a significant margin in a lossy wireless LAN. Furthermore, we perform a reverse-engineering on CoFi and 802.11 using a simple analytical framework, proving that they asymptotically approach different fairness measures, thus resulting in a disparate performance.


Introduction
The IEEE 802.11 (Wi-Fi) based wireless media access control (MAC) protocol has become a dominant technology that provides convenient access to the internet in a wireless LAN, in which a single access point (AP) can serve the file requests from multiple clients [1].However, the effective performance of Wi-Fi, in terms of throughput, delay, and stability, is still incomparable to its wireline counterpart.This is mainly because of its sensitivity to ambient interference and the volatile wireless link condition [2].In particular, the persistent-retransmission based loss protection schemes, and a backoff scheme that does not differentiate losses from congestion, have been identified as the main reasons for the inefficiency of 802.11 [3,4].Alternative protocols have been proposed.In Reference [5] a new MAC algorithm, called multiple access with salvation army (MASA), which adopts a less sensitive carrier sensing to promote more spatial reuse of the channel, was proposed.The MASA alleviates the problem of a high collision probability by adaptively adjusting the communication distance via "packet salvaging" at the MAC layer.The performance of throughput and packet delivery rate was improved significantly.In Reference [6] authors presented the design, implementation, and evaluation of a system that opportunistically caches overhead data to improve subsequent transfer throughput in wireless mesh networks.A new backoff scheme which performs well for dense networks resulting in a low collision probability was proposed by Karaca et al. [7].The backoff scheme opportunistically gives higher priority to users with a high traffic load and better channel conditions, and thus reduced unnecessary contention.A simple window adaptation scheme for backoff in 802.11MAC protocol was proposed by Shetty et al. [8].The scheme used constant step size stochastic approximation to adjust collision probabilities to set values.Using an approximate analytic relationship between this probability and the backoff window ensured a high throughput and fairness.Protocols in all the above references replaced these two reasons with other schemes to provide a best-effort packet level service.
In this paper, we propose CoFi, a MAC level file distribution protocol that aims at improving the WLAN throughput and delay performance under harsh channel conditions.CoFi masks the wireless channel variations using coding based batch transmission.Instead of striving to improve per-packet reliability via retransmission, the AP in CoFi continuously transmits random linear combinations of original packets, which are grouped into batches.Once the receiver accumulates a sufficient number of linearly independent packets, it sends a single ACK to acknowledge the successful reception of the entire batch of data.The rationale of random linear network coding lies in removing packet ordering by mixing the information inside each batch [9][10][11].With this technique, the scheduling problem within each batch can be avoided since the receiver does not have to be halted because of the failure of a single transmission and the subsequent retransmission attempts.As long as the receiver can collect a sufficient number of coded blocks, it will be able to recover the original file.This advantage provides a natural solution to the notorious fairness problem in 802.11, which degrades the throughput of all clients once one of them is experiencing a severe loss [12].
A further optimization that CoFi uses to improve performance is coding based MAC layer caching.The basic idea of MAC layer caching is to take advantage of the wireless broadcast nature [13].Each client can overhear the coded packets that are intended for another client.When one of the (AP client) links suffers from a harsh channel condition, CoFi allows another client who has overheard packets intended for the weak client to serve as a relay.The relay replaces the role of the AP, and transmits random linearly combined packets to the weak client, until it can successfully decode.With the coding based batching and caching schemes, CoFi is able to significantly reduce the control overhead and redundant transmissions in traditional 802.11 like protocols, thus improving the throughput performance, while ensures full MAC layer reliability.
In designing the coding based batch transmission and MAC caching protocols, we have tried to improve not just the throughput and reliability, but also the per-packet delay.Note that the throughput-delay relation with network coding is not straightforward, since for each coded chunk of data, the receiver must wait until the entire batch can be decoded.In general, the delay performance of network coding depends on both the end-host computation time and the batch transmission time [14].The end-host delay is usually negligible on a powerful mobile device, especially when using fast coding implementations [15].Therefore, the per-packet delay is usually constrained by the batch transmission time, which depends on the channel condition, as well as the batch size.With extensive trace based simulation, we demonstrate that the per-packet delay of CoFi can be close to 802.11, especially in a lossy wireless network, while the effective throughput can be more than twice higher.Such a performance advantage renders it particularly useful for applications such as bulk file transfer and streaming, which can tolerate an initial buffering delay.
The surprisingly large performance gap between CoFi and 802.11 may seem counter-intuitive, given that network coding does not fundamentally improve the network capacity [16].Towards a more rigorous justification, we perform reverse-engineering of both protocols using theoretical analysis.With a simple analytical model, we prove that CoFi and 802.11 essentially approach different fairness measures in an asymptotic manner, thus leading to disparate throughput efficiency.
The remainder of this paper is organized as follows.In Section 2, we review existing work on network coding and MAC caching protocols that improve the performance of wireless networks.We continue to discuss the major design issues of CoFi in Section 3, and then describe the implementation CoFi in Section 4. Section 5 evaluates the performance of CoFi, with respect to throughput and delay, as well as the computation cost of network coding.Section 6 presents an in-depth analysis of the asymptotic properties of CoFi, aiming at a theoretical justification of its performance.Finally, Section 7 summarizes the paper and presents our future work.

Related Work
Since the pioneering work by Ho et al. [17], randomized network coding has shifted from the information theory domain to practical wireless networking systems.This trend has been marked by the COPE protocol [18], which employs the simplest form of network coding (i.e., XOR coding) for multi-session unicast in wireless mesh networks.COPE allows intermediate forwarders to opportunistically XOR code incoming packets heading towards different next-hops, based on a prior knowledge of the decodability at the intended downstream nodes.The encoding nodes broadcast the coded packets to all down streams, thereby reducing the number of transmissions compared with traditional routing.The MAC-independent opportunistic routing & encoding (MORE) protocol [19] goes one step further by combining random linear network coding with opportunistic multipath routing.Owing to the resilience of random linear network coding to packet losses, MORE achieves 70% higher throughput on average over a traditional best-path routing protocol.Due to the problem that the stopping and waiting "ACK" policy degrades throughput in MORE, Lin et al. addressed the problem [20] by allowing the coexistence of different data segments.In the proposed method, called code opportunistic routing (CodeOR), the source node transmits W (window size) concurrent segments.When the source node receives end-to-end feedback from the destination node, the node adds a new segment to the current window.Kim et al. [21] proposed a novel scattered random network coding (S-RNC) scheme, which takes the advantage of error position diversity to improve the throughput performance in multiple hops wireless networks.The RNC blocks can be classified into different groups according to different bit positions (i.e., error probability), and the lower ones are protected.The sender and relays scatter the bits of these protected coded blocks on "good" bit positions and the rest on "bad" bit positions.Amerimehr et al. [22] investigated the throughput gain of inter-flow network coding over a non-coding scheme on multicast sessions in multi-hop wireless networks.They also defined a new metric, network unbalance ratio, which identifies the amount of unbalance instability among nodes.
Despite the wide interest on the satiation throughput of network coding based protocols, little attention has been paid to its delay performance.In addition, randomized network coding has mostly been applied to improving the performance of a single network flow.When multiple unicast sessions are running concurrently, it is unknown whether network coding can still provide benefits, given its aggressive transmission strategy.In Reference [23], the delay performance gains of network coding are quantified using theoretical analysis, assuming an ideal time division multiple access (TDMA) based cellular network.In designing the CoFi protocol, we focused on the performance of network coding in realistic carrier sense multiple access (CSMA) based wireless LANs.Another deficiency of existing randomized network coding protocol is that they are mostly applied to routing in static mesh networks.For example, MORE [19] requires the extensive measurement of all link conditions in a mesh network, and then runs a centralized algorithm to determine the number of packets sent over each link based on its average quality.In CoFi, we used signal strength as the metric for the selection of relays, which can be realized without any measurement overhead, and which is more adaptable to a mobile scenario.
The idea of opportunistically caching packets is not new.It has been applied in wireless mesh networks in order to reduce the routing overhead [5] and to save redundant transmissions [6].However, these schemes strongly depend on the underlying scheduling algorithm that determines "which packets to cache" and "which cached packets to transmit".In CoFi, we take advantage of the inherent randomized nature of network coding, eliminate the complex scheduling problem, and thus further reduce the file completion time.
Network coding has been applied to the 802.16 worldwide interoperability for microwave access (WiMax) cellular networks as well [24,25], in order to improve its throughput and stability.The 802.11 based WLAN differs from Reference [24] in that all subscribers in the WLAN contend for the same channel.In this case, the delay experienced by each client depends not only on its channel condition, but also on the link quality experienced by others.Furthermore, in a multichannel system such as WiMax, the MAC caching scheme cannot be applied, since nearby links are usually assigned orthogonal sets of sub channels.

Coding Based Batching and Caching in CoFi
In this section, we introduce the major design issues in CoFi, i.e., coding based batch transmission and MAC layer caching.We begin with the basic idea of randomized network coding and how it can be applied to simplify both mechanisms.

Background: Randomized Network Coding
Existing coding based wireless unicast protocols have mostly adopted the following batch based scheme, which was first introduced by Chou et al. [26].Before transmission, in order to facilitate the encoding of the original data file, it is grouped into batches (Figure 1), each containing n blocks of size k bytes.n and k are termed batch size and block size, respectively.A batch can be represented as a n × k matrix B, with rows being the n blocks, and columns the bytes (integers from 0 to 255) of each block.In each batch the coding operation is performed.The encoding operation produces a linear combination of the original blocks in this batch by X = R • B (Figure 2), where R is a n × n matrix composed of random coefficients in the Galois field GF(2 8 ) [27].The coding coefficients (rows in R), along with the coded blocks (rows in X), are packetized and transmitted to the receiver.inherent randomized nature of network coding, eliminate the complex scheduling problem, and thus further reduce the file completion time.
Network coding has been applied to the 802.16 worldwide interoperability for microwave access (WiMax) cellular networks as well [24,25], in order to improve its throughput and stability.The 802.11 based WLAN differs from Reference [24] in that all subscribers in the WLAN contend for the same channel.In this case, the delay experienced by each client depends not only on its channel condition, but also on the link quality experienced by others.Furthermore, in a multichannel system such as WiMax, the MAC caching scheme cannot be applied, since nearby links are usually assigned orthogonal sets of sub channels.

Coding Based Batching and Caching in CoFi
In this section, we introduce the major design issues in CoFi, i.e., coding based batch transmission and MAC layer caching.We begin with the basic idea of randomized network coding and how it can be applied to simplify both mechanisms.

Background: Randomized Network Coding
Existing coding based wireless unicast protocols have mostly adopted the following batch based scheme, which was first introduced by Chou et al. [26].Before transmission, in order to facilitate the encoding of the original data file, it is grouped into batches (Figure 1), each containing n blocks of size k bytes.n and k are termed batch size and block size, respectively.A batch can be represented as a nk  matrix B, with rows being the n blocks, and columns the bytes (integers from 0 to 255) of each block.In each batch the coding operation is performed.The encoding operation produces a linear combination of the original blocks in this batch by X R B  (Figure 2), where R is a nn  matrix composed of random coefficients in the Galois field GF( 82 ) [27].The coding coefficients (rows in R), along with the coded blocks (rows in X), are packetized and transmitted to the receiver.Batch    inherent randomized nature of network coding, eliminate the complex scheduling problem, and thus further reduce the file completion time.
Network coding has been applied to the 802.16 worldwide interoperability for microwave access (WiMax) cellular networks as well [24,25], in order to improve its throughput and stability.The 802.11 based WLAN differs from Reference [24] in that all subscribers in the WLAN contend for the same channel.In this case, the delay experienced by each client depends not only on its channel condition, but also on the link quality experienced by others.Furthermore, in a multichannel system such as WiMax, the MAC caching scheme cannot be applied, since nearby links are usually assigned orthogonal sets of sub channels.

Coding Based Batching and Caching in CoFi
In this section, we introduce the major design issues in CoFi, i.e., coding based batch transmission and MAC layer caching.We begin with the basic idea of randomized network coding and how it can be applied to simplify both mechanisms.

Background: Randomized Network Coding
Existing coding based wireless unicast protocols have mostly adopted the following batch based scheme, which was first introduced by Chou et al. [26].Before transmission, in order to facilitate the encoding of the original data file, it is grouped into batches (Figure 1), each containing n blocks of size k bytes.n and k are termed batch size and block size, respectively.A batch can be represented as a nk  matrix B, with rows being the n blocks, and columns the bytes (integers from 0 to 255) of each block.In each batch the coding operation is performed.The encoding operation produces a linear combination of the original blocks in this batch by X R B  (Figure 2), where R is a nn  matrix composed of random coefficients in the Galois field GF( 82 ) [27].The coding coefficients (rows in R), along with the coded blocks (rows in X), are packetized and transmitted to the receiver.Batch The decoding operation at the receiver side is the matrix reversion B = R −1 • X, where each row of X represents a coded block and each row of R represents the coding coefficients accomplished with it [27].It is required that the matrix R is of full rank, i.e., the receiver must collect n independent coded blocks for this batch to successfully recover the original batch B. Afterwards, it returns a single ACK packet to the sender, which confirms the successful decoding of the entire batch.
Random linear network coding allows a re-encoding operation at the helper node.Specifically, it produces a new block by re-encoding existing blocks it has overheard in this batch.The re-encoding operation replaces the coding coefficients accomplished with the original coded packets with another set of random coefficients.For instance, if we consider the existing coded packets to be rows in the matrix Y, which from the viewpoint of the AP was obtained using Y = R y • B (B is the original un-coded packets and R y is the random coefficients), then the helper node may produce a new code block by re-encoding existing packets as As a result, the original coefficients R y are replaced by R ∼ y .The re-encoding operation circumvents the packet level scheduling problem in traditional MAC caching protocols [5,6], because by randomly mixing information from all existing blocks, a newly generated coded block is innovative to the receiver with a probability close to 1 [17].
Note that the packet overhead of network coding is negligible, since the size of the coding coefficient (n) is usually much smaller than the packet size (k).This issue has been well recognized in existing work [19].

Coding Based Batch Transmission
The coding based batch transmission scheme in CoFi resembles the above mentioned randomized network coding.In designing such a batch transmission scheme, a key factor is the parameter settings, including the batch size and the block size.A large batch size provides better protection against packet losses, but also induces higher computation overhead.On the other hand, a smaller block size results in lower computation overhead, but also implies a higher coefficient overhead and lower level of loss resilience.In this paper, we use a small batch size, so that CoFi can never be computation bounded.In a parallel project, we look into an adaptive approach that opportunistically adjusts the batch size and block size according to the variation in the channel condition and CPU load of the wireless node.
An important optimization that CoFi uses to reduce the computation and coefficient overhead is hybrid coding.Specifically, for each batch, a transmitter sends the first n packets without network coding, where n equals the batch size.Afterwards, it performs random linear encoding upon each outgoing packet.In this way, it still ensures each outgoing packet is fresh for the receiver with high probability.However, since the first n packets are sent without coding, the computation cost of network coding can be reduced.In addition, since the first n packets are sent per se, CoFi no longer needs to wait until a full batch of data is available at the transmitter buffer.Consequently, the per-packet delay of CoFi is reduced, and it can gracefully degrade to the non-coding based protocol which guarantees a per-packet quality of service (w.r.t.delay), when the channel condition is satisfactory.Note that such a hybrid coding scheme is not applicable to the existing coding based multipath multihop routing protocols (such as MORE [19]) since there is no direct link between the source and destination in those scenarios.
Note that CoFi has no exponential backoff after losses (this design choice is adopted by 802.11MAC, which treats all losses as congestion signals, thus resulting in a low efficiency).However, CoFi still preserves the exponential backoff scheme after sensing a busy channel.This is necessary to avoid collisions when both the clients and the AP need to reserve the transmission time.

Coding Based MAC Layer Caching
A major component of CoFi is the coding based MAC layer caching scheme, as illustrated in Figure 3.The intuition behind MAC caching is to allow each client to opportunistically overhear packets intended for a co-located peer, and pass them later to it.This will further reduce the delay due to bursty losses, which is quite typical in the wireless environment.With random linear network coding, the "packet selection" problem, i.e., choosing which packets to send in the cache, is circumvented.However, two critical issues still need to be addressed in the coding based MAC caching scheme: who should relay and when should the relay help, which correspond to a relay selection problem and relay scheduling problem, respectively.

The Relay Selection Problem
When multiple potential relays are available for help, a relay selection algorithm must be applied, so as to avoid redundant transmissions.In the MORE multipath network coding protocol [19], a credit based algorithm is used that determines the number of coded packets each intermediate forwarder should transmit.The credit is computed at the source node in a centralized manner, based on the average quality of each link.The algorithm is applicable to mesh networks with static nodes and stable links, whose quality do not vary over a long period of time.In contrast, CoFi targets wireless LANs with possible mobile clients and varying link conditions.In such scenarios, measuring the average link quality (time-averaged packet reception rate) of all links online is infeasible due to its large traffic overhead.
More specifically, we advocate a signal strength based on a single-relay selection algorithm.For each client, at most one relay is allowed to help by transmitting cached packets when it is experiencing severe losses.When multiple relays are available, the best is selected according to its potential.The potential of a client R as a relay for client C is defined as: where  (AP R) S denotes the average signal strength from the AP to node R. The potential of each client C, i.e.,  (C C) P equals the average signal strength from AP to itself.The selected best relay R, which has the highest potential, will be allowed to offer help for client C if and only if the following two conditions are satisfied: , where TS is a threshold that is used to avoid oscillation of relay selections due to instantaneous variation of channel condition.Currently, we set TS to 10 dB, which is the typical capture threshold used in wireless LANs that indicates a signal significantly overwhelms another.Our experience with the USRP programmable wireless transceiver [28] indicates that the typical variation of signal strength for a static link also lies within 10 dB.


The packet loss rate of link  (AP C) is larger than a threshold TL, which is the PHY level loss rate in a WLAN.This threshold should be set according to the typical QoS requirement of clients.In our experiments, TL is equal to the inverse of the maximum retry limit specified by the 802.11MAC, i.e., when the PHY loss rate is larger than TL, the 802.11protocol will suffer from low MAC level packet delivery ratio.
Note that both the above two thresholds are hardcoded to typical values, requiring no runtime intervention.With random linear network coding, the "packet selection" problem, i.e., choosing which packets to send in the cache, is circumvented.However, two critical issues still need to be addressed in the coding based MAC caching scheme: who should relay and when should the relay help, which correspond to a relay selection problem and relay scheduling problem, respectively.

The Relay Selection Problem
When multiple potential relays are available for help, a relay selection algorithm must be applied, so as to avoid redundant transmissions.In the MORE multipath network coding protocol [19], a credit based algorithm is used that determines the number of coded packets each intermediate forwarder should transmit.The credit is computed at the source node in a centralized manner, based on the average quality of each link.The algorithm is applicable to mesh networks with static nodes and stable links, whose quality do not vary over a long period of time.In contrast, CoFi targets wireless LANs with possible mobile clients and varying link conditions.In such scenarios, measuring the average link quality (time-averaged packet reception rate) of all links online is infeasible due to its large traffic overhead.
More specifically, we advocate a signal strength based on a single-relay selection algorithm.For each client, at most one relay is allowed to help by transmitting cached packets when it is experiencing severe losses.When multiple relays are available, the best is selected according to its potential.The potential of a client R as a relay for client C is defined as: where S(AP → R) denotes the average signal strength from the AP to node R. The potential of each client C, i.e., P(C → C) equals the average signal strength from AP to itself.The selected best relay R, which has the highest potential, will be allowed to offer help for client C if and only if the following two conditions are satisfied: where TS is a threshold that is used to avoid oscillation of relay selections due to instantaneous variation of channel condition.Currently, we set TS to 10 dB, which is the typical capture threshold used in wireless LANs that indicates a signal significantly overwhelms another.Our experience with the USRP programmable wireless transceiver [28] indicates that the typical variation of signal strength for a static link also lies within 10 dB.

•
The packet loss rate of link (AP → C) is larger than a threshold TL, which is the PHY level loss rate in a WLAN.This threshold should be set according to the typical QoS requirement of clients.
In our experiments, TL is equal to the inverse of the maximum retry limit specified by the 802.11MAC, i.e., when the PHY loss rate is larger than TL, the 802.11protocol will suffer from low MAC level packet delivery ratio.
Note that both the above two thresholds are hardcoded to typical values, requiring no runtime intervention.
The advantage of the signal-strength based approach lies in its simplicity and efficiency.Our measurement of wireless links (Section 5) indicates that most links experience a stable signal strength with small variance unless the clients move.Therefore, a few samples are enough to determine the average quality of a link.To obtain the signal strength from R to C, for instance, C only needs to overhear a number of ACKs that are sent from R and are intended for the AP.In contrast, measuring the MAC level packet loss ratio typically requires a large number of samples over a relatively long period of time.
Existing work has mostly discarded signal strength as an indicator of the link quality [29,30].However, such measurements adopted 802.11MAC layer throughput as the link quality metric.The 802.11 throughput is not in close correlation with the signal strength because of its persistent retransmission, per-packet ACK, and exponential backoff scheme.In CoFi, a single-direction transmission without backoff dominates the traffic, and the link rate is directly related with the signal strength.

The Relay Scheduling Problem
The scheduling scheme in CoFi triggers the relay to transmit cached packets to the intended client in order to save the channel time wasted on the lossy link between the AP and that client.Once the best relay can decode an entire overheard batch, it will take over the role of the AP, serving the intended client by sending a random linear combination of packets in the batch.When the AP overhears a coded packet from the relay, it stops transmitting to the weak client since it knows the relay has a full batch of data that can rescue the target client.
Beside triggering the relay, the scheduling algorithm must ensure liveness, i.e., it can continue to the next batch after the client can decode the current one.Towards this end, we marked the relay or the client as "FULL" whenever they can decode the current batch.A marked node will broadcast an ACK whenever it receives another packet with the current batch sequence, which implies that the sender is still unaware of the decodability at this node.Such a persistent feedback scheme essentially allows the relay to forward the ACK to the AP, so that it can proceed to a new batch of data.

Implementation of CoFi
We have implemented the CoFi protocol based on the 802.11MAC/PHY layer in ns-2 [31], a widely used packet level discrete event simulator.As mentioned above, we suppressed the original per-packet ACK in 802.11, and used per-batch ACK instead.The 802.11 style per-packet retransmission and the exponential backoff schemes are discarded in CoFi.This modified MAC layer interfaces the higher layers through the CoFi encoding/decoding queues (Figure 4).The advantage of the signal-strength based approach lies in its simplicity and efficiency.Our measurement of wireless links (Section 5) indicates that most links experience a stable signal strength with small variance unless the clients move.Therefore, a few samples are enough to determine the average quality of a link.To obtain the signal strength from R to C, for instance, C only needs to overhear a number of ACKs that are sent from R and are intended for the AP.In contrast, measuring the MAC level packet loss ratio typically requires a large number of samples over a relatively long period of time.
Existing work has mostly discarded signal strength as an indicator of the link quality [29,30].However, such measurements adopted 802.11MAC layer throughput as the link quality metric.The 802.11 throughput is not in close correlation with the signal strength because of its persistent retransmission, per-packet ACK, and exponential backoff scheme.In CoFi, a single-direction transmission without backoff dominates the traffic, and the link rate is directly related with the signal strength.

The Relay Scheduling Problem
The scheduling scheme in CoFi triggers the relay to transmit cached packets to the intended client in order to save the channel time wasted on the lossy link between the AP and that client.Once the best relay can decode an entire overheard batch, it will take over the role of the AP, serving the intended client by sending a random linear combination of packets in the batch.When the AP overhears a coded packet from the relay, it stops transmitting to the weak client since it knows the relay has a full batch of data that can rescue the target client.
Beside triggering the relay, the scheduling algorithm must ensure liveness, i.e., it can continue to the next batch after the client can decode the current one.Towards this end, we marked the relay or the client as "FULL" whenever they can decode the current batch.A marked node will broadcast an ACK whenever it receives another packet with the current batch sequence, which implies that the sender is still unaware of the decodability at this node.Such a persistent feedback scheme essentially allows the relay to forward the ACK to the AP, so that it can proceed to a new batch of data.

Implementation of CoFi
We have implemented the CoFi protocol based on the 802.11MAC/PHY layer in ns-2 [31], a widely used packet level discrete event simulator.As mentioned above, we suppressed the original per-packet ACK in 802.11, and used per-batch ACK instead.The 802.11 style per-packet retransmission and the exponential backoff schemes are discarded in CoFi.This modified MAC layer interfaces the higher layers through the CoFi encoding/decoding queues (Figure 4).

Network Encoding and Decoding
Although we opted to use a simulation to evaluate the performance of CoFi, real network coding is still needed because it serves as a checker for packet dependence.The implementation is based on C++ and consists of three parts: The coding queue, the encoding/decoding operation, and the elementary matrix operations in the Galois finite field GF(2 8 ).
A coding queue is simply a finite buffer, whose size equals the batch size.The AP maintains separate coding queues for each client, while each client maintains a coding queue for its own packets, and for the overheard packets intended for other clients.A newly coming packet is put into the coding queue only if it is linearly independent from all existing packets in that queue.Independence checking and decoding operations are performed concurrently, using Gauss-Jordan elimination [27].The encoding operation is simply a matrix multiplication over the finite field GF(2 8 ).Elementary arithmetic operations in GF(2 8 ) are implemented using a widely adopted lookup-table method [32,33].

Packet Management
The format of a data packet is illustrated in Figure 5. Beside the legacy 802.11 header, a batch sequence is piggybacked, which is used for triggering the persistent ACK and the transmission of a new batch.In addition, before encoding the original data, an offset sequence must be placed into the data field.This is necessary because the decoding algorithm (Gauss-Jordan elimination) may break the original ordering of packets through matrix row swapping.By encoding the offset sequence together with data, we can recover the transport layer sequence of each packet using its batch sequence and MAC layer offset.

Network Encoding and Decoding
Although we opted to use a simulation to evaluate the performance of CoFi, real network coding is still needed because it serves as a checker for packet dependence.The implementation is based on C++ and consists of three parts: The coding queue, the encoding/decoding operation, and the elementary matrix operations in the Galois finite field GF( 82 ).A coding queue is simply a finite buffer, whose size equals the batch size.The AP maintains separate coding queues for each client, while each client maintains a coding queue for its own packets, and for the overheard packets intended for other clients.A newly coming packet is put into the coding queue only if it is linearly independent from all existing packets in that queue.Independence checking and decoding operations are performed concurrently, using Gauss-Jordan elimination [27].The encoding operation is simply a matrix multiplication over the finite field GF( 82 ).Elementary arithmetic operations in GF( 82 ) are implemented using a widely adopted lookup-table method [32,33].

Packet Management
The format of a data packet is illustrated in Figure 5. Beside the legacy 802.11 header, a batch sequence is piggybacked, which is used for triggering the persistent ACK and the transmission of a new batch.In addition, before encoding the original data, an offset sequence must be placed into the data field.This is necessary because the decoding algorithm (Gauss-Jordan elimination) may break the original ordering of packets through matrix row swapping.By encoding the offset sequence together with data, we can recover the transport layer sequence of each packet using its batch sequence and MAC layer offset.

Header
Batch seq

MAC offset
Original data Encoded data The ACK packet is similar to the 802.11specification.Remarkably, the ACK sent from a relay will piggyback the address of the client under its help, because the AP only needs to know who has decoded, rather than who is helping the client.

Decentralized MAC Caching
The relay scheduling algorithm can be realized without any notification message among client, relay, and the AP.However, the current version of CoFi still relies on message exchanges to carry out the relay selection procedure.More specifically, when a client experiences low throughput due to weak signal strength, it broadcasts a short "HELP" message to neighbors in the WLAN.Each potential relay overhearing this message will reply with the signal strength from the AP to it.The AP collects all reply messages in a timeout period, performs the relay selection, and subsequently broadcast a message containing the identity of the selected relay.
A client needs to re-initiate the relay selection protocol whenever the average signal strength (a moving average) from the relay to it drops by more than TS.This may happen when either of them moves.When the  AP Relay signal strength changes by more than TS, the relay also has to notify the client under help so that a better relay can be selected.Such a relay selection algorithm is feasible in a typical dynamic WLAN since the movement speed of nodes is usually low.The ACK packet is similar to the 802.11specification.Remarkably, the ACK sent from a relay will piggyback the address of the client under its help, because the AP only needs to know who has decoded, rather than who is helping the client.

Decentralized MAC Caching
The relay scheduling algorithm can be realized without any notification message among client, relay, and the AP.However, the current version of CoFi still relies on message exchanges to carry out the relay selection procedure.More specifically, when a client experiences low throughput due to weak signal strength, it broadcasts a short "HELP" message to neighbors in the WLAN.Each potential relay overhearing this message will reply with the signal strength from the AP to it.The AP collects all reply messages in a timeout period, performs the relay selection, and subsequently broadcast a message containing the identity of the selected relay.
A client needs to re-initiate the relay selection protocol whenever the average signal strength (a moving average) from the relay to it drops by more than TS.This may happen when either of them moves.When the AP → Relay signal strength changes by more than TS, the relay also has to notify the client under help so that a better relay can be selected.Such a relay selection algorithm is feasible in a typical dynamic WLAN since the movement speed of nodes is usually low.

Performance and Evaluation
In this section, we evaluate the throughput and delay performance of CoFi, in comparison with the 802.11protocol.Before delving into the detailed evaluation, we first check the practicality of network coding in terms of computation cost.

The Computation Cost of Network Coding
Network coding is essentially a mechanism that trades computation time for network throughput.As long as the time spent on encoding or decoding one packet is shorter than that of transmitting one packet (in other words, the coding bandwidth is larger than the effective link bandwidth), the coding operations will not hurt the link throughput, because they can be pipelined with each other and performed in parallel.
To quantify the computation cost of network coding, we test the average bandwidth of GF (2 8 ) random linear encoding, on a 3.5 GHz desktop PC with 16 GB memory.Figure 6 plots the encoding bandwidth as a function of batch size and block size.Since the encoding always operates over a dense matrix, it tends to dominate the decoding time.Therefore, we focus on encoding bandwidth only.When the batch size is small, the coding bandwidth can be larger than the effective bandwidth of the 802.11links [34].Such a result is consistent with existing literature [19].However, the actual coding bandwidth in our experiment is lower than [19].This is mainly because our implementation is based on less optimized code.In fact, existing and ongoing work has already exploited fast network encoding/decoding methods, such as hardware accelerated approach [15].

Performance and Evaluation
In this section, we evaluate the throughput and delay performance of CoFi, in comparison with the 802.11protocol.Before delving into the detailed evaluation, we first check the practicality of network coding in terms of computation cost.

The Computation Cost of Network Coding
Network coding is essentially a mechanism that trades computation time for network throughput.As long as the time spent on encoding or decoding one packet is shorter than that of transmitting one packet (in other words, the coding bandwidth is larger than the effective link bandwidth), the coding operations will not hurt the link throughput, because they can be pipelined with each other and performed in parallel.
To quantify the computation cost of network coding, we test the average bandwidth of GF ( 82 ) random linear encoding, on a 3.5 GHz desktop PC with 16 GB memory.Figure 6 plots the encoding bandwidth as a function of batch size and block size.Since the encoding always operates over a dense matrix, it tends to dominate the decoding time.Therefore, we focus on encoding bandwidth only.When the batch size is small, the coding bandwidth can be larger than the effective bandwidth of the 802.11links [34].Such a result is consistent with existing literature [19].However, the actual coding bandwidth in our experiment is lower than [19].This is mainly because our implementation is based on less optimized code.In fact, existing and ongoing work has already exploited fast network encoding/decoding methods, such as hardware accelerated approach [15].Considering the high computing power of existing clients, such as handheld mobile devices and laptops, the computation load of decoding and encoding operations of the clients usually do not affect the bandwidth of MAC protocols.In our experimental evaluation of CoFi, we also assumed the cost of network coding could be ignored.
However, for low-end access points, such as sensors, computation resources may be limited, and the complexity of network coding may make a considerable difference.In our ongoing work, we implement and test network coding over a pair of programmable wireless nodes (the USRP testbed [28]), and dynamically adjust the batch size to satisfy the real-time QoS constraint of the link.We found that when the CPU is heavily loaded with other concurrent tasks, even with batch size 10, and packet size 1.5 KB, the computation cost of network coding is considerable.This work will be complementary to our CoFi protocol.Considering the high computing power of existing clients, such as handheld mobile devices and laptops, the computation load of decoding and encoding operations of the clients usually do not affect the bandwidth of MAC protocols.In our experimental evaluation of CoFi, we also assumed the cost of network coding could be ignored.
However, for low-end access points, such as sensors, computation resources may be limited, and the complexity of network coding may make a considerable difference.In our ongoing work, we implement and test network coding over a pair of programmable wireless nodes (the USRP testbed [28]), and dynamically adjust the batch size to satisfy the real-time QoS constraint of the link.We found that when the CPU is heavily loaded with other concurrent tasks, even with batch size 10, and packet size 1.5 KB, the computation cost of network coding is considerable.This work will be complementary to our CoFi protocol.

The Trace Based Simulation
The variation of wireless channels with space and time is attributed to both large-scale effects (path-loss and shadowing) and small-scale effects (multipath fading and Doppler fading), and only the former is accounted for in the ns-2 channel model.To obtain more realistic results, we adopted a trace-based approach to compensate for the inaccuracy of ns-2 simulation.Specifically, we obtained physical layer packet loss traces using a pair of universal software radio peripheral (USRP) programmable transceiver [28].Compared with existing 802.11 traces, the advantage of USRP is that it has no MAC protocol, and therefore it can isolate the channel variation from the MAC layer protocol overhead.We modified the ns-2 physical layer module by intercepting all packet transmissions, and then fed the traces into the module, allowing it to determine whether a packet could be successfully received (Figure 7).To improve the granularity of traces, each USRP packet had only 32 bytes, which was shorter than the data packet (1.5 KB) and ACK packet (46 bytes) in our simulation.Whenever the airtime (duration of transmission) of a simulated packet overlapped an erroneous packet in the trace, it was declared as loss.

The Trace Based Simulation
The variation of wireless channels with space and time is attributed to both large-scale effects (path-loss and shadowing) and small-scale effects (multipath fading and Doppler fading), and only the former is accounted for in the ns-2 channel model.To obtain more realistic results, we adopted a trace-based approach to compensate for the inaccuracy of ns-2 simulation.Specifically, we obtained physical layer packet loss traces using a pair of universal software radio peripheral (USRP) programmable transceiver [28].Compared with existing 802.11 traces, the advantage of USRP is that it has no MAC protocol, and therefore it can isolate the channel variation from the MAC layer protocol overhead.We modified the ns-2 physical layer module by intercepting all packet transmissions, and then fed the traces into the module, allowing it to determine whether a packet could be successfully received (Figure 7).To improve the granularity of traces, each USRP packet had only 32 bytes, which was shorter than the data packet (1.5 KB) and ACK packet (46 bytes) in our simulation.Whenever the airtime (duration of transmission) of a simulated packet overlapped an erroneous packet in the trace, it was declared as loss.We collected the packet loss traces for all links in the 3-node (AP, A, B) topology in Figure 3, and plot the results in Figure 8.We observed that most links tended to have a stabilized loss rate.However, some links may have experienced a large loss variation and asymmetric loss characteristics.For example, the (A  B) and (B  A) links demonstrate a very different loss rate and variation.Such properties were not modeled in the ns-2 simulator, either.

CoFi Performance in an Elementary Topology
Using the above trace-based simulation, we evaluated the performance of CoFi within the 3node (AP, A, B) topology in Figure 3. Unless otherwise noted, all MAC/PHY layer parameter settings We collected the packet loss traces for all links in the 3-node (AP, A, B) topology in Figure 3, and plot the results in Figure 8.We observed that most links tended to have a stabilized loss rate.However, some links may have experienced a large loss variation and asymmetric loss characteristics.For example, the (A→B) and (B→A) links demonstrate a very different loss rate and variation.Such properties were not modeled in the ns-2 simulator, either.

The Trace Based Simulation
The variation of wireless channels with space and time is attributed to both large-scale effects (path-loss and shadowing) and small-scale effects (multipath fading and Doppler fading), and only the former is accounted for in the ns-2 channel model.To obtain more realistic results, we adopted a trace-based approach to compensate for the inaccuracy of ns-2 simulation.Specifically, we obtained physical layer packet loss traces using a pair of universal software radio peripheral (USRP) programmable transceiver [28].Compared with existing 802.11 traces, the advantage of USRP is that it has no MAC protocol, and therefore it can isolate the channel variation from the MAC layer protocol overhead.We modified the ns-2 physical layer module by intercepting all packet transmissions, and then fed the traces into the module, allowing it to determine whether a packet could be successfully received (Figure 7).To improve the granularity of traces, each USRP packet had only 32 bytes, which was shorter than the data packet (1.5 KB) and ACK packet (46 bytes) in our simulation.Whenever the airtime (duration of transmission) of a simulated packet overlapped an erroneous packet in the trace, it was declared as loss.We collected the packet loss traces for all links in the 3-node (AP, A, B) topology in Figure 3, and plot the results in Figure 8.We observed that most links tended to have a stabilized loss rate.However, some links may have experienced a large loss variation and asymmetric loss characteristics.For example, the (A  B) and (B  A) links demonstrate a very different loss rate and variation.Such properties were not modeled in the ns-2 simulator, either.

CoFi Performance in an Elementary Topology
Using the above trace-based simulation, we evaluated the performance of CoFi within the 3node (AP, A, B) topology in Figure 3. Unless otherwise noted, all MAC/PHY layer parameter settings

CoFi Performance in an Elementary Topology
Using the above trace-based simulation, we evaluated the performance of CoFi within the 3-node (AP, A, B) topology in Figure 3. Unless otherwise noted, all MAC/PHY layer parameter settings follow Table 1.Since the USRP link bandwidth is 1 Mbps, we set the maximum link bandwidth to 1 Mbps in ns-2 simulation, so that the simulation evolved in a similar time scale with the traces.Our first performance metric is throughput, which is the amount of decodable data received by each client within the unit time.As demonstrated in Figure 9, the clients running CoFi experience much higher throughput than those using 802.11.CoFi provides more than three times improvement of time-averaged throughput for each client and guarantees a full link level reliability for each client.In contrast, the 802.11MAC discards a packet if it cannot be acknowledged after a number of retransmission attempts (equal to Max Retry Limit).In addition, 802.11 suffers from link asymmetry.If the lossy link is on the forward direction, it affects 802.11 more than CoFi, because of the persistent retransmission and exponential backoff mechanisms.On the other hand, since CoFi uses batch transmission which requires fewer ACKs, it suffers less from ACK losses on the reverse direction.follow Table 1.Since the USRP link bandwidth is 1 Mbps, we set the maximum link bandwidth to 1 Mbps in ns-2 simulation, so that the simulation evolved in a similar time scale with the traces.Our first performance metric is throughput, which is the amount of decodable data received by each client within the unit time.As demonstrated in Figure 9, the clients running CoFi experience much higher throughput than those using 802.11.CoFi provides more than three times improvement of time-averaged throughput for each client and guarantees a full link level reliability for each client.In contrast, the 802.11MAC discards a packet if it cannot be acknowledged after a number of retransmission attempts (equal to Max Retry Limit).In addition, 802.11 suffers from link asymmetry.If the lossy link is on the forward direction, it affects 802.11 more than CoFi, because of the persistent retransmission and exponential backoff mechanisms.On the other hand, since CoFi uses batch transmission which requires fewer ACKs, it suffers less from ACK losses on the reverse direction.Figure 10 evaluates the time-averaged per-batch delay of CoFi and 802.11.For CoFi, the perbatch delay was measured without considering the packet reliability, i.e., it equaled each time duration when the receiver collected n different packets (n is the batch size).For 802.11, client A needed to strictly receive a packet one by one according to the order in which the packets were sent.At the same time, the transmitter needed to get full ACKs (equaling the number of packets).This Figure 10 evaluates the time-averaged per-batch delay of CoFi and 802.11.For CoFi, the per-batch delay was measured without considering the packet reliability, i.e., it equaled each time duration when the receiver collected n different packets (n is the batch size).For 802.11, client A needed to strictly receive a packet one by one according to the order in which the packets were sent.At the same time, the transmitter needed to get full ACKs (equaling the number of packets).This indicated the probability that the data transmission was interfered by the changes of the channel environment being increased.Since the per-batch delay was inversely proportional to the throughput, the large performance gap between these two protocols was consistent with the throughput comparison in Figure 9. indicated the probability that the data transmission was interfered by the changes of the channel environment being increased.Since the per-batch delay was inversely proportional to the throughput, the large performance gap between these two protocols was consistent with the throughput comparison in Figure 9.We proceed to evaluate the per-packet delay of both protocols.In CoFi an encoded packet is not visible at the receiver side until the entire batch it belongs to is decodable.Therefore, the per-packet delay of CoFi equals its per-batch delay if a packet reception fails during the hybrid coding stage.From Figure 11, we see that the per-packet delay of client A (which experiences low loss rate) when using CoFi is similar to the case when running 802.11.This is partly because of its hybrid coding scheme, which allows it to gracefully degrade to a non-coding protocol.In addition, since CoFi discards the persistent retransmission and exponential backoff scheme, the client B that is under a harsh channel condition does not affect the throughput of client A who experiences much less packet losses.For client B, the per-packet delay of CoFi is higher than that of 802.11, since the successful reception of an entire batch takes longer than receiving a single packet in 802.11.We noted that using a smaller batch size does not necessarily result in shorter delay, especially for lossy links.This is because a smaller batch size also entailed more interaction overhead, such as the ACK feedback and the switch between the AP and relay.In the extreme case when batch size equals 1, CoFi was essentially degraded into a persistent retransmission protocol, and the benefit of MAC caching also diminished.
Another problem of interest is: To what extent does the MAC caching scheme help?Figures 12 and 13 compare the two versions of CoFi with and without MAC caching, in terms of throughput and per-packet delay.
Since the MAC caching scheme ensures that the throughput of high-quality links are not affected when they offer help to other links, client A achieves the same level of throughput and delay, no matter if MAC caching is used.However, the throughput of client B is nearly doubled with MAC caching.The reason is that MAC caching replaces the low quality link (AP  B) with the high-quality link (A  B) when relay A can offer help.The packets received by client A are recoded and sent to Client B to improve the transmission efficiency.From Figure 13, we also observe that MAC caching is especially helpful during the period when the (AP  B) link experiences harsh channel condition, where the bursty losses can be smoothed out.Therefore, caching is particularly useful for jittersensitive applications, such as video streaming over the wireless LAN.We proceed to evaluate the per-packet delay of both protocols.In CoFi an encoded packet is not visible at the receiver side until the entire batch it belongs to is decodable.Therefore, the per-packet delay of CoFi equals its per-batch delay if a packet reception fails during the hybrid coding stage.From Figure 11, we see that the per-packet delay of client A (which experiences low loss rate) when using CoFi is similar to the case when running 802.11.This is partly because of its hybrid coding scheme, which allows it to gracefully degrade to a non-coding protocol.In addition, since CoFi discards the persistent retransmission and exponential backoff scheme, the client B that is under a harsh channel condition does not affect the throughput of client A who experiences much less packet losses.For client B, the per-packet delay of CoFi is higher than that of 802.11, since the successful reception of an entire batch takes longer than receiving a single packet in 802.11.We noted that using a smaller batch size does not necessarily result in shorter delay, especially for lossy links.This is because a smaller batch size also entailed more interaction overhead, such as the ACK feedback and the switch between the AP and relay.In the extreme case when batch size equals 1, CoFi was essentially degraded into a persistent retransmission protocol, and the benefit of MAC caching also diminished.
Another problem of interest is: To what extent does the MAC caching scheme help?Figures 12 and 13 compare the two versions of CoFi with and without MAC caching, in terms of throughput and per-packet delay.
Since the MAC caching scheme ensures that the throughput of high-quality links are not affected when they offer help to other links, client A achieves the same level of throughput and delay, no matter if MAC caching is used.However, the throughput of client B is nearly doubled with MAC caching.The reason is that MAC caching replaces the low quality link (AP→B) with the high-quality link (A→B) when relay A can offer help.The packets received by client A are recoded and sent to Client B to improve the transmission efficiency.From Figure 13, we also observe that MAC caching is especially helpful during the period when the (AP→B) link experiences harsh channel condition, where the bursty losses can be smoothed out.Therefore, caching is particularly useful for jitter-sensitive applications, such as video streaming over the wireless LAN.

CoFi in a Dynamic Topology Setting
To further validate the design of CoFi, we tested it in a simulated dynamic topology with one AP and five mobile clients.Each client moved according to the random waypoint model with speed 2 m/s and sojourn time uniformly distributed within 10 and 30 s.The terrain was a 50 m-radius circular region, and the average reception probability at the edge of the region was 0.5.The maximum data rate of each link was set to 2 Mb/s, i.e., the basic data rate of 802.11.
Figures 14 and 15 plot the total network throughput and per-packet delay performance, respectively.Owing to its batch transmission, CoFi with or without MAC caching mechanisms improved the throughput of 802.11 by a significant margin, and the amplitude elevated was slightly superior to the value in the elementary topology.Furthermore, with the use of MAC caching scheme, the throughput was opportunistically improved, according to whether a relay was available nearby.It can be expected that in a dense network with many clients, CoFi will demonstrate even higher throughput gains due to more available relays.Another observation was that the per-packet delay of CoFi could be lower than the 802.11protocol (Figure 15).This is because when more mobile clients are subscribed to the same AP, the probability that one of them experiences low channel quality becomes higher.Clients with low channel quality could encumber the ones with high channel quality.This will lead to transmission time increasment in the 802.11protocol.In addition, comparing with the elementary topology, we also observed that the time of per-packet delay increased and the volatility of value is quite large.This indicated that the dynamic topology needs higher level protocol and it is much more complex to be implemented.The results show that the volatility in running CoFi was much smaller than that in the 802.11protocol, which means the present method is more suitable for application in a practical scenario.Therefore, the case where the low-quality link brings down the performance of the entire network becomes the common case.

CoFi in a Dynamic Topology Setting
To further validate the design of CoFi, we tested it in a simulated dynamic topology with one AP and five mobile clients.Each client moved according to the random waypoint model with speed 2 m/s and sojourn time uniformly distributed within 10 and 30 s.The terrain was a 50 m-radius circular region, and the average reception probability at the edge of the region was 0.5.The maximum data rate of each link was set to 2 Mb/s, i.e., the basic data rate of 802.11.
Figures 14 and 15 plot the total network throughput and per-packet delay performance, respectively.Owing to its batch transmission, CoFi with or without MAC caching mechanisms improved the throughput of 802.11 by a significant margin, and the amplitude elevated was slightly superior to the value in the elementary topology.Furthermore, with the use of MAC caching scheme, the throughput was opportunistically improved, according to whether a relay was available nearby.It can be expected that in a dense network with many clients, CoFi will demonstrate even higher throughput gains due to more available relays.Another observation was that the per-packet delay of CoFi could be lower than the 802.11protocol (Figure 15).This is because when more mobile clients are subscribed to the same AP, the probability that one of them experiences low channel quality becomes higher.Clients with low channel quality could encumber the ones with high channel quality.This will lead to transmission time increasment in the 802.11protocol.In addition, comparing with the elementary topology, we also observed that the time of per-packet delay increased and the volatility of value is quite large.This indicated that the dynamic topology needs higher level protocol and it is much more complex to be implemented.The results show that the volatility in running CoFi was much smaller than that in the 802.11protocol, which means the present method is more suitable for application in a practical scenario.Therefore, the case where the low-quality link brings down the performance of the entire network becomes the common case.

Potential Application
The above evaluation demonstrated that CoFi could provide a much higher network throughput than the widely adopted 802.11MAC protocol, especially when one or more clients in the WLAN are experiencing harsh channel conditions.However, CoFi does not guarantee low per-packet delay for all cases.Therefore, it is best applicable to bulk file trans-missions and streaming protocols, which adopts buffers to tolerate an initial setup delay.In addition, when the custom VLSI of RLNC is implemented and the computing power of CPU improved, it is possible to apply random linear coding on handheld mobile devices and standalone wireless nodes [35].There can be a wide variety of form for clients, such as the sensor nodes in IoTs, which greatly expands its range of application.

Reverse Engineering of CoFi and 802.11
In this section, we developed a simple model for CoFi and the 802.11MAC, in order to further understand the root of their performance disparity.We first modeled the asymptotic throughput of CoFi and 802.11, and then proved that they essentially solve optimization problems with the objective of maximizing throughput, but with different fairness constraint.

Asymptotic Throughput of CoFi and 802.11
We consider a wireless LAN topology with one AP and K clients, denoted as which is independent of the client's identity.Within this interval, the total units of packets that are successfully delivered is K. Therefore, the total network throughput is:

Potential Application
The above evaluation demonstrated that CoFi could provide a much higher network throughput than the widely adopted 802.11MAC protocol, especially when one or more clients in the WLAN are experiencing harsh channel conditions.However, CoFi does not guarantee low per-packet delay for all cases.Therefore, it is best applicable to bulk file trans-missions and streaming protocols, which adopts buffers to tolerate an initial setup delay.In addition, when the custom VLSI of RLNC is implemented and the computing power of CPU improved, it is possible to apply random linear coding on handheld mobile devices and standalone wireless nodes [35].There can be a wide variety of form for clients, such as the sensor nodes in IoTs, which greatly expands its range of application.

Reverse Engineering of CoFi and 802.11
In this section, we developed a simple model for CoFi and the 802.11MAC, in order to further understand the root of their performance disparity.We first modeled the asymptotic throughput of CoFi and 802.11, and then proved that they essentially solve optimization problems with the objective of maximizing throughput, but with different fairness constraint.

Asymptotic Throughput of CoFi and 802.11
We consider a wireless LAN topology with one AP and K clients, denoted as C 1 , C 2 , . . ., C K .The average reception rate of each link (AP→ C i ) equals p i .The maximum link bandwidth supported by the underlying PHY is R. We assume each client requests a large file from the AP and the AP serves all clients in a round-robin manner.
For the 802.11MAC, we assume the AP adopts persistent retransmission without backoff, thus analyzing an asymptotic upper bound of its throughput.For each client C i , the average number of transmissions needed to deliver a unit packet is x i = 1/p i .Since the airtime of each packet is constant, the interval that AP serves C i is: which is independent of the client's identity.Within this interval, the total units of packets that are successfully delivered is K. Therefore, the total network throughput is: Since each client only receives one packet intended for it during each interval, its effective throughput is: which depends on the link quality of all other clients.For CoFi without caching, in each round a client is served only once.Therefore, its effective throughput only depends its own link quality.Specifically, the throughput of a client C i is: (5)

Performance from a Fairness Point of View
In the literature of network resource allocation [36], two notions of fairness are widely adopted: Max-min fairness and proportional fairness.Consider the MAC as a bandwidth (or channel time) resource allocation protocol, then the former fairness measure essentially strives to allocate more channel time to the weak link [37,38], while the latter allocates the link bandwidth according to the quality of each link.More specifically, with the maxmin fairness measure, a MAC protocol with persistent retransmission solves the following optimization problem: maxt (6) subject to : t ≤ T i , ∀1 ≤ i ≤ K where t is the effective throughput of the client with the weakest link condition.It can be easily proved that t ≤ R • (T/K) = R/∑ K j=1 x j , which is achievable when ∀i, T i = R • (T/K) = R/∑ K j=1 x j .This is exactly the asymptotic throughput provisioned by the 802.11protocol.Therefore, we have the following observation: Theorem 1. 802.11 MAC essentially solves an optimization problem that achieves maxmin throughput fairness.
It is well known that maxmin fairness strives to allocate the same level of resources to all clients, thus resulting in low efficiency.This reveals the fundamental reason behind the low performance of 802.11 under unsatisfactory channel conditions.
In contrast, with the proportional fairness measure, a MAC protocol without persistent retransmission solves the following optimization problem: max K ∑ j=1 ln F j (9) subject to : where F i is the effective throughput that we can really get.Similar to the 802.11case, we can solve this problem by upper-bounding the objective.The solution (R/K) • p i is exactly the asymptotic throughput of CoFi without MAC caching.Therefore, we have: Theorem 2. CoFi without MAC caching essentially solves an optimization problem that achieves proportional throughput, fairness.
In other words, high-quality links will enjoy a high throughput, while low-quality links are not starved.Such a proportional throughput measure may result in an imbalanced throughput allocation when the link qualities are different.The CoFi protocol with MAC caching offers a way to balance the throughput by replacing the low quality (AP→client) link with a high quality (relay→client) link.Therefore, CoFi essentially strikes a balance between the maxmin and proportional fairness measures.
Through the above analysis, it can be seen that 802.11 and CoFi are essentially distributed approximate solutions to certain optimization problems, but the optimization objectives are different.The network source allocation problems are also equivalent to a discrete combinatorial optimization problem under certain constraints, which may be considered to solve by stochastic optimization algorithms [39][40][41].

Conclusions
In this paper, we introduced CoFi, a MAC protocol that improves Wi-Fi performance under harsh channel conditions.To mask packet losses while ensuring reliable delivery, CoFi groups packets into batches, and continuously transmits random linear combination of all data in each batch, until they can be decoded by the receiver.In addition, CoFi allows clients to cache overheard packets, and schedules the best relay to help the client with weak link quality to the access point.Using trace based simulation, we find that CoFi can significantly improve throughput in a lossy network, and gracefully degrade to a non-coding protocol when the channel condition is good.The per-packet delay of CoFi is comparable to 802.11 in a lossy network, and even better in a dynamic topology with mobile clients.We also developed a simple analytical model that justifies the performance of CoFi from a fairness point of view.Our ongoing work tries to improve CoFi from the following aspects:

•
Implementation of CoFi on real wireless nodes.We plan to implement CoFi on the USRP testbed, which will provide more accurate experimentation results.We are also considering making CoFi more compatible with existing 802.11 features, such as rate adaptation and service differentiating.

•
CoFi for elastic applications.In this paper, we assumed stable application data when evaluating CoFi and 802.11.Elastic traffic, such as VBR, has more stringent requirements on delay performance.To adapt to such applications, we have to allow a dynamic batch size, so that each packet can be delivered on time.While adjusting the batch size, we also need to consider the end-host delay of network coding, especially when the computation load of coding operations is non-negligible.

Figure 2 . 1 B
Figure 2. The encoding operation is equivalent to a matrix multiplication  X R B (within the finite field GF (2 8 )), where R and B are the random coefficient matrixes, and B is the original data blocks in one batch.The decoding operation is the matrix reversion:   1 B R X .

Figure 1 .
Figure 1.Batch files for random linear encoding.

Figure 2 . 1 B
Figure 2. The encoding operation is equivalent to a matrix multiplication  X R B (within the finite field GF (2 8 )), where R and B are the random coefficient matrixes, and B is the original data blocks in one batch.The decoding operation is the matrix reversion:   1 B R X .

Figure 2 .
Figure 2. The encoding operation is equivalent to a matrix multiplication X = R • B (within the finite field GF (2 8 )), where R and B are the random coefficient matrixes, and B is the original data blocks in one batch.The decoding operation is the matrix reversion: B = R −1 • X.

Figure 3 .
Figure 3.The caching scheme batch transmission and MAC in CoFi.

Figure 3 .
Figure 3.The caching scheme batch transmission and MAC in CoFi.

Figure 4 .
Figure 4. Implementation of CoFi in the ns-2 simulator.Figure 4. Implementation of CoFi in the ns-2 simulator.

Figure 4 .
Figure 4. Implementation of CoFi in the ns-2 simulator.Figure 4. Implementation of CoFi in the ns-2 simulator.

Figure 6 .
Figure 6.The computation bandwidth of random linear encoding.

Figure 6 .
Figure 6.The computation bandwidth of random linear encoding.

Figure 7 .
Figure 7. Trace simulation: Feeding USRP packet loss traces into the ns-2 physical module (checked ns-2 packets are declared as losses).

Figure 8 .
Figure 8.The packet loss rate traces for the 3-node (AP, A, B) topology in Figure 3.Each point represents the average packet loss rate during the past 5 s.

Figure 7 .
Figure 7. Trace simulation: Feeding USRP packet loss traces into the ns-2 physical module (checked ns-2 packets are declared as losses).

Figure 7 .
Figure 7. Trace simulation: Feeding USRP packet loss traces into the ns-2 physical module (checked ns-2 packets are declared as losses).

Figure 8 .
Figure 8.The packet loss rate traces for the 3-node (AP, A, B) topology in Figure 3.Each point represents the average packet loss rate during the past 5 s.

Figure 8 .
Figure 8.The packet loss rate traces for the 3-node (AP, A, B) topology in Figure 3.Each point represents the average packet loss rate during the past 5 s.

Figure 9 .
Figure 9.The throughput (averaged over 10-s periods) of each client running 802.11 protocol and CoFi protocol in the elementary topology.

Figure 9 .
Figure 9.The throughput (averaged over 10-s periods) of each client running 802.11 protocol and CoFi protocol in the elementary topology.

Figure 10 .
Figure 10.The per-batch delay (averaged over 10-s periods) of each client running 802.11 protocol and CoFi protocol in the elementary topology.

Figure 10 .
Figure 10.The per-batch delay (averaged over 10-s periods) of each client running 802.11 protocol and CoFi protocol in the elementary topology.

Figure 11 .
Figure 11.The per-packet delay (averaged over 10-s periods) of each client running 802.11 protocol and CoFi protocol in the elementary topology.

Figure 12 .
Figure 12.The throughput of each client running CoFi with/without MAC caching in the elementary topology.

Figure 13 .
Figure 13.The per-packet delay of each client running CoFi with/without MAC caching in the elementary topology.

Figure 11 . 19 Figure 11 .
Figure 11.The per-packet delay (averaged over 10-s periods) of each client running 802.11 protocol and CoFi protocol in the elementary topology.

Figure 12 .
Figure 12.The throughput of each client running CoFi with/without MAC caching in the elementary topology.

Figure 13 .
Figure 13.The per-packet delay of each client running CoFi with/without MAC caching in the elementary topology.

Figure 12 . 19 Figure 11 .
Figure 12.The throughput of each client running CoFi with/without MAC caching in the elementary topology.

Figure 12 .
Figure 12.The throughput of each client running CoFi with/without MAC caching in the elementary topology.

Figure 13 .
Figure 13.The per-packet delay of each client running CoFi with/without MAC caching in the elementary topology.

Figure 13 .
Figure 13.The per-packet delay of each client running CoFi with/without MAC caching in the elementary topology.

Figure 14 .
Figure 14.The total throughput of the network running 802.11 protocol and CoFi protocol with/without MAC caching in a dynamic topology.

Figure 14 .
Figure 14.The total throughput of the network running 802.11 protocol and CoFi protocol with/without MAC caching in a dynamic topology.

Figure 15 .
Figure 15.The per-packet delay of the network running 802.11 protocol and CoFi protocol with/without MAC caching in a dynamic topology.

Figure 15 .
Figure 15.The per-packet delay of the network running 802.11 protocol and CoFi protocol with/without MAC caching in a dynamic topology.

Table 1 .
Parameters setting for the performance test.

Table 1 .
Parameters setting for the performance test.