Adaptive Packet Coding for Reliable Underwater Acoustic Communications

: This work investigates adaptive random linear packet coding (RLPC) for reliable under-water acoustic (UWA) communications. Our goal is to minimize the total transmission time of data blocks by adjusting the packet coding rate. We ﬁrst consider the application of RLPC with the conventional automatic repeat request (ARQ) scheme. We dynamically adjust the coding rate to ﬁt the time variations of UWA channels by choosing the optimal number of packets in each transmission. The optimal number of packets in each transmission is obtained based on a dynamic programming (DP) algorithm according to the feedback messages, which contain the number of successfully transmitted packets in the last transmission and the channel state information. Furthermore, considering the long propagation delay of UWA communications, we propose a modiﬁed juggling-like ARQ (J-ARQ) for the RLPC scheme, for which the duration of each transmission can be adjusted based on the characteristics of RLPC. A two-step DP algorithm is proposed to ﬁnd out the optimal solutions for this case. Simulation results show that the proposed schemes can improve the throughput efﬁciency and reduce the outage probability.


Introduction
Underwater communications have attracted much interest in recent years due to the rapid development of ocean exploration.Reliable data transmission among nodes is the basis of many underwater applications, such as tsunami warning, ecological monitoring, off-shore oil exploration, etc. [1].Compared to other media, acoustic waves are the only means of achieving reliable underwater transmission over a long distance.However, underwater acoustic (UWA) communications also face some difficulties compared to their terrestrial counterparts.The characteristics of UWA channels include large delay and Doppler spread, high path attenuation, limited bandwidth, and long propagation delay, which severely reduce the reliability of underwater transmissions.
Generally speaking, redundancy and retransmission are the two main approaches to enable a robust communication link.Retransmission schemes such as automatic repeat request (ARQ) are widely used in terrestrial wireless communications.However, the long propagation delay caused by the slow speed of acoustic waves (approximately 1500 m/s) in water makes the standard ARQ scheme very inefficient in UWA communications.Thus, there have been many modified ARQ schemes proposed for UWA communications, such as the go-back N scheme, the stop-and-wait (S&W) scheme and the selective repeat (SR) scheme [2,3].These modified ARQ schemes improve the efficiency compared to standard ARQ, but not well enough.This is because the sender always remains idle when it is waiting for the receiver's acknowledgement (ACK) messages in these schemes.As a result, they are still quite inefficient when the propagation time is long enough.In [4,5], a jugglinglike ARQ scheme (J-ARQ) is proposed to further improve the channel utilization, where the sender reserves a fixed gap for the ACK reception after each transmission.In this way, the sender still could transmit the data when it is awaiting the ACK.However, the fixed transmission and reception duration are used in the J-ARQ scheme to avoid conflict, which will lead to a decrease in transmission efficiency when the transmitter does not have an infinite amount of data waiting to be transmitted [5].
Compared to retransmission, the redundancy strategy alternatively seeks an attempt to improve the quality of each transmission.A common redundancy technique is forward error control (FEC), which can lower the bit error rate (BER) at the cost of reducing the data rate.For example, bit-level channel coding schemes, such as low-density parity codes [6] and convolutional codes [7], have been widely used in the physical layer of UWA communications.In contrast, in this paper we investigate another FEC technique, which is performed on the packet level and designed for the link layer [8][9][10][11][12][13][14][15][16][17].Its mechanism is that the sender sends multiple coded packets which are encoded by original packets.The receiver can recover the original packets correctly if there are enough coded packets successfully received.Rateless codes [18][19][20], as a good packet-coding technology, have been used in UWA communications.These codes are so named since the sender can generate unlimited coded packets to ensure that the original packets can be fully recovered.In [12], a kind of rateless code, Raptor code, is used in UWA communications.The coding rate is optimized to maximize the throughput over UWA channels.In [13], a cross-layer FEC scheme which combines both the physical layer FEC and the packet layer FER is extended based on [12].
So far, there have been many works which combine the redundancy strategy and retransmission strategy based on the rateless codes.In [5], rateless code is used in the J-ARQ scheme.In [10], an HARQ scheme is proposed, named segmented data reliable transport (SDRT), to achieve reliable data transmission in UWA sensor networks by employing Tornado codes.In this scheme, the sender keeps sending the packets until it receives an ACK.In order to reduce the energy consumption for unnecessary transmission, a window control mechanism is further proposed to estimate the expected number of packets actually needed.With the information, the sender just transmits a pre-estimated number of encoder packets (i.e., the window size), and then slows down the transmission to wait for an ACK.In [11], an underwater hybrid ARQ (UW-HARQ) is proposed which uses an NACK to feed back the receive state.Random linear packet coding (RLPC) is used as a rateless code in [14][15][16].In [14], the optimal number of coded packets is investigated for a half-duplex link to minimize the time (or energy) required for the transmission of a group of packets.In [15], joint power and rate control for an acoustic link employing random linear packet coding is considered to achieve a prespecified outage/reliability criterion.To this end, ref. [16] extends the work in [15] by grouped packet coding and increases its throughput efficiency based on the S&W ARQ scheme.The sender transmits a super-group packets and waits for an ACK.This could improve efficiency by packing multiple packets together.
In this paper, we propose two adaptive packet-coding schemes to achieve reliable UWA communications.We consider the time variations of UWA channels, which are ignored in most works.Since we focus on the link level performance, we model the channel state by a finite-state Markov chain (FSMC) [21].Some works use FSMC to model the UWA channels [22][23][24][25][26][27][28].In [22,23], Preisig used a two-state FSMC channel model with known transition probabilities to evaluate energy-efficient schedulers for underwater acoustic point-to-point links based on experimental data.As compared to existing works, our main contributors are summarized as follows: (1) We first propose an S&W ARQ scheme with RLPC (RLPC-ARQ).In the RLPC-ARQ scheme, the number of successfully transmitted packets in each transmission and the channel state information (CSI) are fed back to the sender in the ACK messages.The sender dynamically adjusts the coding rate based on the ACK messages to fit the time variations of UWA channels by choosing the optimal number of packets in each transmission.Meanwhile, considering the long propagation delay and the error-prone characteristic of UWA communications, we set a maximum number of retransmissions to avoid infinite retransmission.The problem is formulated as a finite horizon optimization problem.A dynamic programming (DP) algorithm is proposed to obtain the optimal number of packets in each transmission.(2) We also propose a modified juggling-like ARQ scheme for the RLPC system (RLPC-J-ARQ) to deal with the long propagation of UWA channels.In the RLPC-J-ARQ scheme, two data blocks are alternately transmitted at the sender; there is no need to stop and wait for the ACK.Different from the J-ARQ scheme in [5] for which the duration of each transmission is fixed, the RLPC-J-ARQ scheme adopts adjustable transmission duration by exploiting the characteristics of the rateless code.The rateless code is only used when there is not enough data to transmit to avoid idleness.In addition, we also consider the effect of channel variation.In this case, the standard DP algorithm does not work.Thus, we also propose a two-step DP algorithm to find out the optimal number of packets in each transmission.
The rest of this paper is organized as follows.Section 2 presents the basic system model.We formulate the optimization problem in Section 3. We propose the optimization solution based on the principle of DP in Section 4. Section 5 proposes the RLPC-J-ARQ scheme.Section 6 proposes a two-step DP approach to find out the optimal solution of the RLPC-J-ARQ scheme.Section 7 provides numerical results for the proposed solutions and Section 8 concludes the paper.

System Model
We consider a point-to-point system operating in a half-duplex manner without interference from other nodes.The sender collects W s information packets.Each packet has N b symbols and N b is a constant number during the transmission.These information packets are first divided into multiple blocks, each containing W packets.In this paper, we only consider the FEC technique in the packet level.The W original packets in each block are then encoded based on the RLPC [14][15][16] to generate sufficient coded packets.We assume that the data is delay constrained, which means the receiver should receive the data before the deadline.Otherwise, the data becomes useless.
We first consider an S&W ARQ scheme with RLPC, which is illustrated in Figure 1.
The coded packets from different blocks are transmitted in sequence.Let N (k) i denotes the number of transmitted packets for the k-th block in the i-th transmission.Let ρ k,i and ρ denote the beginning and ending time instants of the k-th block in the i-th transmission.Hence, the relationship between them is where T p = N b T s is the time duration of a coded packet, T s is the symbol duration.
The node is on transmission mode The node is on reception mode The guard interval ACK The ACK message The S&W ARQ scheme with RLPC.
Let T d denote the propagation delay.Then, the transmitted signal is received after T d duration, T d = d/c, where d is the distance between the transceiver pair, and c is the sound speed in water.Let σ (1) k,i and σ (2) k,i denote the beginning and ending time instants of the received signal at the receiver.Then, we have When the receiver completes the receiving process, there is a guard interval, T g , for the receiver node to process the data and change the mode.Next, the receiver responds with an ACK.According to the characteristic of RLPC, the receiver can recover the original information if there are enough coded packets successfully received.Thus, we do not need to retransmit the unsuccessful packets in the following transmission.In the following transmission, the sender will send enough other coded packets.It is noted that the other coded packets are also generated from the same information packets.There is the main different from the traditional ARQ scheme.Hence, the ACK contains the number of successfully received packets s (k) i in the current transmission and some pilot symbols used for channel estimation at the sender.
The BER e depends on the channel coding rate, the modulation order, the channel state, and the codeword length.In this paper, these parameters are fixed except the channel state.Thus, the packet error probability (PEP) P c is only connected with the channel state.
As mentioned above, we focus on the link level performance, so we model the channel state by a finite-state Markov chain (FSMC) [22,24,25].Assume the channel state is constant i denote the CSI when the k-th block is transmitted at the i-th transmission, which is quantized into L levels, i.e., h i ) denote the probability that s k,i denote the beginning and ending time points for the ACK at the receiver.Then, the relationship is given as where T ACK is the time duration of ACK.
The sender will receive the corresponding ACK after T d time duration.Let ρ k,i and ρ (4) k,i denote the beginning and ending time points for the received ACK at the sender.Then, The sender could estimate the CSI based on the pilot symbols in the ACK and decode the feedback messages s (k) i .We use a low rate channel code for ACK messages to improve the reliability of feedback messages.To reduce the complexity of the problem, we assume the feedback messages are error-free.
There is also a guard interval for the sender to process the ACK and change the mode.In this way, the i-th transmission for the k-th block is done.Hence, the total time duration for the i-th transmission is given as where If the receiver has received enough coded packets to recover the original information from the k-th block, the sender will send the coded packets from another block in the next transmission.Otherwise, the sender determines the number of transmitted coded packets in the (i + 1)-th transmission based on the feedback ACK.Thus, the data rate of packet coding is not constant during the transmission.The sender dynamically adjusts the rate based on the current transmission result to adapt to the time variation of UWA channels.

Problem Formulation
Assume that the receiver has successfully demodulated q (k) i−1 packets from the k-th block in the previous i − 1 transmissions according to the feedback messages.Then, q According to the characteristics of RLPC, the information packets can be recovered correctly if [15,16] Then, according to (8), we have i−1 denote the minimum number of packets needed to guarantee correct decoding.Thus, the probability of W information packets can be received correctly after the i-th transmission is given by It is obvious that the number of transmitted packets during the i-th transmission, N i , cannot exceed a maximal level, denoted by Our objective is to minimize the total transmission time of data blocks for delayconstrained applications by adjusting the packet-coding rate.Since the packet error is a stochastic process, there are certainly cases where the information cannot be received correctly after many transmissions, especially for error-prone UWA channels.However, considering the significant propagation delay of UWA channels, too many retransmissions is unacceptable in a practical communications system.Thus, we should set a maximal number of retransmissions, M.
To make sure that W information packets can be received correctly after M transmissions, we add a penalty term for the M-th transmission.In this way, the transmission time of the M-th transmission is given by where C is a large value.This setting implies that if the information cannot be recovered correctly after M transmissions, it will incur a large cost.
Then, the objective function is given by where γ is discount factor.Consider the cost of storage space and number overhead, γ > 1.
Naturally, there is a trade-off between the coding rate and the transmission time.Generally speaking, sending more coded packets will improve the reliability of each transmission.Moreover, it will reduce the probability of retransmission and then reduce the total transmission time.However, if we send too many packets in each transmission, the sending time also increases, which also decreases efficiency.Thus, there is an optimal coding rate for each transmission to minimize the total transmission time.

Optimal Solution
In our problem, the state of the i-th transmission is dependent on the result of (i − 1)-th transmission.Meanwhile, since packet error is a stochastic process, this problem can be seen as a sequential decision-making problem.This problem can be solved by the finite-horizon DP [24,25,29] approach.Let c i−1 ) denote the state of the system at the i-th transmission for the k-th block, where q (k) i−1 denotes the number of coded packets successfully received in the all (i − 1) transmissions.h In this case, the upper limit of The probability that the system state c (k) i will transfer to state c According to (8), The expected cost incurred from the M-th transmission is given by M , N The first term of ( 19) is the cost of the M-th transmission, the second term is the expectation of penalty term due to the incomplete transfer after M transmissions.It depends on the result of the M-th transmission according to (14).
Based on the Bellman equation [29], the optimal solution can be obtained by recursively ).The expected cost incurred from the i-th transmission to termination is given by The first term in (21) represents the expected cost of current transmission.The second term in (21) is the expected future cost accumulated from the (i + 1)-th transmission to the M-th transmission.
The DP approach includes two steps.In step 1, the sender calculates the (21) to find the optimal N (k) i * for each system state c (k) i and records it as a lookup table.In step 2, the sender chooses the optimal N (k) i * of the current state based on the lookup table.

RLPC-J-ARQ Scheme
In the RLPC-ARQ scheme, the sender idles for a round-trip time while waiting for the ACK.This is quite inefficient for UWA communications, which have a long propagation delay.Ref. [5] proposed the J-ARQ to solve this problem.However, the block size of J-ARQ is fixed, which leads to the analysis in [5]; it cannot be directly applied in this problem.Essentially, the J-ARQ scheme uses the time that the feedback information is transmitted in the channel to send the following data blocks.According to this idea, we propose a RLPC-J-ARQ in our problem.
Figure 2 illustrates the transmission style of the RLPC-J-ARQ scheme.We only consider that two blocks are transmitted simultaneously.The sender leaves a gap after sending N (k) i coded packets for receiving the ACK messages.Similarly, a guard interval is reserved for the node to change the mode.Next, the sender sends N (k+1) i data packets for the (k + 1)-th block.In this way, the transmitter could send more data blocks than the RLPC-ARQ scheme.
In Figure 2, ρ (j) k+1,i and σ (j) k+1,i represent the time points for the sender and receiver with the (k + 1)-th block, respectively.The relationship between the time instants ρ (j) k+1,i and σ (j) k+1,i are similar to those of the k-th block.According to Figure 2, there are some constraints about the time instants to avoid the collision.At the sender, we have At the receiver, we have k,i+2 − T g (23) According to (1), ( 2), ( 4) and ( 5), the relationship between ρ (j) k,i and ρ At the receiver, there is the same relationship σ (j) k,i and σ (j) k,i+1 .Similarly, there is the same relationship for the (k + 1)-th block.
Thus, according to the relationship in ( 24), ( 22) can be rewritten as After further straightforward deduction, we have Similar results can be obtained for the receiver, ( 23) can be rewritten as For the first transmission, i = 1, the constraints at the sender are given by k,1 = 0; according to (28), the constraints about the initial time instant of the (k + 1)-th block are given as Namely, the sender sends the packets from the (k + 1)-th block immediately once it is allowed to send.Meanwhile, the number of packets from the (k + 1)-th block is also restricted, To sum up, the constraints in our problem are as follows According to (31), the number of transmitted packets for the (k + 1)-th block at the i-th transmission does not exceed that of for the k-th block at the i-th transmission.The number of transmitted packets for the k-th block at the (i + 1)-th transmission doe not exceed that of for the (k + 1)-th block at the i-th transmission.If N i+1 .Thus, the maximum number of transmitted packets in the (i + 1)-th transmission are given by There is a difference from the RLPC-ARQ scheme.In the RLPC-ARQ scheme, the sender can adjust the number of transmitted packets freely based on the feedback ACK.However, in this case, the number of transmitted packets in each transmission is monotone decreasing.
The transmission time duration for the i-th transmission is also different.For the first transmission, the transmission time duration is given by For the i-th > 0, according to Figure 2, the transmission time duration for the i-th transmission is given by In this case, it seems that the transmission time duration for the i-th transmission has nothing to do with the number of transmitted packets from the k-th block.However, N (k) i still affects the transmission time duration according to the constraints in (31).
If N k,i > 0 and N k+1,i = 0, the transmission time duration should be However, it is difficult to calculate ρ k+1,i since it connects with the number of transmitted packets in the first (i − 1) transmissions.Thus, we let ρ In the same way, for the M-th transmission, we also set the penalty term, π(q (14).The objective function is also given by This problem is different from the problem in (15).There are two parameters that need to be optimized for each transmission.At the begin of the i-th transmission for the k-th block, the sender does not receive the feedback message of the (i − 1)-th transmission for the (k + 1)-th block.Namely, the sender only knows part of the transmission result of the (i − 1)-th transmission.Apparently, The standard DP approach is not suitable for this case.We propose a two-step DP approach to solve this problem.

Proposed Two-Step DP Approach
In each transmission, we separately define the system state for the k-th block and the (k + 1)-th block.For the k-th block, the system state is c i is the maximum number of transmitted packets for the k-th block.For the i-th transmission of the (k + 1)-th block, the sender already knows the transmission result of the (i − 1)-th transmission.Thus, the system state is c ).
As analyzed above, the problem for the RLPC-J-ARQ scheme is also a sequential decision-making problem.Thus, it could also be solved based on the DP approach.The main difference is that the B i , as given in (32) and (33), respectively.In other words, the state of the i-th transmission is not only related In summary, the cost of the M-th transmission is given as M−2 < W (50)

The Cost of the i-th Transmission
For the i-th transmission, the system state for the (k + 1)-th block is c ).The expected cost incurred from the i-th transmission of the (k + 1)-th block to termination is given as Equation (51).
The conditional expectation in (51) is given as It is interesting that (52) does not seem to have much to do with N For the k-th block, the system state is c i ).The expected cost incurred from the i-th transmission to termination is given by (53).Similarly, we have also has no influence on transfer probability and affects J i (c (k+1) i ) by limiting the range of N (k+1) i .and N (k+1) i according to system state.The details of the proposed optimal two-step DP policy are summarized in Algorithm 1.

Algorithm 1
The proposed two-step DP algorithm.
, based on (39) and save in the table.
M , based on (50) and save in the table.5   Find the optimal (N .17: end for As shown in [5], the performance of the J-ARQ scheme depends on the number of source data.Thus, we first study the impact of the number of source data.Figure 3 illustrates the throughput efficiency as a function of the number of source data packets W s .From Figure 3, we find the SR-ARQ scheme has the lowest throughput efficiency in all schemes.The throughput efficiency performance of conventional J-ARQ increases with the length of source data.If the sender has enough source data, the J-ARQ scheme has the best throughput efficiency.The performance of the RLPC-ARQ scheme and RLPC-J-ARQ is independent of data length.The throughput efficiency of RLPC-J-ARQ is better than RLPC-ARQ since it takes full advantage of the transmission time.When the data length is short, the performance of J-ARQ is worse than that of the RLPC-J-ARQ scheme.In the RLPC-J-ARQ scheme, the transmission duration is adjustable.Thus, it has a better performance when W s is not big enough.Figure 4 illustrates the throughput efficiency as a function of block size W.In this simulation, W s = 400.The throughput efficiency of all schemes increases with the increase of W. The proposed RLPC schemes are still better than the corresponding ARQ schemes.That makes sense since a bigger W means the transmitter could send more packets during a transmission; it reduces the number of ACKs.There is a turning point for the J-ARQ scheme.This is because N g will decrease with the increase in group size W. When W = 10, N g = 4 and it turns to N g = 3 for W = 12.The throughput efficiency of the J-ARQ scheme decreases as N g becomes smaller since it leads to more idle time.
Figure 5 shows the impact of transmission distance on throughput efficiency.Naturally, the throughput efficiency decreases with the increase of d, since it leads to an increase in propagation delay T d .The performance of the RLPC-ARQ scheme is always better than the SR-ARQ scheme.For the J-ARQ scheme, the throughput efficiency becomes larger at d = 5000 m because N g becomes larger.The transmitter sends two group packets during a round-trip time with d = 4500 m, while it sends three groups with d = 5000 m.The same situation also occurs at d = 7000 m.For the RLPC-J-ARQ scheme, the performance continuously decreases since we only consider that two blocks are transmitted simultaneously.However, its throughput efficiency is still better than that of the J-ARQ scheme when d is less than 7000 m.Finally, we analyze the impact of steady-state probability π 0 , which reflects the effect of UWA channels.Figure 6 illustrates the throughput efficiency as a function of steady-state probability π 0 .From Figure 6, we find the performance of all schemes will deteriorate with the increase of π 0 .A larger π 0 means the channel is more likely to be bad.The performance gain between with RLPC and without RLPC also increase as π 0 increase.This is because the RLPC schemes under the bad channel could improve the successful decode probability by increasing the transmitted packets to reduce the retransmission.However, the conventional ARQ schemes need to retransmit many times.Thus, the superiority of RLPC schemes is more reflected in the case of poor channels.After all, the conventional ARQ scheme also does not need to be retransmitted when the channel is good.This paper proposed two ARQ schemes with RLPC for UWA communications.The limitations of the proposed method are as follows.First, the sender knows the transition matrix of the channel state in advance, which needs to do a lot of experiments.Second, the proposed methods were based on the DP algorithm.The computational complexity is very high with a large system state.Thus, it is better to solve the problem with deep learning method.

Conclusions
This work investigates the adaptive packet coding for reliable UWA communications.This paper proposes two schemes based on two different ARQs.The first scheme is based on the conventional S&W ARQ scheme.The sender chooses the optimal number of packets in each transmission according to the feedback transmission result about the last transmission.We also set maximum retransmission times to avoid infinite retransmission considering the error-prone characteristic of UWA channels.This problem is formulated as a finite horizon optimization and solved with the DP algorithm.To overcome the impact of long propagation delay, we propose the modified juggling-like ARQ scheme for RLPC.Compared with the standard J-ARQ scheme, the transmission duration of the proposed scheme is variant and adapts to the rateless characteristics of the random linear packet coding.A two-step DP algorithm is proposed to find out the optimal solution.Simulation results demonstrate the performance gain of the proposed schemes as well as the impact of various practical factors such as data length, group size, channel steady-state distribution and transmission distance.

i
coded packets have been successfully received when N (k) i coded packets are transmitted under the channel state h (k) i .Then, is the feedback channel state of the (i − 1)-th transmission for the k-th block.Let U(c (k) i ) denote the feasible set of the action N i+1 ) by limiting the range of N (k) i+1 in (52).
k+1) i based on (51) and save in the table

Figure 3 .
Figure 3. Throughput efficiency as a function of the number of source data W s .

Figure 4 .Figure 5 .
Figure 4. Throughput efficiency as a function of the number of group packets W.

Figure 6 .
Figure 6.Throughput efficiency as a function of the steady-state probability π 0 .