## 1. Introduction

With the development of space exploration and the evolution of future space information networks (SIN), erasure correcting codes have attracted considerable research interest to enhance the information transmission capacity under the extremely challenging space communication environments [

1], which are characterized by the frequent and lengthy link disruptions, high data loss and a long link delay [

2]. In 2014, Consultative Committee for Space Data Systems (CCSDS) released an experimental specification of long erasure correcting (LEC) codes for near earth and deep-space communications [

3], in which near-optimum fixed code rate Irregular-Repeat-Accumulate (IRA) codes are proposed. In [

4], a joint design of the CCSDS file delivery protocol (CFDP) and IRA codes is discussed, and such erasure codes can be decoded efficiently with the Maximum-Likelihood algorithm [

5], and the code rate can be selected from several values [

6,

7].

Rateless codes (RC), also termed fountain codes, are capacity-achieving loss-resilient codes for erasure channels. Luby-Transform (LT) codes with a well-designed robust Soliton degree distribution (RSD) are the first practical realization of fountain codes [

8]. They can recover the original

k information (input) symbols from any

$N=k+O(\sqrt{k}{ln}^{2}(k/\theta ))$ received coded (output) symbols with probability

$1-\theta $ and the decoding cost of

$O(kln(k/\theta ))$ operations, where

θ is the allowable failure probability to recover the original message after

N coded symbols have been received. To further reduce the decoding complexity and to address the issue of high error floor in LT codes, Shokrollahi proposed Raptor codes that concatenate the LT code with a weakened robust Soliton degree distribution (WRSD) with a high-rate pre-code [

9]. In [

10,

11], LT codes are incorporated into the CFDP implementation, and both of the Packets Interleaving CCSDS File Delivery Protocol and the Loss-Tolerant File Delivery Protocol are able to resist channel erasures and can further reduce obvious overhead in CFDP.

Moreover, the line of sight (LOS) link is often unavailable for the space exploration rovers communicating to the base station directly [

12]. The space exploration rovers and relay satellites can form a typical multi-access relaying SIN, which has a dynamic time-varying property of multi-hop links [

13]. For example, the data from disjoint rovers/explorers need to be collected at the base station through a periodic relaying satellite [

14]. Therefore, rateless codes have been considered in increasingly complicated SIN to provide an efficient distributed transmission scheme [

15].

The first distributed LT (DLT) codes is proposed in [

16]. The degree distribution for the distributed sources is designed as a way of decomposing the standard RSD, which is suited for a pre-fixed number of source nodes communicating with a single destination via a relay. In [

17], selective distributed LT (SDLT) code is proposed by applying the And-Or tree analysis and linear programming, which can find some optimal combination at the relay node for an arbitrarily number of sources. In [

18], soliton-like rateless coding (SLRC) scheme is designed for a Y-network, and SLRC scheme can provide degree distributions that generate LT-like output symbols in a relay with simple network coding protocol. It was shown through Monte Carlo simulations that the SLRC outperforms the DLT and SDLT. In [

19], an improved approach of SLRC is proposed for the relay buffer-limited situation to ensure more effective decoding. The SIN is considered in the scenario that direct links and relay links are all existing, and an available degree distribution optimization scheme is proposed based on the And–Or tree analysis in [

20]. The rateless network coding (NC) for dynamic relay topology based on [

20] is investigated preliminarily for increasing the system throughput in this paper.

Furthermore, there are several scenarios where the conventional rateless codes cannot perform optimally due to the lack of unequal error protection (UEP). For example, when transmitting the data blocks of discrete wavelet-transform encoded images in SIN, the lower frequency part of data blocks are more important than the higher frequency parts. Thus, it is more desirable to use rateless codes with UEP to protect the important parts. The first scheme of rateless codes with UEP property is proposed in [

21], and message symbols are allocated two different weights according to their importance levels. In [

22], expanding window fountain codes is proposed to generate output symbols only from message symbols within a certain window. Two overlapping and expanding windows are pre-designed, such that the smaller window contains important message symbols, and the larger window contains all the symbols. A distributed rateless code with an unequal error protection (UEP) property has been proposed in [

23] for a Y-network, and it can provide different data importance levels with different error probabilities for two sources. The generalized UEP rateless code (GURC) for distributed relay networks is proposed in [

24], and the relationship between UEP property and decoding error rate (DER) of LT codes is obtained by the And-Or tree analysis [

25]. However, the UEP property for multiple source nodes and original data in a dynamic network topology of SIN is still lacking research.

Considering a relay has its own orbit around the mission planet, which makes the links between landed rovers and the relay being periodically available. Since the space explorers have limited energy, broadcasting is prohibited. Thus, the rovers cannot communicate with the relay and destination simultaneously. In this paper, to improve the throughput of the multimedia service in future SIN communications, we proposed a novel distributed UEP rateless coding (DURC) scheme for the multimedia data transmission in a multi-access relaying SIN, which could obtain a lower DER under the pre-selected parameters of UEP property. Specifically, the RC degree distributions and network coding rules are designed to match the duration of the link access conditions.

The rest of the paper is organized as follows. In

Section 2, we present the system model and our DURC scheme. In

Section 3, we derive the asymptomatic performance based on the And-Or tree analysis and optimize the degree distributions and network coding rules by using multi-objective programming. In

Section 4, we employ NSGA-II to design DURC codes and evaluate the performance of the DURC under different channel conditions. Finally, we conclude the paper in

Section 5.

## 2. System Model

We consider a communication scenario in SIN as shown in

Figure 1: two disjoint exploration rovers with sources

${s}_{1}$ and

${s}_{2}$ with data block of length

${k}_{1}$ and

${k}_{2}$ input symbols, respectively. Let

${S}_{1}$ and

${S}_{2}$ denote the set of

${s}_{1}$ and

${s}_{2}$ input symbols, respectively.

${S}_{1}$ and

${S}_{2}$ transmit to a base station

D with the assistance of a periodic moving relay satellite/orbiter

R. Due to the periodic motion,

R has limited access time, and the source nodes have the knowledge of the accessing period in the SIN. Note that, without loss of generality, the output symbols transmission in

Figure 1 are on binary erasure channels (BEC), and the erasure probabilities between the four nodes are denoted by

${\epsilon}_{ij}$, where

$i\in \{1,2,R\}$ and

$j\in \{R,D\}$. Note that the qualities of direct links are much worse than the relay links in SIN, i.e.,

$\{{\epsilon}_{1D},{\epsilon}_{2D}\}>>\{{\epsilon}_{1R},{\epsilon}_{2R},{\epsilon}_{RD}\}$. We define the relay links

${S}_{1}-R$,

${S}_{2}-R$, and

$R-D$ (Y-network in

Figure 1) as primary links, and the direct links between sources to destination,

${S}_{1}-D$,

${S}_{2}-D$, as secondary links.

We define each period of the relay in the SIN as a transmission session, and each transmission session period is divided into two Phases.

In

Phase 1,

R is invisible to

${S}_{1}$,

${S}_{2}$, and

D, then

${S}_{1}$ and

${S}_{2}$ performs LT coding over their information set

${s}_{1}$ and

${s}_{2}$ with degree distribution

${\mathrm{\Omega}}_{1}(x)$ and

${\mathrm{\Omega}}_{2}(x)$, respectively, and transmits the coded symbols on the secondary links

${S}_{1}-D$ and

${S}_{2}-D$ to the destination

D; in

Phase 2, once primary links

${S}_{1}-R$ and

${S}_{2}-R$ are available, the secondary link is closed to save energy immediately.

${S}_{1}$ and

${S}_{2}$ performs LT coding over its information set

${s}_{1}$ and

${s}_{2}$ with degree distribution

${\mathrm{\Psi}}_{1}(x)$ and

${\mathrm{\Psi}}_{2}(x)$, respectively, and transmits the coded symbols to the relay

R, and a rateless NC is performed at

R and then transmitted to

D on

$R-D$. The connections and durations are illustrated in

Figure 2. The number of coded symbols transmitted on each link are denoted by

${N}_{1}$ and

${N}_{2}$. For this dynamic SIN network model, the detail of our DURC scheme is illustrated below:

Initialization Suppose that the information symbol lengths of

${s}_{1}$ and

${s}_{2}$ are

${k}_{1}$ and

${k}_{2}$, respectively. The

${k}_{m}$ (

$m\in \{1,2\}$) symbols are divided into

n subsets according to their importance levels, expressed as

${I}_{m1},{I}_{m2},...,{I}_{mn}$ and

${k}_{m}={\sum}_{i\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}1}^{n}{I}_{mi}$, the fraction of the

${I}_{mi}$ information symbols in

${k}_{m}$ is

${\pi}_{mi}$ and

${\sum}_{i\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}1}^{n}{\pi}_{mi}\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}1$.

${S}_{m}$ selects the

i-th importance level from subset

${I}_{mi}$ with probability

${w}_{mi}$, which is called symbol-selection weight [

21].

${S}_{m}$ employs an LT-coding degree distribution

${\mathrm{\Omega}}_{m}(x)={\sum}_{{d}_{m}\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}1}^{{D}_{m}}{\mathrm{\Omega}}_{m,{d}_{m}}{x}^{{d}_{m}}$ in

Phase 1, and

${\mathrm{\Psi}}_{m}(x)={\sum}_{{d}_{m}\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}1}^{{D}_{m}}{\mathrm{\Psi}}_{m,{d}_{m}}{x}^{{d}_{m}}$ in

Phase 2, where

${D}_{m}$ denotes a pre-selected maximum value of

${d}_{m}$.

Phase 1 R is invisible, and ${S}_{1}$ and ${S}_{2}$ generate distributed rateless coded symbols from ${k}_{m}$ information symbols using LT-coding degree distribution ${\mathrm{\Omega}}_{m}(x)$, and transmit ${N}_{1}$ coded symbols to D by secondary links. In an encoding process at ${S}_{m}$, if a degree ${d}_{m}$ is randomly selected with probability ${\mathrm{\Omega}}_{m,{d}_{m}}$ using the degree distribution ${\mathrm{\Omega}}_{m}(x)={\sum}_{{d}_{m}\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}1}^{{D}_{m}}{\mathrm{\Omega}}_{m,{d}_{m}}{x}^{{d}_{m}}$, then ${d}_{m}$ information symbols are selected uniformly at random and are bitwise XORed to form the coded symbol.

Phase 2.1 R is visible, and ${S}_{1}$ and ${S}_{2}$ generate distributed UEP rateless coded symbols from information symbols using LT-coding degree distribution ${\mathrm{\Psi}}_{m}(x)$, and transmit ${N}_{2}$ coded symbols to R by primary links. In an encoding process at ${S}_{m}$, if a degree ${d}_{m}$ is randomly selected with probability ${\mathrm{\Psi}}_{m,{d}_{m}}$ using the degree distribution ${\mathrm{\Psi}}_{m}(x)={\sum}_{{d}_{m}\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}1}^{{D}_{m}}{\mathrm{\Psi}}_{m,{d}_{m}}{x}^{{d}_{m}}$, then ${d}_{m}$ information symbols are selected with probability $\frac{{w}_{mi}}{{\pi}_{mi}{k}_{m}}$ in ${I}_{mi}$ and are bitwise XORed to form the coded symbol.

Phase 2.2 The coded symbols are transmitted to relay R from ${S}_{1}$ and ${S}_{2}$, and based on the network coding rule $P=\{{p}_{1},{p}_{2},{p}_{3}\}$ and ${\sum}_{i\phantom{\rule{3.33333pt}{0ex}}=\phantom{\rule{3.33333pt}{0ex}}1}^{3}{p}_{i}=1$, R generates three types of network coded symbols and forwards them to the destination D. R forwards ${S}_{1}$’s output symbol directly with the probability ${p}_{1}$, while forwarding ${S}_{2}$’s output symbol directly with the probability ${p}_{2}$, and XORs the two incoming symbols with the probability ${p}_{3}$ and then forwards to D .

Decoding After receiving enough coded symbols from ${S}_{1}$, ${S}_{2}$ and R, a joint decoding is performed on D to recover the information symbols of ${S}_{1}$ and ${S}_{2}$.

Considering the erasure probabilities of the links, in Phase 1 and Phase 2, the expected number of the coded symbols successfully received at D can be expressed as ${n}_{1}={N}_{1}(1-{\epsilon}_{1D})$, ${n}_{2}={N}_{1}(1-{\epsilon}_{2D})$ and ${n}_{NC}={N}_{2}(1-{\epsilon}_{RD})$, where ${n}_{1}$ and ${n}_{2}$ are the number of the received coded symbols on the ${S}_{1}-D$ and ${S}_{2}-D$ links in Phase 1, respectively, and ${n}_{NC}$ is the number of the received output coded symbols on the link $R-D$ in Phase 2.

## 4. Simulation and Comparison

Let us investigate the DURC parameters under a totally symmetric network model, where the block lengths are

${N}_{1}={N}_{2}=1200$, the information symbols lengths are

${k}_{1}={k}_{2}=1000$, and the erasure probabilities of channel links are

${\epsilon}_{1D}={\epsilon}_{2D}=0.5$,

${\epsilon}_{1R}={\epsilon}_{2R}=0$, and

${\epsilon}_{RD}=0.1$. Setting the maximal value of the degree

${D}_{1}={D}_{2}=50$, desired total overhead

$\gamma *=1.1$, the proportion of MIB in every source is

${\pi}_{m1}=0.5$, and the decoding performance relationship between MIB and LIB is

${I}_{m,1}\ge 10{I}_{m,2}$. To obtain the optimized degree distribution

${\mathrm{\Psi}}_{1}(x)$ and

${\mathrm{\Psi}}_{2}(x)$, we solved the MOP1 and finally get a Pareto front about the optimized

${y}_{l,12}$ and

${y}_{l,22}$. We plot the Pareto fronts obtained from our optimizations in

Figure 4, where

$\eta ={y}_{l,2,2}/{y}_{l,1,2}$. It is obviously that the protection of

${S}_{1}$ is increasing with the increasing of

η.

The partial optimization results are shown in

Table 1. Furthermore, we select an equal error protect (eep) degree distribution from the sets of our optimized DURC scheme, i.e.,

$\eta =1$ and

$n=1$, the optimized

${\mathrm{\Psi}}_{1}(x)$ and

${\mathrm{\Psi}}_{2}(x)$ at sources are identical as

${\mathrm{\Psi}}_{1}(x)={\mathrm{\Psi}}_{2}(x)=0.0111x+0.4944{x}^{2}+0.1787{x}^{3}+0.1653{x}^{5}+0.0053{x}^{6}+0.0978{x}^{12}+0.0474{x}^{50}$. Furthermore, we substitute

${\mathrm{\Psi}}_{m}(x)$ into the MOP problem (12) with a desired total overhead

${\gamma}^{*}=1.1$ to solve for the RC degree distribution on secondary links as

${\mathrm{\Omega}}_{1}(x)={\mathrm{\Omega}}_{2}(x)=0.054x+0.946{x}^{2}$, and NC relaying probabilities as

${p}_{1}=0.045$,

${p}_{2}=0.045$ and

${p}_{3}=0.91$, and we can use these eep-DURC scheme to compare with three existing distributed rateless coding schemes in the same network model as described before. The store-and-forward (SF) scheme is that the RC degree distributions on secondary and primary links are both set as the classical degree distributions used for Raptor codes in [

9], and the relay node

R randomly forwards the received coded symbols from

${S}_{1}$ and

${S}_{2}$ with equal probability. Simple network coding scheme (XOR) is that

R always sends a new coded symbol to

D, which is generated by XORing the two coded symbols from

${S}_{1}$ and

${S}_{2}$. The SLRC scheme uses the network coding relay protocol as in [

18], the relay node

R only forwards coded symbols with degree-1 and degree-2 with a threshold probability, and RC degree distributions on secondary and primary links are also the Raptor degree distributions.

Figure 5a shows the DER versus the total overhead for various distributed rateless codes (DRC) schemes, which are obtained by the asymptotical performances formed by And-Or tree analysis. It is noted that the XOR scheme has the worst decoding performance because of the lack of lower degree symbols (1 and 2) received by

D. The SLRC scheme has better performance than the SF scheme when the overhead is larger than 1.15. The DER of the eep-DURC scheme, which is the basic of our proposed UEP, is clearly the lowest.

Figure 5b shows the DER versus the total overhead for various UEP schemes. It is noted that the DURC schemes achieved the lowest DER for both MIB and LIB, which gives great support for information transmission in SIN.

Figure 6a shows the DER versus the total overhead of inner information in one source node for DURC and eep-DURC schemes from the And-Or tree performance evaluation. It is obvious that the DURC scheme can achieve marvel decoding performance of MIB with about two orders of the decoding performance of LIB decreasing, which gives great protection for MIB than eep-DURC scheme.

Figure 6b shows the LIB performance of different sources for

$\eta \phantom{\rule{-0.166667em}{0ex}}=\phantom{\rule{-0.166667em}{0ex}}10,{10}^{2},{10}^{3},{10}^{4}$. The result shows that when decoding performance of

${S}_{1}$ increases, the performance of related

${S}_{2}$ will decrease, that is to say, the performance increase of one source is on the price of performance decrease of another source. Therefore, we only have to set desired parameters, and the scheme proposed in this paper will then satisfy different needs.

We also estimate the throughput performance of the DURC scheme. To substitute the parameter setting in this section into (

10), we can derive the system throughput upper bound as

Figure 7 shows the system throughput versus erasure probability of channel

$R-D$ for information in one source and eep-DURC. To match the system setting of primary link and secondary link, we set

$\epsilon \le 0.5$.

The result in

Figure 7 demonstrated the same trend of

Figure 6a, where the throughput of MIB is much higher than eep-DURC, but the throughput of LIB is decreased. However, for the systems that need higher protection of MIB, this tradeoff is meaningful.