1. Introduction
Conventional scalar linear network coding (LNC) models data symbols transmitted along a network over the finite field
and linearly combines data symbols with coding coefficients selected from
. Vector LNC [
1,
2,
3] is an extension of conventional scalar LNC. It models the data symbols over the
L-dimensional binary vector space
and linearly combines data packets with coding coefficients selected from
matrices over GF(2). Compared to scalar LNC, which provides
field elements as the candidates for coding coefficient selection, a key advantage of vector LNC is that it significantly expands the number of candidates for coding coefficient selection to
. This expansion enhances coding flexibility. In particular, vector LNC subsumes scalar LNC in the sense that, in a single-source multicast network, every scalar LNC scheme over GF(
) can be transformed to a vector LNC scheme over
, so that the scalar scheme achieves the network’s multicast capacity if and only if the counterpart vector scheme achieves the network’s multicast capacity (see, e.g., [
2,
3]). In addition, Ref. [
1] constructed a classical non-multicast network whose capacity can be achieved by a simple vector LNC scheme over
but cannot be achieved by scalar LNC schemes over any finite field. Various single-source multicast networks have also been constructed in [
3,
4,
5,
6] to demonstrate that the size of data symbols required by vector LNC to achieve the multicast capacity can be much smaller than that required by scalar LNC. Under the framework of vector LNC, a class of LNC schemes called circular-shift LNC was introduced in [
7,
8]. It models coding coefficients based on circulant matrices so as to attain lower coding complexity in comparison to scalar LNC.
Random LNC (RLNC) [
9] is an important type of LNC schemes in which the coding coefficients are selected in a randomized manner. It has demonstrated significant potential to improve transmission efficiency and throughput in wireless broadcasts [
8,
10,
11,
12,
13,
14,
15], which is a popular transmission scenario with various applications (e.g., vehicle-to-everything communication [
16]). Despite the advantages of vector LNC reviewed in the previous paragraph, in the literature, most attention on the study of RLNC in wireless broadcasts is paid to scalar RLNC. In particular, it is well known that, with increasing
L, primitive scalar RLNC over GF(
), where “primitive” means employing independently and uniformly distributed coding coefficients from GF(
), can achieve the optimal throughput performance. To the best of our knowledge, in the scenario of wireless broadcasts, existing research on vector RLNC has primarily focused on designing concrete vector RLNC schemes with low coding complexity [
8,
17]. However, for the most fundamental vector RLNC scheme, i.e.,, primitive vector RLNC over
, theoretical characterization of its throughput performance remains unexplored. This work aims to fill in this theoretical blank. Similar to the consideration in [
18,
19,
20,
21,
22,
23,
24,
25], we consider systematic RLNC schemes, that is, the sender first broadcasts all original packets and then the coded packets. Following the approach in [
8,
19,
22], we utilize
completion delay as the core metric for evaluating throughput. This metric quantifies the total number of coded packets required by the sender until every receiver has successfully decoded all original packets.
The main contributions of this paper are summarized as follows.
For primitive vector RLNC over , we derive closed-form characterization for the probability distribution as well as the expected value of both the completion delay at a single receiver and the system completion delay. Numerical comparison validates our theoretical characterization, demonstrating high accuracy between theoretical and simulation results, particularly for large enough L (Say, ).
Unlike primitive scalar RLNC over GF(), which is capable of asymptotically attaining optimal completion delay when L increases, we reveal that, even for large enough L, primitive vector RLNC over fails to reach optimal completion delay, but the gap between the expected completion delay at a receiver and the optimal one is shown to be a constant smaller than .
We reveal that, for primitive vector RLNC over , the normalized expected completion delay per original packet asymptotically converges to its optimal value as P grows.
Our theoretical characterization on the completion delay performance of primitive vector RLNC provides a theoretical benchmark for the future design of vector RLNC schemes with different design goals.
The rest of this paper is structured as follows.
Section 2 establishes the system model and reviews known results on perfect RLNC, a class of RLNC schemes that attain the optimal completion delay performance.
Section 3 theoretically characterizes the probability distribution and the expected value of completion delay of primitive vector RLNC over
. The theoretical characterization is numerically compared with simulation results in
Section 4. The paper is concluded in
Section 5.
2. Preliminaries
As illustrated in
Figure 1, we consider a single-hop wireless broadcast network without feedback, where a single sender aims to deliver a total of
P original packets to
R receivers. Each packet contains
M bits. During each timeslot, the sender is allowed to transmit one packet, which is received by all receivers. The communication link between the sender and every receiver is modeled as an independent memoryless erasure channel, where receiver
r experiences a packet erasure probability of
. The goal for every receiver is to successfully recover all
P original packets.
In this paper, all RLNC transmission strategies adopt a
systematic structure, which has also been considered in [
18,
22]. Specifically, during the initial transmission phase, the sender sequentially broadcasts all
P original packets
. Following this, in the second phase the sender transmits coded packets, each formed as a linear combination of the
P original packets. This process continues until all receivers successfully recover the entire set of
P original packets. The term
completion delay refers to the number of coded packets sent during this second phase. As emphasized in prior studies on RLNC performance in wireless broadcast settings [
8,
19,
22], completion delay serves as a fundamental indicator of transmission efficiency. Non-systematic codes usually incur higher completion delay than systematic codes [
19]. This is due to the fact that the first
P transmitted packets are not original packets, but randomly coded packets, which are
not necessarily linearly independent.
The conventional scalar RLNC scheme operates over the finite field GF(
), where every packet is treated as a row vector composed of
symbols belonging to GF(
) (To simplify the exposition, we assume
L divides
M. In practical systems, since
, padding with dummy bits ensures this condition is met). In this scheme, every packet transmitted by the sender is constructed as a GF(
)-linear combination of the
P original packets
. In particular, during the second transmission phase, each coded packet
(for
) generated by the sender can be represented as
where each coding coefficient
is uniformly and independently drawn from GF(
).
To enable receivers to interpret how is generated from the original packets, a global encoding kernel is appended to the packet as a header. For the P original packets, their corresponding global encoding kernels form the identity matrix, i.e., . Once a receiver r successfully receives P packets with linearly independent global encoding kernels, it can recover the entire P original packets through decoding.
The RLNC schemes we consider in this paper are
vector RLNC schemes, which are defined over GF
. Each packet
of
M bits is interpreted as a row vector
comprising
symbols, where each symbol
is treated as an
L-bit row vector defined over GF(2). For an
matrix
over GF(2), the linear operation
is defined symbol-wise as
Hence, for every
, the coded packet
that is randomly produced by the sender during the second phase can be represented as
where the coding coefficients
are randomly and independently selected from
matrix over GF(2). We define the
global encoding kernel of a packet as a
matrix over GF(2). Every global encoding kernel can be viewed as a
block matrix, where each block corresponds to a coding coefficient represented by an
matrix over GF(2). For an original packet
,
, its global encoding kernel
contains the identity matrix
in the
block entry and
zero matrices in all other positions. Consequently,
. For a coded packet
, its global encoding kernel
is defined as
. For vector codes, a receiver can decode the
P original packets upon receiving any
P packets whose global encoding kernels are linearly independent (i.e., their concatenation has full rank
).
Perfect RLNC [
8,
22] is a class of RLNC scheme where the encoded packets generated by the source node exhibit the strongest possible linear independence. In particular, a receiver is able to recover all
P original packets upon successfully receiving arbitrary
P perfect RLNC packets. Therefore, perfect RLNC is optimal in terms of completion delay, so it serves as an important benchmark scheme in the literature of RLNC in wireless broadcasts. For perfect RLNC, let
and
respectively denote the completion delay at single receiver
r and the system completion delay. It is known that
follows the negative binomial distribution with parameter
P and
[
8]. Consequently, the distribution of
follows
where
is the regularized incomplete beta function and is expressed as
Based on Equation (
4), we further have
3. Theoretical Analyses of Vector RLNC
Analogous to conventional scalar RLNC over GF(), in which coding coefficients are independently and uniformly selected from GF(), the most fundamental vector RLNC scheme selects coding coefficients independently and uniformly from all matrices over GF(2). Unless otherwise specified, such an RLNC scheme is referred to as primitive vector RLNC over . To the best of our knowledge, the completion delay performance of primitive vector RLNC has not been theoretically analyzed before. This section aims to address and fill in this blank.
For primitive vector RLNC over , let and respectively denote the completion delay at single receiver r and the system completion delay. The analysis of requires the following lemma. For , let be a randomly generated matrix over GF(2), in which every entry is independently and uniformly distributed over . Let represent the full rank probability of .
Lemma 1. The full rank probability of is given by In particular, when , Proof. Let
denote the
row of the binary matrix
. Assume that we build the random matrix
row by row. The probability for
to be full-rank is
. Under the assumption that the first l rows
are linearly independent, as there are
different GF(2)-linear combinations of
, the probability of the
row
being linearly independent of all previous l rows is
. Hence,
When
, the size of matrix
becomes
, and
When increases, decreases. Thus, when , is maximized and equal to . It can also be readily checked that as increases, converges to . Thus, . □
For scalar RLNC defined over GF(
), it has been widely recognized (see, e.g., [
8]) that, as
L grows, the expected completion delay asymptotically approaches the optimal value
, i.e., under the assumption of perfect RLNC.
On the contrary, as a consequence of Lemma 1, it turns out that a similar conclusion cannot be drawn for primitive vector RLNC over regardless of the choice of L.
Proposition 1. For primitive vector RLNC over , for any choice of L.
Proof. Perfect RLNC assumes that receiver r is able to recover the original P packets upon successfully receiving any P packets. If this condition holds for primitive RLNC over , then, for any P packets successfully received by r, the corresponding P global encoding kernels each of size , when concatenated column-wise, can form a full-rank matrix over GF(2), say . Among the P received packets, it is assumed that N are coded packets. As the global encoding kernels for original packets consist of unit column vectors, the full-rank matrix can be reduced to an full-rank matrix , in which every entry is randomly selected from GF(2). However, according to Lemma 1, for any L there is a nonzero probability that is not full rank, so that a contradiction is drawn. Hence, the perfect RLNC assumption does not hold for primitive vector RLNC for any L. □
Even though primitive vector RLNC over cannot asymptotically achieve the optimal expected completion delay with increasing L, based on the following characterization of the distribution of completion delay, we assert that primitive vector RLNC over asymptotically achieves the optimal expected completion delay with increasing P.
Throughout our theoretical analysis, we make the assumption that
L is large enough (say,
). The main reason for this is twofold. This assumption is motivated by two crucial considerations. First, it guarantees that the full-rank probability
is equal to 1 for all values of
P, as established in Equation (
8), thereby significantly simplifying our derivation of the completion delay distribution. Second, it ensures that the probabilities
and
remain effectively constant across different
P, which is essential for maintaining the accuracy of our analytical results. For instance, when
,
for
and
for
, and
for all
.
Theorem 1. For primitive vector RLNC over , the distribution of completion delay at single receiver r is characterized as , and for , Theorem 2. For primitive vector RLNC over , the expected completion delay at a single receiver is given by Proof. The technical proof of Equation (
12) based on Equation (
11) is given in
Appendix B. □
In addition to the technical derivation of
in
Appendix B, based on the distribution of
, we can also analytically characterize
based on the concept of negative binomial distribution and the full rank probability of
,
,
as follows.
First, it takes an average of
transmissions until receiver
r successfully receives
P packets. Assume among these
P successfully received packets,
are original packets received in the first transmission phase and
are coded packets received at the second transmission phase. Assume the case
, which happens with probability
. In this case, receiver
r is able to utilize
received coded packets to recover
missing original packets with probability
. Thus, with probability
, receiver
r needs to receive at least one more coded packet, after an average of
transmissions. Upon receiving
coded packets, the probability of receiver
r to recover
missing original packets is
. Consequently, with probability
, receiver
r needs to receive an extra coded packet, after an average of
transmissions. Upon receiving
coded packets, the probability of receiver
r to recover
missing original packets is
, which is equal to 1 under the assumption that
L is large enough. To sum up, the expected completion delay at receiver
r can be characterized as
It can be readily checked that Equation (
13) and Equation (
12) are mathematically equivalent representations of the expected completion delay
.
Remark 1. Under the assumption that L is large enough, our theoretical characterization of the expected completion delay for primitive vector RLNC over is invariant to L. This invariance results in a fixed gap in completion delay between primitive vector RLNC and perfect RLNC, even when L increases to infinity. Despite this, the next corollary asserts that the expected completion delay normalized by P asymptotically approaches the optimal value with increasing P.
Corollary 1. For primitive vector RLNC over (for large enough L), is upper bounded by Consequently, with increasing P, Proof. According to Equation (
12),
is upper bounded by
. Since we assume L is large enough,
. We thus have
, which is the upper bound of
for primitive vector RLNC over
. Equation (
15) is a direct consequence of (
14). □
Remark 2. The corollary above asserts that, for primitive vector RLNC over with large enough L, regardless of the packet erasure probability , a receiver only needs to successfully receive packets on average to recover all P original packets, where is derived from . In comparison, it has been proven that, for RLNC over GF(2) (which is equivalent to primitive vector RLNC over GF(2)), a receiver needs to successfully receive at most packets on average to recover all P original packets [26]. This signifies that the assumption of larger L allows us to deduce a tighter upper bound on . Proposition 2. For vector RLNC over , the distribution of system completion delay D is given bywhere is explicitly characterized in Theorem 1. Based on the distribution of D, the expected system completion delay D can be characterized as This section theoretically analyzed the distribution as well as the expectation of completion delay for primitive vector RLNC over , which serves as a benchmark scheme for the future study of vector RLNC.
4. Numerical Validation
In this section, we present the numerical results of primitive vector RLNC schemes over , with L respectively set as , and compare the simulation results to the theoretical derivations obtained in the previous section. The erasure probability at each receiver is fixed at . The completion delay performance depicted in this section is normalized by the packet number P. In the figure legend, theoretical results are labeled as “theo” and simulation results are labeled as “simu”.
Recall that, in the previous section, the theoretical analysis is based on the assumption that
L is large enough so that
, defined in Equation (
8), is always equal to 1 and the value of
and of
stays approximately the same, regardless of the choice of
P. In this section, when
, we directly adopt Equation (
11) in Theorem 1 to calculate the distribution of
. When
,
cannot be assumed to be 1 anymore; as a direct extension of (
11), we adopt the following equation to calculate the distribution of
to make the theoretical characterization more accurate:
and
and
are respectively calculated based on
and Equation (
17).
Figure 2 plots the average completion delay per packet at receiver
r based on the theoretical and simulation results of primitive vector RLNC schemes over different
. The expected completion delay
at receiver
r of perfect RLNC is also depicted for comparison. From this figure, the following observations can be made. First, the average completion delay at receiver
r of every primitive vector RLNC scheme decreases with increasing
P. Second, when
, there is no visually distinguishable difference between the theoretical values of
characterized in Equation (
12) and the numerical results. This validates the correctness of our theoretical derivation in Theorem 1 based on the assumption of large enough
L. In contrast, when
and
, the theoretical value of
in Equation (
12) is higher than the actual simulation value. This is because, without the assumption that
L is large enough, the theoretical value
we adopted in Theorem 1 is smaller than the actual probability of a random
matrix being full-rank, which leads to the distribution
characterized in Equation (
18) smaller than the actual one. However, with increasing
P, the theoretical value coincides with the simulation one. Furthermore,
Figure 2 reveals that, when
, the increase of
L will no longer lower the expected completion delay
of primitive vector RLNC over
, while there is a noticeable gap between
and
of perfect RLNC. This observation is in line with the discussion in Remark 1. According to Equation (
12), when
P continues to increase,
converges asymptotically to
, and thus has the asymptotical optimal completion delay performance. This finding aligns with the conclusion of (
15) in Corollary 1. Last,
Table 1 illustrates that, when
, the expected completion delay per packet at single receiver
r is lower than the theoretical upper bound computed via Equation (
14), thereby validating the accuracy of the upper bound in Equation (
14).
Theoretical and simulation results for the average system completion delay of primitive vector RLNC schemes over
are compared in
Figure 3 under different parameter settings. The number of receivers is set to be 20. As observed from
Figure 3, the performance gap in completion delay among primitive vector RLNC schemes with different
L diminishes as
P increases. Moreover, when
, the increase of
L will not lower the expected completion delay
of primitive vector RLNC over
any more. Furthermore, when
P continues growing, all primitive vector RLNC schemes asymptotically approach the performance of perfect RLNC. The same as the observation of
Figure 2 and
Figure 3 demonstrates that, when
, there is no visually distinguishable difference between the theoretical characterization of
in Proposition 2 and the simulation results. For the case
, the theoretical characterization of
becomes more accurate as
P increases, because
is deduced based on the distribution of
in Equation (
18), which is smaller than the actual value but converges to the actual one with increasing
P.