Three Efficient All-Erasure Decoding Methods for Blaum–Roth Codes

Weijie Zhou; Hanxu Hou

doi:10.3390/e24101499

and

¹

School of Computer Science and Technology, Dongguan University of Technology, Dongguan 523820, China

²

School of Electrical Engineering and Intelligentization, Dongguan University of Technology, Dongguan 523820, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Entropy2022, 24(10), 1499;https://doi.org/10.3390/e24101499

This article belongs to the Special Issue Information Theory and Network Coding II

Version Notes

Order Reprints

Review Reports

Abstract

Blaum–Roth Codes are binary maximum distance separable (MDS) array codes over the binary quotient ring

F_{2} [x] / (M_{p} (x))

, where

M_{p} (x) = 1 + x + \dots + x^{p - 1}

, and p is a prime number. Two existing all-erasure decoding methods for Blaum–Roth codes are the syndrome-based decoding method and the interpolation-based decoding method. In this paper, we propose a modified syndrome-based decoding method and a modified interpolation-based decoding method that have lower decoding complexity than the syndrome-based decoding method and the interpolation-based decoding method, respectively. Moreover, we present a fast decoding method for Blaum–Roth codes based on the LU decomposition of the Vandermonde matrix that has a lower decoding complexity than the two modified decoding methods for most of the parameters.

Keywords:

distributed storage; Blaum–Roth codes; all-erasure decoding; decoding complexity

1. Introduction

Redundancy is necessary in storage systems in order to provide high data reliability in case of disk failures [1]. Replication and erasure codes are two main ways of including redundancy. The idea of replication is that the data in one disk are copied to multiple disks. The storage system replaces damaged disks with their copies when some disks are erased. It is fast to repair the erased disks but requires a lot of storage space. In contrast, erasure codes provide higher data reliability with a small storage cost.

Maximum distance separable (MDS) codes [2] are typical erasure codes that have optimal tradeoff between storage cost and data reliability, i.e., they can achieve the minimum storage cost given a level of data reliability. Binary MDS codes are special MDS codes that have lower computational complexity in the encoding/decoding procedures, since only XORs and cyclic-shift operations are involved. Some existing constructions of binary MDS codes are EVENODD codes [3,4], RDP codes [5], and X-codes [6,7], which can correct any two-column (we use “column" and “disk" interchangeably in this paper) erasures. RTP codes [8], Star codes [9,10], and extended EVENODD codes [11,12,13,14] can correct any three-column erasures. With the rapid increase in the data scale in storage systems [15], we need to design binary MDS codes that can correct any number of erasures as well as efficient encoding/decoding methods. Graftage codes [16] can achieve various tradeoffs between storage and repair bandwidth, while we focus on efficient decoding methods of binary MDS codes. Blaum–Roth codes [17] are this type of code, which are designed over the ring

R_{p} = F_{2} [x] / (M_{p} (x))

, where

M_{p} (x) = 1 + x + \dots + x^{p - 1}

, and p is a prime number.

When some columns are erased, the syndrome-based decoding method [17] and the interpolation-based decoding method [18] have been proposed to recover the erased columns. In the decoding methods [17,18], there are three basic operations over the ring

R_{p}

: (i) addition, (ii) multiplication of a power of x and a polynomial, and (iii) division of factor

1 + x^{b}

with

1 \leq b \leq p - 1

. It is shown in the decoding methods [17,18] that we can first take the operations (i) and (ii) modulo

1 + x^{p}

and then take the results of modulo

M_{p} (x)

, while operation (iii) in the decoding methods [17,18] is directly taken as modulo

M_{p} (x)

.

In this paper, we show that we can also compute operation (iii) as modulo

1 + x^{p}

, which has lower computational complexity than modulo

M_{p} (x)

. We propose modified decoding methods for the two existing decoding methods [17,18] that have a lower decoding complexity than the original decoding methods by computing operation (iii) as modulo

1 + x^{p}

instead of modulo

M_{p} (x)

. The reason our modified decoding methods have much lower decoding complexity than the decoding methods [17,18] is twofold. First, all the operations in our decoding methods are taken as modulo

1 + x^{p}

, while the existing decoding methods execute the divisions as modulo

M_{p} (x)

. Second, we propose new algorithms in the decoding procedure to reduce the number of operations. Please refer to Section 3 for our two modified decoding methods. Moreover, the efficient LU decoding method [19] proposed for extended EVENODD codes decoding can also be employed to recover the erased columns of Blaum–Roth codes. We show that the LU decoding method has lower decoding complexity than the two modified decoding methods for most of the parameters. We define the decoding complexity as the total number of XORs required to recover the erased columns.

2. Blaum–Roth Codes

In this section, we first review the construction of Blaum–Roth codes [17] and then show the efficient operations over the ring

F_{2} [x] / (1 + x^{p})

. Finally, we present an algorithm to compute multiple multiplications, which have two nonzero terms over

F_{2} [x] / (1 + x^{p})

with lower complexity.

2.1. Construction of Blaum–Roth Codes [17]

The codeword of Blaum–Roth codes [17] is a

(p - 1) \times n

array

{[c_{i, j}]}_{i = 0, j = 0}^{p - 2, n - 1}

that is encoded from the

(p - 1) k

information bits, where

c_{i, j} \in F_{2}

and

n \leq p

. We can view any k columns of the

(p - 1) \times n

array as information columns that store the

(p - 1) k

information bits and the other

r = n - k

columns as parity columns that store the

(p - 1) r

parity bits. For

j = 0, 1, \dots, n - 1

, we represent the

p - 1

bits in column j by a polynomial

c_{j} (x) = \sum_{i = 0}^{p - 2} c_{i, j} x^{i}

. The

(p - 1) \times n

array of Blaum-Roth codes is defined as

(\begin{matrix} c_{0} (x) & c_{1} (x) & \dots & c_{n - 1} (x) \end{matrix}) \cdot H_{r \times n}^{T} \equiv 0 (\mod M_{p} (x)),

where

H_{r \times n}

is the

r \times n

parity-check matrix

H_{r \times n} = [\begin{matrix} 1 & 1 & 1 & \dots & 1 \\ 1 & x & x^{2} & \dots & x^{n - 1} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 1 & x^{(r - 1)} & x^{(r - 1) 2} & \dots & x^{(r - 1) (n - 1)} \end{matrix}],

and

0

is the all-zero row of length r. We denote the Blaum–Roth codes defined above as

C (p, n, r)

. When

p \geq n

and p is a prime number, we can always retrieve all the information bits from any k out of the n polynomials [17], i.e.,

C (p, n, r)

are MDS codes.

If we let

c_{p - 1, j} = 0

for all

j = 0, 1, \dots, n - 1

, then

C (p, n, r)

can be equivalently defined as the following

p \cdot r

linear constraints. (The subscripts are taken as modulo p unless otherwise specified.)

\sum_{j = 0}^{n - 1} c_{{⟨ m - ℓ \cdot j ⟩}_{p}, j} = 0,

where

0 \leq m \leq p - 1

and

0 \leq ℓ \leq r - 1

.

Suppose that the

λ

columns

{e_{i}}_{i = 0}^{λ - 1}

are erased, where

λ \geq 2

and

0 \leq e_{0} < \dots < e_{λ - 1} < n

. Let the

δ = n - λ

surviving columns be

{h_{j}}_{j = 0}^{δ - 1}

, where

0 \leq h_{0} < \dots < h_{δ - 1} < n

and

{e_{i}}_{i = 0}^{λ - 1} \cup {h_{j}}_{j = 0}^{δ - 1} = {0, 1, \dots, n - 1}

. We have

(\begin{matrix} c_{e_{0}} (x) & c_{e_{1}} (x) & \dots & c_{e_{λ - 1}} (x) \end{matrix}) \cdot V_{λ \times λ}^{T} = S,

(1)

over the ring

R_{p}

, where

V_{λ \times λ}

is the

λ \times λ

square

V_{λ \times λ} = [\begin{matrix} 1 & 1 & 1 & \dots & 1 \\ x^{e_{0}} & x^{e_{1}} & x^{e_{2}} & \dots & x^{e_{λ - 1}} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ x^{(λ - 1) e_{0}} & x^{(λ - 1) e_{1}} & x^{(λ - 1) e_{2}} & \dots & x^{(λ - 1) e_{λ - 1}} \end{matrix}],

and

S = (\begin{matrix} S_{0} (x) & S_{1} (x) & \dots & S_{λ - 1} (x) \end{matrix})

, where the

λ

syndrome polynomials are

S_{ℓ} (x) = \sum_{j = 0}^{δ - 1} x^{ℓ \cdot h_{j}} c_{h_{j}} (x) for 0 \leq ℓ \leq λ - 1 .

(2)

In this paper, we present three efficient decoding methods to solve the linear systems in Equation (1) over the ring

F_{2} [x] / (1 + x^{p})

.

2.2. Efficient Operations over $F_{2} [x] / (1 + x^{p})$

It is more efficient to compute the multiplication of a power of x and division of the factor

1 + x^{b}

over the ring

F_{2} [x] / (1 + x^{p})

rather than over the ring

R_{p}

: (i) Let

a (x) \in R_{p}

, and the multiplication

x^{i} \cdot a (x)

over the ring

R_{p}

in [17] (Equation (19)) takes

p - 1

XORs, while the multiplication

x^{i} \cdot a (x)

over the ring

F_{2} [x] / (1 + x^{p})

takes no XORs [20]. (ii) Let

g (x), f (x) \in F_{2} [x] / (1 + x^{p})

, where d is a positive integer, which is coprime with p. Consider the equation

(1 + x^{d}) g (x) \equiv f (x) (\mod 1 + x^{p}),

(3)

where

f (x)

has an even number of nonzero terms. Given such

f (x)

and d, we can compute

g (x)

by Lemma 1.

Lemma 1.

[Lemma 8] in [21] The coefficients of

g (x)

in Equation (3) are given by

\begin{matrix} g_{p - 1} & = 0, g_{p - d - 1} = f_{p - 1}, g_{d - 1} = f_{d - 1}, \\ g_{p - (ℓ + 1) d - 1} & = g_{p - ℓ d - 1} + f_{p - ℓ d - 1} for ℓ = 1, 2, \dots, p - 3 . \end{matrix}

By Lemma 1, computing the division

\frac{f (x)}{1 + x^{d}}

takes

p - 3

XORs, but we are not sure whether

g (x)

has an even number of nonzero terms or not. If we want to guarantee that

g (x)

has an even number of nonzero terms, we should use Lemma 2 to compute the division

\frac{f (x)}{1 + x^{d}}

.

Lemma 2.

[Lemma 13] in [20] The coefficients of

g (x)

in Equation (3) are given by

\begin{matrix} g_{0} & = f_{2 d} + f_{4 d} + \dots + f_{(p - 1) d}, \\ g_{ℓ d} & = g_{(ℓ - 1) d} + f_{ℓ d} for ℓ = 1, 2, \dots, p - 1 . \end{matrix}

By Lemma 2, the division

\frac{f (x)}{1 + x^{d}}

takes

\frac{3 p - 5}{2}

XORs, and

g (x)

has an even number of nonzero terms. However, computing the division

\frac{f (x)}{1 + x^{d}}

in [Corollary 2] in [17] takes

2 (p - 1)

XORs over the ring

R_{p}

, which is strictly larger than the decoding methods in Lemmas 1 and 2. It is shown in [Theorem 5] in [19] that we can always solve the equations in Equation (1) over the ring

F_{2} [x] / (1 + x^{p})

of which all the solutions are congruent to each other after modulo

M_{p} (x)

. Therefore, we can first solve the equations in Equation (1) over the ring

F_{2} [x] / (1 + x^{p})

and then obtain the unique solution by taking modulo

M_{p} (x)

to reduce the computational complexity.

2.3. Multiple Multiplications over $F_{2} [x] / (1 + x^{p})$

Note that in our modified syndrome-based decoding method and the modified interpolation-based decoding method, we need to compute multiple polynomial multiplications, where each polynomial has two nonzero terms. Suppose that we want to compute the following m multiplications

L (x^{τ}) = \prod_{i = 0}^{m - 1} (x^{τ} - x^{ξ_{i}}) (\mod 1 + x^{p}),

(4)

where m is a positive integer,

0 \leq τ \leq p - 1

such that

τ \notin {ξ_{0}, ξ_{1}, \dots, ξ_{m - 1}}

, and

0 \leq ξ_{0} < \dots < ξ_{m - 1} < n

.

We can derive from Equation (4) that

L (x^{τ}) = x^{π} \cdot \prod_{i = 0}^{m - 1} (1 + x^{d_{i}}) (\mod 1 + x^{p}),

(5)

where

π = \sum_{i = 0}^{m - 1} min (τ, ξ_{i})

modulo p and

d_{i} = | τ - ξ_{i} |

for

i = 0, 1, \dots, m - 1

.

Algorithm 1 presents a method to simplify the multiplications in Equation (4). In Algorithm 1, we use

Γ_{ℓ}

to denote the number of the polynomial

1 + x^{ℓ}

in the multiplication

L (x^{τ})

. Note that we only need to count the number of

1 + x^{ℓ}

for

1 \leq ℓ \leq \frac{p - 1}{2}

, because the equation

1 + x^{ℓ} \equiv x^{ℓ} \cdot (1 + x^{p - ℓ})

modulo

1 + x^{p}

holds for

\frac{p - 1}{2} < ℓ < n

. If

Γ_{ℓ} > 1

, then we have

{(1 + x^{ℓ})}^{Γ_{ℓ}} = {(1 + x^{ℓ})}^{Γ_{ℓ} - 2 ⌊ \frac{Γ_{ℓ}}{2} ⌋} \cdot {(1 + x^{2 ℓ})}^{⌊ \frac{Γ_{ℓ}}{2} ⌋}

. Therefore, we can always merge

Γ_{ℓ}

multiplications

{(1 + x^{ℓ})}^{Γ_{ℓ}}

into

Γ_{ℓ} - ⌊ \frac{Γ_{ℓ}}{2} ⌋

multiplications and the computational complexity can be reduced with Algorithm 1. When Algorithm 1 is executed, all elements of count-array

Γ

should be zero or one, and the length

η

of the final

L (x^{τ})

is between 1 and m.

Algorithm 1: Simplify the multiple multiplications.

3. Decoding Algorithm

In this section, we present two decoding methods over the ring

F_{2} [x] / (1 + x^{p})

by modifying two existing decoding methods [17,18] that can reduce the decoding complexity.

Recall that the

λ

erased columns are

λ

columns

{e_{i}}_{i = 0}^{λ - 1}

, and the

δ = n - λ

surviving columns are

δ

columns

{h_{j}}_{j = 0}^{δ - 1}

.

3.1. Modified Syndrome-Based Method

We define the function of the indeterminate z

G_{i} (z) = \prod_{s = 0, \neq i}^{λ - 1} (1 - x^{e_{s}} z) = \sum_{ℓ = 0}^{λ - 1} G_{i, ℓ} (x) z^{ℓ},

and the syndrome function

S (z) = \sum_{ℓ = 0}^{r - 1} S_{ℓ} (x) z^{ℓ}

, where

0 \leq i \leq λ - 1

and

S_{ℓ} (x)

is given in Equation (2). We can obtain in [Equation (18)] in [17] that

\begin{matrix} \prod_{s = 0, \neq i}^{λ - 1} (x^{e_{i}} - x^{e_{s}}) c_{e_{i}} (x) & \equiv \sum_{ℓ = 0}^{λ - 1} G_{i, λ - 1 - ℓ} (x) S_{ℓ} (x) \\ \equiv σ_{i} (x) (\mod M_{p} (x)) . \end{matrix}

Therefore, the

σ_{i} (x)

can be regarded as the coefficient of

z^{λ - 1}

of the polynomial

G_{i} (z) S (z)

. Then, the erased column

c_{e_{i}} (x)

is given by

\frac{σ_{i} (x)}{\prod_{s = 0, \neq i}^{λ - 1} (x^{e_{i}} - x^{e_{s}})}

, where

0 \leq i \leq λ - 1

.

Note that the terms of set

{S_{ℓ} (x) z^{ℓ}}_{ℓ = λ}^{r - 1}

are not involved in computing the coefficient of

z^{λ - 1}

of the polynomial

G_{i} (z) S (z)

. Thus, we can just consider the first

λ

terms (the

λ

coefficients of degrees less than

λ

) of

S (z)

when computing these coefficients, but all the r terms of

S (z)

are calculated in [Step 1] in [17]. This is one essential way our modified syndrome-based decoding method obtains a lower decoding complexity than the original method in [17].

Moreover, the syndrome polynomials

S_{ℓ} (x)

satisfy

S_{0} (1) = S_{1} (1) = \dots = S_{λ - 1} (1),

(6)

i.e., the

λ

syndrome polynomials

S_{ℓ} (x)

either all have an even number of nonzero terms, or they all have an odd number of nonzero terms, from the definition of Equation (2).

Let

G (z) = (1 - x^{e_{i}} z) G_{i} (z)

and

Q (z) = G (z) S (z)

. Then, we have

\begin{matrix} Q (z) & = (1 - x^{e_{i}} z) \prod_{s = 0, \neq i}^{λ - 1} (1 - x^{e_{s}} z) S (z) \\ = \prod_{s = 0}^{λ - 1} (1 - x^{e_{s}} z) S (z) = \sum_{ℓ = 0}^{r + λ - 1} Q_{ℓ} (x) z^{ℓ} . \end{matrix}

(7)

Thus,

Q (z)

is independent of the erasure index i, and we only need to compute

Q (z)

once in the decoding procedure. Recall that

σ_{i} (x)

is the coefficient of

z^{λ - 1}

of the polynomial

G_{i} (z) S (z)

; then, the

σ_{i} (x)

is also the coefficient of

z^{λ - 1}

of the polynomial

\frac{Q (z)}{(1 - x^{e_{i}} z)} = \frac{(1 - x^{e_{i}} z) G_{i} (z) S (z)}{(1 - x^{e_{i}} z)}

for all

0 \leq i \leq λ - 1

. Suppose that

\frac{Q (z)}{(1 - x^{e_{i}} z)} = f_{0}^{i} (x) + f_{1}^{i} (x) z + \dots + f_{λ - 1}^{i} (x) z^{λ - 1} + \dots,

we can derive the recurrence formula

f_{ℓ}^{i} (x) = \{\begin{matrix} Q_{0} (x), & ℓ = 0; \\ x^{e_{i}} \cdot f_{ℓ - 1}^{i} (x) + Q_{ℓ} (x), & ℓ > 0; \end{matrix}

(8)

where

0 \leq i \leq λ - 1

. Notice that

σ_{i} (x) = f_{λ - 1}^{i} (x)

holds. Similar to

S (z)

, we only compute the first

λ

terms (the

λ

coefficients of degrees less than

λ

) of

Q (z)

, since the other coefficients of

Q (z)

are not needed, but all the

r + λ

terms of

Q (z)

are calculated in [Step 2] in [17]. This is another way our modified syndrome-based decoding method obtains a lower decoding complexity than the original method in [17]. Algorithm 1 shows our modified syndrome-based decoding method over the ring

F_{2} [x] / (1 + x^{p})

.

The following Lemma shows that we can always compute the divisions in steps 11–12 of Algorithm 2 by Lemmas 1 and 2 when

λ \geq 2

.

Lemma 3.

In steps 11–12 of Algorithm 2, the

σ_{i} (x)

has an even number of nonzero terms for all

0 \leq i \leq λ - 1

, and we can employ Lemmas 1 and 2 to compute the divisions.

Proof.

From Equation (8) and steps 7–10 of Algorithm 2, we obtain

σ_{i} (x) = x^{(λ - 1) e_{i}} Q_{0} (x) + x^{(λ - 2) e_{i}} Q_{1} (x) + \dots + Q_{λ - 1} (x),

where

0 \leq i \leq λ - 1

. If the number of polynomials in the set

{Q_{j} (x)}_{j = 0}^{λ - 1}

, which has an odd number of nonzero terms, is an even number, then the

σ_{i} (x)

has an even number of nonzero terms for

0 \leq i \leq λ - 1

. In the following, we will show this is true. According to Equation (6) and step 3 of Algorithm 2,

Q_{0} (1) = \dots = Q_{λ - 1} (1)

holds.

Firstly, we consider

Q_{0} (1) = \dots = Q_{λ - 1} (1) = 1

. We denote the

λ

polynomials

{Q_{j} (x)}_{j = 0}^{λ - 1}

with

ε = 0, 1, \dots, λ

as

{Q_{j}^{ε} (x)}_{j = 0}^{λ - 1}

. Let

Q_{j}^{0} (x)

be the initial

Q_{j} (x)

for

0 \leq j \leq λ - 1

.

To prove that the number of polynomials with an odd number of nonzero terms in the set

{Q_{j}^{ε} (x)}_{j = 0}^{λ - 1}

is even, it is equivalent to prove that

\sum_{j = 0}^{λ - 1} Q_{j}^{ε} (1) = 0

.

Algorithm 2: Modified syndrome-based decoding method.

According to Equation (7) and steps 4–6 of Algorithm 2, we have

Q_{j}^{ε} (1) = \{\begin{matrix} Q_{j}^{ε - 1} (1), & j = 0; \\ Q_{j - 1}^{ε - 1} (1) + Q_{j}^{ε - 1} (1), & 1 \leq j \leq λ - 1; \end{matrix}

(9)

where

ε = 1, 2, \dots, λ

. The

Q_{j}^{1} (1) = 0

holds for all

j \geq 1

. We can obtain by induction

Q_{j}^{ε} (1) = Q_{j - 1}^{ε - 1} (1) + Q_{j}^{ε - 1} (1) = 0 for all j \geq ε \geq 1 .

(10)

Note that

\sum_{j = 0}^{λ - 1} Q_{j}^{2} (1) = 0

; we can suppose that there are an even number of polynomials in the set

{Q_{j}^{ε} (x)}_{j = 0}^{λ - 1}

, which has an odd number of nonzero terms, when

ε = y \geq 2

, i.e.,

\sum_{j = 0}^{λ - 1} Q_{j}^{y} (1) = 0

first. We have

\sum_{j = 0}^{λ - 1} Q_{j}^{y + 1} (1)

; so,

\begin{matrix} \sum_{j = 0}^{λ - 1} Q_{j}^{y + 1} (1) & = Q_{0}^{y} (1) + \sum_{j = 1}^{λ - 1} (Q_{j - 1}^{y} (1) + Q_{j}^{y} (1)) \\ = \sum_{j = 0}^{λ - 1} Q_{j}^{y} (1) + \sum_{j = 0}^{λ - 2} Q_{j}^{y} (1) \\ = Q_{λ - 1}^{y} (1) = 0 . \end{matrix}

(11)

Equation (11) comes from Equation (10) with

j = λ - 1

. Therefore, there are an even number of polynomials in the set

{Q_{j}^{y + 1} (x)}_{j = 0}^{λ - 1}

, which has an odd number of nonzero terms.

Secondly, when

Q_{0} (1) = \dots = Q_{λ - 1} (1) = 0

, the argument is similar. This completes the proof. □

According to Lemma 3, we can use Lemmas 1 and 2 to compute the divisions in step 12. The number of divisions required in step 12 is recorded as

L_{i}

, which ranges from 1 to

λ - 1

for

i = 0, 1, \dots, λ - 1

. So, we can obtain

c_{e_{i}} (x)

in step 12 by recursively computing the division

L_{i}

times, while the number of nonzero terms of the polynomial resulting from the first

L_{i} - 1

divisions is even. Therefore, we can execute these divisions by Lemma 2 and execute the last division by Lemma 1. The computational complexity

T_{D}

in steps 11–12 of Algorithm 2 is

T_{D} = \sum_{i = 0}^{λ - 1} ((L_{i} - 1) \frac{3 p - 5}{2} + p - 3),

(12)

where

λ (p - 3) \leq T_{D} \leq λ (λ - 2) \frac{3 p - 5}{2} + λ (p - 3)

.

In steps 11–12 of Algorithm 2, we take the

λ (λ - 1)

division without Algorithm 1, in which

λ

divisions are executed by Lemma 1 and

λ (λ - 2)

divisions are executed by Lemma 2; however, the number of the divisions can be reduced with Algorithm 1. In Table 1, we show the average number of divisions in steps 11–12 of Algorithm 2 executed by Lemma 1 and Lemma 2 with Algorithm 1 for

(p, n) \in {(5, 5), (7, 7)}

.

Table 1. The average number of XORs involved in steps 11–12 of Algorithm 2.

We specify the computational complexity of Algorithm 2 as follows:

Steps 1–2 take $λ (δ - 1) p = λ (n - λ - 1) p$ XORs.
Steps 3–6 take $λ (λ - 1) p$ XORs.
Steps 7–10 take $λ (λ - 1) p$ XORs.
Steps 11–12 take $T_{D}$ XORs by Equation (12).

Then, the computational complexity

T_{A l g 2}

of Algorithm 2 is

T_{A l g 2} = λ (n + λ - 3) p + T_{D},

(13)

where

p λ^{2} + ((n - 2) p - 3) λ \leq T_{A l g 2} \leq \frac{5 (p - 1)}{2} λ^{2} + ((n - 5) p + 2) λ .

Recall that the computational complexity of the decoding method in [17] is

\frac{7 p - 4}{2} λ^{2} - \frac{7 p - 2}{2} λ + r (n - 1) p .

which is strictly larger than

T_{A l g 2}

.

Table 2 evaluates the computational complexity of the decoding method in [17] and Algorithm 2 for some parameters. The results in Table 2 demonstrate that Algorithm 2 has much lower decoding complexity, compared with the original decoding method in [17]. For example, Algorithm 2 has 40.60% less decoding complexity than the decoding method in [17] when

(p, n, r) = (7, 7, 4), λ = 3

.

Table 2. Decoding complexity of method in [17] and Algorithm 2.

The reason why Algorithm 2 has lower decoding complexity than the decoding method in [17] can be summarized as the following three points.

Firstly, we only consider the first

λ

terms (the

λ

coefficients of degrees less than

λ

) for both

S (z)

and

Q (z)

in computing the coefficients of

z^{λ - 1}

, while all r terms of

S (z)

and all

r + λ

terms of

Q (z)

are calculated in the decoding method in [17], where

r \geq λ

.

Secondly, all the divisions in Algorithm 2 are executed over the ring

F_{2} [x] / (1 + x^{p})

by Lemmas 1 and 2, which takes

p - 3

XORs and

\frac{3 p - 5}{2}

XORs for each division, respectively. In addition, the division in [17] is executed over the ring

R_{p}

, which takes

2 (p - 1)

XORs [17] (Corollary 2).

Thirdly, we apply Algorithm 1 to steps 11–12 of Algorithm 2, which can significantly reduce the number of divisions, thus reducing the number of XORs required.

3.2. Modified Interpolation-Based Decoding Method

According to the decoding method in [18], we can recover the erased column

c_{e_{i}} (x)

with

0 \leq i \leq λ - 1

by

c_{e_{i}} (x) = \sum_{j = 0}^{δ - 1} c_{h_{j}} (x) \frac{f_{i} (x^{h_{j}})}{f_{i} (x^{e_{i}})} (\mod M_{p} (x)),

(14)

where

f_{i} (y) = \prod_{s = 0, \neq i}^{λ - 1} (y - x^{e_{s}})

and

f (y) = \prod_{s = 0}^{λ - 1} (y - x^{e_{s}})

. Let

a_{j} (x) = c_{h_{j}} (x) \cdot f (x^{h_{j}}) = \prod_{s = 0}^{λ - 1} (x^{h_{j}} - x^{e_{s}}) \cdot c_{h_{j}} (x) (\mod M_{p} (x)),

(15)

where

0 \leq j \leq δ - 1

. Then,

a_{j} (x)

has an even number of nonzero terms, and we only need to compute once for

a_{j} (x)

in the decoding procedure, since

a_{j} (x)

is independent of the erasure index i. Let

\begin{matrix} b_{i} (x) & = \sum_{j = 0}^{δ - 1} \frac{a_{j} (x)}{x^{h_{j}} - x^{e_{i}}} (\mod M_{p} (x)), \end{matrix}

(16)

\begin{matrix} c_{e_{i}} (x) & = \frac{b_{i} (x)}{f_{i} (x^{e_{i}})} = \frac{b_{i} (x)}{\prod_{s = 0, \neq i}^{λ - 1} (x^{e_{i}} - x^{e_{s}})} (\mod M_{p} (x)), \end{matrix}

(17)

where

0 \leq i \leq λ - 1

, and

M_{p} (x) = 1 + x + \dots + x^{p - 1}

. Algorithm 3 shows our modified interpolation-based method over the ring

F_{2} [x] / (1 + x^{p})

.

After using Algorithm 1, the number of polynomial multiplications in step 2 ranges from 1 to

λ

. Thus, the computational complexity

T_{M}

in steps 1–2 of Algorithm 3 is

(n - λ) p \leq T_{M} \leq (n - λ) λ p .

(18)

Algorithm 3: Modified interpolation-based method.

In steps 1–2, we need to take

λ

multiplications without Algorithm 1, which takes

(n - λ) λ p

XORs; however, with Algorithm 1, the number of multiplications involved in steps 1–2 can be reduced. In Table 3, we show the average number of XORs involved in steps 1–2 of Algorithm 3 with Algorithm 1 for

(p, n) \in {(5, 5), (7, 7)}

. The results in Table 3 show that we can reduce the number of XORs with Algorithm 1, especially for a large value of

λ

.

Table 3. The average number of XORs involved in steps 1–2 of Algorithm 3.

Only steps 4 and 6 of Algorithm 3 are needed to compute the division. We should employ Lemma 2 to execute the divisions in steps 3–4 in Algorithm 3, since

b_{i} (x)

in step 6 of Algorithm 3 should have an even number of nonzero terms. Notice that steps 5–6 of Algorithm 3 are exactly the same as steps 11–12 of Algorithm 2.

We specify the computational complexity of Algorithm 3 as follows:

Steps 1–2 require $T_{M}$ XORs by Equation (18).
Steps 3–4 need $λ (δ - 1)$ additions and $λ δ$ divisions by Lemma 2, which require $λ (n - λ - 1) p + λ (n - λ) \frac{3 p - 5}{2}$ XORs in total.
Steps 5–6 require $T_{D}$ XORs by Equation (12).

Then, the computational complexity of Algorithm 3 is

T_{A l g 3} = T_{M} + λ (n - λ - 1) p + λ (n - λ) \frac{3 p - 5}{2} + T_{D},

(19)

where

- \frac{5 (p - 1)}{2} λ^{2} + (\frac{5 n - 2}{2} p - \frac{5}{2} n - 3) λ + n p \leq T_{A l g 3} \leq - 2 p λ^{2} + (\frac{7 n - 6}{2} p - \frac{5}{2} n + 2) λ .

Recall that the computational complexity of the decoding method in [18] is

(- 2 p + 1) λ^{2} + (4 (n - 1) p - 3 n + 4) λ + n (p - 1),

which is larger than that of our Algorithm 3.

Table 4 evaluates the computational complexity of the decoding method in [18] and Algorithm 3 for some parameters. The results in Table 4 demonstrate that our Algorithm 3 had much lower decoding complexity, compared with the original decoding method in [18]. For example, Algorithm 3 had a 34.13% lower decoding complexity than the decoding method in [18], when

(p, n, r) = (7, 7, 4), λ = 3

.

Table 4. Decoding complexities of the decoding method in [18] and our Algorithm 3.

The reason why Algorithm 3 has a lower decoding complexity than that of the decoding method in [18] is summarized as follows.

Firstly, all the divisions in Algorithm 3 were executed over the ring

F_{2} [x] / (1 + x^{p})

by Lemmas 1 and 2, which used

p - 3

XORs and

(3 p - 5) / 2

XORs for each division, respectively. The division in the decoding method in [18] was executed over the ring

R_{p}

, which used

2 (p - 1)

XORs.

Secondly, we applied our Algorithm 1 to steps 1–2 and steps 5–6, which significantly reduced the number of multiplications, thus reducing the number of XORs required.

4. LU Decomposition-Based Method

The LU factorization of a matrix [22] is to express the matrix as a product of a lower triangular matrix

L

and an upper triangular matrix

U

. According to the LU factorization of the Vandermonde matrix [23], we can express a Vandermonde matrix as a product of several lower triangular matrices and several upper triangular matrices. Therefore, we can solve the Vandermonde linear equations by first solving the linear equations with the encoding matrices that are the upper triangular matrices and then solving the linear equations with the encoding matrices that are the lower triangular matrices.

Suppose that the

λ

erased columns are

λ

columns

{e_{i}}_{i = 0}^{λ - 1}

and the

δ = n - λ

surviving columns are

{h_{j}}_{j = 0}^{δ - 1}

. Algorithm 4 shows our LU decomposition-based method over the ring

F_{2} [x] / (1 + x^{p})

.

According to [Theorem 8] in [19], Equation (1) can be factorized into

(\begin{matrix} c_{e_{0}} (x) & c_{e_{1}} (x) & \dots & c_{e_{λ - 1}} (x) \end{matrix}) \cdot (L_{λ}^{(1)} L_{λ}^{(2)} \dots L_{λ}^{(λ - 1)}) \cdot (U_{λ}^{(λ - 1)} U_{λ}^{(λ - 2)} \dots U_{λ}^{(1)}) = S,

(20)

over the ring

R_{p}

, where

U_{λ}^{(θ)}

is the upper triangle matrix

U_{λ}^{(θ)} = [\begin{matrix} I_{λ - θ - 1} & 0 \\ i n e 0 & \begin{matrix} 1 & x^{e_{0}} & 0 & \dots & 0 & 0 \\ 0 & 1 & x^{e_{1}} & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & 0 & \dots & 1 & x^{e_{θ - 1}} \\ 0 & 0 & 0 & \dots & 0 & 1 \end{matrix} \end{matrix}],

(21)

and

L_{λ}^{(θ)}

is the lower triangle matrix

L_{λ}^{(θ)} = [\begin{matrix} I_{λ - θ - 1} & 0 \\ i n e 0 & \begin{matrix} 1 & 0 & \dots & 0 & 0 \\ 1 & x^{e_{λ - θ}} + x^{e_{λ - θ - 1}} & \dots & 0 & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ & ⋮ \\ 0 & 0 & \dots & x^{e_{λ - 2}} + x^{e_{λ - θ - 1}} & 0 \\ 0 & 0 & \dots & 1 & x^{e_{λ - 1}} + x^{e_{λ - θ - 1}} \end{matrix} \end{matrix}],

(22)

for

θ = 1, 2, \dots, λ - 1

.

Algorithm 4: LU decomposition-based method.

We specify the computational complexity of Algorithm 4 as follows:

Steps 1–2 require $λ (δ - 1) p = λ (n - λ - 1) p$ XORs.
Steps 3–11 require $λ (λ - 1) p + (λ - 1) (p - 3) + (λ - 1) (λ - 2) (3 p - 5) / 4$ XORs at most, according to [Theorem 10] in [19].

Then, the computational complexity of Algorithm 4 is

T_{A l g 4} = \frac{3 p - 5}{4} λ^{2} + \frac{(4 n - 13) p + 3}{4} λ + \frac{p + 1}{2} .

(23)

5. Comparison and Conclusions

Table 5 evaluates the decoding complexity of Algorithm 2–4 for some parameters. The results of Table 5 demonstrate that Algorithm 2 performs better than Algorithm 3 if

λ \leq \frac{n}{2}

; otherwise, if

λ > \frac{n}{2}

, then Algorithm 3 has less decoding complexity. Algorithm 4 has less decoding complexity than both Algorithms 2 and 3, when

λ

is small. However, when

λ

is large, Algorithm 3 is more efficient than Algorithm 4. For example, compared with Algorithm 2–4 have

21.98 %

and

40.66 %

less decoding complexity, respectively, when

(p, n, r) = (5, 5, 4), λ = 3

.

Table 5. Decoding complexities of the proposed three decoding methods.

In this paper, we presented three efficient decoding methods for the erasures of Blaum–Roth codes that all have lower decoding complexity than the existing decoding methods. The efficient implementation of the proposed decoding methods in practical storage systems is one of our future works.

Author Contributions

Funding acquisition, H.H. methodology, H.H.; writing—original draft preparation, W.Z.; writing—review and editing, H.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China grant number 62071121 and National Key R&D Program of China grant number 2020YFA0712300.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

All data generated or analysed during this study are included in this published article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Peng, P. Redundancy Allocation in Distributed Systems. Ph.D. Thesis, Rutgers The State University of New Jersey, School of Graduate Studies, New Brunswick, NJ, USA, 2022. [Google Scholar]
MacWilliams, F.J.; Sloane, N.J.A. The Theory of Error Correcting Codes; Elsevier: Amsterdam, The Netherlands, 1977; Volume 16. [Google Scholar]
Blaum, M.; Brady, J.; Bruck, J.; Menon, J. EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures. IEEE Trans. Comput. 1995, 44, 192–202. [Google Scholar] [CrossRef]
Hou, H.; Lee, P.P.C. A New Construction of EVENODD Codes With Lower Computational Complexity. IEEE Commun. Lett. 2018, 22, 1120–1123. [Google Scholar] [CrossRef]
Corbett, P.; English, B.; Goel, A.; Grcanac, T.; Kleiman, S.; Leong, J.; Sankar, S. Row-diagonal Parity for Double Disk Failure Correction. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies, San Francisco, CA, USA, 31 March–4 April 2004; pp. 1–14. [Google Scholar]
Xu, L.; Bruck, J. X-code: MDS Array Codes with Optimal Encoding. IEEE Trans. Inf. Theory 1999, 45, 272–276. [Google Scholar]
Tsunoda, Y.; Fujiwara, Y.; Ando, H.; Vandendriessche, P. Bounds on separating redundancy of linear codes and rates of X-codes. IEEE Trans. Inf. Theory 2018, 64, 7577–7593. [Google Scholar] [CrossRef]
Goel, A.; Corbett, P. RAID Triple Parity. ACM SIGOPS Oper. Syst. Rev. 2012, 46, 41–49. [Google Scholar] [CrossRef]
Huang, C.; Xu, L. STAR: An Efficient Coding Scheme for Correcting Triple Storage Node Failures. IEEE Trans. Comput. 2008, 57, 889–901. [Google Scholar] [CrossRef]
Hou, H.; Lee, P.P.C. STAR+ Codes: Triple-Fault-Tolerant Codes with Asymptotically Optimal Updates and Efficient Encoding/Decoding. In Proceedings of the 2021 IEEE Information Theory Workshop (ITW 2021), Kanazawa, Japan, 17–21 October 2021. [Google Scholar]
Blaum, M.; Brady, J.; Bruck, J.; Jai Menon, J.; Vardy, A. The EVENODD Code and its Generalization: An Effcient Scheme for Tolerating Multiple Disk Failures in RAID Architectures. In High Performance Mass Storage and Parallel I/O; Wiley-IEEE Press: Hoboken, NJ, USA, 2002; Chapter 8; pp. 187–208. [Google Scholar]
Blaum, M.; Bruck, J.; Vardy, A. MDS Array Codes With Independent Parity Symbols. IEEE Trans. Inf. Theory 1996, 42, 529–542. [Google Scholar] [CrossRef]
Hou, H.; Shum, K.W.; Chen, M.; Li, H. New MDS Array Code Correcting Multiple Disk Failures. In Proceedings of the 2014 IEEE Global Communications Conference, Austin, TX, USA, 8–12 December 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 2369–2374. [Google Scholar]
Fu, H.; Hou, H.; Zhang, L. Extended EVENODD+ Codes with Asymptotically Optimal Updates and Efficient Encoding/Decoding. In Proceedings of the 2021 XVII International Symposium “Problems of Redundancy in Information and Control Systems” (REDUNDANCY), Moscow, Russia, 25–29 October 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–6. [Google Scholar]
Chiniah, A.; Mungur, A. On the Adoption of Erasure Code for Cloud Storage by Major Distributed Storage Systems. EAI Endorsed Trans. Cloud Syst. 2022, 7, e1. [Google Scholar] [CrossRef]
Rui, J.; Huang, Q.; Wang, Z. Graftage Coding for Distributed Storage Systems. IEEE Trans. Inf. Theory 2021, 67, 2192–2205. [Google Scholar] [CrossRef]
Blaum, M.; Roth, R.M. New Array Codes for Multiple Phased Burst Correction. IEEE Trans. Inf. Theory 1993, 39, 66–77. [Google Scholar] [CrossRef]
Guo, Q.; Kan, H. On Systematic Encoding for Blaum-Roth Codes. In Proceedings of the 2011 IEEE International Symposium on Information Theory Proceedings, St. Petersburg, Russia, 31 July–5 August 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 2353–2357. [Google Scholar]
Hou, H.; Han, Y.S.; Shum, K.W.; Li, H. A Unified Form of EVENODD and RDP Codes and Their Efficient Decoding. IEEE Trans. Commun. 2018, 66, 5053–5066. [Google Scholar] [CrossRef]
Hou, H.; Shum, K.W.; Chen, M.; Li, H. BASIC Codes: Low-complexity Regenerating Codes for Distributed Storage Systems. IEEE Trans. Inf. Theory 2016, 62, 3053–3069. [Google Scholar] [CrossRef]
Hou, H.; Han, Y.S. A New Construction and An Efficient Decoding Method for Rabin-like Codes. IEEE Trans. Commun. 2017, 66, 521–533. [Google Scholar] [CrossRef]
Strang, G.; Strang, G.; Strang, G.; Strang, G. Introduction to Linear Algebra; Wellesley-Cambridge Press: Wellesley, MA, USA, 1993; Volume 3. [Google Scholar]
Yang, S.l. On the LU factorization of the Vandermonde matrix. Discret. Appl. Math. 2005, 146, 102–105. [Google Scholar] [CrossRef]

Table 1. The average number of XORs involved in steps 11–12 of Algorithm 2.

p, n	λ	Without Algorithm 1			Apply Algorithm 1			Improvement(%)
p, n	λ	Lemma 2	Lemma 1	XORs	Lemma 2	Lemma 1	XORs	Improvement(%)
(5, 5)	2	0	2	4	0	2	4	0%
	3	3	3	21	2	3	16	23.81%
	4	8	4	48	0	4	8	83.33%
(7, 7)	2	0	2	8	0	2	8	0%
	3	3	3	36	2.4	3	31.2	13.33%
	4	8	4	80	4.4	4	51.2	36%
	5	15	5	140	1	5	28	80%
	6	24	6	216	6	6	72	66.67%

Table 2. Decoding complexity of method in [17] and Algorithm 2.

p, n, r	λ	XORs in [17]	XORs of $T_{Alg 2}$	Improvement(%)
(5, 5, 3)	2	89	44	50.56%
(5, 5, 3)	3	150	91	39.33%
(7, 7, 4)	2	211	92	56.40%
	3	300	178.2	40.60%
	4	434	275.2	36.59%

Table 3. The average number of XORs involved in steps 1–2 of Algorithm 3.

p, n	λ	Without Algorithm 1		Apply Algorithm 1		Improvement(%)
p, n	λ	Multiplication	XORs	Multiplication	XORs	Improvement(%)
(5, 5)	2	6	30	5	25	16.67%
	3	6	30	2	10	66.67%
	4	4	20	2	10	50%
(7, 7)	2	10	70	9	63	10%
	3	12	84	8.4	58.8	30%
	4	12	84	3.6	25.2	70%
	5	10	70	4	28	60%
	6	6	42	3	21	50%

Table 4. Decoding complexities of the decoding method in [18] and our Algorithm 3.

p, n, r	λ	XORs in [18]	XORs of $T_{Alg 3}$	Improvement(%)
(5, 5, 3)	2	122	79	35.25%
(5, 5, 3)	3	146	71	51.37%
(7, 7, 4)	2	292	207	29.11%
	3	378	249	34.13%
	4	438	228.4	47.85%

Table 5. Decoding complexities of the proposed three decoding methods.

p, n, r	λ	Total XORs			$\frac{T_{Alg 2} - T_{Alg 3}}{T_{Alg 2}}$	$\frac{T_{Alg 2} - T_{Alg 4}}{T_{Alg 2}}$
p, n, r	λ	$T_{Alg 2}$	$T_{Alg 3}$	$T_{Alg 4}$	$\frac{T_{Alg 2} - T_{Alg 3}}{T_{Alg 2}}$	$\frac{T_{Alg 2} - T_{Alg 4}}{T_{Alg 2}}$
(5, 5, 4)	2	44	79	32	−79.55%	27.27%
	3	91	71	54	21.98%	40.66%
	4	128	38	81	70.31%	36.72%
(7, 7, 6)	2	92	207	74	−125%	19.57%
	3	178.2	249	121	−39.73%	32.10%
	4	275.2	228.4	176	17.01%	36.05%
	5	343	171	239	50.15%	30.32%
	6	492	141	310	71.34%	36.99%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Three Efficient All-Erasure Decoding Methods for Blaum–Roth Codes

Abstract

1. Introduction

2. Blaum–Roth Codes

2.1. Construction of Blaum–Roth Codes [17]

2.2. Efficient Operations over $F_{2} [x] / (1 + x^{p})$

2.3. Multiple Multiplications over $F_{2} [x] / (1 + x^{p})$

3. Decoding Algorithm

3.1. Modified Syndrome-Based Method

3.2. Modified Interpolation-Based Decoding Method

4. LU Decomposition-Based Method

5. Comparison and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

Three Efficient All-Erasure Decoding Methods for Blaum–Roth Codes

Abstract

1. Introduction

2. Blaum–Roth Codes

2.1. Construction of Blaum–Roth Codes [17]

2.2. Efficient Operations over F 2 [ x ] / ( 1 + x p )

2.3. Multiple Multiplications over F 2 [ x ] / ( 1 + x p )

3. Decoding Algorithm

3.1. Modified Syndrome-Based Method

3.2. Modified Interpolation-Based Decoding Method

4. LU Decomposition-Based Method

5. Comparison and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

2.2. Efficient Operations over $F_{2} [x] / (1 + x^{p})$

2.3. Multiple Multiplications over $F_{2} [x] / (1 + x^{p})$