Abstract
Blaum–Roth Codes are binary maximum distance separable (MDS) array codes over the binary quotient ring , where , and p is a prime number. Two existing all-erasure decoding methods for Blaum–Roth codes are the syndrome-based decoding method and the interpolation-based decoding method. In this paper, we propose a modified syndrome-based decoding method and a modified interpolation-based decoding method that have lower decoding complexity than the syndrome-based decoding method and the interpolation-based decoding method, respectively. Moreover, we present a fast decoding method for Blaum–Roth codes based on the LU decomposition of the Vandermonde matrix that has a lower decoding complexity than the two modified decoding methods for most of the parameters.
1. Introduction
Redundancy is necessary in storage systems in order to provide high data reliability in case of disk failures [1]. Replication and erasure codes are two main ways of including redundancy. The idea of replication is that the data in one disk are copied to multiple disks. The storage system replaces damaged disks with their copies when some disks are erased. It is fast to repair the erased disks but requires a lot of storage space. In contrast, erasure codes provide higher data reliability with a small storage cost.
Maximum distance separable (MDS) codes [2] are typical erasure codes that have optimal tradeoff between storage cost and data reliability, i.e., they can achieve the minimum storage cost given a level of data reliability. Binary MDS codes are special MDS codes that have lower computational complexity in the encoding/decoding procedures, since only XORs and cyclic-shift operations are involved. Some existing constructions of binary MDS codes are EVENODD codes [3,4], RDP codes [5], and X-codes [6,7], which can correct any two-column (we use “column" and “disk" interchangeably in this paper) erasures. RTP codes [8], Star codes [9,10], and extended EVENODD codes [11,12,13,14] can correct any three-column erasures. With the rapid increase in the data scale in storage systems [15], we need to design binary MDS codes that can correct any number of erasures as well as efficient encoding/decoding methods. Graftage codes [16] can achieve various tradeoffs between storage and repair bandwidth, while we focus on efficient decoding methods of binary MDS codes. Blaum–Roth codes [17] are this type of code, which are designed over the ring , where , and p is a prime number.
When some columns are erased, the syndrome-based decoding method [17] and the interpolation-based decoding method [18] have been proposed to recover the erased columns. In the decoding methods [17,18], there are three basic operations over the ring : (i) addition, (ii) multiplication of a power of x and a polynomial, and (iii) division of factor with . It is shown in the decoding methods [17,18] that we can first take the operations (i) and (ii) modulo and then take the results of modulo , while operation (iii) in the decoding methods [17,18] is directly taken as modulo .
In this paper, we show that we can also compute operation (iii) as modulo , which has lower computational complexity than modulo . We propose modified decoding methods for the two existing decoding methods [17,18] that have a lower decoding complexity than the original decoding methods by computing operation (iii) as modulo instead of modulo . The reason our modified decoding methods have much lower decoding complexity than the decoding methods [17,18] is twofold. First, all the operations in our decoding methods are taken as modulo , while the existing decoding methods execute the divisions as modulo . Second, we propose new algorithms in the decoding procedure to reduce the number of operations. Please refer to Section 3 for our two modified decoding methods. Moreover, the efficient LU decoding method [19] proposed for extended EVENODD codes decoding can also be employed to recover the erased columns of Blaum–Roth codes. We show that the LU decoding method has lower decoding complexity than the two modified decoding methods for most of the parameters. We define the decoding complexity as the total number of XORs required to recover the erased columns.
2. Blaum–Roth Codes
In this section, we first review the construction of Blaum–Roth codes [17] and then show the efficient operations over the ring . Finally, we present an algorithm to compute multiple multiplications, which have two nonzero terms over with lower complexity.
2.1. Construction of Blaum–Roth Codes [17]
The codeword of Blaum–Roth codes [17] is a array that is encoded from the information bits, where and . We can view any k columns of the array as information columns that store the information bits and the other columns as parity columns that store the parity bits. For , we represent the bits in column j by a polynomial . The array of Blaum-Roth codes is defined as
where is the parity-check matrix
and is the all-zero row of length r. We denote the Blaum–Roth codes defined above as . When and p is a prime number, we can always retrieve all the information bits from any k out of the n polynomials [17], i.e., are MDS codes.
If we let for all , then can be equivalently defined as the following linear constraints. (The subscripts are taken as modulo p unless otherwise specified.)
where and .
Suppose that the columns are erased, where and . Let the surviving columns be , where and . We have
over the ring , where is the square
and , where the syndrome polynomials are
In this paper, we present three efficient decoding methods to solve the linear systems in Equation (1) over the ring .
2.2. Efficient Operations over
It is more efficient to compute the multiplication of a power of x and division of the factor over the ring rather than over the ring : (i) Let , and the multiplication over the ring in [17] (Equation (19)) takes XORs, while the multiplication over the ring takes no XORs [20]. (ii) Let , where d is a positive integer, which is coprime with p. Consider the equation
where has an even number of nonzero terms. Given such and d, we can compute by Lemma 1.
Lemma 1.
[Lemma 8] in [21] The coefficients of in Equation (3) are given by
By Lemma 1, computing the division takes XORs, but we are not sure whether has an even number of nonzero terms or not. If we want to guarantee that has an even number of nonzero terms, we should use Lemma 2 to compute the division .
Lemma 2.
[Lemma 13] in [20] The coefficients of in Equation (3) are given by
By Lemma 2, the division takes XORs, and has an even number of nonzero terms. However, computing the division in [Corollary 2] in [17] takes XORs over the ring , which is strictly larger than the decoding methods in Lemmas 1 and 2. It is shown in [Theorem 5] in [19] that we can always solve the equations in Equation (1) over the ring of which all the solutions are congruent to each other after modulo . Therefore, we can first solve the equations in Equation (1) over the ring and then obtain the unique solution by taking modulo to reduce the computational complexity.
2.3. Multiple Multiplications over
Note that in our modified syndrome-based decoding method and the modified interpolation-based decoding method, we need to compute multiple polynomial multiplications, where each polynomial has two nonzero terms. Suppose that we want to compute the following m multiplications
where m is a positive integer, such that , and .
Algorithm 1 presents a method to simplify the multiplications in Equation (4). In Algorithm 1, we use to denote the number of the polynomial in the multiplication . Note that we only need to count the number of for , because the equation modulo holds for . If , then we have . Therefore, we can always merge multiplications into multiplications and the computational complexity can be reduced with Algorithm 1. When Algorithm 1 is executed, all elements of count-array should be zero or one, and the length of the final is between 1 and m.
| Algorithm 1: Simplify the multiple multiplications. |
![]() |
3. Decoding Algorithm
In this section, we present two decoding methods over the ring by modifying two existing decoding methods [17,18] that can reduce the decoding complexity.
Recall that the erased columns are columns , and the surviving columns are columns .
3.1. Modified Syndrome-Based Method
We define the function of the indeterminate z
and the syndrome function , where and is given in Equation (2). We can obtain in [Equation (18)] in [17] that
Therefore, the can be regarded as the coefficient of of the polynomial . Then, the erased column is given by , where .
Note that the terms of set are not involved in computing the coefficient of of the polynomial . Thus, we can just consider the first terms (the coefficients of degrees less than ) of when computing these coefficients, but all the r terms of are calculated in [Step 1] in [17]. This is one essential way our modified syndrome-based decoding method obtains a lower decoding complexity than the original method in [17].
Moreover, the syndrome polynomials satisfy
i.e., the syndrome polynomials either all have an even number of nonzero terms, or they all have an odd number of nonzero terms, from the definition of Equation (2).
Let and . Then, we have
Thus, is independent of the erasure index i, and we only need to compute once in the decoding procedure. Recall that is the coefficient of of the polynomial ; then, the is also the coefficient of of the polynomial for all . Suppose that
we can derive the recurrence formula
where . Notice that holds. Similar to , we only compute the first terms (the coefficients of degrees less than ) of , since the other coefficients of are not needed, but all the terms of are calculated in [Step 2] in [17]. This is another way our modified syndrome-based decoding method obtains a lower decoding complexity than the original method in [17]. Algorithm 1 shows our modified syndrome-based decoding method over the ring .
The following Lemma shows that we can always compute the divisions in steps 11–12 of Algorithm 2 by Lemmas 1 and 2 when .
Lemma 3.
In steps 11–12 of Algorithm 2, the has an even number of nonzero terms for all , and we can employ Lemmas 1 and 2 to compute the divisions.
Proof.
From Equation (8) and steps 7–10 of Algorithm 2, we obtain
where . If the number of polynomials in the set , which has an odd number of nonzero terms, is an even number, then the has an even number of nonzero terms for . In the following, we will show this is true. According to Equation (6) and step 3 of Algorithm 2, holds.
Firstly, we consider . We denote the polynomials with as . Let be the initial for .
To prove that the number of polynomials with an odd number of nonzero terms in the set is even, it is equivalent to prove that .
| Algorithm 2: Modified syndrome-based decoding method. |
![]() |
According to Equation (7) and steps 4–6 of Algorithm 2, we have
where . The holds for all . We can obtain by induction
Note that ; we can suppose that there are an even number of polynomials in the set , which has an odd number of nonzero terms, when , i.e., first. We have ; so,
Equation (11) comes from Equation (10) with . Therefore, there are an even number of polynomials in the set , which has an odd number of nonzero terms.
Secondly, when , the argument is similar. This completes the proof. □
According to Lemma 3, we can use Lemmas 1 and 2 to compute the divisions in step 12. The number of divisions required in step 12 is recorded as , which ranges from 1 to for . So, we can obtain in step 12 by recursively computing the division times, while the number of nonzero terms of the polynomial resulting from the first divisions is even. Therefore, we can execute these divisions by Lemma 2 and execute the last division by Lemma 1. The computational complexity in steps 11–12 of Algorithm 2 is
where .
In steps 11–12 of Algorithm 2, we take the division without Algorithm 1, in which divisions are executed by Lemma 1 and divisions are executed by Lemma 2; however, the number of the divisions can be reduced with Algorithm 1. In Table 1, we show the average number of divisions in steps 11–12 of Algorithm 2 executed by Lemma 1 and Lemma 2 with Algorithm 1 for .
Table 1.
The average number of XORs involved in steps 11–12 of Algorithm 2.
We specify the computational complexity of Algorithm 2 as follows:
- Steps 1–2 take XORs.
- Steps 3–6 take XORs.
- Steps 7–10 take XORs.
- Steps 11–12 take XORs by Equation (12).
Then, the computational complexity of Algorithm 2 is
where
Recall that the computational complexity of the decoding method in [17] is
which is strictly larger than .
Table 2 evaluates the computational complexity of the decoding method in [17] and Algorithm 2 for some parameters. The results in Table 2 demonstrate that Algorithm 2 has much lower decoding complexity, compared with the original decoding method in [17]. For example, Algorithm 2 has 40.60% less decoding complexity than the decoding method in [17] when .
Table 2.
Decoding complexity of method in [17] and Algorithm 2.
The reason why Algorithm 2 has lower decoding complexity than the decoding method in [17] can be summarized as the following three points.
Firstly, we only consider the first terms (the coefficients of degrees less than ) for both and in computing the coefficients of , while all r terms of and all terms of are calculated in the decoding method in [17], where .
Secondly, all the divisions in Algorithm 2 are executed over the ring by Lemmas 1 and 2, which takes XORs and XORs for each division, respectively. In addition, the division in [17] is executed over the ring , which takes XORs [17] (Corollary 2).
Thirdly, we apply Algorithm 1 to steps 11–12 of Algorithm 2, which can significantly reduce the number of divisions, thus reducing the number of XORs required.
3.2. Modified Interpolation-Based Decoding Method
According to the decoding method in [18], we can recover the erased column with by
where and . Let
where . Then, has an even number of nonzero terms, and we only need to compute once for in the decoding procedure, since is independent of the erasure index i. Let
where , and . Algorithm 3 shows our modified interpolation-based method over the ring .
After using Algorithm 1, the number of polynomial multiplications in step 2 ranges from 1 to . Thus, the computational complexity in steps 1–2 of Algorithm 3 is
| Algorithm 3: Modified interpolation-based method. |
![]() |
In steps 1–2, we need to take multiplications without Algorithm 1, which takes XORs; however, with Algorithm 1, the number of multiplications involved in steps 1–2 can be reduced. In Table 3, we show the average number of XORs involved in steps 1–2 of Algorithm 3 with Algorithm 1 for . The results in Table 3 show that we can reduce the number of XORs with Algorithm 1, especially for a large value of .
Table 3.
The average number of XORs involved in steps 1–2 of Algorithm 3.
Only steps 4 and 6 of Algorithm 3 are needed to compute the division. We should employ Lemma 2 to execute the divisions in steps 3–4 in Algorithm 3, since in step 6 of Algorithm 3 should have an even number of nonzero terms. Notice that steps 5–6 of Algorithm 3 are exactly the same as steps 11–12 of Algorithm 2.
We specify the computational complexity of Algorithm 3 as follows:
Then, the computational complexity of Algorithm 3 is
where
Recall that the computational complexity of the decoding method in [18] is
which is larger than that of our Algorithm 3.
Table 4 evaluates the computational complexity of the decoding method in [18] and Algorithm 3 for some parameters. The results in Table 4 demonstrate that our Algorithm 3 had much lower decoding complexity, compared with the original decoding method in [18]. For example, Algorithm 3 had a 34.13% lower decoding complexity than the decoding method in [18], when .
Table 4.
Decoding complexities of the decoding method in [18] and our Algorithm 3.
The reason why Algorithm 3 has a lower decoding complexity than that of the decoding method in [18] is summarized as follows.
Firstly, all the divisions in Algorithm 3 were executed over the ring by Lemmas 1 and 2, which used XORs and XORs for each division, respectively. The division in the decoding method in [18] was executed over the ring , which used XORs.
Secondly, we applied our Algorithm 1 to steps 1–2 and steps 5–6, which significantly reduced the number of multiplications, thus reducing the number of XORs required.
4. LU Decomposition-Based Method
The LU factorization of a matrix [22] is to express the matrix as a product of a lower triangular matrix and an upper triangular matrix . According to the LU factorization of the Vandermonde matrix [23], we can express a Vandermonde matrix as a product of several lower triangular matrices and several upper triangular matrices. Therefore, we can solve the Vandermonde linear equations by first solving the linear equations with the encoding matrices that are the upper triangular matrices and then solving the linear equations with the encoding matrices that are the lower triangular matrices.
Suppose that the erased columns are columns and the surviving columns are . Algorithm 4 shows our LU decomposition-based method over the ring .
According to [Theorem 8] in [19], Equation (1) can be factorized into
over the ring , where is the upper triangle matrix
and is the lower triangle matrix
for .
| Algorithm 4: LU decomposition-based method. |
![]() |
We specify the computational complexity of Algorithm 4 as follows:
- Steps 1–2 require XORs.
- Steps 3–11 require XORs at most, according to [Theorem 10] in [19].
Then, the computational complexity of Algorithm 4 is
5. Comparison and Conclusions
Table 5 evaluates the decoding complexity of Algorithm 2–4 for some parameters. The results of Table 5 demonstrate that Algorithm 2 performs better than Algorithm 3 if ; otherwise, if , then Algorithm 3 has less decoding complexity. Algorithm 4 has less decoding complexity than both Algorithms 2 and 3, when is small. However, when is large, Algorithm 3 is more efficient than Algorithm 4. For example, compared with Algorithm 2–4 have and less decoding complexity, respectively, when .
Table 5.
Decoding complexities of the proposed three decoding methods.
In this paper, we presented three efficient decoding methods for the erasures of Blaum–Roth codes that all have lower decoding complexity than the existing decoding methods. The efficient implementation of the proposed decoding methods in practical storage systems is one of our future works.
Author Contributions
Funding acquisition, H.H. methodology, H.H.; writing—original draft preparation, W.Z.; writing—review and editing, H.H. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by National Natural Science Foundation of China grant number 62071121 and National Key R&D Program of China grant number 2020YFA0712300.
Institutional Review Board Statement
Not applicable.
Data Availability Statement
All data generated or analysed during this study are included in this published article.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Peng, P. Redundancy Allocation in Distributed Systems. Ph.D. Thesis, Rutgers The State University of New Jersey, School of Graduate Studies, New Brunswick, NJ, USA, 2022. [Google Scholar]
- MacWilliams, F.J.; Sloane, N.J.A. The Theory of Error Correcting Codes; Elsevier: Amsterdam, The Netherlands, 1977; Volume 16. [Google Scholar]
- Blaum, M.; Brady, J.; Bruck, J.; Menon, J. EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures. IEEE Trans. Comput. 1995, 44, 192–202. [Google Scholar] [CrossRef]
- Hou, H.; Lee, P.P.C. A New Construction of EVENODD Codes With Lower Computational Complexity. IEEE Commun. Lett. 2018, 22, 1120–1123. [Google Scholar] [CrossRef]
- Corbett, P.; English, B.; Goel, A.; Grcanac, T.; Kleiman, S.; Leong, J.; Sankar, S. Row-diagonal Parity for Double Disk Failure Correction. In Proceedings of the 3rd USENIX Conference on File and Storage Technologies, San Francisco, CA, USA, 31 March–4 April 2004; pp. 1–14. [Google Scholar]
- Xu, L.; Bruck, J. X-code: MDS Array Codes with Optimal Encoding. IEEE Trans. Inf. Theory 1999, 45, 272–276. [Google Scholar]
- Tsunoda, Y.; Fujiwara, Y.; Ando, H.; Vandendriessche, P. Bounds on separating redundancy of linear codes and rates of X-codes. IEEE Trans. Inf. Theory 2018, 64, 7577–7593. [Google Scholar] [CrossRef]
- Goel, A.; Corbett, P. RAID Triple Parity. ACM SIGOPS Oper. Syst. Rev. 2012, 46, 41–49. [Google Scholar] [CrossRef]
- Huang, C.; Xu, L. STAR: An Efficient Coding Scheme for Correcting Triple Storage Node Failures. IEEE Trans. Comput. 2008, 57, 889–901. [Google Scholar] [CrossRef]
- Hou, H.; Lee, P.P.C. STAR+ Codes: Triple-Fault-Tolerant Codes with Asymptotically Optimal Updates and Efficient Encoding/Decoding. In Proceedings of the 2021 IEEE Information Theory Workshop (ITW 2021), Kanazawa, Japan, 17–21 October 2021. [Google Scholar]
- Blaum, M.; Brady, J.; Bruck, J.; Jai Menon, J.; Vardy, A. The EVENODD Code and its Generalization: An Effcient Scheme for Tolerating Multiple Disk Failures in RAID Architectures. In High Performance Mass Storage and Parallel I/O; Wiley-IEEE Press: Hoboken, NJ, USA, 2002; Chapter 8; pp. 187–208. [Google Scholar]
- Blaum, M.; Bruck, J.; Vardy, A. MDS Array Codes With Independent Parity Symbols. IEEE Trans. Inf. Theory 1996, 42, 529–542. [Google Scholar] [CrossRef]
- Hou, H.; Shum, K.W.; Chen, M.; Li, H. New MDS Array Code Correcting Multiple Disk Failures. In Proceedings of the 2014 IEEE Global Communications Conference, Austin, TX, USA, 8–12 December 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 2369–2374. [Google Scholar]
- Fu, H.; Hou, H.; Zhang, L. Extended EVENODD+ Codes with Asymptotically Optimal Updates and Efficient Encoding/Decoding. In Proceedings of the 2021 XVII International Symposium “Problems of Redundancy in Information and Control Systems” (REDUNDANCY), Moscow, Russia, 25–29 October 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–6. [Google Scholar]
- Chiniah, A.; Mungur, A. On the Adoption of Erasure Code for Cloud Storage by Major Distributed Storage Systems. EAI Endorsed Trans. Cloud Syst. 2022, 7, e1. [Google Scholar] [CrossRef]
- Rui, J.; Huang, Q.; Wang, Z. Graftage Coding for Distributed Storage Systems. IEEE Trans. Inf. Theory 2021, 67, 2192–2205. [Google Scholar] [CrossRef]
- Blaum, M.; Roth, R.M. New Array Codes for Multiple Phased Burst Correction. IEEE Trans. Inf. Theory 1993, 39, 66–77. [Google Scholar] [CrossRef]
- Guo, Q.; Kan, H. On Systematic Encoding for Blaum-Roth Codes. In Proceedings of the 2011 IEEE International Symposium on Information Theory Proceedings, St. Petersburg, Russia, 31 July–5 August 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 2353–2357. [Google Scholar]
- Hou, H.; Han, Y.S.; Shum, K.W.; Li, H. A Unified Form of EVENODD and RDP Codes and Their Efficient Decoding. IEEE Trans. Commun. 2018, 66, 5053–5066. [Google Scholar] [CrossRef]
- Hou, H.; Shum, K.W.; Chen, M.; Li, H. BASIC Codes: Low-complexity Regenerating Codes for Distributed Storage Systems. IEEE Trans. Inf. Theory 2016, 62, 3053–3069. [Google Scholar] [CrossRef]
- Hou, H.; Han, Y.S. A New Construction and An Efficient Decoding Method for Rabin-like Codes. IEEE Trans. Commun. 2017, 66, 521–533. [Google Scholar] [CrossRef]
- Strang, G.; Strang, G.; Strang, G.; Strang, G. Introduction to Linear Algebra; Wellesley-Cambridge Press: Wellesley, MA, USA, 1993; Volume 3. [Google Scholar]
- Yang, S.l. On the LU factorization of the Vandermonde matrix. Discret. Appl. Math. 2005, 146, 102–105. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).



