En-AR-PRNS: Entropy-Based Reliability for Configurable and Scalable Distributed Storage Systems

.


Introduction
Reliable storage is key for computer systems.Various factors can decrease its reliability and safety: hardware and software malfunctions, integrity violation, unexpected and unauthorized data modifications, data loss, malicious intrusions, falsifications, denial of access, etc.
The advantage of RRNS is that it has properties of error correction codes, homomorphic encryption (HE), and SSS.It is a non-positional system with no dependence between its shares (i.e., residues).Thus, an error in one share is not propagated to other shares.Therefore, isolation of the faulty residues allows for fault tolerance and facilitates error detection and correction.
Chervyakov et al. ( 2019) [2] used the Asmuth-Bloom algorithm making the SSS asymptotically ideal.It is effective but inefficient for distributed storage, since it requires redundancy as large as the Shamir scheme.Tchernykh et al. (2018) [3] estimated the risks of cloud conspiracy, DDoS attacks, and errors of cloud providers to find the appropriate moduli and the number of controls and working moduli.
Here, we adapted this approach for a polynomial residue number system (PRNS) to control the computational security of distributed storage.By selecting PRNS moduli, we could also minimize the computational costs associated with the coding-decoding operations.Since they are modular, we had to consider moduli that were irreducible polynomials over the binary field.Parallel execution could be performed as a series of smaller calculations broken down from an overall single, large calculation with horizontal and vertical scalability.
To reduce the decoding complexity from RRNS to a binary representation, Chervyakov et al. (2019) [2] introduced the concept of an approximation of the rank of a number for RRNS (AR-RRNS).This reduces the number of calculated projections and replaces computationally complex long integer division by taking the least significant bits.An approximation of the rank for PRNS (AR-PRNS) was studied by Tchernykh et al. (2019) [15] to avoid the most resource-consuming operations of finding residue from division by a large polynomial.The authors compared the speeds of data coding and decoding of PRNS and RRNS.They also provided a theoretical analysis of data redundancy and mechanisms to configure parameters to cope with different objective preferences.
The contribution of this paper is multifold:

•
We propose a novel entropy-based mechanism for improving the reliability of distributed data storage; • An En-AR-PRNS scheme is proposed that combines the SSS and error correction codes with multiple failure detection/recovery mechanisms.It can detect and correct more errors than the state-of-the-art threshold-based PRNS; • A theoretical analysis is presented providing the basis for the dynamic storage configuration to deal with varied user preferences and storage properties to ensure highquality solutions in a non-stationary environment;

•
The concept of an AR of the polynomial to improve decoding speed is applied; • Approaches to efficiently exploit parallel processing to provide fast high-quality solutions for security, reliability, and performance optimization needed in the nonstationary cloud environment are classified.
Data replication provides for the storage of the same data in multiple locations to improve data availability, accessibility, system resilience, and reliability.This approach becomes expensive with the growing amount of stored data (Ghemawat et al., 2003) [19].One typical use of data replication is for disaster recovery to ensure that an accurate backup exists in the case of a catastrophe, hardware failure, or a system breach where data are compromised.
Secret Sharing Schemes (SSS) refer to methods for dividing data and distributing shares of the secret between participants in such a way that it can be reconstructed only when a sufficient number of shares are combined (Shamir, Blackly, etc.).They are suitable for securely distributed storage systems (Gomathisankaran et al., 2011) [19].Asmuth and Bloom (1983) [24] and Mignotte (1982) [25] proposed asymptotically perfect SSS based on RNS.
In RNS, a number is represented by residues-remainders under Euclidean division of the number by several pairwise coprime integers called the moduli.Due to the original number being divided into smaller numbers, operations are carried out independently and concurrently, providing fast computations.
An RRNS is obtained from RNS by adding redundant residues, which bring multipleerror detection and correction capability (Celesti et al., 2016) [14].With r redundant moduli, it can detect r and correct r/2 errors using projection methods.However, the number of necessary projections grows exponentially with increasing r.Significant optimization is required for practical efficiency.
Erasure code (EC) converts p length data and produces s length data (s > p) so that the original message can be recovered from a subset of s characters.An efficient implementation of EC is O(p• log 2 p) complexity (Lin, et al. 2014) [26].
Regenerating codes are designed for providing reliability and efficient repair (reconstruction) of lost coded fragments in distributed storage systems.They overcome the limitations of the traditional codes mentioned above, significantly reducing the traffic (repair-bandwidth) required for repairing.Furthermore, it has reasonable tradeoffs between storage and repair bandwidth (Dimakis et al., 2010) [22].A special pseudorandom number generator is required to effectively implement RC (Liu et al., 2015) [27].

Basic Definitions
The original data, A, can be represented by the polynomial A P (x) over GF(2 m ) of the form , where the highest exponent of the variable x is called the degree of the polynomial.For example, A = 243 = 11110011 2 .Hence, A P (x) = x 7 + x 6 + x 5 + x 4 + x + 1 with a degree of seven.
The PRNS modulo is an n-tuple of irreducible polynomials over the binary field.We denote this n-tuple: m 1 (x),m 2 (x), . . . ,m n (x), where n is the number of moduli and k = n − r is the number of working moduli.d i is the degree of the polynomial m i (x).
For a unique residue representation of an arbitrary element from GF(2 m ), D should be greater or equal to m and degA P (x) < D. In PRNS, A P (x) is represented by an n-tuple of residues A PRNS , as follows: where a i (x) = |A P (x)| m i (x) is the residue of the division of the polynomial A P (x) by m i (x), for i = 1, 2, . . ., n.
To convert the residual representation to a polynomial one, the following CRT extension is used: where and M i (x) = M(x)/m i (x).The quantities O i (x) are known as the multiplicative inverses of M i (x) by modulo m i (x) such that Throughout the paper, we introduce notations related to main topics such as PRNS, SSS, cloud storage, coding-decoding, error detection and correction, entropy, and reliability.To facilitate understanding, we summarize the main notations in Table 1.
orthogonal basis weight A PRNS tuple of residues of A P (x) E PRNS tuple of errors E P (x) polynomial representation of an error A PRNS = A PRNS + E PRNS tuple of residues of A P (x) with errors residue of the division of polynomial A P (x) by m i (x), The proof is as follows: Let A P (x) be represented in redundant PRNS as A PRNS = (a 1 (x), a 2 (x), . . . ,a n (x)) with the legitimate range of the degree being no higher than D.
Since A P (x) is an element over GF(2 m ), its degree is smaller or equal to D − 1; thus, A P (x) has a legitimate range.Now, let us convert E PRNS (x) from PRNS back to the polynomial representation using (1) with a full range: where Mi (x) = M(x)/m i (x) and Ôi (x) = M−1 i (x) .
The degree of Mi (x) is D − d i and the degree of e i (x)• Ôi (x) m i (x) can be any number from 0 to d i − 1.Hence, the degree of E P (x), which we denote as D E = degE P (x), ranges from D − d i to D − 1 .
Since D − d i ≥ D, the following condition is true: if degA P (x) < D then A P (x) is correct, otherwise, A P (x) is not correct.

Coding Speed
For performance analysis, we used an VM Intel Xeon ® E5-2673 v.3, 2 GB of RAM, 16 GB SSD hard drive provided by Microsoft Azure [15].It had 2 30 bit operations per second, on average, according to the Geekbench Browser site [34].
The PRNS coding complexity was O(k•d) (Chervyakov et al., 2016) [35].Hence, the calculation of the remainder of the division required roughly k•d bit operations.
To obtain a polynomial of degree D − 1, we performed finding the remainder of the division by PRNS moduli n times.Hence, the total number of the bit operations was n•k•d.
The size of the original data was represented by D = k•d bits.The number of D in 1 Mb was 2 23 /D = 2 23 /(k•d).Therefore, to code 1 Mb of data, 2 23 •n 23 •n bit operations were required.
The coding speed, V C , in MB/s can be calculated by the formula: Figure 1 shows that the coding speed, V C , versus (k, n) parameters was a saw type.Picks were achieved for the schemes (n, n), and minimums were achieved for (2, n).The user can select the required parameters to obtain the required speed.
Mathematics 2022, 9, x FOR PEER REVIEW 7 of 25 Figure 1 shows that the coding speed,  , versus (, ) parameters was a saw type.Picks were achieved for the schemes (, ), and minimums were achieved for (2, ).The user can select the required parameters to obtain the required speed.
Figure 2 shows the coding speed,  , (Mb/s) of PRNS and RRNS.We can see that the coding speed of PRNS was higher than RRNS for all parameters.The PRNS operations on (2 ) did not require the transfer of values from the lowest to the highest term.Consequently, the time complexity of the division remainder decreased compared to the arithmetic operations of RRNS performed over (2 ).

PRNS Decoding Speed with Data Errors
To correct the ⌊/2⌋ errors with  extra moduli, we used the projection method.To compute a projection, CRT with algorithmic complexity ( ) required roughly  =  ⋅  bit operations.Since the number of projections was  , to detect and localize an error, we needed  ⋅  ⋅  bit operations.Hence bit operations were required to code 1 Mb with the worst-case decoding speed,  .
Figure 3 shows the data decoding speed,  , varying PRNS parameters for  = {8, 16, 32}.  Figure 2 shows the coding speed, V C , (Mb/s) of PRNS and RRNS.We can see that the coding speed of PRNS was higher than RRNS for all parameters.
Mathematics 2022, 9, x FOR PEER REVIEW 7 of 25 Figure 1 shows that the coding speed,  , versus (, ) parameters was a saw type.Picks were achieved for the schemes (, ), and minimums were achieved for (2, ).The user can select the required parameters to obtain the required speed.
Figure 2 shows the coding speed,  , (Mb/s) of PRNS and RRNS.We can see that the coding speed of PRNS was higher than RRNS for all parameters.The PRNS operations on (2 ) did not require the transfer of values from the lowest to the highest term.Consequently, the time complexity of the division remainder decreased compared to the arithmetic operations of RRNS performed over (2 ).

PRNS Decoding Speed with Data Errors
To correct the ⌊/2⌋ errors with  extra moduli, we used the projection method.To compute a projection, CRT with algorithmic complexity ( ) required roughly  =  ⋅  bit operations.Since the number of projections was  , to detect and localize an error, we needed  ⋅  ⋅  bit operations.Hence bit operations were required to code 1 Mb with the worst-case decoding speed,  .
Figure 3 shows the data decoding speed,  , varying PRNS parameters for  = {8, 16, 32}.The PRNS operations on GF(2 m ) did not require the transfer of values from the lowest to the highest term.Consequently, the time complexity of the division remainder decreased compared to the arithmetic operations of RRNS performed over GF(2 m ).

PRNS Decoding Speed with Data Errors
To correct the r/2 errors with r extra moduli, we used the projection method.To compute a projection, CRT with algorithmic complexity O D 2 required roughly , to detect and localize an error, we needed bit operations were required to code 1 Mb with the worst-case decoding speed, V D .We can see that the computational complexity of PRNS, when an error was localized, was higher than that of RRNS.Hence, the PRNS decoding speed was less than for RRNS.
Figure 4 shows the PRNS and RRNS decoding speeds,  (Mb/s), of the worst-case scenario with the maximum number of errors.The approximation of the rank of the PRNS number was used to increase this speed (Section 5).

Data Redundancy
To prevent system operation disruption in the case of technical failures, disasters, and cyber-attacks by maintaining a continuity of service data redundancy is very important.
The input polynomial is  (), where deg  () < .Hence, the total input volume approximately equals .The number of bits needed for residues is equal to  in the worst case.The data redundancy degradation is the ratio of the coded data size to the original minus 1: Let us select  () so that  () uses at most  =  = ⋯ =  =  bits or one word.Hence, the redundancy is roughly / − 1.
We can see that the computational complexity of PRNS, when an error was localized, was higher than that of RRNS.Hence, the PRNS decoding speed was less than for RRNS.
Figure 4 shows the PRNS and RRNS decoding speeds, V D (Mb/s), of the worst-case scenario with the maximum number of errors.We can see that the computational complexity of PRNS, when an error was localized, was higher than that of RRNS.Hence, the PRNS decoding speed was less than for RRNS.
Figure 4 shows the PRNS and RRNS decoding speeds,  (Mb/s), of the worst-case scenario with the maximum number of errors.The approximation of the rank of the PRNS number was used to increase this speed (Section 5).

Data Redundancy
To prevent system operation disruption in the case of technical failures, disasters, and cyber-attacks by maintaining a continuity of service data redundancy is very important.
The input polynomial is  (), where deg  () < .Hence, the total input volume approximately equals .The number of bits needed for residues is equal to  in the worst case.The data redundancy degradation is the ratio of the coded data size to the original minus 1: Let us select  () so that  () uses at most  =  = ⋯ =  =  bits or one word.Hence, the redundancy is roughly / − 1.
Figure 5 shows the redundancy versus the PRNS parameters.We can see that the minimum values are for (, ), where  = 4, 5, … ,9.These values are less than those of the Bigtable system (⌈( + 1)/3⌉, ).The approximation of the rank of the PRNS number was used to increase this speed (Section 5).

Data Redundancy
To prevent system operation disruption in the case of technical failures, disasters, and cyber-attacks by maintaining a continuity of service data redundancy is very important.
The input polynomial is A P (x), where deg A P (x) < D. Hence, the total input volume approximately equals D. The number of bits needed for residues is equal to D in the worst case.The data redundancy degradation is the ratio of the coded data size to the original minus 1: Let us select m i (x) so that a i (x) uses at most d 1 = d 2 = . . .= d n = d bits or one word.Hence, the redundancy is roughly n/k − 1.
Figure 5 shows the redundancy versus the PRNS parameters.We can see that the minimum values are for (n, n), where n = 4, 5, . . ., 9.These values are less than those of the Bigtable system ( (n + 1)/3 , n).PRNS reduces data redundancy compared to numerical RNS, because the degree of the polynomial (residue) is strictly less than the degree of the divisor.We demonstrate this in the following example.Therefore, the redundancy degradation is 4 × − 1 = 0.

Rank
Reducing the computational complexity of the decoding algorithm is of the utmost interest.One possible approach is to approximate the value of the AR [15].
In CRT, the rank  () of  is determined by Equation ( 2).It is used to restore  () from residues: Hence,  (x) = ∑ Instead of computing the rank  (), we calculate an approximation of the rank,  (), based on the approximate method and modular adder, which decreases the complexity: where  () = ⌈ () ⋅  / ()⌉.PRNS reduces data redundancy compared to numerical RNS, because the degree of the polynomial (residue) is strictly less than the degree of the divisor.We demonstrate this in the following example.
Example 1.Let us consider a (4, 4) scheme with PRNS.Let the moduli be m In this case, the dynamic range is: and it has 32 bits.A P (x) has the following representation: , and a 4 (x) are seventh-degree polynomials of 8 bits.

Rank
Reducing the computational complexity of the decoding algorithm is of the utmost interest.One possible approach is to approximate the value of the AR [15].
In CRT, the rank r A (x) of A PRNS is determined by Equation ( 2).It is used to restore A P (x) from residues: for all i = 1, 2, . . ., k.
in PRNS representa- tion.Its calculation includes an expensive operation of the Euclidean division.Instead of computing the rank r A (x), we calculate an approximation of the rank, R A (x), based on the approximate method and modular adder, which decreases the complexity: This method reduces the number of projections and replaces the computationally complex operation of the division of polynomials with the remainder by taking the division of a polynomial by the monomial x N .The complexity is reduced from O D 2 to O(D• log D).
Theorem 1 shows that R A (x) and r A (x) are equal.It provides the theoretical basis for our approach.
The proof is described in Appendix A. The algorithmic complexity of the rank of r A (x) based on the Theorem 1 is O (d• log k).Since the coefficients b i (x) are of degree d, we can compute A efficiently and reduce the computational complexity of the decoding from •a i (x) can be performed in parallel with the computation of R A (x).

AR-PRNS Decoding Speed with Data Errors
In the previous section, we described finding the error using AR to increase the decoding speed.
To detect and localize r/2 errors using the syndrome method, it is necessary to precompute a table consisting of r/2 •d possible syndromes (rows).The maximum number of bit errors is r/2 •d.Each syndrome indicates the localization of errors.To find the syndrome in the table by binary search, we need log 2 ( r/2 •d)•r•d bit operations.To code 1 Mb, we need bit operations.Therefore, the AR-PRNS decoding speed in the worst case is To detect and correct at least one error, k has to satisfy the inequality k ≤ n − 2. Figure 6 shows the decoding speeds of PRNS, AR-RRNS, and AR-PRNS depending on the parameters that satisfy this condition.We can see that AR-PRNS outperformed the others by almost three times.This method reduces the number of projections and replaces the computationally complex operation of the division of polynomials with the remainder by taking the division of a polynomial by the monomial  .The complexity is reduced from ( ) to ( ⋅ log ).
Theorem 1 shows that  () and  () are equal.It provides the theoretical basis for our approach.The proof is described in Appendix A. The algorithmic complexity of the rank of  () based on the Theorem 1 is  ( • log ).Since the coefficients  () are of degree , we can compute  efficiently and reduce the computational complexity of the decoding from ( ) down to  ( • log ).Computation of ∑  () •  () •  () can be performed in parallel with the computation of  ().

AR-PRNS Decoding Speed with Data Errors
In the previous section, we described finding the error using AR to increase the decoding speed.
Figure 6 shows the decoding speeds of PRNS, AR-RRNS, and AR-PRNS depending on the parameters that satisfy this condition.We can see that AR-PRNS outperformed the others by almost three times.

Entropy Polynomial Error Correction Code
We propose a novel En-AR-PRNS method for data decoding that uses AR for decoding speed improvement and entropy for reliability improvement.We show that the

Entropy Polynomial Error Correction Code
We propose a novel En-AR-PRNS method for data decoding that uses AR for decoding speed improvement and entropy for reliability improvement.We show that the entropy concept in PRNS can correct more errors than a traditional threshold-based PRNS.
Since k moduli are included in the dynamic range and r is redundant (control moduli), where k + r = n, degA P (x) Since an error is of the form E(x) = β(x) MI (x), where MI (x) = ∏ i∈I m i (x), β(x) is a nonzero polynomial, and I is the tuple of residues without errors.
If I is not an empty tuple and deg MI (x) ≥ ∑ k i=1 d i , then an error can be detected, since degE P (x) ≥ ∑ k i=1 d i .

Entropy in PRNS
According to CRT, A P (x) is a polynomial over the binary field of degree less than D = ∑ k i=1 d i .Therefore, A P (x) can have 2 D different values.Following Kolmogorov  (1965)  [36] and Ivanov et al. (2019) [37], we can state that the entropy of A P (x) is equal to: If i ∈ [1, n] and a i (x) = |A P (x)| m i (x) , then the entropy of a i (x) is equal to d i : Hence, the residue a i (x) carries some information of A. If the entropy d i = 0, the residue does not carry information about A P (x).In another extreme, if d i = D, the residue equals to A P (x).From Equation ( 5) and Equation ( 6), it follows that: If Equation ( 7) is satisfied, the amount of known information is greater than or equal to the initial information.Hence, we can restore A P (x), where I is a tuple of residues without error.
From the information theory point of view, the entropy of the residue a i (x) can be viewed as a measure of how close it is to the minimal entropy case.Hence, it measures the amount of information that a i (x) carries from A P (x).The maximal entropy corresponds to a non-coding-non-secure case.
Using an entropy-based approach, we can verify the obtained result for its correctness using the following theorem.Theorem 2. Let m 1 (x), m 2 (x), . . . ,m n (x) be the PRNS moduli n-tuple, A P (x) PRNS → A PRNS = (a 1 (x), a 2 (x), . . . ,a n (x)), where r control moduli,k = n − r, and I is a tuple of residues with an error.If an error can be detected.
Proof.Let us consider two cases: when there is an error and when there is no error.Case 1.If I is an empty tuple, then there are no errors and A P (x) can be calculated using (1).
Case 2. If I is not an empty tuple and it satisfies the condition ( 8), then we show that degA P (x) > degA P (x), where A P (x) = A P (x) + E P (x).Due to the fact that E P (x) = β(x) ∏ i∈I m i (x), where β(x) = 0, we have: Given that ∑ i∈I d i = ∑ n i=1 d i − ∑ i∈I d i , then Equation ( 9) takes the following form: Substituting the condition of the theorem ∑ i∈I d i ≤ ∑ r i=1 d k+i into Equation ( 10), we obtain that degE P (x) satisfies the following inequality: Because ∑ n i=1 d i − ∑ r i=1 d k+i = ∑ k i=1 d i , then Equation (11) takes the following form: Given that degβ(x) ≥ 0, then from the inequality (12), it follows degE P (x) ≥ ∑ k i=1 d i .Because degA P (x) < ∑ k i=1 d i and degA P (x) = degE(x) ≥ ∑ k i=1 d i , an error is detected.The theorem is proved.
Example 2. (see Appendix B) considers the case when the degrees of control moduli are less than the degrees of working moduli so that we can find an error in the remainder of the division by the working moduli.
Example 3 considers the case when one control modulo is used.We can detect two errors unlike one error in the traditional PRNS.

Example 3. Let PRNS moduli tuple be m
Let A P (x) = x 7 PRNS → x, 1, 1, x 2 + x .We consider three cases in which errors occurred on two shares.
Case A. If errors occurred in a 1 (x) and a 2 (x), then the error vector had the following form: E PRNS = (1, 1, 0, 0).
the condition of Theorem 2 (see Equation ( 8)) is satisfied, and errors can be detected.
Case B. If errors occurred in a 1 (x) and a 3 (x), then the error vector is E PRNS = (1, 0, 1, 0); hence, , the condition of Theorem 2 (see Equation ( 8)) is satisfied, and errors can be detected.
Let us show how an error is detected.First, we calculated the PRNS constants Mi (x) = M(x)/m i (x).
A P (x) = x 13 + x 12 + x 3 + 1, Finally, we determine whether A P (x), = A P (x), and A P (x) have errors.Case A. Because degA P (x) = 13 > degM(x) = 8, then A P (x) contains an error.To detect an error, we use Theorem 2. To localize and correct the error, we modify the maximum likelihood decoding (MLD) method from Goh and Siddiqi (2008) [38].

MLD Modification
To correct errors, we need tuple I residues, where ∑ i∈I d i > D. One of the ways to select them from all correct residues is the MLD method.
In the process of localization and correction of errors, we calculate the tuple V of possible candidates, A P (x), satisfying the condition degA P (x) < D. Each of the possible A P (x) is denoted by V l P (x), that is: where λ is the total number of candidate A P (x) that fall within the legitimate range, from which A P (x) is selected.
If their entropies equal d 1 = d 2 = . . .= d n , then we can use the Hamming distance, H l T (i.e., Hamming distance between V l PRNS and vector A PRNS ), of residues.H l T is defined as the number of elements in which two vectors, V l PRNS and A PRNS , differ.But if we consider weighted error correction codes in which d 1 = d 2 = . . .= d n , then the Hamming distance does not provide correct measurement, since residues carry a different amount of information about A P (x).
Let m 1 (x), m 2 (x), . . . ,m n (x) be the PRNS moduli tuple: Let us calculate H l W as the Hamming weight of V l PRNS and A PRNS using the following Algorithm 1.
Algorithm 1. Calculation H l W .

Input:
A PRNS = (a 1 (x), a 2 (x), . . . ,a n (x)) According to Algorithm 1, three steps are used to calculate H l W : First, similar to MLD, calculate the Hamming vector, h = (h 1 , h 2 , . . . ,h n ), where h i ∈ {0, 1}.Second, calculate the inverse of the vector h, h = h 1 , h 2 , . . ., h n , where h i is equal to one if the remainder a i (x) does not contain an error, otherwise h i is equal to zero.Third, calculate the amount of entropy, H l W , as a dot product of two vectors, h = h 1 , h 2 , . . ., h n , and vectors consisting of the entropies of the remainders of the division, a i (x).
The idea is to calculate the amount of entropy, H l W = H V l PRNS , but not the number of errors, H l T , as in MLD.If the volume of correct data is greater than D, then according to Theorem 2, we can restore the value of A P (x).Now, we can select A P (x) from all corrected V l P (x).To this end, traditional MLD picks the candidate with the minimal Hamming distance.In our approach, the best candidate is the V l P (x) with the maximal entropy, H l W .
Let the dynamic range M(x) and the full range of M(x) of PRNS be: x and E PRNS = (1, 0, 0, 0), then A PRNS = A PRNS + E PRNS = x + 1, 1, 1, x 2 + x .There are three possible values of V l P that satisfy the condition degV l P (x) We present the calculation results in Table 2.
Table 2. V with three candidates of V l P (x).
If we use the MLD from Kolmogorov (1965) [36], then the tuple of possible recovering candidates A P (x) consists of two values V 1 P (x) = x 7 and V 3 P (x) = ∑ 7 i=0 x i with the minimal Hamming distance.The proposed MLD modification for the error correction codes selects for maximal entropy.As we showed above, it allows us to determine the true result, x , unambiguously and correct the error.In classical PRNS, an error can be detected but not corrected.

Reliability
The detection of errors and failures in distributed storage and communication is mainly based on error correction codes, ECs, RCs, and modifications.In contrast to the existing methods, the PRNS method is simple and fast.However, it has one significant drawback: the limitation of detection and localization of errors.To overcome this limitation and increase the reliability, we propose an entropy-based approach to correct errors (see Section 6.1.).
Let Pr i be the probability of error of data stored in the i-th cloud.It can be seen as the probability of unexpected and unauthorized modifications, falsifications, hardware and software failures, disk errors, integrity violations, denial of access or data loss, etc.
To correct errors, we have to take enough residues without errors.The data access structure is defined as a set of residues without errors allowing correct errors.It is defined as follows: Hence, the probability of information loss of an En-AR-PRNS is determined by (13).
To estimate the probability of i-th storage error (i.e., failure), Pr i , we used data from an analysis of the downtime of public cloud providers presented by CloudHarmony [12].It monitors the health status of service providers by spinning up workload instances in public clouds and constantly pinging them.It does not present a complete insight into the types of failures due to the limited information obtained by monitoring; however, it serves as a valuable point of reliability analysis.
Table 3 describes the downtime, T D (in minutes per year), of eight cloud storage providers [39].We calculated the Pr i by the geometric probability definition (ratio of measures) Pr i = T D /525, 600, where 525, 600 = 365 × 24 × 60 is the number of minutes in a year.It included upload and download times (Tchernykh et al., 2018) [18].Based on these probabilities, we can roughly estimate reliability by calculating the probability of information loss.In threshold PRNS, we lose data when at least r + 1 storages have errors when we are not able to correct errors.In En-AR-PRNS, as we showed in Section 6, the entropy-based approach allows for correcting more errors.
Let us consider both approaches.Table 4 shows an example of data sharing in eight storages.Column m i (x) shows the eight moduli we used for coding, H(a i (x)) is the entropy of obtained residues, and the Cloud column is the used storage.Due to the entropies and polynomial degrees being equal, the residues could be allocated based on the access speed, trustiness, etc., of the storages.In this case, the reliabilities of both the En-AR-PRNS and PRNS methods had no difference.
Figure 7 shows the probabilities of information loss.Figure 7a shows that this probability varied ranging from 24 (four storages) to 64 (eight storages) versus the threshold (dynamic range).Figure 7b shows this probability varying threshold from 24 to 64 versus the range.
( Γ had nineteen recovery cases and Γ had sixteen cases.As we showed before, to restore data in a PRNS, we can have errors in two residues at most.In an En-AR-PRNS, we showed that, in some cases, we could restore data even with three residues have errors. The probability of information loss calculated by (13) was Pr = 2.3 × 10 and Pr = 3.7 × 10 .Therefore, En-AR-PRNS improved the reliability of the storage by about .⋅ . ⋅ ≈ 6.2 times using the same number of clouds with the same probability of errors for each provider.Now, let us compare the reliability of data storages based on PRNS and En-AR-PRNS with various (, ) settings.We defined the degree of residues as follows:  =  = ⋯ =  = 8 and  =  = ⋯ =  = 16 .The probabilities of information loss of threshold PRNS and entropy-based PRNS on a logarithmic scale are presented in Figure 8.We observed that: • By reducing thresholds (i.e., increasing data redundancy), we reduced the probability of information loss (Figure 7a);

•
By increasing the number of clouds (i.e., increasing the range), we reduced the probability of information loss (Figure 7a); • Reduction in the information loss probability by increasing the range was approximately equal for all thresholds (see Figure 7b).
Example 5. Let five cloud providers (i.e., Joyent, Azure, Google, Rackspace, and Centu-ryLink) be used for storage.The probabilities of errors and information loss were calculated using the data presented in [39] (see Table 5).x 6 + x + 1 6 0.000065 Joyent Table 5 shows an example of the data sharing in these storage systems.The m i (x) column shows the five used moduli, the H(a i (x)) column shows the entropy of residues, and the Cloud column shows the storage provider.
Due to the fact that the entropies were different, we allocated the residue with the highest entropy, H(a i (x)) = 6, on storage with the lowest error probability, i = 5.Then, we allocated three shares with entropy four on storages with the next smaller probabilities, i = 4, 3, 2. Finally, we allocated the last share with entropy two on storage i = 1.
For such an allocation, we used a PRNS with k = 3 and r = 2.The access structures Γ En−AR−PRNS and Γ PRNS included combinations of the residues (recovery cases) that could be used to recover original data.
Γ En−AR−PRNS ={{2, 5}, {3, 5}, {4, 5}} ∪ Γ PRNS .Γ En−AR−PRNS had nineteen recovery cases and Γ PRNS had sixteen cases.As we showed before, to restore data in a PRNS, we can have errors in two residues at most.In an En-AR-PRNS, we showed that, in some cases, we could restore data even with three residues have errors.
The probability of information loss calculated by (13) was Pr PRNS = 2.3 × 10 −8 and Pr En−AR−PRNS = 3.7 × 10 −9 .Therefore, En-AR-PRNS improved the reliability of the storage by about 2.3•10 −8 3.7•10 −9 ≈ 6.2 times using the same number of clouds with the same probability of errors for each provider.Now, let us compare the reliability of data storages based on PRNS and En-AR-PRNS with various (k, n) settings.We defined the degree of residues as follows: To show the advantage of an entropy-based approach, Figure 9 presents the reliability improvement of En-AR-PRNS over PRNS.We can see that the largest improvement was approximately eighty times for (5,8) settings.The top improvements were for (3, 4), (3, 5), (3, 6), (4, 7), and (5,8).We can conclude that for the same number of cloud providers, the probability of information loss for En-AR-PRNS was much lower than for PRNS.Now, we show how many recovery cases can be detected and corrected by both approaches.
Figure 10 shows the number of recovery cases found by an En-AR-PRNS and a PRNS.For instance, for (3, 4) settings, the PRNS could detect only one error in shares  = 1, 2, 3, 4. The En-AR-PRNS could detect the same errors and three more cases when  To show the advantage of an entropy-based approach, Figure 9 presents the reliability improvement of En-AR-PRNS over PRNS.We can see that the largest improvement was approximately eighty times for (5, 8) settings.The top improvements were for (3, 4), (3, 5), (3, 6), (4, 7), and (5, 8).To show the advantage of an entropy-based approach, Figure 9 presents the reliability improvement of En-AR-PRNS over PRNS.We can see that the largest improvement was approximately eighty times for (5,8) settings.The top improvements were for (3, 4), (3, 5), (3, 6), (4, 7), and (5,8).We can conclude that for the same number of cloud providers, the probability of information loss for En-AR-PRNS was much lower than for PRNS.Now, we show how many recovery cases can be detected and corrected by both approaches.
Figure 10 shows the number of recovery cases found by an En-AR-PRNS and a PRNS.For instance, for (3, 4) settings, the PRNS could detect only one error in shares  = 1, 2, 3, 4. The En-AR-PRNS could detect the same errors and three more cases when detecting errors in two shares, for a total of seven cases.We can conclude that for the same number of cloud providers, the probability of information loss for En-AR-PRNS was much lower than for PRNS.Now, we show how many recovery cases can be detected and corrected by both approaches.
Figure 10 shows the number of recovery cases found by an En-AR-PRNS and a PRNS.For instance, for (3, 4) settings, the PRNS could detect only one error in shares i = 1, 2, 3, 4. The En-AR-PRNS could detect the same errors and three more cases when detecting errors in two shares, for a total of seven cases.We can conclude that for the same number of cloud providers, the probability of information loss for En-AR-PRNS was much lower than for PRNS.Now, we show how many recovery cases can be detected and corrected by both approaches.
Figure 10 shows the number of recovery cases found by an En-AR-PRNS and a PRNS.For instance, for (3, 4) settings, the PRNS could detect only one error in shares  = 1, 2, 3, 4. The En-AR-PRNS could detect the same errors and three more cases when detecting errors in two shares, for a total of seven cases.   Figure 11 shows how many times the En-AR-PRNS outperformed the PRNS in error detection.We can see that for (7, 8) settings, the En-AR-PRNS outperformed the PRNS by 3.6 times.
Mathematics 2022, 9, x FOR PEER REVIEW 19 of 25 Figure 11 shows how many times the En-AR-PRNS outperformed the PRNS in error detection.We can see that for (7, 8) settings, the En-AR-PRNS outperformed the PRNS by 3.6 times.An En-AR-PRNS detects and corrects more errors than classical methods in a PRNS.Table 6 shows the comparative analysis of the methods.An En-AR-PRNS detects and corrects more errors than classical methods in a PRNS.Table 6 shows the comparative analysis of the methods.Figure 12 shows the number of recovery cases corrected by En-AR-PRNS and PRNS.We see that for (3, 8) settings, the En-AR-PRNS corrected errors in 227 cases and PRNS in 36 cases.[36] ≤ ⌊/2⌋ ≤  Polynomial Projection new ≥ ⌊/2⌋ ≥  Polynomial En-AR-PRNS Figure 12 shows the number of recovery cases corrected by En-AR-PRNS and PRNS.We see that for (3, 8) settings, the En-AR-PRNS corrected errors in 227 cases and PRNS in 36 cases.Example 6 explains the results shown in Figure 12.
Example 6.Let PRNS moduli 4-tuple be m where k = 3, r = 1, and n = k + r = 4.The number of recovery cases found by PRNS is zero and by En-AR-PRNS it was three: a 1 (x), a 2 (x), and a 3 (x).

Discussion
The complexity of the presented algorithm motivates the usage of parallel computing.The inherent properties of PRNS attract much attention from both the scientific community and industry.Special interest is paid to the parallelization of the following algorithms: Coding.Since the computation of the residue for each modulo is independent, polynomial-to-PRNS conversion can be efficiently parallelized: independent calculation of residues a i (x); parallel calculation of each residue a i (x) based on a neural-like network approach to finite ring computations.
Decoding.Despite En-AR-PRNS performance improvement for PRNS-residues-topolynomial conversion, presented in this paper, this conversion still significantly impacts the performance limitation of the whole system.It is one of the most difficult PRNS operations, and parallel computing can significantly improve it.
The following parallelization options are exploited: • Parallel syndrome table search; • Independent generation of the candidates, V l P (x), with data recovery from all possible tuples of residues; • Parallel calculation of ∑ k i=1 M i (x)•O i (x)•a i (x) and R A (x)•M(x) to generate the candi- date V l P (x); • Independent calculation of the entropy, H l W , for each candidate.PRNS offers an important source of parallelism for addition, subtraction, and multiplication operations; • Data are converted to many smaller polynomials (residues).Operations on initial data are substituted by faster operations on residuals executed in parallel.The complexity of operations is reduced; • These features are used by the cryptography (RSA) community (Yen et al., 2003) [40], (Bajard and Imbert, 2004) [41], homomorphic encryption (Cheon et al., 2019) [42], and Microsoft SEAL (Laine, 2017) [43]; • Errors in a faulty computational logic element are localized in the corresponding residue without impacting other residues.This property is used in the algorithm for detecting errors in the AES cipher (Chu and Benaissa, 2013) [29] and control calculation results with encrypted data (Chervyakov et al., 2019) [2].

•
Operations are based on fractional representations.Regardless of word sizes, they can be performed on bits without carrying in a single clock period.This improves computational efficiency, which has been proven to be especially useful in digital signal processing (Chang et al., 2015) [32], cryptography: RSA (Bajard and Imbert, 2004) [41], elliptic curve cryptography (Schinianakis et al., 2009 [44]; Guillermin, 2010) [45]), homomorphic encryption (Cheon et al., 2019 [42]; Laine, 2017 [43]), etc.This property is a way to approach a famous bound of speed at which addition and multiplication can be performed (Flynn, 2003) [46].This bound, called Winograd's bound, determines a minimum time for arithmetic operations and is an important basis for determining the comparative value of the various implementations of algorithms.
Other optimization criteria can also be considered, for instance, power consumption.Using small arithmetic units for the PRNS processor reduces the switching activities in each channel and dynamic power (Wang et al., 2000) [47].The enhanced speed and low power consumption make the PRNS very encouraging in applications with intensive operations.

Conclusions
In this paper, we studied data reliability based on a polynomial residual number system and proposed a configurable, reliable, and secure distributed storage scheme, named En-AR-PRNS.We provided a theoretical analysis of the dynamic storage configurations.
Our main contributions were multi-fold:

•
We proposed a novel decoding technique based on entropy to increase reliability.We showed that it can detect and correct more errors than the classical thresholdbased PRNS;

•
We provided a theoretical analysis of the reliability, redundancy, and coding/decoding speed, depending on configurable parameters to dynamically adapt the current configuration to various changes in the storage that are difficult to predict in advance;

•
We reduced the computational complexity of the decoding from O D 2 down to O(D• log D) using the concept of an approximation of the rank of a polynomial.
The main idea of adaptive optimization is to set (k, n) PRNS parameters, moduli, and storages and dynamically change them to cope with different objective preferences and current properties.To this end, the past characteristics can be analyzed for a certain time interval to determine appropriate parameters.This interval should be set according to the dynamism of the characteristics and storage configurations.
This reactive approach deals with the uncertainty and non-stationarity associated with unauthorized data modifications, hardware and software malfunctions, disk errors, loss of data, malicious intrusions, denial of access for a long time, data transmission failures, etc.To detect changes, estimate, and classify the security and reliability levels, and violations of confidentiality, integrity, and availability, multi-objective techniques using computational intelligence and artificial intelligence can be applied.Future work will center on a comprehensive experimental study to assess the proposed mechanism's efficiency and effectiveness in real dynamic systems under different types of errors and malicious attacks.

Figure 2 .
Figure 2. The V C of a PRNS and an RRNS versus (k, n).

Figure 4 .
Figure 4.  of a PRNS and an RRNS with the maximum number of errors versus (, ) for  = 8.

Figure 4 .
Figure 4.  of a PRNS and an RRNS with the maximum number of errors versus (, ) for  = 8. 0

Figure 4 .
Figure 4. V D of a PRNS and an RRNS with the maximum number of errors versus (k, n) for d = 8.

Figure 6 .
Figure 6.The decoding speed (Mb/s) of a PRNS, AR-PRNS, and AR-RRNS versus the settings (, ) in the worst-case scenario with the maximum number of errors for  = 8.

Figure 6 .
Figure 6.The decoding speed (Mb/s) of a PRNS, AR-PRNS, and AR-RRNS versus the settings (k, n) in the worst-case scenario with the maximum number of errors for d = 8.

Figure 7 .
Figure 7. Probability of information loss of an En-AR-PRNS (a) and a PRNS (b).

Figure 7 .
Figure 7. Probability of information loss of an En-AR-PRNS (a) and a PRNS (b).

25 Figure 8 .
Figure 8.The probability of information loss of a threshold PRNS and an entropy-based PRNS.

Figure 8 .
Figure 8.The probability of information loss of a threshold PRNS and an entropy-based PRNS.

25 Figure 8 .
Figure 8.The probability of information loss of a threshold PRNS and an entropy-based PRNS.

Figure 10 .
Figure 10.Number of recovery cases detected by the En-AR-PRNS and the PRNS.

Figure 10 .
Figure 10.Number of recovery cases detected by the En-AR-PRNS and the PRNS.

Figure 11 .
Figure 11.Improvement of number of recovery cases detected by En-AR-PRNS over PRNS.

Figure 11 .
Figure 11.Improvement of number of recovery cases detected by En-AR-PRNS over PRNS.

Figure 12 .
Figure 12.Number of recovery cases corrected by En-AR-PRNS and PRNS.

•
Coding: data conversion from conventional representation to PRNS; • Decoding: data conversion from PRNS to conventional representation; • Homomorphic computing: processing encrypted information without decryption; • PRNS-based communication: transmitting shares; • Computational intelligence: Estimating and classifying the security and reliability levels and finding adequate settings.

Table 3 .
Characteristics of the clouds.

Table 4 .
Data allocation based on PRNS.

Table 5 .
Data allocation based on En-AR-PRNS.

Table 6 .
Comparison of multiple residue digit error detection and correction algorithms.

Table 6 .
Comparison of multiple residue digit error detection and correction algorithms.