An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem

Selianinau, Mikhail; Povstenko, Yuriy

doi:10.3390/e24020242

Open AccessArticle

An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem

by

Mikhail Selianinau

and

Yuriy Povstenko

^*

Department of Mathematics and Computer Sciences, Faculty of Science and Technology, Jan Dlugosz University in Czestochowa, al. Armii Krajowej 13/15, 42-200 Czestochowa, Poland

^*

Author to whom correspondence should be addressed.

Entropy 2022, 24(2), 242; https://doi.org/10.3390/e24020242

Submission received: 8 January 2022 / Revised: 30 January 2022 / Accepted: 3 February 2022 / Published: 5 February 2022

(This article belongs to the Special Issue Theory and Applications of Information Processing Algorithms)

Download Versions Notes

Abstract

In this paper, we deal with the critical problems in residue arithmetic. The reverse conversion from a Residue Number System (RNS) to positional notation is a main non-modular operation, and it constitutes a basis of other non-modular procedures used to implement various computational algorithms. We present a novel approach to the parallel reverse conversion from the residue code into a weighted number representation in the Mixed-Radix System (MRS). In our proposed method, the calculation of mixed-radix digits reduces to a parallel summation of the small word-length residues in the independent modular channels corresponding to the primary RNS moduli. The computational complexity of the developed method concerning both required modular addition operations and one-input lookup tables is estimated as

O (k^{2} / 2)

, where k equals the number of used moduli. The time complexity is

O (⌈\log_{2} k⌉)

modular clock cycles. In pipeline mode, the throughput rate of the proposed algorithm is one reverse conversion in one modular clock cycle.

Keywords:

Residue Number System; modular arithmetic; residue-to-binary conversion; Chinese Remainder Theorem; mixed-radix representation

1. Introduction

Along with the improvement of computer technology, the development and implementation of new effective approaches to the organization and realization of computational tasks are some of the main ways to increase the data processing speed. At present, high-performance computing is developing extremely rapidly. These reasons lead to qualitatively new requirements imposed on number-theoretic methods and computational algorithms. Practically, all well-known approaches to high-performance computing use certain parallel forms of data representation and processing. In recent decades, special consideration has been given to the so-called modular computational structures. Their arithmetic foundation is the Residue Number System (RNS), whose ideological roots go back to the classic topics of number theory and abstract algebra. The RNS is a non-positional number system with inherent parallelism and occupies a place of particular importance due to its carry-free properties, which provide a high potential for accelerating arithmetic operations.

As is well known, the RNS has some advantages over a conventional Weighted Number System (WNS) in the design and implementation of high-performance computing applications, devices, and systems. From its appearance in the mid-1950s to the present, RNS arithmetic has attracted the constant attention of researchers in computer technology [1,2], number-theoretic methods [3,4,5], digital signal and image processing [2,5,6,7,8], communications systems [5,9], cryptography [2,8,10,11], and other fields [10].

The main advantage of RNS is its unique ability to decompose the large word-length numbers into a set of smaller word-length residues, which are processed in parallel in the independent modular channels. The inherent parallelism of RNS enables avoiding the carry-overs obtained in addition, subtraction, and multiplication, which are usually time-consuming in the WNS. In this regard, the modularity and carry-free properties make computation fast and efficient. Therefore, the RNS presents one of the most efficient means for increasing data processing speed.

Due to its carry-free property, the residue arithmetic is exceptionally suitable for a broad class of applications in which addition and multiplication are the dominant arithmetic operations. In any case, it has excellent potential for many substantial applications in such areas as digital signal processing, cryptography, distributed information and communication systems, information security systems, fault tolerance, cloud computing, and others. Moreover, these RNS applications may be effectively embedded in processor platforms functioning according to the conventional information-processing approach [2,5,8]. For the reasons mentioned above, residue arithmetic represents an efficient mathematical tool for the high-speed implementation of various computational tasks.

The reverse conversion and base extension are the most critical topics in residue arithmetic. As opposed to conventional WNS, these operations, on a par with other central non-modular procedures such as magnitude comparison, sign determination, overflow detection, general division, scaling, etc., are relatively harder for implementation. They are time consuming and costly due to their more complicated structure compared to modular operations.

As is known, to perform non-modular operations, it is necessary to carry out the binary reconstruction of the integer by its residue code, which in general is hampered by the non-weighted nature of the RNS. This circumstance negates to a substantial extent the main advantages of residue arithmetic.

Therefore, the development of novel approaches and methods for fast number reconstruction by its residue code has significant importance in high-performance computing based on parallel algorithmic structures of RNS, especially for high-speed implementing digital signal processing applications and public-key cryptosystems. That should enable the extensive use of residue arithmetic in many priority areas of science and technology.

In this paper, we present a novel approach to the parallel reverse conversion from the residue code into the mixed-radix representation. In the proposed method, the calculation of mixed-radix digits reduces to a parallel summation of the small word-length residues in the independent modular channels corresponding to the primary RNS moduli.

The paper is structured as follows. Section 2 and Section 3 discuss the basic theoretical concepts of the research. Section 4 describes the mathematical background of the proposed reverse conversion method. Section 5 and Section 6 present a numerical example and an analysis of the computational cost, respectively. Section 7 provides discussion, and Section 8 concludes the paper.

2. The Basic Concepts of the Residue Arithmetic

The abstract algebra and number theory create the theoretical basis of the residue arithmetic [12,13].

An RNS is defined by an ordered set

\{m_{1}, m_{2}, \dots, m_{k}\}

of k pairwise relatively prime moduli, where each modulus

m_{i} \geq 2

(i = 1, 2, \dots, k)

, and the greatest common divisor of

m_{i}

and

m_{j}

equals 1, i.e.,

\gcd (m_{i}, m_{j}) = 1

for

i \neq j

. For convenience, we assume that the default order of moduli is ascending, i.e.,

m_{1} < m_{2} < \dots < m_{k}

.

In the given RNS, it is possible to represent

M_{k}

integer numbers, where

M_{k}

is the product of all moduli,

M_{k} = \prod_{i = 1}^{k} m_{k}

. Therefore, the set

Z_{M_{k}} = \{0, 1, \dots, M_{k} - 1\}

is usually used as an RNS dynamic range.

Every number

X \in Z_{M_{k}}

has a unique representation in the form of a k-tuple of small integers

(χ_{1}, χ_{2}, \dots, χ_{k})

, which is called a residue code, where

χ_{i}

is a least non-negative remainder of a division of X by

m_{i}

(i = 1, 2, \dots, k)

. We can notationally write this relation as

χ_{i} = {|X|}_{m_{i}}

, where

χ_{i} \in Z_{m_{i}} = \{0, 1, \dots, m_{i} - 1\}

.

The main advantage of the residue arithmetic over conventional binary arithmetic consists of parallel carrying out addition, subtraction, and multiplication at the level of small word-length residues. The modular operations

\circ \in \{+, -, \times\}

on integers

A = (α_{1}, α_{2}, \dots, α_{k})

and

B = (β_{1}, β_{2}, \dots, β_{k})

are performed independently in each modular channel in compliance with the computational rule:

\begin{array}{l} A & \circ B = (α_{1}, α_{2}, \dots, α_{k}) \circ (β_{1}, β_{2}, \dots, β_{k}) = \\ = ({|α_{1} \circ β_{1}|}_{m_{1}}, {|α_{2} \circ β_{2}|}_{m_{2}}, \dots, {|α_{k} \circ β_{k}|}_{m_{k}}), \end{array}

(1)

where

α_{i} = {|A|}_{m_{i}}

and

β_{i} = {|B|}_{m_{i}}, i = 1, 2, \dots, k

.

In other words, the arithmetic operations on long-word operands are decomposed into modular channels with operands that are no larger than the corresponding modulus. Moreover, all the modular channels are entirely independent of each other. The carry-free nature of modular operations (1) is one of the most attractive features of residue arithmetic [1,3,8].

Therefore, compared with the conventional WNS, the RNS simplifies and speeds up the addition and multiplication operations. This fundamental advantage of the residue arithmetic strongly appears in the case of implementing computational procedures, which mainly contain long segments consisting of only sequences of modular arithmetic operations. In this case, the primary moduli set is chosen so that the final results of the computational procedure always belong to the used dynamic range for any allowed values of input operands. At the same time, the intermediate results can even exceed the boundaries of the dynamic range.

Along with the carry-free modular operations, there are also the so-called non-modular operations such as residue-to-binary conversion, base extension, magnitude comparison, sign determination, overflow detection, general division, scaling, etc. These operations are complicated and quite time consuming, and their significant computational complexity limits the applications of the residue arithmetic and restricts its widespread usage for high-speed computing.

To perform the non-modular operations, it is required to consider all residues in the k-tuple

(χ_{1}, χ_{2}, \dots, χ_{k})

. Furthermore, it is necessary to determine the integer value of the number by its residue code, which in general is hampered by the non-positional nature of the RNS. The crucial problem of efficient implementation of non-modular operations is constantly receiving considerable attention by modern researchers [2,5,8].

The applicability of residue arithmetic is mainly determined by the computational complexity and feasibility of non-modular operations, which are used as a basis for implementing more complex computational algorithms in RNS. At the same time, the fundamental problem in the residue arithmetic, which unfortunately up to now is yet completely unresolved; it consists of reducing the computational complexity of non-modular operations. Due to a lack of efficient methods and algorithms for non-modular operations implementation, the residue arithmetic is mainly suitable when the modular additions and multiplications make up the bulk of required computations. In this case, the number of used non-modular operations is relatively small. This circumstance bounds the widespread use of the RNS to a narrow class of specific tasks.

3. Reverse Conversion of the Residue Code to Conventional Representation

The root problem of residue arithmetic is that the weighted value of the integer X depends on all the residues

χ_{1}, χ_{2}, \dots, χ_{k}

. The reconstruction of an integer by its residue code, i.e., the reverse conversion, is one of the most difficult non-modular operations in residue arithmetic. Moreover, this operation underlies all the other non-modular procedures.

Despite the currently extensive studies on residue arithmetic and its applications, there is a need to develop novel efficient approaches and methods of an integer number reconstruction by its residue code. This should enable us the extensive use of residue arithmetic for high-speed computing in many priority fields, first of all, in various digital signal processing and cryptographic applications.

There are two canonical techniques of reverse conversion: the canonical method based on the Chinese Remainder Theorem (CRT) and the residue code conversion to a weighted representation in the Mixed-Radix System (MRS) [1,2,5,8,14,15,16,17,18]. In general, all other conversion methods represent different variants of these two methods.

Below, we describe the mathematical background of these methods.

3.1. CRT-Base Conversion Method

When the moduli

m_{1}, m_{2}, \dots, m_{k}

are pairwise relatively prime, the integer number X and its residue code

(χ_{1}, χ_{2}, \dots, χ_{k})

are related by the equation:

X = {|\sum_{i = 1}^{k} M_{i, k} χ_{i, k}|}_{M_{k}},

(2)

where

M_{i, k} = M_{k} / m_{i}

,

χ_{i, k} = {|M_{i, k}^{- 1} χ_{i}|}_{m_{i}}

is a normalized residue modulo

m_{i}

(i = 1, 2, \dots, k)

,

{|Y^{- 1}|}_{m}

denotes the multiplicative inverse of an integer Y modulo m.

In essence, Equation (2) represents the CRT [10,19,20].

In the last decades, considerable efforts are directed to reducing the complexity of the CRT implementation and the possibility of its application in high-speed computing [2,5,8,21,22,23]. The main idea of these methods is to replace the inner multiplications and additions modulo

M_{k}

with simpler operations (see (2)).

Consider the CRT-number

X_{k} = \sum_{i = 1}^{k} M_{i, k} χ_{i, k} .

(3)

As follows from (2), the difference

X_{k} - X

is a multiple of

M_{k}

. Therefore, the following exact integer equality holds

X = X_{k} - ρ_{k} (X) M_{k} .

(4)

The unique integer number

ρ_{k} (X)

is a normalized rank (or, briefly, rank) of the number X [3,4,7].

Equation (4) is called a rank form of the integer X. In essence, the rank

ρ_{k} (X)

is a reconstruction coefficient that indicates how many times the dynamic range

M_{k}

is exceeded when converting the residue code

(χ_{1}, χ_{2}, \dots, χ_{k})

to the integer X.

In contrast to (2), Equation (4) does not contain a very time-consuming reduction modulo

M_{k}

. Therefore, when we have the efficient method for the rank

ρ_{k} (X)

computation, the reverse conversion algorithm constructed on the basis of (4) has a substantial lead over the canonical CRT implementation (2).

3.2. MRS-Base Conversion Method

In the MRS defined by a set

\{m_{1}, m_{2}, \dots, m_{k}\}

of pairwise relatively prime moduli, the integer

X \in Z_{M_{k}}

is represented by the k-tuple

(x_{k}, x_{k - 1}, \dots, x_{1})

of mixed-radix digits, resulting in

X = x_{1} + x_{2} M_{1} + x_{3} M_{2} + \dots + x_{k} M_{k - 1} = \sum_{i = 1}^{k} x_{i} M_{i - 1},

(5)

where

x_{i} \in Z_{m_{i}}

(i = 1, 2, \dots, k)

[1,2,8].

It is well known that the MRS surpasses the RNS when performing non-modular operations such as magnitude comparison, sign determination, and overflow detection. Therefore, the mixed-radix representation has received the widest appliance for the implementation of non-modular procedures along with the other generally accepted integral characteristics of the residue code such as the rank of a number, core function, interval index, parity function, diagonal, and quotient functions [3,4,7,24,25,26,27,28,29,30,31,32,33].

The RNS-to-MRS reverse conversion establishes an association between the residue code

(χ_{1}, χ_{2}, \dots, χ_{k})

of the number X and its mixed-radix representation

(x_{k}, x_{k - 1}, \dots, x_{1})

. The mixed-radix digits

x_{i}

(i = 1, 2, \dots, k)

in (5) are computed according to the following calculation relations [1]:

\begin{matrix} x_{1} = χ_{1}, \end{matrix}

\begin{matrix} x_{2} = {|(χ_{2} - x_{1}) {|m_{1}^{- 1}|}_{m_{2}}|}_{m_{2}}, \end{matrix}

\begin{matrix} x_{3} = {|((χ_{3} - x_{1}) {|m_{1}^{- 1}|}_{m_{3}} - x_{2}) {|m_{2}^{- 1}|}_{m_{3}}|}_{m_{3}}, \end{matrix}

\begin{matrix} \dots \end{matrix}

\begin{matrix} x_{k} = {|(\dots ((χ_{k} - x_{1}) {|m_{1}^{- 1}|}_{m_{k}} - x_{2}) {|m_{2}^{- 1}|}_{m_{k}} - \dots - x_{k - 1}) {|m_{k - 1}^{- 1}|}_{m_{k}}|}_{m_{k}} . \end{matrix}

This sequential calculation procedure called a chained algorithm can be written in the general form

x_{i} = {|X^{(i)}|}_{m_{i}},

(6)

where

X^{(i)} = \{\begin{matrix} X, if i = 1, \\ (X^{(i - 1)} - x_{i - 1}) m_{i - 1}^{- 1}, if i = 2, 3, \dots, k . \end{matrix}

(7)

From (6) and (7), it follows that the considered computational process requires two modular operations: subtraction and multiplication by the multiplicative inverse. Thus, the most crucial advantage of this algorithm is its high modularity. However, its strictly sequential nature prevents general use for the construction of appropriate high-performance parallel computing procedures.

4. A Novel CRT-Base RNS-to-MRS Reverse Conversion Method

Now, we describe a proposed new method for calculating mixed-radix digits

x_{1}, x_{2}, \dots, x_{k}

of the number X by its residue code

(χ_{1}, χ_{2}, \dots, χ_{k})

.

Consider the CRT-number

X_{k}

. According to (3), we have

X_{k} = \sum_{i = 1}^{k - 1} M_{i, k - 1} m_{k} χ_{i, k} + M_{k - 1} χ_{k, k} .

(8)

By Euclid’s Division Lemma, the integer

m_{k} χ_{i, k}

can be written as

m_{k} χ_{i, k} = χ_{i, k - 1} + ⌊\frac{m_{k} χ_{i, k}}{m_{i}}⌋ m_{i},

(9)

where

\begin{matrix} χ_{i, k - 1} = {|m_{k} χ_{i, k}|}_{m_{i}} = {|m_{k} {|M_{i, k}^{- 1} χ_{i}|}_{m_{i}}|}_{m_{i}} = {|m_{k} M_{i, k}^{- 1} χ_{i}|}_{m_{i}} = {|M_{i, k - 1}^{- 1} χ_{i}|}_{m_{i}}, \end{matrix}

⌊x⌋

denotes the largest integer less than or equal to x.

Substituting (9) into (8), we obtain

X_{k} = X_{k - 1} + M_{k - 1} S_{k} (X),

(10)

where

X_{k - 1} = \sum_{i = 1}^{k - 1} M_{i, k - 1} χ_{i, k - 1},

(11)

S_{k} (X) = \sum_{i = 1}^{k} R_{i, k} (χ_{i}),

(12)

R_{i, k} (χ_{i}) = ⌊\frac{m_{k} χ_{i, k}}{m_{i}}⌋ (i = 1, 2, \dots, k) .

(13)

Taking into account (9), we have

\begin{matrix} R_{i, k} (χ_{i}) = \frac{m_{k} χ_{i, k} - χ_{i, k - 1}}{m_{i}} . \end{matrix}

Since

R_{i, k} (χ_{i}) \in Z_{m_{k}}

, we can reduce the right side of equality modulo

m_{k}

.

Hence, the residue

R_{i, k} (χ_{i})

can be calculated as

R_{i, k} (χ_{i}) = {|- \frac{χ_{i, k - 1}}{m_{i}}|}_{m_{k}} = {|- \frac{{|M_{i, k - 1}^{- 1} χ_{i}|}_{m_{i}}}{m_{i}}|}_{m_{k}} (i = 1, 2, \dots, k - 1) .

(14)

At the same time, from (13) it follows that

R_{k, k} (χ_{k}) = χ_{k, k} = {|M_{k, k}^{- 1} χ_{k}|}_{m_{k}} = {|M_{k - 1}^{- 1} χ_{k}|}_{m_{k}} .

(15)

Similarly, taking into account Equations (10)–(13), the numbers

X_{i}

(i = k - 1, k - 2, \dots, 1)

can be written by turns as

\begin{matrix} X_{k - 1} = X_{k - 2} + M_{k - 2} S_{k - 1} (X), \end{matrix}

\begin{matrix} X_{k - 2} = X_{k - 3} + M_{k - 3} S_{k - 2} (X), \end{matrix}

\begin{matrix} \dots \end{matrix}

\begin{matrix} X_{2} = X_{1} + M_{1} S_{2} (X), \end{matrix}

\begin{matrix} X_{1} = M_{0} S_{1} (X), \end{matrix}

where

M_{0} = 1

,

S_{1} (X) = χ_{1}

, the integers

S_{l} (X)

(l = 2, 3, \dots, k)

are calculated according to (12)–(15) in the case when the index k is replaced by l.

Finally, substituting the above equations for

X_{l}

(l = k - 1, k - 2, \dots, 1)

by turns into (10), we obtain

X_{k} = \sum_{i = 1}^{k} M_{l - 1} S_{l} (X) .

(16)

At the same time, according to Euclid’s Division Lemma, we have

S_{l} (X) = R_{l} (X) + m_{l} Q_{l} (X),

(17)

where

R_{l} (X) = {|S_{l} (X)|}_{m_{l}}

and

Q_{l} (X) = ⌊S_{l} (X) / m_{i}⌋

are the remainder and quotient of the division

S_{l} (X)

by the modulus

m_{l}

, respectively.

Therefore, taking into account (12), when the index k is replaced by l, the integers

R_{l} (X)

and

Q_{l} (X)

can be computed as

R_{l} (X) = {|\sum_{i = 1}^{l} R_{i, l} (χ_{i})|}_{m_{l}},

(18)

Q_{l} (X) = ⌊\frac{1}{m_{l}} \sum_{i = 1}^{l} R_{i, l} (χ_{i})⌋ .

(19)

From (19), it follows that

Q_{l} (X)

equals the number of occurred overflows when calculating the sum

R_{l} (X)

of residues

R_{1, l} (χ_{1}), R_{2, l} (χ_{2}), \dots, R_{l, l} (χ_{l})

modulo

m_{l}

(l = 2, 3, \dots, k)

.

Note that

R_{1} (X) = χ_{1}

and

Q_{1} (X) = 0

since

S_{1} (X) = χ_{1}

.

Substituting (17) into (16), we obtain

X_{k} = X_{k}^{(R)} + X_{k - 1}^{(Q)} + M_{k} Q_{k} (X),

(20)

where

X_{k}^{(R)} = \sum_{l = 1}^{k} M_{l - 1} R_{l} (X),

(21)

X_{k - 1}^{(Q)} = \sum_{l = 1}^{k - 1} M_{l} Q_{l} (X) .

(22)

Let us draw attention to Equations (21) and (22). It is evident that the number

X_{k}^{(R)}

is represented by the k-tuple

(x_{k}^{(R)}, x_{k - 1}^{(R)}, \dots, x_{1}^{(R)})

of mixed-radix digits, where

x_{l}^{(R)} = R_{l} (X)

,

l = 1, 2, \dots, k

(see Equation (5)). At the same time,

x_{l}^{(R)} \in Z_{m_{l}}

and

X_{k}^{(R)} \leq M_{k} - 1

.

Bearing in mind that

Q_{1} (X) = 0

, the number

X_{k - 1}^{(Q)}

can be written as

X_{k - 1}^{(Q)} = \sum_{l = 1}^{k - 1} M_{l - 1} Q_{l}^{^{'}} (X),

(23)

where

Q_{1}^{^{'}} (X) = 0

,

Q_{2}^{^{'}} (X) = Q_{1} (X) = 0

, and

Q_{l}^{^{'}} (X) = Q_{l - 1} (X)

for

l \geq 3

. Therefore, taking into account (19), the integer

Q_{l}^{^{'}} (X)

can be calculated as

Q_{l}^{^{'}} (X) = ⌊\frac{1}{m_{l - 1}} \sum_{i = 1}^{l - 1} R_{i, l - 1} (χ_{i})⌋ (l = 3, 4, \dots, k) .

(24)

Hence,

Q_{l}^{^{'}} (X) < l - 1

since

R_{i, l - 1} (χ_{i}) \leq m_{l - 1} - 1

.

Thus, the integer

X_{k - 1}^{(Q)}

(see Equations (23) and (5)) can be represented by a k-tuple

(x_{k}^{(Q)}, x_{k - 1}^{(Q)}, \dots, x_{1}^{(Q)})

of mixed-radix digits under the condition that

x_{l}^{(Q)} \in Z_{m_{l}} (l = 1, 2, \dots, k)

, where

x_{1}^{(Q)} = x_{2}^{(Q)} = 0, x_{l}^{(Q)} = Q_{l}^{^{'}} (X)

for

l > 2

. Consequently, that entails the fulfillment of the condition

Z_{l - 1} \subset Z_{m_{l}}

, which leads to inequality

m_{l} \geq l - 1 (l = 1, 2, \dots, k) .

(25)

Thus, when the moduli set

\{m_{1}, m_{2}, \dots, m_{k}\}

meets the conditions (25), we have that

X_{k - 1}^{(Q)} < M_{k}

.

Note that the integer

X_{k - 1}^{(Q)}

is a multiple of the number

M_{2} = m_{1} m_{2}

because of

x_{1}^{(Q)} = x_{2}^{(Q)} = 0

(see Equation (5)).

Now, let us return to Equation (20). According to Euclid’s Division Lemma, the sum of two mixed-radix numbers

X_{k}^{(R)}

and

X_{k - 1}^{(Q)}

results in

X_{k}^{(R)} + X_{k - 1}^{(Q)} = {|X_{k}^{(R)} + X_{k - 1}^{(Q)}|}_{M_{k}} + M_{k} ⌊\frac{X_{k}^{(R)} + X_{k - 1}^{(Q)}}{M_{k}}⌋ .

(26)

Hence, substituting (26) into (20), we obtain

X_{k} = {|X_{k}^{(R)} + X_{k - 1}^{(Q)}|}_{M_{k}} + M_{k} (Q_{k} (X) + ⌊\frac{X_{k}^{(R)} + X_{k - 1}^{(Q)}}{M_{k}}⌋) .

(27)

Taking into account the rank form of the number X (4), from (27) we have

X = {|X_{k}^{(R)} + X_{k - 1}^{(Q)}|}_{M_{k}} .

(28)

From (28), it follows that the mixed-radix representation of the number X, i.e., k-tuple

(x_{k}, x_{k - 1}, \dots, x_{1})

, can be calculated as a result of the addition of two mixed-radix numbers

X_{k}^{(R)} = (x_{k}^{(R)}, x_{k - 1}^{(R)}, \dots, x_{1}^{(R)})

and

X_{k - 1}^{(Q)} = (x_{k}^{(Q)}, x_{k - 1}^{(Q)}, \dots, x_{1}^{(Q)})

(see (21) and (23)) in the basis

\{m_{1}, m_{2}, \dots, m_{k}\}

. Note that

x_{1}^{(R)} = χ_{1}

,

x_{1}^{(Q)} = x_{2}^{(Q)} = 0

. At the same time, the digits

x_{2}^{(R)}, x_{3}^{(R)}, \dots, x_{k}^{(R)}

and

x_{3}^{(Q)}, x_{4}^{(Q)}, \dots, x_{k}^{(Q)}

are calculated as the sum of the residues

R_{1, l} (χ_{1}), R_{2, l} (χ_{2}), \dots, R_{l, l} (χ_{l})

modulo

m_{l}

along with the counting of occurred overflows according to (18) and (24)

(l = 2, 3, \dots, k)

.

Therefore, the mixed-radix digits

x_{l}^{(R)}

and

x_{l}^{(Q)}

are computed as

x_{1}^{(R)} = χ_{1}, x_{l}^{(R)} = {|\sum_{i = 1}^{l} R_{i, l} (χ_{i})|}_{m_{l}} (l = 2, 3, \dots, k),

(29)

x_{1}^{(Q)} = x_{2}^{(Q)} = 0, x_{l}^{(Q)} = ⌊\frac{1}{m_{l - 1}} \sum_{i = 1}^{l - 1} R_{i, l - 1} (χ_{i})⌋ (l = 3, 4, \dots, k),

(30)

where

R_{i, l} (χ_{i}) = {|- \frac{{|M_{i, l - 1}^{- 1} χ_{i}|}_{m_{i}}}{m_{i}}|}_{m_{l}} (i \neq l),

(31)

R_{l, l} (χ_{l}) = {|M_{l - 1}^{- 1} χ_{l}|}_{m_{l}} (l = 2, 3, \dots, k) .

(32)

Furthermore, in the MRS with the bases

m_{1}, m_{2}, \dots, m_{k}

, we calculate the sum of two numbers

X_{k}^{(R)}

and

X_{k - 1}^{(Q)}

. As a result, we obtain the mixed-radix representation

(x_{k}, x_{k - 1}, \dots, x_{1})

of the number X.

Table 1 given below presents the pre-calculation components (see Equations (31) and (32)). It should be recalled that

〈R_{1, 1} (χ_{1})〉 = χ_{1}

. The abbreviation LUT means lookup table. The bit-length of residues is

b_{l} = ⌈\log_{2} m_{l}⌉

(l = 1, 2, \dots, k)

. Here, and further,

⌈x⌉

denotes the smallest integer greater than or equal to x.

Table 2 presents the results of calculations in the modular channels according to Equations (29) and (30). It should be reminded that in the first modular channel corresponding to the modulus

m_{1}

, the calculations are not carried out, so

x_{1}^{(R)} = χ_{1}

and

x_{2}^{(Q)} = 0

.

The stated above allows us to formulate the following substantial theorem.

Theorem 1.

(About RNS-to-MRS reverse conversion).

Let an arbitrary RNS be defined by an ascending-ordered set of k pairwise relatively prime moduli

m_{1}, m_{2}, \dots, m_{k}

(

m_{l} \geq l - 1

,

l = 1, 2, \dots, k

,

k \geq 2

), and let the residue code

(χ_{1}, χ_{2}, \dots, χ_{k})

of the number

X \in Z_{M_{k}}

be given. Then, the mixed-radix representation

(x_{k}, x_{k - 1}, \dots, x_{1})

of the number X can be computed as a result of the summation of two mixed-radix numbers, namely, the appropriate number

X_{k}^{(R)} = (x_{k}^{(R)}, x_{k - 1}^{(R)}, \dots, x_{1}^{(R)})

and the correction number

X_{k - 1}^{(Q)} = (x_{k}^{(Q)}, x_{k - 1}^{(Q)}, \dots, x_{1}^{(Q)})

, where the digits

x_{l}^{(R)}

and

x_{l}^{(Q)}

(l = 1, 2, \dots, k)

are calculated according to (29) and (30), respectively, taking into account (31) and (32).

5. A Numerical Example of the Proposed Conversion Method

The main idea of the proposed approach to reverse conversion is illustrated below by a simple numerical example. For convenience, we consider a four-moduli RNS.

Example 1.

Let the RNS moduli-set be

\{m_{1}, m_{2}, m_{3}, m_{4}\}

=

\{5, 7, 9, 11\}

. Suppose that we wish to calculate the digits of the mixed-radix representation

(x_{4}, x_{3}, x_{2}, x_{1})

of the given number X by its residue code

(χ_{1}, χ_{2}, χ_{3}, χ_{4}) = (3, 6, 4, 2)

.

Step 1. The calculation of the primitive constants in a given RNS.

M_{4} = 3465, M_{3} = 315, M_{2} = 35, M_{1} = 5, M_{0} = 1,

M_{1, 4} = 693, M_{2, 4} = 495, M_{3, 4} = 385, M_{4, 4} = 315,

{|M_{1, 4}^{- 1}|}_{m_{1}} = 2, {|M_{2, 4}^{- 1}|}_{m_{2}} = 3, {|M_{3, 4}^{- 1}|}_{m_{3}} = 4, {|M_{4, 4}^{- 1}|}_{m_{4}} = 8,

{|m_{1}^{- 1}|}_{m_{4}} = 9, {|m_{2}^{- 1}|}_{m_{4}} = 8, {|m_{3}^{- 1}|}_{m_{4}} = 5, {|M_{3}^{- 1}|}_{m_{4}} = 8,

M_{1, 3} = 63, M_{2, 3} = 45, M_{3, 3} = 35,

{|M_{1, 3}^{- 1}|}_{m_{1}} = 2, {|M_{2, 3}^{- 1}|}_{m_{2}} = 5, {|M_{3, 3}^{- 1}|}_{m_{3}} = 8,

{|m_{1}^{- 1}|}_{m_{3}} = 2, {|m_{2}^{- 1}|}_{m_{3}} = 4, {|M_{2}^{- 1}|}_{m_{3}} = 8,

M_{1, 2} = 7, M_{2, 2} = 5,

{|M_{1, 2}^{- 1}|}_{m_{1}} = 3, {|M_{2, 2}^{- 1}|}_{m_{2}} = 3,

{|m_{1}^{- 1}|}_{m_{2}} = 3, {|M_{1}^{- 1}|}_{m_{2}} = 3 .

Step 2. The calculation of the residue sets

〈R_{1, l} (χ_{1}), R_{2, l} (χ_{2}), \dots, R_{l, l} (χ_{l})〉

according to (31) and (32)

(l = 1, 2, 3, 4)

.

We obtain

R_{1, 1} (χ_{1}) = χ_{1} = 3,

R_{1, 2} (χ_{1}) = {|- {|1 \cdot 3|}_{5} \cdot 3|}_{7} = 5,

R_{2, 2} (χ_{2}) = {|3 \cdot 6|}_{7} = 4,

R_{1, 3} (χ_{1}) = {|- {|3 \cdot 3|}_{5} \cdot 2|}_{9} = 1,

R_{2, 3} (χ_{2}) = {|- {|3 \cdot 6|}_{7} \cdot 4|}_{9} = 2,

R_{3, 3} (χ_{3}) = {|8 \cdot 4|}_{9} = 5,

R_{1, 4} (χ_{1}) = {|- {|2 \cdot 3|}_{5} \cdot 9|}_{11} = 2,

R_{2, 4} (χ_{2}) = {|- {|5 \cdot 6|}_{7} \cdot 8|}_{11} = 6,

R_{3, 4} (χ_{3}) = {|- {|8 \cdot 4|}_{9} \cdot 5|}_{11} = 8,

R_{4, 4} (χ_{4}) = {|8 \cdot 2|}_{11} = 5 .

As a result, the following sets of residues occur

〈R_{1, 1} (χ_{1})〉 = 〈3〉,

〈R_{1, 2} (χ_{1}), R_{2, 2} (χ_{2})〉 = 〈5, 4〉,

〈R_{1, 3} (χ_{1}), R_{2, 3} (χ_{2}), R_{3, 3} (χ_{3})〉 = 〈1, 2, 5〉,

〈R_{1, 4} (χ_{1}), R_{2, 4} (χ_{2}), R_{3, 4} (χ_{3}), R_{4, 4} (χ_{4})〉 = 〈2, 6, 8, 5〉 .

Step 3.The summation of the residues

R_{1, l} (χ_{1}), R_{2, l} (χ_{2}), \dots, R_{l, l} (χ_{l})

modulo

m_{l}

along with the counting of occurring overflows according to (18) and (19), respectively

(l = 2, 3, 4)

.

Recall that

R_{1} (X) = R_{1, 1} (χ_{1}) = 3

, and

Q_{1} (X) = 0

. We have

R_{2} (X) = {|5 + 4|}_{7} = {|9|}_{7} = 2,

R_{3} (X) = {|1 + 2 + 5|}_{9} = {|8|}_{9} = 8,

R_{4} (X) = {|2 + 6 + 8 + 5|}_{11} = {|21|}_{11} = 10,

Q_{2} (X) = ⌊(5 + 4) / 7⌋ = ⌊9 / 7⌋ = 1,

Q_{3} (X) = ⌊(1 + 2 + 5) / 9⌋ = ⌊8 / 9⌋ = 0,

Q_{4} (X) = ⌊(2 + 6 + 8 + 5) / 11⌋ = ⌊21 / 11⌋ = 1 .

Therefore, the mixed-radix representations of the numbers

X_{4}^{(R)}

and

X_{3}^{(Q)}

(see (21) and (23)) are computed:

(x_{4}^{(R)}, x_{3}^{(R)}, x_{2}^{(R)}, x_{1}^{(R)}) = (R_{4} (X), R_{3} (X), R_{2} (X), R_{1} (X)) = (10, 8, 2, 3),

(x_{4}^{(Q)}, x_{3}^{(Q)}, x_{2}^{(Q)}, x_{1}^{(Q)}) = (Q_{3} (X), Q_{2} (X), 0, 0) = (0, 1, 0, 0)

.

Step 4.The calculation of the mixed-radix digits

(x_{4}, x_{3}, x_{2}, x_{1})

.

The addition of two numbers

X_{4}^{(R)} = (10, 8, 2, 3)

and

X_{3}^{(Q)} = (0, 1, 0, 0)

according to (28) gives the mixed-radix representation

(0, 0, 2, 3)

of the number X.

Let us now verify the obtained result. According to (5), we have

X = (0, 0, 2, 3) = 0 \cdot 315 + 0 \cdot 35 + 2 \cdot 5 + 3 = 13 .

This result holds because the residue code of the integer number

X = 13

is

3, 6, 4, 2

, since

{|13|}_{5} = 3

,

{|13|}_{7} = 6

,

{|13|}_{9} = 4

,

{|13|}_{11} = 2

. Thus, this result coincides with the condition of the example.

6. The Computational Cost of the Reverse Conversion Method

As it follows from the results mentioned above, the calculation of the mixed-radix digits

x_{1}

,

x_{2}

, ⋯,

x_{k}

reduces to the independent and parallel summation of small residues

R_{1, l} (χ_{1})

,

R_{2, l} (χ_{2})

, ⋯,

R_{l, l} (χ_{l})

modulo

m_{l}

in lth modular channel

(l = 1, 2, \dots, k)

, taking into account the number of the overflows occuring during the modular addition operations (see (29)–(32)).

Let us evaluate the time required to perform the parallel reverse conversion.

First, we consider the calculation of mixed-radix digits of the numbers

X_{k}^{(R)} = (x_{k}^{(R)}, x_{k - 1}^{(R)}, \dots, x_{1}^{(R)})

and

X_{k - 1}^{(Q)} = (x_{k}^{(Q)}, x_{k - 1}^{(Q)}, \dots, x_{1}^{(Q)})

(see (29) and (30)). As can be seen, there are no modular addition operations in the first modular channel corresponding to the modulus

m_{1}

. In the second channel, we have only one addition operation modulo

m_{2}

. Furthermore, two additions modulo

m_{3}

are performed in the third channel and so on. Thus, in the lth modular channel, we have

l - 1

additions modulo

m_{l}

(l = 2, 3, \dots, k)

. These calculations are easily parallelized and pipelined. Therefore, the required computation time for calculating digits

x_{l}^{(R)}

and

x_{l}^{(Q)}

is

T_{l} = ⌈\log_{2} l⌉

modular clock cycles.

Thus, the time for obtaining the mixed-radix representations of the numbers

X_{k}^{(R)}

and

X_{k - 1}^{(Q)}

is determined by the time in the kth modular channel and equals

T_{k} = ⌈\log_{2} k⌉

modular clock cycles.

The summation of

X_{k}^{(R)}

and

X_{k - 1}^{(Q)}

on the bases

\{m_{1}, m_{2}, \dots, m_{k}\}

involves two additional modular clock cycles taking into account the inter-digit carries. Therefore, the execution time of the reverse conversion equals

T_{c o n v} = T_{k} + 2

modular clock cycles. Thus, the overall time is

t_{c o n v} = T_{c o n v} t_{m o d}

, where

t_{m o d}

denotes the modular clock cycle time. At the same time, when pipelined, the throughput rate of the proposed conversion method is one conversion in one modular clock cycle.

Consider now the evaluation of the required computational cost. Due to the small word-length of residues in the k-tuple

(χ_{1}, χ_{2}, \dots, χ_{k})

, the pre-computation and lookup table techniques are suitable for reverse conversion implementation. So, we can use one-input lookup tables depending on the residues word-length in each modular channel.

At the beginning stage of the reverse conversion, in the lth channel corresponding to the modulus

m_{l}

, the number of lookup tables required to store the residue set

〈R_{1, l} (χ_{1}), R_{2, l} (χ_{2}), \dots, R_{l, l} (χ_{l})〉

equals

N_{l u t} (l) = l

. At the same time, the word length of recorded residues is

b_{l} = ⌈\log_{2} m_{l}⌉

bits

(l = 2, 3, \dots, k)

. In the first modular channel,

N_{l u t} (1) = 0

since

S_{1} (X) = χ_{1}

.

Then, the overall number of one-input lookup tables in all modular channels is equal to

\begin{matrix} N_{l u t} = \sum_{l = 2}^{k} N_{l u t} (l) = \frac{k^{2} + k - 2}{2} . \end{matrix}

The summation of the residues

R_{1, l} (χ_{1}), R_{2, l} (χ_{2}), \dots, R_{l, l} (χ_{l})

modulo

m_{l}

requires

N_{a d d} (l) = l - 1

modular addition operations

(l = 2, 3, \dots, k)

. At the same time, all independent calculations are realized in parallel in corresponding modular channels.

Taking into account that

x_{1}^{(Q)} = x_{2}^{(Q)} = 0

, the summation of two numbers

X_{k}^{(R)} = (x_{k}^{(R)}, x_{k - 1}^{(R)}, \dots, x_{1}^{(R)})

and

X_{k - 1}^{(Q)} = (x_{k}^{(Q)}, x_{k - 1}^{(Q)}, \dots, x_{1}^{(Q)})

on the final stage of the reverse conversion requires

2 (k - 2)

modular addition operations.

Hence, the overall number of modular addition operations in all modular channels is equal to

\begin{matrix} N_{a d d} = \sum_{l = 2}^{k} N_{a d d} (l) + 2 (k - 2) = \frac{k^{2} + 3 k - 8}{2} . \end{matrix}

When pipelined, the throughput rate of the proposed method is one reverse conversion in one modular clock cycle.

7. Discussion

As it follows from [1], the calculation of the mixed-radix digits

x_{1}, x_{2}, \dots, x_{k}

(see (6) and (7)) requires

k - 1

both addition and multiplication operations; in this case, the overall conversion time is

k (k - 1) / 2 \cdot (t_{a d d} + t_{m u l})

, where

t_{a d d}

and

t_{m u l}

denote an execution time of addition/subtraction and multiplication, respectively. The computational cost of the pipelined implementation of this algorithm is

k (k - 1) / 2

, both multiplication and addition operations, while the conversion time is

(k - 1) (t_{a d d} + t_{m u l})

. The main drawback of this method is its strictly sequential nature.

The parallel conversion method circumscribed in [16] uses the additional lookup tables. At the same time,

k (k + 1) / 2

lookup tables and

k (k + 1) / 2

adders are required. The conversion time is

t_{l u t} + (k - 1) t_{a d d}

due to the need to generate the inter-digit carries when performing addition operations. As noted in [34], the method proposed in [16] does not allow obtaining the claimed depth of

O (\log_{2} k)

in terms of RNS processing elements. In this regard, an improved method was proposed by adding extra

k (k + 1) / 2

multipliers to hardware resources used in [16]. The implementation time is

t_{l u t} + t_{m u l} + (2 \log_{2} k + 1) t_{a d d}

. Hence, the time complexity of this conversion algorithm is

O (\log_{2} k)

.

In [15], the mixed-radix conversion is realized by the cascaded scheme of lookup tables and adders. The computational cost for the sequential implementation is

k (k - 2) / 4

double-size lookup tables and

k (k - 2) / 4

adders, while the conversion time equals

(k / 2) \cdot

(t_{l u t} + t_{a d d})

. When pipelined, the throughput rate is determined by the time equals

t_{l u t} + t_{a d d}

. This method works well when the used moduli do not have a very large word-length, since the size of lookup tables increases significantly with a word-length growth.

The paper [17] presents the parallel reverse conversion method, which uses the lookup table technique and requires no arithmetic or logical units. As reported, this algorithm is better than the ones presented in [15,16]. It is based on solving

k (k - 1) / 2

linear Diophantine Equations and requires

k (k - 1) / 2

lookup tables of size

m_{i} \times m_{j}

, while a conversion time is

(k - 1) t_{l u t}

. When pipelined, its effective conversion rate is one conversion per

t_{l u t}

. So, this method is attractive for DSP implementation. However, it is not suitable for implementing cryptographic applications because of the enormous size of the required lookup tables, especially when processing large numbers.

In the paper [9], the reverse conversion method is based on modular reduction by a modified canonic CRT algorithm. This enables minimizing the bit-width of intermediate data processing. The lookup tables translate the

b_{i}

-bit input residues

(i = 1, 2, \dots, k)

into

b_{o u t}

-bit output integers, where

b_{i} = ⌈\log_{2} m_{i}⌉

,

b_{o u t} = ⌈\frac{1}{2} \log_{2} (\sum_{i = 1}^{k} b_{i})⌉

, and k is the number of RNS moduli. As a result, the modular reduction of the modified k-tuple of

b_{o u t}

-bits integers is carried out over a ring of size

2 b_{o u t}

such that only the

b_{o u t}

least significant bits of the binary representation are maintained. In this case, all the

b_{o u t}

-bit outputs in the modified k-tuple are added together by adder tree without regard to overflow, propagating the

b_{o u t}

least significant bits to the output. The reverse conversion requires k lookup tables and

k - 1

adders. The scope of used lookup tables is

2^{b} \times 2^{b_{o u t}}

,

b \in {b_{1}, b_{2}, \dots, b_{k}}

. The overall conversion time is

t_{l u t} + ⌈\log_{2} k⌉ t_{a d d}

.

Some reverse conversion methods use the special moduli sets with a limited number of moduli, such as

m = 2^{n} + d

(d \in \{- 1, 0, 1\})

[2,8,35,36,37,38,39,40]. Their main drawback consists in a small number of the selected moduli, typically from three to five. These moduli sets are suitable for the efficient implementations of DSP algorithms but completely not applicable for large numbers processing widely used in cryptography. For example, to represent 1024-bit word-length cryptographic numbers using four RNS moduli, each modular channel must have residues of 256-bit length, which is not qualified for high-performance computing.

Table 3 compares the results across multiple techniques of the reverse conversion. Here, we use the following abbreviations: LUT—lookup table, ADD–adder, MUL—multiplier. The bit length

b \in \{b_{1}, b_{2}, \dots, b_{k}\}

,

b_{l} = ⌈\log_{2} m_{l}⌉

(l = 1, 2, \dots, k)

.

As seen from above, the proposed parallel reverse conversion method has time complexity of the order

O (⌈\log_{2} k⌉)

. In pipelined mode, it enables the high throughput rate and has one reverse conversion in one modular clock cycle. At the same time, the computational complexity is of the order of

O (k^{2} / 2)

in terms of the number of both required arithmetic operations and one-input lookup tables.

8. Conclusions

In this paper, a novel approach to parallel reverse conversion of the residue code

(χ_{1}, χ_{2}, \dots, χ_{k})

of the number X to mixed-radix representation

(x_{k}, x_{k - 1}, \dots, x_{1})

is described.

The calculation of the mixed-radix digits

(x_{k}, x_{k - 1}, \dots, x_{1})

is reduced to a parallel summation of the small word-length residues

R_{1, l} (χ_{1})

,

R_{2, l} (χ_{2})

, ⋯,

R_{l, l} (χ_{l})

modulo

m_{l}

in lth modular channel

(l = 1, 2, \dots, k)

, taking into account the number of the overflows occuring during the modular addition operations. These modular operations are performed fast and independently in each modular channel and easily pipelined.

The computational cost of the proposed reverse conversion method is presented. In all modular channels, the general number of modular addition operations is equal to

N_{a d d} = (k^{2} + 3 k - 8) / 2

. At the same time, the summary number of reqiured one-input lookup tables makes up

N_{l u t} = (k^{2} + k - 2) / 2

.

The execution time of the reverse conversion equals

T_{c o n v} = ⌈\log_{2} k⌉ + 2

modular clock cycles. At the same time, when pipelined, the throughput rate of the proposed conversion method is one conversion in one modular clock cycle.

The proposed parallel reverse conversion method coincides with the development vector of modern high-performance computing using residue arithmetic. It can find a widespread application for implementing a broad class of tasks in various areas of science and technology, first of all, in digital signal processing and cryptography.

Author Contributions

Conceptualization, M.S.; investigation, Y.P.; methodology, M.S.; writing—original draft preparation, M.S.; writing—review and editing, Y.P. All authors have read and improved the final version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Szabo, N.S.; Tanaka, R.I. Residue Arithmetic and Its Application to Computer Technology; McGraw-Hill: New York, NY, USA, 1967. [Google Scholar]
Molahosseini, A.S.; de Sousa, L.S.; Chang, C.H. (Eds.) Embedded Systems Design with Special Arithmetic and Number Systems; Springer: Cham, Switzerland, 2017. [Google Scholar]
Akushskii, I.Y.; Juditskii, D.I. Machine Arithmetic in Residue Classes; Soviet Radio: Moscow, Russia, 1968. (In Russian) [Google Scholar]
Amerbayev, V.M. Theoretical Foundations of Machine Arithmetic; Nauka: Alma-Ata, Kazakhstan, 1976. (In Russian) [Google Scholar]
Omondi, A.R.; Premkumar, B. Residue Number Systems: Theory and Implementation; Imperial College Press: London, UK, 2007. [Google Scholar]
Soderstrand, M.A.; Jenkins, W.K.; Jullien, G.A.; Taylor, F.J. (Eds.) Residue Number System Arithmetic: Modern Applications in Digital Signal Processing; IEEE Press: New York, NY, USA, 1986. [Google Scholar]
Chernyavsky, A.F.; Danilevich, V.V.; Kolyada, A.A.; Selyaninov, M.Y. High-Speed Methods, and Systems of Digital Information Processing; Belarusian State University: Minsk, Belarus, 1996. (In Russian) [Google Scholar]
Ananda Mohan, P.V. Residue Number Systems. Theory and Applications; Springer: Cham, Switzerland, 2016. [Google Scholar]
Michaels, A.J. A maximal entropy digital chaotic circuit. In Proceedings of the 2011 IEEE International Symposium of Circuits and Systems (ISCAS), Rio de Janeiro, Brazil, 15–18 May 2011; pp. 717–720. [Google Scholar]
Ding, C.; Pei, D.; Salomaa, A. Chinese Remainder Theorem: Applications in Computing, Coding, Cryptography; World Scientific: Singapore, 1996. [Google Scholar]
Omondi, A.R. Cryptography Arithmetic: Algorithms and Hardware Architectures; Springer: Cham, Switzerland, 2020. [Google Scholar]
Burton, D.M. Elementary Number Theory, 7th ed.; McGraw-Hill: New York, NY, USA, 2011. [Google Scholar]
Hardy, G.H.; Wright, E.M. An Introduction to the Theory of Numbers, 6th ed.; Oxford University Press: London, UK, 2008. [Google Scholar]
Akkal, M.; Siy, P. A new mixed radix conversion algorithm MRC-II. J. Syst. Archit. 2007, 53, 577–586. [Google Scholar] [CrossRef]
Chakraborti, N.B.; Soundararajan, J.S.; Reddy, A.L.N. An implementation of mixed-radix conversion for residue number applications. IEEE Trans. Comput. 1986, 35, 762–764. [Google Scholar] [CrossRef]
Huang, C.H. Fully parallel mixed-radix conversion algorithm for residue number applications. IEEE Trans. Comput. 1983, 32, 398–402. [Google Scholar] [CrossRef]
Miller, D.F.; McCormick, W.S. An arithmetic free parallel mixed-radix conversion algorithm. IEEE Trans. Circuits Syst. II 1998, 45, 158–162. [Google Scholar] [CrossRef]
Yassine, H.M.; Moore, W.R. Improved mixed-radix conversion for residue number architectures. IEE Proc. G - Circuits Devices Syst. 1991, 138, 120–124. [Google Scholar] [CrossRef]
Knuth, D.E. The Art of Computer Programming, Volume 2: Seminumerical Algorithms, 3rd ed.; Addison-Wesley: Boston, MA, USA, 1998. [Google Scholar]
Shoup, V. A Computational Introduction to Number Theory and Algebra, 2nd ed.; Cambridge University Press: Cambridge, UK, 2005. [Google Scholar]
Phatak, D.S.; Houston, S.D. New distributed algorithms for fast sign detection in residue number systems (RNS). J. Parallel Distrib. Comput. 2016, 97, 78–95. [Google Scholar] [CrossRef]
Shenoy, M.A.P.; Kumaresan, R. A fast and accurate RNS scaling technique for high speed signal processing. IEEE Trans. Acoust. Speech Signal Process. 1989, 37, 929–937. [Google Scholar] [CrossRef]
Vu, T.V. Efficient implementations of the Chinese Remainder Theorem for sign detection and residue decoding. IEEE Trans. Comput. 1985, 34, 646–651. [Google Scholar]
Miller, D.D.; Altschul, R.E.; King, J.R.; Polky, J.N. Analysis of the residue class core function of Akushskii, Burcev, and Pak. In Residue Number System Arithmetic: Modern Applications in Digital Signal Processing; IEEE Press: Piscataway, NJ, USA, 1986; pp. 390–401. [Google Scholar]
Gonnella, J. The application of core functions to residue number system. IEEE Trans. Signal Process. 1991, 39, 69–75. [Google Scholar] [CrossRef]
Abtahi, M. Core function of an RNS number with no ambiguity. Comput. Math. Appl. 2005, 50, 459–470. [Google Scholar] [CrossRef][Green Version]
Kong, Y.; Asif, S.; Khan, M.A.U. Modular multiplication using the core function in the residue number system. Appl. Algebra Eng. Commun. Comput. 2016, 27, 1–16. [Google Scholar] [CrossRef]
Kolyada, A.A.; Selyaninov, M.Y. Generation of integral characteristics of symmetric-range residue codes. Cybern. Syst. Anal. 1986, 22, 431–437. [Google Scholar] [CrossRef]
Selianinau, M. An efficient implementation of the CRT algorithm based on an interval-index characteristic and minimum-redundancy residue code. Int. J. Comput. Meth. 2020, 17, 2050004. [Google Scholar] [CrossRef]
Lu, M.; Chiang, J.-S. A novel division algorithm for the residue number system. IEEE Trans. Comput. 1992, 41, 1026–1032. [Google Scholar] [CrossRef]
Dimauro, G.; Impedovo, S.; Modugno, R.; Pirlo, G.; Stefanelli, R. Residue-to-binary conversion by the “quotient function”. IEEE Trans. Circuits Syst. II Analog Digital Signal Process. 2003, 50, 488–493. [Google Scholar] [CrossRef]
Dimauro, G.; Impedovo, S.; Pirlo, G.; Salzo, A. RNS architectures for the implementation of the ’diagonal function’. Inf. Process. Lett. 2000, 73, 189–198. [Google Scholar] [CrossRef]
Pirlo, G.; Impedovo, D. A new class of monotone functions of the residue number system. Int. J. Math. Models Meth. Appl. Sci. 2013, 7, 802–809. [Google Scholar]
Hitz, M.A.; Kaltofen, E. Integer division in residue number systems. IEEE Trans. Comput. 1995, 44, 983–989. [Google Scholar] [CrossRef]
Bergerman, M.V.; Lyakhov, P.A.; Voznesensky, A.S.; Bogaevskiy, D.V.; Kaplun, D.I. Designing reverse converter for data transmission systems from two-level RNS to BNS. J. Phys. Conf. Ser. 2020, 1658, 012005. [Google Scholar] [CrossRef]
Daphni, S.; Vijula Grace, K.S. A review analysis of reverse converter based on RNS in signal processing. Int. J. Sci. Technol. Res. 2020, 9, 1686–1689. [Google Scholar]
Sousa, L.; Paludo, R.; Martins, P.; Pettenghi, H. Towards the integration of reverse converters into the RNS channels. IEEE Trans. Comput. 2020, 69, 342–348. [Google Scholar] [CrossRef]
Mojahed, M.; Molahosseini, A.S.; Zarandi, A.A.E. A multifunctional unit for reverse conversion and sign detection based on the 5-moduli set. Comp. Sci. 2021, 22, 101–121. [Google Scholar] [CrossRef]
Salifu, A. New reverse conversion for four-moduli set and five-moduli set. J. Comp. Commun. 2021, 9, 57–66. [Google Scholar] [CrossRef]
Taghizadeghankalantari, M.; TaghipourEivazi, S. Design of efficient reverse converters for Residue Number System. J. Circuits Syst. Comp. 2021, 30, 2150141. [Google Scholar] [CrossRef]

Table 1. The pre-calculation components.

Input Residue	Number and Skope of LUTs	Output Residue Set
$χ_{1}$	$k - 1$ , $2^{b_{1}} \times b_{l}$ $(l = 2, 3, \dots, k)$	$〈R_{1, 2} (χ_{1}), R_{1, 3} (χ_{1}), \dots, R_{1, k} (χ_{1})〉$
$χ_{2}$	$k - 1$ , $2^{b_{2}} \times b_{l}$ $(l = 2, 3, \dots, k)$	$〈R_{2, 2} (χ_{2}), R_{2, 3} (χ_{2}), \dots, R_{2, k} (χ_{2})〉$
⋯	⋯	⋯
$χ_{k - 1}$	2, $2^{b_{k - 1}} \times b_{l}$ $(l = k - 1, k)$	$〈R_{k - 1, k - 1} (χ_{k - 1}), R_{k - 1, k} (χ_{k - 1})〉$
$χ_{k}$	1, $2^{b_{k}} \times b_{k}$	$〈R_{k, k} (χ_{k})〉$

Table 2. The results of calculations in the modular channels.

Modular Channel	Input Data	Output Data
$m_{2}$	$〈R_{1, 2} (χ_{1}), R_{2, 2} (χ_{2})〉$	$x_{2}^{(R)}$ , $x_{3}^{(Q)}$
$m_{3}$	$〈R_{1, 3} (χ_{1}), R_{2, 3} (χ_{2}), R_{3, 3} (χ_{3})〉$	$x_{3}^{(R)}$ , $x_{4}^{(Q)}$
⋯	⋯	⋯
$m_{k - 1}$	$〈R_{1, k - 1} (χ_{1}), R_{2, k - 1} (χ_{2}), \dots, R_{k - 1, k - 1} (χ_{k - 1})〉$	$x_{k - 1}^{(R)}$ , $x_{k}^{(Q)}$
$m_{k}$	$〈R_{1, k} (χ_{1}), R_{2, k} (χ_{2}), \dots, R_{k, k} (χ_{k})〉$	$x_{k}^{(R)}$

Table 3. RNS-to-MRS reverse conversion methods.

Method	Number and Scope of LUTs	ADD	MUL	Conversion Time
[1],
sequential	–	$k - 1$	$k - 1$	$\frac{k (k - 1)}{2} (t_{m u l} + t_{a d d})$
[1],
sequential,
pipelined	$\frac{k (k - 1)}{2}$ ; $2^{(b + 1)} \times b$	$\frac{k (k - 1)}{2}$	–	$(k - 1) (t_{l u t} + t_{a d d})$
[16],
parallel	$\frac{k (k + 1)}{2}$ ; $2^{b} \times b$	$\frac{k (k + 1)}{2}$	–	$t_{l u t} + (k - 1) t_{a d d}$
[34],
parallel	$\frac{k (k + 1)}{2}$ ; $2^{b} \times b$	$\frac{k (k + 1)}{2}$	$\frac{k (k + 1)}{2}$	$t_{l u t} + t_{m u l} + (2 \log_{2} k + 1) t_{a d d}$
[15],
sequential	$\frac{k (k - 2)}{4}$ ; $2^{2 b} \times 2 b$	$\frac{k (k - 1)}{4}$	–	$\frac{k}{2} (t_{l u t} + t_{a d d})$
[15],
parallel	$\frac{k (k - 2)}{4} + k - 1$ ; $2^{2 b} \times 2 b$	$\frac{k (k + 2)}{4} - 3$	–	$t_{l u t} + \frac{k}{2} t_{a d d}$
[17],
parallel	$\frac{k (k - 1)}{2}$ ; $2^{2 b} \times b$	–	–	$(k - 1) t_{l u t}$
[9]	k; $2^{b} \times 2^{⌈\frac{1}{2} \log_{2} (k b)⌉}$	$k - 1$	–	$t_{l u t} + (⌈\log_{2} k⌉) t_{a d d}$
Our method,
parallel	$\frac{k^{2} + k - 2}{2}$ ; $2^{b} \times b$	$\frac{k^{2} + 3 k - 8}{2}$	–	$(⌈\log_{2} k⌉ + 2) t_{m o d}$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Selianinau, M.; Povstenko, Y. An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem. Entropy 2022, 24, 242. https://doi.org/10.3390/e24020242

AMA Style

Selianinau M, Povstenko Y. An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem. Entropy. 2022; 24(2):242. https://doi.org/10.3390/e24020242

Chicago/Turabian Style

Selianinau, Mikhail, and Yuriy Povstenko. 2022. "An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem" Entropy 24, no. 2: 242. https://doi.org/10.3390/e24020242

APA Style

Selianinau, M., & Povstenko, Y. (2022). An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem. Entropy, 24(2), 242. https://doi.org/10.3390/e24020242

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Efficient Parallel Reverse Conversion of Residue Code to Mixed-Radix Representation Based on the Chinese Remainder Theorem

Abstract

1. Introduction

2. The Basic Concepts of the Residue Arithmetic

3. Reverse Conversion of the Residue Code to Conventional Representation

3.1. CRT-Base Conversion Method

3.2. MRS-Base Conversion Method

4. A Novel CRT-Base RNS-to-MRS Reverse Conversion Method

5. A Numerical Example of the Proposed Conversion Method

6. The Computational Cost of the Reverse Conversion Method

7. Discussion

8. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI