Performance Analysis of Hardware Implementations of Reverse Conversion from the Residue Number System

Kuchukov, Viktor; Telpukhov, Dmitry; Babenko, Mikhail; Mkrtchan, Ilya; Stempkovsky, Alexander; Kucherov, Nikolay; Ermakova, Tatiana; Grigoryan, Marine

doi:10.3390/app122312355

Open AccessArticle

Performance Analysis of Hardware Implementations of Reverse Conversion from the Residue Number System

by

Viktor Kuchukov

^1,*

,

Dmitry Telpukhov

^2,*

,

Mikhail Babenko

^3,4

,

Ilya Mkrtchan

²

,

Alexander Stempkovsky

²

,

Nikolay Kucherov

³

,

Tatiana Ermakova

⁵ and

Marine Grigoryan

⁶

¹

North-Caucasus Center for Mathematical Research, North-Caucasus Federal University, 355017 Stavropol, Russia

²

Institute for Design Problems in Microelectronics, 124365 Moscow, Russia

³

Department of Applied Mathematics an Mathematical Modeling, North-Caucasus Federal University, 355017 Stavropol, Russia

⁴

Institute for System Programming of the Russian Academy of Sciences, 124681 Moscow, Russia

⁵

School of Computing, Communication and Business, Hochschule für Technik und Wirtschaft (University of Applied Sciences for Engineering and Economics), 10318 Berlin, Germany

⁶

Department of Higher Mathematics and Physics, National University of Architecture and Construction of Armenia, Yerevan 0009, Armenia

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2022, 12(23), 12355; https://doi.org/10.3390/app122312355

Submission received: 17 November 2022 / Revised: 28 November 2022 / Accepted: 29 November 2022 / Published: 2 December 2022

(This article belongs to the Special Issue Emerging Residue Number System Technologies and Applications)

Download

Browse Figures

Versions Notes

Abstract

:

The Residue Number System (RNS) is a non-positional number system that allows parallel computations without transfers between digits. However, some operations in RNS require knowledge of the positional characteristic of a number. Among these operations is the conversion from RNS to the positional number system. The methods of reverse conversion for general form moduli based on the Chinese remainder theorem and the mixed-radix conversion are considered, as well as the optimized methods for special form moduli. In this paper, a method is proposed that develops the authors’ ideas based on the modified mixed-radix conversion and reference points. The modified method based on the mixed-radix conversion in this case makes it possible to replace the operation of finding the residue of division by a large modulo with the sequential calculation of the residue. The method of reference points allows to reduce the size of the stored information compared to the use of ROM to store all the residues of RNS. The application of this approach makes it possible to find a balance between the speed of the calculation and the hardware used, by varying the number of moduli of one method and the other.

Keywords:

residue number system; Chinese remainder theorem; mixed-radix conversion; reference points

1. Introduction

Position-based calculations with binary and decimal numbers require transfers between digits, which have a negative impact on performance or device overhead for calculations with large numbers. In non-position-based number systems, such as the Residue Number System (RNS), transfers between digits are avoided. Interest in RNS as a basis for computing system development emerged as early as the 1950s [1,2]. In particular, in the USSR, long-range detection radar stations operating on the basis of modular arithmetic and built under the leadership of I.Ya. Akushsky and D.I. Yuditsky were deployed in the 1960s. Currently, various aspects of the residue number system are applied in patents of companies from the USA (Olsen IP Reserve, LLC; The Athena Group, Inc.; Cisco Technology, Inc.), China (Tianjin University; No 30 Inst China Electronic Technology Group Corp.; Univ Xidian), and the UK (Samsung Electronics Co., Ltd.).

The architecture of computing systems operating in RNS generally consists of forward conversion of positional numbers into RNS [3], computation of the main problem using modular addition [4], multiplication [5], number comparison and sign determination [6] operations, reconversion from RNS to positional number system [7,8,9,10,11,12,13,14]. Furthermore, if addition, subtraction, multiplication in RNS have a parallel structure with numbers of small dimensions, then the operations of sign determination, comparison, conversion from RNS to the positional number system require the calculation of the positional characteristic of the number, which is computationally complex.

The work is structured as follows. In Section 2, reverse conversion methods for general form moduli, such as Chinese Remainder Theorem (CRT), Mixed-Radix Conversion (MRC), CRT with fractional numbers, New CRT I, and New CRT II are discussed. In Section 3, we consider the application of RNS moduli of the special kind

2^{n} \pm 1

,

2^{n}

. Section 4 considers the MRC-based method proposed by the authors. In Section 5, the authors’ proposed method based on reference points is considered. However, the result of the work is the method considered in Section 6, which combines the use of modified MRC and the method of reference points. The results of the physical synthesis of the reverse conversion methods are presented in Section 7. Section 8 analyzes the results of the physical synthesis and draws conclusions about the applicability of the proposed approaches.

2. Reverse Conversion for General View Moduli

Most methods of reverse conversion are based on the use of CRT and MRC.

Reconstruction of a number X from its modular form

(x_{1}, x_{2}, \dots, x_{n})

using the classical form of the Chinese Residue Theorem (CRTc) [7] is possible by the formula:

X = {|\sum_{i = 1}^{n} {|P_{i}^{- 1}|}_{p_{i}} \cdot P_{i} \cdot x_{i}|}_{P},

(1)

where

p_{1}, p_{2}, \dots, p_{n}

are RNS moduli,

P = \prod_{i = 1}^{n} p_{i}

is RNS dynamic range,

P_{i} = \frac{P}{p_{i}}

,

{|P_{i}^{- 1}|}_{p_{i}}

is multiplicative inversion of

P_{i}

modulo

p_{i}

.

To illustrate the considered methods, let us consider examples of RNS with 8-bit range, in particular

\{7, 8, 9\}

with range

P = 7 \cdot 8 \cdot 9 = 504

.

For the given RNS the auxiliary coefficients are:

\begin{matrix} P_{1} = \frac{504}{7} = 72, & P_{2} = 63, & P_{3} = 56, \\ {|P_{1}^{- 1}|}_{p_{1}} = 4, & {|P_{2}^{- 1}|}_{p_{2}} = 7, & {|P_{3}^{- 1}|}_{p_{3}} = 5 . \end{matrix}

Consider the reconstruction of the number

X = (3, 2, 1) = 10

. Then, using the Formula (1), we obtain:

X = {|4 \cdot 72 \cdot 3 + 7 \cdot 63 \cdot 2 + 5 \cdot 56 \cdot 1|}_{504} = {|2026|}_{504} = 10 .

In the case of the hardware implementation, the product calculations are performed in parallel, and the summation can be performed by tree doubling, which speeds up the calculations, but it is necessary to find the remainder of the division by a large number P.

Although in some cases it helps to simplify the scheme by using the Formula (1) to find the remainder for each summand

{|{|P_{i}^{- 1}|}_{p_{i}} \cdot P_{i} \cdot x_{i}|}_{P}

, the hardware implementation of the classical CRT leads to large areas and delays.

Another approach to reverse conversion is using the mixed-radix conversion [7], according to which the number

(x_{1}, x_{2}, \dots, x_{n})

can be transformed into a positional number system based on the formula:

X = d_{1} + d_{2} \cdot p_{1} + d_{3} \cdot p_{1} \cdot p_{2} + \dots + d_{n} \cdot p_{1} \cdot \dots \cdot p_{n - 1},

(2)

In this case, the MRC numbers

d_{i}

can be calculated with the formulas:

\begin{matrix} d_{1} = x_{1}, \end{matrix}

\begin{matrix} d_{2} = (x_{2} - d_{1}) \cdot {|p_{1}^{- 1}|}_{p_{2}} mod p_{2}, \end{matrix}

\begin{matrix} d_{3} = ((x_{3} - d_{1}) \cdot {|p_{1}^{- 1}|}_{p_{3}} - d_{2}) \cdot {|p_{2}^{- 1}|}_{p_{3}} mod p_{3} \end{matrix}

\begin{matrix} \dots \end{matrix}

\begin{matrix} d_{n} = (\dots ((x_{n} - d_{1}) \cdot {|p_{1}^{- 1}|}_{p_{n}} - d_{2}) \cdot {|p_{2}^{- 1}|}_{p_{n}} - \dots - d_{n - 1}) \cdot {|p_{n - 1}^{- 1}|}_{p_{n}} mod p_{n} . \end{matrix}

Consider the reconstruction process for the number

X = (3, 2, 1) = 10

, then

\begin{matrix} d_{1} = 3, \end{matrix}

\begin{matrix} d_{2} = (2 - 3) \cdot {|7^{- 1}|}_{8} mod 8 = (- 1 \cdot 7) mod 8 = 1, \end{matrix}

\begin{matrix} d_{3} = ((1 - 3) \cdot {|7^{- 1}|}_{9} - 1) \cdot {|8^{- 1}|}_{9} mod 9 = (- 2 \cdot 4 - 1) \cdot 8 mod 9 = 0 . \end{matrix}

Then, by the Formula (2), we obtain:

X = 3 + 1 \cdot 7 + 0 \cdot 7 \cdot 8 = 10 .

The main advantage of MRC is that calculations can be performed with small numbers without having to determine the remainder by a large modulo. However, the sequential architecture MRC leads to an increase in delay when the number of moduli increases.

To solve the problem of finding the remainder of the division by the dynamic range in the classical Chinese remainder theorem, its modification with fractional numbers in which the left and right parts of the expression (1) are divided by the dynamic range P [15]:

\bar{X} = \frac{X}{P} = {|\sum_{i = 1}^{n} \frac{{|P_{i}^{- 1}|}_{p_{i}}}{p_{i}} \cdot x_{i}|}_{1} = {|\sum_{i = 1}^{n} k_{i} \cdot x_{i}|}_{1},

(3)

where the symbol

{|\circ|}_{1}

denotes the fractional part of the number, and

k_{i} = \frac{{|P_{i}^{- 1}|}_{p_{i}}}{p_{i}}, i = 1, 2, \dots, n .

(4)

In this case, the sum value in (3) is a positional characteristic that lies in the range

[0, 1)

and enables number comparison and sign detection. The number X we are looking for is obtained by multiplying

\bar{X}

by the dynamic range P, i.e.,

X = \bar{X} \cdot P .

For the RNS with moduli

\{7, 8, 9\}

, the coefficients calculated by the Formula (4) will be:

k_{1} = \frac{{|P_{1}^{- 1}|}_{p_{1}}}{p_{1}} = \frac{4}{7}, k_{2} = \frac{7}{8}, k_{3} = \frac{5}{9} .

The number

X = (3, 2, 1) = 10

is reconstructed with the Formula (3):

\bar{X} = {|\frac{4 \cdot 3}{7} + \frac{7 \cdot 2}{8} + \frac{5 \cdot 1}{9}|}_{1} = {|\frac{1013}{252}|}_{1} = \frac{5}{252},

then

X = \frac{5}{252} \cdot 504 = 10

.

However, the coefficients

k_{i}

can rarely be represented as a finite fraction, so the question of the accuracy of rounding the coefficients is acute.

In the article [7], there is an approximate modification of the method based on the CRT scaling by

2^{N}

, where

N = ⌈{log}_{2} P μ⌉ - 1

,

μ = \sum_{i = 1}^{n} (p_{i} - 1)

. Then the Formulas (3) and (4) can be written as:

X^{'} = {|\sum_{i = 1}^{n} k_{i}^{'} \cdot x_{i}|}_{2^{N}},

(5)

where the coefficients

k_{i}^{'}

are rounded upwards

k_{i}^{'} = ⌈\frac{{|P_{i}^{- 1}|}_{p_{i}}}{p_{i}} \cdot 2^{N}⌉, i = 1, 2, \dots, n .

Then, the recovered number can be obtained by the formula with rounding down:

X = ⌊\frac{X^{'} \cdot P}{2^{N}}⌋ .

(6)

For RNS

\{7, 8, 9\}

the required accuracy is

N = 13

, so the calculation is performed modulo

2^{N} = 8192

. Then the coefficients are

k_{i}^{'}

:

k_{1}^{'} = ⌈\frac{{|P_{1}^{- 1}|}_{p_{1}}}{p_{1}} \cdot 2^{N}⌉ = ⌈\frac{4}{7} \cdot 2^{13}⌉ = 4682, k_{2}^{'} = 7168, k_{3}^{'} = 4552 .

From the Formula (5) for the number

X = (3, 2, 1) = 10

we obtain:

X^{'} = {|4682 \cdot 3 + 7168 \cdot 2 + 4552 \cdot 1|}_{2^{13}} = {|32934|}_{2^{13}} = 166 .

Then X will be equal to:

X = ⌊\frac{166 \cdot 504}{8192}⌋ = 10 .

Another method based on the CRT is a method called New CRT [8], which is given by the formula:

X = {|x_{1} + k_{1} (x_{2} - x_{1}) p_{1} + k_{2} (x_{3} - x_{2}) p_{1} p_{2} + \dots + k_{n - 1} (x_{n} - x_{n - 1}) p_{1} \dots p_{n - 1}|}_{P},

(7)

where

k_{i} = {|\frac{1}{\prod_{j = 1}^{i} p_{j}}|}_{\prod_{j = i + 1}^{n} p_{j}}, 1 \geq i \geq r - 1 .

(8)

For the above considered RNS

\{7, 8, 9\}

the coefficients

k_{i}

of the new CRT will be:

k_{1} = {|\frac{1}{p_{1}}|}_{p_{2} p_{3}} = 31, k_{2} = {|\frac{1}{p_{1} p_{2}}|}_{p_{3}} = 5 .

Then, using the Formula (7) for the number

X = (3, 2, 1) = 10

we obtain:

X = {|3 + 31 \cdot (2 - 3) \cdot 7 + 5 \cdot (1 - 2) \cdot 7 \cdot 8|}_{504} = {|- 494|}_{504} = 10 .

The article [16] proposes two modifications of the New CRT, which are denoted as New CRT I and New CRT II.

Let us consider each of these methods separately. In New CRT I, the Formula (8) similarly calculates constants

k_{i}

, which are used to find coefficients

a_{i}

,

i = 0, \dots, n - 1

:

\begin{matrix} a_{0} = {|1 - k_{1} p_{1}|}_{p_{1} p_{2} \dots p_{n - 1} p_{n}} \end{matrix}

\begin{matrix} a_{1} = {|k_{1} - k_{2} p_{2}|}_{p_{2} \dots p_{n - 1} p_{n}} \end{matrix}

\begin{matrix} \dots \end{matrix}

\begin{matrix} a_{n - 2} = {|k_{n - 2} - k_{n - 1} p_{n - 1}|}_{p_{n - 1} p_{n}} \end{matrix}

\begin{matrix} a_{n - 1} = {|k_{n - 1}|}_{p_{n}} \end{matrix}

Then, for these coefficients we build a matrix of characteristics:

A = (\begin{matrix} a_{0, 0} & 0 & \dots & 0 & 0 \\ a_{0, 1} & a_{1, 1} & \dots & 0 & 0 \\ \dots & \dots & \dots & \dots & \dots \\ a_{0, n - 2} & a_{1, n - 2} & \dots & a_{n - 2, n - 2} & 0 \\ a_{0, n - 1} & a_{1, n - 1} & \dots & a_{n - 2, n - 1} & a_{n - 1, n - 1} \end{matrix})

which coefficients are obtained from the representation of

a_{i}

in the mixed-radix conversion:

\begin{matrix} a_{0} = a_{0, 0} + a_{0, 1} p_{1} + \dots + a_{0, n - 1} p_{1} p_{2} \dots p_{n - 1} \end{matrix}

\begin{matrix} a_{1} = a_{1, 1} + a_{1, 2} p_{2} + \dots + a_{1, n - 1} p_{2} \dots p_{n - 1} \end{matrix}

\begin{matrix} \dots \end{matrix}

\begin{matrix} a_{n - 2} = a_{n - 2, n - 2} + a_{n - 2, n - 1} p_{n - 1} \end{matrix}

\begin{matrix} a_{n - 1} = a_{n - 1, n - 1} \end{matrix}

Then, from the matrix A and the value

X = (x_{1}, x_{2}, \dots, x_{n})

a vector

B = A \times X

can be obtained, i.e.:

B = (\begin{matrix} B_{0} \\ B_{1} \\ \dots \\ B_{n - 2} \\ B_{n - 1} \end{matrix}) = (\begin{matrix} a_{0, 0} & 0 & \dots & 0 & 0 \\ a_{0, 1} & a_{1, 1} & \dots & 0 & 0 \\ \dots & \dots & \dots & \dots & \dots \\ a_{0, n - 2} & a_{1, n - 2} & \dots & a_{n - 2, n - 2} & 0 \\ a_{0, n - 1} & a_{1, n - 1} & \dots & a_{n - 2, n - 1} & a_{n - 1, n - 1} \end{matrix}) \times (\begin{matrix} x_{1} \\ x_{2} \\ \dots \\ x_{n - 1} \\ x_{n} \end{matrix})

Then, the number X can be reconstructed by the formula:

X = {|B_{0} + B_{1} p_{1} + B_{2} p_{1} p_{2} + \dots + B_{n - 1} p_{1} p_{2} \dots p_{n - 1}|}_{p_{1} p_{2} \dots p_{n}}

Consider a similar example for which the coefficients of

a_{i}

are:

\begin{matrix} a_{0} = {|1 - 31 \cdot 7|}_{504} = 288 = 1 + 1 p_{1} + 5 p_{1} p_{2}, \end{matrix}

\begin{matrix} a_{1} = {|31 - 5 \cdot 8|}_{72} = 63 = 7 + 7 p_{2}, \end{matrix}

\begin{matrix} a_{2} = {|5|}_{9} = 5 . \end{matrix}

For the number

X = (3, 2, 1) = 10

we obtain the vector B:

B = (\begin{matrix} 1 & 0 & 0 \\ 1 & 7 & 0 \\ 5 & 7 & 5 \end{matrix}) \times (\begin{matrix} 3 \\ 2 \\ 1 \end{matrix}) = (\begin{matrix} 3 \\ 17 \\ 34 \end{matrix}) .

From which we obtain:

X = {|3 + 17 \cdot 7 + 34 \cdot 7 \cdot 8|}_{504} = 10 .

For the case of ordered RNS with moduli satisfying the expression

p_{i} > p_{1} + \dots_{+} p_{i - 1}

or more strictly

p_{i} > 2 p_{i - 1}

, the reverse conversion can be obtained without taking into account the remainder modulo P.

In this case, the function

Y < 2 P

is given by

Y = B_{0} + B_{1} p_{1} + B_{2} p_{1} p_{2} + \dots + {|B_{n - 1}|}_{p_{n}} p_{1} p_{2} \dots p_{n - 1}

and the required value of X is

X = \{\begin{matrix} Y & if Y < P, \\ Y - P & if Y \geq P . \end{matrix}

(9)

The restriction of moduli in this case leads to an unbalanced RNS. So, for the 8-bit range, we can take a set of moduli

\{3, 7, 13\}

, where the dimensionality of each module is one binary digit larger. In this case:

\begin{matrix} k_{1} = 61, k_{2} = 5 \end{matrix}

\begin{matrix} a_{0} = 91, a_{1} = 26, a_{2} = 5 . \end{matrix}

Then for the number

X = (1, 3, 10) = 10

we obtain:

B = (\begin{matrix} 1 & 0 & 0 \\ 2 & 5 & 0 \\ 4 & 3 & 5 \end{matrix}) \times (\begin{matrix} 1 \\ 3 \\ 10 \end{matrix}) = (\begin{matrix} 1 \\ 17 \\ 63 \end{matrix})

The desired value of X can be obtained from (9) and the vector

B = (B_{0}, B_{1}, {|B_{2}|}_{p_{3}})

:

Y = 1 + 17 \cdot 3 + 11 \cdot 3 \cdot 7 = 283

, since

Y \geq P = 273

, then

X = Y - P = 10

.

This method requires matrix multiplication with small-digit numbers, finding the remainder modulo RNS, and multiplication with accumulation for n summands.

Another method proposed in [16] is called New CRT II.

For a two-moduli CRT with

\{p_{1}, p_{2}\}

moduli, the Algorithm 1 denoted by findno

(x_{1},

x_{2},

p_{1},

p_{2}, X)

.

Algorithm 1 Algorithm findno number recovery for a two-moduli RNS

Require:

(x_{1}, x_{2}), \{p_{1}, p_{2}\}

Ensure:

X < p_{1} \cdot p_{2}

1:: $k_{0} = {|p_{2}^{- 1}|}_{p_{1}}$
2:: return $X = x_{2} + {|k_{0} (x_{1} - x_{2})|}_{p_{1}} \cdot p_{2}$

For ordered RNS with moduli

p_{1} < p_{2} < \dots < p_{n}

a recursive Algorithm 2 translate

((x_{1}, x_{2}, \dots, x_{n}), X)

is used.

Consider an example of applying the Algorithm 2 to RNS

\{7, 8, 9\}

and the number

X = (3, 2, 1) = 10

.

Since

n = 3

, according to step 5 of the Algorithm 2 we obtain two new calls of this algorithm translate

((x_{1}), N_{1})

and translate

((x_{2}, x_{3}), N_{2})

.

Calling translate

((x_{1}), N_{1})

leads to step 11 of the Algorithm 2, for which we obtain

N_{1} = x_{1} mod p_{1}

,

P_{1} = p_{1}

, i.e.,

N_{1} = 3, P_{1} = 7

.

Calling translate

((x_{2}, x_{3}), N_{2})

leads to step 9 of the Algorithm 2, for which from the Algorithm 1 findno we obtain

N_{2} = x_{3} + {|{|p_{3}^{- 1}|}_{p_{2}} \cdot (x_{2} - x_{3})|}_{p_{2}} \cdot p_{3}

,

P_{2} = p_{2} \cdot p_{3}

. That is,

N_{2} = 1 + {|1 \cdot (2 - 1)|}_{8} \cdot 9 = 10, P_{2} = 8 \cdot 9 = 72

.

Algorithm 2 Algorithm translate of number reconstruction in RNS

Require:

(x_{1}, x_{2}, \dots, x_{n}), \{p_{1}, p_{2}, \dots, p_{n}\}

1:: if $n = 2 t > 2$ (n—even, greater than 2) then
2:: translate $((x_{1}, \dots, x_{t}), N_{1})$ , $P_{1} = \prod_{i = 1}^{t} p_{i}$
3:: translate $((x_{t + 1}, \dots, x_{n}), N_{2})$ , $P_{2} = \prod_{i = t + 1}^{n} p_{i}$
4:: findno $(N_{1}, N_{2}, P_{1}, P_{2}, X)$
5:: else if $n = 2 t + 1 > 2$ (n—odd, greater than 2) then
6:: translate $((x_{1}, \dots, x_{t}), N_{1})$ , $P_{1} = \prod_{i = 1}^{t} p_{i}$
7:: translate $((x_{t + 1}, \dots, x_{n}), N_{2})$ , $P_{2} = \prod_{i = t + 1}^{n} p_{i}$
8:: findno $(N_{1}, N_{2}, P_{1}, P_{2}, X)$
9:: else if $n = 2$ then
10:: findno $(x_{1}, x_{2}, p_{1}, p_{2}, X)$
11:: else if $n = 1$ then
12:: $X = x_{1} mod p_{1}$
13:: end if

Then, according to step 8 of the Algorithm 2, the call findno

(N_{1},

N_{2},

P_{1},

P_{2},

X)

, that is, the calculation

X = N_{2} + {|{|P_{2}^{- 1}|}_{P_{1}} \cdot (N_{1} - N_{2})|}_{P_{1}} \cdot P_{2} = 10 + {|{|72^{- 1}|}_{7} \cdot (3 - 10)|}_{7} \cdot 72 = 10 .

Thus, the New CRT II method is based on the recursive use of MRC with two moduli. In this case, there are difficulties in calculating the multiplicative inversion in the last steps of the algorithm, but in the hardware implementation these values can be written to memory.

In [11] the set of odd moduli

\{p_{1}, p_{2}, \dots, p_{n - 1}\}

whose range

P_{h} = \prod_{i = 1}^{n - 1} p_{i}

and from the Formula (1) we can obtain

X_{h} = {|\sum_{i = 1}^{n - 1} P_{i} {|\frac{x_{i}}{P_{i}}|}_{p_{i}}|}_{P_{h}} .

Adding the remainder

x_{n}

modulo

p_{n} = 2^{k}

leads to the expression:

X = {|2^{k} {|\frac{X_{h}}{2^{k}}|}_{P_{h}} + P_{h} {|\frac{x_{n}}{P_{h}}|}_{2^{k}}|}_{2^{k} P_{h}} = {|2^{k} \sum_{i = 1}^{n - 1} P_{i} {|\frac{x_{i}}{2^{k} P_{i}}|}_{p_{i}} + P_{h} {|\frac{x_{n}}{P_{h}}|}_{2^{k}}|}_{2^{k} P_{h}}

(10)

For

d \geq ⌈{log}_{2} (n - 1)⌉

, denote

x_{n}^{'} = {|\frac{x_{n}}{P_{h}}|}_{2^{k}}

, which splits into blocks

x_{q}

and

x_{p}

by

k - d

and d bits, respectively, that is,

x_{n}^{'} = 2^{k - d} x_{p} + x_{q}

.

Next, the variables introduced are:

x_{i}^{'} = {|\frac{x_{i}}{2^{k - d} P_{i}}|}_{p_{i}} and x_{p}^{'} = {|x_{p} - \sum_{i = 1}^{n - 1} \frac{x_{i}^{'}}{p_{i}}|}_{2^{d}}

and

x_{n}^{″} = 2^{k - d} x_{p}^{'} + x_{q} = (x_{p}^{'} | | x_{q})

, (

| |

—concatenation of binary values).

Then the Formula (10) can be rewritten as:

X = {|2^{k - d} \sum_{i = 1}^{n - 1} P_{i} x_{i}^{'} + P_{h} x_{n}^{″}|}_{2^{k} P_{h}},

(11)

In this case, as in (9), the value of

X < 2 \cdot 2^{k} P_{h}

, i.e., does not exceed the double range and can be computed without computing the remainder by a large modulo.

Consider a similar example, with moduli

\{7, 9, 8\}

, then

n = 3, k = {log}_{2} 8 = 3 .

P_{h} = p_{1} \cdot p_{2} = 63, P_{1} = \frac{P_{h}}{p_{1}} = p_{2} = 9, P_{2} = p_{1} = 7 .

d = ⌈{log}_{2} (3 - 1)⌉ = 1 .

For the number

X = (3, 1, 2) = 10

, calculate

x_{i}^{'}

and

x_{q}, x_{p}

,

x_{p}^{'}, x_{n}^{″}

.

\begin{matrix} x_{1}^{'} = {|\frac{x_{1}}{2^{3 - 1} P_{1}}|}_{p_{1}} = {|3 \cdot {|{(2^{2} \cdot 9)}^{- 1}|}_{7}|}_{7} = 3, x_{2}^{'} = 1, x_{p} = 1, x_{q} = 2 . \end{matrix}

\begin{matrix} x_{p}^{'} = {|x_{p} - \sum_{i = 1}^{n - 1} \frac{x_{i}^{'}}{p_{i}}|}_{2^{d}} = 1, x_{n}^{″} = 2^{k - d} x_{p}^{'} + x_{q} = 6 . \end{matrix}

Then, from (11) we obtain

X = {|514|}_{504} = 10

.

This method allows you to find the reconstructed number without calculating the remainder large modulo, because

X < 2 \cdot 2^{k} P_{h}

.

3. Use of Special Type Moduli

To improve the performance of RNS, many researchers are considering moduli of a special kind

\{2^{n} - 1, 2^{n}, 2^{n} + 1\}

[8],

\{2^{2 n}, 2^{2 n - 1} - 1, 2^{2 n} - 1, 2^{2 n + 1} - 1\}

[9],

\{2^{k} - 3, 2^{k} - 2, 2^{k} - 1\}

and

\{2^{k} + 1, 2^{k} + 2, 2^{k} + 3\}

[10],

\{2^{2 n - 5} - 1,

2^{2 n - 3} - 1,

2^{2 n - 2} + 1,

2^{2 n - 1} - 1,

2^{2 n - 1} + 1,

2^{2 n},

2^{2 n} + 1\}

[12],

\{2^{2 n}, 2^{n} + 1, 2^{n} - 1, 2^{n} + 3, 2^{n} - 3\}

[13].

The most interesting among the reverse conversion methods is the optimization for the moduli set

\{2^{n} - 1, 2^{n}, 2^{n} + 1\}

, which was proposed in the article [8].

The most efficient implementations are based on the modulo

2^{n} - 1

for which a number of properties are satisfied:

The sign change (additive inversion) is achieved by bitwise negation, i.e., ${|- z|}_{2^{n} - 1} = {|\bar{z}|}_{2^{n} - 1} = {|({\bar{z}}_{n - 1} \dots {\bar{z}}_{0})|}_{2^{n} - 1}$ , e.g., ${|- 5|}_{7} = {|- 101_{2}|}_{7} = {|{\bar{101}}_{2}|}_{7} = {|(010_{2})|}_{7} = 2 .$
The multiplication by $2^{d}$ can be performed by a cyclic shift to the left by d digits, i.e., ${|2^{d} z|}_{2^{n} - 1} = {|z_{n - d - 1} \dots z_{0} z_{n - 1} \dots z_{n - d}|}_{2^{n} - 1}$ , e.g., ${|2 \cdot 5|}_{7} = {|2 \cdot 101_{2}|}_{7} = {|011_{2}|}_{7} = 3 .$
Multiplication by multiplicative inversion ${|2^{- d}|}_{2^{n} - 1}$ can be performed by a cyclic shift to the right by d positions or to the left by $n - d$ positions, since ${|2^{- d} z|}_{2^{n} - 1} = {|2^{n - d} z|}_{2^{n} - 1}$ , i.e., ${|2^{- d} z|}_{2^{n} - 1} = {|z_{d - 1} \dots z_{0} z_{n - 1} \dots z_{d}|}_{2^{n} - 1}$ , e.g., ${|2^{- 1} \cdot 5|}_{7} = {|2^{- 1} \cdot 101_{2}|}_{7} = {|110_{2}|}_{7} = 3 .$
For any non-negative integer i, ${|2^{i \cdot n}|}_{2^{n} - 1} = 1$ , then the $(s n) -$ bit number can be represented as

${|\sum_{i = 0}^{s - 1} 2^{n i} (y_{n (i + 1) - 1} \dots y_{n i})|}_{2^{n} - 1} = {|\sum_{i = 0}^{s - 1} (y_{n (i + 1) - 1} \dots y_{n i})|}_{2^{n} - 1}$

The article [8] is based on the idea that an expression for the reverse conversion based on the CRT (1) can be represented in a form, where the constant and the variables can be separated:

X = {|(\sum_{i = 1}^{n} b_{i} \cdot x_{i}) + C_{k}|}_{P}

First, one takes from the set

\{2^{n}, 2^{n} - 1, 2^{n} + 1\}

two moduli

\{p_{2}, p_{3}\} = \{2^{n} - 1, 2^{n} + 1\}

for which the number

X_{23} = \{x_{2}, x_{3}\}

is found from CRT:

\begin{matrix} X_{23} = & {|p_{3} \cdot x_{2} \cdot {|p_{3}^{- 1}|}_{p_{2}} + p_{2} \cdot x_{3} \cdot {|p_{2}^{- 1}|}_{p_{3}}|}_{p_{2} p_{3}} = \\ = {|(2^{n} + 1) \cdot 2^{n - 1} \cdot x_{2} + (2^{n} - 1) \cdot 2^{n - 1} \cdot x_{3}|}_{2^{2 n} - 1} = \\ = {|2^{n - 1} \cdot ((2^{n} + 1) \cdot x_{2} + (2^{n} - 1) x_{3})|}_{2^{2 n} - 1} \end{matrix}

Based on the MRC for the set of moduli

\{p_{1}, p_{2} p_{3}\}

, the number

X = \{x_{1}, X_{23}\}

is found:

X = x_{1} + p_{1} {|(X_{23} - x_{1}) \frac{1}{p_{1}}|}_{p_{2} p_{3}} = x_{1} + 2^{n} X_{h},

(12)

where

\begin{matrix} X_{h} = & {|(X_{23} - x_{1}) 2^{- 1}|}_{2^{2 n} - 1} = \\ = {|(2^{n - 1} ((2^{n} + 1) x_{2} + (2^{n} - 1) x_{3}) - x_{1}) 2^{- 1}|}_{2^{2 n} - 1} = \\ = {|- x_{1} 2^{- n} + x_{2} (2^{n} + 1) 2^{- 1} + x_{3} (2^{n} - 1) 2^{- 1}|}_{2^{2 n} - 1} \end{matrix}

(13)

The optimization of the expression (13) is possible by bitwise operations. The first term can be represented as follows:

{|- x_{1} \cdot 2^{- n}|}_{2^{2 n} - 1} = {|2^{- 1} (({\bar{x}}_{1, n - 2} \dots {\bar{x}}_{1, 0}) | | \underset{n}{\underset{⏟}{(0 \dots 0)}} | | {\bar{x}}_{1, n - 1}) + (2^{- n} - 1)|}_{2^{2 n} - 1} .

The second term appears as:

{|x_{2} (2^{n} + 1) 2^{- 1}|}_{2^{2 n} - 1} = {|2^{- 1} (x_{2} | | x_{2})|}_{2^{2 n} - 1} .

The third term can be represented as:

\begin{matrix} {|x_{3} (2^{n} - 1) 2^{- 1}|}_{2^{2 n} - 1} = \\ = {|2^{- 1} (\begin{matrix} ((x_{3, n - 1} \lor x_{3, n}) \dots (x_{3, 0} \lor x_{3, n}) | | {\bar{x}}_{3, n - 1} \dots {\bar{x}}_{3, 0} + 2^{n}) + \\ + (1 - 2^{n + 1}) \end{matrix})|}_{2^{2 n} - 1} \end{matrix}

(14)

or

{|x_{3} (2^{n} - 1) 2^{- 1}|}_{2^{2 n} - 1} = {|2^{- 1} (\begin{matrix} (x_{3, n - 1} \dots x_{3, 0} | | {\bar{x}}_{3, n - 1} \dots {\bar{x}}_{3, 1} | | (\bar{x_{3, 0} \lor x_{3, n}})) + \\ + (\underset{n - 1}{\underset{⏟}{0 \dots 0}} | | {\bar{x}}_{3, n} | | \underset{n - 1}{\underset{⏟}{0 \dots 0}} | | x_{3, n} | | 0) + \\ + (1 - 2^{n + 1}) \end{matrix})|}_{2^{2 n} - 1}

(15)

Depending on the chosen representation of the third term (14) or (15), the article [8] forms two variants of the reverse conversion constructed using the formula:

X_{h} = {|w_{1} + w_{2} + w_{3} + C_{k}|}_{2^{2 n} - 1}

(16)

In the first case, based on the Formula (14) we have:

\begin{matrix} w_{1} = ({\bar{x}}_{1, n - 1} \dots {\bar{x}}_{1, 0} | | \underset{n}{\underset{⏟}{(0 \dots 0)}}) \\ w_{2} = (x_{2, 0} | | x_{2, n - 1} \dots x_{2, 0} | | x_{2, n - 1} \dots x_{2, 1}) \\ w_{3} = ({\bar{x}}_{3, 0} | | (x_{3, n - 1} \lor x_{3, n}) \dots (x_{3, 0} \lor x_{3, n}) | | {\bar{x}}_{3, n - 1} \dots {\bar{x}}_{3, 1}) \\ C_{k} = {|(2^{- n} - 1) + 2^{- 1} (1 - 2^{n + 1}) + 2^{n - 1}|}_{2^{2 n} - 1} = 2^{2 n - 1} + 2^{n - 1} - 1 = (1 \underset{n}{\underset{⏟}{0 \dots 0}} \underset{n - 1}{\underset{⏟}{1 \dots 1}}) \end{matrix}

In the article [8], there is a typo that

C_{k}

has

n + 2

zeros. However, it is obvious that

2^{2 n - 1}

contains one and

2 n - 1

zeros, and the number (

2^{n - 1} - 1

) contains

n - 1

one. Then adding the numbers will result in the lowest number

n - 1

being a one, and the remaining

2 n - 1 - (n - 1) = n

zeros will remain zeros. For example, for

n = 2

:

2^{3} + 2 - 1 = 9 = 1001_{2}

.

For the second case, the formula (15) gives:

\begin{matrix} w_{1} = ({\bar{x}}_{1, n - 1} \dots {\bar{x}}_{1, 0} | | {\bar{x}}_{3, n} | | \underset{n - 2}{\underset{⏟}{(0 \dots 0)}} | | x_{3, n}) \\ w_{2} = (x_{2, 0} | | x_{2, n - 1} \dots x_{2, 0} | | x_{2, n - 1} \dots x_{2, 1}) \\ w_{3} = (\bar{(x_{3, 0} \lor x_{3, n})} | | x_{3, n - 1} \dots x_{3, 0} | | {\bar{x}}_{3, n - 1} \dots {\bar{x}}_{3, 1}) \\ C_{k} = {|(2^{- n} - 1) + 2^{- 1} (1 - 2^{n + 1})|}_{2^{2 n} - 1} = 2^{2 n - 1} - 1 = (0 \underset{2 n - 1}{\underset{⏟}{1 \dots 1}}) \end{matrix}

The addition of

C_{k}

in the calculation of

X_{h}

by the Formula (16) is equivalent to the addition of

2^{n} C_{k}

to

X_{23}

. Moreover, the latter can also be obtained by simultaneous addition of

c_{2} = {|2^{n} C_{k}|}_{2^{n} - 1} = {|C_{k}|}_{2^{n} - 1}

to

x_{2}

modulo

2^{n} - 1

and

c_{3} = {|2^{n} C_{k}|}_{2^{n} + 1} = {|- C_{k}|}_{2^{n} + 1}

to

x_{3}

modulo

2^{n} + 1

(leaving the remainder

x_{1}

of even modulo unchanged, since

{|2^{n} C_{k}|}_{2^{n}} = 0

).

In the first case we obtain:

\begin{matrix} c_{2} = {|2^{n} (2^{2 n - 1} + 2^{n - 1} - 1)|}_{2^{n} - 1} = 0, \end{matrix}

\begin{matrix} c_{3} = {|2^{n} (2^{2 n - 1} + 2^{n - 1} - 1)|}_{2^{n} + 1} = 1 . \end{matrix}

In the second case:

\begin{matrix} c_{2} = {|2^{n} (2^{2 n - 1} - 1)|}_{2^{n} - 1} = 2^{n - 1} - 1 = 0 \underset{n - 1}{\underset{⏟}{1 \dots 1}}, \end{matrix}

\begin{matrix} c_{3} = {|2^{n} (2^{2 n - 1} - 1)|}_{2^{n} + 1} = 2^{n - 1} + 1 = 01 \underset{n - 2}{\underset{⏟}{0 \dots 0}} 1 . \end{matrix}

Then the Formula (16) will be reduced to:

X_{h} = {|w_{1} + w_{2} + w_{3}|}_{2^{2 n} - 1}

with further use of the Formula (12).

Consider a similar example with RNS

\{8, 7, 9\}

,

n = 3

, for the number

X = 10 = (2, 3, 1)

.

In the first case, the coefficients

c_{2} = 0

and

c_{3} = 1

do not depend on the value of n. Then the offset value of

X^{'} = (2, 3, {|1 + 1|}_{9})

. From this follows

\begin{matrix} x_{1}^{'} = 2 = 010_{2}, & w_{1} = 101 | | 000_{2} = 40, \end{matrix}

\begin{matrix} x_{2}^{'} = 3 = 011_{2}, & w_{2} = 1 | | 011 | | 01_{2} = 45, \end{matrix}

\begin{matrix} x_{3}^{'} = 2 = 0010_{2}, & w_{3} = 1 | | 010 | | 10_{2} = 42 . \end{matrix}

Then

X_{h} = {|40 + 45 + 42|}_{63} = 1,

and

X = 2 + 2^{3} \cdot 1 = 10

.

For the second case for

n = 3

c_{2} = 2^{2} - 1 = 3

and

c_{3} = 2^{3} + 1 = 5

. Then the offset value of

X^{'} = (2, {|3 + 3|}_{7}, {|1 + 5|}_{9})

. From here

\begin{matrix} x_{1}^{'} = 2 = 010_{2}, & w_{1} = 101 | | 1 | | 0 | | 0_{2} = 44, \end{matrix}

\begin{matrix} x_{2}^{'} = 6 = 110_{2}, & w_{2} = 0 | | 110 | | 11_{2} = 27, \end{matrix}

\begin{matrix} x_{3}^{'} = 6 = 0110_{2}, & w_{3} = 1 | | 110 | | 00_{2} = 56 . \end{matrix}

Then

X_{h} = {|44 + 27 + 56|}_{63} = 1,

and

X = 2 + 2^{3} \cdot 1 = 10 .

In the first case, only addition with

c_{3}

is required, but n of OR logical operators are required when calculating

w_{3}

, in the second case there is one OR operator, but addition is performed on both

c_{2}

and

c_{3}

.

4. Modified Method Based on MRC

The patent [17] presents a new Algorithm 3 for translation of RNS to the positional number system and base extension based on the mixed-radix conversion, which we will hereafter call “the modified mixed-radix conversion method”. This algorithm works with moduli of the generalized form.

Algorithm 3 Reconstructing numbers and expanding bases

Require:

(x_{1}, \dots, x_{n}),

p_{n + 1}

Ensure:

X

,

{|X|}_{p_{n + 1}}

1:: $P_{i} = \frac{P}{p_{i}};$ ${|P_{i}^{- 1}|}_{p_{i}};$ $B_{i} = P_{i} \cdot {|P_{i}^{- 1}|}_{p_{i}}$
2:: ${\hat{w}}_{1} = 1;$ ${\hat{w}}_{j} = \prod_{i = 1}^{j - 1} p_{i},$ $j = \bar{2, n}$
3:: $B_{i} \overset{MRC}{\to} [{\hat{b}}_{i, j}],$ $i, j = \bar{1, n},$ $B_{i} = \sum_{j = 1}^{n} {\hat{b}}_{i, j} \cdot {\hat{w}}_{j}$
4:: $U_{i} = \sum_{j = 1}^{i} x_{j} \cdot {\hat{b}}_{j, i}$ , $i = \bar{1, n}$
5:: $σ_{0} = 0$ , ${\hat{x}}_{i} = {|σ_{i - 1} + U_{i}|}_{p_{i}},$ $σ_{i} = ⌊\frac{σ_{i - 1} + U_{i}}{p_{i}}⌋$
6:: $X = \sum_{i = 1}^{n} {\hat{x}}_{i} \cdot {\hat{w}}_{i},$
7:: ${|X|}_{p_{n + 1}} = {|\sum_{i = 1}^{n} {\hat{x}}_{i} \cdot {\hat{w}}_{i}|}_{p_{n + 1}}$
8:: return $X,$ ${|X|}_{p_{n + 1}}$

In some cases it is necessary to add one or more additional moduli to detect range overflows or to detect and correct errors. If the modulo

p_{n + 1}

is added, then the new range allows the display of any number

X < P \cdot p_{n + 1}

.

Consider the Algorithm 3 in detail. Set of a residue number system with moduli

\{p_{1}, p_{2}, \dots, p_{n}\}

for which the orthogonal bases of RNS

B_{i} = P_{i} \cdot {|P_{i}^{- 1}|}_{p_{i}}

,

i = \bar{1, n}

. Compute the bases of the MRC

{\hat{w}}_{j} = \prod_{i = 1}^{j - 1} p_{i}

,

j = \bar{2, n}

, with

w_{1} = 1

. The orthogonal RNS bases are represented in MRC as a vector of values

{\hat{b}}_{i, j}

, for which

B_{i} = \sum_{j = 1}^{n} {\hat{b}}_{i, j} \cdot {\hat{w}}_{j}

. The values of

{\hat{b}}_{i, j}

form a triangular matrix.

Given

X = (x_{1}, x_{2}, \dots, x_{n})

, we need to recover the number X in the positional number system and find the remainder of the division of X by an additional modulo

p_{n + 1}

, which we denote by

{|X|}_{p_{n + 1}}

.

To do this,

(x_{1}, x_{2}, \dots, x_{n})

is multiplied by the triangular matrix

{\hat{b}}_{i, j}

, so

U_{i} = \sum_{j = 1}^{i} x_{j} \cdot {\hat{b}}_{j, i}

,

i = \bar{1, n}

.

Then

{\hat{x}}_{i} = {|σ_{i - 1} + U_{i}|}_{p_{i}}

and

σ_{i} = ⌊\frac{σ_{i - 1} + U_{i}}{p_{i}}⌋

are calculated, assuming that

σ_{0} = 0

.

We can substitute that

{\hat{x}}_{i}

and

σ_{i}

are related as the remainder of the division of the sum

(σ_{i - 1} + U_{i})

by the modulo

p_{i}

and the rank of the sum, i.e., the carry by how many times the sum value exceeds the modulo.

Then find the value of the expression

X = \sum_{i = 1}^{n} {\hat{x}}_{i} \cdot {\hat{w}}_{i}

to convert a number from RNS to a positional number system, and to expand the bases, find

{|X|}_{p_{n + 1}} = {|| \sum_{i = 1}^{n} {\hat{x}}_{i} \cdot {\hat{w}}_{i}|}_{p_{n + 1}}

.

Let us consider an example to illustrate how the algorithm works. Given a residue number system with three moduli

\{7, 8, 9\}

. Then let

P = \prod_{i = 1}^{3} p_{i} = 504

.

Let us calculate

P_{i} = \frac{P}{p_{i}},

{|P_{i}^{- 1}|}_{p_{i}}

and

B_{i} = P_{i} \cdot {|P_{i}^{- 1}|}_{p_{i}}

:

\begin{matrix} P_{1} = 8 \cdot 9 = 72, & {|P_{1}^{- 1}|}_{p_{1}} = {|72^{- 1}|}_{7} = 4, & B_{1} = 288, \\ P_{2} = 7 \cdot 9 = 63, & {|P_{2}^{- 1}|}_{p_{2}} = {|63^{- 1}|}_{8} = 7, & B_{2} = 441, \\ P_{3} = 7 \cdot 8 = 56, & {|P_{3}^{- 1}|}_{p_{3}} = {|56^{- 1}|}_{9} = 5, & B_{3} = 280 . \end{matrix}

Let us calculate

{\hat{w}}_{j} = \prod_{i = 1}^{j - 1} p_{i}

:

{\hat{w}}_{1} = 1, {\hat{w}}_{2} = 7, {\hat{w}}_{3} = 7 \cdot 8 = 56 .

Let us calculate the representation of

B_{i}

in MRC:

B_{1} \overset{MRC}{\to} {\hat{B}}_{1} = [{\hat{b}}_{1, 1}, {\hat{b}}_{1, 2}, {\hat{b}}_{1, 3}] = [1, 1, 5]

, checking

1 + 1 \cdot 7 + 5 \cdot 56 = 288

,

B_{2} \overset{MRC}{\to} {\hat{B}}_{2} = [{\hat{b}}_{2, 1}, {\hat{b}}_{2, 2}, {\hat{b}}_{2, 3}] = [0, 7, 7]

, checking

0 + 7 \cdot 7 + 7 \cdot 56 = 441

,

B_{3} \overset{MRC}{\to} {\hat{B}}_{3} = [{\hat{b}}_{3, 1}, {\hat{b}}_{3, 2}, {\hat{b}}_{3, 3}] = [0, 0, 5]

, checking

0 + 0 \cdot 7 + 5 \cdot 56 = 280

.

Given

X = 10 = (3, 2, 1)

, we can compute

U_{i} = \sum_{j = 1}^{i} x_{j} \cdot {\hat{b}}_{j, i}

:

\begin{matrix} U_{1} = x_{1} \cdot {\hat{b}}_{1, 1} = 3 \cdot 1 = 3, \\ U_{2} = x_{1} \cdot {\hat{b}}_{1, 2} + x_{2} \cdot {\hat{b}}_{2, 2} = 3 \cdot 1 + 2 \cdot 7 = 17, \\ U_{3} = x_{1} \cdot {\hat{b}}_{1, 3} + x_{2} \cdot {\hat{b}}_{2, 3} + x_{3} \cdot {\hat{b}}_{3, 3} = 3 \cdot 5 + 2 \cdot 7 + 1 \cdot 5 = 34 . \end{matrix}

Let us calculate

{\hat{x}}_{i} = {|σ_{i - 1} + U_{i}|}_{p_{i}}

and

σ_{i} = ⌊\frac{σ_{i - 1} + U_{i}}{p_{i}}⌋

:

\begin{matrix} {\hat{x}}_{1} = {|U_{1}|}_{p_{1}} = {|3|}_{7} = 3, & σ_{1} = ⌊\frac{U_{1}}{p_{1}}⌋ = ⌊\frac{3}{7}⌋ = 0, \\ {\hat{x}}_{2} = {|σ_{1} + U_{2}|}_{p_{2}} = {|0 + 17|}_{8} = 1, & σ_{2} = ⌊\frac{σ_{1} + U_{2}}{p_{2}}⌋ = ⌊\frac{17}{8}⌋ = 2, \\ {\hat{x}}_{3} = {|σ_{2} + U_{3}|}_{p_{3}} = {|2 + 34|}_{9} = 0 . \end{matrix}

From this we obtain:

X = \sum_{i = 1}^{n} {\hat{x}}_{i} \cdot {\hat{w}}_{i} = 3 \cdot 1 + 1 \cdot 7 + 0 \cdot 56 = 10 .

Consider the block diagram for converting numbers from RNS and base expansion based on the Algorithm 3 shown in Figure 1.

The number

(x_{1}, x_{2}, \dots, x_{n})

represented in RNS is fed to the n inputs of the residues, where n is the number of moduli of the residue number system.

The residues from the corresponding residue inputs are written to n residue storage units. From the outputs of the residue storage units, the residues moduli

\{p_{1}, p_{2}, \dots, p_{n}\}

are fed to multipliers which form a triangular matrix and calculated values of the products of the residue with the base coefficients in the MRC

x_{j} \cdot {\hat{b}}_{j, i}

, where

B_{i} = P_{i} \cdot {|P_{i}^{- 1}|}_{p_{i}} = \sum_{j = 1}^{n} {\hat{b}}_{i, j} \cdot {\hat{w}}_{j}

, are orthogonal bases of RNS,

{\hat{w}}_{j} = \prod_{i = 1}^{j - 1} p_{i}, j = \bar{1, n}

are bases of MRC, with

{\hat{w}}_{0} = 1

.

The output of the first residue storage unit modulo

p_{1}

is connected to the inputs of n multipliers by

{\hat{b}}_{1, 1}

, …,

{\hat{b}}_{1, n}

, which compute the values

x_{1} \cdot {\hat{b}}_{1, i}

,

i = \bar{1, n}

.

The output of the second residue storage unit modulo

p_{2}

is connected to the inputs of the

(n - 1)

-th multiplier by

{\hat{b}}_{2, 2}

, …,

{\hat{b}}_{2, n}

, which compute the values

x_{2} \cdot {\hat{b}}_{2, i}

,

i = \bar{2, n}

, and so on, the output of the n-th residue storage unit modulo

p_{n}

are connected to the input of the multiplier by

{\hat{b}}_{n, n}

, which calculates the value

x_{n} \cdot {\hat{b}}_{n, n}

.

The outputs of the multipliers are connected to the inputs of n modular adders modulo

p_{i}

,

i = \bar{1, n}

, and the output of the multiplier by

{\hat{b}}_{1, 1}

is connected to the input of the first modular adder modulo

p_{1}

and performs calculations

{\hat{x}}_{1} = {|U_{1}|}_{p_{1}} = {|x_{1} \cdot {\hat{b}}_{1, 1}|}_{p_{1}}

,

σ_{1} = ⌊\frac{x_{1} \cdot {\hat{b}}_{1, 1}}{p_{1}}⌋

, the value

{\hat{x}}_{1}

goes to the output of the first modular adder modulo

p_{1}

, the value

σ_{1}

goes to the carry output of the first modular added modulo

p_{1}

.

The outputs of the multipliers by

{\hat{b}}_{1, 2}

and by

{\hat{b}}_{2, 2}

are connected to the inputs of the second modular adder modulo

p_{2}

, whose carry input is connected to the carry output of the first modular adder modulo

p_{1}

, and calculates

{\hat{x}}_{2} = {|σ_{1} + U_{2}|}_{p_{2}} = {|σ_{1} + x_{1} \cdot {\hat{b}}_{1, 2} + x_{2} \cdot {\hat{b}}_{2, 2}|}_{p_{2}}

,

σ_{2} = ⌊\frac{σ_{1} + x_{1} \cdot {\hat{b}}_{1, 2} + x_{2} \cdot {\hat{b}}_{2, 2}}{p_{2}}⌋

, the value of

{\hat{x}}_{2}

goes to the output of the second modular adder modulo

p_{2}

, the value of

σ_{2}

goes to the carry output of the second modular adder modulo

p_{2}

.

Therefore, the outputs of the multipliers by

{\hat{b}}_{i, n}

,

i = \bar{1, n}

, are connected to the inputs of the

n - t h

modular adder modulo

p_{n}

, whose carry input is connected to the carry output of the

(n - 1)

-th modular adder modulo

p_{n - 1}

, and calculates

{\hat{x}}_{n} = {|σ_{n - 1} + U_{n}|}_{p_{n}} = {|σ_{n - 1} + \sum_{i = 1}^{n} x_{i} {\hat{b}}_{i, n}|}_{p_{n}}

,

σ_{n} = ⌊\frac{σ_{n - 1} + \sum_{i = 1}^{n} x_{i} {\hat{b}}_{i, n}}{p_{n}}⌋

, the value

{\hat{x}}_{n}

arrives at the output n of the modular adder modulo

p_{n}

, the value of

σ_{n}

arrives at the carry output of the

n

-th modular adder modulo

p_{n}

, which is not connected to other elements.

The values of

{\hat{x}}_{i}

from the outputs of i-th modular adders modulo

p_{i}

are fed to the inputs of i-th multipliers by the bases of MRC

{\hat{w}}_{i}

,

i = \bar{1, n}

, where multiplication of

{\hat{x}}_{i}

by

{\hat{w}}_{i} = \prod_{j = 1}^{i - 1} p_{j}

takes place.

The output of the i-th multiplier by the bases of the MRC

{\hat{w}}_{i}

is connected to the i-th inputs of the adder and the modular adder,

i = \bar{1, n}

, the input of the extended base of the modular adder receives the modulo value

p_{n + 1}

, the output of the reduced number of the adder receives the reduced number X in the positional number system, and the output of the remainder on the extended basis of the modular adder receives the number on the extended basis, i.e.,

{|X|}_{p_{n + 1}}

.

This method has a similar matrix structure to New CRT, but uses a sequential calculation without the remainder over a large modulo, as in MRC.

5. Method Based on Reference Points

In [18], a method of reference points based on the use of LUT tables is proposed for translation from a residue number system.

For a given set of moduli

\{p_{1}, p_{2}, \dots, p_{n}\}

is chosen by the reference point modulo

p_{n}

, which to reduce the number of values in memory it is more efficient to choose the largest modulo RNS.

Then the contents of the LUT will consist of

P_{h} = \prod_{i = 1}^{n - 1} p_{i}

reference points, which correspond to numbers

X = (x_{1}, x_{2}, \dots, x_{n - 1}, 0)

, which are completely divided by the modulo

p_{n}

.

In order to obtain an arbitrary number

Y = (y_{1}, y_{2}, \dots, y_{n})

that is completely divided by

p_{n}

, you must subtract

y_{n}

from the number Y, that is, calculate

x_{i} = {|y_{i} - y_{n}|}_{p_{i}}, i = 1, \dots, n

.

For example, for a RNS

\{7, 8, 9\}

and an initial number

Y = 10 = (3, 2, 1)

, the reference point is

X (|3 - 1| |_{7}, {|2 - 1|}_{8}, {|1 - 1|}_{9}) = (2, 1, 0)

.

Thus, by analogy with New CRT II and the MRC for two moduli, the reference point will be

X = {|X_{h} \cdot {|p_{n}^{- 1}|}_{P_{h}}|}_{P_{h}} \cdot p_{n}

, where

X_{h}

is the positional number obtained from

(x_{1}, x_{2}, \dots, x_{n - 1})

for the set of moduli

\{p_{1}, p_{2}, \dots, p_{n - 1}\}

.

For example, for the number

Y = 228 = (4, 4, 3)

, the reference point would be the value

X ({|4 - 3|}_{7}, {|4 - 3|}_{8}, {|3 - 3|}_{9}) = (1, 1, 0)

. Since

(1, 1)

for the set of moduli

\{7, 8\}

is 1, and

{|9^{- 1}|}_{7 \cdot 8} = 25

, the reference point is

X = {|1 \cdot 25|}_{56} \cdot 9 = 225

, which corresponds to the subtraction from

Y = 228

of residue

y_{n} = 3

.

It can be seen that the value of the reference point depends on the number

(x_{1}, x_{2}, \dots, x_{n - 1})

with moduli

\{p_{1}, p_{2}, \dots, p_{n - 1}\}

and range

P_{h} = \prod_{i = 1}^{n - 1} p_{i}

. Then the computation can be replaced by a memory access, which by the value

X_{h} = (x_{1}, x_{2}, \dots, x_{n - 1})

will give the value

{|X_{h} \cdot {|p_{n}^{- 1}|}_{P_{h}}|}_{P_{h}} \cdot p_{n}

.

In this case, the reverse converter based on the reference points can have a hierarchical structure, for example, as shown in Figure 2. In this case, based on the differences of the residues

{|x_{1} - x_{3}|}_{p_{1}}

and

{|x_{2} - x_{3}|}_{p_{2}}

by moduli

p_{1}

and

p_{2}

, respectively, using LUT with the values of the reference points, which are

{|X_{12} \cdot {|p_{3}^{- 1}|}_{p_{1} p_{2}}|}_{p_{1} p_{2}} \cdot p_{3}

, where

X_{12}

is—the restored value over the moduli

p_{1}, p_{2}

, and adding with the remainder

x_{3}

modulo

p_{3}

find the restored number

X_{123}

modulo

p_{1}, p_{2}, p_{3}

.

Then similarly are the differences of the residues by moduli

p_{4}

and

p_{5}

and the reconstructed number

X_{123}

, and since the value of

X_{123}

can greatly exceed the values of modules

p_{4}

and

p_{5}

, forward conversion blocks are used, which can be implemented both arithmetically and using memory. Similarly, the values in the LUT of reference points are found, where the pre-calculated values

{|X_{45} \cdot {|{(P_{123})}^{- 1}|}_{P_{45}}|}_{P_{45}} \cdot P_{123}

for

P_{123} = p_{1} \cdot p_{2} \cdot p_{3}

and

P_{45} = p_{4} \cdot p_{5}

.

Thus, by varying the number of LUT inputs from 1 to

n - 1

and the number of reverse conversion layers, a solution with better time or area can be obtained.

6. Modification of the Reference Points Method Based on the MRC

As the number of moduli and their dimensionality increase, the method of reference points requires a significant amount of memory.

First, we introduce a method based on the proposed methods of reference points and modified MRC. Let us take RNS

\{p_{1}, \dots, p_{n - 1}, p_{n}\}

, where

p_{n} = 2^{k}

, i.e., degree 2. Then using the Algorithm 3 we find

X_{h}

by the moduli

\{p_{1}, \dots, p_{n - 1}\}

, and using the method of reference point we compute

{|x_{n} - X_{h}|}_{p_{n}}

. Since

p_{n} = 2^{k}

, this expression can be reduced to taking k lower bits of the sum

x_{n} + {\bar{X}}_{h} + 1

, where

{\bar{X}}_{h}

is the bitwise inversion.

Then, based on this value, a precomputed value equal to

{|{|x_{n} + {\bar{X}}_{h} + 1|}_{2^{k}} \cdot {|P_{h}^{- 1}|}_{2^{k}}|}_{2^{k}} \cdot P_{h}

is selected from the LUT table of size

2^{k}

, where

P_{h} = \prod_{i = 1}^{n - 1} p_{i}

. The final value is obtained by adding

{|x_{n} + {\bar{X}}_{h} + 1|}_{2^{k}}

. In this case, the reverse converter can have a structure, for example, as shown in Figure 3.

Consider an example for RNS

\{3, 5, 7, 8\}

and the number

X = 115 = (1, 0, 3, 3)

. To calculate the Algorithm 3, find the parameters of the RNS

\{3, 5, 7\}

:

\begin{matrix} P_{h} = 3 \cdot 5 \cdot 7 = 105, \\ P_{1} = 5 \cdot 7 = 35, & {|P_{1}^{- 1}|}_{p_{1}} = {|35^{- 1}|}_{3} = 2, & B_{1} = 70, \\ P_{2} = 3 \cdot 7 = 21, & {|P_{2}^{- 1}|}_{p_{2}} = {|21^{- 1}|}_{5} = 1, & B_{2} = 21, \\ P_{3} = 3 \cdot 5 = 15, & {|P_{3}^{- 1}|}_{p_{3}} = {|15^{- 1}|}_{7} = 1, & B_{3} = 15 . \end{matrix}

Let us calculate

{\hat{w}}_{j} = \prod_{i = 1}^{j - 1} p_{i}

:

{\hat{w}}_{1} = 1, {\hat{w}}_{2} = 3, {\hat{w}}_{3} = 3 \cdot 5 = 15 .

Let us calculate the representation of

B_{i}

in the MRC:

\begin{matrix} B_{1} \overset{MRC}{\to} {\hat{B}}_{1} = [{\hat{b}}_{1, 1}, {\hat{b}}_{1, 2}, {\hat{b}}_{1, 3}] = [1, 3, 4] \end{matrix}

\begin{matrix} B_{2} \overset{MRC}{\to} {\hat{B}}_{2} = [{\hat{b}}_{2, 1}, {\hat{b}}_{2, 2}, {\hat{b}}_{2, 3}] = [0, 2, 1] \end{matrix}

\begin{matrix} B_{3} \overset{MRC}{\to} {\hat{B}}_{3} = [{\hat{b}}_{3, 1}, {\hat{b}}_{3, 2}, {\hat{b}}_{3, 3}] = [0, 0, 1] . \end{matrix}

To calculate

X_{h} = (1, 0, 3)

we then compute

U_{i} = \sum_{j = 1}^{i} x_{j} \cdot {\hat{b}}_{j, i}

:

\begin{matrix} U_{1} = x_{1} \cdot {\hat{b}}_{1, 1} = 1 \cdot 1 = 1, \end{matrix}

\begin{matrix} U_{2} = x_{1} \cdot {\hat{b}}_{1, 2} + x_{2} \cdot {\hat{b}}_{2, 2} = 1 \cdot 3 + 0 \cdot 2 = 3, \end{matrix}

\begin{matrix} U_{3} = x_{1} \cdot {\hat{b}}_{1, 3} + x_{2} \cdot {\hat{b}}_{2, 3} + x_{3} \cdot {\hat{b}}_{3, 3} = 1 \cdot 4 + 0 \cdot 1 + 3 \cdot 1 = 7 . \end{matrix}

Let us calculate

{\hat{x}}_{i} = {|σ_{i - 1} + U_{i}|}_{p_{i}}

and

σ_{i} = ⌊\frac{σ_{i - 1} + U_{i}}{p_{i}}⌋

:

\begin{matrix} {\hat{x}}_{1} = {|U_{1}|}_{p_{1}} = {|1|}_{3} = 1, & σ_{1} = ⌊\frac{U_{1}}{p_{1}}⌋ = ⌊\frac{1}{3}⌋ = 0, \end{matrix}

\begin{matrix} {\hat{x}}_{2} = {|σ_{1} + U_{2}|}_{p_{2}} = {|0 + 3|}_{5} = 3, & σ_{2} = ⌊\frac{σ_{1} + U_{2}}{p_{2}}⌋ = ⌊\frac{3}{5}⌋ = 0, \end{matrix}

\begin{matrix} {\hat{x}}_{3} = {|σ_{2} + U_{3}|}_{p_{3}} = {|0 + 7|}_{7} = 0 . \end{matrix}

From this we obtain:

X_{h} = \sum_{i = 1}^{n} {\hat{x}}_{i} \cdot {\hat{w}}_{i} = 1 \cdot 1 + 3 \cdot 3 + 0 \cdot 15 = 10 .

According to the method of reference vectors we find the difference

X_{4} = {|x_{4} - X_{h}|}_{p_{4}} = {|3 - 10|}_{8} = 1 .

In the case of a hardware implementation, to obtain

X_{4}

, one must add

x_{4}

to one and

k = 3

to the lowest bits of the inverted value of

X_{h}

, i.e.,

X_{4} = {|11 + \bar{1010} + 1|}_{8} = {|11 + 101 + 1|}_{8} = {|1001|}_{8} = 1 .

LUT of reference point stores values

{|X_{4} \cdot {|P_{h}^{- 1}|}_{2^{k}}|}_{2^{k}} \cdot P_{h}

, values for this case are shown in Table 1.

Since

X_{4} = 1

, based on the Table 1 we obtain a value of 105 and then the initial value

X = 105 + X_{h} = 115

.

Changing the ratio of the number of moduli over which the reconstruction in the modified MRC and the method of reference points, as well as the number of steps in the method of reference points allows you to flexibly adjust the system according to the constraints on the used area and computation time.

7. Physical Synthesis of Reverse Conversion Methods

Physical synthesis of methods for translating from a residue number system to a positional number system was performed on an ASIC in the RTL and physical synthesis Cadence Genus Synthesis Solution environment using the NangateOpenCellLibrary and computation time optimization. The parameters measured were the signal propagation time across the circuit (picoseconds, ps) and the area used (square micrometer, µm²).

The Chinese residue theorem-based method (Formula (1)), the approximated CRT-based method (Formulas (5) and (6)), the mixed-radix conversion-based method (Formula (2)), a method based on special moduli

\{2^{n} - 1, 2^{n}, 2^{n} + 1\}

[8], the reference points method proposed by the authors and discussed in Section 5, the modified MRC-based method (Algorithm 3), and the introduced method based on reference points and modified MRC (Section 6).

Simulation results for the reference points method, CRT, approximate CRT, modified MRC, MRC, special moduli

\{2^{n} - 1, 2^{n}, 2^{n} + 1\}

and the proposed method are shown in Figure 4, Figure 5, Figure 6 for 8, 16 and 32 bits, respectively.

In addition, Table 2, Table 3, Table 4 are given to compare the proposed modification with the previously developed methods. In these tables, the reference points based method is denoted as I, the modified mixed-radix conversion based method—II, the proposed modification—III. The configuration of the layers for the reference points method is given in the second column.

The analysis of the Figure 4 shows that the reference point based method has the smallest area in the 8-bit domain and is only slightly inferior to the special type moduli in terms of time. The MRC and approximate CRT-based methods have complex computational algorithms and perform poorly in terms of both area and time.

In the 16-bit domain (Figure 5), the method using special moduli has a clear advantage, and the method based on reference points has a low execution time, but at the same time significant hardware costs. The proposed method allows to reduce the used area, but slightly increases the computation time.

In the 32-bit domain (Figure 6), the reference points method shows both some of the best and some of the worst results in terms of time and area, and thus a strong dependence on the set of moduli and the configuration of layer over which the residues are grouped for the LUT. The MRC-based method performs worse than the CRT and the approximate CRT in most cases, but has less area as the dimensionality increases.

Figure 7 and Figure 8 show the best time and area, respectively, for the methods considered.

8. Conclusions

In this paper, an overview of the methods of conversion from the residue number system to the positional number system is given. Reverse conversion methods for general form moduli, such as the Chinese residue theorem, mixed-radix conversion, the Chinese residue theorem with fractional numbers, new CRT I, and new CRT II, were considered. The application of RNS moduli of the special form 2ⁿ ± 1, 2ⁿ was also considered. Based on the modified mixed-radix conversion presented by Algorithm 3 and the method of reference points (Section 5), a method of reverse conversion of a number from a residue number system to a positional number system is proposed (Section 6). The results of physical synthesis have shown that the use of moduli of a special type makes it possible to significantly reduce the computational time and the claimed area, however, the task of working with moduli of a general type often arises.

The reference points method proposed by the authors in [18] makes it possible to obtain computationally efficient implementations, but the used area can differ up to 30 times within one dimension (for the sets {7, 17, 31, 65, 127, 256} and {1023, 1025, 8192}), which leads to the need to check a large number of sets of a given dimension in the search for the optimal set.

The method proposed in this work reduces the required area (up to five times for the set {7, 17, 31, 65, 127, and 256}) for RNS with a large number of moduli compared to the reference point method and reduces the area for RNS with three and four moduli compared to the modified MRC. It is possible to combine the reference point method with different configurations of layers and methods based on CRT and MRC to obtain efficient implementations in terms of computational speed and required area.

Author Contributions

Conceptualization, V.K., D.T., M.B., I.M., A.S., N.K., T.E. and M.G.; methodology, V.K., D.T., M.B., I.M., A.S., N.K., T.E. and M.G.; software, V.K. and I.M.; validation, V.K., D.T., M.B., I.M., A.S., N.K., T.E. and M.G.; formal analysis, V.K., D.T., M.B., I.M., A.S., N.K., T.E. and M.G.; investigation, V.K., D.T., M.B., I.M., A.S., N.K., T.E. and M.G.; writing—original draft preparation, V.K., D.T., M.B. and I.M.; writing—review and editing, V.K., D.T., M.B., I.M., A.S., N.K., T.E. and M.G.; supervision, V.K., D.T., M.B. and A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was carried out at the North Caucasus Center for Mathematical Research within agreement no. 075-02-2022-892 with the Ministry of Science and Higher Education of the Russian Federation. The reported study was funded by Russian Federation President Grant MD-1414.2021.4, MK-1203.2022.1.6 and SP-3186.2022.5.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Garner, H.L. The residue number system. In Proceedings of the Western Joint Computer Conference, San Francisco, CA, USA, 3–5 March 1959. [Google Scholar]
Malashevich, B.M. Brief Basis and History of Domestic Modular Computers. Origins of modular arithmetic. Proc. SoRuCom-2017 2017, 10, 193–207. (In Russian) [Google Scholar]
Low, J.Y.S.; Chang, C.H. A new approach to the design of efficient residue generators for arbitrary moduli. IEEE Trans. Circuits Syst. I Regul. Pap. 2013, 60, 2366–2374. [Google Scholar] [CrossRef]
Patel, R.A.; Benaissa, M.; Powell, N.; Boussakta, S. Novel power-delay-area-efficient approach to generic modular addition. IEEE Trans. Circuits Syst. I Regul. Pap. 2007, 54, 1279–1292. [Google Scholar] [CrossRef]
Zimmermann, R. Efficient VLSI implementation of modulo (2ⁿ ± 1) addition and multiplication. In Proceedings of the 14th IEEE Symposium on Computer Arithmetic, Adelaide, Australia, 14–16 April 1999; pp. 158–167. [Google Scholar]
Tchernykh, A.; Babenko, M.; Shiriaev, E.; Pulido-Gaytan, B.; Cortés-Mendoza, J.M.; Avetisyan, A.; Drozdov, A.Y.; Kuchukov, V. An Efficient Method for Comparing Numbers and Determining the Sign of a Number in RNS for Even Ranges. Computation 2022, 10, 17. [Google Scholar] [CrossRef]
Chervyakov, N.I.; Molahosseini, A.S.; Lyakhov, P.A.; Babenko, M.G.; Deryabin, M.A. Residue-to-binary conversion for general moduli sets based on approximate Chinese remainder theorem. Int. J. Comput. Math. 2017, 94, 1833–1849. [Google Scholar] [CrossRef]
Patronik, P.; Piestrak, S.J. Design of RNS reverse converters with constant shifting to residue datapath channels. J. Signal Process. Syst. 2018, 90, 323–339. [Google Scholar] [CrossRef] [Green Version]
Ananda Mohan, P.V. Reverse Converters for the Moduli Set {2ⁿ, 2ⁿ⁻¹ − 1, 2ⁿ − 1, 2ⁿ⁺¹ − 1}(nEven). Circuits Syst. Signal Process. 2018, 37, 3605–3634. [Google Scholar] [CrossRef]
Phalguna, P.S.; Kamat, D.V.; Ananda Mohan, P.V. RNS-to-Binary Converters for New Three-Moduli Sets {2^k − 3,2^k − 2,2^k − 1} and {2^k + 1,2^k + 2, 2^k + 3}. J. Circuits Syst. Comput. 2018, 27, 1850224. [Google Scholar] [CrossRef]
Patronik, P. On reverse converters for arbitrary multi-moduli RNS. Integration 2020, 75, 158–167. [Google Scholar] [CrossRef]
Madhavi Latha, M.V.N.; Rachh, R.R.; Ananda Mohan, P.V. An improved RNS-to-binary converter for 7-modulus set {2ⁿ⁻⁵ − 1, 2ⁿ⁻³ − 1, 2ⁿ⁻² + 1, 2ⁿ⁻¹ − 1, 2ⁿ⁻¹ + 1, 2ⁿ, 2ⁿ + 1} for n even. Sadhana 2020, 45, 1–4. [Google Scholar] [CrossRef]
Mojahed, M.; Molahosseini, A.S.; Zarandi, A.A.E. Multifunctional unit for reverse conversion and sign detection based on five-moduli set {2²ⁿ, 2ⁿ + 1, 2ⁿ − 1, 2ⁿ + 3, 2ⁿ − 3}. Comput. Sci. 2021, 22, 101–121. [Google Scholar] [CrossRef]
Majd, K.M.; Molahosseini, A.S. Energy-Efficient Residue-to-Binary Conversion Based on a Modulo-Adder-Free Architecture. In Proceedings of the 2022 30th International Conference on Electrical Engineering (ICEE), Tehran, Iran, 17–19 May 2022; Volume 10, pp. 676–680. [Google Scholar]
Hung, C.Y.; Parhami, B. An approximate sign detection method for residue numbers and its application to RNS division. Comput. Math. Appl. 1994, 27, 23–35. [Google Scholar] [CrossRef] [Green Version]
Wang, Y. Residue-to-binary converters based on new Chinese remainder theorems. IEEE Trans. Circuits Syst. II Analog. Digit. Signal Process. 2000, 47, 197–205. [Google Scholar] [CrossRef]
Patent for Invention № 2744815 Russian Federation, Int. Cl. G06F 7/72. Device for Transferring Numbers from Residue Number System and Base-Radix Extensions: № 2020120649; Priority. 22.06.2020; Date of Publication. 16.03.2021/Babenko M.G., Kuchukov V.A., Chernykh A.N., Kucherov N.N.; Proprietor: Federalnoe Gosudarstvennoe Avtonomnoe Obrazovatelnoe Uchrezhdenie Vysshego Obrazovaniia “Severo-Kavkazskii Federalnyi Universitet” (RU).—13 p. Available online: https://fips.ru/ofpstorage/Doc/IZPM/RUNWC1/000/000/002/744/815/%D0%98%D0%97-02744815-00001/document.pdf (accessed on 16 November 2022).
Stempkovsky, A.; Telpukhov, D.; Mkrtchan, I.; Zhigulin, A. Reference Points Based RNS Reverse Conversion for General Moduli Sets. In Proceedings of the International Conference on Mathematics and Its Applications in New Computer Systems, Stavropol, Russia, 13–15 December 2021; pp. 253–262. [Google Scholar]

Figure 1. Computational unit for translating numbers from RNS and expanding bases.

Figure 2. Reverse converter based on reference point.

Figure 3. Reverse converter based on reference point and modified MRC.

Figure 4. Reverse conversion for 8 bits.

Figure 5. Reverse conversion for 16 bits.

Figure 6. Reverse conversion for 32 bits.

Figure 7. Comparison of the time required for the methods of translation from RNS to positional number system.

Figure 8. Comparison of the area used for methods of translation from RNS to positional number system.

Table 1. LUT values for the example at hand.

$X_{4}$	Reference Point Value	$X_{4}$	Reference Point Value
0	0	4	420
1	105	5	525
2	210	6	630
3	315	7	735

Table 2. Time and area comparison for 8 bits.

RNS Moduli	Layer	Area, µm²			Time, ps
		I	II	III	I	II	III
{3, 5, 7, 8}	3-5-7/8	553	442	408	654	987	987
	3-7/5-8	425			729
{3, 5, 32}	3-5/32	236	483	279	521	787	787
	32/3-5	266			494
{3, 7, 16}	3-7/16	217	350	315	523	691	1016
{5, 7, 8}	5-7/8	293	288	277	277	845	1051
{5, 9, 16}	5-9/16	409	444	414	611	984	1277
	16/5-9	371			508
{7, 9, 16}	7-9/16	406	426	342	568	979	970
{7, 15, 16}	7-15/16	604	579	543	713	1197	1486
	15-16/7	392			903

Table 3. Time and area comparison for 16 bits.

RNS Moduli	Layer	Area, µm²			Time, ps
RNS Moduli	Layer	I	II	III	I	II	III
{3, 5, 7, 17, 64}	3-5-7/64/17	1582	1538	1498	2322	1752	2012
{5, 7, 9, 17, 32}	5-7-9/17/32	2181	1677	1477	1518	2115	2288
{5, 7, 9, 17, 32}	5-32/7-17/9	1868	1677	1477	2896	2115	2288
{7, 15, 17, 64}	7-17/15/64	1329	1627	1553	1805	2025	2231
{7, 15, 31, 32}	7-31/15-32	2552	1566	1381	1538	2018	2266
{15, 17, 31, 32}	31/32 /15-17	2001	1464	1353	1915	2014	2274
{31, 33, 128}	31/33/128	945	1302	759	1071	1625	1628
{31, 33, 128}	128/31-33	2219	1302	759	798	1625	1628
{31, 63, 64}	31/63/64	1015	1457	1391	1372	1964	2320
{31, 63, 64}	31-63/64	2643	1457	1391	1048	1964	2320

Table 4. Time and area comparison for 32 bits.

RNS Moduli	Layer	Area, µm²			Time, ps
RNS Moduli	Layer	I	II	III	I	II	III
{3, 7, 17, 31, 65, 127, 128}	3-17-31/7-127/65/128	12,153	5658	5982	5077	4731	4837
{3, 7, 17, 31, 65, 127, 128}	3-128/7-127/17-65/31	10,730	5658	5982	5461	4731	4837
{7, 17, 31, 65, 127, 256}	7-65/17-256/31-127	30,741	5421	6168	3321	4212	4545
{9, 17, 31, 65, 127, 128 }	127/17-65/128/9-31	9601	5339	5314	4190	4190	4675
{9, 17, 31, 65, 127, 128 }	128/127/65/9-17-31	25,870	5339	5314	4383	4190	4675
{31, 33, 65, 127, 512}	31-33/65/127/512	6996	5391	5397	3746	4270	4392
{31, 33, 65, 127, 512}	31-127/65-33/512	19,691	5391	5397	3356	4270	4392
{31, 65, 127, 129, 256}	31-65/127/129/256	10,903	5297	6333	4763	4510	4727
{127, 129, 257, 1024}	127/129/1024/257	3509	5226	3906	3707	3589	3917
{127, 129, 257, 1024}	127-129/1024/257	11,075	5226	3906	3407	3589	3917
{127, 255, 257, 1024}	127/255/257/1024	4318	6025	4535	3589	4075	4284
{127, 255, 511, 512}	127/255/511/512	4837	5945	5726	3108	4221	4377
{1023, 1025, 8192}	1023/1025/8192	1074	3652	2482	1366	2821	2984
{1023, 1025, 8192}	8192/1023/1025	2659	3652	2482	3081	2821	2984
{1023, 2047, 4096}	4096/2047/1023	1535	5031	3966	2179	3323	3638

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kuchukov, V.; Telpukhov, D.; Babenko, M.; Mkrtchan, I.; Stempkovsky, A.; Kucherov, N.; Ermakova, T.; Grigoryan, M. Performance Analysis of Hardware Implementations of Reverse Conversion from the Residue Number System. Appl. Sci. 2022, 12, 12355. https://doi.org/10.3390/app122312355

AMA Style

Kuchukov V, Telpukhov D, Babenko M, Mkrtchan I, Stempkovsky A, Kucherov N, Ermakova T, Grigoryan M. Performance Analysis of Hardware Implementations of Reverse Conversion from the Residue Number System. Applied Sciences. 2022; 12(23):12355. https://doi.org/10.3390/app122312355

Chicago/Turabian Style

Kuchukov, Viktor, Dmitry Telpukhov, Mikhail Babenko, Ilya Mkrtchan, Alexander Stempkovsky, Nikolay Kucherov, Tatiana Ermakova, and Marine Grigoryan. 2022. "Performance Analysis of Hardware Implementations of Reverse Conversion from the Residue Number System" Applied Sciences 12, no. 23: 12355. https://doi.org/10.3390/app122312355

APA Style

Kuchukov, V., Telpukhov, D., Babenko, M., Mkrtchan, I., Stempkovsky, A., Kucherov, N., Ermakova, T., & Grigoryan, M. (2022). Performance Analysis of Hardware Implementations of Reverse Conversion from the Residue Number System. Applied Sciences, 12(23), 12355. https://doi.org/10.3390/app122312355

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Performance Analysis of Hardware Implementations of Reverse Conversion from the Residue Number System

Abstract

1. Introduction

2. Reverse Conversion for General View Moduli

3. Use of Special Type Moduli

4. Modified Method Based on MRC

5. Method Based on Reference Points

6. Modification of the Reference Points Method Based on the MRC

7. Physical Synthesis of Reverse Conversion Methods

8. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI