The Study of Monotonic Core Functions and Their Use to Build RNS Number Comparators

: A non-positional residue number system (RNS) enjoys particularly efﬁcient implementation of addition and multiplication, but non-modular arithmetic operations in RNS-like number comparison are known to be difﬁcult. In this paper, a new technique for designing comparators of RNS numbers represented in an arbitrary moduli set is presented. It is based on using the core function for which it was shown that it must be monotonic to allow for RNS number comparison. The conditions of the monotonicity of the core function were formulated, which also ensured the minimal range of the core function (essential to obtain the best characteristics of the comparator). The best choice is a core function in which only one coefﬁcient corresponding to the largest modulus is set to 1 whereas all other coefﬁcients are set to 0. It is also shown that the already known diagonal function is nothing else but the special case of the core function with all coefﬁcients set to 1. Performance evaluation suggests that the new comparator uses less hardware and in some cases also introduces smaller delay than its counterparts based on diagonal function. The potential applications of the new comparator include some recently developed homomorphic encryption algorithms implemented using RNS.


Introduction
Residue number system (RNS) is a non-positional representation of integers whose main advantage over its traditional positional 2 s complement counterpart is particularly efficient implementation of the basic arithmetic operations like addition and multiplication, which are executed on shorter operands by parallel independent circuits [1,2]. Unfortunately, non-modular arithmetic operations in RNS like number comparison, sign and overflow detection are known to be difficult, because they require involvement of all residues. That execution of these and some other difficult operations does not have to resort to restore RNS numbers to their positional notation involving cumbersome operations of finding the remainder of the division by a large and awkward number was first shown by Akushskii et al. [3]. They introduced the core function whose major advantage is that it offers the possibility to reduce the range within which the remainder of the division is calculated and which contains some positional information about an RNS encoded number. Nevertheless, the main disadvantage of the core function remains that most of non-modular operations are hard to implement directly [4].
The simplest approach to RNS number comparison relies on converting them to the positional representations, which are then handled using an ordinary number comparator [1]. However, using a reverse converter for RNS number comparison involves computations modulo a very large modulus M, which is both time-and power-consuming. Nevertheless, as for extra hardware, the latter has the advantage that only the ordinary a-bit number comparator (a = log 2 M ) is needed, because any RNS-based processor must use the reverse converter anyway. Since [3], several attempts to design stand-alone comparators (for arbitrary RNS moduli sets) using more sophisticated approaches have been proposed [5][6][7][8][9][10][11]. In [5], the algorithm for comparison of signed RNS numbers, based on using the core function from [3], was proposed. Unfortunately, it requires using a redundant modulus, which must be larger than the range of the core function used. Such a solution seems impractical due to its cost, because one extra residue datapath channel must be added just to allow for number comparison (although it can serve to facilitate execution of some other difficult RNS operations as well). Faster general RNS number comparators were based on using the diagonal function [6,7] and some monotone functions proposed in [8], although the latter requires including the modulus of the form 2 k . Some limitations of the comparators of [6,7] were pointed out in [9], and they also apply to those of [8]. The comparison algorithm suggested in [10], allows one to reduce the maximum size of modulo addition from M to approximately √ M, but it suffers from excessive delay compared to other methods. Finally, a new approach based on the Modified Diagonal Function (MDF) was proposed recently in [11]. It allows replacing computations modulo a large and awkward a-bit number M with significantly simpler computations involving only a power of 2 modulus 2 N , although N is always larger than a. The MDF is a kind of extension of Vu's approach for sign detection and reverse conversion [12], which also can be reduced to the computations modulo 2 N [13]. The comparator of [11] was shown superior w.r.t. both area, speed, and power consumption compared to its existing counterparts.
The importance of availability of cost-efficient and fast RNS comparison algorithms stems from the following observations. Because comparison in RNS has been considered a complex operation, the most widespread applications of RNS are usually comparison-free. The potential improvement of the efficiency of RNS comparison techniques can have significant impact on novel applications of RNS wherein comparison cannot be avoided. These include image processing [14], RNS-based convolutional neural networks [15], and RNS-based error correction codes [16]. However, cryptography and data security is the most promising emergent and dynamically developing area using RNS to improve performance of computations involving very large numbers, whose lengths are counted in thousands of bits. These include integrity verification in RNS-based verifiable secret sharing schemes [17], RNS-based algorithms in cloud computing and in edge and fog devices [16,18], and modern post-quantum homomorphic cryptography algorithms based on algebraic lattices and Ring Learning With Errors (RLWE) assumption, whose execution can be accelerated using RNS [18][19][20][21][22][23][24]. The magnitude comparison is required for integrity control in [20,[22][23][24]. Because all cryptographic schemes require computations involving polynomials of very large degree and with very large coefficients, RNS representation of coefficients and operands could allow significantly increase processing performance for such schemes. Nevertheless, in this context using even modulus 2 N with N > a might also have some drawbacks. Although computations modulo 2 N are more efficient than modulo any other modulus, some cryptographic applications like homomorphic encryption algorithms based on RLWE (requiring comparison of encrypted numbers in RNS) are very sensitive to memory consumption, which can put executing computations on larger N-bit rather than a-bit operands on disadvantage. This is because such an approach requires a big amount of memory to represent ciphertext, the computations modulo 2 N could not necessarily be advantageous (nevertheless, the approach based on the MDF from [11] preserves its advantages, if applied to implement RNS algorithms involving smaller dynamic range size). All computations in RLWE-based cryptosystems (both in hardware and software) are based on some Number-Theoretic Transform (NTT) [18,25], since all ciphertexts are represented as polynomials in the cyclotomic ring. However, NTT requires representing numbers in RNS moduli sets composed only of prime numbers, so that involving any computations modulo 2 N could not be supported in general.
In this paper, we will study monotonic core functions and, in particular, their properties which would make them suitable for efficient RNS number comparison. These newly discovered properties will provide the way for construction of the RNS comparison algorithm based on the core function with the smallest possible range. The general context is that all computations of a new function can be assumed as computations in the new RNS in which one of the moduli of the original RNS is excluded. It could serve as a theoretical basis for NTT-based cryptographic algorithms requiring the use of prime moduli only, aiming at accelerating such algorithms as homomorphic comparison of numbers in encrypted form.
The main contributions of this paper are twofold. One is a new systematic design approach to number comparison in RNS, which is based on the newly defined minimumrange monotonic core function and which is applicable to an arbitrary general RNS moduli set. Its major advantage is that its hardware implementation is less complex and in some cases it could be also faster than any previous similar design. Formulated will be the conditions of the monotonicity of the core function (necessary to execute comparison), which will also ensure its minimal range (essential to obtain the best characteristics of the comparator). The second is our finding that the diagonal function, previously used for number comparison in RNS and reverse conversion, is actually nothing else but the special case of the core function with all coefficients set to 1.
This paper is organized as follows. Sections 2 and 3 present the basic properties of RNS and the core functions, respectively. Section 4 details the theoretical background of the core functions, allowing for number comparison in RNS. Performance evaluation and comparison against existing circuits are provided in Section 5. Finally, some conclusions and suggestions for future research are given in Section 6.

Properties of RNS
The RNS is defined by the set of n pairwise prime moduli {m 1 , m 2 , · · · , m n }, which are here arranged in the increasing order (i.e., m 1 < m 2 < · · · < m n ). The dynamic range of this RNS is M = ∏ n i=1 m i , i.e., any a-bit integer X (a = log 2 M ) such that 0 ≤ X < M can be uniquely represented in RNS as X = {x 1 , x 2 , · · · , x n }, written X RNS −→ {x 1 , x 2 , · · · , x n }, where x i = |X| m i (also written x i = X mod m i ) is the a i -bit remainder of an integer division of X by m i (a i = log 2 m i ).
To obtain the number X back from RNS to the positional form, the Chinese remainder theorem (CRT) can be used [1] where the set of n CRT constants defined by is called the orthogonal basis [3].

Properties of the Core Function
The core function was defined [3] as or equivalently where ω i , 1 ≤ i ≤ n, are integer constants which can be selected arbitrarily. For a given set of moduli, the core function can be characterized by: • the value C M = C(M), which can be selected arbitrarily and usually such that C M M, and • its range G = C max − C min , where C max and C min are respectively its maximum and minimum, which occur for some X max and X min [26].
The main attraction of the core function is that its range can vary and, similarly to C M , it can be significantly smaller than M. Replacing X by M in Equation (4) yields Because |M j | m i = 0 for i = j, the constant coefficients ω i can be determined by the equation Note that in Equation (6), which also defines a residue class for each i, 1 ≤ i ≤ n, the coefficients ω i can assume both positive or negative values. Now we will show how to obtain a practically useful formula to compute C(X) for any X. As M RNS −→ {0, 0, · · · , 0}, then setting X = M in Equation (4) yields Because Equation (3) is not practical, the value of C(X) can be calculated by using remainders of X in the CRT according to Equation (1) where α = X/M . Substituting this expression in Equation (4) and using Equation (7) leads to Now the most convenient formula for calculating C(X) is obtained by substituting Equation (9) in Equation (8), which yields

Monotonic Properties of the Core Function
It has been shown [5,27] that a core function, generally, is not monotonic. Now we will determine the necessary conditions for its monotonicity, which would make it useful for RNS number comparison. First, let us express C(X − 1) as a function of C(X) for 0 < X < M. According to Equations (4) and (7) Because for any x i , 1 ≤ i ≤ n, the following condition is met: we can substitute X − 1 in Equation (11): Applying Equations (7) and (11) to Equation (12) yields In other words, the value of the core function for the preceding value of X (i.e., X − 1) is equal to the sum of the core function for X (i.e., X − 1) and the sum of all those coefficients ω i for which x i = |X| m i = 0. The latter observation immediately leads to the following property.

Property 1.
The core function is monotonic if and only if all its coefficients ω i are non-negative, 1 ≤ i ≤ n.
Thus, Property 1 undeniably limits the design space exploration range to only those core functions which could be suitable for RNS number comparison.
For {ω 1 , . . . , ω n−1 , ω n } = {1, . . . , 1, 1}, the range of the core function C(X) equals to , it is nothing else but the sum of quotients SQ introduced in 1993 [6] also for number comparison. Consequently, the diagonal function D(X) of [6] is nothing else but the special case of the core function C(X) with C M = SQ: the fact which has been unnoticed for several years until now.
For {ω 1 , . . . , ω n−1 , ω n } = {0, . . . , 0, 1} we obtain the monotonic core function with the minimum range G equal to C M = M n , which is obtained by setting ω i = 0 for 1 ≤ i ≤ n − 1 and ω n = 1 in Equation (3) i.e., it is nothing else but the quotient obtained by dividing X by the largest modulus m n and such that C M is the smallest compared to any other C M = M i , i < n. Henceforth, the above function will be called the Minimum-Range Monotonic Core Function (MMCF). Note that: • the core function cannot be strictly monotonic because C M < M, hence some other sufficiently large parameter must be used for comparison to resolve the case of C(X) = C(Y) for some compared numbers X and Y; and • for any C M < M n the number comparison becomes impossible, because M n is the minimal possible range for core functions with ω i ≥ 0 according to Equation (5), so that the number of combinations available to differentiate numbers is only m n · C M < M. Finally, we compare our results obtained in this section against those of [8], where a new class of monotonic functions was proposed for number comparison and residue-tobinary conversion. A closer look reveals that the function F I (X) proposed in this paper (where I is a non-empty subset of indices 1 ≤ i ≤ n) is nothing else but the core function with all coefficients ω i set to 1 for any i ∈ I. Besides, the theory of the functions F I (X) presented in [8] has the following limitations.
(1) No proof is given that the function F I is indeed monotonic. We have formally proven (Property 1) that all coefficients ω i must be non-negative to guarantee that any core function C(X) is monotonic. (2) It is assumed that one modulus must be even 2 k , although no justification for this assumption is given. For the class of core functions considered here, the set of RNS moduli can be arbitrary (i.e., all moduli can be odd as well). (3) No suggestions are given, how to choose the function F I to obtain the most efficient comparator. We have shown how to construct the MMCF.

New Comparison Algorithm and Its Hardware Implementation
Because C m n (X) of Equation (14) can be computed using Equation (10), the comparison of RNS integers X and Y can be formally summarized as the following algorithm. Figure 1 shows the hardware implementation of Algorithm 1. The n-operand modular adder (MOMA) can be implemented e.g., using the methods of [28,29]. Two ordinary a M n -and a n -bit number comparators (a M n = log 2 M n ) which work in parallel as shown, can be designed e.g., according to [30] (pp. 45-47). Obviously, the basic principle of Algorithm 1 and its hardware implementation are the same as for RNS comparators proposed in [9,11]: they only differ in the modulus used by the n-operand MOMA, which generates the equivalent representation of compared numbers, sufficient to perform comparison. The modulus M n proposed here is the smallest amongst them and this is the main contribution here. C m n (5) = |0 · 7 + 5 · 15 + 5 · 13| 35 = |140| 35 = 0 C m n (6) = |1 · 7 + 6 · 15 + 6 · 13| 35 = |175| 35 = 0 Obviously, because C m n (5) = C m n (6) = 0, the result of comparison of x 3 = 5 < y 3 = 6 is needed to conclude comparison.

Algorithm 1:
Comparison of RNS numbers using the core function with the minimal range.
Step 1. Compute C m n (X) and C m n (Y).
Step 2. Compare the values of C m n (X) and C m n (Y): (

General Analysis
Here, we will compare the new comparators against their most efficient known counterparts based on the diagonal function [7], the modified diagonal function [11], and those based on CRT [1]. (Because in [9], it was shown that the CRT-based version Case (a) of Figure 3.1 in [7] was actually the fastest, we will consider only the latter version for comparison.) The two latter comparators have similar structures as one proposed here on Figure 1. The only differences are the following: (i) for the diagonal function and for the CRT the n-operand MOMA mod SQ and mod M is used, respectively; and (ii) for the CRT, the positional comparison consists only of a simple a-bit comparator. Because in all three cases an n-operand MOMA with varying modulus is the main building block, the impact of the size of the modulus on the hardware complexity will be analyzed, using the characteristics summarized in Table 1.

Comparator Type Modulus Operand Size [bits]
CRT-based First, we compare the sizes of operands handled by RNS numbers comparators built using standard CRT-based implementation and the MMCF. By setting M n = M/m n and taking the logarithms of both sides, we obtain log 2 M n = log 2 M − log 2 m n , which leads to the following inequality.
In particular, if m n = 2 k then a M n = a M − k. Clearly, the bigger is the largest modulus m n , the relatively shorter are operands of the MOMA (a M n vs. a M ) and more hardware savings ((n − 2) · (a M − a M n ) FAs less in the CSA tree alone of the MOMA) are observed compared to the CRT-based implementation. No savings are observed in the positional comparison, because the sizes of the a-bit comparator for the CRT and the total size of two comparators for the MMCF M n − and a n -bit are similar, although the latter requires a few extra final gates.
To compare the sizes of operands handled by comparators built using the diagonal function and the MMCF notice the following.
from which we get From the latter equation we obtain the lower-and upper bounds on the number of bits saved in our design In summary, because n ≥ 2, then obviously SQ = ∑ n i=1 M i > M n . The resulting inequality a M n < a SQ implies the following general observations. • The MOMA mod M n operates on shorter operands than the MOMA mod SQ, so that both internal carry-save adders (CSAs) as well as the final CPAs of the former are shorter by a SQ − a M n bits: therefore hardware savings in adders are about n(a SQ − a M n ) FAs, i.e., they grow with both the number of moduli n and the size of the largest modulus m n (see (18)). • Up to a SQ − a M n less outputs from each of n input look-up tables (LUTs) (usually implemented using ROMs) imply less area due to connections. • Selection of the largest modulus m n for extra comparison to resolve the ambiguity, which occurs if C m n (X) = C m n (Y), does not affect the delay of the whole RNS comparator, because it can be done in parallel with the comparison of C m n (X) and C m n (Y) (cf. Figure 1).
• Some delay saving can be observed for any moduli set for which log 2 a M n < log 2 a SQ , because all fast carry-propagate adders (CPAs) used by the MOMA mod M n have a few gate levels less than their counterparts used by the MOMA mod SQ. Examples of such RNS moduli sets will be given below.
As for the modified diagonal function, we will see that N is always significantly larger than a M n , which would make the new RNS comparators of interest for some cryptographic applications mentioned in the Introduction.

Complexity Analysis for Sample RNS Moduli Sets
To reveal differences between the sizes of MOMAs used in various RNS number comparators (hence, hardware savings) depending on the number of moduli n and the dynamic range DR, the parameters of several sample RNS moduli sets S n,i are listed in Table 2 (where n is the number of moduli and i is the number of a particular n-moduli set). To note that amongst them, the sets S 6,1 and S 11,1 are the maximal sets of the largest relatively prime moduli respectively of size a i ≤ 4 and a i ≤ 5. The comparison of the comparators based on the diagonal function against their CRT-based counterparts reveals the following: (i) Any significant advantages of the diagonal function a M − a SQ ≥ 4 are observed only for a few moduli sets which, additionally, are only the smallest moduli sets composed of n = 3 or 4 moduli: S 3,1 , S 3,2 , S 4,3 , and S 4,4 . (ii) For n ≥ 6, the difference (if any) between a M and a SQ becomes insignificant, which implies that the diagonal function actually does not offer any meaningful advantages over standard CRT-based implementation of the RNS numbers comparator.
The comparison of the MMCF-based comparators proposed here against their counterparts based on the diagonal function reveals the following: (i) Should the even modulus m n = 2 k be used, for n ≥ 7, at least (k − 1)n FAs are saved in our design compared to its counterpart based on the diagonal function. The inspection of the last column SQ/M n of Table 2 reveals that the upper-bound on the MOMA operand reduction (cf. Equation (18)) is obtained for most of the sample RNS moduli sets listed; the cases of lower-bounds are distinguished by italics. (ii) For all moduli sets for which log 2 a SQ ≤ log 2 a M n (marked in bold in the column of a M n ) the new comparator is also faster, because it requires one stage less circuitry of the CPA. For instance, for S 6,1 the comparator based on the diagonal function uses the adder mod 493,189 operating on 19-bits, so that the delay of a CPA used is 12 gate delays; on the other hand, the MMCF-based comparator uses the adder mod 45,045 operating on 16-bits, so that the delay of a CPA used is 8 gate delays. (iii) Finally, notice that the data of Table 2 show why we failed to find any closed formula to evaluate the upper-bound of a SQ − a M n , which would be simpler than (18). Should it depend e.g., on log 2 m n alone, notice that although for S 6,4 = {7, 9, 11, 13, 31, 32} we have log 2 m 6 = log 2 32 = 5 and a SQ − a M n = 4 (which is quite close), for S 3,1 = {63, 65, 256} a significant difference occurs: log 2 m 3 = log 2 256 = 8 and a red = a SQ − a M n = 4.
To convey a reader with some ideas about the size difference between N of their counterparts based on the modified diagonal function [11] and the MMCF-based comparators proposed here, we have included the last column N in Table 2. The size of the even modulus N used by the modified diagonal function of [11] for all cases considered is significantly larger than a M n by from 7 up to 15 bits (for S 6,5 ).

Detailed Complexity Evaluation
Every RNS comparator considered here has the same general structure as shown in Figure 1, whose basic blocks are: • L(l, a)-a look-up table of 2 l locations with a-bit output word length (with a time delay denoted t L(l,a) ); • MOMA(n, a)-a multi-operand modular adder (MOMA) for n operands with a-bit word length (with a time delay denoted t MOMA(n,a) ); and • C(a)-a binary comparator of a-bit integers (with a time delay denoted t C(a) ).
We have made the following assumptions regarding the basic building blocks and we will follow the same notation as previously used in [9]. Similarly as in [7,9], the complexity of various implementations is evaluated in terms of the number of bits for look-up tables (L), the number of full adders (FA), and time delay (TD). To evaluate the time delay, ∆ the delay of a NAND gate is used as a unit, and it is assumed that t FA = t MUX = 2∆ and t XOR = 1∆.
However, unlike in [9], in all complexity evaluations will be used here the same MOMA from [32], which is actually faster than the MOMA from [28] used in [9], although at a little hardware cost. For readers' convenience, its block scheme is detailed here with delay evaluations in Figure 2. It is seen that the n-operand CSA tree of this MOMA produces a pair of vectors S and C, which are partitioned into two pairs of the most significant bits (MSBs) and the least significant bits S = {S H , S L } and C = {C H , C L } such that max{S L + C L } < M. The actual exact total numbers of the bits in S and C as well as the upper-bound on the number of MSBs which could make inputs to the MSB converter (max{h s + h c }) can be found in Table 3. The MSB converter is nothing else but an L(h s + h c , 2a) look-up table), which generates |S H + C H | M . The delay of the whole MOMA in which CLAs are used to implement CPAs equals to t MOMA(n,a) = (θ(n) where θ(n) denotes the minimal number of stages on a CSA tree that processes n input operands, for which some sample values are listed in Table 4.   Example 2. Consider the 6-moduli set S 6,1 = {5, 7,9,11,13, 16}, which is the maximal set of the largest relatively prime 4-bit moduli, whose all basic parameters can be found in Table 2. Its dynamic range M > 2 20 is sufficient for many DSP applications. We will evaluate performance of two different comparator versions. Table 5 details the characteristics of all basic blocks used to build these comparators as well as the delays of the whole comparators. Note that the delay of both input look-up tables (LUTs) and the MOMA is counted twice, because we assume that for each pair of compared numbers their positional values or their core functions are computed serially by the same circuitry. The delays of the MOMAs to build the comparators using the diagonal function and MMCF (calculated according to Equation (20)) are respectively as follows: Clearly, the data of Table 5 show that the new comparator is faster as it introduces smaller delay by 8∆. It is also less complex as it uses less FAs (27) Time delay 2(t L(4,19) + t MOMA−DF ) + 10 2(t L(4,16) + t MOMA−MMCF ) + 10 = 2(5 + 28) + 10 = 76 = 2(5 + 24) + 10 = 68 In general, smaller delay can be observed for any moduli set for which at least one of the below conditions holds.
(i) Each of the pair of a-bit CPAs of a MOMA is faster, which occurs if log 2 a M n < log 2 a SQ . Besides the moduli set S 6,1 considered above, the inspection of the columns a SQ and a M n of Table 2 reveals that several other moduli sets meet this condition. (ii) A relatively rare case, when the final a-bit comparator is faster, occurs for most cases of practical interest e.g., if a M n ≤ 24 and a SQ > 24, when the delay is reduced by 4∆.
In Table 2, only the set S 8,1 meets this condition.

Final Remarks
The complexity evaluation and comparison of the new comparators against their most efficient known counterparts presented in this section allows to formulate the following conclusions. To allow number comparison, the new comparators based on the MMCF use the smallest modulus of all circuits considered. As a result, the operands added by the MOMA are also the shortest. Because the MOMA is the principal contributor to the complexity of any comparator, the presented complexity analysis proves that the new comparators are the least complex. There are also indicated some cases, when the new comparators are also faster than their counterparts. The data presented in Table 2 reveal significant impact of selecting possibly the largest modulus on improved performance of new comparators.
To maximally benefit of handling RNS data by a set of independent residue datapath channels mod m i (1 ≤ i ≤ n), it is desirable that the latter are balanced as much as possible, i.e., they introduce similar delay and consume similar amount of hardware resources. Particularly advantageous are moduli sets in which the largest modulus m n is even of the type 2 p . This is because despite the modulus 2 p is larger by a few bits than all remaining odd moduli, the delay and hardware complexity of the residue datapath channel mod 2 p could still be comparable to those for the largest odd moduli. The latter has been already observed for the special moduli sets composed only of low-cost moduli of the form 2 k ± 1 and 2 p [33][34][35]. Indeed, here we have shown that selecting an even modulus 2 p as the largest one is also more advantageous to built efficient comparators proposed here and for arbitrary RNS moduli sets, including those containing other odd moduli than those of the form 2 k ± 1 (at least (p · (n + 1) FAs are saved).

Conclusions
In this paper, a new general method for comparison of numbers represented using residue number system (RNS) was proposed. The method is based on using the core function, for which it was shown that it must be monotonic and use only non-negative coefficients to be suitable for RNS number comparison. Formulated were the conditions of the monotonicity of the core function, which also ensure the minimal range of the core function (essential to obtain the best characteristics of the comparator). It was found that the Minimum-range Monotonic Core Function (MMCF) has only one coefficient set to 1 (corresponding to the largest modulus) whereas all other coefficients are set to 0. It is also shown that the already known diagonal function, previously suggested to implement RNS numbers comparison and other RNS non-modular operations, is nothing else but the special case of the core function with all coefficients set to 1. Performance evaluation suggests that the new comparator uses less hardware and in some cases also introduces smaller delay than its counterparts based on diagonal function. It is likely that hardware savings could result in smaller power consumption as well. Some new previously undisclosed limitations of the diagonal function are also revealed. The new comparator could be of interest in all applications in which the use of the even modulus 2 N must be excluded to implement comparison, like in some recent cryptographic applications. We believe that the presented study of the monotonic core functions will deepen the understanding of their properties and hence will allow to apply the presented theory to improve implementations of other non-modular RNS operations, thus contributing to extension of the applicability of RNS in different fields.