On Linear Coding over Finite Rings and Applications to Computing

This paper presents a coding theorem for linear coding over finite rings, in the setting of the Slepian–Wolf source coding problem. This theorem covers corresponding achievability theorems of Elias (IRE Conv. Rec. 1955, 3, 37–46) and Csiszár (IEEE Trans. Inf. Theory 1982, 28, 585–592) for linear coding over finite fields as special cases. In addition, it is shown that, for any set of finite correlated discrete memoryless sources, there always exists a sequence of linear encoders over some finite non-field rings which achieves the data compression limit, the Slepian–Wolf region. Hence, the optimality problem regarding linear coding over finite non-field rings for data compression is closed with positive confirmation with respect to existence. For application, we address the problem of source coding for computing, where the decoder is interested in recovering a discrete function of the data generated and independently encoded by several correlated i.i.d. random sources. We propose linear coding over finite rings as an alternative solution to this problem. Results in Körner–Marton (IEEE Trans. Inf. Theory 1979, 25, 219–221) and Ahlswede–Han (IEEE Trans. Inf. Theory 1983, 29, 396–411, Theorem 10) are generalized to cases for encoding (pseudo) nomographic functions (over rings). Since a discrete function with a finite domain always admits a nomographic presentation, we conclude that both generalizations universally apply for encoding all discrete functions of finite domains. Based on these, we demonstrate that linear coding over finite rings strictly outperforms its field counterpart in terms of achieving better coding rates and reducing the required alphabet sizes of the encoders for encoding infinitely many discrete functions.


I. INTRODUCTION
The problem of source coding for computing considers the following scenario.
The region R[g] is called the achievable coding rate region for computing g. A rate tuple R ∈ R s is said to be achievable for computing g (or simply achievable) if and only if R ∈ R [g]. A region R ⊂ R s is said to be achievable for computing g (or simply achievable) if and only if R ⊆ R[g].
If g is the identity function, it is obvious that the described computing problem is equivalent to the Slepian-Wolf (SW) source coding problem. Hence, R[g] is the SW region [4], namely where T c is the complement of T in S and X T (X T c ) is the random variable array In the original SW source coding scenario, the structure of the encoders is unclear, the corresponding mappings are chosen randomly among all feasible mappings. In the single source scenario, Elias [1] showed that linear coding over finite field (LCoF), where X i 's and Ω are embedded as subsets of this field and φ i 's are linear mappings, is sufficient in achieving the best coding rate. This idea is then generalized to the multiple sources scenario by Csiszár [2]. As a consequence of [2], any rate tuple in the SW region is achievable with LCoF.
Generally speaking, R[X 1 , X 2 , · · · , X s ] ⊆ R[g] for an arbitrary discrete function g. Making use of Elias' theorem on binary linear codes [1], Körner-Marton [3] shows that R[⊕ 2 ] ("⊕ 2 " is the modulo-two sum) contains the region This region is not contained in the SW region for certain distributions. In other words, R[⊕ 2 ] R[X 1 , X 2 ].
Combining the standard random coding technique and Elias' result, [5] shows that R[⊕ 2 ] can be strictly larger than the convex hull of the union R[X 1 , X 2 ] ∪ R ⊕2 . However, the functions considered in these works are relatively simple.
Taking on a polynomial approach, [6], [7] generalize the result of Ahlswede-Han [5,Theorem 10] to the most general scenario. Making use of the fact that a discrete function is essentially a polynomial function [8, pp. 93] over some finite field, an achievable region is given for computing an arbitrary discrete function. Such a region contains and can be strictly larger (depending on the precise function and distribution under consideration) than the SW region. Conditions under which R[g] is strictly larger than the SW region are presented in [9] and [6] from different perspectives, respectively.
We observe that the linear coding (LC) technique over field by Elias and Csiszár is a key element in the results accounted for above. This observation inspires our study of searching for alternative encoding methods (coding techniques). This paper focuses on linear coding over finite ring (LCoR) which serves as an alternative technique in the case of the computing problem. We will show that this approach is better in terms of achieving coding rates and reducing alphabet sizes of the encoders compared to its field counterpart. In Section III, we present a region (4) achieved with LC over several finite rings, namely all encoders are linear mappings over rings (see Definition II.9). It is proved that this region specializes to the SW region if all of these rings are fields. Thus, the results of [1], [2] become special cases of ours in this sense. In addition, Section V shows that LCoR can achieve the best coding rate in the single source scenario with certain rings (e.g., Z 4 ) that are not necessary fields. To illustrate applications to computing, Problem 1 is considered in Section VI where a generalized theorem of Körner-Marton [3] is given. To conclude Section VI, Example VI.5 is constructed exhibiting the advantages of this LC technique over finite rings. In this example, LCoR achieves a strictly larger region compared to the one obtained with LCoF in the sense of Körner-Marton [3]. Additionally, the encoders using LCoR require strictly smaller alphabet sizes than when using LCoF.
In addition to the fact that LCoR outperforms LCoF, Example VI.5 also points at another circumstance. The results of Körner-Marton [3] and Ahlswede-Han [5, Theorem 10] depend on LCoF and apply to linear functions over fields only. Therefore, their methods do not apply directly to discrete functions that are neither linear nor can be linearized over any finite field (e.g., g from Example VI.5). Although the polynomial approach [6], [7], proposed by the authors of the present work, works universally for any discrete functions, it will possibly require larger alphabet size. More importantly, this approach requires that the polynomial presentation of the function is available in accessible form. Unfortunately, this can possibly turn out to be a very strict requirement, even though it is proved to be always possible. On the contrary, those discrete functions (e.g., g from Example VI.5) could be simply a linear function over certain finite ring. Therefore, LCoR offers an alternative solution. As a matter of fact, such an alternative solution is rather promising for, at least, functions like g from Example VI.5.
Conceptually speaking, LCoR is in fact a generalization of the LC technique proposed by Elias and Csiszár (LCoF), since a field must be a ring. However, as seen in Section IV, the analysis of decoding error for the ring version is in general substantially more challenging than in the case of the field version. Our analysis crucially relies on the concept of ideal of ring. A field contains no non-trivial ideal but itself. Because of this special property of fields, our general argument for finite rings deployed in later sections will render to a simple one if only finite fields are considered.
Even though our analysis for the ring scenario is more complicated than the one for field, linear encoders working over some finite rings are in general considerably easier to implement than using finite fields in practice. This is because the implementation of finite field arithmetic can be quite demanding. Normally, a finite field is given by its polynomial representation, operations are carried out based on the polynomial operation followed by polynomial long division algorithm. On the contrary, implementing arithmetic of many finite rings is a straightforward task.
For instance, the arithmetic of modulo integers ring Z q , for any positive integer q, is simply the integer modulo q arithmetic.

II. RINGS, IDEALS AND LINEAR MAPPINGS
In this section we introduce some fundamental concepts from abstract algebra. Readers who are already familiar with this material may still choose to go through quickly to identify our notation.
Definition II.1. The tuple [R, +, ·] is called a ring if the following criteria are met: 2) There exists a multiplicative identity 1 We often write R for [R, +, ·] when the operations (operators) considered are known from the context. The operator "·" is usually written by juxtaposition, ab for a · b, for all a, b ∈ R. Proposition II.2. Given s rings R 1 , R 2 , · · · , R s , for any non-empty set T ⊆ {1, 2, · · · , s}, the Cartesian product with respect to the component-wise operations defined as follows: Remark 1. In Proposition II.2, it can be easily seen that [0, 0, · · · , 0] and [1, 1, · · · , 1] are the zero and multiplicative identity of R T , respectively.
Definition II.3. A non-zero element a of a ring R is said to be invertible, if and only if there exists b ∈ R, such that ab = ba = 1. b is called the inverse of a, denoted by a −1 . An invertible element of a ring is called a unit.
Remark 2. It can be proved that the inverse of a unit is unique. By definition, the multiplicative identity is the inverse of itself.
Let R * = R \ {0}. The ring [R, +, ·] is a field if and only if R * is an Abelian group with respect to the multiplicative operation "·". In other words, all non-zero elements of R are invertible. All fields are commutative 1 Sometimes a ring without a multiplicative identity is considered. Such a structure has been called a rng. We consider rings with multiplicative identities in this paper. However, similar results remain valid when considering rngs instead. Although we will occasionally comment on such results, they are not fully considered in the present work.
rings. Z q is a field if and only if q is a prime. Up to isomorphism, all finite fields are unique [10, pp. 549]. We use F q to denote this "unique" field of order q. It is necessary that q is a power of a prime. Remark 3. Wedderburn's little theorem guarantees commutativity for a finite ring if all of its non-zero elements are invertible. Hence, a finite ring is either a field or at least one of its elements has no inverse. However, a finite commutative ring is not necessary a field, e.g. Z q is not a field if q is not a prime.  2) ∀ x ∈ I and ∀ r ∈ R, r · x ∈ I.

If condition 2) is replaced by
3) ∀ x ∈ I and ∀ r ∈ R, x · r ∈ I, then I is called a right ideal of R, denoted by I ≤ r R. {0} is a trivial left (right) ideal, usually denoted by 0.
The cardinality |I| is called the order of a finite left (right) ideal I.
Remark 5. Let {a 1 , a 2 , · · · , a n } be a non-empty set of elements of some ring R. It is easy to verify that a 1 , a 2 , · · · , a n r = n i=1 a i r i r i ∈ R, ∀ 1 ≤ i ≤ n is a right ideal. Furthermore, a 1 , a 2 , · · · , a n r = R if a i is a unit for some 1 ≤ i ≤ n.
It is well-known that if I ≤ l R or I ≤ r R, then R is divided into disjoint cosets which are of equal size (cardinality). For any coset J, J = x + I = {x + y|y ∈ I}, ∀ x ∈ J. The set of all cosets forms a quotient group, denoted by R/I. See [8, Ch. 1.6 and Ch. 2.9] for more details.
Definition II. 9. A mapping f : R n → R m given as: where a i,j ∈ R for all feasible i and j, is called a left (right) linear mapping over ring R. If m = 1, then f is called a left (right) linear function over R.
From now on, left linear mapping (function) or right linear mapping (function) are simply called linear mapping (function). This will not lead to any confusion since the intended use can usually be clearly distinguished from the context.
where A is an m × n matrix over R and [A] i,j = a i,j for all feasible i and j. A is named the coefficient matrix. It is easy to prove that a linear mapping is uniquely determined by its coefficient matrix, and vice versa. The linear mapping f is said to be trivial, denoted by 0, if A is the zero matrix, i.e., [A] i,j = 0 for all feasible i and j.
Let A be an m × n matrix over ring R and f (x) = Ax, ∀ x ∈ R n . For the system of linear equations We conclude this section with a lemma regarding the cardinalities of R n and S(f ) in the following.
Lemma II. 10. For a finite ring R and a linear function Moreover, if a i is a unit, then I = R, thus, |S(f )| = |R| n / |R| = |R| n−1 .

Remark 9.
For linear function Lemma II.10 holds true, if

III. LINEAR CODING OVER FINITE RINGS
In this section, we will present a coding rate region achieved with LCoR for the SW source coding problem, i.e., g is the identity function in Problem 1. This region is exactly the SW region if all the rings considered are fields.
However, being field is not necessary as seen in Section V.
Before proceeding, a subtleness needs to be cleared out. It is assumed that a source, say t i , generates data taking values from a finite sample space X i , while X i does not necessarily admit any algebraic structure. We have to either assume that X i is with a certain algebraic structure, for instance X i is a ring, or injectively map elements of X i into some algebraic structure. In our subsequent discussions, we assume that X i is mapped into a finite ring R i of order at least |X i | by some injection Φ i . Hence, X i can simply be treated as a subset Φ i (X i ) ⊆ R i for a fixed Φ i . When required, Φ i can also be selected to obtain desired outcomes (see Remark 11).
To simplify our discussion, the following notation is used.
where r (T, Any rate in R Φ is achievable with linear coding over finite rings R 1 , R 2 , · · · , R s .
A concrete example can be helpful in the interpretation of this theorem.
Example III.2. Consider the single source scenario, where X 1 ∼ p and X 1 = Z 6 , satisfying the follows.
By Theorem III.1, is achievable with LCoR.
Remark 11. From Theorem V.1, one will see that (3) and (4) are the same when all the rings are fields. Actually, both are identical to the SW region. However, (4) can be strictly larger than (3) (see Theorem V.2), when not all the rings are fields. This implies that, in order to achieve desired rate, a suitable injection is required. However, be reminded that taking convex hull (4) is not always needed for optimality as shown in Example III.2. Much more sophisticated analysis on this issue is found in Section V.
The rest of this section provides key supporting lemmata and concepts used to prove Theorem III.1. The final proof is given in Section IV.
Lemma III.3. Given a finite ring R, two distinct sequences x, y ∈ R n , and let y − x = [a 1 , a 2 , · · · , a n ] t and f : R n → R k be a linear mapping chosen uniformly randomly, i.e., generate the k × n coefficient matrix A of f by independently choosing each entry of A uniformly at random. Then where I = a 1 , a 2 , · · · , a n l . Proof: since f i 's are independent to each other. The statement follows from Lemma II.10 and Remark 9 which assure that Remark 12. In Lemma III.3, if R is a field and x = y, then I = R because every non-zero a i is a unit. Thus, Definition III.4 (c.f. [11]). Let X ∼ p X be a discrete random variable with sample space X . The set T ǫ (n, X) of strongly ǫ-typical sequences of length n with respect to X is defined to be where N (x; x) is the number of occurrences of x in the sequence x.
T ǫ (n, X) is sometimes replaced by T ǫ when the length n and the random variable X referred to are clear from the context.
Let X ∼ p X be a discrete random variable with finite sample space X and H(p X ) = H(X). It is well-known that H is a concave function, i.e., cH( Equality holds if and only if p 1 = p 2 . For convenience, define emp(x) to be the empirical distribution of x and H(x) = H(emp(x)), Lemma III.5. Let (X 1 , X 2 ) ∼ p be a jointly random variable whose sample space is a finite ring R = R 1 × R 2 .

IV. PROOF OF THEOREM III.1
As mentioned, X i can be seen as a subset of R i for a fixed Φ = [Φ 1 , · · · , Φ s ]. In this section, we assume that X i has sample space R i , which makes sense since Φ i is injective.
where n is the length of the data sequences. If for some small constant η > 0 and large enough n), ∀ ∅ = T ⊆ S, ∀ 0 = I i ≤ l R i . We claim that R is achievable.

A. Encoding:
For every i ∈ S, randomly generate a k i × n matrix A i based on a uniform distribution, i.e., independently choose each entry of A i uniformly at random. Define a linear encoder φ i : R n i → R ki i such that Obviously the coding rate of this encoder is

B. Decoding:
Subject to observing y i ∈ R ki i (i ∈ S) from the ith encoder, the decoder claims that R n i is the array of the encoded data sequences, if and only if:

C. Error:
Assume that X i ∈ R n i (i ∈ S) is the original data sequence generated by the ith source. It is readily seen that an error occurs if and only if:
Additionally, for ∅ = T ⊆ S, let We have where X T = i∈T X i and X T c = i∈T c X i , since I goes over all possible non-trivial left ideals. Consequently, where (13)   Consequently, which is the SW region R[X 1 , X 2 , · · · , X s ]. Therefore, region (4) is also the SW region.
If R i is a field, then obviously it has no proper non-trivial left (right) ideal. Conversely, ∀ 0 = a ∈ R i , Remark 14. Theorem V.1 states that LCoF of Elias [1] and Csiszár [2] are special cases of Theorem III.1 in the sense of achieving the optimal coding rate region, the SW region. However, R 1 , R 2 , · · · , R s being fields is not necessary.
As shown in Theorem V.2, the SW region can be achieved using rings such as Z 4 and M L = is a ring with respect to matrix addition and multiplication). Clearly, neither Z 4 nor M L is a field.

A. Single Source
In the single source scenario, showing that the convex hull (4) is the SW region is equivalent to proving that for some Φ ∈ M (X S , R S ). As mentioned before, Y R1/I1 depends on Φ 1 (note that Φ = [Φ 1 ] for single source).
Hence, an appropriate Φ 1 is a crucial. Proof: R 1 has only two non-trivial left ideals, R 1 and the unique proper non-trivial left ideal J. Moreover, Thus, (17) is valid if and only if for some injection Φ 1 : Without loss of generality, assume that (note that after fixing Φ 1 , X 1 is seen as a subset of R 1 ). Consequently, (18) is equivalent to which is established by Lemma A.1. Therefore, (18) holds, so does (17). Furthermore, that R 1 is isomorphic to either Z 4 or M L is a well-known fact [12]. The theorem is proved.
Remark 15. Notice that Theorem V.2 is valid for any X ∼ p. However, for other rings (in particular for those containing only one left (right) ideal besides the trivial ideal and itself, for instance, Z q 2 with prime q), no conclusive result is so far available. However, that optimality, i.e., (17) holds, can still be shown for certain distribution p.
Remark 16. Up to isomorphism, there are 4 rings (and 11 rngs) of order 4 [12]. On the contrary, there is one and only one finite field of any finite order. Theorem V.2 suggests that LC over rings, e.g., F 4 , Z 4 and M L can be as good as LC over fields in the single source SW source coding problem. Actually, due to its flexible structure (a ring does not have to be commutative, can have a non-prime characteristic, etc), LCoR offers certain advantages.
One example is shown in Section VI demonstrating this.

B. Multiple Sources
Based on Theorem V.2, the following corollary can be verified immediately.
Corollary V.3. Region (4) contains the region For the multiple sources scenario, we do not have conclusive result like Theorem V.2 that is valid for any distribution. However, the following theorem makes it more plausible that there exists some set of non-field rings over which LC is optimal.
Theorem V.4. Regardless which set of rings R 1 , R 2 , · · · , R s is chosen, as long as Proof: If p is uniform, then, for any ∅ = T ⊆ S and 0 = I T ≤ l R T , Y RT /IT is uniformly distributed on Moreover, X T and X T c are independent, so are Y RT /IT and X T c . Therefore, Region (3) is the SW region.
Remark 17. When p is uniform, it is obvious that the uncoded strategy (all encoders are one-to-one mappings) is optimal in the SW source coding problem. However, optimality stated in Theorem V.4 does not come from deliberately fixing the encoding mappings, but generating them randomly.

VI. APPLICATION: SOURCE CODING FOR COMPUTING
Some advantages of LCoR are demonstrated in this section. In Example VI.5 below, we show that LCoR achieves better coding rates compared to LCoF in the sense of Körner-Marton [3] for some function g in Problem 1. At the same time, the encoders using LCoR require strictly smaller alphabet sizes than using LCoF. We first present the following theorem which is seen as a generalization of Körner-Marton [3].
Theorem VI.1. In Problem 1, ifĝ is a polynomial function in R[s] admitting that and h, k i ∈ R [1] for all feasible i, then where X = k(X 1 , X 2 , · · · , X s ) and Y R/I is a random variable with sample space R/I and Proof: By Theorem III.1, ∀ ǫ > 0, there exists a large enough n, an m × n matrix A ∈ R m×n and a decoder ψ, such that Pr the encoder of the ith source. Upon receiving φ i (X n i ) from the ith source, the decoder claims that h X n , wherê , is the function, namelyĝ, subject to computation. The probability of decoding error is Therefore, all (r, r, · · · , r) ∈ R s , where r = m log |R| n > max

Rĝ ⊆ R[ĝ].
Corollary VI.2. In Theorem VI.1, let X = k(X 1 , X 2 , · · · , X s ) ∼ p X . We have if either of the following conditions holds: 1) R is isomorphic to some finite field; 2) R is isomorphic to Z 4 and p X (0) = p 1 , p X (1) = p 2 , p X Remark 18. If R is isomorphic to Z 2 andĝ is the modulo-two sum, then Corollary VI.2 recovers the theorem of Körner-Marton [3]. While if R is (isomorphic to) a field, it becomes a special case of [7, Theorem III.1]. Actually, almost all the results in [6] and [7] can be reproved in the setting of rings in a parallel fashion.
Definition VI.4. Given function g : D → Ω, and let ∅ = S ⊆ D. The restriction of g on S is defined to be the function g| S : S → Ω such that g| S : Remark 19. Up to equivalence, a function can be presented in many different formats. For example, the function min{x, y} defined on {0, 1} × {0, 1} can either be seen as F 1 (x, y) = xy on Z 2 2 or be treated as the restriction of . We refer to each presented format of a function as a presentation of this function.
Assume that g has presentationĝ ∈ R[s] for some finite ring R, we say that the region Rĝ given by (20) is achievable for computing g in the sense of Körner-Marton [3]. From [7], we know that Rĝ might not the largest achievable region one can obtain for computing g. However, Rĝ still captures the ability of LC over R when used for computing g. More precisely, Rĝ is the region purely achieved with LC over R for computing g. On the other hand, regions from [7] are achieved by combining the LC and the standard random coding techniques. Therefore, it is reasonable to compare LCoR with LCoF in the sense of Körner-Marton 4 .
Making use of Theorem VI.1 and Corollary VI.2, we show that LCoR strictly outperforms its field counterpart in the following example.
Proposition VI.6. There exists no polynomial functionĝ ∈ F 4 [3] of format (19), such that a restriction ofĝ is equivalent to the function g defined by (21).
for all feasible i. We claim thatĝ and h are both surjective, i.e., g admits a presentation k 1 (x) + k 2 (y) + k 3 (z) ∈ F 4 [3]. A contradiction to Lemma As a consequence of Proposition VI.6, in order to use LCoF in the sense of Körner-Marton to compute function g, the alphabet sizes of the three encoders need to be at least 5. However, LCoR offers a solution in which the alphabet sizes are 4, strictly smaller than using LCoF. In addition, in the sense of Körner-Marton, the region achieved with LC over a finite field F q , is always a subset of the one achieved with LC over Z 4 . This is proved in the following proposition.

Remark 20.
With proof similar to Proposition VI.7, one can show that LCoR outperforms LCoF in computing g given by (21) in the sense of [7], namely combining the standard and LC techniques.

VII. CONCLUSION
Careful readers might have noticed that the encoders used in Theorem III.1 are left linear mappings. An almost identical theorem (Theorem VII.1) can be easily proved when using right linear mappings as encoders.
Theorem VII.1. Given Φ ∈ M (X S , R S ) and let where r (T, Any rate in R ′ Φ is achievable with linear coding over finite rings R 1 , R 2 , · · · , R s .
By time sharing, where R Φ and R ′ Φ are given by (3) and (27), respectively, is achievable with (left and right) LCoR. As mentioned before, [6], [7] consider the computing problem, Problem 1, by treating a discrete function as a polynomial function over some finite field. Naturally, we believe that similar results can be obtained for polynomial functions over finite rings. In addition, both [9] and [6] considered the problem that under what circumstances R[g] R[X 1 , X 2 , · · · , X s ]. However, a conclusive result is proved only for the case s = 2. It will be interesting to know whether the ring approach can provide further insight regarding this problem.
In order to focus on the basics of the new ideas, we did not consider the computing problem in the context of noisy channels (e.g., [14]) or network coding (e.g., [15]). However, it is clear that the polynomial (over fields or rings) approach can be applied to such scenarios as well. This will be considered in our further work.
As another suggestion for further work, it will be very interesting to consider instead linear coding over rngs. It will be even more intriguing should it turn out that the rng version outperforms the ring version in the computing problem, in the same manner that the ring version outperforms the field counterpart. It will also be interesting to see whether the idea of using rng provides more understanding of the problems from [9] and [6].
Regarding algebraic structure coding, some authors [16], [17], [18] propose to implement coding over a simpler structure, group. Seemingly, this is a more universal approach since a field or a ring must be a group. However, one subtle issue is often overlooked in this context. Strictly speaking, the set of rings (or rngs) is not a subset of the set of groups. Several non-isomorphic rings (or rngs) can be defined on one and the same group. For instance, given two distinct primes p and q, up to isomorphism, 1) there are 2 finite rngs of order p, while there is only one group of order p; 2) there are 4 finite rngs of order pq; 3) there are 11 finite rngs of order p 2 (if p = 2, 4 of them are rings, namely F 4 , Z 4 , Z 2 × Z 2 and M L [12]); 4) there are 22 finite rngs of order p 2 q; 5) there are 52 finite rngs of order 8; 6) there are 3p + 50 finite rngs of order p 3 (p > 2). (More can be found from [19].) Therefore, there is no one-to-one correspondence from rings (or rngs) to groups, in either direction. Furthermore, from the point of view of formulating a multivariate function, group, associated with a single operator, is in a subordinate position compared to ring (rng or field). On the contrary, it is well-known that every discrete function is essentially a restriction of some polynomial function over some finite ring (rng or field). Although non-Abelian structures (non-Abelian groups) possess the potential to offer prominent results [20], [21], they are very difficult to handle theoretically and in practice. The performance of non-Abelian group block codes is sometimes bad [22]. Since H 2 is a concave function and 4 j=1 p j = 1, then A ≤ H 2 (p 1 + p 2 ) .
Lemma A.2. No matter which finite field F q is chosen, g given by (21) admits no presentation k 1 (x)+k 2 (y)+k 3 (z), where k i ∈ F q [1] for all feasible i.
This contradicts that ν is injective.
Remark 21. As a special case, this lemma implies that no matter which finite field F q is chosen, g defined by (21) has no presentation that is linear over F q . In contrast, g is equivalent to linear function x + 2y + 3z ∈ Z 4 [3].