Error Correction for Check Digit Systems over p-Groups and Applications to DNA Sequences

Beaugris, Louis

doi:10.3390/math13020211

Open AccessFeature PaperArticle

Error Correction for Check Digit Systems over p-Groups and Applications to DNA Sequences

by

Louis Beaugris

Department of Mathematical Sciences, Kean University, 1000 Morris Avenue, Union, NJ 07083, USA

Mathematics 2025, 13(2), 211; https://doi.org/10.3390/math13020211

Submission received: 30 November 2024 / Revised: 1 January 2025 / Accepted: 7 January 2025 / Published: 10 January 2025

Download Versions Notes

Abstract

Statistical analysis shows that the most common errors in the transmission of information consist of single errors and transposition errors. Error detection and correction methods are often desired, particularly when the accuracy of information is of crucial importance. Inspired by a check digit system constructed from the companion matrix of a primitive polynomial over the integers

Z_{p}

and that focused on error detection, this work develops error-correction formulas for single errors and transposition errors for that check digit scheme. We also propose an application to DNA sequences.

Keywords:

error correction; check digit; companion matrix; Galois field

MSC:

11Z05; 11T71; 12E20; 20D15; 15B36

1. Introduction

As shown in [1,2], single errors and various types of transposition errors are the most frequently occurring in the transcription or transmission of information. As described in [3], these errors consist of (1) single errors: a→b; (2) adjacent transposition: ab→ba; (3) twin errors: aa→bb; (4) jump transposition: abc→cba; and (5) jump twin errors: aca→bcb.

Numerous check digit systems have been used for the detection of such errors. Some examples of widely used check digit schemes include the International Standard Book Number (ISBN) and the Universal Product Code (UPC). A more comprehensive list of schemes can be found in Gallian’s paper [4]. Errors are typically detected by affixing a check digit number

a_{n + 1}

to information digits

a_{0} a_{1} a_{2} \dots a_{n}

to form the string

a_{0} a_{1} a_{2} \dots a_{n} a_{n + 1}

satisfying a check equation.

Niemenmaa [3] designed a check digit system for hexadecimal numbers using an automorphism on a group G. That check digit system can detect all of the aforementioned five types of errors. In a series of papers [5,6,7], the authors provided a matrix algebra interpretation of [3] and found a check digit system that, in addition to the five types of errors listed above, is capable of detecting t-jump transposition errors of the form

a b_{1} \dots b_{t} c \to

c b_{1} \dots b_{t} a

and t-jump twin errors of the form

a b_{1} \dots b_{t} a

…

\to \dots c b_{1} \dots b_{t} c

.

The modified system in [5,6,7] uses a check equation over a set of

p^{k}

numbers, 0, 1, 2, …,

p^{k} - 1

where p is the prime and k is a positive integer. The check equation is written as follows:

a_{1} P + a_{2} P^{2} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = 0 .

Each of the

a_{i}

’s is mapped to a k-tuple of the set

G = \underset{\underset{k}{︸}}{Z_{p} \oplus Z_{p} \oplus \dots \oplus Z_{p}}

and the k-tuple is taken as the base-p representation of the coefficient

a_{i}

[5,6].

In the check equation,

P

is a

k \times k

companion matrix whose characteristic polynomial is a primitive polynomial over

Z_{p}

. We recall the companion matrix of a monic polynomial

h (x) = x^{n} + a_{n - 1} x^{n - 1} + a_{n - 2} x^{n - 2} + \dots + a_{1} x + a_{0}

of degree n over a field is given by the

n \times n

,

M = [\begin{matrix} 0 & 0 & 0 & \dots & 0 & - a_{0} \\ 1 & 0 & 0 & \dots & 0 & - a_{1} \\ 0 & 1 & 0 & \dots & 0 & - a_{2} \\ 0 & 0 & 1 & 0 & - a_{3} \\ . & . & . & . & . \\ . & . & . & . & . \\ . & . & . & . & . \\ 0 & 0 & 0 & \dots & 1 & - a_{n - 1} \end{matrix}]

We also recall that, and it is known from matrix theory, that

h (M) = 0

[8].

In what follows, we will build error-correction formulas for single errors and transposition errors for the error-detection scheme started in [3] and redeveloped throughout [5,6,7]. We end the paper by proposing an application to DNA sequences.

2. Correcting Single Errors and Adjacent Transpositions

We first recall a theorem on the error detection of single errors and with a check digit

a_{n + 1}

[5,6].

Theorem 1.

Let

a_{1} a_{2} a_{3} \dots a_{n} a_{n + 1}

be an identification number satisfying the check equation

a_{1} P + a_{2} P^{2} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = 0

. Then, a single error a

\dots \to \dots b

is detected if and only if

a P \neq b P

for all a and b in

G = \underset{\underset{k}{︸}}{Z_{p} \oplus Z_{p} \oplus \dots \oplus Z_{p}}

.

We propose the following theorem for identification numbers with a single error. The approach in affixing a second check digit is similar to the one in [4].

Theorem 2.

Suppose a single error in the information digits

a_{1} a_{2} a_{3} \dots a_{n - 1}

is detected through the check digit equations with check digits

a_{n}

and

a_{n + 1}

:

a_{1} + a_{2} + a_{3} + \dots + a_{n} + a_{n + 1} = 0

(1)

a_{1} P + a_{2} P^{2} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = 0 .

(2)

Suppose the error occurred in the

i t h

position. Then,

e P^{i} = f

, where e is the error in Equation (1) and f the error in Equation (2).

Proof.

As a single error was detected in

a_{1} a_{2} a_{3} \dots a_{n - 1}

, we obtain in (1)

a_{1} + a_{2} + a_{3} + \dots + a_{n} + a_{n + 1} = e

. Thus, one of the digits in the string

a_{1} a_{2} a_{3} \dots a_{n - 1}

is e too big; i.e.,

a_{k} = c + e

for some

1 \leq k \leq n - 1

and where c is the correct digit.

Now, the single error applied to (2) yields

a_{1} P + a_{2} P^{2} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = f

. Suppose the error occurred in the

i t h

position. Then,

a_{1} P + a_{2} P^{2} + \dots + (c + e) P^{i} + a_{n} P^{n} + a_{n + 1} P^{n + 1} = f

. Thus,

a_{1} P + a_{2} P^{2} + \dots + c P^{i} + e P^{i} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = f

. I.e.,

(a_{1} P + a_{2} P^{2} + \dots + c P^{i} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1}) + e P^{i} = f

. We obtain,

0 + e P^{i} = f

. Therefore,

e P^{i} = f

. □

We now propose a theorem for the correction of an adjacent transposition error.

Theorem 3.

Suppose an adjacent transposition error in the information digits

a_{1} a_{2} a_{3} \dots a_{n - 1}

is detected through the check Equations (3) and (4), with check digits

a_{n}

and

a_{n + 1}

, where n is odd.

a_{1} + a_{2} P^{2} + a_{3} + a_{4} P^{2} + a_{5} + \dots + a_{n - 1} P^{2} + a_{n} + a_{n + 1} P^{2} = 0

(3)

a_{1} P + a_{2} P^{2} + a_{3} P^{3} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = 0 .

(4)

Suppose the error occurred in the

i t h

and

(i t h + 1)

positions. Then,

e {(I + P^{2})}^{- 1} P^{i} = f {(I + P)}^{- 1}

, where e is the error in Equation (3) and f the error in Equation (4).

Proof.

Suppose the transposition error occurred at the

i t h

and

(i t h + 1)

positions switched one to the other. Then, in Equation (3), for an odd number i, we obtain

a_{1} + a_{2} P^{2} + a_{3} + a_{4} P^{2} + a_{5} + \dots + a_{i + 1} + a_{i} P^{2} + \dots + a_{n - 1} P^{2} + a_{n} + a_{n + 1} P^{2} = e

a_{1} + a_{2} P^{2} + \dots + a_{i} + e_{i} + (a_{i + 1} + e_{i + 1}) P^{2} + \dots + a_{n - 1} P^{2} + a_{n} + a_{n + 1} P^{2} = e

a_{1} + a_{2} P^{2} + \dots + (a_{i} + e_{i}) + a_{i + 1} P^{2} + e_{i + 1} P^{2} + \dots + a_{n - 1} P^{2} + a_{n} + a_{n + 1} P^{2} = e

(

a_{1} + a_{2} P^{2} + \dots + a_{i} + a_{i + 1} P^{2} + \dots + a_{n - 1} P^{2} + a_{n} + a_{n + 1} P^{2}) + e_{i} + e_{i + 1} P^{2} = e

0 + e_{i} + e_{i + 1} P^{2} = e

. Note that

e_{i} = e_{i + 1}

. So we get

e_{i} + e_{i} P^{2} = e

.

Hence,

e_{i} (I + P^{2}) = e, and therefore, e_{i} = e {(I + P^{2})}^{- 1} .

(5)

Now, in Equation (4), we obtain

a_{1} P + a_{2} P^{2} + a_{3} P^{3} \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = 0

.

a_{1} P + a_{2} P^{2} + a_{3} P^{3} + \dots + a_{i + 1} P^{i} + a_{i} P^{i + 1} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = f

.

a_{1} P + a_{2} P^{2} + \dots + (a_{i} + e_{i}) P^{i} + (a_{i + 1} + e_{i + 1}) P^{i + 1} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = f

.

a_{1} P + a_{2} P^{2} + \dots + a_{i} P^{i} + e_{i} P^{i} + a_{i + 1} P^{i + 1} + e_{i + 1} P^{i + 1} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = f

.

(

a_{1} P + a_{2} P^{2} + \dots + a_{i} P^{i} + a_{i + 1} P^{i + 1} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1}) + e_{i} P^{i} + e_{i + 1} P^{i + 1} = f

.

0 + e_{i} P^{i} + e_{i + 1} P^{i + 1} = f

. As

e_{i} = e_{i + 1}

, we obtain

e_{i} P^{i} (I + P) = f

and that

e_{i} P^{i} = f {(I + P)}^{- 1}

. Substituting

e_{i}

by its value from (5), we obtain

e {(I + P^{2})}^{- 1} P^{i} = f {(I + P)}^{- 1}

. □

We provide an example for hexadecimal numbers similarly to the systems in [3,5,6]. In this case, the symbols

a_{i}

in the check equation are the elements of

G F (16)

represented as hexadecimal numbers mapped to quadruples over

Z_{2}

. In this system, the hexadecimal numbers {0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F} are represented as

0 = (0, 0, 0, 0)

,

1 = (0, 0, 0, 1)

,

2 = (0, 0, 1, 0)

,

3 = (0, 0, 1, 1)

,

4 = (0, 1, 0, 0)

,

5 = (0, 1, 0, 1)

,

6 = (0, 1, 1, 0)

,

7 = (0, 1, 1, 1)

,

8 = (1, 0, 0, 0)

,

9 = (1, 0, 0, 1)

,

A = (1, 0, 1, 0)

,

B = (1, 0, 1, 1)

,

C = (1, 1, 0, 0)

,

D = (1, 1, 0, 1)

,

E = (1, 1, 1, 0)

,

F = (1, 1, 1, 1)

.

As an example, let

6 D 2 E C 7 A

be information digits for an hexadecimal check digit system. In this scheme, we use the companion matrix P used in [7,9],

P = [\begin{matrix} 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 \end{matrix}]

generated by the irreducible polynomial

x^{4} + x^{3} + 1

. Equations (1) and (2) of Theorem 2 verify that the two check digits are

a_{8} = 5

and

a_{9} = 3

. The identification number is thus

6 D 2 E C 7 A 53

.

As an illustration of locating a single error, assume that the information number

6 D 5 E C 7 A 53

was transcribed instead of the identification number in the previous paragraph. We see that this produces an error

e = (0, 1, 1, 1)

in Equation (1) and an error

f = (1, 1, 0, 1)

in Equation (2). We find that the error location formula

e P^{i} = f

in Theorem 2 yields

(0, 1, 1, 1) P^{i} = (1, 1, 0, 1)

. This is true for

i = 3

, confirming that the error occurred in the third position.

To illustrate the correction of an adjacent transposition error, we use an identification number of length 10 with the information digits

6 D 2 C E 7 A 4

. First, note that, with the check digits

a_{9} = E

and

a_{10} = 5

, the identification number satisfies both check equations in Theorem 3, verifying that

6 D 2 C E 7 A 4 E 5

is the correct number.

We now locate an adjacent transposition to illustrate the formula in Theorem 3. Suppose the identification number

6 D 2 E C 7 A 4 E 5

was typed instead of the correct one displayed above. We show that positions 4 and 5 were transposed.

Verifying

6 D 2 E C 7 A 4 E 5

in Equation (1) of Theorem 3, we obtain an error

e = (1, 0, 1, 0)

. Similarly, verifying

6 D 2 E C 7 A 4 E 5

in Equation (2) of the same theorem yields an error

f = (0, 1, 0, 0)

. Substituting in the formula, we obtain

(1, 0, 1, 0) {(I + P^{2})}^{- 1} P^{i} = (0, 1, 0, 0) {(I + P)}^{- 1}

; i.e.,

(0, 0, 1, 0) P^{i} = (0, 0, 1, 1)

. We see that this holds for

i = 4

, and conclude that the adjacent positions 4 and 5 were transposed.

The reader can use any computing means at their disposal to perform the matrix operations in the examples above, particularly the matrix exponent of the form

A^{n}

for an integer n.

3. Application to DNA Sequences

We apply our previously developed companion matrix error-detection system and corresponding error-correction formulas to DNA sequences. As explained in [10], a DNA (deoxyribo neuclotide acid) sequence is a string of four letters representing bases called adenine (A), thymine (T), guanine (G), and cytosine (C). In genetic coding, these bases are grouped into codons, which are sequences of three consecutive nucleotides [11]. Given that each codon consists of three bases, there are 64 possible codon combinations. These codons serve as building blocks for amino acids, which in turn form proteins. We note that we use T for U in this paper, as U is commonly used in DNA transcription [12].

For the purpose of encoding a DNA sequence, we use the check digit system with entries represented as elements of the Abelian group

G = \underset{\underset{k}{︸}}{Z_{p} \oplus Z_{p} \oplus \dots \oplus Z_{p}}

and where the k-tuple is taken as the base p representation of the coefficient

a_{i}

[5,6,7]. We take

k = 6

and represent the elements of

G F (64)

as base-64 characters and map them to the 6-tuples of the group G. We denote them as shown below. The full list is in Table 1 before the conclusion of this paper.

0 = (0, 0, 0, 0, 0, 0)

1 = (0, 0, 0, 0, 0, 1)

.

10 = A = (1, 0, 1, 0, 0, 0)

11 = B = (1, 0, 1, 1, 0, 0)

.

35 = Z = (1, 0, 0, 1, 0, 0)

36 = a = (1, 0, 0, 1, 0, 0)

.

61 = z = (1, 1, 1, 1, 0, 1)

62 = @ = (1, 1, 1, 1, 1, 0)

63 = & = (1, 1, 1, 1, 1, 1)

.

We map the nucleotides as follows:

A \to 00

,

C \to 01

,

G \to 10

, and

T \to 11

. Similar mappings are used in [13,14]. We then represent binary strings

x_{1} x_{2} x_{3} x_{4} x_{5} x_{6}

of length 6 by points

(x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, x_{6})

with six coordinates. We see, for example, that:

A A A \to (0, 0, 0, 0, 0, 0)

A A T \to (0, 0, 0, 0, 1, 1)

.

C C C \to (0, 1, 0, 1, 0, 1)

A T C \to (0, 0, 1, 1, 0, 1)

.

G G G \to (1, 0, 1, 0, 1, 0)

We encode a DNA sequence

a_{1} a_{2} a_{3} \dots a_{n}

of length n via the equation

a_{1} P + a_{2} P^{2} + a_{3} P^{3} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = 0

(6)

where P is the companion matrix of a primitive polynomial of the Galois field

G F (64)

, and symbols

a_{1}

through

a_{n}

are called the information digits, and

a_{n + 1}

is the check digit to be affixed to the information digits. Here, each

a_{i}

corresponds to a codon which is then mapped to a binary 6-tuple.

As an example, we will encode the DNA subsequence aggatctagc agcagcagaa gcggagcttt obtained from [15]. Taken one codon at a time successively, we note that the above DNA sequence corresponds to the binary representation:

(0, 0, 1, 0, 1, 0), (0, 0, 1, 1, 0, 1), (1, 1, 0, 0, 1, 0), (0, 1, 0, 0, 1, 0), (0, 1, 0, 0, 1, 0),

(0, 1, 0, 0, 1, 0), (0, 0, 0, 0, 1, 0), (0, 1, 1, 0, 1, 0), (0, 0, 1, 0, 0, 1), (1, 1, 1, 1, 1, 1)

.

Using the alphanumeric base-64 representation of these binary sequences, the DNA sequence is encoded as

A D o I I I 2 Q 9 &

. The check digits are found with the use of the check equation

a_{1} P + a_{2} P^{2} + a_{3} P^{3} + \dots + a_{n} P^{n} + a_{n + 1} P^{n + 1} = 0

where

P = [\begin{matrix} 0 & 0 & 0 & 0 & 0 & 1 \\ 1 & 0 & 0 & 0 & 0 & 1 \\ 0 & 1 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1 & 0 \end{matrix}]

is the companion matrix of the primitive polynomial

x^{6} + x + 1

of GF(64).

The check digit scheme covered in this work has its real-life applications. As pointed out in [3], examples of check digit schemes based on hexadecimal numbers include the International Standard Audiovisual Number (ISAN) and Mobile Equipment Identifier (MEID). It is also pointed out in [7,9] that this design over groups of prime-power orders outperforms ISAN and MEID as the system based on p-groups is able to detect errors of types 1–5 and t-jump transpositions and t-jump twin errors with a detection radius of 14 and 16. A novelty in the design built in this work is that it has two check digits and can correct single errors and adjacent transposition errors. Another novelty is its application to DNA sequences by affixing an extra codon to the sequence as a check digit. It is important to note that the extra codon is only used for error detection and correction purposes. Once the identification number is decoded back into a DNA sequence with the A, C, G, T alphabet, one has to delete the extra codon.

To end this section, we give an example for correcting a single error and a transposition error involving the aforementioned DNA sequence aggatctagc agcagcagaa gcggagcttt. First, recall that, using the mapping in Table 1 above, the sequence can be written as

A D o I I I 2 Q 9 &

.

Before transmitting the sequence, we first use Equations (1) and (2) of Theorem 2 to find the check digits

a_{11} = P

and

a_{12} = G .

We thus obtain the identification number

A D o I I I 2 Q 9 & P G

.

We now locate a single error in the DNA sequence. Suppose the transmitted sequence

A D o I I I 2 Q 9 & P G

was received as

A D O I I I 2 Q 9 & P G

. This would yield an error

e = (1, 0, 1, 0, 1, 0)

through Equation (1) and an error

f = (0, 1, 0, 1, 1, 1)

in Equation (2) of Theorem 2. The error locator equation

e P^{i} = f

is true for

i = 3

, verifying that the error occurred in position 3 of the identification number, with a capital letter O received or stored instead of the lower-case letter o.

To demonstrate the location of a transposition error, suppose the sequence

A D o I I I 2 Q 9 &

is to be transmitted. First, we encode it with Theorem 3 by finding

a_{11}

and

a_{12}

. Computations with the same theorem yield

a_{11} = (0, 1, 0, 0, 0, 1)

and

a_{12} = (0, 1, 0, 0, 0, 1)

, which we represent by H and K, respectively. Thus, we obtain

A D o I I I 2 Q 9 & H K

as the identification number to be used.

Suppose the digits

A D o I I I Q 29 & H K

were transmitted. First, note that Q and 2 were transposed. Verifying this identification number in Equation (3), we obtain the error

e = (1, 1, 1, 0, 1, 0)

. From Equation (4) of Theorem 3, we obtain the error

f = (1, 1, 0, 1, 0, 0)

. We also see that

i = 7

satisfies the equation

e {(I + P^{2})}^{- 1} P^{i} = f {(I + P)}^{- 1}

. This confirms that 2 and Q (positions 7 and 8) were transposed. As a side note, it may be necessary to use scalar multiplication to avoid decimal answers.

4. Conclusions

In this article, we discussed a check digit system based on hexadecimal numbers then extended it to groups of order

p^{k}

and summarized its error detection capability. The main accomplishment in this paper is the development of theorems for the correction of single errors and of adjacent transposition errors. The identification numbers we designed have two check digits. It is important to note that Equation (1) of Theorem 2 does not detect transposition errors, but Equation (3) in Theorem 3 detects both single errors and adjacent transposition errors. We also proposed an application to encoding DNA sequences and the construction of check digits for these sequences. Another novelty in this work is the affixing of a check codon for the detection and correction of single errors and adjacent transposition errors in a DNA sequence of length

3 m

compressed to a sequence of length m. Future work will include the correction of other types of errors including jump transposition and twin errors.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The author declares no conflicts of interest.

References

Beckley, D.F. An optimal system with modulo 11. Comput. Bull. 1967, 11, 213–215. [Google Scholar]
Verhoeff, J. Error detecting decimal codes. In Mathematical Centre Tracts; Mathematisch Centrum: Amsterdam, The Netherlands, 1969; Volume 29. [Google Scholar]
Niemenmaa, M. A check digit system for hexagonal numbers. Appl. Algebra Eng. Commun. Comput. 2011, 22, 109–112. [Google Scholar] [CrossRef]
Gallian, J. The Mathematics of Identification Numbers. Coll. Math. J. 1991, 22, 194–202. [Google Scholar] [CrossRef]
Chen, Y.; Niemenmaa, M.; Vinck, A.J.H.; Gliogorosk, D. On some properties of a check digit system. In Proceedings of the 2012 IEEE International Symposium on Information Theory Proceedings, Cambridge, MA, USA, 1–6 July 2012; pp. 1563–1567. [Google Scholar]
Chen, Y.; Niemenmaa, M.; Vinck, A.J.H. A check digit system over a group of arbitray order. In Proceedings of the 2013 8th International Conference on Communications and Networking in China (CHINACOM), Guilin, China, 14–16 August 2013; pp. 897–902. [Google Scholar]
Chen, Y.; Niemenmaa, N.; Vinck, A.J.H. A general check digit system based on finite groups. Des. Codes Cryptogr. 2016, 80, 149–163. [Google Scholar] [CrossRef]
Lidl, R.; Niederreiter, H. Introduction to Finite Fields and Their Applications, Revised ed.; Cambridge University Press: Cambridge, UK, 1994. [Google Scholar]
Chen, Y.; Niemenmaa, M.; Vinck, A.J.H.; Gligoroski, D. On the Error Detection Capability of One Check Digit. IEEE Trans. Inf. Theory 2014, 60, 261–270. [Google Scholar] [CrossRef]
Koonin, E.V.; Novozhilov, A.S. Origin and evolution of the universal genetic code. Annu. Rev. Genet. 2017, 51, 45–62. [Google Scholar] [CrossRef]
Crick, F.H.C.; Barnett, L.; Brenner, S.; Watts-Tobin, R.J. General nature of the genetic code for proteins. Nature 1961, 60, 1227–1232. [Google Scholar] [CrossRef]
Alberts, B.; Heald, R.; Johnson, A.; Morgan, D.; Raff, M. Molecular Biology of the Cell, 7th ed.; W. W. Norton & Company: New York, NY, USA, 2022. [Google Scholar]
Milenkovic, O.; Kashyap, N. On the design of codes for DNA computing. In International Workshop on Coding and Cryptography; Springer: Berlin/Heidelberg, Germany, 2017; pp. 100–119. [Google Scholar]
Liu, J.; Liu, H. DNA Codes Over the Ring $F$ ₄[U]/ < U³ >. IEEE Access 2020, 8, 77528–77534. [Google Scholar]
Influenza Virus B. Available online: https://www.ncbi.nlm.nih.gov/nuccore/D00004.1 (accessed on 22 September 2024).

Table 1. Codon correspondence.

Codon	$G = Z_{2} \oplus Z_{2} \oplus Z_{2} \oplus Z_{2} \oplus Z_{2} \oplus Z_{2}$	GF(64)alphanumeric
aaa	$(0, 0, 0, 0, 0, 0)$	0
aac	$(0, 0, 0, 0, 0, 1)$	1
aag	$(0, 0, 0, 0, 1, 0)$	2
aat	$(0, 0, 0, 0, 1, 1)$	3
aca	$(0, 0, 0, 1, 0, 0)$	4
acc	$(0, 0, 0, 1, 0, 1)$	5
acg	$(0, 0, 0, 1, 1, 0)$	6
act	$(0, 0, 0, 1, 1, 1)$	7
aga	$(0, 0, 1, 0, 0, 0)$	8
agc	$(0, 0, 1, 0, 0, 1)$	9
agg	$(0, 0, 1, 0, 1, 0)$	A
agt	$(0, 0, 1, 0, 1, 1)$	B
ata	$(0, 0, 1, 1, 0, 0)$	C
atc	$(0, 0, 1, 1, 0, 1)$	D
atg	$(0, 0, 1, 1, 1, 0)$	E
att	$(0, 0, 1, 1, 1, 1)$	F
caa	$(0, 1, 0, 0, 0, 0)$	G
cac	$(0, 1, 0, 0, 0, 1)$	H
cag	$(0, 1, 0, 0, 1, 0)$	I
cat	$(0, 1, 0, 0, 1, 1)$	J
cca	$(0, 1, 0, 1, 0, 0)$	K
ccc	$(0, 1, 0, 1, 0, 1)$	L
ccg	$(0, 1, 0, 1, 1, 0)$	M
cct	$(0, 1, 0, 1, 1, 1)$	N
cga	$(0, 1, 1, 0, 0, 0)$	O
cgc	$(0, 1, 1, 0, 0, 1)$	P
cgg	$(0, 1, 1, 0, 1, 0)$	Q
cgt	$(0, 1, 1, 0, 1, 1)$	R
cta	$(0, 1, 1, 1, 0, 0)$	S
ctc	$(0, 1, 1, 1, 0, 1)$	T
ctg	$(0, 1, 1, 1, 1, 0)$	U
ctt	$(0, 1, 1, 1, 1, 1)$	V
gaa	$(1, 0, 0, 0, 0, 0)$	W
gac	$(1, 0, 0, 0, 0, 1)$	X
gag	$(1, 0, 0, 0, 1, 0)$	Y
gat	$(1, 0, 0, 0, 1, 1)$	Z
gca	$(1, 0, 0, 1, 0, 0)$	a
gcc	$(1, 0, 0, 1, 0, 1)$	b
gcg	$(1, 0, 0, 1, 1, 0)$	c
gct	$(1, 0, 0, 1, 1, 1)$	d
gga	$(1, 0, 1, 0, 0, 0)$	e
ggc	$(1, 0, 1, 0, 0, 1)$	f
ggg	$(1, 0, 1, 0, 1, 0)$	g
ggt	$(1, 0, 1, 0, 1, 1)$	h
gta	$(1, 0, 1, 1, 0, 0)$	i
gtc	$(1, 0, 1, 1, 0, 1)$	j
gtg	$(1, 0, 1, 1, 1, 0)$	k
gtt	$(1, 0, 1, 1, 1, 1)$	l
taa	$(1, 1, 0, 0, 0, 0)$	m
tac	$(1, 1, 0, 0, 0, 1)$	n
tag	$(1, 1, 0, 0, 1, 0)$	o
tat	$(1, 1, 0, 0, 1, 1)$	p
tca	$(1, 1, 0, 1, 0, 0)$	q
tcc	$(1, 1, 0, 1, 0, 1)$	r
tcg	$(1, 1, 0, 1, 1, 0)$	s
tct	$(1, 1, 0, 1, 1, 1)$	t
tga	$(1, 1, 1, 0, 0, 0)$	u
tgc	$(1, 1, 1, 0, 0, 1)$	v
tgg	$(1, 1, 1, 0, 1, 0)$	w
tgt	$(1, 1, 1, 0, 1, 1)$	x
tta	$(1, 1, 1, 1, 0, 0)$	y
ttc	$(1, 1, 1, 1, 0, 1)$	z
ttg	$(1, 1, 1, 1, 1, 0)$	@
ttt	$(1, 1, 1, 1, 1, 1)$	&

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Beaugris, L. Error Correction for Check Digit Systems over p-Groups and Applications to DNA Sequences. Mathematics 2025, 13, 211. https://doi.org/10.3390/math13020211

AMA Style

Beaugris L. Error Correction for Check Digit Systems over p-Groups and Applications to DNA Sequences. Mathematics. 2025; 13(2):211. https://doi.org/10.3390/math13020211

Chicago/Turabian Style

Beaugris, Louis. 2025. "Error Correction for Check Digit Systems over p-Groups and Applications to DNA Sequences" Mathematics 13, no. 2: 211. https://doi.org/10.3390/math13020211

APA Style

Beaugris, L. (2025). Error Correction for Check Digit Systems over p-Groups and Applications to DNA Sequences. Mathematics, 13(2), 211. https://doi.org/10.3390/math13020211

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Error Correction for Check Digit Systems over p-Groups and Applications to DNA Sequences

Abstract

1. Introduction

2. Correcting Single Errors and Adjacent Transpositions

3. Application to DNA Sequences

4. Conclusions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI