Algebraic Analysis of a Simpliﬁed Encryption Algorithm GOST R 34.12-2015

: In January 2016, a new standard for symmetric block encryption was established in the Russian Federation. The standard contains two encryption algorithms: Magma and Kuznyechik. In this paper we propose to consider the possibility of applying the algebraic analysis method to these ciphers. To do this, we use the simpliﬁed algorithms Magma ⊕ and S-KN2. To solve sets of nonlinear Boolean equations, we choose two di ﬀ erent approaches: a reduction and solving of the Boolean satisﬁability problem (by using the CryptoMiniSat solver) and an extended linearization method (XL). In our research, we suggest using a security assessment approach that identiﬁes the resistance of block ciphers to algebraic cryptanalysis. The algebraic analysis of an eight-round Magma (68 key bits were ﬁxed) with the CryptoMiniSat solver demanded four known text pairs and took 3029.56 s to complete (the search took 416.31 s). The algebraic analysis of a ﬁve-round Magma cipher with weakened S-boxes required seven known text pairs and took 1135.61 s (the search took 3.36 s). The algebraic analysis of a ﬁve-round Magma cipher with disabled S-blocks (equivalent value substitution) led to getting only one solution for ﬁve known text pairs in 501.18 s (the search took 4.92 s). The complexity of the XL algebraic analysis of a four-round S-KN2 cipher with three text pairs was 236.33 s (took 1.191 Gb RAM).


Introduction
The discussion of the potential vulnerability of algebraic analysis has attracted the attention of scientists all over the world to the algebraic methods of attack on information protection systems. The advantage of this method, compared to those described above, is in obtaining the correct encryption key in the presence of a small number of known pairs of plaintext/ciphertext. Mostly algebraic analysis is focused on nonlinear primitives of encryption algorithms and is based on describing the encryption algorithm in the form of systems of nonlinear equations connecting the secret key and the known data. In the work [1], it is stated that the author performed the analysis of the full-round algorithm GOST 28147-89. At the same time, the work did not disclose the used approaches to the analysis, but gives only a general description of the work done and its approximate complexity.
Referring to the work of C. Shannon [2], we can say that the assessment of reliability information protection with encryption algorithms is equal to doing "as much work as to solve a system of equations with a large number of unknowns". For a long time, in the process of analyzing block encryption algorithms, most attention was paid to statistical methods of analysis, and algebraic methods of analysis that describe general approaches to the problem of analyzing the reliability of encryption algorithms were not sufficiently taken into consideration.
There are various ways to find solutions of nonlinear Boolean equations systems. During the analysis of science research bases, three main areas were identified in the field of solving such systems used in information security assessment (reliability of encryption algorithms) [3]: The above algebraic analysis methods are being actively developed and improved as applied to cryptographic algorithms (especially lightweight, because of their simplified mathematical structure). There are some recent results published on the topic of algebraic analysis (namely linearization methods and SAT solving). Taking into account the importance of substitution S-boxes for algebraic analysis, we note research [4], which discussed the effects of S-box representation on the efficiency of the algebraic analysis of block ciphers. Additionally, algebraic analysis is based on representation addition modulo 2 n operations and their influence on efficiency is discussed in [5].
We should note that paper [6] aimed at analyzing the practical effectiveness of algebraic attacks using experiments for reduced block ciphers (an algebraic attack was implemented on a reduced LowMC cipher). As the basis of the attack, a dynamical elimination algorithm was developed. The research [7] is devoted to the development and analysis of quantum algorithms for solving a system of quadratic equations. The complexity calculations of the algorithms XL, FXL, ReversibleXL and GroverXL for random systems are performed. Research [8] is focused on the application of the XL algorithm to generic systems with 32 variables and 64 equations over GF16. Experiments have been carried out on two computer systems: a 64-core NUMA system and an8-node InfiniBand cluster, and a comparison of the investigated implementation of an XL algorithm with PWXL (a parallel implementation of XL) and Faugère's F4 algorithms was also made. Algebraic attack (by a linearization method) on two/three/four rounds of Keccak-384 and two/three rounds of Keccak-512 were investigated in paper [9]. Paper [10] describes the application of algebraic cryptanalyses to 12-round LBlock, six-round MIBS, seven-round PRESENT and nine-round SKINNY lightweight block ciphers. A new approach to simplify the equation system was presented (by using additional polynomial relations-linear relation between intermediate state bits). Research [11] demonstrated that the block ciphers Jarvis and Friday (members of the MARVELlous family of cryptographic primitives) are vulnerable to Gröbner basis attacks. Algebraic attacks on reduced-and full-round DESL (a lightweight version of DES) are presented in [12].
Paper [13] describes the SAT-based algorithm to determine the multiplicative complexity of a Boolean function. A SAT-based cryptanalysis for the Grain v1 stream cipher is presented in [14]. Practical SAT-based guess-and-determine attacks for several stream ciphers are developed in [15]. The dissertation [16] describes an opportunity of synergy between Gröbner-like and DPLL-like solving. The author presented new types of solving algorithms (SRES) and made some experiments of algebraic fault attacks on the symmetric ciphers LED and derivatives of the block cipher AES.
In this paper, we consider the possibility of using linearization methods and SAT solvers to analyze information security properties for Russian symmetric block encryption standards and their modified versions.
Also we propose considering approaches to algebraic cryptanalysis of the simplified algorithms Magma and Kuznyechik (S-KN2, which is presented in [17]). The paper is organized as follows: Section 1 contains a brief description of the Russian symmetric block encryption standard GOST R 34.12-2015 and how its simplified versions are applied. Section 2 is devoted to the investigation of the basic algebraic analysis methods extended linearization and SAT solving. Section 3 includes proposed security assessment approaches and their input and output parameters. In Section 4 we consider algorithms for describing encryption transformations as systems of linearly independent equations (we fixed two basic nonlinear elements: S-box and addition modulo 2 n ). Section 5 presents the experimental results of applying algebraic analysis methods to the Magma cipher (some versions) and S-KN2 cipher.

GOST R 34.12-2015
GOST R 34.12-2015 was introduced as a new symmetric block cipher standard in Russia in 2016 [18]. The standard contains the descriptions of two encryption algorithms: the Magma cipher (GOST R 34.12-2015 n = 64) and Kuznyechik cipher (GOST R 34.12-2015 n = 128). We describe both of them below, as well as the simplified versions used (Magma ⊕ and S-KN2). The Magma encryption algorithm is part of the symmetric encryption standard in the Russian Federation [18]. Previously, this encryption algorithm was called GOST 28147-89 and was slightly different from the current version. In the earlier version of the cipher, unfixed S-boxes were used. The Magma cipher is a symmetric block cipher designed according to the Feistel scheme. In Electronic Codebook (ECB) mode, 64 bits of a data block (T) are converted to 64 bits of ciphertext (C) under the influence of a 256-bit secret key. According to Feistel's scheme, a data block is divided into two parts, each part containing 32 bits. The right part of the data is processed by the F-function in each round. The F-function consists of three operations:

•
Mixing data with secret key bits using module 2 32 addition; • S-box bit substitution; • 11 position cyclic shift to the left.
The F-function output is mixed with the left part of the data block by addition module two. After that, the left and right parts of the text are swapped. The scheme of the Magma encryption algorithm is shown in Figure 1. The S-boxes recommended by the GOST R 34.12-2015 standard for use in the Magma cipher are presented in Table 1.

Kuznyechik Cipher (GOST R 34.12-2015 n = 128) and S-KN2 Cipher
The Kuznyechik cipher has been part of the government standard for symmetric data encryption in the Russian Federation since 2016. A full description of the encryption can be found in [18]. The cipher is based on the substitution and permutation network principle. The cipher input contains a data block of 128 bits, and the cipher text of 128 bits is generated at the output. For conversion, the secret key of 256 bits is used. The cipher starts by mixing the data with the first round key (addition module 2). After that, nine rounds of encryption are performed. Each round of encryption consists of three operations:  The round encryption keys for use in each round are retrieved from the original 256-bit key. The original secret key is divided into eight 32-bit parts: K1, K2, K3, K4, K5, K6, K7, and K8. The round keys must be used in direct order from K1 to K8 in rounds one to 24 (three times) and in reverse order from K8 to K1 in rounds 25 to 32.

Kuznyechik Cipher (GOST R 34.12-2015 n = 128) and S-KN2 Cipher
The Kuznyechik cipher has been part of the government standard for symmetric data encryption in the Russian Federation since 2016. A full description of the encryption can be found in [18]. The cipher is based on the substitution and permutation network principle. The cipher input contains a data block of 128 bits, and the cipher text of 128 bits is generated at the output. For conversion, the secret key of 256 bits is used. The cipher starts by mixing the data with the first round key (addition module 2). After that, nine rounds of encryption are performed. Each round of encryption consists of three operations: • Byte exchange with S-block; • linear mixing bits L; • mixing data with secret key bits using module 2 addition.
The output of the ninth round of encryption forms the ciphertext. In order to decrypt the data, the reverse order of operations must be used, the operations must be inversed, respectively, and the round keys must also be used in reverse order.

•
Byte exchange with S-block. The data block is divided into 16 bytes. Each byte is replaced by a new value according to the table defined in the standard; • linear mixing bits L. The operation is performed by using the multiplication of polynomials in the given field. Multiplication is performed 16 times until all bytes are changed; • mixing data with secret key bits using module 2 addition.
The secret key is divided into two parts, K1 and K2, which form the first two round key connections. These keys are used as inputs into a special Feistel scheme to form the remaining round keys.
The Kuznyechik cipher is a new algorithm and has not been sufficiently researched. The investigation of its properties is important. Applying simplified models for finding cryptographic properties is a common approach in cryptography.
For example, the S-DES algorithm, proposed by E. Schaefer and W. Stollings, is widely used for educational purposes [19]. A few simplified versions of AES were proposed by various authors: Rafael Chun-Wei Fan [20], Mohammed Musa, Edward Schaefer, and Stephen Vedig [21], Henri Gilbert [22], and others. These ciphers are used not only in education, but also to model various types of cryptanalytic attacks. Linear cryptanalysis of S-DES was proposed in [23]. The possibility of using differential cryptanalysis in addition to linear cryptanalysis is presented in [24]. The authors of [25] investigate the use of heuristic cryptanalysis for S-DES analysis. The authors of [26] present an approach to the cryptanalysis of S-DES using a genetic algorithm. The attack is carried out only on ciphertext, and on the basis of fitness function, many optimal keys are created. The authors of [27] present a new cryptanalysis attack aimed at the ciphertext generated by S-DES. The attack was carried out using a modified version of the BPSO (binary particle swarm optimization) algorithm. It is clear from the publication date that simplified versions of DES are still of interest to cryptographers. The number of publications on S-AES is no less. The authors of [28] present an approach to S-AES analysis using an impossible differential method. B. Hitapuru and S. Indarjani consider the possibility of applying the square attack to mini-AES [29]. The linear cryptanalysis of S-AES is presented by S. Indarjani, D. Mansouri, and H. K. Bizaki [30]. This investigation was continued by S. Campbell et al. [31]. The authors characterize a class of strongly nonlinear S-boxes for which their algorithm is always successful. They also show how to construct S-boxes to make the algorithm more resistant to linear cryptanalysis. S. Simmonds reviewed different approaches to S-AES analysis [32].
Two simplified versions for the Kuznyechik cipher were presented in [17]. The SKN-2 cipher was developed to simulate various cryptographic attack scenarios. The SKN-2 cipher is built in the image and likeness of the original Kuznyechik cipher. It converts a 16-bit data block using an SP network for three rounds. The secret key contains 32 bits. Round keys are generated from the original secret key using four rounds of the Feistel scheme. Each round consists of substitution S, linear transformation L, and the addition with the round subkey modulo two.
The original description of the S-KN2 cipher uses four rounds [17]. However, this amount can be increased easily. If the experiment allows you to use more rounds, then it is necessary to develop additional round keys according to the scheme. The S-KN2 cipher is shown in Figure 2. The first transformation is the mixing of data with the first round key. Hereinafter, data mixing with a round key is performed using modulo 2 addition (XOR operation). Further, each round contains three operations: replacing S, shuffling L, and addition with a round key. Consider each transformation in more detail. others. These ciphers are used not only in education, but also to model various types of cryptanalytic attacks. Linear cryptanalysis of S-DES was proposed in [23]. The possibility of using differential cryptanalysis in addition to linear cryptanalysis is presented in [24]. The authors of [25] investigate the use of heuristic cryptanalysis for S-DES analysis. The authors of [26] present an approach to the cryptanalysis of S-DES using a genetic algorithm. The attack is carried out only on ciphertext, and on the basis of fitness function, many optimal keys are created. The authors of [27] present a new cryptanalysis attack aimed at the ciphertext generated by S-DES. The attack was carried out using a modified version of the BPSO (binary particle swarm optimization) algorithm. It is clear from the publication date that simplified versions of DES are still of interest to cryptographers. The number of publications on S-AES is no less. The authors of [28] present an approach to S-AES analysis using an impossible differential method. B. Hitapuru and S. Indarjani consider the possibility of applying the square attack to mini-AES [29]. The linear cryptanalysis of S-AES is presented by S. Indarjani, D. Mansouri, and H. K. Bizaki [30]. This investigation was continued by S. Campbell et al. [31]. The authors characterize a class of strongly nonlinear S-boxes for which their algorithm is always successful. They also show how to construct S-boxes to make the algorithm more resistant to linear cryptanalysis. S. Simmonds reviewed different approaches to S-AES analysis [32]. Two simplified versions for the Kuznyechik cipher were presented in [17]. The SKN-2 cipher was developed to simulate various cryptographic attack scenarios. The SKN-2 cipher is built in the image and likeness of the original Kuznyechik cipher. It converts a 16-bit data block using an SP network for three rounds. The secret key contains 32 bits. Round keys are generated from the original secret key using four rounds of the Feistel scheme. Each round consists of substitution S, linear transformation L, and the addition with the round subkey modulo two.
The original description of the S-KN2 cipher uses four rounds [17]. However, this amount can be increased easily. If the experiment allows you to use more rounds, then it is necessary to develop additional round keys according to the scheme. The S-KN2 cipher is shown in Figure 2. The first transformation is the mixing of data with the first round key. Hereinafter, data mixing with a round key is performed using modulo 2 addition (XOR operation). Further, each round contains three operations: replacing S, shuffling L, and addition with a round key. Consider each transformation in more detail.  In operation S, the data block is divided into four nibbles. For each nibble, a replacement is performed using Table 2. Table 2 has to be interpreted as follows: the upper row indicates the S-box input, while the lower one indicates the corresponding output. In this case, the inverse table will take the form shown in Table 3. The data in Tables 2 and 3 are presented in hexadecimal form.  The L transform contains four iterations, each changing one nibble and right shifting another one. The nibbles are modified during encryption, as described by the following formulas: The information is presented in the form of polynomials for the transformations. Multiplication is performed modulo Ψ(x) = x 4 ⊕ x ⊕ 1. In [17], we propose an input-output matches table for polynomial multiplication by three. This table is easy to build and easy to use. A similar table can be constructed for multiplication by four.
For decryption, it is necessary to use the inverse operation L −1 (Figure 2). The following set of equations is used for that: a 3 = 4*a 2 ⊕ a 1 ⊕ 3*a 0 ⊕a 3 .
To generate round keys, the Feistel scheme, shown in Figure 3, is used. The first two round keys (K1 and K2, which are obtained from the original secret key) are considered as input. The right part (key K1) is input into the function F. The function F is also shown in Figure 3 and consists of three operations: addition modulo two for the data block and constant Ci, replacement with the S-box, and linear mixing L. C i constants are obtained by transforming i with the linear transform L.
Data decryption is implemented in the reverse direction from bottom to top. The inverse operations are applied instead of their counterparts. A sample encryption and decryption by S-KN2 is given in [17].  Data decryption is implemented in the reverse direction from bottom to top. The inverse operations are applied instead of their counterparts. A sample encryption and decryption by S-KN2 is given in [17].

Extended Linearization Method
The extended linearization method (XL) was proposed by N. Courtois, A. Klimov, J. Patarin, and A. Shamir in [33].
Let a field K and a system of quadratic equations m  be given, where m is the number of equations of the system. Each equation i  of the system is a polynomial of the form where i -is the number of the equation of the system, n x x ,..., 1 -are unknowns, b -is the free term of the equation. The goal of the extended linearization method is to obtain at least one solution Let N D ∈ , N -the set of natural numbers, then D I -is an ideal, generated by equations of the form

Extended Linearization Method
The extended linearization method (XL) was proposed by N. Courtois, A. Klimov, J. Patarin, and A. Shamir in [33].
Let a field K and a system of quadratic equations m be given, where m is the number of equations of the system. Each equation i of the system is a polynomial of the form f i (x 1 , . . . , x n ) − b i ., where i-is the number of the equation of the system, x 1 , . . . , x n -are unknowns, b-is the free term of the equation. The goal of the extended linearization method is to obtain at least one solution x = (x 1 , . . . , x n ) ∈ K n for a given value of b = (b 1 , . . . , b m ) ∈ K m . All possible multiplications of unknowns x of degree k are denoted as the set W = It is obvious that the total number of linearly independent equations cannot exceed the number of monomials T. If the system has a unique solution, then there is a value D, for which the inequality R ≥ T. holds. Moreover, the number of linearly independent equations from R will be close enough to the value of T. If the difference between the number of monomials and linearly independent equations is not large, then the system will be solvable. The system will be solved most easily with a very small value of the difference between the number of monomials and linearly independent equations.
It is expected that the value of D, at which the extended linearization method is applicable, will be equal to or close to the theoretical value of the parameter D. In this case, the algorithm of the extended linearization method will be effective, provided that: From the Formula (1) we obtain that:

SAT Solvers for Solving a System of Boolean Algebraic Equations
Any SAT problem is based on two key stages-checking the feasibility of an arbitrary Boolean function represented in conjunctive normal form (CNF) and finding a set of values at which such CNF is executed. Many SAT solvers are based on the DPLL (Devis, Putnam, Logemann, Loveland) algorithm, which was developed in 1962 precisely to determine the feasibility of Boolean formulas in CNF. For more than half a century, the DPLL algorithm has been the basis for most effective solvers for SAT problems. The main idea of the DPLL algorithm is to use methods to bypass the search tree in depth and apply the single clause rule [34]. The DPLL algorithm splits the set of sought CNF variables into two subsets, A and B, where variables with the value "true" are included in the set A, and variables with the value "false" in the set B. At each step, an arbitrary variable of the CNF is selected and the value "true" is assigned to it (adding a variable to subset A). Then the initial formula is simplified, and the simplified problem is solved. If the CNF obtained after simplification is feasible, then the value of the variable is chosen correctly, otherwise the selected variable is assigned the value "false" and it is transferred to the subset B. The task is solved again for the selected value of the "false" variable. Thus, it will either find the correct value of the variable ("true" or "false"), or it will be proved that the original formula is not feasible.
Each time a variable is checked, the original CNF is simplified according to the following two rules: 1. Variable propagation. If there is only one variable left in the sentence, assign it such a value that the sentence becomes true (put the variable in the subset A if there is no negation in the sentence, or put it in the set B if there is negation).

2.
The elimination of "pure variables". If a variable is found in the formula with only negation or only without negation, then it is called "pure" and it can be assigned such a value that it is always "true" (in this way we reduce the number of free variables).
If, after simplification, an empty clause is received (i.e., all simple conjuncts are false), the formula is not feasible and we return to the previous step. If no free variables remain, then the formula is considered feasible, and the operation of the algorithm can be stopped. If there is no disjoint left (uncommitted free variables can be set arbitrarily), then the check of the feasibility of the CNF also ends.
Consider the representation of the addition operation modulo two in CNF, and let the formula be given x ⊕ y ⊕ z, then it should be presented in the form of clauses: x ∨ y ∨ z.
You will need to create 2 n−1 clauses to describe addition modulo two lengths n.
In order to reduce the number of clauses in the SAT representation, the fragmentation of the modulo two addition operation is used.
The formula of the form can be represented in the SAT solver, as: To simplify the search for solution sets using SAT solvers, a search on variables is also used instead of a search on literals.
For example, to represent the formula: each product of the unknowns should be represented as a separate clause, i.e., replaced with additional i 1 , . . . , i 3 variables We use the CryptoMiniSat package in the SageMath environment to solve a system of Boolean equations as an SAT problem. CryptoMiniSat is a DPLL-based SAT solver based on the MiniSat. The fundamental differences between CryptoMiniSat and MiniSat solver are as follows:

1.
Clauses of addition modulo two are distinguished at the beginning of the search for solutions. They have their own separate search list, a separate extension mechanism, and a categorization algorithm. The use of such opportunities leads to a speed increase in searching for solutions of Boolean equations systems.

2.
Clauses of addition modulo two (binary) are processed by special methods. First, the search is usually performed using special heuristics. Secondly, a tree structure is constructed of them, reflecting which of the variables is equivalent or anti-valent. The upper level of the constructed trees is usually replaced by lower values in the tree, thereby reducing the number of classes and variables in the analyzed task. This usually leads to the necessity of reassigning variables.

3.
Technical and cryptographic SAT problems are very different, so CryptoMiniSat allows you to change the restart settings and change the type of learning heuristics using the Glucose or MiniSat training methods.

4.
Clauses are removed from CNF as soon as at least one of the literals included in this clause takes a value equal to true. Unlike the MiniSat, the literals equal to false are also deleted in the clauses, thereby allowing the clause to be reduced.

5.
The removal of dependent variables is carried out among the associated clauses of addition modulo two. Dependent variables are variables that are found only in one clause of addition modulo two. This simplification allows you to remove the variable from the task. It should be remembered that such a variable cannot be removed by using the exclusion of "pure" literals. 6.
Variables take values "false" and "true" at fixed intervals. If one of the search branches leads to an error (returning an impracticable formula), then the second branch is checked. Moreover, the results of checking both branches of the search are saved for subsequent comparison.
The proposed algorithm for representing the transformations of substitutions allows us to form a system of Boolean equations describing the transformations in an arbitrary S-box. Using SAT solvers for algebraic analysis can be described in the following steps: • Representation of cryptographic transformation as a system of Boolean equations in the algebraic normal form (ANF). • Convert the equation system from ANF to CNF. • Solve a SAT problem to find a set of solutions by SAT solvers.
After the generation of the system of Boolean nonlinear equations, we substitute the input and output vectors of the S-block through known text pairs (plaintexts and ciphertexts) using a knowledge of encryption algorithm structure. At this stage, a system of Boolean equations presented in algebraic normal form (ANF) is obtained.
To use existed SAT solvers, we should convert the formed system of Boolean equations (in the ANF) to CNF. First, we should simplify the presentation of the equations generated for block ciphers in ANF.
In cryptographic tasks, the application of the following algorithm for converting from ANF to CNF turned out to be effective [35]:

1.
Replacing constant 1 by a new unknown, since CNF should not contain constants.

2.
Replacing all products of unknowns by new variables (apply the linearization method to the original nonlinear system).

3.
The splitting of long chains formed as the addition modulo two unknowns into substrings of shorter length (for example, only four unknowns).

4.
Representation of the transformed system in CNF.
In general terms, it can be said that a 2 p−1 clause is required to represent the sum of unknowns with a length of p. Defragmentation of long sum chains (length l) for the equation To represent the resulting system in CNF, you can use the anf2cnf conversion library [36][37][38]. Then the system of equations in CNF is transferred to the SAT solver algorithm. We chose CryptoMiniSat 2.5 as one of the most efficient SAT solvers for cryptotasks. We made an experiment in the cloud environment SageMath Cloud [38].

Generation of a System of Boolean Equations Describing an Encryption Algorithm
The first step of algebraic analysis is the generation of a system of equations linking known data (plaintexts and ciphertexts) and an encryption key. For most encryption algorithms, a system of equations is constructed for substitution boxes because they are often the only non-linear encryption transformation.
Denote by x i , y i bits of the input and output vector of the substitution box over the field GF(2 s ), where i ∈ N, 0 ≤ i ≤ s − 1. We need to present the substitution operation in S-boxes in the form of a subsystem of equations valid with probability 1 for all possible input and corresponding output values of the observed S-Box. The common form of the equation describing the transformations in the S-Box can be given by the formula: where x i x j is the multiplication of the input bits of the S-box, y i y j is the multiplication of the output bits of the S-box, x i y j is the multiplication of the input and output bits, x i and y i are the input and output bits of the S-box, respectively, and α, β, γ, δ, ε, η are coefficients taking values of 0 or 1.
As part of the research, it is enough to consider the multiplication of two variables, however, for some algorithms (for which the algebraic immunity of substitution box is three), it may be necessary to increase the number of monomials used by including the product of three variables (x i x j y k , x i y j y k ) into the equations.
In this case, equations will be given by the formula: For a substitution box with an input size of s bits, we will get 2 t possible equations, where t is the number of monomials in the system of equations. The parameter, t, is calculated by the formula: When the block size is four bits, the number of monomials in the system is t = 37. Therefore, it is possible to make no more than 2 37 = 137,438,953,472 quadratic equations. Then, to select from the total number of possible equations, a truth table is formed using only transformations valid to the used substitution box. A general view of the truth table is shown in Table 4.
Some of the found equations, which were valid to the S-box substitution table, turned out to be linearly dependent and were not suitable for further use for algebraic analysis. It was necessary to choose only linearly independent equations for inclusion in the resultant system describing the transformations in the substitution box. When choosing linearly independent equations, we use the following condition [39]: For any substitution box S(x 1 , . . . , x s ) → (y 1 , . . . , y h ) , if the condition t > 2 s is satisfied, then there are at least t − 2 s linearly independent equations valid for all input values of the substitution box.

All Compositions of S-Block Inputs and Outputs
All possible S-block inputs (from 0 to 2 s ) x s . . . x 1 y s . . . y 1 x s x s−1 . . . x 2 × 1 y s y s−1 . . . y 2 y 1 x s y s . . .
For the algebraic analysis of a GOST R 34.12-2015 (n = 64) cipher, it will be necessary to expand the system of equations by including the bitwise dependence between the input substitution bits, plaintext bits (or the input round vector), and the secret round key, i.e., to include the equations connecting addition modulo 2 32 . Consider three vectors of n-bit size X = (x 0 . . . , x n−1 ), Y = (y 0 . . . , y n−1 ), Z = (z 0 . . . , z n−1 ) : for which addition modulo 2 n is performed: In the modulo 2 n addition operation, each result bit z i depends on the previous bits x n−1 , . . . , x i , y n−1 , . . . , y i . Such transformations can be described as follows through two subsystems [40]: c n−2 = x n−1 y n−1 , c n−3 = x n−2 y n−2 ⊕ (x n−2 ⊕ y n−2 )c n−2 , . . .
where c n−2 , . . . , c 0 is the transfer coefficients between digits. In virtue of the considered above systems, it can be noted that the value c i can be immediately expressed through the remaining unknowns and simplified. In this case, we can create the following system describing the transformation of addition modulo 2 n : For any n in system of equations, one linear and n − 1 quadratic equations are additionally obtained. Thus, to use bitwise dependencies for 32-bit vectors of round keys, the system of equations will include an extra 32 equations for each round key. In this case, for each round key a new variable in the system will be used (an additional 32 unknowns).

Assessment Approaches by Algebraic Analysis Methods
In the course of this research, a methodology was proposed for conducting algebraic analysis based on the application of the XL method and SAT solvers [41]. Two main encryption operations (substitution primitives and addition modulo 2 n ) were considered.
The initial data for the approach are: • The mathematical structure of the encryption algorithm; • the structure of the substitution operations (as they are defined in the algorithm); • available known data (the number of plaintext-ciphertext pairs).
The resulting characteristics of the approach are:  SageMath [42] was chosen as the software development environment. The algebraic cryptanalysis was implemented by using the functions of sage.sat.converters.polybori for transforming the equation set from ANF into CNF (CNFEncoder). We applied sage.sat.boolean\_polynomials library [43,44] for access to the functionality of the SAT solver CryptoMiniSat. An IntelCore i5 2.8 GHz 8 GByte PC was used as a test bench, on which we received some numerical experimental results of the algebraic cryptanalysis application to Magma cipher (Table 5).  SageMath [42] was chosen as the software development environment. The algebraic cryptanalysis was implemented by using the functions of sage.sat.converters.polybori for transforming the equation set from ANF into CNF (CNFEncoder). We applied sage.sat.boolean\_polynomials library [43,44] for access to the functionality of the SAT solver CryptoMiniSat. An IntelCore i5 2.8 GHz 8 GByte PC was used as a test bench, on which we received some numerical experimental results of the algebraic cryptanalysis application to Magma cipher (Table 5). The algebraic analysis of an eight-round Magma (68 key bits were fixed) with a CryptoMiniSat solver demanded four known text pairs and took 3029.56 s to complete (the search took 416.31 s). During the analysis of an eight round Magma, 68 key bits were fixed: 0-15, 51-55, 64-66, 128-130, 179-183, 192-207, 224-231, and 244-255. The algebraic analysis of a five-round Magma cipher with weakened S-boxes required seven known text pairs and took 1135.61 s (the search took 3.36 s). The algebraic analysis of a five-round Magma cipher with disabled S-blocks (equivalent value substitution) led to getting only one solution for five known text pairs in 501.18 s (the search took 4.92 s).
As seen from the experimental results, to find the only one existing solution for disabled and weakened S-boxes, we need to add more known data (text pairs) to the SAT solvers.
The results of the experiments of time and memory complexity of the Magma cipher algebraic analysis are presented in Figure 5.   The experimental dependences between the algebraic analysis total time complexity and the number of known plaintexts for the Magma ⊕ algorithm are presented in Figure 7.
putation time, s. The algebraic analysis of Magma ⊕ encryption has the following obtained time dependence for the search for all sets of solutions ( Figure 6).   The experimental dependences between the algebraic analysis total time complexity and the number of known plaintexts for the Magma ⊕ algorithm are presented in Figure 7.
utation time, s. The experimental dependences between the algebraic analysis total time complexity and the number of known plaintexts for the Magma ⊕ algorithm are presented in Figure 7.  The experimental dependences between the algebraic analysis total time complexity and the number of known plaintexts for the Magma ⊕ algorithm are presented in Figure 7.
Computation time, s. We evaluated the complexity of the algebraic analysis (by the XL method) for a simplified version of the Kuznyechik algorithm, namely S-KN2, without a round subkey generation scheme. For the substitution box (Table 2) we generated the following quadratic system of equations (including 21 linear independent equations):
The estimations of the complexity of solving the system by the XL method and maximum required memory are given in Table 6. Table 6. Evaluated complexity of the algebraic analysis of the S-KN2 cipher by the XL method.

Conclusions
In this article, we described the main steps of the algebraic analysis of cipher reliability and observed the Russian block encryption standard GOST R 34.12-2015 (n = 64, Magma and n = 128, Kuznyechik). We presented algorithms that can be used to teach the principles of the Kuznyechik block cipher (GOST R 34.12-2015). The S-KN2 algorithm can be used to illustrate popular cryptanalysis attacks against Kuznyechik and other similar block ciphers.
We observed approaches to implementing algebraic analysis to symmetric block ciphers: linearization, extended linearization, extended sparse linearization, and SAT solving. We proposed the experimental results of finding encryption keys with SAT solvers and extended linearization using Magma block ciphers. As examples for the experiment, we chose three fillings of Magma cipher substitution boxes (one from the standard, substitution with equivalent values S(X) = X, and a weak one) and simplified version Magma ⊕. We described the reduction of encryption transformation to a SAT problem. The number of literals and clauses, which are encountered with different numbers of known text pairs, was found. We also computed the evaluated complexity of the algebraic analysis of the S-KN2 cipher by the XL method for some known plaintext number.
The proposed approaches and algorithms can be further used for the security assessment of arbitrary ciphers based on substitutions and addition modulo 2 n in terms of their resistance to algebraic cryptanalysis.
Author Contributions: Methodology, E.M.; software, P.P.; validation, E.I., E.M. and P.P. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by Southern Federal University.