1. Introduction
A substitution box (
S-box) is a principal component of a large class of symmetric ciphers. The main task of the S-box is to provide nonlinearity to the cipher design, creating a property called confusion [
1]. From the mathematical point of view, a cryptographic
S-box is a vectorial Boolean function
, which fulfills specific criteria. We refer the reader to [
2] for a more thorough study of Boolean functions used in cryptography.
We can evaluate a vectorial Boolean function with respect to some S-box criteria and assign a quality to an S-box (with respect to these criteria). An S-box that fulfills prescribed criteria is called a strong S-box, while an S-box that lacks in some criterion is called a weak S-box. The S-box criteria include:
Nonlinearity, which measures the resistance against the linear cryptanalysis [
3]. Nonlinearity is computed as a minimum distance to all affine functions, which can be efficiently implemented with a Fast Walsh–Hadamard transform. Cryptographic applications require S-boxes with nonlinearity that is as high as possible.
Differential profile, which measures the resistance against the differential cryptanalysis [
4]. The differential profile measures the probability of the difference propagation, and should be as flat as possible. In this article, we focus exclusively on the differential profile. We provide more details in the following text.
Balancedness [
5], which is required to achieve uniform distribution of output bits. A balanced Boolean function has the same number of zeroes and ones in its vector of values. Note that it is easy to show that Boolean permutation (bijective vectorial Boolean function) is always balanced.
Strict Avalanche (SAC) and Output Bit Independence (BIC) [
6,
7], which measure the diffusion properties of the S-box.
Algebraic immunity [
8], which measures the resistance against algebraic attacks on symmetric ciphers.
Multiplicative complexity [
9], which measures the complexity of the S-box implementation in terms of the number of AND gates required to implement the S-box. High multiplicative complexity means that S-box implementation in hardware is more costly. On the other hand, S-boxes with low multiplicative complexity can be weak with respect to other criteria.
Other criteria, such as differential profile with respect to addition modulo
[
10]. This can cover special cases required by non-standard cipher designs.
An important task in cipher design is to generate strong S-boxes, as using a weak S-box can weaken the cipher. Even if the overall cipher design fixes the weakness introduced by a weak S-box, it would be less efficient than the design which uses strong S-boxes. There are multiple methods for generating strong S-boxes, which we review in
Section 2. While efficient algebraic constructions of S-boxes with good cryptographic properties are known, we prefer to generate S-boxes with stochastic methods that can choose S-boxes from a large set of possible candidates and are not restricted to specific algebraic classes. Our main reason is that some cipher designs can be vulnerable against algebraic attacks (see, e.g., [
11]), and these types of attacks can be specific to only certain classes of S-boxes.
Thus, our goal is to generate S-boxes with better properties than can be obtained from a randomly generated vectorial Boolean function while keeping the S-box space as large (and diverse) as possible. For this purpose, we have introduced a new algorithm for generating S-boxes with prescribed differential properties. Unlike other known stochastic methods, our algorithms build the S-box in incremental steps. After each new value, we check the partial differential table. If the partial S-box does not fulfill the criteria, we use backtracking to explore a different branch of the search space. The algorithm is formally presented in
Section 3.
Note that in our search algorithm we focus on a single S-box criterion, the maximum value of the so-called Difference Distribution Table (DDT). We define the Difference Distribution Table (sometimes called the XOR table) of S-box
S as a matrix
over
, with dimension
. When
S is known from the context, we simply write
. Rows of
are indexed by vectors
, while columns are indexed by vectors
. An element at position
has a value
The differential profile of an S-box S is a multiset of values from for . S-box S is differentially -uniform if , that is, is the maximum value in the whole DDT excluding the case . Our goal is to construct S-boxes with low differential uniformity, which are important in constructing ciphers resistant to differential cryptanalysis.
It might be possible to generalize our algorithm for multiple criteria; however, we wanted to keep this research focused. We suspect that it would be too difficult to understand and analyze the method properly if we wanted to include multiple criteria. By investigating a single criterion, we can evaluate the complexity of the algorithm analytically (
Section 4). We provide experimental results for small S-boxes in
Section 5. Finally, we discuss our results and open research questions in
Section 6.
2. Methods for Generating Cryptographic S-Boxes
There are a large number of different methods for generating cryptographic S-boxes. In this section, we provide a brief non-exhaustive overview of existing methods. The S-box generation methods can be categorized into the following main categories:
(Pseudo-)random generation
Stochastic search
Mathematical construction
Construction from smaller components.
2.1. Random S-Boxes
The easiest way to generate an S-box is to choose a candidate randomly. It is obvious that such a randomly generated S-box will not have the strongest cryptographic properties. On the other hand, a randomly generated Boolean function will not be too weak, either. Pseudo-random S-box generation is based on a known pseudo-random generator initialized with a known seed. If the seed is not controlled by the cipher designers, the randomness of the S-box generation can be verifiable by third parties.
A (pseudo-)random S-box generation can be improved by examining a larger set of S-boxes and taking either the best candidate or the first candidate that fulfills the required design restrictions. An example is the basis of the algorithm used to generate S-boxes for the AES candidate MARS [
12]. The authors tested roughly
candidates on five differential and four linear criteria. Note that this algorithm combines random generation with an extra “early abort”, that is, when a special criterion (with a low probability of occurrence) is encountered, a candidate is modified in such a way that the criterion is satisfied.
2.2. Stochastic Search
Random S-box generation is a complex computational problem, as it takes a very long time to enumerate and examine all possible candidates until a suitable candidate with the required properties is found. With increasing size of the S-boxes, the space of potential candidates grows super-exponentially. For a fixed S-box size, strengthening the criteria increases the search time exponentially. One of the methods for decreasing the search time and increasing the quality of S-box candidates is to apply various stochastic search heuristics from the areas of artificial intelligence and evolutionary computation.
Artificial intelligence and evolutionary computation play an important in both cryptanalysis [
13] and cipher design [
14]. The seminal works in this area include the use of Simulated Annealing [
15], Hill-Climbing [
16,
17], and their combinations and extensions [
17] for generating cryptographic S-boxes.
Modern stochastic search methods provide very strong experimental results, especially when considering multi-objective optimization [
18]. In [
19], the authors presented an optimization of the simulated annealing (SA) algorithm. In [
20], a parallel application of tabu search and simulated annealing was employed. A method based on a memorable simulated annealing algorithm was successfully applied in [
21] to generate chaotic S-boxes. A reversed genetic algorithm for generating S-boxes was proposed in [
22]. The search starts from a mathematically constructed set of candidates (see the next section) and improves the diversity and quality of candidates by stochastic search based on genetic algorithms. Genetic algorithms can be used to find cellular automata-based S-boxes [
23,
24]. Stochastic search methods can be improved in terms of both the quality of results and the complexity of algorithms by investigating new implementation techniques and cost functions [
25].
Our proposed algorithm is based on an exhaustive search with early rejection. We believe that it might be possible to improve the proposed method by adopting a combination of the proposed search method and evolutionary search methods. Inspiration can be found in [
26], for instance, where a combination of a special genetic algorithm and total tree searching produced S-boxes with high nonlinearity.
2.3. Mathematical Construction
A different approach to the construction of S-boxes is to use mathematical (or algebraic) construction of S-boxes. Mathematically constructed S-boxes are based on using a family of Boolean functions with known security properties [
2], such as the Gold functions [
27], the Kasami functions [
28], the Bracken–Leander functions [
29], Dillon’s permutation [
30], and others. New functions with good properties can be derived mathematically from these known constructions [
31].
One of the most commonly used families of functions is based on inverse mapping. As an example, the function
in
is known to be four-differentially uniform [
32] and to have high nonlinearity and algebraic degree. The S-box of the current Advanced Encryption Standard (AES) was constructed by applying an affine transformation to this function. Another way of mathematical constructing S-boxes is using cubic polynomial mapping [
33].
In [
34,
35,
36,
37], the authors applied various optimizations to AES S-box generation. A key-dependent mechanism to generate S-boxes with good cryptographic properties was used in [
38,
39]. A similar approach combined with other proposed optimizations was applied by the authors of [
40,
41,
42,
43].
A different approach was used in [
44,
45]. The authors replaced the binary representation of S-boxes with a domain quasigroup
G, making it possible to find functions that are at the same time both balanced and perfectly nonlinear. Such functions have a completely flat difference distribution table.
2.4. Construction from Smaller Components
A Boolean function can be implemented in (AND, XOR) algebra (or any other Boolean algebra). The minimum number of AND-gates in the (AND, XOR) representation of a Boolean function is called the multiplicative complexity (MC). The MC is an important property for various problems connected to S-boxes, such as logic circuit minimization, algebraic cryptanalysis, and optimal masking against higher-order power analysis attacks. However, low multiplicative complexity can conflict with other S-box criteria, such as nonlinearity and differential uniformity.
In [
9], we provided an analysis of multiplicative complexity for all
bijective S-boxes and showed that MC of any
S-box is at most 5. It is, however, difficult to determine the MC of an existing (larger) S-box. Thus, in [
46], we described the process of generic construction, which can be applied in the construction of strong
S-boxes with low multiplicative complexity. Instead of analyzing an existing complex Boolean function, we constructed an S-box from smaller primitives with known multiplicative complexity.
3. New Algorithm for Generating S-Boxes with Prescribed Differential Properties
The goal of an S-box generation algorithm is to produce an S-box, which is a vectorial Boolean function
that fulfills specific criteria. For a given
there are
possible functions
S. In general, for each potential
S, we define some evaluation function
that returns a score with respect to selected criteria. If the score is below some limit, we reject the Boolean function (a weak S-box); otherwise, we produce
S as a result of the S-box generation. We aim to obtain an S-box with as high a score as possible, but are limited by computational resources, as even for small
it might be impossible to examine the whole search space.
Our new algorithm is based on a randomized search with early rejection. Our search criteria are reduced to a single dimension; we evaluate the Difference Distribution Table (DDT) of the S-box and reject all S-boxes with DDTs containing values above some pre-selected threshold
. Threshold
should be selected according to the security requirements. Our algorithm either produces a
-uniform S-box or proves that no such S-box exists. Note that if we set the threshold
too low, the algorithm might take too long to find a suitable S-box or exhaust the search space. The complexity analysis provided in
Section 4 can help to select a suitable threshold that balances security requirements and the running time of the algorithm.
The main advantage of our algorithm in comparison with a simple search is that we can evaluate the DDT based on a piece of partial information about the function S. This allows us to reduce the search tree size by cutting whole unproductive branches, essentially performing an early rejection sampling in the search space. Thus, it can produce good S-boxes with less work than a simple exhaustive or random search.
3.1. Partial DDT
Let
be a vectorial Boolean function. We say that
S is partially determined on
P if we have a set of points
with
,
, and
for each
. The Partial Difference Distribution Table of partially determined
S is a
matrix with elements
We can compute the partial DDT of an S-box given lists
with Algorithm 1. We suppose that lists
encode pairs
in a given order of elements. Note that if lists
are extended by one element, the next PDDT can be computed efficiently by simply adding extra values to the previous PDDT by computing differences of the last added pair
with previous pairs in the partial S-box.
Algorithm 1 Partial DDT construction algorithm. |
Require: | {Lists defining a partial S-box of the same length l.} |
Require: | {S-box dimensions.} |
| |
for all do | |
for all do | |
| |
| |
| |
end for | |
end for | |
return | |
Because , each . Thus, given a partially evaluated S, we can check whether satisfies our pre-selected threshold . If for some , , then as well. Thus, we can quickly reject partial S without evaluating other points from \X and use backtracking to check for other more suitable branches of the search tree.
Note that we can place other stricter criteria on DDT. However, we require that these criteria are preserved when using partial DDT, e.g., we can have more strict restrictions on specific rows or columns of the DDT. This can be useful for cipher designs based on simple substitution permutation networks, e.g., a present-like cipher [
47] with a custom S-box, which requires stronger criteria on rows and columns of the DDT indexed by indices with low Hamming weight. In general, we can define a function
with the input being a partial DDT, which returns
true if the partial DDT satisfies S-box criteria and
false otherwise. In our analyses, we use a simple function that only checks whether
for each
.
3.2. General Idea of the Algorithm
The general idea of our randomized search algorithm is presented as Algorithm 2. We store a partially determined S-box using lists . In each step, we choose a random new point a from the remaining unassigned points from and determine a random coordinate . Then, we check the partial DDT; if it satisfies the conditions, we extend lists and continue with the algorithm. When we have assigned a value to each point from , we return a finished S-box with a DDT that satisfies the criteria.
In general, Algorithm 2 is not guaranteed to find an S-box; a partial S-box given by sets
may be constructed with a correct partial DDT in such a way that no successor fulfills the DTT criteria. Thus, we need to introduce an additional parameter,
limit, that can be used to restrict the search to a fixed number of maximum retries.
Algorithm 2 Randomized algorithm to construct S-boxes with prescribed differential table. |
Require: , S-box input size. | |
Require: , S-box output size. | |
Require: , a function that returns true if partial DDT satisfies criteria. | |
Require: , the maximum number of tries. | |
| |
| |
while
do | |
if then | |
return ∅ | |
end if | |
| |
| {for bijective S-box use: } |
| |
if then | |
| |
| |
end if | |
| |
end while | |
return
| |
Note that the presented methods can be used to generate both general S-boxes (any vectorial Boolean function) and bijective S-boxes,. When generating bijective S-boxes, it is necessary to set the same S-box input and output size () and restrict the selection of the y values to ensure that no y value is repeated. The appropriate modification is marked in the formal descriptions of Algorithms 2 and 3 in the comments.
3.3. Main Algorithm
The disadvantages of Algorithm 2 lead us to a final version of the search algorithm that is based on a depth-first search with backtracking. This algorithm is formalized as Algorithm 3.
In Algorithm 3, we initialize arrays with potential elements from in random order. Then, we try to fill in the partial S-box in a fixed order of x, with stored in the growing list Y. For each new x, we take the next element y from (representing ) and check whether the conditions on partial DDT of the potential S-box hold. If PDDT is correct, we continue with new x; otherwise, we remove y from , and try the next one.
If all y on a level have been removed, it is neceesary to backtrack. We remove the last stored assignment of by decreasing active x and removing y from Y and the corresponding set. After backtracking, we continue to investigate the remaining options in , or backtrack again if all options are exhausted.
In practice, it is possible to speed up the search by exploiting the affine equivalence of S-boxes; see, e.g., [
9]). This means that when
or
for some
, we only use
. In this case, the algorithm generates a normalized representative of an affine class of S-boxes with the required DDT. This method cannot be applied when looking for additional properties of the DDT, such as special distribution of values in rows and columns of the DDT.
Algorithm 3 Depth-first search algorithm to find S-boxes with prescribed differential table. |
Require: , S-box input size. | |
Require: , S-box output size. | |
Require: , a function that returns true if partial DDT satisfies criteria. | |
| |
for all do | |
| {Store elements in random order.} |
end for | |
x ← 0 | |
while
do | |
| {If bijective S-boxes are required, use instead.} |
for all do | |
| |
if then | |
| {Increse depth.} |
| |
else | |
| {Dead end, try other branches.} |
end if | |
end for | |
| {Search failed?} |
if and then | |
return ∅ | |
end if | |
| {Backtracking needed?} |
if then | |
| {Reset options on this level.} |
| {Decrease search level.} |
| {Remove last element of Y.} |
| {Explored, no suitable successors.} |
end if | |
end while | |
return | {Whole S-box is determined.} |
Algorithm 3 is a typical example of an exhaustive search algorithm. Thus, it always stops and produces the required results if any S-box that satisfies the conditions exists. However, in the worst-case scenario the algorithm needs to examine the whole search space, with potential complexity as high as
DDT evaluations. However, because unproductive branches are cut off, the proposed algorithm terminates sooner than a simple exhaustive search that only checks the DDT of the whole S-box. A more detailed complexity analysis is provided in
Section 4.
3.4. Example Run of the Algorithm
In
Table 1, we provide a small example run of Algorithm 3. We use parameters
and try to restrict DDT in such a way that each
. Additionally, we use affine equivalence and search only for bijective S-boxes to further restrict the search space for the sake of demonstration. Note that these choices only influence the selection of sets
in Algorithm 3.
4. Complexity Analysis
Recalling our basic notation, we want to generate a suitable vectorial Boolean function variables with a differential spectrum bounded by . Thus, the difference distribution table of F should contain values d upper-bounded by .
To estimate the complexity, we abstract our algorithm using a “balls into bins” problem. We start with an matrix of empty bins. These correspond to possible DDT positions; thus, .
During the algorithm, we fill in the vector of function values of F of length N. After k steps of the algorithm, k positions out of N are filled. These fixed values determine existing differences in the partial DDT, distributed in at most rows.
In step , we add a new assignment . The k differences for address at most k rows of the DDT. For each i, we increase the DDT value in column . Due to using the ⊕ operation, all of the differences are paired; thus, we do not need to check the differences of , only to increase the DDT value directly by 2.
We abstract this in the “balls into bins” problem as follows. Each difference pair is a “ball” we throw into DDT “bins”; in iteration k, we independently and randomly choose k rows of bins, then throw a ball into bins in each of the chosen rows. The random variable represents the number of balls in a bin in row i and column j after the k-th iteration. In our algorithm the choices of DDT positions are not independent; however, we use as an estimate of the probability distribution of the generated partial DDTs after k steps of the algorithm (up to a scaling factor of 2).
4.1. Random Generation of S-Boxes
If there is no restriction on DDT, we can estimate the probability distribution of the final DDTs by computing
. Alternatively, this should correspond to a simpler experiment in which we simply throw
balls into
bins. The generated function
F is expected to be
-differentially uniform with a probability of event
Distribution of balls in bins follows a multinomial distribution, with exclusive events (bins), equal event probabilities , and the number of trials equal to . Thus, the expected average number of balls per bin is . The most important case for S-box generation is when , which provides us with an expected average number of balls per bin near 1/2. Considering such samples, we now ask what the is probability that all samples are at most at the threshold in order to obtain a -differentially uniform S-box.
To simplify this calculation, we can replace individual binomial distributions with their Poisson approximation
which has
. The probability that the sample size reaches the threshold
is
The terms in the expansion converge rapidly to 0; thus, we can take the first term as the expected probability of a single bin containing at least
t balls. The probability that none of the bins contains
t or more balls is then estimated by
The estimated probabilities for
and
are summarized numerically in
Table 2.
Using the approximation
and Stirling’s approximation [
48], we obtain
This means that when we decrease for a fixed n, our probability of generating a -uniform S-box decreases exponentially. Thus, we need to examine exponentially more randomly generated S-boxes to obtain one of suitable quality.
4.2. Analysis of the Algorithm 3
From the complexity point of view, Algorithm 3 has the advantage of early rejection compared to the general rejection sampling employed by a randomized depth-first search for S-boxes. A classical approach is to generate the whole S-box, compute DDT, and then reject or accept based on the threshold . Our new algorithm enables partial DDT sampling. If the partial sample contains DDT values above the threshold , further iterations of the algorithm cannot improve this value, meaning that it can be immediately rejected. Moreover, using a randomized depth-first search, we can exclude the whole set of samples that are above the threshold from the search.
Similarly to
Section 4.1, we can estimate the probability of distribution of PDDT cells after
s steps of the algorithm as follows:
where
is an expected average value of “balls per bin” after
s steps of the algorithm. The number of bins, value
, does not change throughout the algorithm, however, the expected average occupancy of the bins
grows with
s.
If we carry out rejection sampling after exactly
s steps of the algorithm, we can estimate the probability of not rejecting the PDDT with threshold
again as
The value of
increases with
s as follows. After
s steps, we throw
“balls” (pairs added to PDDT) into
bins; thus, the average number of “balls per bin” is
For S-boxes
, we have
and
When
s is small in comparison to
, we have a low probability of rejecting a PDDT; thus, the search space quickly branches out. With growing
s, we increase the number of potential branches (up to
), as well as the probability of rejecting the PDDTs. Thus, the estimated width of the search tree on level
s is
The number of potential endpoints of the search tree we are looking for is
. Each of these points is connected to the root of the search tree, with the paths going through the level with the maximum width of the tree, denoted by
. Thus, we need to go through the search tree on average
-times, which is
As
for each
s, it is apparent that Algorithm 3 can always find the desired S-box (if it exists) faster than random search, which has an expected complexity of
.
5. Experimental Results
In this section, we present experimental results to support our theoretical analysis of the algorithm. To obtain the presented results, we used a custom software called “SBox Tool” [
49], developed in Python. This software can generate a selected number of S-boxes of a given size using different methods, analyze them, and export the results for further statistical processing.
In the first experiment, we analyzed a large number of randomly generated bijective S-boxes generated using Python’s random permutation method.
Table 3 contains a cumulative distribution of
-differentially uniform S-boxes. The values are based on a dataset of 10,000 randomly generated S-boxes. The experimental values are consistent with the estimates obtained by the “bins and balls” method presented in
Table 2.
We implemented the proposed method based on depth-first search (Algorithm 3) in the “S-box tool” software. We tested the method by generating datasets of 100 bijective S-boxes with different parameter settings. As the generated S-boxes were bijective, all of them are balanced. Although our method does not focus on other S-box properties, we have included the distribution of nonlinearity of the generated S-boxes for comparison with other methods based on the suggestion of an anonymous reviewer.
The summary of the results is presented in
Table 4,
Table 5,
Table 6,
Table 7 and
Table 8. We show two different methods of S-box generation: “Random” denotes simple random generation, while our new method is denoted by the label “P_DDT”. The parameter
denotes the limit on partial DDT items allowed during the search. The “time” column is the average time required to generate an S-box with the selected method, while columns in “Nr. of
-uniform S-boxes” show the distribution of the maximum value in the final DDT among the generated S-boxes (out of 100 generated S-boxes). Finally, the columns in “Nonlinearity” show the distribution of the nonlinearity among the generated S-boxes (out of 100 generated S-boxes).
The results of S-box generation for
and
are available in graphical form in
Figure 1 and
Figure 2, respectively. Note that we were unable to generate any S-boxes of size
with the prescribed maximal DDT value of 4, as the program took too much time (more than 10 h) and did not produce any solution. Similar issues were encountered for S-boxes with
and
and with
and
.
These results show that with the correct setting it is possible to generate S-boxes within a reasonable time that have better quality than those generated by a random search. The S-boxes generated with our method have similar nonlinearity to random S-boxes. The distribution of nonlinearity is slightly improved when setting smaller . This can be useful for future methods that might combine our algorithm with further post-processing focused on nonlinearity.
6. Discussion
In this article, we have introduced a new type of algorithm for constructing cryptographically strong S-boxes. The main idea of the proposed method is to determine S-box function values step by step and examine the partial Difference Distribution Table during the process. Setting and enforcing quality criteria during the S-box construction leads to a stochastic search algorithm with early rejection sampling. Our analysis of the algorithm, as confirmed by experimental results, shows that it outperforms purely random searches.
There are many open questions and open research topics related to our proposal. In our proposal, we focus solely on the differential properties of the S-box. To generate a good S-box that satisfies multiple criteria, these criteria can be evaluated after the algorithm reaches an S-box that fulfills the prescribed differential properties. If the S-box does not meet other criteria, it is possible to resume the original search after backtracking.
An open question is whether other criteria can be incorporated directly into the search based on the partial S-box. For certain criteria, this might be challenging. If we consider nonlinearity as an example, the nonlinearity is the minimum distance of any component of our function to any linear Boolean function. Adding additional value to a partial S-box can increase certain distances while decreasing others.
Another open question is whether it is possible to improve our method using stochastic search heuristics such as hill climbing, simulated annealing, or evolutionary algorithms. One direction to consider is changing the definition of the algorithm step from permuting two function values to filling in a partial S-box. It is, however, unclear what the fitness landscape would look like, as well as whether the optimization heuristics would provide an additional increase in performance.