1. Introduction
Cryptography is an ancient science originally developed to protect the confidentiality of information, particularly in military and political contexts. In the contemporary era, however, its applications have expanded significantly beyond these traditional domains to encompass any area where safeguarding information is of paramount importance. This includes preserving database confidentiality and securing data transmitted over insecure channels, such as computer networks (both private and public). The Internet is a primary example of such channels, along with telephone, radio, and television communications. Block ciphers are among the most widely used cryptographic primitives in symmetric cryptography today, and their robustness enables their deployment in a variety of cryptographic protocols.
Cryptography and cryptanalysis, the former being the art of creating ciphers and the latter the science of breaking their security, have emerged as closely intertwined fields of study. Numerous cryptanalysis methods (also referred to as “attacks”) have been developed, including Differential Cryptanalysis, Algebraic Cryptanalysis, and Linear Cryptanalysis. However, the design of new block ciphers has made them increasingly resistant to these known attacks, making successful outcomes progressively more challenging to achieve. These attacks typically require traversing key spaces or other large structures containing vast numbers of elements to identify a solution. In this context, heuristic techniques have emerged as a promising alternative. Such methods have been applied in the cryptanalysis of block ciphers to search for potential keys without exhaustively exploring the entire key space. Heuristic techniques are optimization methods based on searching for solutions within the space of possible candidates.
The enduring resilience of modern symmetric cryptographic primitives forms the technological foundation of contemporary information security systems. Central to this architecture are
block ciphers, which mathematically transform fixed-size plaintext blocks into unintelligible ciphertext using a secret key
K. Following Claude Shannon’s seminal work, the security of these algorithms relies critically on the design principles of
confusion and
diffusion. Confusion is achieved through nonlinear elements, such as Substitution Boxes (S-boxes), which are primary targets of structured cryptanalytic efforts. The objective of cryptanalysis is to compromise the cipher’s confidentiality, typically seeking a
total break by recovering the secret key
K. For ciphers employing key sizes of 128 bits or more, the sheer magnitude of the key space (
) renders brute-force attacks intractable, compelling cryptographers to focus on identifying intelligent search strategies [
1]. Consequently, research has focused on specialized mathematical attacks, such as Differential Cryptanalysis (DC) and Linear Cryptanalysis, which exploit statistical biases inherent in cryptographic structures [
2]. A fundamental limitation of these classical methodologies arises when analyzing modern, high-round ciphers: the probability of a successful differential characteristic decreases exponentially with the number of rounds [
3]. This deterioration in high-round ciphers effectively exceeds the practical limits of classical statistical techniques, resulting in weak or limited security assurances. Similarly, although algebraic attacks transform the cipher structure into systems of multivariate polynomial equations, these systems are often too large or complex to be efficiently solved using conventional methods. This intractability necessitates a paradigm shift, framing cryptanalysis not merely as a statistical analysis problem but as a complex combinatorial optimization challenge [
3].
In modern cryptanalysis, certain methodologies, particularly those related to key recovery attacks on robust block ciphers, are regarded as constituting an exhaustive search task within vast combinatorial domains. This characteristic has led to their classification as NP-hard problems in contemporary cryptography. Ciphers are deliberately designed with high nonlinearity and low auto-correlation [
4,
5,
6], properties that significantly limit the effectiveness of traditional techniques, including brute-force and classical statistical methods [
7,
8,
9]. This inherent complexity, characterized by high dimensionality and multimodal search landscapes, necessitates the use of optimization tools known as metaheuristics [
10]. These algorithms provide robust means to approximate optimal or near-optimal solutions in large-scale, nonlinear environments, offering a crucial technical response to the increasing strength of cryptographic systems. Thus, the adoption of metaheuristics is a methodological imperative driven by the intrinsic complexity of modern ciphers.
For a metaheuristic to succeed in key cryptanalysis, it is essential to formally define the search problem as an optimization one. This involves delineating the solution space, denoted by
S (the set of all possible keys,
K), and establishing an objective or fitness function,
. The objective of the study transitions from systematically enumerating all possible values of
to minimizing (or maximizing)
. This approach aims to identify the optimal key, denoted by
. The design of the fitness function is critical to the success of any metaheuristic attack, including Tabu Search (TS). This function quantifies the “goodness”of a candidate key by measuring its proximity to a known plaintext (in a Known Plaintext Attack) or by evaluating statistical properties of the decrypted output. Poorly designed fitness functions can create flat or deceptive search landscapes, causing local search algorithms to become prematurely trapped in suboptimal solutions. This issue has been observed in Genetic Algorithms (see [
11,
12]), and for TS (see [
13]). For the metaheuristic-based exploration of block cipher key spaces, particularly in the context of complex state-space modeling and search strategy design, the works in [
14,
15] systematically demonstrate, from the perspective of discrete memristive systems and chaotic dynamics, the construction and application of complex search spaces and highly nonlinear structures in information security. These findings are intrinsically consistent with the approach presented in this paper.
Tabu Search (TS), a metaheuristic formalized by Fred W. Glover, is a memory-guided local search method tailored for combinatorial optimization problems. Its application in cryptanalysis is justified because key recovery is precisely such a high-dimensional combinatorial problem [
16]. This article explores a Tabu Search (TS) attack on block ciphers. A black-box attack is a type of cryptographic attack in which the attacker does not require any knowledge of the block cipher’s structure. The adversary only needs access to the block cipher’s encryption and decryption mechanisms, in addition to at least one plaintext-ciphertext pair. This scenario constitutes a known plaintext attack. It is well established that the key space of a real block cipher is sufficiently large to prevent brute-force attacks. Therefore, TS must address the challenge of efficiently searching this space. Although the search is heuristic, the size of the key space remains a critical factor. In [
17], the key space is partitioned according to a specific arithmetic congruence, enabling the attack to focus on selected subsets of the partition and thereby reducing the set of keys to be examined. The use of arithmetic congruence in cryptanalysis is noteworthy because it allows complex problems to be solved with relatively simple tools. This technique is also employed in other works; for example, in [
18], the author demonstrates that a certain public key exchange protocol can be broken with a man-in-the-middle attack, which fundamentally relies on arithmetic congruence.
This article addresses two key points discussed above: the size of the key space and the need to combine strategies and approaches. We analyze the design of a cryptanalysis methodology for the
block cipher, which uses TS combined with a key space partitioning strategy. In
Section 2.3, the different key space partitioning strategies are explained, while
Section 3 and
Section 4 present the methodologies developed for block cipher cryptanalysis and the experimental results, respectively. However, the main contribution of this work lies in the organization of search strategies and methodological frameworks, rather than in a structural cryptanalytic attack on AES itself or other practically used ciphers. In
Section 3.2, we demonstrate how the standard variant of the Deep First Tabu Search methodology results in a more efficient way of performing Simple Tabu Search. Additionally,
Section 4 outlines the general design of the Subregion Path Attack methodology, which is based on partitioning the key space into subregions and describes how to move between these subregions. In this approach, one methodology is required to move between subregions and another to evaluate them. In
Section 4.1, we present the specialization of this technique combined with Tabu Search, along with the experimental results.
2. Preliminaries
Let be a block cipher, T be the plaintext, K be the key, and C be the corresponding ciphertext, . is said to be a key consistent with and C if . Let denote the set of keys consistent with and C. The problem investigated in this work is as follows: Given and C, Compute a key in . Therefore, this is a known plaintext attack. Individuals in the population are keys that are candidates to be consistent keys. Below, we provide our specification of the fitness function.
Let
be the fitness function, where
means the number of equal components between blocks
and
. In other words,
, where
is the Hamming distance. Therefore,
is better suited than
if
. This function takes into account the correlation between the known ciphertext and the ciphertext generated by the random keys. This fitness function is used in [
19].
2.1. The Block Cipher Family AES(t)
In this section we present a cipher that we have called
, because it is a family of block ciphers quite similar to AES (Advanced Encryption Standard), that depends on a parameter
t in a sense that will be clear later. This family was introduced in [
20]. We now give a description of this family.
Let be the Galois field with elements given by the polynomials in the following table.
In
Table 1, the block and key length of the members of
, for each value of
t, where
t is the corresponding word length, is also given. In particular, AES is
. All the polynomials
are irreducible over
, as the author said in [
20], these polynomials were chosen at random. On the other hand, the S-boxes for each
t, are listed in the appendix of that paper. The
’s round operations AddRoundKey, SubBytes, ShiftRows, and MixColumns are in essence the same as in AES, but on a reduced scale. In particular, for MixColumns, the coefficients of the MDS matrices are the
t least significant bits of the corresponding coefficients of the original AES matrix. The same approach holds for constants used in the key schedule. The number of rounds is the same as those used in AES, even for different key sizes.
2.2. The Tabu Search Heuristic Method
TS is a mathematical optimization method that falls into the category of local search techniques. TS enhances the performance of local search by leveraging memory structures. Specifically, once a potential solution is identified, it is marked as tabu to prevent the algorithm from revisiting that solution. A TS algorithm is a metaheuristic designed to circumvent the potential entrapment in local optima by facilitating flexible movements and temporarily prohibiting the evaluation of previously visited solutions through the implementation of a tabu list.
The technique is based on two fundamental concepts: neighborhood and forbidden move, hereafter referred to as a tabu move. In the first concept, it is assumed that a neighborhood of points can be constructed. Each point corresponds to a generally distinct feasible solution. Therefore, represents the set of all feasible solutions that can be obtained from x, where x is any point in the domain of a function F to be maximized. On the other hand, we define the Tabu status as the prohibition against using a specific point in the function’s domain to find a feasible solution. This status is assigned for a predetermined number of iterations. The forbidden moves are managed by an array T, called the tabu list. It is important to note that this status persists even when the feasible solution obtained from that point is better than others already found. The construction of the neighborhood depends on the specific problem being solved; however, it is recommended that it be generated randomly. At each step of the iterative process, we move randomly, attempting to find the best solution in , regardless of whether is better (where maximizing or minimizing the function) than . If the neighborhood is very large or the evaluation of the objective function is computationally expensive, only a subset of it, for example , may be analyzed. The size of this sub-neighborhood depends on the specific problem. The process involves simple movements between points in the neighborhood, thereby generating different feasible solutions.
2.3. Methodology for Key Space Partitioning
The key space is too large for an exhaustive search; it may even be suitable for a form of heuristic search. Therefore, it could be beneficial to divide the key space into subsets so that the method is applied to only one of the subsets at a time. To this end, the following methodology is proposed in [
17]: It is well known that if the key length is
, then the key space
K has cardinality
and there is a one-to-one correspondence between
K and the integers in the range
. If an integer
is set, then the key space can be represented by the numbers
, where
and
. In this way, the key space is divided into
blocks (determined by the quotients in the partitioning algorithm dividing by
), and within each block, the corresponding key is determined by its position in the block, which is given by the residual
r. This is called
, the key length for movement or the group key length. The key point is then to move with the method for cryptanalysis only in one of the partitions (given
q), which is called the block of the partition, but to calculate the corresponding fitness in the entire real space, that is, to move around
r but to calculate the fitness of the keys given by
.
The following functions are thus defined.
such that
, db and bd are the functions that convert decimal to binary and vice versa (we will use big endian notation, i.e., the most significant bit appears on the left and goes towards the least significant digit, on the right). It is also possible to generate an arbitrary subinterval of length
. We will also refer to this subset as the partition. To do end, it is sufficient to generate a random
with (
) that represents the left end of the interval. Then, the working partition will be given by the interval
. In [
21] it is shown how to generate a partition by fixing
components of the key space to work with the subset of the remaining
unknown components. Thus, given any element
z in the subregion
represents the element
in the key space of length
.
3. Tabu Search for Block Cipher Cryptanalysis
In this case, the population of possible individuals (solutions) corresponds to the set of keys. An iteration is one step of the Algorithm TS, and the set of neighbors in any step corresponds to the neighbors of the selected individual
(see Algorithm 1). When working with a partition of that space, the population is restricted to a subset within that partition. The methodology aims to employ key space partitioning strategies and transition between partitions, rather than working with excessively large population sets. This is why population sizes are chosen such that it is possible to label all individuals within the space, allowing for the identification of all individuals included in the Tabu list and the avoidance of their repetition. Furthermore, in the context of block cipher cryptanalysis, the neighbors of an individual
x consist of the
m individuals obtained by changing one component of
x while keeping the others unchanged, where
m is the cipher key length in the partition or a subset of the partition. Since this number of neighbors is relatively small, it is feasible to generate and evaluate all neighbors during the step when an individual is selected as
. In this way, once
has been analyzed by generating all its neighbors, the algorithm will not need to revisit this point. For example, the neighbors of
are
,
,
.
| Algorithm 1: Tabu Search |
![Cryptography 10 00008 i001 Cryptography 10 00008 i001]() |
In this work, we do not focus on studying various fitness functions; instead, we concentrate solely on the fitness function based on the Hamming distance. The issue of a “flat fitness landscape” in high-round ciphers is discussed in [
22,
23]. The primary objective of the different strategies with the TS is to generate new population elements by creating new neighbors. Fitness functions play a crucial role in this process by guiding the selection of different paths and identifying individuals with a fitness value equal to 1.
3.1. Simple TS (STS) Methodology
A random initial individual is chosen as a potential key. Its neighbors are then examined, and the fitness values of these neighbors are calculated. The neighbor with the best fitness that is not on the Tabu list is selected, and this process is repeated until a stopping condition is met or the fitness reaches 1. A common stopping condition is reaching a predetermined maximum number of iterations.
3.1.1. Description of the Experiments and Analysis
We work with the
block cipher, with key length
. The same fitness function as in [
17], described in
Section 2, was used. We also set the group key length
, so the number of neighbors obtained in each TS iteration is 16. The key space, restricted to the partition, has a cardinality of
65,536. A maximum number of iterations equal to 1000 is performed (stopping condition). Therefore, we will be sampling 16,000 individuals at most, which represents
% of the total number of elements. Thus, 50 experiments were performed, each of which was created by randomly generating a key (
K) in the partition
and a plaintext
T (
and
T randomly generated for each experiment), from which the ciphertext was generated. The individuals of the experiments were labeled by a number from 1 to 50 and represented by the corresponding key
K. The objective was to see if the method is capable of finding the key by performing the known plaintext attack procedure described in
Section 3 and
Section 3.1.
The selection of the group key
is a crucial factor influenced by the available computing architecture, the specific characteristics of the block cipher, and the fact that, for this key size, the method’s results on a subset of the population of size
are satisfactory. Specifically, in a high percentage of experimental trials, the method successfully recovers the key. However, a smaller
does not always guarantee proportionally better results. For
, acceptable behavior with the GA for
has been studied [
17,
24]. In this work, it has been confirmed that for the proposed methodologies with TS, this group key size is also appropriate, and the results are even better than those obtained with Genetic Algorithms.
To generate an initial individual (the seed) for , a size of is taken to generate an initial population of individuals randomly generated in the partition and the one with the highest fitness is selected. We use for the experiments of this work a computer machine with the following characteristics: Intel(R) Core(TM) i9-13900H @ 2.60 GHz.
3.1.2. Discussion of the Results of the Experiments
The obtained results can be summarized as follows: of the 50 experiments performed, the key was found in 40, representing an 80% success rate. The longest execution time was 373.7 s, with an average time of 188.348 s (3.139 min). As demonstrated in [
24] for the Genetic Algorithm, it is possible to estimate execution time using this methodology. On the computer, an individual is generated, and the time required to perform 100 iterations is measured. This allows for estimating the time the methodology would take on that computer to complete 100 iterations, providing a fairly accurate procedure for predicting the actual execution time. This approach is very useful for planning and estimating the necessary resources and appropriate parameters for cryptanalysis on a specific architecture. In the experiments conducted, the average estimated time was 184.947 s, with a difference of 3.4 s; the estimate for the experiment with the longest execution time was 348.6 s, with a difference of 25.1 s. In terms of both effectiveness and execution time, the results surpass those obtained in [
17,
24] using the Genetic Algorithm methodologies.
Figure 1 presents the sequence of iterations, execution time, and estimated time for the 50 experiments. The failed experiments were those labeled 5, 6, 7, 8, 9, 15, 16, 31, 32, and 33. For these 10 cases, the seed was discarted, and the next highest fitness individual was selected from a randomly generated population of 100 individuals. However, none of these 10 experiments succeeded. Additionally, for each of these failed experiments, 10 new random seeds were generated. The key was found only in experiments 6 (on four occasions) and 31 (on two occasions), indicating that in these failed cases, the outcome depends more heavily on the key and plaintext than on the initial seed when using the STS method.
A better result was obtained by increasing the number of STS iterations (see
Table 2). It should be noted that, in this case, since the length of the subregion is
, each individual has 16 neighbors; therefore,
TS iterations correspond to analyzing a number of individuals equivalent to the size of the partition. Experiments 6, 7, 9, and 32 ended before reaching 2000 STS iterations, while experiments 8, 15, and 16 concluded between 2289 and 2805 iterations. This means that 7 of the 10 failed experiments with up to 1000 iterations ended before 4096 iterations, which is less than 3000 in this case.
3.2. Deep First TS Methodology
The Deep First TS methodology (DFTS), also known as the New Neighbors Control TS methodology, represents a family of TS algorithms distinguished by their novel approach to controlling the generation of neighbors. Since neighbors determine the possible paths from the beginning to the end of the procedure, this control is crucial. At various steps of the TS process, levels are generated with respect to the initial node; within these levels, individuals may be repeated. Consequently, a neighbor of the current individual at step k (level k) may either be a new individual that the algorithm has not previously generated or an individual that was generated as a neighbor in an earlier step but has not yet been chosen.
There are two primary motivations for formulating the Deep First strategy. First, to generate a path distinct from that produced by the STS; second, to prioritize the selection of new individuals at each step of the algorithm. This approach is based on the idea that once neighbors are generated, their fitness is known, regardless of whether they have been added to the Tabu list. While this may seem to restrict access to paths involving individuals that are not new, the neighbor generation process is repetitive, allowing elements to be generated again at different levels as neighbors of other individuals.
Depending on the size of the partition where the method is applied and the available computational resources, we propose two variants. In the first variant, it is possible to label all individuals, allowing control over when they are generated as neighbors and enabling the determination of whether they are new neighbors at a given level. The second variant involves performing this neighbor generation control up to a predetermined number l of levels. In this case, it would be necessary to store, for each individual, the ancestral path by which it was generated. This path would be represented as a list of length associated with each generated individual. In both variants, when an individual is generated as a neighbor by the TS, it is checked to determine whether it is new. If it is new, its fitness value is calculated and stored; if it has been generated previously, the stored fitness value is retrieved and recalculation is avoided. In this work, we use the first variant for DFTS.
In this algorithm, a horizontal search refers to applying the STS criterion without differentiating between neighbors based on whether they have been previously generated. In contrast, a depth-first search prioritizes, at the current step, neighbors that have not been previously obtained and applies the STS criterion exclusively to this subset. Two parameters govern the process: one controls the extent of the horizontal search (
) and the other prioritizes the depth-first search (
). Consequently, depth-first search continues as long as the following condition is satisfied:
where
and
are initialized to 0 and increment based on the number of horizontal or depth-first steps executed in the algorithm; when they reach their maximum values (
,
) are restarted.
In this way, DFTS becomes a family of TS algorithms, depending on the combination of parameters (
,
). Within this family, two distinctive cases of TS algorithms can be identified: Deep First DFTS (DFTS(0,1)):
,
, prioritizes depth-first search at every step of the TS, that is, selecting the next element exclusively from the subset of new neighbors at the current level. Standard TS with DFTS (DFTS(
N,0)): Where
N represents an upper bound on the number of individuals in the population, the key is that
means that the condition (
1) for prioritizing depth-first search is never satisfied. This causes this variant of the DFTS to behave identically to the STS, with the difference that the generation of new neighbors is controlled.
Experiments with DFTS Methodology
For the 10 failed STS experiments, the DFTS methodology was applied, and the results are presented in
Table 3. The columns labeled “Vec”represent the number of individuals generated as neighbors by the TS algorithm.
Some comments on the experimental results: DFTS(0,1) found the key for experiments 6 and 15 in fewer than 1000 iterations, demonstrating how DFTS provides an alternative approach to improving the method’s performance. The DFTS standard allows for evaluating the behavior of STS in terms of generating new neighbors and comparing it with other parameter variants. It can be observed that in all cases where the DFTS failed, the number of neighbors generated by DFTS(0,1) exceeded the number generated by the standard. Note that the more neighbors generated, the greater the likelihood of finding the key among them. The data structure used in DFTS to track new neighbors and prevent reevaluating individuals’ fitness ensures that DFTS variants can operate faster than STS, while also benefiting from the additional information provided. Specifically, the standard DFTS is faster than STS.
Table 4 presents the results of executing DFTS(0,1) and standard DFTS over 1000 iterations for the 10 failed experiments, except 6 and 15, although we will include them for comparison purposes. We want to point out that the standard DFTS completes one iteration earlier than the STS because, by controlling the generation of new neighbors, it detects the individual with fitness 1 at the moment it is generated. This eliminates the need to wait for the next iteration (as in the STS) to select the next element of the Tabu list from the set of neighbors of the previous individual.
After an integrated evaluation of the results in
Table 3 and
Table 4, we conclude that the key can be found before 1000 iterations in two more experiments (6 and 15). Only two of the ten failed experiments in
Section 3.1 do not obtain the key before 4096 iterations, neither by DFTS(0,1) nor by STS (experiments 5 and 33). While DFTS(0,1) generates a larger number of distinct individuals up to a given level, it cannot be claimed that it generally produces better results than STS. What is true is that it provides a different search path in cases where STS fails. Additionally, it should be emphasized that the standard DFTS is a more efficient way to execute the STS method, provided that the size of the subregion and the infrastructure used support the implementation of the data structures and procedures required by DFTS to control the generation of the set of neighbors.
4. The Subregion Path Attack Methodology
The SPA (Subregion Path Attack) methodology is composed of two integrated approaches, which we refer to as SPA1 (external methodology) and SPA2 (internal methodology). The domain
of SPA is a subset of length
of the key space of length
, determined by a specific partitioning criterion. The division-with-remainder algorithm applied to the decimal representation of the individuals of the key space with divisor
, with the possible quotients and residues of the division, offers two variants for defining such domains
(see [
17,
24]). Additionally, in [
21,
25], the construction of this domain is done by fixing
components of the key.
Then SPA1 operates on
as if it were the entire space, and
is partitioned by a specific procedure into subregions
of length
. In such a way that
is partitioned into a set
S consisting of
subregions, each of size
. Let
c denote an individual in the key space, while
r and
represent the individual as an element in
and in the corresponding subregion
respectively, with
,
,
. On the other hand, the SPA2 methodology operates within the subregions
. In SPA1, individuals are associated with different subregions
, whereas in SPA2, individuals are the elements within a subregion. For each methodology, the corresponding operations must be internal to its domain; the only external operation is the fitness calculation, which requires obtaining the representation of the individual in the key space. We denote this operation by
and
, such that
and
(see
Table 5). In this way, any individual is connected to its different representations according to the working domain.
Since SPA1 works with subregions, a way to measure fitness (
) is needed. If we denote the fitness function of SPA2 as
, then we define the fitness of SPA1 as
The SPA methodology involves traversing the subregions
using the SPA1 approach, where each subregion corresponds to one and only one
j with
. The SPA1 method, in turn, employs SPA2 to evaluate each individual based on Equation (
2). Therefore, it is crucial to assess SPA2 during the design tests to maximize its effectiveness. This evaluation should consider the characteristics of the block cipher, the available computational resources, and the relationship among the parameters
,
and
. Note that in particular, SPA2 could be implemented as an efficient exhaustive search within each subregion
, guaranteeing 100% effectiveness in finding the key if it lies within that subregion.
4.1. SPA Experiments Using STS
We will use the SPA methodology, integrating STS for SPA1 and SPA2. We will work with the 10 failed experiments from
Section 3. Below, we show how the method is adapted for this case, see
Table 5 to see the representation of the individuals in the three possible domains. In this article, the domain of individuals corresponding to the experiments consists of intervals
(see
Section 2.3).
Table 5.
SPA with STS for the experiments: individual representation.
Table 5.
SPA with STS for the experiments: individual representation.
| The Key Space | The Domain () | The Subregion j |
|---|
| c | r | |
| | |
With respect to the computation of the fitness function
where
F is the fitness function defined on the key space. If the optimal solution of the STS in the subregion
j is
, then
For each experiment, the SPA methodology is applied, where SPA1 works with the STS in a space of
individuals, each individual corresponding to a subregion of length
, in the subregions of length
the STS is applied as SPA2. For each experiment, in
Table 6 is shown the representation of each individual according to this scheme.
In the STS for SPA2, up to 300 iterations are performed; in this case, traversing a number of elements equivalent to an exhaustive search would require 341 iterations. The number of iterations is increased relative to the total number of elements because with SPA2 method is intended to be as effective as possible. The results obtained in the experiments are shown in
Table 7. The number of subregions traversed to find the key, the number of iterations in which the key is found in the corresponding subregion, the computer on which they were executed, and the execution time are indicated.
In 7 of the 10 experiments, the SPA successfully identifies the key. In the implemented SPA scheme, traversing up to 10 subregions is equivalent to examining fewer than
36,000 individuals, which corresponds to performing up to 2000 iterations of the STS or DFTS algorithms. Therefore, the superior effectiveness of this methodology in the 10 experiments, compared to both previous methods, can be assessed. However, these are not absolute criteria for all cases; the important aspect is to have multiple options available. Note that experiment 33 is the only case in which the key was not found using any of the techniques presented in this work so far. However, if we refine SPA2 and use exhaustive search (ES) instead of STS, let us examine the results for experiments 7, 32, and 33. For these three experiments,
Table 8 shows the seed that serves as the starting point for each methodology used. Each seed is then compared with the corresponding key in relation to the parameters of the SPA methodology.
In the last two columns of the table above, we show, in the first column, the maximum number of subregions that SPA1 would execute before reaching the subregion containing the key, and in the second column, the maximum number of individuals that SPA would traverse to find the key. Note that in experiment 7, the seed and the key are in the same subregion (the seed node for SPA1), whereas in experiments 32 and 33, the key is a neighbor of the seed node. This explains why at most five subregions would be traversed from the seed node to the subregion containing the key. As shown in
Table 8, the SPA method with ES as SPA2 finds the key in experiments 5, 32, and 33 by traversing at most 20,480 individuals (31.25% of the population), which is equivalent to 1.280 iterations of STS or DFTS. This represents a better result for experiments 5 and 33; however, in experiment 32, STS finds the key in 1.041 iterations.
We will explain experiment 32 in greater detail, as presented in
Table 8. The individual corresponding to experiment 32 is the key generated for it, which, in the partition, is represented as 20012 in decimal form. By considering the first four most significant digits, the region to which it belongs is determined among the 16 possible subregions. In this case, the subregion is 4 (
). Then, with the SPA, the same seed is taken as for the previous experiments, which is located in the subregion corresponding to 5 (
). It can be observed that the subregion
is a neighbor of the subregion
, so the subregion of the key would be obtained as one of the four possible neighbors of
. Therefore, the SPA algorithm would traverse (with SPA1) a maximum of five subregions. Furthermore, each of these subregions would be analyzed with SPA2, which, if the entire population is traversed at each node, would result in a maximum of
elements. Therefore, the maximum number of individuals analyzed when traversing 5 subregions would be at most
20,480.
About the Different Methods with TS
In the STS experiments, up to 24% of a partition of size
is explored, since
and the maximum number of iterations of the STS is 1000 and the number of neighbors of an individual is 16. The STS with these characteristics has demonstrated acceptable behavior given that in 40 of the 50 experiments carried out, the randomly generated key was found within the partition. These results are relatively superior compared to a similar context using the genetic algorithm, as reported in [
17,
24].
For each cipher, it is important to find a suitable value for
. Although certain architectures may allow it, we do not recommend excessively increasing
. Instead, if the experiments are unsuccessful, consider performing repetitions with other variants such as DFTS or SPA, or increase the maximum number of iterations of the STS. A moderate size of
also allows better tracking of the Tabu list, helping to avoid repetitions of individuals.
The design of DFTS enables the implementation of a more efficient variant of STS, allowing the algorithm to count the number of distinct individuals it labels through the DFTS neighbor generation control mechanism. Additionally, it avoids evaluating the fitness function multiple times for the same individual once it has been previously calculated. This is particularly important because computing the fitness function is the most computationally expensive step of the algorithm. DFTS represents a family of options that, beyond the advantages described above, also guarantees exploration of alternative search paths through TS.
SPA enables the combination of a cryptanalysis method within a moderately sized partition (of size ), offering an improved efficiency–time trade-off during the process. Subsequently, another method can be applied to explore different partitions, treating each as an individual entity. This work presents the experimentation and results obtained by combining SPA with STS; however, SPA can be integrated with other heuristic algorithms. The key is to adapt these algorithms to operate effectively within closed partitions. Therefore, this work does not claim that the proposed methods are superior to other heuristic approaches. Instead, it develops a methodology for applying TS in various ways for cryptanalysis, while also considering potential actions to address failed experiments.