You are currently viewing a new version of our website. To view the old version click .
Applied Sciences
  • Article
  • Open Access

22 July 2025

Parallelization of Rainbow Tables Generation Using Message Passing Interface: A Study on NTLMv2, MD5, SHA-256 and SHA-512 Cryptographic Hash Functions

,
and
1
Department of Graphical Systems, Faculty of Fundamental Sciences, Vilnius Gediminas Technical University, LT-10223 Vilnius, Lithuania
2
Department of Information Systems, Faculty of Fundamental Sciences, Vilnius Gediminas Technical University, LT-10223 Vilnius, Lithuania
*
Author to whom correspondence should be addressed.

Abstract

Rainbow table attacks utilize a time-memory trade-off to efficiently crack passwords by employing precomputed tables containing chains of passwords and hash values. Generating these tables is computationally intensive, and several researchers have proposed utilizing parallel computing to speed up the generation process. This paper introduces a modification to the traditional master-slave parallelization model using the MPI framework, where, unlike previous approaches, the generation of starting points is decentralized, allowing each process to generate its own tasks independently. This design is proposed to reduce communication overhead and improve the efficiency of rainbow table generation. We reduced the number of inter-process communications by letting each process generate chains independently. We conducted three experiments to evaluate the performance of the parallel rainbow tables generation algorithm for four cryptographic hash functions: NTLMv2, MD5, SHA-256 and SHA-512. The first experiment assessed parallel performance, showing near-linear speedup and 95–99% efficiency across varying numbers of nodes. The second experiment evaluated scalability by increasing the number of processed chains from 100 to 100,000, revealing that higher workloads significantly impacted execution time, with SHA-512 being the most computationally intensive. The third experiment evaluated the effect of chain length on execution time, confirming that longer chains increase computational cost, with SHA-512 consistently requiring the most resources. The proposed approach offers an efficient and practical solution to the computational challenges of rainbow tables generation. The findings of this research can benefit key stakeholders, including cybersecurity professionals, ethical hackers, digital forensics experts and researchers in cryptography, by providing an efficient method for generating rainbow tables to analyze password security.

1. Introduction

In the field of information security, passwords play a key role in protecting information systems, resources and services [1]. Due to the sensitive nature of passwords, they are not stored in plain text but rather in an unreadable format, known as a hash value or digest, which is generated as an output from a cryptographic hash function [2]. This function converts a password of an arbitrary length to a fixed-length value. This process adds a layer of security by making it more difficult for hackers to decipher the original password from the stored hash value. Another security mechanism that can be used to enhance password security is password salting. This method involves adding a unique, random string (called a salt) to each password before hashing it. This makes the rainbow tables technique, which was, historically, highly effective against password hashes, computationally infeasible [3]. However, the topic of salted passwords is out of the scope of this paper due to our focus on optimizing rainbow tables generation for modern cryptographic hash functions.
Cryptographic hash functions are designed as a one-way function, meaning that it is impossible to invert the function and find the original password given the hash value. Instead, malicious actors obtain the hash value via compromise or exfiltration and use indirect methods to find out what the password is. The most common and well-known approach is brute force attack [4], which is the most effective approach but also highly time-consuming and resource-intensive, especially for long and complex passwords. Brute force attack involves trying every possible combination of characters until the correct password is found. Another approach is dictionary attack [4,5], which involves testing passwords from a pre-arranged list of words and known passwords from past data breaches. Dictionary attack is effective only in some cases, specifically for passwords that are simple, short, common, and easy to guess. If the passwords are long and complicated, they are unlikely to appear in any known dictionary.
In 2003, Oechslin [6] introduced a new method for password cracking called rainbow tables attack, which is based on the earlier work from Hellman [7], who was the first to propose using a time-memory trade-off for cryptanalysis. It is important to understand this evolution, as the current research builds directly on these principles to enhance the efficiency of rainbow tables generation using different parallel computing methods.
A time-memory trade-off occurs when an algorithm or program exchanges increased memory usage for reduced execution time. In this context, memory refers to the data storage resources used during task execution, such as random access memory (RAM) or hard disk space, while time refers to the time required to complete the task.
In the context of rainbow table attack, a large, precomputed table containing passwords and hashes is generated prior to the password cracking process to reduce the time it takes when performing the attack itself. It is clear that generation of such tables, especially with large dimensions, requires a significant amount of time and computational power. Naturally, parallel computing is perceived as an obvious way to increase computational capabilities. However, the selection of the efficient parallel algorithm is highly dependable on the research field, the considered problem and the method used [8,9,10].
While the effectiveness of rainbow table attacks has declined in modern systems due to widespread adoption of password salting [11] and GPU capabilities [12], these techniques remain relevant in scenarios involving legacy systems, poorly configured authentication mechanisms or digital forensic investigations. In such contexts, the ability to rapidly generate large and diverse rainbow tables remains of practical importance. Prior work has explored various optimization strategies for rainbow table generation, including GPU-accelerated, FPGA-based, and MPI-based methods, each with different trade-offs in performance, scalability, and implementation complexity.
Our proposed method’s contribution is a modified and improved implementation for parallelization of rainbow tables using the MPI standard based on the structure of the master-slave approach, which is more efficient and convenient for this task. The approach differs from the traditional master-slave model by introducing a modification in the task distribution: instead of the master distributing starting points to each slave process, each process generates them independently, which is more efficient. This design eliminates the need for the master to coordinate task distribution, which is a well-known performance bottleneck in traditional MPI-based master-slave parallelization models [13,14]. As a result, our method significantly reduces inter-process communication, improves scalability and maintains high parallel efficiency even with limited computational resources.
MPI relies on communication between processes for parallelization, and most MPI-based algorithms require efficient communication to minimize latency and maximize computational performance. However, MPI communications require time due to factors such as latency, bandwidth limitations, network topology and communication overhead. Communication overhead in particular requires thoughtful consideration, and reducing it is the key to improving the performance of MPI-based algorithms. In the master-slave parallelization model, the communication overhead can be significant due to the way tasks and data are distributed between the master (which controls task assignment) and the slave processes (which perform computations). The overhead in this model comes from several key factors; the most relevant is the sending of messages and the use of message buffers. In this paper, we modified the master-slave model to reduce the number of point-to-point communications and improve the overall performance of the parallel algorithm.
The paper consists of five additional sections. Section 2 provides a comprehensive literature review on cryptographic hash functions and cryptanalytic time-memory trade-offs leading to parallel computing methods for rainbow tables generation. Section 3 presents the proposed method for parallel rainbow tables generation using MPI. The experimental results are presented in Section 4. Section 5 addresses the limitations of the research and suggests areas for future work. Finally, Section 6 concludes the paper.

3. Proposed Method

The implementation proposed in this paper was written using the C++ programming language and the MPI standard, specifically the OpenMPI implementation. MPI is a standard that allows multiple processors with distributed memory (each with its own separate memory) to work together on a task by exchanging information through messages. By breaking down a large task across processors, MPI enables faster and efficient processing, making it ideal for scientific computing and simulations.
Our implementation leverages key MPI functions such as MPI_Send and MPI_Recv to exchange data and synchronize operations among processes.
The approach for the parallel rainbow tables generation is based on the master-slave parallelization model with a modification. The classic model was used by Al-Khazraji [10] and it seems to be a reasonable choice, as it makes resolving merges in rainbow tables a lot easier in a distributed memory environment.
The implementation from Al-Khazraji [10] follows the standard master-slave communication pattern, where the master generates the data and distributes it to all the slave processes, and they, in return, send the results back to the master process, which finalizes the task. The traditional master-slave models introduce a bottleneck due to the master distributing tasks. Our approach eliminates this step and improves efficiency. We decided to let each process (slaves and master) generate starting points independently and send back the generated chains to the master process, which decides whether to accept a chain or reject it depending on whether a chain with the same endpoint exists in the table. This is done by utilizing a set data structure that handles duplicates. This is done in order to get a clean rainbow table, which is a table without merged chains. It is important to note that the generated table is not a perfect table; the imperfection comes from the lack of a chain regeneration mechanism for rejected chains. This design choice helps not only to simplify the handling of merged chains but also to reduce the number of point-to-point communications, which can have a significant influence on the execution time, speedup and efficiency of the parallel implementation.
By allowing each process to generate starting points independently, the workload is naturally distributed among all nodes, ensuring even computational distribution. This approach eliminates the risk of any single process becoming a bottleneck, leading to more efficient scaling as additional nodes are introduced. We emphasize that our implementation is designed to reduce the impact of network latency by decreasing the number of small messages being sent and eliminating the unnecessary communications.
Figure 1 depicts a visual pseudocode of the proposed implementation. The light blue frame highlights the portion of the code that is executed by the master process, and the purple frame highlights the portion of the code that is executed by the slave processes.
Figure 1. Visual pseudocode for the implementation of the parallel rainbow table generation.
Figure 2 illustrates the parallel computing workflow using MPI for the proposed modified master-slave architecture. The sequence diagram depicts the interactions between a master process (rank 0) and multiple slave processes (rank 1 to rank N−1). Each lifeline in the sequence diagram represents an MPI process that performs the starting and endpoints generation. The program begins by initializing the MPI environment with MPI_Init(), where each process retrieves its rank and the total number of processes. Each process independently generates a subset of start points (SP) by converting indices into string representations and computes hash chains locally. For each starting point, the chain is computed by repeatedly applying a hash function followed by a reduction function, with the final reduced value stored as the endpoint (EP). The master process (rank 0) aggregates all chains by inserting its locally computed chains into a collection and receiving chains from all other processes via MPI_RECEIVE, while slave processes (rank > 0) send their locally computed chains to the master using MPI_SEND. This parallelization ensures that the workload is distributed evenly across processes, significantly reducing the time required to generate rainbow tables for modern hash functions. The program terminates by finalizing the MPI environment with MPI_Finalize().
Figure 2. UML sequence diagram for the MPI-based proposed implementation.

4. Results and Discussion

In this section, we will present the results of a series of experiments to evaluate the performance of the proposed implementation and compare them to the results from three other researchers [10,37,38]. The experiments were executed on a dedicated cluster located in Vilnius Gediminas Technical University consisting of 15 nodes with 12th Gen Intel(R) Core(TM) i7-12700 processor (12 cores) with 16 GB RAM and Crypto++ version 8.9.0 installed to support cryptographic hash functions. We used three metrics for the performance evaluation: the first is the execution time, the second is the speedup calculated as the ratio between the execution time on a single processor and the execution time on p processors and the third metric is the efficiency calculated as a proportion of the speedup and number of computing nodes. The formulas for both speedup and efficiency are presented in Equation (1) and Equation (2), respectively. Both speedup and efficiency are unitless quantities; the speedup indicates how fast the program runs on p processes compared to a single process, while the efficiency represents how effectively multiple processors are utilized. This value is typically in the range from 0 to 1 and is presented in percentages in this paper.
S p = T 1 T p ,
E p = S ( p ) p ,

4.1. The Influence of Hash Functions on the Parallel Performance

In the first experiment, rainbow tables were generated for four different cryptographic hash functions. We chose to use SHA-256, SHA-512, MD5 and NTLMv2 for this experiment. MD5 was once widely used as a cryptographic hash function. However, it has been discovered to have many vulnerabilities, and similarly, NTLM, which is based on outdated cryptographic schemes, is considered weak as well, and even Microsoft recommends transitioning to more modern authentication schemes like Kerberos or Negotiation authentication. Despite their obsolescence, MD5 and NTLMv2 are still found in legacy systems and are occasionally encountered in real-world scenarios, particularly during forensic analysis or penetration testing, as well as for backward compatibility with older systems and servers [41]. In addition, the majority of the business systems in recent years are legacy applications. Recent statistics show that more than 60% of the budget in IT organizations is spent on maintaining these legacy systems [42], and those systems still use those outdated cryptographic hash functions. Therefore, these algorithms are included in this experiment to provide a benchmark for evaluating the effectiveness and performance of attacks against known weak hash functions.
We generated a rainbow table for each of the above-mentioned hash functions for two cases of rainbow tables sizes, one with 10,000 processed chains, each containing 20,000 entries, and another one with 50,000 chains, each containing 40,000 entries. The generated tables were designed for passwords consisting of a minimum of one character and a maximum of five characters. The reason for the small input set is due to the limitations in the number of available resources. The charset used includes lowercase letters (26) and numbers (10) forming a set of 36 characters. This, combined with the range of password length, results in a search space with 62,193,780 combinations. To calculate the number of combinations in the search space, the permutations with repetition formula, shown in Equation (3), was used.
P r , n = r n
where r is the number of options for each character in the password (36 in this case), and n is the length of the password (from one to five characters). To get the total number of combinations, we need to add up all the values for each length, as shown in Equation (4).
n = 1 5 36 n = 62193780
The specific dimensions for the table were selected to evaluate the performance and scalability of the MPI-based implementation under varying workloads considering the imposed resources limitations. The two configurations provide a basis for analyzing how table size and chain length affect parallel execution efficiency and resource utilization.
The results of this experiment are divided in two parts. The results of the first part of the experiment are presented in Figure 3 and Figure 4, which illustrate the execution times for generating rainbow tables using four cryptographic hash functions, SHA-256, SHA-512, MD5 and NTLMv2, across an increasing number of processing nodes in an MPI environment, starting from one node and incrementing by one until 15 nodes are reached, which is the maximum number of computing nodes in our experimental environment. The x-axis represents the number of nodes (ranging from 1 to 15), while the y-axis shows the execution time in seconds.
Figure 3. Execution time for parallel medium-size rainbow tables generation for SHA-256, SHA-512, MD5 and NTLM (NTLMv2) cryptographic hash functions across an increasing number of computing nodes. The lines for SHA-256, MD5 and NTLM overlap due to the relatively small table dimensions.
Figure 4. Execution time for parallel large-size rainbow tables generation for SHA-256, SHA-512, MD5 and NTLM (NTLMv2) cryptographic hash functions across an increasing number of computing nodes. The lines for MD5 and NTLM overlap due to the similarities between the internal characteristics of both hash functions.
Figure 3 presents the results for a 10,000-by-20,000 table, while Figure 4 presents the results for a 50,000-by-40,000 table. As depicted in both figures, the execution time decreases significantly with an increasing number of nodes, highlighting the efficiency of parallel computing in reducing computational overhead. Among the hash functions, SHA-512 has the highest execution time (1119 s for medium table and 11,229 s for large table) with a single node, followed by SHA-256 (730 s for medium table and 8888 s for large table), MD5 (704 s for medium table and 7134 s for large table) and NTLMv2 (722 s for medium table and 7148 s for large table). However, as the number of nodes increases, the execution time difference between the hash functions decreases, with all four functions converging to similar execution times beyond 10 nodes.
The cryptographic hash functions, SHA-256, MD5 and NTLMv2, have similar execution times in Figure 3, and they remain similar as the number of nodes increases and show no significant difference. As seen in Figure 4, this difference becomes more significant as the table dimensions increase, but there is still no significant difference between MD5 and NTLMv2 because NTLMv2 uses MD4 internally, which is similar to MD5 with very minor changes. As a result, the lines for NTLMv2 and MD5 overlap in the figures.
The second part of the experiment focuses on the speedup gained from the proposed implementation. The findings from this part of the experiment are shown in Table 1 and Table 2, which illustrate the speedup gains achieved by parallelizing rainbow table generation for the four previously mentioned cryptographic hash functions.
Table 1. Speedup for parallel medium-size rainbow tables generation for SHA-256, SHA-512, MD5 and NTLMv2 cryptographic hash functions across an increasing number of computing nodes.
Table 2. Speedup for parallel large-size rainbow tables generation for SHA-256, SHA-512, MD5 and NTLMv2 cryptographic hash functions across an increasing number of computing nodes.
In an ideal case, the speedup would be represented as a straight line, indicating that with p processes, we can achieve p times faster generation process. In our case, we did not achieve a perfectly linear speedup, but it is remarkably close to this ideal speedup, a scenario known as near-linear speedup. The tables demonstrate that the speedup achieved for all hash functions closely follows the ideal linear trend, indicating efficient utilization of computational resources in the parallel environment. This consistent performance across hash functions highlights the scalability and effectiveness of the parallel implementation, even for computationally intensive functions like SHA-512. We can also report that the efficiency is good as well and remains within the range of 95–99%.
These results indicate that the parallel computing approach using MPI achieves near-linear speedup, regardless of the computational complexity of the hash function and the table dimensions.

4.2. The Influence of Chain Count on the Execution Time

In the second experiment, we tested the scalability of our parallel implementation when the number of chains being processed is increasing. This experiment was executed for the same four cryptographic hash functions that were mentioned in Section 4.1; those hash functions are SHA-256, SHA-512, MD5 and NTLMv2. The implementation was executed on 15 computing nodes, and the execution times were measured for processing 100, 1000, 10,000 and 100,000 chains while keeping the chain length constant at 10,000. Those values were chosen to evaluate the scalability of the parallel implementation against different table sizes.
The results of this experiment are presented in Figure 5, which illustrates the relative execution times for generating rainbow tables using four cryptographic hash functions, SHA-256, SHA-512, MD5 and NTLMv2, with varying counts of chains. The values shown in the figure are relative to the fastest algorithm in each group. The x-axis represents the number of chains, while the y-axis shows the relative execution time. As shown in the figure, SHA-512 consistently exhibits the highest relative execution time, reflecting its greater computational complexity, while MD5 and NTLMv2 perform much faster, as expected due to their algorithmic weaknesses. SHA-256 occupies the middle ground.
Figure 5. Relative execution time for parallel rainbow tables generation with varying chain counts and 15 computing nodes for SHA-256, SHA-512, MD5 and NTLM (NTLMv2) cryptographic hash functions (the values are relative to the fastest algorithm in each group).

4.3. The Influence of Chain Length on the Execution Time

In the third experiment, we evaluated the performance of our implementation when increasing the length of the generated chains. This experiment was executed for the same four cryptographic hash functions as mentioned in the previous subsections. The implementation was executed on 15 computing nodes and with 10,000 processed chains.
The execution times were measured for chains of length 10 until 100, increasing the length by a factor of 10 for each step. Those values were chosen to evaluate the scalability of the parallel implementation against different table sizes.
Figure 6 presents the results of this experiment, where each line represents a different hash function. In this figure, the x-axis represents the chain length, and the y-axis represents the execution time in milliseconds. Looking at the figure, we can make a few observations that can give us some information on the relationship between chain length and execution time.
Figure 6. Execution time for parallel rainbow tables generation with varying chain lengths and 15 computing nodes for SHA-256, SHA-512, MD5 and NTLM (NTLMv2) cryptographic hash functions.
First, we can notice that as the chain length increases, the execution time for all four hash functions also increases. However, the rate of increase differs from one hash function to another. SHA-512 exhibits the steepest slope, indicating a higher computational cost compared to the other hash functions. MD5, on the other hand, has the shallowest slope, suggesting the lowest computational overhead. The execution times for SHA-256 and NTLMv2 fall between those of SHA-512 and MD5, with NTLMv2 consistently below SHA-256 across all chain lengths.
The second observation is related to the similarity between MD5 and NTLMv2, which indicates that their internal implementation is similar. In fact, NTLMv2 uses the MD4 hash function, which has similar structure, operates on the same size blocks, and produces a digest of the same size as MD5. This similarity is significant because it tells us that NTLMv2 inherits structural weaknesses from MD4 and MD5, both of which have known cryptographic vulnerabilities.

4.4. The Impact of the Internal Characteristics of the Cryptographic Hash Functions on the Performance of Parallel Rainbow Tables Generation

Cryptographic hash functions are designed to transform a message into a fixed-length digest using a series of mathematical operations. Their internal characteristics include digest size, block size and number of rounds. These characteristics determine their computational efficiency, security and suitability for different applications. Each hash function differs by the block size it operates on and the number of rounds it performs.
The digest size determines the length of the hash output, affecting security against collision attacks. Larger digest sizes, such as the 512-bit output of SHA-512, provide enhanced resistance, while smaller digest sizes, like the 128-bit outputs of MD5 and NTLMv2, are more vulnerable to cryptographic attacks.
The block size defines the amount of data processed per round. SHA-512 operates on 1024-bit blocks, whereas SHA-256 and MD5 use 512-bit blocks. Larger block sizes demand more computational resources but offer increased resistance to attacks. NTLMv2, derived from MD4, does not follow the same structured block-processing approach, further reducing its security.
The number of rounds impacts both security and execution time. SHA-512 performs 80 rounds of computations, significantly more than the 64 rounds of SHA-256, increasing its computational workload. MD5 and SHA-256 also utilize 64 rounds, whereas NTLMv2, with only three rounds, is computationally lightweight but highly insecure.
The impact of these characteristics is evident in the results obtained from the experiments. Table 3 summarizes the internal characteristics of the cryptographic hash functions examined in this study. During our experiments, SHA-512 demonstrated the highest execution time due to its increased number of rounds and larger block size, making it computationally intensive. In contrast, MD5 and NTLMv2 exhibited the lowest execution times, due to their smaller block sizes, fewer rounds and simpler computational structures. These factors directly affect execution time, as hash functions with greater complexity require more processing cycles. The increased number of computational steps, such as additional rounds and larger block sizes, results in higher memory and CPU utilization, leading to longer processing times.
Table 3. Summary of cryptographic hash functions internal characteristics.
The trade-off between security and performance is evident: SHA-512 offers superior cryptographic strength at the cost of higher computational demands, while MD5 and NTLMv2 provide faster processing speeds but lack security resilience.

4.5. Comparative Analysis

In our proposed parallel implementation, we chose to utilize the master-slave approach with our proposed modification using the Message Passing Interface standard to improve the performance of rainbow tables generation. The classic master-slave approach relies on the master process to generate tasks and distribute them among the slave processes. As the distribution of tasks requires communications between processes, which contributes to the execution time, we gave each process, including the master process, the responsibility of generating the tasks (generate starting points for rainbow chains) for themselves, which reduces the amount of communications between processes and, as a result, improves the performance of the implementation.
In this section, we aim to compare our results to three other MPI-based implementations from Sykes et al. [37], Al-Khazraji [10] and Avoine et al. [38], who used MPI to speed up the rainbow tables generation. The first two implementations are quite old, but they still present questions relevant to modern computing.
Each implementation, including the approach in this paper, differs in the problem being solved. Sykes et al. [37] focused on improving the RainbowCrack tool for rainbow tables attack, essentially targeting the long time required to generate rainbow tables as well as to search them. Al-Khazraji [10] focused solely on the long time required to generate rainbow tables, and Avoine et al. [38] aimed at solving the problem of inefficiency related with discarding merged chains during the generation process.
In terms of the parallelization model, only Al-Khazraji [10] explicitly mentioned the parallelization model (classic master-slave). Our method improves on this model by decentralizing the chains generation to each process (modified master-slave).
Looking at the number of computing nodes (or CPU cores), we can see significant differences. Al-Khazraji [10] was the one utilizing the largest amount of computing nodes (exactly 201 nodes), while Avoine et al. [38] utilized the least amount of nodes, equalling five. Our experimental environment consists of 15 CPU nodes. Although the more resources one has the better, it is not always feasible.
In terms of performance, Al-Khazraji [10] achieved the largest performance gain due to the size of the cluster used in his experiments. He reduced the generation time from 816.62 min in sequential implementation to 3.38 min in parallel implementation and reduced the generation time from 7.14 days in sequential implementation to 46.45 min in parallel implementation. His results were measured for generation of a rainbow table consisting of 10,000,000 chains and 100,000,000 chains, respectively, when using 201 computing nodes, one for the master process and 200 for the slave processes. In our case, we are limited with the number of computing nodes; we utilized exactly 15 nodes in our experimental environment. Generating a rainbow table with dimensions close to 10,000,000 chains takes 41.73 min, which is less then it took to generate 100,000,000 chains with the author’s parallel implementation.
Sykes et al. [37] also achieved some performance gain reducing the execution time from an estimated six years to just a few days. This result demonstrates a significant increase in efficiency, indicating the potential of their optimizations or architectural changes in resolving the scalability issues of the original solution. Despite this improvement, the execution time still has practical limitations. In real-world scenarios, particularly those involving iterative development, frequent updates, or time-sensitive analysis, a several-day wait time may be considered excessive.
In contrast, Avoine et al. [38] achieved a 2.56 speedup factor when employing a multicore processing configuration with five CPU cores on a standard personal computer. While this result implies an acceptable level of parallel efficiency, it also highlights limitations in scalability and parallel job distribution. This result shows that the algorithm’s intrinsic serial components limit its capacity to scale efficiently over several cores. Furthermore, the utilization of only five cores, while feasible for a consumer-grade setup, raises the question of how well the approach would function on systems with more parallel capacity.
A key factor influencing the performance of MPI-based implementations is the number of communications required, which is directly dependent on the number of processed chains, as each process generates chains independently, unlike the referenced paper’s [10] approach. In the referenced implementation, there are additional communications related with tasks distribution, which are eliminated in our proposed implementation. Our approach eliminates this communication overhead by avoiding explicit task distribution and as a result reduces the bottleneck in the master process.
This comparison highlights the importance of efficient process management and communication strategies in parallel computing. Our results confirm that a carefully designed parallel implementation can deliver excellent performance, even in environments with limited computational resources. In Table 4, we present the summary of the comparative analysis, covering six aspects of all implementations. Those aspects include the problem addressed in each paper, the parallelization model, number of computing nodes (or CPU cores) used for execution of experiments, performance improvement as reported in the respective papers, execution time and the strategy to handle chain merges.
Table 4. Summary of comparative analysis.

5. Limitations and Future Work

A few limitations were identified during the course of this research. One key constraint was the limited number of computing nodes available in our experimental environment, which restricted the scale of our implementation compared to other studies. Testing the implementation on bigger clusters would be necessary. Another limitation is the lack of a chain regeneration mechanism, which is tricky to implement in a distributed memory environment. This absence led to fewer generated chains then requested, which makes the table imperfect. Future work could address this issue and improve the rainbow table generation process. Another area for future work is to experiment with other parallelization techniques for rainbow tables generation, for example, utilizing the power of GPU to enhance the generation process.

6. Conclusions

This research presents an approach to the parallel generation of rainbow tables using the Message Passing Interface standard. The proposed approach is based on a modified master-slave model, where each process independently generates starting points, and the master process resolves chain conflicts. Our implementation achieves significant improvements in performance and efficiency. The reduction in inter-process communication compared to traditional approaches enhances scalability and execution speed, as demonstrated through experimental evaluations. The experimental evaluation is composed of three experiments. In the first experiment, we evaluated the parallel performance for four cryptographic hash functions (SHA-256, SHA-512, MD5 and NTLMv2), achieving near-linear speedup and maintaining efficiency between 95% and 99% across different numbers of computing nodes. This demonstrates the scalability and robustness of the implementation, even for computationally demanding functions like SHA-512.
The second experiment highlighted the scalability of the implementation by varying the number of processed chains from 100 to 100,000. The results revealed that the execution time increases with larger workloads, with SHA-512 exhibiting the highest computational cost due to its complexity. In the third experiment, we analyzed the influence of chain length on execution time. The findings showed that longer chains significantly increased execution times, with SHA-512 once again showing the steepest growth, while MD5 and NTLMv2 had relatively lower computational overhead.
In addition, the study acknowledges certain limitations, including the absence of a chain regeneration mechanism and constraints imposed by the available computing infrastructure. It is important to note that our implementation could be extended to longer and more complex passwords by adjusting the reduction function and search space size. Obviously, this would require additional computational resources due to the exponential increase of the search space. Overall, the proposed approach contributes a practical and efficient solution to the computational demands of rainbow table generation, advancing the field of cryptanalysis and password recovery.
While the use of password salting has significantly reduced the effectiveness of rainbow tables in modern authentication systems, our work remains relevant in scenarios where salting is absent or improperly implemented or when legacy systems are targeted. In addition, the use of rainbow tables can be useful for penetration testing, cyber forensics investigations and password security evaluations.

Author Contributions

Conceptualization, M.V., N.G. and A.K.; methodology, M.V., N.G. and A.K.; software, M.V.; formal analysis, M.V.; investigation, M.V. and A.K.; supervision, A.K. and N.G.; writing—original draft preparation, M.V.; writing—review and editing, M.V., N.G. and A.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data are available in a publicly accessible repository. The original data presented in the study are openly available in GitHub at https://github.com/markvnr/MPI-RainbowTables-Experiment-Data/ (accessed 18 February 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Bonneau, J.; Herley, C.; Van Oorschot, P.C.; Stajano, F. The Quest to Replace Passwords: A Framework for Comparative Evaluation of Web Authentication Schemes. In Proceedings of the IEEE Symposium on Security and Privacy, San Francisco, CA, USA, 20–23 May 2012; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2012; pp. 553–567. [Google Scholar]
  2. Menezes, A.J.; Vanstone, S.A.; Van Oorschot, P.C. Handbook of Applied Cryptography, 1st ed.; CRC Press, Inc.: Boca Raton, FL, USA, 1996; ISBN 0849385237. [Google Scholar]
  3. Horálek, J.; Holík, F.; Horák, O.; Petr, L.; Sobeslav, V. Analysis of the Use of Rainbow Tables to Break Hash. J. Intell. Fuzzy Syst. 2016, 32, 1523–1534. [Google Scholar] [CrossRef]
  4. Bosnjak, L.; Sres, J.; Brumen, B. Brute-Force and Dictionary Attack on Hashed Real-World Passwords. In Proceedings of the 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2018—Proceedings, Opatija, Croatia, 21–25 May 2018; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2018; pp. 1161–1166. [Google Scholar]
  5. Delaune, S.; Jacquemard, F. A Theory of Dictionary Attacks and Its Complexity. In Proceedings of the Computer Security Foundations Workshop, Pacific Grove, CA, USA, 30 June 2004; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2004; Volume 17, pp. 2–15. [Google Scholar]
  6. Oechslin, P. Making a Faster Cryptanalytic Time-Memory Trade-Off. In Advances in Cryptology—CRYPTO 2003; LNCS; Boneh, D., Ed.; Springer: Berlin/Heidelberg, Germany, 2003; Volume 2729, pp. 617–630. [Google Scholar]
  7. Hellman, M. A Cryptanalytic Time-Memory Trade-Off. IEEE Trans. Inf. Theory 1980, 26, 401–406. [Google Scholar] [CrossRef]
  8. Kačeniauskas, A.; Rutschmann, P. Parallel FEM Software for CFD Problems. Informatica 2004, 15, 363–378. [Google Scholar] [CrossRef]
  9. Kačeniauskas, A.; Kačianauskas, R.; Maknickas, A.; Markauskas, D. Computation and Visualization of Discrete Particle Systems on GLite-Based Grid. Adv. Eng. Softw. 2011, 42, 237–246. [Google Scholar] [CrossRef]
  10. Al-Khazraji, S.H.A.A. Using Parallel Computing to Implement Security Attack. Int. J. Comput. Sci. Inf. Secur. 2015, 13, 35–38. [Google Scholar]
  11. Meganathan, N. What Is the Effectiveness of Salt and Pepper in Preventing Rainbow Table Attacks in Modern Password Hashing Algorithms? Int. J. Innov. Sci. Res. Technol. 2024, 9, 242–248. [Google Scholar] [CrossRef]
  12. Fosaaen, K. LM Hash Cracking—Rainbow Tables vs GPU Brute Force. Available online: https://www.netspi.com/blog/technical-blog/network-pentesting/lm-hash-cracking-rainbow-tables-vs-gpu-brute-force/ (accessed on 24 June 2025).
  13. Pabico, J.P. A Framework for a Multiagent-Based Scheduling of Parallel Jobs. arXiv 2015. [Google Scholar] [CrossRef]
  14. Borisenko, A.B.; Gorlatch, S. Parallel MPI-Implementation of the Branch-and-Bound Algorithm for Optimal Selection of Production Equipment. Bull. Tambov. State Tech. Univ. 2016, 22, 350–357. [Google Scholar] [CrossRef]
  15. Stevens, H. Hans Peter Luhn and the Birth of the Hashing Algorithm. Available online: https://spectrum.ieee.org/hans-peter-luhn-and-the-birth-of-the-hashing-algorithm (accessed on 28 January 2025).
  16. Pacevič, R.; Kačeniauskas, A. Hash Functions and GPU Algorithm of Infinite Grid Method for Contact Search. Inf. Technol. Control 2022, 51, 48–58. [Google Scholar] [CrossRef]
  17. Tang, M.; Liu, Z.; Tong, R.; Manocha, D. PSCC: Parallel Self-Collision Culling with Spatial Hashing on GPUs. Proc. ACM Comput. Graph. Interact. Tech. 2018, 1, 1–18. [Google Scholar] [CrossRef]
  18. Wang, X.; Feng, D.; Lai, X.; Yu, H. Collisions for Hash Functions MD4, MD5, HAVAL-128 and RIPEMD. Cryptol. ePrint Arch. 2004, 2004/199. [Google Scholar]
  19. Stevens, M.; Sotirov, A.; Appelbaum, J.; Lenstra, A.; Molnar, D.; Osvik, D.A.; de Weger, B. Short Chosen-Prefix Collisions for MD5 and the Creation of a Rogue CA Certificate. In Advances in Cryptology—CRYPTO 2009; LNCS; Springer: Berlin/Heidelberg, Germany, 2009; Volume 5677, pp. 55–69. [Google Scholar]
  20. Xie, T.; Liu, F.; Feng, D. Fast Collision Attack on MD5. Cryptol. ePrint Arch. 2013, 2013/170. [Google Scholar]
  21. Nkouankou, A.; Clarice, F.; Abel, W.; Ndoundam, R. Pre-Image Attack of the MD5 Hash Function by Proportional Logic. Int. J. Res. Innov. Appl. Sci. 2022, 7, 2454–6194. [Google Scholar] [CrossRef]
  22. Zhong, J.; Lai, X. Preimage Attacks on Reduced DHA-256. Cryptol. ePrint Arch. 2009, 2009/552. [Google Scholar]
  23. Dobbertin, H. The First Two Rounds of MD4 Are Not One-Way. In Fast Software Encryption; LNCS; Springer: Berlin/Heidelberg, Germany, 1998; Volume 1372, pp. 284–292. [Google Scholar]
  24. Kelsey, J.; Schneier, B. Second Preimages on N-Bit Hash Functions for Much Less than 2n Work. In Advances in Cryptology—EUROCRYPT 2005; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3494, pp. 474–490. [Google Scholar]
  25. Sulak, F.; Koçak, O.; Saygı, E.; Öğünç, M.; Bozdemır, B. A Second Pre-Image Attack and a Collision Attack to Cryptographic Hash Function Lux. Commun. Fac. Sci. Univ. Ank. Ser. A1 Math. Stat. 2017, 66, 254–266. [Google Scholar] [CrossRef]
  26. Andreeva, E.; Bouillaguet, C.; Dunkelman, O.; Fouque, P.-A.; Hoch, J.; Kelsey, J.; Shamir, A.; Zimmer, S. New Second-Preimage Attacks on Hash Functions. J. Cryptol. 2016, 29, 657–696. [Google Scholar] [CrossRef]
  27. Denning, D. Cryptography and Data Security; Addison-Wesley Longman Publishing Co., Inc.: Boston, MA, USA, 1982; ISBN 0201101505. [Google Scholar]
  28. Avoine, G.; Junod, P.; Oechslin, P. Characterization and Improvement of Time-Memory Trade-Off Based on Perfect Tables. ACM Trans. Inf. Syst. Secur. 2008, 11, 1–22. [Google Scholar] [CrossRef]
  29. Avoine, G.; Carpent, X. Optimal Storage for Rainbow Tables. In Proceedings of the Information Security and Cryptology—ICISC 2013, LNCS, Seoul, Korea, 27–29 November 2013; Lee, H.-S., Dong-Guk, H., Eds.; Springer: Berlin/Heidelberg, Germany, 2014; Volume 8565, pp. 144–157. [Google Scholar]
  30. Quan, L.J.; Ye, T.J.; Ling, G.G.; Balachandran, V. QIris: Quantum Implementation of Rainbow Table Attacks. arXiv 2024, arXiv:2408.07032. [Google Scholar] [CrossRef]
  31. Dat, T.N.; Iwai, K.; Matsubara, T.; Kurokawa, T. Implementation of High Speed Rainbow Table Generation Using Keccak Hashing Algorithm on GPU. In Proceedings of the 2019 6th NAFOSTED Conference on Information and Computer Science (NICS), Hanoi, Vietnam, 12–13 December 2019; IEEE: New York, NY, USA, 2019; pp. 166–171. [Google Scholar]
  32. Kim, J.W.; Seo, J.; Hong, J.; Park, K.; Kim, S. High-speed Parallel Implementations of the Rainbow Method Based on Perfect Tables in a Heterogeneous System. Softw. Pract. Exp. 2015, 45, 837–855. [Google Scholar] [CrossRef]
  33. Li, P.; Zhu, W.; Chen, J.; Yao, S.; Hsu, C.F.; Xiong, G. High-Speed Implementation of Rainbow Table Method on Heterogeneous Multi-Device Architecture. Future Gener. Comput. Syst. 2023, 143, 293–304. [Google Scholar] [CrossRef]
  34. Kalenderi, M.; Pnevmatikatos, D.; Papaefstathiou, I.; Manifavas, C. Breaking the GSM A5/1 Cryptography Algorithm with Rainbow Tables and High-End FPGAS. In Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL), Oslo, Norway, 29–31 August 2012; IEEE: New York, NY, USA, 2012; pp. 747–753. [Google Scholar]
  35. Papantonakis, P.; Pnevmatikatos, D.; Papaefstathiou, I.; Manifavas, C. Fast, FPGA-Based Rainbow Table Creation for Attacking Encrypted Mobile Communications. In Proceedings of the 2013 23rd International Conference on Field programmable Logic and Applications, Porto, Portugal, 2–4 September 2013; IEEE: New York, NY, USA, 2013; pp. 1–6. [Google Scholar]
  36. Theocharoulis, K.; Papaefstathiou, I.; Manifavas, C. Implementing Rainbow Tables in High-End FPGAs for Super-Fast Password Cracking. In Proceedings of the 2010 International Conference on Field Programmable Logic and Applications, Milan, Italy, 31 August–2 September 2010; IEEE: New York, NY, USA, 2010; pp. 145–150. [Google Scholar]
  37. Sykes, E.R.; Skoczen, W. An Improved Parallel Implementation of RainbowCrack Using MPI. J. Comput. Sci. 2014, 5, 536–541. [Google Scholar] [CrossRef]
  38. Avoine, G.; Carpent, X.; Leblanc-Albarel, D. Stairway to Rainbow. In Proceedings of the ACM Asia Conference on Computer and Communications Security, Melbourne, VIC, Australia, 10–14 July 2023; Association for Computing Machinery: New York, NY, USA, 2023; pp. 286–299. [Google Scholar]
  39. Avoine, G.; Carpent, X.; Leblanc-Albarel, D. Precomputation for Rainbow Tables Has Never Been so Fast. In Proceedings of the 26th European Symposium on Research in Computer Security, Darmstadt, Germany, 4–8 October 2021; pp. 215–234. [Google Scholar]
  40. Westergaard Jørgensen, M. Free Rainbow Tables: Distributed Rainbow Table Project. Available online: https://freerainbowtables.com/ (accessed on 2 November 2024).
  41. Vaideeswaran, N. NTLM Explained. Available online: https://www.crowdstrike.com/en-us/cybersecurity-101/identity-protection/windows-ntlm/ (accessed on 26 June 2025).
  42. Adusumilli, S. Testing a Legacy Application with Zero Documentation. Available online: https://www.cigniti.com/blog/testing-a-legacy-application-with-zero-documentation/ (accessed on 26 June 2025).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.