Next Article in Journal
Adversarial Attacks on Machine Learning Models for Network Traffic Filtering
Previous Article in Journal
Tracert-Retrieval-Augmented Generation: Boosting Multi-Hop Retrieval-Augmented Generation with Direction-Aware Graph Traversal
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Intelligent Password Guessing Using Feature-Guided Diffusion †

Department of Information Engineering and Computer Science, Feng Chia University, Taichung 407102, Taiwan
*
Author to whom correspondence should be addressed.
Presented at 8th International Conference on Knowledge Innovation and Invention 2025 (ICKII 2025), Fukuoka, Japan, 22–24 August 2025.
Eng. Proc. 2025, 120(1), 51; https://doi.org/10.3390/engproc2025120051
Published: 5 February 2026
(This article belongs to the Proceedings of 8th International Conference on Knowledge Innovation and Invention)

Abstract

In modern cybersecurity and deep learning, conditional password guessing plays a critical role in improving password-cracking efficiency by leveraging known patterns and constraints. In contrast with traditional brute-force or dictionary-based attacks, we developed an approach that adopts a latent diffusion model to simulate human password selection behavior, generating more realistic password candidates. We incorporated masked character inputs as conditions and applied advanced feature extraction to capture common patterns such as character substitutions and typing habits. Furthermore, we employed visualization techniques, including autoencoders and principal component analysis, to analyze password distributions, enhancing model interpretability and aiding both offensive and defensive security strategies.

1. Introduction

Passwords are the most commonly adopted authentication mechanisms due to their ease of use and lack of hardware requirements, as demonstrated in various authentication frameworks [1,2]. Despite their convenience, traditional password-based authentication systems are increasingly vulnerable to a variety of attacks as computational capabilities and adversarial techniques continue to evolve. Common threats include dictionary attacks, brute-force methods, and even the integration of machine learning models that significantly improve the efficiency of password guessing. This risk is further exacerbated by users’ tendencies to select weak or reused passwords. Consequently, investigating advanced attack strategies and robust defense mechanisms has become essential to strengthening the security of modern authentication systems.
In recent years, deep-learning-based password-guessing techniques have garnered increasing attention, particularly with the emergence of generative models. Among these, the latent diffusion model (LDM) [3] has demonstrated remarkable generative capabilities across domains such as image synthesis and natural language processing [4]. Owing to its ability to produce high-quality synthetic data, the LDM employs a compelling approach to password-guessing tasks. Unlike existing models such as password-generative adversarial networks (PassGANs) [5] and password bidirectional encoder representations from transformers (PassBERT) [6], an LDM leverages a stepwise diffusion and denoising process within a compressed latent space, enabling it to generate more diverse and realistic password candidates with improved efficiency and representational fidelity.
We applied an LDM for conditional password guessing to enhance the accuracy and efficiency of predictions when partial prior knowledge is available—such as known password substrings, historical password usage patterns, or user-specific behavioral features. The RockYou dataset was employed as the primary training corpus to ensure realistic modeling. Furthermore, the proposed model was evaluated in the context of password attack scenarios and benchmarked against existing generative techniques, including GANs and BERT. Through this comparison, this study assesses the strengths and limitations of the LDM in this domain. The findings highlight the potential of latent diffusion models in password security research and offer new perspectives for future advancements in password strength evaluation and defensive strategies.

2. Related Work

Traditional password-guessing techniques primarily relied on statistical models such as Markov chains [7] and probabilistic context-free grammars (PCFGs) [8], which attempt to model user behavior by analyzing the frequency and structure of previously leaked passwords. While these methods were once effective, their performance has diminished over time due to the increasing complexity and diversity of modern passwords.
In response, deep-learning-based approaches have been developed to capture more intricate patterns in password structures and improve guessing accuracy. With the widespread use of advanced generative models such as generative pre-trained transformers (GPT) [9], BERT [10], GANs [11], flow-based models [12], and diffusion models [13], researchers have proposed various password-guessing frameworks that significantly outperform traditional methods. For example, PassGANs apply adversarial training to generate high-probability password candidates, while PassFlow [14] utilizes invertible transformations to model password distributions. PassGPT [15], PassBERT, and PassDiff [16] leverage attention mechanisms and denoising processes to generate context-aware and structurally valid password guesses. These models benefit from the ability to learn from large-scale datasets and adapt to evolving password usage patterns, thereby improving their effectiveness in real-world attack scenarios.

3. Methodology

3.1. Diffusion Process

The diffusion process is shown in Figure 1. The overall diffusion process can be divided into forward and reverse processes.
In the forward diffusion process, latent representations derived from input text are gradually perturbed by the addition of Gaussian noise over a series of timesteps. Let  z 0  denote the original latent vector produced by the encoder in the first stage of the autoencoder framework. The forward process defines a Markov chain that successively transforms  z t 1  into  z t , where t = 1, 2, …, T, with increasing levels of noise injected at each step. Specifically, the transition from  z t 1  to  z t  follows the conditional distribution:
q z t z t 1 = N z t ;   1 β t z t 1 ,     β t I      
Here,  β t ( 0,1 )  is a predefined variance schedule that controls the amount of noise added at each timestep t. The mean term  1 β t z t 1  ensures that the new latent point remains correlated with the previous one, while the variance term  β t I  introduces isotropic Gaussian noise. As t increases, the cumulative noise causes  z t  to gradually lose its original semantic structure and eventually approximate a standard Gaussian distribution. This progressive corruption process is essential for training the reverse denoising model, as it creates a trajectory that the model will learn to reverse during generation. This process can be referenced in Equation (1).
During the reverse diffusion process, the generative model attempts to reconstruct the original latent representation from its noisy counterpart by iteratively denoising over a series of timesteps. Initially, a clean latent vector  z 0 , obtained from an input sequence via the encoder, is gradually corrupted through the forward diffusion process into a noisy latent variable  z t  at a chosen timestep t. The goal of the reverse process is to iteratively recover  z t 1 , z t 2 , , z 0  from  z t , thereby reconstructing a latent representation that closely approximates the original.
To guide this reverse generation, the model utilizes conditional information c, which, in the context of this study, corresponds to the masked version of the password or partial character clues. This conditional vector c is provided alongside the noisy latent vector  z t  as input to a learnable denoising network often implemented as a U-Net or Transformer-based architecture. The network is trained to estimate the parameters of the Gaussian distribution that defines the reverse transition at each timestep.
Formally, the reverse step is modeled as a conditional probability distribution.
p θ z t 1 z t , c = N z t 1 ; μ θ z t , t , c , Σ θ z t , t , c

3.2. Autoencoder

In the first stage of the training process for the autoencoder (Figure 2), the encoder processes discrete textual data. To facilitate input into the encoder model, an embedding layer is used to convert discrete data w into a continuous representation. This representation then passes through a multi-head attention (MHA) layer, which enhances textual-feature representation. Next, a linear layer compresses the data into a 64-dimensional latent vector representation z. Finally, the decoder restores this 64-dimensional latent vector into a word probability distribution  w ^  that closely approximates or matches the original textual data.
Cross-entropy is primarily used as the training loss function for the first stage of the autoencoder, as shown in Equations (3) and (4).
L A u t o e n c o d e r = 1 N i = 1 N P w ^ i w i log P w ^ i w i
w ^ i = D ϕ ( E θ w i )

3.3. PassLDiffusion-Mask

In the second stage of the training process, passldiffusion-mask, as shown in Figure 3, the key difference from standard LDM is the use of masked character cipher data as a conditional input. Additionally, the structure of the denoising model (noise predictor) is optimized with five layers of ResLinear, which enhances training and inference speed while accelerating the convergence of the denoising model.
The DDPM loss function, based on mean squared error, is used as the training loss function for the second stage of LDM, as described in Equations (5) and (6). Here, ϵ follows a Gaussian probability distribution, while  ϵ θ ( z t , t , c )  is the noise predictor. The objective is to ensure that the predicted  ϵ θ ( z t , t , c )  closely approximates ϵ.
L L D M = E [ ϵ ϵ θ z t , t , c 2 ]
ϵ ~ N ( 0 , I )

4. Results and Discussion

4.1. Dataset

All three datasets consist of publicly leaked passwords from real-world platforms and reflect human-generated patterns, combinations, and distributions. The password length distribution and corresponding percentage stacked bar chart were analyzed by comparing passwords ranging from 4 to 15 characters in length, as shown in Figure 4 and Figure 5.

4.2. Comparison of the Models

In the comparative analysis, two existing models, GAN and BERT, were used as baselines.
  • PassGAN-Mask [5]: This model employs a generative adversarial network (GAN) that uses partially masked password data as conditional input, aiming to learn the conditional probability distribution of real-world passwords. The architecture consists of two components: a generator, which produces candidate passwords, and a discriminator, which determines whether a given sample is real or generated.
  • PassBERT [6]: This model is based on the BERT architecture and is designed for password-guessing tasks. It leverages BERT’s strength in capturing contextual relationships within sequences. During fine-tuning, it is adapted for other tasks such as training additional guessing models or analyzing password strength.

4.3. Model Validation and Guessing Success Rate

In model validation, we sampled 100 unique passwords from three distinct datasets to serve as ground truth for comparison. From this sample, we calculated the guessing success rate as the primary evaluation metric. The encoder extracted 64-dimensional feature vectors from each password, which were then reduced to two-dimensional coordinates using principal component analysis for visualization to examine, based on the experimental results, the spatial positioning and distance between the target passwords and the samples generated by the proposed model—as well as two existing models—within the original datasets. We then analyzed how these spatial relationships correlate with guessing outcomes.
Figure 6 presents the guessing success rate over time with respect to the RockYou dataset. Among the evaluated models, the proposed PassLDiffusion-Mask method exhibited the most consistent and progressively improving performance, particularly under conditions involving higher levels of character masking. Its ability to retain and refine latent representations enables superior adaptability across varying levels of data obscurity. PassGAN-Mask showed strong performance in scenarios with moderate masking, but its effectiveness diminished as the masking ratio increased. This suggests that while GAN-based approaches can model frequent password structures effectively, they may struggle with generalization when limited contextual information is available.
In contrast, PassBERT achieved reasonable results under light masking conditions, benefiting from its bidirectional contextual embeddings. However, its improvement rate was notably slower as the masking level increased, indicating challenges in complex reasoning tasks that require iterative inference. Overall, diffusion-based models—such as PassLDiffusion-Mask—demonstrated advantages in robustness and adaptability, particularly in scenarios where password information was partially known or highly obscured. Their iterative denoising process allows for progressive refinement, making them well-suited for conditional password-guessing tasks.
Figure 7 illustrates the guessing success rate over time with respect to the NordVPN dataset under varying levels of masking. The proposed PassLDiffusion-Mask method consistently outperformed the other models across all masking levels and time intervals, achieving success rates approaching 90% in both two-mask and three-mask scenarios. This demonstrates the model’s robustness and high adaptability in challenging conditions with limited input information. PassBERT showed moderate effectiveness under light masking conditions; however, its performance diminished as the complexity of the input increased, indicating limited capacity for deep inference in highly masked situations. In contrast, PassGAN-Mask consistently delivered the lowest success rates across all settings, rarely exceeding 30%, highlighting its relative weakness in generalizing to more ambiguous or incomplete password structures.
Figure 8 presents the time-based guessing success rate for the MySpace dataset under varying masking conditions. The proposed PassLDiffusion-Mask method consistently outperformed all baseline models, achieving near-perfect success rates under two-mask and three-mask scenarios within just 60 s. This demonstrates not only the model’s high inference accuracy but also its exceptional computational efficiency in real-time guessing tasks. PassBERT delivered moderate results under light masking; however, its performance degraded rapidly as masking complexity increased, reflecting limitations in handling heavily obscured inputs. PassGAN-Mask showed the weakest overall performance, failing to exceed a 40% success rate across all masking levels, further indicating its struggle with more ambiguous or incomplete password structures.
In summary, experimental results obtained across the RockYou, NordVPN, and MySpace datasets consistently demonstrated the superior performance of the proposed PassLDiffusion-Mask model. Regardless of the masking level or time constraint, the model achieved the highest guessing-success rates, particularly excelling under more challenging conditions such as heavy masking and limited inference time. Its iterative denoising mechanism enables robust handling of partial inputs and supports progressive refinement, contributing to both accuracy and stability over time. In contrast, PassBERT and PassGAN-Mask exhibited varying degrees of degradation in performance as input uncertainty increased, further emphasizing the advantage of diffusion-based approaches in adaptive and complex password-guessing scenarios.

5. Conclusions

We adopted PassLDiffusion-mask, a diffusion-based password-guessing framework that operates in latent space and is conditioned on partially masked inputs. Through iterative denoising, the model generates high-quality password candidates that align closely with the target. The experimental results obtained using the RockYou, NordVPN, and MySpace datasets demonstrate strong performance, especially under two- and three-mask conditions, wherein the model consistently outperformed existing approaches in both accuracy and stability. However, under heavier masking (e.g., in four-mask settings), the model’s accuracy decreases and does not always surpass baseline methods. This highlights a limitation in handling highly uncertain inputs. In future work, we aim to improve performance in these challenging scenarios by enhancing the model’s conditional encoding, incorporating structure-aware priors, and exploring more expressive denoising networks.

Author Contributions

Methodology, Y.-C.H. and J.-W.L.; software, Y.-C.H.; validation, Y.-C.H. and J.-W.L.; formal analysis, Y.-C.H. and J.-W.L.; writing—original draft preparation, J.-W.L.; writing—review and editing, J.-W.L.; funding acquisition, J.-W.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Science and Technology Council, Taiwan, under grant 113-2221-E-035 -059 -.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data will be made available on reasonable request.

Acknowledgments

The authors would like to acknowledge the Department of Computer Science and Engineering at Feng Chia University for providing laboratory resources and computing facilities essential to this research.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Haller, N.; Metz, C. A One-Time Password System. Internet Engineering Task Force. RFC 1938, May 1996. Web Document. Available online: http://www.ietf.org/rfc/rfc1938.txt (accessed on 19 July 2025).
  2. Dhanalakshmi, R.; Vijayaraghavan, N.; Narasimhan, S.; Basha, S. Password Manager with Multi-Factor Authentication. In Proceedings of the 2023 International Conference on Networking and Communications (ICNWC), Chennai, India, 5–6 April 2023; pp. 1–5. [Google Scholar]
  3. Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
  4. Lovelace, J.; Kishore, V.; Wan, C.; Shekhtman, E.; Weinberger, K.Q. Latent Diffusion for Language Generation. In Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023. [Google Scholar]
  5. Hitaj, B.; Gasti, P.; Ateniese, G.; Perez-Cruz, F. PassGAN: A Deep Learning Approach for Password Guessing. In Applied Cryptography and Network Security. ACNS 2019; Springer: Cham, Switzerland, 2019. [Google Scholar]
  6. Xu, M.; Yu, J.; Zhang, S.; Wu, H.; Han, W. Improving Real-World Password Guessing Attacks via Bi-Directional Transformers. In Proceedings of the 32nd USENIX Conference on Security Symposium, Anaheim, CA, USA, 9–11 August 2023. [Google Scholar]
  7. Vaithyasubramanian, S.; Christy, A. A Scheme to Create Secured Random Password Using Markov Chain. In Proceedings of the Artificial Intelligence and Evolutionary Algorithms in Engineering Systems; Suresh, L.P., Dash, S.S., Panigrahi, B.K., Eds.; Springer India: New Delhi, India, 2015; pp. 809–814. [Google Scholar]
  8. Weir, M.; Aggarwal, S.; de Medeiros, B.; Glodek, B. Password Cracking Using Probabilistic Context-Free Grammars. In Proceedings of the 2009 30th IEEE Symposium on Security and Privacy, Oakland, CA, USA, 17–20 May 2009; pp. 391–405. [Google Scholar]
  9. Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving Language Understanding by Generative Pre-Training. Available online: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf (accessed on 19 July 2025).
  10. Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. arXiv 2019, arXiv:1810.04805. [Google Scholar] [CrossRef]
  11. Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. Commun. ACM 2014, 63, 139–144. [Google Scholar] [CrossRef]
  12. Papamakarios, G.; Nalisnick, E.; Rezende, D.J.; Mohamed, S.; Lakshminarayanan, B. Normalizing Flows for Probabilistic Modeling and Inference. J. Mach. Learn. Res. 2021, 22, 2617–2680. [Google Scholar]
  13. Ho, J.; Jain, A.; Abbeel, P. Denoising Diffusion Probabilistic Models. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 6–12 December 2020. [Google Scholar]
  14. Pagnotta, G.; Hitaj, D.; De Gaspari, F.; Mancini, L.V. PassFlow: Guessing Passwords with Generative Flows. arXiv 2021, arXiv:2105.06165. [Google Scholar] [CrossRef]
  15. Rando, J.; Perez-Cruz, F.; Hitaj, B. PassGPT: Password Modeling and (Guided) Generation with Large Language Models. In Computer Security—ESORICS 2023; Springer: Cham, Switzerland, 2023. [Google Scholar]
  16. Guo, S.; Duan, M.; Du, Y.; Wang, W.; Guo, L. PassDiff: A New Approach for Password Guessing Using Diffusion Model. In Proceedings of the 13th International Conference on Computer Engineering and Networks; Zhang, Y., Qi, L., Liu, Q., Yin, G., Liu, X., Eds.; Lecture Notes in Electrical Engineering; Springer Nature: Singapore, 2024; Volume 1125, pp. 29–40. ISBN 978-981-99-9238-6. [Google Scholar]
Figure 1. Diffusion process.
Figure 1. Diffusion process.
Engproc 120 00051 g001
Figure 2. Autoencoder architecture.
Figure 2. Autoencoder architecture.
Engproc 120 00051 g002
Figure 3. PassLDiffusion-mask architecture.
Figure 3. PassLDiffusion-mask architecture.
Engproc 120 00051 g003
Figure 4. Length distributions of password dataset.
Figure 4. Length distributions of password dataset.
Engproc 120 00051 g004
Figure 5. Stacked percentages of password dataset.
Figure 5. Stacked percentages of password dataset.
Engproc 120 00051 g005
Figure 6. Time-guessing success rate with respect to the RockYou dataset.
Figure 6. Time-guessing success rate with respect to the RockYou dataset.
Engproc 120 00051 g006
Figure 7. Time-guessing success rate with respect to the NordVPN dataset.
Figure 7. Time-guessing success rate with respect to the NordVPN dataset.
Engproc 120 00051 g007
Figure 8. Time guessing success rate with respect to the MySpace dataset.
Figure 8. Time guessing success rate with respect to the MySpace dataset.
Engproc 120 00051 g008
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, Y.-C.; Lin, J.-W. Intelligent Password Guessing Using Feature-Guided Diffusion. Eng. Proc. 2025, 120, 51. https://doi.org/10.3390/engproc2025120051

AMA Style

Huang Y-C, Lin J-W. Intelligent Password Guessing Using Feature-Guided Diffusion. Engineering Proceedings. 2025; 120(1):51. https://doi.org/10.3390/engproc2025120051

Chicago/Turabian Style

Huang, Yi-Ching, and Jhe-Wei Lin. 2025. "Intelligent Password Guessing Using Feature-Guided Diffusion" Engineering Proceedings 120, no. 1: 51. https://doi.org/10.3390/engproc2025120051

APA Style

Huang, Y.-C., & Lin, J.-W. (2025). Intelligent Password Guessing Using Feature-Guided Diffusion. Engineering Proceedings, 120(1), 51. https://doi.org/10.3390/engproc2025120051

Article Metrics

Back to TopTop