PromptTrace: A Fine-Grained Prompt Stealing Attack via CLIP-Guided Beam Search for Text-to-Image Models

Shaofeng Ming; Yuhao Zhang; Yang Liu; Tianyu Han; Dengmu Liu; Tong Yu; Jieke Lu; Bo Xu

doi:10.3390/sym18010161

,

and

¹

School of Electrical Engineering, Northeast Electric Power University, Jilin 132000, China

²

Hubei Key Laboratory of Internet of Intelligence, School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China

³

People’s Bank of China Qinghai Branch, Xining 810001, China

⁴

School of Computer Science, Northeast Electric Power University, Jilin 132000, China

Symmetry2026, 18(1), 161;https://doi.org/10.3390/sym18010161

This article belongs to the Section Computer

Version Notes

Order Reprints

Abstract

The inherent semantic symmetry and cross-modal alignment between textual prompts and generated images have fueled the success of text-to-image (T2I) generation. However, this strong correlation also introduces security vulnerabilities, specifically prompt stealing attacks, where valuable prompts are reverse-engineered from images. In this paper, we address the challenge of information asymmetry in black-box attack scenarios and propose PromptTrace, a fine-grained prompt stealing framework via Contrastive Language-Image Pre-training (CLIP)-guidedbeam search. Unlike existing methods that rely on single-stage generation, PromptTrace structurally decomposes prompt reconstruction into subject generation, modifier extraction, and iterative search optimization to effectively restore the visual–textual correspondence. By leveraging a CLIP-guided beam search strategy, our method progressively optimizes candidate prompts based on image–text similarity feedback, ensuring the stolen prompt achieves high fidelity in both semantic intent and stylistic representation. Extensive evaluations across multiple datasets and T2I models demonstrate that PromptTrace outperforms existing methods, highlighting the feasibility of exploiting cross-modal symmetry for attacks and underscoring the urgent need for defense mechanisms in the T2I ecosystem.

Keywords:

prompt stealing attack; text-to-image models; prompt engineering

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Article metric data becomes available approximately 24 hours after publication online.