You are currently on the new version of our website. Access the old version .
SymmetrySymmetry
  • This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
  • Article
  • Open Access

15 January 2026

PromptTrace: A Fine-Grained Prompt Stealing Attack via CLIP-Guided Beam Search for Text-to-Image Models

,
,
,
,
,
,
and
1
School of Electrical Engineering, Northeast Electric Power University, Jilin 132000, China
2
Hubei Key Laboratory of Internet of Intelligence, School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan 430074, China
3
People’s Bank of China Qinghai Branch, Xining 810001, China
4
School of Computer Science, Northeast Electric Power University, Jilin 132000, China
This article belongs to the Section Computer

Abstract

The inherent semantic symmetry and cross-modal alignment between textual prompts and generated images have fueled the success of text-to-image (T2I) generation. However, this strong correlation also introduces security vulnerabilities, specifically prompt stealing attacks, where valuable prompts are reverse-engineered from images. In this paper, we address the challenge of information asymmetry in black-box attack scenarios and propose PromptTrace, a fine-grained prompt stealing framework via Contrastive Language-Image Pre-training (CLIP)-guidedbeam search. Unlike existing methods that rely on single-stage generation, PromptTrace structurally decomposes prompt reconstruction into subject generation, modifier extraction, and iterative search optimization to effectively restore the visual–textual correspondence. By leveraging a CLIP-guided beam search strategy, our method progressively optimizes candidate prompts based on image–text similarity feedback, ensuring the stolen prompt achieves high fidelity in both semantic intent and stylistic representation. Extensive evaluations across multiple datasets and T2I models demonstrate that PromptTrace outperforms existing methods, highlighting the feasibility of exploiting cross-modal symmetry for attacks and underscoring the urgent need for defense mechanisms in the T2I ecosystem.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.