Next Article in Journal
Benders Decomposition Approach for Generalized Maximal Covering and Partial Set Covering Location Problems
Previous Article in Journal
Geometric Realization of Triality via Octonionic Vector Fields
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Research on Making Two Models Based on the Generative Linguistic Steganography for Securing Linguistic Steganographic Texts from Active Attacks

1
School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
2
School of Cybersecurity, Tarim University, Alar City 843300, China
3
School of Cyber Science and Engineering, Nanjing University of Science and Technology, Wuxi 214443, China
*
Author to whom correspondence should be addressed.
Symmetry 2025, 17(9), 1416; https://doi.org/10.3390/sym17091416
Submission received: 22 July 2025 / Revised: 21 August 2025 / Accepted: 25 August 2025 / Published: 1 September 2025
(This article belongs to the Section Computer)

Abstract

Generative steganographic text covertly transmits hidden information through readable text that is unrelated to the message. Existing AI-based linguistic steganography primarily focuses on improving text quality to evade detection and therefore only addresses passive attacks. Active attacks, such as text tampering, can disrupt the symmetry between encoding and decoding, which in turn prevents accurate extraction of hidden information. To investigate these threats, we construct two attack models: the in-domain synonym substitution attack (ISSA) and the out-of-domain random tampering attack (ODRTA), with ODRTA further divided into continuous (CODRTA) and discontinuous (DODRTA) types. To enhance robustness, we propose a proactive adaptive-clustering defense against ISSA, and, for CODRTA and DODRTA, a post-hoc repair mechanism based on context-oriented search and the determinism of text generation. Experimental results demonstrate that these mechanisms effectively counter all attack types and significantly improve the integrity and usability of hidden information. The main limitation of our approach is the relatively high computational cost of defending against ISSA. Future work will focus on improving efficiency and expanding practical applicability.

1. Introduction

Steganography [1] enables the covert transmission of information by concealing the existence of a message. Traditionally, hidden information has been embedded within common covers such as text, images, and videos, thereby reducing perceptibility during transmission. Among these media, text has attracted significant attention due to its widespread use and ease of distribution. However, conventional text-based steganographic methods suffer from limited embedding capacity and insufficient imperceptibility [2], which limits their practical applicability.
With the rapid advancement of artificial intelligence (AI), neural text generation models can produce highly readable text by leveraging intrinsic associations in large-scale datasets. This capability has given rise to generative linguistic steganography, which encodes hidden information into fluent text that is unrelated to the original message. Compared with traditional approaches, generative methods can embed more information while maintaining better imperceptibility.
Despite these advantages, existing generative linguistic steganography primarily focuses on resisting passive attacks, such as manual audits and statistical steganalysis. Active attacks, particularly text tampering, pose a significant threat by modifying steganographic text and preventing accurate extraction of hidden information. Text tampering can take the form of synonym substitution, which replaces words with semantically similar alternatives, or random tampering, which introduces arbitrary changes without semantic constraints. However, conventional attack models are often ineffective against carefully crafted generative steganographic text, allowing the recipient to recover the hidden information.
This research addresses the vulnerability of generative linguistic steganography to active text tampering attacks. We redefine traditional tampering attacks to enhance their adversarial capability, resulting in the in-domain synonym substitution attack (ISSA) and the out-of-domain random tampering attack (ODRTA), with the latter further divided into continuous and discontinuous types. To defend against these attacks, we propose a proactive adaptive-clustering mechanism for ISSA and a post-hoc repair mechanism for ODRTA, leveraging context-oriented search and the determinism of text generation. These mechanisms aim to improve the integrity and usability of hidden information under active attacks.
The main contributions of this research are as follows:
  • We remodel common text tampering attacks specifically for generative linguistic steganography.
  • We classify tampering attacks into ISSA and ODRTA based on whether the tampered words exist within the candidate pool of the text generation model.
  • We further categorize ODRTA into continuous and discontinuous types according to the continuity of the attack.
  • We propose a proactive adaptive-clustering defense for ISSA and a post-hoc repair mechanism for ODRTA.
  • We validate through experiments the idea that the proposed mechanisms effectively enhance the integrity and accessibility of hidden information.
Section 2 reviews related work on generative linguistic steganography. Section 3 introduces the two types of attack models. Section 4 details the proposed steganographic mechanisms. Section 5 presents experimental results and provides a comprehensive discussion. Section 6 illustrates examples of the proposed attack models and defense mechanisms in practice. Finally, Section 7 concludes the research.

2. Related Work

In the early days of generative linguistic steganography, limited computational power often necessitated reliance on statistical models or Markov chains for text generation. Specifically, these mechanisms used Markov chains to embed hidden information through state transitions and transition probabilities. However, Markov chain-based steganographic texts [3] struggled to embed large amounts of information and generally exhibited low quality, with poor perceptual and statistical imperceptibility. Huang et al. [4] attempted to improve text quality by using Song Ci as a textual template, but the template format constrained embedding capacity, limiting the practical utility of early mechanisms.
With advances in computational capabilities and artificial intelligence (AI), neural network-based text generation models have significantly improved the handling of long sequences. Consequently, generative linguistic steganography regained research focus, emphasizing higher embedding capacity and better quality, including both perceptual and statistical imperceptibility. Perceptual imperceptibility refers to evading human scrutiny, while statistical imperceptibility assesses alignment with the statistical distribution of large datasets.
Despite improvements, early neural models faced challenges in generating high-quality text, and embedding hidden information could further reduce quality. Huang et al. [5] addressed this by combining intra-word mutual information with recurrent neural networks to generate Chinese poetry containing hidden information. Building on this, Tong [6] used long short-term memory networks to generate Chinese lyrics with higher embedding capacity. Although quality improved, advancing steganalysis techniques necessitated further optimization to enhance resistance against detection.
Recent developments in natural language processing (NLP) and text generation allow new models [7,8,9,10,11] to generate texts with arbitrary semantic content. Compared with traditional neural network models, these new models improve steganographic text quality and are no longer constrained by fixed templates, enhancing practicality. Zachary et al. [12] introduced a semantically secure steganographic mechanism using state-of-the-art models and arithmetic coding, achieving near-optimal statistical security. Huang et al. [13] developed a long short-term memory-based model implementing two embedding algorithms: fixed-length binary tree mapping and variable-length Huffman coding. While Huffman coding offers slightly lower embedding capacity, it favors high-probability candidate words, improving imperceptibility. Dai et al. [14] further enhanced imperceptibility using patient Huffman coding based on statistical regularities of fluent texts [13,15].
Huang et al. [2] observed a tradeoff between perceptual and statistical imperceptibility and proposed a variational autoencoder (VAE) combined with Huffman coding to improve both. Zhou et al. [16] leveraged adaptive probability distributions and generative adversarial networks (GANs) [17] to strengthen imperceptibility and security. Reliable transmission also requires shared knowledge between communicators. Yi et al. [18] proposed a BERT-based mechanism with Gibbs sampling, but limited long-sequence performance restricted embedding capacity. Cao et al. [19] enhanced text generation using plug-and-play models with an embeddable candidate pool, while Yan et al. [20] introduced lightweight disambiguation to improve statistical imperceptibility. Ding et al. [21] combined neural machine translation with algorithmic coding to generate high-quality semantic steganographic text.
Further improvements include those of Li et al. [22], who leveraged prompt engineering and knowledge graphs with large language models for content consistency, and Yang et al. [23], who integrated modification-based and generation-based mechanisms to enhance embedding capacity and steganalysis resistance. Zhang et al. [24] proposed controllable steganography using summary generation, improving semantic consistency. Bai et al. [25] mapped hidden information into semantic spaces using ontology trees, enhancing robustness. Huang et al. [26] applied adaptive dynamic grouping to minimize statistical differences between steganographic and natural text. Christian et al. [27] demonstrated that maximal information throughput requires minimal entropy coupling in perfectly secure systems. Huang et al. [28] used rejection sampling for approximate perfect steganography, while Pang et al. [29] applied frequency-reform-based softmax to reduce statistical discrepancies. Ding et al. [30] utilized distribution replicas to achieve near-theoretical absolute security. Wu et al. [31] developed LLM Stega, embedding encrypted messages via large language model interfaces while maintaining semantic quality. Lin et al. [32] proposed a zero-shot contextual learning method to enhance concealability, and Sun et al. [33] introduced theme-controlled graph-to-text steganography for improved quality and concealability.
To increase embedding capacity, Xiang et al. [34] refined generation units from words to letters, while Omer et al. [35] developed a word-level Arabic poetry mechanism using LSTM networks and Baudot coding. These studies collectively improved embedding capacity.
Despite these advances, generative linguistic steganography mechanisms still show vulnerabilities against active attacks. Some defensive measures exist, primarily for cover-based steganography [36,37]. While strategies for generative steganography have been proposed [38], they often overlook the adversary’s perspective and oversimplify synonym substitution, weakening defense effectiveness against complex manipulations.
Table 1 summarizes the representative works and their key features.

3. Attack Model

Currently, generative linguistic steganography mechanisms follow a specific process. First, based on historical information (prompt), a candidate pool (CP) for the next position is constructed by a text generation model. Construction of the CP mainly relies on top- k or top- p sampling. Subsequently, an embedding algorithm, such as Huffman coding [2,13], arithmetic coding [21], or distributed copies [30], is employed to map hidden information to elements within the CP, thereby selecting the current output word. It is noteworthy that the CP’s dimensionality is typically much smaller than the text generation model’s vocabulary V , which helps maintain good readability of the generated steganographic text.
Text tampering attacks involve replacing certain words in a text with other words. These attacks can be further categorized based on the similarity of the replacement words to the original words into two types: synonym substitution and non-synonym substitution. Among the latter, random tampering attacks can be considered a special case.
Based on the generative linguistic steganography process, we have redefined synonym substitution and random tampering attacks as in-domain synonym substitution attack (ISSA) and out-of-domain random tampering attack (ODRTA). Instead of depending on the similarity between tampered and original words, the distinction now relies on whether the tampered words are present in the current CP. When a tampered word exists in the CP, the attack is classified as ISSA and is considered stealthy. The other type is termed ODRTA, which will be further classified based on attack continuity in the following sections.

3.1. ISSA

To enhance the adversary’s ability beyond conventional synonym substitution attacks, we introduce the in-domain synonym substitution attack (ISSA).
We consider an active adversary who intercepts communication channels suspected of carrying steganographic texts. The adversary has full knowledge of the text generation model and the candidate pool construction strategy (e.g., top- k or top- p sampling) but cannot manipulate the generation system itself. Thus, the adversary is treated as a model-aware outsider rather than an insider. Since the adversary can access the same model and decoding strategy, the candidate pool at each step is reproducible.
Before launching an attack, the adversary may attempt to detect whether a message contains hidden information using human judgment or automated steganalysis (e.g., statistical tests or classifier-based methods). Once a text is identified as suspicious, the adversary selectively decides where to apply ISSA.
The adversary’s tampering strategy is not arbitrary. At each position t , the adversary inspects the candidate pool C P t constructed by the generation model. If no semantically related alternatives are found, the position is left unchanged. Otherwise, when a candidate c i t C P t is identified as a synonym of the original word w t , the attacker replaces w t with c i t . This selective, condition-triggered attack ensures that tampering is both in-domain (restricted to C P t ) and semantically consistent (maintaining synonymy). Unlike ordinary synonym replacement methods, which indiscriminately substitute words regardless of context or model constraints, this design significantly improves stealthiness and reduces the likelihood of immediate detection.
At step t , the generation model constructs a candidate pool CP V according to its decoding strategy. The adversary inspects C P t . If no semantically related alternatives are found, the position is left unchanged. Otherwise, the adversary replaces the original word w t with a synonym c i t C P t .
Based on the above description, ISSA can be expressed as follows:
c i t { c C P t | c i w t , ϕ w t ϕ c 2 < δ } ,
where w t denotes the original word at step t , C P t is the candidate pool generated by the model, ϕ · R d represents the word embedding function, · 2 is the L 2 norm, and δ is a similarity threshold. Note that C P t is a set of candidate words rather than a vector.
The parameter δ controls the trade-off between semantic fidelity and the number of substitution options. A small δ ensures high semantic consistency but yields fewer candidates, whereas a larger δ enlarges the tampering space at the risk of semantic drift. In practice, δ can be tuned on validation data to balance stealthiness and naturalness.
A key feature of ISSA is the position-shift effect, which significantly enhances stealthiness. When the adversary substitutes w t with c i t C P t , the receiver may not detect the change at step t . Instead, the subsequent word w t + 1 may fall outside its corresponding candidate pool C P t + 1 , leading the receiver to misattribute the tampering.
For example, suppose the original sequence is A , B , C . If the adversary replaces B with a synonym B from C P 2 , the original third word C may no longer belong to C P 3 . Consequently, the receiver detects inconsistency at position 3 rather than at position 2. This misalignment complicates both detection and recovery, making ISSA substantially more stealthy than conventional synonym substitution attacks. Figure 1 illustrates the adversary intercepting steganographic text S and performing ISSA at position t .

3.2. ODRTA

From the description of ISSA, it is clear that the adversary must possess substantial knowledge. This includes an understanding of the text generation model and the candidate pool construction strategy. In ISSA, the adversary has significant attack capabilities but cannot manipulate the generation system itself. Therefore, an adversary employing ISSA can be best characterized as a model-aware outsider.
In contrast, attacks may also originate from ordinary external adversaries who have limited or no access to the underlying mechanisms and parameters. In such text-tampering attacks, these adversaries primarily target the availability and timeliness of hidden information. As a result, they are less constrained by detection concerns and may freely replace original words.
Traditional random tampering attacks (RTA) can be modeled as follows:
W R T A t V , w t o V \ W R T A t ,
where W R T A t is the available set of tampering words at step t , V is the entire vocabulary, and w t o is the original steganographic word at step t .
Due to the requirement for readability in generative linguistic steganography tasks, candidate pools are generally small, e.g., containing 2, 8, or 16 words. Consequently, the probability of an adversary randomly selecting these words is significantly lower than the probability of choosing a word from the remaining vocabulary V * = V \ W C P t Therefore, we define the out-of-domain random tampering attack (ODRTA), where “random” refers to the selection of words from V * . The ODRTA can be represented as follows:
W O D R T A t W C P t = W O D R T A t W C P t = V W C P t W O D R T A t V P w O D R T A t C P = 0 ,
where · denotes the number of elements in the set.
ODRTA can be categorized based on the continuity of the attacks into continuous ODRTA and discontinuous ODRTA. Let S = w 1 , w 2 , w 3 , w 4 , w T , , w T + k , w n denote the intercepted steganographic text, and let T and T + k be the indices of consecutive attack points; then, we obtain the following:
C O D R T A T + k T = 1 D O D R T A T + k T > 1 ,
If k = 1 , the attack is continuous, meaning no unmodified words exist between attack points. If k > 1 , the attack is discontinuous.
Figure 2 illustrates a schematic representation of ODRTA. The upper half of Figure 2 represents a single instance of DODRTA, while the lower half showcases a CODRTA.

4. Defense Mechanism

To defend against the two types of tampering attacks described above and to enhance the integrity and availability of hidden information at the receiving end, we propose two mechanisms: a proactive defense mechanism based on adaptive clustering for ISSA, and a post-hoc repair mechanism based on a context-oriented search strategy and the determinism of the text generation model for ODRTA. In addition to targeting different attacks, these two mechanisms also differ in their deployment locations. The proactive defense mechanism must be deployed at both the sending and receiving ends of the communication, whereas the post-hoc repair mechanism only needs to be deployed at the receiving end.

4.1. ISSA-Defense (ISSA-D)

According to the ISSA model, the construction mechanism and specific parameters of a steganographic text are known to the adversary. To avoid detection by both parties in the communication and to increase the difficulty of extracting the original hidden information, the adversary selects synonyms of the source words from the candidate pool generated by the text generation model at the current step as tampering words.
To enhance the defensive capability of generative linguistic steganography against ISSA, we propose a proactive defense mechanism based on an adaptive clustering algorithm. In this mechanism, the DBSCAN algorithm is chosen as the clustering algorithm.
First, the sending end inputs a prompt into the text generation model to initiate the steganographic task. Most existing text generation models calculate the probability of each word in the vocabulary V appearing at the current position based on conditional probabilities. Subsequently, a candidate pool (CP) is constructed by selecting the top- k words from the computed results using the top- k sampling strategy. The prediction rules of the text generation model and the top- strategy are as follows:
P S = P w t w t 1 , w t 2 , , w 1 , p r o m p t C P = t o p - k V , k = { x 1 , x 2 , , x k } ,
where x i V and p x i p x i + 1 .
Currently, generative linguistic steganography typically begins the embedding process immediately after obtaining the CP. To ensure that steganographic text can withstand ISSA, we perform adaptive clustering on the CP, constructing a new CP to be transmitted to the embedding unit. Let C P t denote the CP constructed by the text generation model at step t , and C P c t refer to the CP transmitted to the embedding unit.
The adaptive clustering algorithm based on the DBSCAN algorithm is as follows:
N ϵ x i = { x j C P t | | x i x j | | 2 ϵ } ,
where N ϵ · and ϵ are the crucial hyperparameters of DBSCAN. Here, N ϵ · , represents the minimum number of points required to form a dense region, while ϵ , denotes the maximum distance between two points to be considered in the same cluster.
To ensure that every element in the C P c t can withstand ISSA, the minimum value of N ϵ · is set to 2.
N ϵ x i 2 ,
As the elements of C P t differ at each step, a static ϵ cannot guarantee effective clustering results. Therefore, we introduce the K-NN algorithm to dynamically adjust ϵ , as follows:
D x i = { d x i x j x j K - NN C P t , N ϵ x i } ,
where D x i represents the distances from all elements in C P t to data point x i , while d x i x j denotes the distance from data point x j to data point x i .
ϵ = 1 k i = 1 k D i T ,
where k is the total number of words in C P t and T represents the K nearest words to the i -th word based on their distances.
After these operations, C P t is partitioned into different clusters, each containing at least two elements:
C P t = { C 1 , , C m } ,
A filtering function selects the element with the highest probability from each cluster C i to be included in C P c t :
x i j = arg max x C i P x ,
where x i j denotes the word with the highest probability in cluster C i . This ensures perceptual imperceptibility of the steganographic text.
At this point, C P c t serves as the latest candidate pool, passed to the embedding unit for information hiding. A complete algorithmic flow for the adversarial ISSA is illustrated in Algorithm 1, where line 11 produces the final C P c t to be transmitted to the next processing unit. In line 9, the DBSCAN clustering algorithm is applied with parameters ϵ , M i n P t s = ϵ , 2 . Here, M i n P t s is set to 2 to ensure that each cluster contains at least two candidate words, which establishes the lower bound necessary for the mechanism to resist ISSA attacks. The hyperparameters were chosen based on empirical experience and further tuned on the validation set to achieve meaningful clustering results while balancing cluster density and noise points.
Algorithm 1. Steganographic text generation against ISSA
01: INPUT: hidden information HI, historical information HIS
02: OUTPUT: steganographic text ST
03: HIS text generation model, TGM
04: i = 1, OHI = HIS
05: while HI is not empty do
06:  candidate pool, C P t   TGM(HIS)
07:  distances K-NN( K )
08:   ϵ   mean(distances)
09:   clusters, C   DBSCAN( ϵ , M i n P t s = 2)
10:  candidate pool C P c t = []
11:  for cluster in C  do
12:    word cluster(max_prob)
13:     C P c t .add(word)
14:  end for
15:   w i   Embedding(HI, C P c t )
16:  HI update(HI)
17:  HIS HIS + w i
18:   i + +
19: end while
20: ST HIS.minus(OHI)
21: Return ST
Now, we discuss how the candidate pool C P c t helps both parties resist ISSA. Suppose the adversary performs an ISSA at the i -th word of the intercepted S . According to the ISSA description, the word x i a is altered to a synonym of the original word w i . When x i a exists within the cluster C i containing the original word w i , the receiving end can query the original word w i using Equation (11). The number of allowed alterations for each cluster C i is limited to its dimension, minus one. Additionally, if the clustering operation does not yield multiple clusters, the word with the highest probability from C P t will be returned. This approach is aimed at enhancing the usability of the steganographic text while maintaining perceptual imperceptibility. The formulation expression is as follows:
w i C i C i C P t x i a C i Q , Q x i a = w i ,
where w i represents the original word, x i a represents the tampering word, and Q denotes the query function related to Equation (11).

4.2. ODRTA-Defense (ODRTA-D)

When an adversary aims to compromise the usability and timeliness of hidden information without considering detection, they can implement an out-of-domain random tampering attack (ODRTA) on the S .
To recover such texts and enhance the integrity and usability of hidden information, we propose a post-hoc repair mechanism based on context-oriented search strategy and the determinism of text generation model.
Generative linguistic steganography effectively constitutes a deterministic system under various conditions determined by text generation model, CP construction strategies, hidden information, and embedding algorithm. Specifically, this mechanism ensures that the same input always produces the same output. The representation of steganographic text generation and hidden information extraction are as follows:
E G p r o m p t , M , θ S E 1 G p r o m p t , S , θ M w h e r e , G · C P t S = { w 1 , w 2 , , w t , , w n } ,
where E denotes the embedding algorithm, E 1 denotes its inverse, G · denotes the text generation process used to construct C P t , θ is a hyperparameter, and M is the hidden information.
As G · is a deterministic system and both E and E 1 are bijective, generative linguistic steganography can also be regarded as a deterministic system. At step t , if the steganographic text S * is tampered such that the original word w t is altered to w t + α , given that the information before t is correct, the generation system G · outputs the correct C P t at step t . However, as w t + α C P t , the inverse mapping function E 1 cannot proceed, interrupting the extraction process:
S * = { p r o m p t , w 1 , w 2 , , w t + α , w t + 1 , , w t + m , , w n } ,
Without auxiliary information, the receiving end can only repair the steganographic text via enumeration. Introducing auxiliary information to assist in the repair process is therefore a natural approach. The receiving end can reasonably assume that the steganographic text following the tampering point is correct and use it as auxiliary information.
Under this assumption, the repair mechanism uses { p r o m p t , w 1 , , w t 1 } and employs a text generation model to predict m + 1 steps backward, enumerating all possible outcomes. The last m words are then used as search tokens to query text samples matching these words. These m words correspond to the nearest m words following the tampered word w t + α in S * . After the repair process, the first word following { p r o m p t , w 1 , , w t 1 } is restored to the original word w t .
Algorithm 2 describes the workflow for repairing steganographic text affected by a DODRTA. In line 15, based on historical information (HIS), the algorithm predicts m + 1 steps backward, where the first step represents the possible original word at the tampering point, and subsequent steps provide auxiliary information necessary for the repair. Once predictions are completed, the algorithm combines historical information with the auxiliary information predicted in the first step. Leveraging the determinism of the steganographic mechanism and a context-oriented search strategy, the algorithm filters out the original word from possible candidates at the tampering point (lines 16–18).
Algorithm 2. Extracting from DODRTA-affected ST
01: INPUT: steganographic text ST, historical information HIS
02: OUTPUT: hidden information HI
03: ST ST.minus(HIS)
04: HI = []
05: HIS text generation model, TGM
06: for  t in range(0, len(ST)) do
07:  candidate pool, C P t   TGM(HIS)
08:  if ST [ t ] in C P t  do
09:    temp Extraction(ST [ t ])
10:    HI.add(temp)
11:    HIS HIS + ST [ t ]
12:  else do
13:    // Predict m + 1 positions backward and query the path of ST[ t + 1 : t + m ]
14:    candidate pools, C P t : m + t   TGM(HIS, m + 1 )
15:    if ST[ t + 1 : t + m ] in C P t + 1 : t + m  do
16:      original word, ow in C P t j
17:      temp Extraction(ow + ST [ t + 1 : t + m ])
18:      HI.add(temp)
19:      HIS HIS + ow + ST[ t + 1 : t + m ]
20:    end if
21:  end if
22: end for
23: Return HI
Repairing steganographic text S * * affected by a CODRTA requires more auxiliary knowledge:
S * * = { p r o m p t , w 1 , w 2 , , w t + α 0 , , w t + a + α a , , , w n } ,
Let a represent the continuity of tampered words and z denote the amount of required auxiliary information. Based on { p r o m p t , w 1 , , w t 1 } , the text generation model predicts a + z steps backward. The repair process is expressed as follows:
Y a + z t 1 = S e q G S * * p r o m p t : t 1 , C P t : t + a + z , a + z W t : t + a = S e a Y t 1 a + z , S * * t + a + 1 : t + a + z ,
where Y a + z t 1 denotes the data-predicted backward a + z steps, S e q · combines these data samples, C P t : t + a + z collects all candidate pools from step t to t + a + z , S e a · is the retrieval function for matching the retrieval tokens S * * t + a + 1 : t + a + z , and W t : t + a is the set of original words to be restored. Algorithm 3 describes the workflow for repairing steganographic text affected by a CODRTA.
Algorithm 3. Extracting from CDORTA-affected ST
01: INPUT: steganographic text ST, historical information HIS
02: OUTPUT: hidden information HI
03: ST ST.minus(HIS)
04: HI = []
05: HIS text generation model, TGM
06: while ST is not empty do
07:  candidate pool, C P t   TGM(HIS)
08:  if ST [0] in C P t  do
09:    temp Extraction(ST [0])
10:    HI.add(temp)
11:    HIS HIS + ST [ t ]
12:    ST ST.minus(ST[0])
13:  elif ST[0] not in C P t do
14:    candidate pools, C P t : t + m   TGM(HIS, m + 1 )
15:    if ST[1: m 1 ] in C P t + 1 : t + m 1  do
16:      go to DODRTA
17:    else do
18:      candidate pools, C P t : t + a + z TGM(HIS}, a + z )
19:      if ST [ t + a + 1 : t + a + z ] in C P t + a + 1 : t + a + z  do
20:        original word, o w t : t + a in C P t : t + a
21:        tem Extraction( o w t : t + a + ST[ t : t + a + z ])
22:        HI.add(tem)
23:        HIS HIS + o w t : t + a + ST[ t + a + 1 : t + a + z ]
24:        ST ST.minus(ST[0: a + z 1 ])
25:      end if
26:    end if
27:  end if
28: end while
29: Return HI

5. Experiments

This section presents the evaluation metrics commonly used in generative linguistic steganography, analyzes the feasibility of the proposed mechanisms, and compares them with several existing steganographic approaches.

5.1. Evaluation Metrics

The evaluation of generative linguistic steganography typically relies on two key dimensions: perceptual imperceptibility and statistical imperceptibility.
Perceptual imperceptibility refers to the readability of the generated steganographic text, including semantic clarity and logical coherence. For fairness, we employ standard metrics commonly used in natural language processing [39] to assess perceptual imperceptibility. The primary evaluation metrics are perplexity ( p p l ) and Kullback–Leibler divergence ( K L divergence):
p p l = 2 1 n l o g P S ,
K L P t | Q g = P t x l o g P t x Q g x ,
where n denotes the number of words in the steganographic text S , P t represents the true data distribution obtained from training, and Q g is the statistical distribution of the generated text.
In addition, we also adopt diversity ( D i v ), grammatical scores ( G S · ), and descriptiveness ( D e s ) as auxiliary metrics [40] for assessing perceptual imperceptibility.
D i v = s e t w o r d s S ,
G S S = 1 1 + e S ,
D e s = | v | 1 S , v = N , A , R ,
where · denotes the number of words in the steganographic text and s e t · ensures that only unique words are counted. Specifically, N , A and R represent nouns, adjectives, and adverbs, respectively.
Statistical imperceptibility refers to the ability of the generated steganographic text to evade detection by steganalysis tools. We evaluate statistical imperceptibility using three widely adopted steganalysis models: FCN [41], TS-RNN [42], and HiduNet [43]. The corresponding evaluation metrics are accuracy ( A c c ), precision ( P r e ), F 1 score ( F 1 ), and recall ( R e c ):
A c c = T P + T N T P + T N + F P + F N ,
P r e = T P T P + F P ,
R e c = T P T P + F N ,
F 1 = 2 × p r e · r e c p r e + r e c ,
where T P , T N , F P , and F N denote the numbers of true positives, true negatives, false positives, and false negatives, respectively.
To assess the resilience of generative linguistic steganography against active tampering attacks, we adopt three qualitative binary metrics: attack detection capability ( A D C ), which measures the ability to detect actual manipulation points; attack identification capability ( A I C ), which evaluates the effectiveness in recognizing the type of attack; and repair capability ( R C ), which quantifies the ability to restore the steganographic text and correctly extract hidden information.
Furthermore, we introduce the fault tolerance rate ( F T R ) to quantify robustness against active attacks. F T R is defined as the maximum proportion of manipulation points allowed in the steganographic text, beyond which hidden information can no longer be correctly extracted:
F T R = P o i n t s a t t a c k e d S × 100 % ,
where P o i n t s a t t a c k e d denotes that the point is under attack.
Finally, embedding capacity is also an important evaluation criterion. It typically refers to the maximum amount of hidden information that can be embedded into a given carrier. As most generative steganographic methods convert hidden information into bit streams, bits per word ( b p w ) is used as the metric:
b p w = b i t s M S ,
where M denotes hidden information.

5.2. Feasibility Analysis

The feasibility analysis aims to evaluate whether the proposed defense mechanisms effectively enhance the integrity and availability of hidden information at the receiving end. We further analyze the performance of different steganographic mechanisms under the two proposed types of attacks.
To rapidly validate the feasibility and usability of the proposed mechanisms, we implemented an ISSA-D steganographic mechanism using the adaptive DBSCAN algorithm and generated 6000 steganographic texts. In constructing the candidate pool, we adopted the top- k strategy with k = 8 , 16 , and 32. For ODRTA evaluation, we constructed 4000 data points with a candidate pool dimension of 2, among which 2000 were used for measuring DODRTA-D and the remainder for measuring CODRTA-D-2. Here, the subscript “2” indicates that the maximum number of consecutive manipulations is two. To ensure generalizability, GPT-2 [44] was selected as the underlying text generation model.
For ISSA-D and ODRTA-D, we evaluated ( A D C ), ( A I C ), and ( R C ). The results are presented in Table 2. The findings demonstrate that ISSA-D is capable of successfully extracting hidden information from steganographic texts affected by ISSA, whereas other mechanisms fail. Similarly, only ODRTA-D achieves successful extraction from texts affected by ODRTA. This difference arises because ODRTA introduces tampered words from outside the current candidate pool, allowing other methods to detect tampering points based on the text generation model and candidate pool strategy. In contrast, ISSA operates more covertly, by concealing tampering points within synonyms, preventing other steganographic mechanisms from identifying them.
The complexity analyses and correctness/security proofs of the proposed ISSA-D and ODRTA-D mechanisms are detailed in Appendix A (complexity analysis) and Appendix B (correctness and security analysis)
We further used the fault tolerance rate ( F T R ) to quantify the performance of ISSA-D, DODRTA-D, and CODRTA-D-2, with the results reported in Table 3. In the table, C P denotes the candidate pool dimension under the top- k strategy, while A C indicates the continuity of the attack. As ISSA-D does not require auxiliary information, its FTR reaches up to 100%, meaning ISSA can occur at any position and with any number of words in the steganographic text. By contrast, the repair processes of DODRTA-D and CODRTA-D-2 rely on subsequent auxiliary information, thereby limiting their tolerance to extensive manipulations.
In theory, as long as at least one untampered word remains, tampering points can be repaired. We initially set m = 1 . However, in practice, when tampering occurs near the end of the steganographic text, accurate recovery may not be possible. This is due to the iterative nature of the repair process: as the computational sequence grows longer, the computational load increases exponentially, reducing precision and hindering retrieval of the original candidate pool. To mitigate this, we set m = 2 . With m = 2 , DODRTA-affected steganographic texts can be fully and accurately repaired. For CODRTA-2, we further increased the value of z ; when z = 4 , the receiver was also able to extract the hidden information completely and correctly.
Therefore, we conclude that, to ensure full and accurate recovery of hidden information, the amount of auxiliary information following tampered points must be at least twice the number of tampered words.

5.3. Perceptual Imperceptibility Analysis

To validate the perceptual imperceptibility of the steganographic text generated by the proposed method, particularly that produced by ISSA-D, a quantitative assessment was conducted. The dimensions of the candidate pool were set to 8, 16, and 32. A comparison was made among RNN-stega [13], VAE-stega [2], and Discop [30], with RNN-stega [13] serving as the baseline. Discop [30], a theoretically secure steganographic mechanism, demonstrated superior perceptual imperceptibility. Similarly, under different dimensions of the CP, we generated 2000 data samples for each comparative subject under identical conditions of hidden information and prompt.
In evaluating the generative quality of steganographic texts, perceptual imperceptibility was one of the key metrics. This metric primarily referred to the extent to which the generated steganographic text aligned with human writing habits while exhibiting logical coherence and semantic clarity. To objectively assess the perceptual imperceptibility of steganographic texts produced by the four mechanisms, we employed the BERT [10] model as an evaluation tool. The comparison results are presented in Table 4.
From Table 4, it can be observed that the steganographic text generated by ISSA-D exhibited superior generation quality compared with the baseline [13] in terms of K L and p p l metrics across different dimensions, although it remained lower than that of VAE-stega [2] and Discop [30]. In terms of the D i v metric, VAE-stega [2] performed the best, followed by ISSA-D; conversely, ISSA-D excelled in the D e s metric, with Discop [30] coming in second. To quickly validate the feasibility of ISSA-D, we employed a perfect binary tree for the embedding algorithm, while Discop [30] utilized distributed copies, and both VAE-stega [2] and RNN-stega [13] implemented Huffman coding. In contrast, the latter two embedding algorithms were more effective in enhancing the perceptual imperceptibility of steganographic text.
Additionally, regarding the b p w metric, the embedding capacity of ISSA-D was significantly lower than that of the other three mechanisms. This was because, to ensure the steganographic text could withstand ISSA, certain elements in the candidate pool had to be sacrificed. Furthermore, the candidate pool in Table 4 was obtained through top- k or top- p filtering, with ISSA-D performing additional filtering on the resulting candidate pool. Notably, ISSA-D, RNN-stega [13], and VAE-stega [2] all used a top- k strategy, which allowed for static candidate pool dimensions, whereas Discop [30] adopted a top- p strategy, representing a dynamic candidate pool construction method.

5.4. Statistical Imperceptibility Analysis

It is fair to state that assessing statistical imperceptibility should be conducted under identical b p w conditions. Given that the b p w of RNN-stega [13], VAE-stega [2], and Discop [30] was significantly higher than that of ISSA-D, we increased the dimensions of the initial candidate pool to enhance the b p w of ISSA-D. After adjustments, the b p w s of the newly generated data for ISSA-D were found to be 1.1, 1.9, 2.4, and 2.7, respectively.
In evaluating the generative quality of steganographic texts, statistical imperceptibility was another important metric, primarily measuring whether the generated steganographic texts could effectively evade detection attacks from steganalysis tools. To verify the statistical imperceptibility of the steganographic text, we generated 2000 steganographic texts for four different mechanisms under four b p w conditions. In total, this resulted in 32,000 texts being analyzed.
Figure 3 demonstrated the resilience of steganographic texts against steganalysis using the FCN [41] model. The results indicate that VAE-stega [2] performed best in the accuracy ( A c c ), precision ( P r e ), and F 1 score ( F 1 ), averaging 0.75%, 0.25%, and 0.5% better than the second-best method (ISSA-D), respectively. In contrast, Disop [30] excelled in recall ( R e c ), outperforming the proposed method by an average of 11.75%.
Figure 4 illustrates the resilience of steganographic texts against steganalysis using the TS-RNN [42] model. The performance of all four mechanisms in the A c c was relatively poor, yet all exceed 90%. In the P r e , the proposed method improved with increasing b p w , with gains of up to 5%. In R e c , Discop [30] performed the best, followed by the proposed method, which averaged 5.75% higher than Discop [30]. For the F 1 score, VAE-stega [2] led, followed by Discop [30], the proposed method, and RNN-stega [13].
Figure 5 demonstrates the resilience of steganographic texts against steganalysis using the HiduNet [43] model. The results show that VAE-stega [2] outperforms the other three methods across all four metrics, with Discop [30] following closely. The proposed method averages 11.75%, 11%, 12.5%, and 11.5% points higher than VAE-stega [2] in the respective metrics. It is important to note that, for steganographic text creators, lower values of these four metrics indicate better statistical imperceptibility of the generated steganographic text.
In summary, compared with the baseline model RNN-stega [13], the steganographic texts generated by ISSA-D demonstrated superior capabilities in resisting statistical analysis. However, when compared with the more advanced models Discop [30] and VAE-stega [2], ISSA-D still shows potential for further improvement.
Table 5 presents the performance of steganographic texts generated by GPT-2 [44] under the condition of 1 b p w across four steganalysis tools. These steganographic texts were primarily used to evaluate the statistical imperceptibility of ODRTA-D.

6. Case Studies

The internet of things (IoT) [45] faces numerous shortcomings, including low data security and inadequate privacy protection. These issues render IoT systems vulnerable to attacks, exposing them to both passive and active threats, particularly man-in-the-middle (MITM) attacks and zero-day attacks that compromise the availability of transmitted data [46]. Transmitting confidential information [47] in such insecure communication environments poses significant risks, potentially leading to the disclosure of users’ personal information [48]. Thus, the introduction of protective measures is essential. Compared with conventional encryption systems, covert communication systems can conceal hidden information during transmission, making them more suitable for resource-constrained communication devices. Consequently, implementing generative linguistic steganography mechanisms in IoT environments is crucial. However, existing linguistic steganography mechanisms remain inadequate against MITM attacks. Therefore, we conducted a study to enhance the integrity and availability of hidden information at the receiving end. Our proposed method was tested for usability and effectiveness across various IoT environments.

6.1. Wifi or Buletooth

Consider a scenario in which Alice and Bob are employees of a company. Alice intended to transmit confidential information to Bob via the company’s Wi-Fi [49,50] or Bluetooth [51,52]. However, due to the inherent vulnerabilities of Wi-Fi and Bluetooth transmission, another employee, Eve, could intercept the data packets. To obscure the visibility of the confidential information, Alice used a generative steganographic text and transmitted a readable steganographic message S to Bob.
Assuming that the steganographic mechanism was shared within the company, Eve could exploit this mechanism to decipher the confidential information transmitted by Alice. At this juncture, Eve not only sought to acquire the confidential information but also aimed to prevent Bob from interpreting it. In this context, Eve executed an ISSA attack on the steganographic text.
Figure 6 illustrates information transmission in wireless communication networks (e.g., Wi-Fi and Bluetooth) and an MITM attack via in-domain synonym substitution attack (ISSA). In the left half, Alice constructs a steganographic text using the proposed ISSA-D method, with the currently selected word highlighted in red and a marker indicating the i-th attack conducted by Eve. During testing, Eve can tamper with any position, though the example shows only a specific instance for illustration. Ultimately, Alice transmits the steganographic text S to Bob, while Eve intercepts and modifies it according to ISSA, producing the altered text S , as shown in the right half of the figure. If Bob lacks defensive mechanisms, he will be unable to extract the hidden information from S . The initial candidate pool displayed in Figure 6 is also assumed to be known to Eve.

6.2. Internet of Medical Things (IoMT)

In the internet of medical things (IoMT) [53], data security issues have also arisen [54,55]. For instance, when hospitals use steganography to conceal confidential information within local databases through text covers such as case reports or patient feedback, a zero-day attack on the database could compromise its availability. In such cases, a steganographic mechanism without recovery measures would be unable to extract hidden confidential information from the compromised text.
In Figure 7a, S1 represents the steganographic text stored in the local database, with the points of assumed attack tampering highlighted in red, while the blue text indicates the auxiliary information required for tampering recovery. The figure illustrates both the construction of the steganographic text and the DODRTA process, resulting in the transformation of S 1 into S 1 * . It is important to note that, as the adversary aims solely to corrupt the data, the candidate pool remains unknown to them.

6.3. Industrial Internet of Things (IIoT)

In industrial control environments [56,57], long-distance communication from the data acquisition end to the data processing end is particularly susceptible to tampering. Transmitting collected data in plaintext is highly insecure. Conversely, ciphertext transmission may attract the attention of adversaries, increasing the likelihood of interception. Therefore, introducing steganography into industrial control environments is essential.
Figure 7b shows a different scenario, where linguistic steganography is applied in the industrial internet of things, and an indiscriminate random tampering attack (CODRTA) is launched on all data. If the data processing end lacks a configured recovery mechanism, the chances of extracting hidden information are significantly reduced, leading to a loss of data availability. Here too, the candidate pool remains unknown to the adversary.
These experiments simulate real-world IoT and IoMT scenarios where active adversaries can tamper with steganographic texts without access to the candidate pool. The key observation is that, without recovery or defense mechanisms, both DODRTA and CODRTA can severely compromise the integrity and availability of hidden information, emphasizing the necessity of the proposed tampering-recovery strategies.

7. Conclusions

Current AI-based linguistic steganography mechanisms often fail to withstand active attacks. To address this, we proposed two novel text tampering models, ISSA and ODRTA, to distinguish whether tampered words exist in the candidate pool, with ODRTA further divided into discontinuous and continuous variants. Correspondingly, we designed targeted defense mechanisms, a proactive adaptive clustering for ISSA and a post-hoc context-oriented repair for ODRTA. These approaches effectively counter ISSA, CODRTA, and DODRTA, significantly improving the integrity and usability of hidden information at the receiving end.
Compared with existing methods such as VAE-stega and Discop, our framework introduces innovative attack models and effective defenses, explicitly addressing candidate-pool and out-of-domain tampering while maintaining reliable steganographic communication under adversarial conditions.
Despite these improvements, the generated steganographic texts still exhibit limited perceptual imperceptibility, especially at higher bits per word. Future work will focus on enhancing imperceptibility while preserving resilience, and extending the approach to other data covers such as images and videos to improve practical usability in IoT environments.

Author Contributions

Conceptualization, Y.C.; methodology, Y.C.; software, Y.C., X.W. and Z.Y.; validation, X.W. and Z.Y.; formal analysis, Y.C. and X.W.; investigation, Y.C. and Z.Y.; writing—original draft preparation, Y.C., Q.L., X.W. and Z.Y.; writing—review and editing, Y.C., Q.L., X.W. and Z.Y.; visualization, Q.L.; supervision, Q.L.; project administration, Q.L.; funding acquisition, Q.L. All authors have read and agreed to the published version of the manuscript.

Funding

This paper is funded by the 2024 Jiangsu Province Frontier Technology R&D Project “Research on Cross-Domain Multi-Dimensional Security Technology of Intelligent Systems for AI Computing Networks” (No: BF2024071).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
A c c Accuracy
A D C Attack detection capability
AIArtificial intelligence
A I C Attack identification capability
b p w Bits per word
CODRTAContinuous ODRTA
CPCandidate pool
D e s Descriptiveness
D i v Diversity
DODRTADiscontinuous ODRTA
ISSA-DISSA-defense
F 1 ODRTA-defense F 1 score
F T R Fault tolerance rate
GANGenerative adversarial network
G S · Grammatical scores
HISHistorical information
IIoTIndustrial internet of things
IoMTInternet of medical things
IoTInternet of things
ISSAIn-domain synonym substitution attack
K L Kullback–Leibler divergence
MITMMan-in-the-middle
NLPNatural language processing
ODRTAOut-of-domain random tampering attack
ODRTA-DODRTA-Defense
p p l Perplexity
P r e Precision
R C Repair capability
RTARandom tampering attack
VAEVariational autoencoder

Appendix A

For the ISSA-D mechanism, during the generation of a single token, the time complexity can be decomposed into several main components: the forward pass of the generative model O M g L 2 d g + L d g 2 , where M g is the number of layers in the generative model, L is the current sequence length, and d g is the hidden dimension; top- k candidate selection implemented via a heap O V l o g k , candidate filtering O k ; candidate word clustering, including BERT encoding O k M c d c 2 and DBSCAN clustering O k 2 d c , where M c and d c denote the number of layers and hidden dimension of the BERT model used for clustering; and finally, result assembly and writing O k . Therefore, the overall time complexity for generating a single token is:
O M g L 2 d g + L d g 2 + V l o g k + k M c d c 2 + k 2 d c ,
The space complexity mainly consists of the activations and parameters of the generative model O M g L d g + θ g , where θ g represents the total number of parameters, as well as the BERT embeddings and clustering data required for candidate word clustering O k d c + k 2 . Other overheads are negligible. Thus, the overall space complexity is:
O M g L d g + θ g + k d c + k 2 ,
where, the generative model constitutes the dominant cost, and the overhead from candidate word clustering is relatively small.
For the ODRTA-D mechanism, during the recovery of a single token, the time complexity includes the forward pass of the generative model, top- k candidate selection via a heap, candidate filtering, and stego verification with partial sequence enumeration. In the worst case, enumeration introduces additional cost proportional to the number of enumerated sequences. Result assembly and token appending also contribute linearly. Therefore, the overall time complexity is:
O M g L 2 d g + L d g 2 + V l o g k + k + L ,
or, in the worst case with enumeration:
O E · M g L 2 d g + L d g 2 + V l o g k + k + L ,
where, E is a fixed enumeration length.
The space complexity mainly consists of the generative model activations and parameters, as well as storage for candidate lists and enumerated sequences. Other overheads are negligible, yielding:
O M g L d g + θ g + k + E · L ,
where the generative model remains the dominant cost and enumeration overhead is relatively small.

Appendix B

Appendix B.1. Correctness and Security Analysis for ISSA-D

To verify the correctness of the proposed defense algorithm, we analyze the effectiveness of the candidate pool construction and clustering mechanism. At each step t , the original candidate pool C P t is clustered to produce C P c t , with each cluster containing at least two words. This ensures that, during embedding, each candidate can withstand at least one synonym substitution. Let w i denote the original word; if an adversary performs ISSA and substitutes w i with a synonym x i a within the same cluster, the receiver can recover w i using the cluster-based query function Q :
x i a C i Q x i a = w i ,
This mechanism guarantees the integrity and recoverability of embedded information while maintaining semantic consistency. The dynamically adjusted clustering radius ϵ adapts to variations in candidate pool density, ensuring reasonable cluster formation and high semantic similarity.
By selecting the word with the highest probability within each cluster for embedding, the algorithm preserves text naturalness and imperceptibility, while intra-cluster semantic similarity minimizes semantic drift. The dynamic ϵ ensures robustness under varying candidate pool sizes and densities, and the requirement of at least two words per cluster further strengthens resistance to ISSA. Overall, the algorithm effectively defends against ISSA while ensuring the correctness and recoverability of the embedded information without compromising text quality.

Appendix B.2. Correctness and Security Analysis for ODRTA-D

To verify the correctness of the proposed ODRTA Defense (ODRTA-D) mechanism, we analyze the effectiveness of its text repair and recovery strategy. ODRTA-D leverages two key properties: the determinism of the generative linguistic steganography system and a context-oriented search strategy. Under fixed conditions, including the text generation model, candidate pool construction strategy, embedding algorithm, and hidden information, the generative steganography system is deterministic, ensuring that the same input consistently produces the same output. This property allows the receiving end to accurately reconstruct candidate pools even when certain words have been tampered with.
When a steganographic text is subject to an Out-of-Domain Random Tampering Attack (ODRTA), whether discontinuous (DODRTA) or continuous (CODRTA), the tampered words may fall outside the original candidate pools. The repair mechanism uses previously verified words as historical context and employs the text generation model to predict and enumerate possible original words at the tampering points. By exploiting the deterministic relationship between candidate pools and embedded information, ODRTA-D successfully restores the integrity and recoverability of the hidden information.
From a security perspective, ODRTA-D effectively mitigates out-of-domain random tampering attacks. Its reliance on historical context, deterministic candidate pools, and context-oriented search ensures that even if multiple words are replaced, the hidden information can still be recovered correctly. The mechanism is robust to both discontinuous and continuous attacks, as the repair process adaptively extends the search based on the continuity of tampering and available auxiliary information. Consequently, ODRTA-D guarantees that hidden information remains correct and recoverable while preserving the naturalness and semantic coherence of the steganographic text.

References

  1. Cachin, C. An information-theoretic model for steganography. In Proceedings of the International Workshop on Information Hiding, Portland, OR, USA, 14–17 April 1998; pp. 306–318. [Google Scholar]
  2. Yang, Z.-L.; Zhang, S.-Y.; Hu, Y.-T.; Hu, Z.-W.; Huang, Y.-F. Vae-stega: Linguistic steganography based on variational auto-encoder. IEEE Trans. Inf. Forensics Secur. 2020, 16, 880–895. [Google Scholar] [CrossRef]
  3. Wu, N.; Shang, P.; Fan, J.; Yang, Z.; Ma, W.; Liu, Z. Research on coverless text steganography based on single bit rules. J. Phys. Conf. Ser. 2019, 1237, 022077. [Google Scholar] [CrossRef]
  4. Luo, Y.; Huang, Y.; Li, F.; Chang, C. Text steganography based on ci-poetry generation using markov chain model. KSII Trans. Internet Inf. Syst. (TIIS) 2016, 10, 4568–4584. [Google Scholar]
  5. Luo, Y.; Huang, Y. Text steganography with high embedding rate: Using recurrent neural networks to generate chinese classic poetry. In Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, Philadelphia, PA, USA, 20–22 June 2017; pp. 99–104. [Google Scholar]
  6. Tong, Y.; Liu, Y.; Wang, J.; Xin, G. Text steganography on rnn-generated lyrics. Math. Biosci. Eng. 2019, 16, 5451–5463. [Google Scholar] [CrossRef]
  7. Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
  8. Creswell, A.; White, T.; Dumoulin, V.; Arulkumaran, K.; Sengupta, B.; Bharath, A.A. Generative adversarial networks: An overview. IEEE Signal Process. Mag. 2018, 35, 53–65. [Google Scholar] [CrossRef]
  9. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
  10. Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; Volume 1 (Long and Short Papers), pp. 4171–4186. [Google Scholar]
  11. Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving language understanding by generative pre-training. OpenAI, 2018; preprint. [Google Scholar]
  12. Ziegler, Z.M.; Deng, Y.; Rush, A.M. Neural linguistic steganography. arXiv 2019, arXiv:1909.01496. [Google Scholar] [CrossRef]
  13. Yang, Z.-L.; Guo, X.-Q.; Chen, Z.-M.; Huang, Y.-F.; Zhang, Y.-J. Rnn-stega: Linguistic steganography based on recurrent neural networks. IEEE Trans. Inf. Forensics Secur. 2018, 14, 1280–1295. [Google Scholar] [CrossRef]
  14. Dai, F.Z.; Cai, Z. Towards near-imperceptible steganographic text. arXiv 2019, arXiv:1907.06679. [Google Scholar] [CrossRef]
  15. Fang, T.; Jaggi, M.; Argyraki, K. Generating steganographic text with lstms. arXiv 2017, arXiv:1705.10742. [Google Scholar] [CrossRef]
  16. Zhou, X.; Peng, W.; Yang, B.; Wen, J.; Xue, Y.; Zhong, P. Linguistic steganography based on adaptive probability distribution. IEEE Trans. Dependable Secur. Comput. 2021, 19, 2982–2997. [Google Scholar] [CrossRef]
  17. Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; Volume 27. [Google Scholar]
  18. Yi, B.; Wu, H.; Feng, G.; Zhang, X. Alisa: Acrostic linguistic steganography based on bert and gibbs sampling. IEEE Signal Process. Lett. 2022, 29, 687–691. [Google Scholar] [CrossRef]
  19. Cao, Y.; Zhou, Z.; Chakraborty, C.; Wang, M.; Wu, Q.J.; Sun, X.; Yu, K. Generative steganography based on long readable text generation. IEEE Trans. Comput. Soc. Syst. 2022, 11, 4584–4594. [Google Scholar] [CrossRef]
  20. Yan, R.; Yang, Y.; Song, T. A secure and disambiguating approach for generative linguistic steganography. IEEE Signal Process. Lett. 2023, 30, 1047–1051. [Google Scholar] [CrossRef]
  21. Ding, C.; Fu, Z.; Yang, Z.; Yu, Q.; Li, D.; Huang, Y. Context-aware linguistic steganography model based on neural machine translation. IEEE/ACM Trans. Audio Speech Lang. Process. 2023, 32, 868–878. [Google Scholar] [CrossRef]
  22. Li, Y.; Zhang, R.; Liu, J.; Lei, Q. A semantic controllable long text steganography framework based on llm prompt engineering and knowledge graph. IEEE Signal Process. Lett. 2024, 31, 2610–2614. [Google Scholar] [CrossRef]
  23. Yang, T.; Wu, H.; Yi, B.; Feng, G.; Zhang, X. Semantic-preserving linguistic steganography by pivot translation and semantic-aware bins coding. IEEE Trans. Dependable Secur. Comput. 2023, 21, 139–152. [Google Scholar] [CrossRef]
  24. Zhang, R.; Liu, J.; Zhang, R. Controllable semantic linguistic steganography via summarization generation. In Proceedings of the ICASSP 2024—2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; pp. 4560–4564. [Google Scholar]
  25. Bai, M.; Yang, J.; Pang, K.; Huang, Y.; Gao, Y. Semantic steganography: A framework for robust and high-capacity information hiding using large language models. arXiv 2024, arXiv:2412.11043. [Google Scholar] [CrossRef]
  26. Zhang, S.; Yang, Z.; Yang, J.; Huang, Y. Provably secure generative linguistic steganography. arXiv 2021, arXiv:2106.02011. [Google Scholar] [CrossRef]
  27. de Witt, C.S.; Sokota, S.; Kolter, J.Z.; Foerster, J.; Strohmeier, M. Perfectly secure steganography using minimum entropy coupling. arXiv 2022, arXiv:2210.14889. [Google Scholar]
  28. Zhang, S.; Yang, Z.; Yang, J.; Huang, Y. Linguistic steganography: From symbolic space to semantic space. IEEE Signal Process. Lett. 2020, 28, 11–15. [Google Scholar] [CrossRef]
  29. Pang, K.; Bai, M.; Yang, J.; Wang, H.; Jiang, M.; Huang, Y. Fremax: A simple method towards truly secure generative linguistic steganography. In Proceedings of the ICASSP 2024—2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; pp. 4755–4759. [Google Scholar]
  30. Ding, J.; Chen, K.; Wang, Y.; Zhao, N.; Zhang, W.; Yu, N. Discop: Provably secure steganography in practice based on “distribution copies”. In Proceedings of the 2023 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 21–25 May 2023; pp. 2238–2255. [Google Scholar]
  31. Wu, J.; Wu, Z.; Xue, Y.; Wen, J.; Peng, W. Generative text steganography with large language model. In Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, VIC, Australia, 28 October–1 November 2024; pp. 10345–10353. [Google Scholar]
  32. Lin, K.; Luo, Y.; Zhang, Z.; Luo, P. Zero-shot generative linguistic steganography. arXiv 2024, arXiv:2403.10856. [Google Scholar] [CrossRef]
  33. Sun, B.; Li, Y.; Zhang, J.; Xu, H.; Ma, X.; Xia, P. Topic controlled steganography via graph-to-text generation. CMES-Comput. Model. Eng. Sci. 2023, 136, 157–176. [Google Scholar] [CrossRef]
  34. Xiang, L.; Yang, S.; Liu, Y.; Li, Q.; Zhu, C. Novel linguistic steganography based on character-level text generation. Mathematics 2020, 8, 1558. [Google Scholar] [CrossRef]
  35. Adeeb, O.F.A.; Kabudian, S.J. Arabic text steganography based on deep learning methods. IEEE Access 2022, 10, 94403–94416. [Google Scholar] [CrossRef]
  36. Von Ahn, L.; Hopper, N.J. Public-key steganography. In Proceedings of the International Conference on the Theory and Applications of Cryptographic Techniques, Interlaken, Switzerland, 2–6 May 2004; pp. 323–341. [Google Scholar]
  37. Backes, M.; Cachin, C. Public-key steganography with active attacks. In Proceedings of the Theory of Cryptography Conference, Cambridge, MA, USA, 10–12 February 2005; pp. 210–226. [Google Scholar]
  38. Lu, T.; Liu, G.; Zhang, R.; Ju, T. Neural linguistic steganography with controllable security. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 18–23 June 2023; pp. 1–8. [Google Scholar]
  39. Celikyilmaz, A.; Clark, E.; Gao, J. Evaluation of text generation: A survey. arXiv 2020, arXiv:2006.14799. [Google Scholar]
  40. Becker, J.; Wahle, J.P.; Gipp, B.; Ruas, T. Text generation: A systematic literature review of tasks, evaluation, and challenges. arXiv 2024, arXiv:2405.15604. [Google Scholar] [CrossRef]
  41. Yang, Z.; Huang, Y.; Zhang, Y.-J. A fast and efficient text steganalysis method. IEEE Signal Process. Lett. 2019, 26, 627–631. [Google Scholar] [CrossRef]
  42. Yang, Z.; Wang, K.; Li, J.; Huang, Y.; Zhang, Y.-J. Ts-rnn: Text steganalysis based on recurrent neural networks. IEEE Signal Process. Lett. 2019, 26, 1743–1747. [Google Scholar] [CrossRef]
  43. Peng, W.; Li, S.; Qian, Z.; Zhang, X. Text steganalysis based on hierarchical supervised learning and dual attention mechanism. IEEE/ACM Trans. Audio Speech Lang. Process. 2023, 31, 3513–3526. [Google Scholar] [CrossRef]
  44. Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
  45. Bhattacharjya, A.; Zhong, X.; Wang, J.; Li, X. Present Scenarios of IoT Projects with Security Aspects Focused. In Digital Twin Technologies and Smart Cities; Springer International Publishing: Cham, Switzerland, 2020; pp. 95–122. [Google Scholar] [CrossRef]
  46. Bhattacharjya, A. A holistic study on the use of blockchain technology in cps and iot architectures maintaining the cia triad in data communication. Int. J. Appl. Math. Comput. Sci. 2022, 32, 403–413. [Google Scholar] [CrossRef]
  47. Bahashwan, A.A.; Anbar, M.; Abdullah, N.; Al-Hadhrami, T.; Hanshi, S.M. Review on common iot communication technologies for both long-range network (lpwan) and short-range network. In Advances on Smart and Soft Computing, Proceedings of the ICACIn 2020, Casablanca, Morocco, 12–13 April 2020; Springer: Singapore, 2021; pp. 341–353. [Google Scholar]
  48. Bhattacharjya, A.; Zhong, X.; Wang, J.; Li, X. Security Challenges and Concerns of Internet of Things (IoT). In Cyber-Physical Systems: Architecture, Security and Application; Springer International Publishing: Cham, Switzerland, 2019; pp. 153–185. [Google Scholar] [CrossRef]
  49. Ramanna, V.K.; Sheth, J.; Liu, S.; Dezfouli, B. Towards understanding and enhancing association and long sleep in low-power wifi iot systems. IEEE Trans. Green Commun. Netw. 2021, 5, 1833–1845. [Google Scholar] [CrossRef]
  50. Lee, I.-G.; Kim, D.B.; Choi, J.; Park, H.; Lee, S.-K.; Cho, J.; Yu, H. Wifi halow for long-range and low-power internet of things: System on chip development and performance evaluation. IEEE Commun. Mag. 2021, 59, 101–107. [Google Scholar] [CrossRef]
  51. Kalanandhini, G.; Aravind, A.; Vijayalakshmi, G.; Gayathri, J.; Senthilkumar, K. Bluetooth technology on iot using the architecture of piconet and scatternet. AIP Conf. Proc. 2022, 2393, 020121. [Google Scholar]
  52. Mohamed, K.S. Bluetooth 5.0 Modem Design for IoT Devices; Springer: Cham, Switzerland, 2022. [Google Scholar]
  53. Bhattacharjya, A.; Kozdrój, K.; Bazydło, G.; Wisniewski, R. Trusted and secure blockchain-based architecture for internet-of-medical-things. Electronics 2022, 11, 2560. [Google Scholar] [CrossRef]
  54. Siam, A.I.; Almaiah, M.A.; Al-Zahrani, A.; Elazm, A.A.; El Banby, G.M.; El-Shafai, W.; El-Samie, F.E.A.; El-Bahnasawy, N.A. Secure health monitoring communication systems based on iot and cloud computing for medical emergency applications. Comput. Intell. Neurosci. 2021, 2021, 8016525. [Google Scholar] [CrossRef] [PubMed]
  55. El Zouka, H.A.; Hosni, M.M. Secure iot communications for smart healthcare monitoring system. Internet Things 2021, 13, 100036. [Google Scholar] [CrossRef]
  56. Shu, J.; Lu, J. Two-stage botnet detection method based on feature selection for industrial internet of things. IET Inf. Secur. 2025, 2025, 9984635. [Google Scholar] [CrossRef]
  57. Wu, B.; Shi, C.; Jiang, W.; Zhan, T.; Qian, K. Enterprise digital intelligent remote control system based on industrial internet of things. World J. Innov. Mod. Technol. 2024, 7, 67–74. [Google Scholar]
Figure 1. Diagram of adversary executing ISSA at position t .
Figure 1. Diagram of adversary executing ISSA at position t .
Symmetry 17 01416 g001
Figure 2. DORTA operation per unit step, where w i denotes the original word, and a w i denotes the word that replaces w i at position i .
Figure 2. DORTA operation per unit step, where w i denotes the original word, and a w i denotes the word that replaces w i at position i .
Symmetry 17 01416 g002
Figure 3. Performance comparison of FCN [41] under different b p w settings.
Figure 3. Performance comparison of FCN [41] under different b p w settings.
Symmetry 17 01416 g003
Figure 4. Performance comparison of TS-RNN [42] under different b p w settings.
Figure 4. Performance comparison of TS-RNN [42] under different b p w settings.
Symmetry 17 01416 g004
Figure 5. Performance comparison of HiDuNet [43] under different b p w settings.
Figure 5. Performance comparison of HiDuNet [43] under different b p w settings.
Symmetry 17 01416 g005
Figure 6. MITM attacks, such as in-domain synonym substitution in wireless networks (e.g., Wi-Fi, Bluetooth), use superscripts to indicate positions and red font to mark altered source words and their replacements.
Figure 6. MITM attacks, such as in-domain synonym substitution in wireless networks (e.g., Wi-Fi, Bluetooth), use superscripts to indicate positions and red font to mark altered source words and their replacements.
Symmetry 17 01416 g006
Figure 7. Out-of-domain random tampering attack schematic diagram. (a) Linguistic steganography in the internet of medical things (IoMT) and zero-day attack on database (text tampering, DODRTA). (b) Linguistic steganography in the industrial internet of things and indiscriminate tampering attack (text tampering, CODRTA). The superscript indicates the position, the red font marks the tampered words, and the blue font denotes the auxiliary information used for repair.
Figure 7. Out-of-domain random tampering attack schematic diagram. (a) Linguistic steganography in the internet of medical things (IoMT) and zero-day attack on database (text tampering, DODRTA). (b) Linguistic steganography in the industrial internet of things and indiscriminate tampering attack (text tampering, CODRTA). The superscript indicates the position, the red font marks the tampered words, and the blue font denotes the auxiliary information used for repair.
Symmetry 17 01416 g007
Table 1. Comparative summary of generative linguistic steganography approaches.
Table 1. Comparative summary of generative linguistic steganography approaches.
Development StageMethod
Category
Embedding StrategyMain Focus/FeatureAdvantagesLimitationsReference
Early (2000–2005)StatisticalState transitionsBasic generative steganographySimpleLow capacity and quality[3,4]
Neural networks (2010–2015)Neural networksWord levelImprove embeddingHigher capacityVulnerable[5,6]
Advanced neural models (2020–2022)LSTM/SOTAHuffmanEnhance imperceptibilityHigh qualityModel complexity[12,13,14,15]
Large language models (2022–2024)Transformer/GAN/VAEVariable embeddingImprove text qualityHigh qualityHigh cost[2,16,17,18,19,20,21,22,23,24]
Public key and security focusPublic key/controllableCover textResist active attacksSecureComplex[36,37,38]
Table 2. Comparison of defensive capabilities of steganographic mechanisms. ☑ and ☒ denote the presence and absence of capability, respectively.
Table 2. Comparison of defensive capabilities of steganographic mechanisms. ☑ and ☒ denote the presence and absence of capability, respectively.
MethodISSAODRTA
A D C A I C R C A D C A I C R C
ISSA-D (our)
ODRTA-D (our)
RNN-stega [13]
PH-stega [14]
VAE-stega [2]
GAN-stega [16]
BERT-GS-stega [18]
PAP-stega [19]
LD-stega [20]
NT-stega [21]
PE&KE-stega [22]
SS2SS-stega [28]
Discop [30]
LLM-stega [31]
ZS-stega [32]
TC-stega [33]
CS-stega [38]
Table 3. Defensive performance of the proposed method. ☑ and ☒ denote the presence and absence of capability, respectively.
Table 3. Defensive performance of the proposed method. ☑ and ☒ denote the presence and absence of capability, respectively.
Method|CP| F T R A C
ISSA-D8100%
16100%
32100%
DODRTA-D233.3%
CODRTA-D-2233.3%
Table 4. Perceptual imperceptibility comparison table.
Table 4. Perceptual imperceptibility comparison table.
Method|CP| K L p p l D i v G S D e s b p w
ISSA-D84.2419.960.360.50.340.16
165.4346.490.620.50.390.33
327.2363.990.730.50.410.41
DODRTA-D24.7029.00.840.50.340.96
Discop [30]Dynamic1.862.000.50.50.371.38
Dynamic3.193.700.60.50.401.78
Dynamic4.209.600.510.50.362.70
VAE-stega [2]83.7417.130.720.50.392.54
164.1722.670.750.50.383.19
324.6330.570.780.50.383.76
RNN-stega [13]87.5320.930.350.50.302.64
168.2033.160.450.50.303.52
328.8352.550.560.50.294.40
Table 5. Steganalysis resistance for ODRTA-D steganographic text.
Table 5. Steganalysis resistance for ODRTA-D steganographic text.
Metric[41][42][43]
A c c 64.00%88.79%92.62%
P r e 58.2290.96%91.01%
F 1 100%85.74%94.82%
R e c 73.29%87.10%92.61%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chen, Y.; Li, Q.; Wu, X.; Ying, Z. Research on Making Two Models Based on the Generative Linguistic Steganography for Securing Linguistic Steganographic Texts from Active Attacks. Symmetry 2025, 17, 1416. https://doi.org/10.3390/sym17091416

AMA Style

Chen Y, Li Q, Wu X, Ying Z. Research on Making Two Models Based on the Generative Linguistic Steganography for Securing Linguistic Steganographic Texts from Active Attacks. Symmetry. 2025; 17(9):1416. https://doi.org/10.3390/sym17091416

Chicago/Turabian Style

Chen, Yingquan, Qianmu Li, Xiaocong Wu, and Zijian Ying. 2025. "Research on Making Two Models Based on the Generative Linguistic Steganography for Securing Linguistic Steganographic Texts from Active Attacks" Symmetry 17, no. 9: 1416. https://doi.org/10.3390/sym17091416

APA Style

Chen, Y., Li, Q., Wu, X., & Ying, Z. (2025). Research on Making Two Models Based on the Generative Linguistic Steganography for Securing Linguistic Steganographic Texts from Active Attacks. Symmetry, 17(9), 1416. https://doi.org/10.3390/sym17091416

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop