1. Introduction
Steganography [
1] enables the covert transmission of information by concealing the existence of a message. Traditionally, hidden information has been embedded within common covers such as text, images, and videos, thereby reducing perceptibility during transmission. Among these media, text has attracted significant attention due to its widespread use and ease of distribution. However, conventional text-based steganographic methods suffer from limited embedding capacity and insufficient imperceptibility [
2], which limits their practical applicability.
With the rapid advancement of artificial intelligence (AI), neural text generation models can produce highly readable text by leveraging intrinsic associations in large-scale datasets. This capability has given rise to generative linguistic steganography, which encodes hidden information into fluent text that is unrelated to the original message. Compared with traditional approaches, generative methods can embed more information while maintaining better imperceptibility.
Despite these advantages, existing generative linguistic steganography primarily focuses on resisting passive attacks, such as manual audits and statistical steganalysis. Active attacks, particularly text tampering, pose a significant threat by modifying steganographic text and preventing accurate extraction of hidden information. Text tampering can take the form of synonym substitution, which replaces words with semantically similar alternatives, or random tampering, which introduces arbitrary changes without semantic constraints. However, conventional attack models are often ineffective against carefully crafted generative steganographic text, allowing the recipient to recover the hidden information.
This research addresses the vulnerability of generative linguistic steganography to active text tampering attacks. We redefine traditional tampering attacks to enhance their adversarial capability, resulting in the in-domain synonym substitution attack (ISSA) and the out-of-domain random tampering attack (ODRTA), with the latter further divided into continuous and discontinuous types. To defend against these attacks, we propose a proactive adaptive-clustering mechanism for ISSA and a post-hoc repair mechanism for ODRTA, leveraging context-oriented search and the determinism of text generation. These mechanisms aim to improve the integrity and usability of hidden information under active attacks.
The main contributions of this research are as follows:
We remodel common text tampering attacks specifically for generative linguistic steganography.
We classify tampering attacks into ISSA and ODRTA based on whether the tampered words exist within the candidate pool of the text generation model.
We further categorize ODRTA into continuous and discontinuous types according to the continuity of the attack.
We propose a proactive adaptive-clustering defense for ISSA and a post-hoc repair mechanism for ODRTA.
We validate through experiments the idea that the proposed mechanisms effectively enhance the integrity and accessibility of hidden information.
Section 2 reviews related work on generative linguistic steganography.
Section 3 introduces the two types of attack models.
Section 4 details the proposed steganographic mechanisms.
Section 5 presents experimental results and provides a comprehensive discussion.
Section 6 illustrates examples of the proposed attack models and defense mechanisms in practice. Finally,
Section 7 concludes the research.
2. Related Work
In the early days of generative linguistic steganography, limited computational power often necessitated reliance on statistical models or Markov chains for text generation. Specifically, these mechanisms used Markov chains to embed hidden information through state transitions and transition probabilities. However, Markov chain-based steganographic texts [
3] struggled to embed large amounts of information and generally exhibited low quality, with poor perceptual and statistical imperceptibility. Huang et al. [
4] attempted to improve text quality by using Song Ci as a textual template, but the template format constrained embedding capacity, limiting the practical utility of early mechanisms.
With advances in computational capabilities and artificial intelligence (AI), neural network-based text generation models have significantly improved the handling of long sequences. Consequently, generative linguistic steganography regained research focus, emphasizing higher embedding capacity and better quality, including both perceptual and statistical imperceptibility. Perceptual imperceptibility refers to evading human scrutiny, while statistical imperceptibility assesses alignment with the statistical distribution of large datasets.
Despite improvements, early neural models faced challenges in generating high-quality text, and embedding hidden information could further reduce quality. Huang et al. [
5] addressed this by combining intra-word mutual information with recurrent neural networks to generate Chinese poetry containing hidden information. Building on this, Tong [
6] used long short-term memory networks to generate Chinese lyrics with higher embedding capacity. Although quality improved, advancing steganalysis techniques necessitated further optimization to enhance resistance against detection.
Recent developments in natural language processing (NLP) and text generation allow new models [
7,
8,
9,
10,
11] to generate texts with arbitrary semantic content. Compared with traditional neural network models, these new models improve steganographic text quality and are no longer constrained by fixed templates, enhancing practicality. Zachary et al. [
12] introduced a semantically secure steganographic mechanism using state-of-the-art models and arithmetic coding, achieving near-optimal statistical security. Huang et al. [
13] developed a long short-term memory-based model implementing two embedding algorithms: fixed-length binary tree mapping and variable-length Huffman coding. While Huffman coding offers slightly lower embedding capacity, it favors high-probability candidate words, improving imperceptibility. Dai et al. [
14] further enhanced imperceptibility using patient Huffman coding based on statistical regularities of fluent texts [
13,
15].
Huang et al. [
2] observed a tradeoff between perceptual and statistical imperceptibility and proposed a variational autoencoder (VAE) combined with Huffman coding to improve both. Zhou et al. [
16] leveraged adaptive probability distributions and generative adversarial networks (GANs) [
17] to strengthen imperceptibility and security. Reliable transmission also requires shared knowledge between communicators. Yi et al. [
18] proposed a BERT-based mechanism with Gibbs sampling, but limited long-sequence performance restricted embedding capacity. Cao et al. [
19] enhanced text generation using plug-and-play models with an embeddable candidate pool, while Yan et al. [
20] introduced lightweight disambiguation to improve statistical imperceptibility. Ding et al. [
21] combined neural machine translation with algorithmic coding to generate high-quality semantic steganographic text.
Further improvements include those of Li et al. [
22], who leveraged prompt engineering and knowledge graphs with large language models for content consistency, and Yang et al. [
23], who integrated modification-based and generation-based mechanisms to enhance embedding capacity and steganalysis resistance. Zhang et al. [
24] proposed controllable steganography using summary generation, improving semantic consistency. Bai et al. [
25] mapped hidden information into semantic spaces using ontology trees, enhancing robustness. Huang et al. [
26] applied adaptive dynamic grouping to minimize statistical differences between steganographic and natural text. Christian et al. [
27] demonstrated that maximal information throughput requires minimal entropy coupling in perfectly secure systems. Huang et al. [
28] used rejection sampling for approximate perfect steganography, while Pang et al. [
29] applied frequency-reform-based softmax to reduce statistical discrepancies. Ding et al. [
30] utilized distribution replicas to achieve near-theoretical absolute security. Wu et al. [
31] developed LLM Stega, embedding encrypted messages via large language model interfaces while maintaining semantic quality. Lin et al. [
32] proposed a zero-shot contextual learning method to enhance concealability, and Sun et al. [
33] introduced theme-controlled graph-to-text steganography for improved quality and concealability.
To increase embedding capacity, Xiang et al. [
34] refined generation units from words to letters, while Omer et al. [
35] developed a word-level Arabic poetry mechanism using LSTM networks and Baudot coding. These studies collectively improved embedding capacity.
Despite these advances, generative linguistic steganography mechanisms still show vulnerabilities against active attacks. Some defensive measures exist, primarily for cover-based steganography [
36,
37]. While strategies for generative steganography have been proposed [
38], they often overlook the adversary’s perspective and oversimplify synonym substitution, weakening defense effectiveness against complex manipulations.
Table 1 summarizes the representative works and their key features.
3. Attack Model
Currently, generative linguistic steganography mechanisms follow a specific process. First, based on historical information (prompt), a candidate pool (CP) for the next position is constructed by a text generation model. Construction of the CP mainly relies on top-
or top-
sampling. Subsequently, an embedding algorithm, such as Huffman coding [
2,
13], arithmetic coding [
21], or distributed copies [
30], is employed to map hidden information to elements within the CP, thereby selecting the current output word. It is noteworthy that the CP’s dimensionality is typically much smaller than the text generation model’s vocabulary
, which helps maintain good readability of the generated steganographic text.
Text tampering attacks involve replacing certain words in a text with other words. These attacks can be further categorized based on the similarity of the replacement words to the original words into two types: synonym substitution and non-synonym substitution. Among the latter, random tampering attacks can be considered a special case.
Based on the generative linguistic steganography process, we have redefined synonym substitution and random tampering attacks as in-domain synonym substitution attack (ISSA) and out-of-domain random tampering attack (ODRTA). Instead of depending on the similarity between tampered and original words, the distinction now relies on whether the tampered words are present in the current CP. When a tampered word exists in the CP, the attack is classified as ISSA and is considered stealthy. The other type is termed ODRTA, which will be further classified based on attack continuity in the following sections.
3.1. ISSA
To enhance the adversary’s ability beyond conventional synonym substitution attacks, we introduce the in-domain synonym substitution attack (ISSA).
We consider an active adversary who intercepts communication channels suspected of carrying steganographic texts. The adversary has full knowledge of the text generation model and the candidate pool construction strategy (e.g., top- or top- sampling) but cannot manipulate the generation system itself. Thus, the adversary is treated as a model-aware outsider rather than an insider. Since the adversary can access the same model and decoding strategy, the candidate pool at each step is reproducible.
Before launching an attack, the adversary may attempt to detect whether a message contains hidden information using human judgment or automated steganalysis (e.g., statistical tests or classifier-based methods). Once a text is identified as suspicious, the adversary selectively decides where to apply ISSA.
The adversary’s tampering strategy is not arbitrary. At each position , the adversary inspects the candidate pool constructed by the generation model. If no semantically related alternatives are found, the position is left unchanged. Otherwise, when a candidate is identified as a synonym of the original word , the attacker replaces with . This selective, condition-triggered attack ensures that tampering is both in-domain (restricted to ) and semantically consistent (maintaining synonymy). Unlike ordinary synonym replacement methods, which indiscriminately substitute words regardless of context or model constraints, this design significantly improves stealthiness and reduces the likelihood of immediate detection.
At step , the generation model constructs a candidate pool according to its decoding strategy. The adversary inspects . If no semantically related alternatives are found, the position is left unchanged. Otherwise, the adversary replaces the original word with a synonym .
Based on the above description, ISSA can be expressed as follows:
where
denotes the original word at step
,
is the candidate pool generated by the model,
represents the word embedding function,
is the
norm, and
is a similarity threshold. Note that
is a set of candidate words rather than a vector.
The parameter controls the trade-off between semantic fidelity and the number of substitution options. A small ensures high semantic consistency but yields fewer candidates, whereas a larger enlarges the tampering space at the risk of semantic drift. In practice, can be tuned on validation data to balance stealthiness and naturalness.
A key feature of ISSA is the position-shift effect, which significantly enhances stealthiness. When the adversary substitutes with , the receiver may not detect the change at step . Instead, the subsequent word may fall outside its corresponding candidate pool , leading the receiver to misattribute the tampering.
For example, suppose the original sequence is
. If the adversary replaces
with a synonym
from
, the original third word
may no longer belong to
. Consequently, the receiver detects inconsistency at position 3 rather than at position 2. This misalignment complicates both detection and recovery, making ISSA substantially more stealthy than conventional synonym substitution attacks.
Figure 1 illustrates the adversary intercepting steganographic text
and performing ISSA at position
.
3.2. ODRTA
From the description of ISSA, it is clear that the adversary must possess substantial knowledge. This includes an understanding of the text generation model and the candidate pool construction strategy. In ISSA, the adversary has significant attack capabilities but cannot manipulate the generation system itself. Therefore, an adversary employing ISSA can be best characterized as a model-aware outsider.
In contrast, attacks may also originate from ordinary external adversaries who have limited or no access to the underlying mechanisms and parameters. In such text-tampering attacks, these adversaries primarily target the availability and timeliness of hidden information. As a result, they are less constrained by detection concerns and may freely replace original words.
Traditional random tampering attacks (RTA) can be modeled as follows:
where
is the available set of tampering words at step
,
is the entire vocabulary, and
is the original steganographic word at step
.
Due to the requirement for readability in generative linguistic steganography tasks, candidate pools are generally small, e.g., containing 2, 8, or 16 words. Consequently, the probability of an adversary randomly selecting these words is significantly lower than the probability of choosing a word from the remaining vocabulary
Therefore, we define the out-of-domain random tampering attack (ODRTA), where “random” refers to the selection of words from
. The ODRTA can be represented as follows:
where
denotes the number of elements in the set.
ODRTA can be categorized based on the continuity of the attacks into continuous ODRTA and discontinuous ODRTA. Let
denote the intercepted steganographic text, and let
and
be the indices of consecutive attack points; then, we obtain the following:
If , the attack is continuous, meaning no unmodified words exist between attack points. If , the attack is discontinuous.
Figure 2 illustrates a schematic representation of ODRTA. The upper half of
Figure 2 represents a single instance of DODRTA, while the lower half showcases a CODRTA.
4. Defense Mechanism
To defend against the two types of tampering attacks described above and to enhance the integrity and availability of hidden information at the receiving end, we propose two mechanisms: a proactive defense mechanism based on adaptive clustering for ISSA, and a post-hoc repair mechanism based on a context-oriented search strategy and the determinism of the text generation model for ODRTA. In addition to targeting different attacks, these two mechanisms also differ in their deployment locations. The proactive defense mechanism must be deployed at both the sending and receiving ends of the communication, whereas the post-hoc repair mechanism only needs to be deployed at the receiving end.
4.1. ISSA-Defense (ISSA-D)
According to the ISSA model, the construction mechanism and specific parameters of a steganographic text are known to the adversary. To avoid detection by both parties in the communication and to increase the difficulty of extracting the original hidden information, the adversary selects synonyms of the source words from the candidate pool generated by the text generation model at the current step as tampering words.
To enhance the defensive capability of generative linguistic steganography against ISSA, we propose a proactive defense mechanism based on an adaptive clustering algorithm. In this mechanism, the DBSCAN algorithm is chosen as the clustering algorithm.
First, the sending end inputs a prompt into the text generation model to initiate the steganographic task. Most existing text generation models calculate the probability of each word in the vocabulary
appearing at the current position based on conditional probabilities. Subsequently, a candidate pool (CP) is constructed by selecting the top-
words from the computed results using the top-
sampling strategy. The prediction rules of the text generation model and the top- strategy are as follows:
where
and
.
Currently, generative linguistic steganography typically begins the embedding process immediately after obtaining the CP. To ensure that steganographic text can withstand ISSA, we perform adaptive clustering on the CP, constructing a new CP to be transmitted to the embedding unit. Let denote the CP constructed by the text generation model at step , and refer to the CP transmitted to the embedding unit.
The adaptive clustering algorithm based on the DBSCAN algorithm is as follows:
where
and
are the crucial hyperparameters of DBSCAN. Here,
, represents the minimum number of points required to form a dense region, while
, denotes the maximum distance between two points to be considered in the same cluster.
To ensure that every element in the
can withstand ISSA, the minimum value of
is set to 2.
As the elements of
differ at each step, a static
cannot guarantee effective clustering results. Therefore, we introduce the K-NN algorithm to dynamically adjust
, as follows:
where
represents the distances from all elements in
to data point
, while
denotes the distance from data point
to data point
.
where
is the total number of words in
and
represents the
nearest words to the
-th word based on their distances.
After these operations,
is partitioned into different clusters, each containing at least two elements:
A filtering function selects the element with the highest probability from each cluster
to be included in
:
where
denotes the word with the highest probability in cluster
. This ensures perceptual imperceptibility of the steganographic text.
At this point,
serves as the latest candidate pool, passed to the embedding unit for information hiding. A complete algorithmic flow for the adversarial ISSA is illustrated in Algorithm 1, where line 11 produces the final
to be transmitted to the next processing unit. In line 9, the DBSCAN clustering algorithm is applied with parameters
. Here,
is set to 2 to ensure that each cluster contains at least two candidate words, which establishes the lower bound necessary for the mechanism to resist ISSA attacks. The hyperparameters were chosen based on empirical experience and further tuned on the validation set to achieve meaningful clustering results while balancing cluster density and noise points.
Algorithm 1. Steganographic text generation against ISSA |
01: INPUT: hidden information HI, historical information HIS 02: OUTPUT: steganographic text ST 03: HIS text generation model, TGM 04: = 1, OHI = HIS 05: while HI is not empty do 06: candidate pool, TGM(HIS) 07: distances K-NN() 08: mean(distances) 09: clusters, DBSCAN(, = 2) 10: candidate pool = [] 11: for cluster in do 12: word cluster(max_prob) 13: .add(word) 14: end for 15: Embedding(HI, ) 16: HI update(HI) 17: HIS HIS + 18: 19: end while 20: ST HIS.minus(OHI) 21: Return ST |
Now, we discuss how the candidate pool
helps both parties resist ISSA. Suppose the adversary performs an ISSA at the
-th word of the intercepted
. According to the ISSA description, the word
is altered to a synonym of the original word
. When
exists within the cluster
containing the original word
, the receiving end can query the original word
using Equation (11). The number of allowed alterations for each cluster
is limited to its dimension, minus one. Additionally, if the clustering operation does not yield multiple clusters, the word with the highest probability from
will be returned. This approach is aimed at enhancing the usability of the steganographic text while maintaining perceptual imperceptibility. The formulation expression is as follows:
where
represents the original word,
represents the tampering word, and
denotes the query function related to Equation (11).
4.2. ODRTA-Defense (ODRTA-D)
When an adversary aims to compromise the usability and timeliness of hidden information without considering detection, they can implement an out-of-domain random tampering attack (ODRTA) on the .
To recover such texts and enhance the integrity and usability of hidden information, we propose a post-hoc repair mechanism based on context-oriented search strategy and the determinism of text generation model.
Generative linguistic steganography effectively constitutes a deterministic system under various conditions determined by text generation model, CP construction strategies, hidden information, and embedding algorithm. Specifically, this mechanism ensures that the same input always produces the same output. The representation of steganographic text generation and hidden information extraction are as follows:
where
denotes the embedding algorithm,
denotes its inverse,
denotes the text generation process used to construct
,
is a hyperparameter, and
is the hidden information.
As
is a deterministic system and both
and
are bijective, generative linguistic steganography can also be regarded as a deterministic system. At step
, if the steganographic text
is tampered such that the original word
is altered to
, given that the information before
is correct, the generation system
outputs the correct
at step
. However, as
, the inverse mapping function
cannot proceed, interrupting the extraction process:
Without auxiliary information, the receiving end can only repair the steganographic text via enumeration. Introducing auxiliary information to assist in the repair process is therefore a natural approach. The receiving end can reasonably assume that the steganographic text following the tampering point is correct and use it as auxiliary information.
Under this assumption, the repair mechanism uses and employs a text generation model to predict steps backward, enumerating all possible outcomes. The last words are then used as search tokens to query text samples matching these words. These words correspond to the nearest words following the tampered word in . After the repair process, the first word following is restored to the original word .
Algorithm 2 describes the workflow for repairing steganographic text affected by a DODRTA. In line 15, based on historical information (HIS), the algorithm predicts
steps backward, where the first step represents the possible original word at the tampering point, and subsequent steps provide auxiliary information necessary for the repair. Once predictions are completed, the algorithm combines historical information with the auxiliary information predicted in the first step. Leveraging the determinism of the steganographic mechanism and a context-oriented search strategy, the algorithm filters out the original word from possible candidates at the tampering point (lines 16–18).
Algorithm 2. Extracting from DODRTA-affected ST |
01: INPUT: steganographic text ST, historical information HIS 02: OUTPUT: hidden information HI 03: ST ST.minus(HIS) 04: HI = [] 05: HIS text generation model, TGM 06: for in range(0, len(ST)) do 07: candidate pool, TGM(HIS) 08: if ST [] in do 09: temp Extraction(ST []) 10: HI.add(temp) 11: HIS HIS + ST [] 12: else do 13: // Predict positions backward and query the path of ST[: ] 14: candidate pools, TGM(HIS, ) 15: if ST[: ] in do 16: original word, ow in 17: temp Extraction(ow + ST [: ]) 18: HI.add(temp) 19: HIS HIS + ow + ST[: ] 20: end if 21: end if 22: end for 23: Return HI |
Repairing steganographic text
affected by a CODRTA requires more auxiliary knowledge:
Let
represent the continuity of tampered words and
denote the amount of required auxiliary information. Based on
, the text generation model predicts
steps backward. The repair process is expressed as follows:
where
denotes the data-predicted backward
steps,
combines these data samples,
collects all candidate pools from step
to
,
is the retrieval function for matching the retrieval tokens
, and
is the set of original words to be restored. Algorithm 3 describes the workflow for repairing steganographic text affected by a CODRTA.
Algorithm 3. Extracting from CDORTA-affected ST |
01: INPUT: steganographic text ST, historical information HIS 02: OUTPUT: hidden information HI 03: ST ST.minus(HIS) 04: HI = [] 05: HIS text generation model, TGM 06: while ST is not empty do 07: candidate pool, TGM(HIS) 08: if ST [0] in do 09: temp Extraction(ST [0]) 10: HI.add(temp) 11: HIS HIS + ST [] 12: ST ST.minus(ST[0]) 13: elif ST[0] not in do 14: candidate pools, TGM(HIS, ) 15: if ST[1: ] in do 16: go to DODRTA 17: else do 18: candidate pools, TGM(HIS}, ) 19: if ST [: ] in do 20: original word, in 21: tem Extraction( + ST[: ]) 22: HI.add(tem) 23: HIS HIS + + ST[: ] 24: ST ST.minus(ST[0: ]) 25: end if 26: end if 27: end if 28: end while 29: Return HI |
5. Experiments
This section presents the evaluation metrics commonly used in generative linguistic steganography, analyzes the feasibility of the proposed mechanisms, and compares them with several existing steganographic approaches.
5.1. Evaluation Metrics
The evaluation of generative linguistic steganography typically relies on two key dimensions: perceptual imperceptibility and statistical imperceptibility.
Perceptual imperceptibility refers to the readability of the generated steganographic text, including semantic clarity and logical coherence. For fairness, we employ standard metrics commonly used in natural language processing [
39] to assess perceptual imperceptibility. The primary evaluation metrics are perplexity (
) and Kullback–Leibler divergence (
divergence):
where
denotes the number of words in the steganographic text
,
represents the true data distribution obtained from training, and
is the statistical distribution of the generated text.
In addition, we also adopt diversity (
), grammatical scores (
), and descriptiveness (
) as auxiliary metrics [
40] for assessing perceptual imperceptibility.
where
denotes the number of words in the steganographic text and
ensures that only unique words are counted. Specifically,
,
and
represent nouns, adjectives, and adverbs, respectively.
Statistical imperceptibility refers to the ability of the generated steganographic text to evade detection by steganalysis tools. We evaluate statistical imperceptibility using three widely adopted steganalysis models: FCN [
41], TS-RNN [
42], and HiduNet [
43]. The corresponding evaluation metrics are accuracy (
), precision (
),
score (
), and recall (
):
where
,
,
, and
denote the numbers of true positives, true negatives, false positives, and false negatives, respectively.
To assess the resilience of generative linguistic steganography against active tampering attacks, we adopt three qualitative binary metrics: attack detection capability (), which measures the ability to detect actual manipulation points; attack identification capability (), which evaluates the effectiveness in recognizing the type of attack; and repair capability (), which quantifies the ability to restore the steganographic text and correctly extract hidden information.
Furthermore, we introduce the fault tolerance rate (
) to quantify robustness against active attacks.
is defined as the maximum proportion of manipulation points allowed in the steganographic text, beyond which hidden information can no longer be correctly extracted:
where
denotes that the point is under attack.
Finally, embedding capacity is also an important evaluation criterion. It typically refers to the maximum amount of hidden information that can be embedded into a given carrier. As most generative steganographic methods convert hidden information into bit streams, bits per word (
) is used as the metric:
where
denotes hidden information.
5.2. Feasibility Analysis
The feasibility analysis aims to evaluate whether the proposed defense mechanisms effectively enhance the integrity and availability of hidden information at the receiving end. We further analyze the performance of different steganographic mechanisms under the two proposed types of attacks.
To rapidly validate the feasibility and usability of the proposed mechanisms, we implemented an ISSA-D steganographic mechanism using the adaptive DBSCAN algorithm and generated 6000 steganographic texts. In constructing the candidate pool, we adopted the top-
strategy with
and 32. For ODRTA evaluation, we constructed 4000 data points with a candidate pool dimension of 2, among which 2000 were used for measuring DODRTA-D and the remainder for measuring CODRTA-D-2. Here, the subscript “2” indicates that the maximum number of consecutive manipulations is two. To ensure generalizability, GPT-2 [
44] was selected as the underlying text generation model.
For ISSA-D and ODRTA-D, we evaluated (
), (
), and (
). The results are presented in
Table 2. The findings demonstrate that ISSA-D is capable of successfully extracting hidden information from steganographic texts affected by ISSA, whereas other mechanisms fail. Similarly, only ODRTA-D achieves successful extraction from texts affected by ODRTA. This difference arises because ODRTA introduces tampered words from outside the current candidate pool, allowing other methods to detect tampering points based on the text generation model and candidate pool strategy. In contrast, ISSA operates more covertly, by concealing tampering points within synonyms, preventing other steganographic mechanisms from identifying them.
The complexity analyses and correctness/security proofs of the proposed ISSA-D and ODRTA-D mechanisms are detailed in
Appendix A (complexity analysis) and
Appendix B (correctness and security analysis)
We further used the fault tolerance rate (
) to quantify the performance of ISSA-D, DODRTA-D, and CODRTA-D-2, with the results reported in
Table 3. In the table,
denotes the candidate pool dimension under the top-
strategy, while
indicates the continuity of the attack. As ISSA-D does not require auxiliary information, its FTR reaches up to 100%, meaning ISSA can occur at any position and with any number of words in the steganographic text. By contrast, the repair processes of DODRTA-D and CODRTA-D-2 rely on subsequent auxiliary information, thereby limiting their tolerance to extensive manipulations.
In theory, as long as at least one untampered word remains, tampering points can be repaired. We initially set . However, in practice, when tampering occurs near the end of the steganographic text, accurate recovery may not be possible. This is due to the iterative nature of the repair process: as the computational sequence grows longer, the computational load increases exponentially, reducing precision and hindering retrieval of the original candidate pool. To mitigate this, we set . With , DODRTA-affected steganographic texts can be fully and accurately repaired. For CODRTA-2, we further increased the value of ; when , the receiver was also able to extract the hidden information completely and correctly.
Therefore, we conclude that, to ensure full and accurate recovery of hidden information, the amount of auxiliary information following tampered points must be at least twice the number of tampered words.
5.3. Perceptual Imperceptibility Analysis
To validate the perceptual imperceptibility of the steganographic text generated by the proposed method, particularly that produced by ISSA-D, a quantitative assessment was conducted. The dimensions of the candidate pool were set to 8, 16, and 32. A comparison was made among RNN-stega [
13], VAE-stega [
2], and Discop [
30], with RNN-stega [
13] serving as the baseline. Discop [
30], a theoretically secure steganographic mechanism, demonstrated superior perceptual imperceptibility. Similarly, under different dimensions of the CP, we generated 2000 data samples for each comparative subject under identical conditions of hidden information and prompt.
In evaluating the generative quality of steganographic texts, perceptual imperceptibility was one of the key metrics. This metric primarily referred to the extent to which the generated steganographic text aligned with human writing habits while exhibiting logical coherence and semantic clarity. To objectively assess the perceptual imperceptibility of steganographic texts produced by the four mechanisms, we employed the BERT [
10] model as an evaluation tool. The comparison results are presented in
Table 4.
From
Table 4, it can be observed that the steganographic text generated by ISSA-D exhibited superior generation quality compared with the baseline [
13] in terms of
and
metrics across different dimensions, although it remained lower than that of VAE-stega [
2] and Discop [
30]. In terms of the
metric, VAE-stega [
2] performed the best, followed by ISSA-D; conversely, ISSA-D excelled in the
metric, with Discop [
30] coming in second. To quickly validate the feasibility of ISSA-D, we employed a perfect binary tree for the embedding algorithm, while Discop [
30] utilized distributed copies, and both VAE-stega [
2] and RNN-stega [
13] implemented Huffman coding. In contrast, the latter two embedding algorithms were more effective in enhancing the perceptual imperceptibility of steganographic text.
Additionally, regarding the
metric, the embedding capacity of ISSA-D was significantly lower than that of the other three mechanisms. This was because, to ensure the steganographic text could withstand ISSA, certain elements in the candidate pool had to be sacrificed. Furthermore, the candidate pool in
Table 4 was obtained through top-
or top-
filtering, with ISSA-D performing additional filtering on the resulting candidate pool. Notably, ISSA-D, RNN-stega [
13], and VAE-stega [
2] all used a top-
strategy, which allowed for static candidate pool dimensions, whereas Discop [
30] adopted a top-
strategy, representing a dynamic candidate pool construction method.
5.4. Statistical Imperceptibility Analysis
It is fair to state that assessing statistical imperceptibility should be conducted under identical
conditions. Given that the
of RNN-stega [
13], VAE-stega [
2], and Discop [
30] was significantly higher than that of ISSA-D, we increased the dimensions of the initial candidate pool to enhance the
of ISSA-D. After adjustments, the
s of the newly generated data for ISSA-D were found to be 1.1, 1.9, 2.4, and 2.7, respectively.
In evaluating the generative quality of steganographic texts, statistical imperceptibility was another important metric, primarily measuring whether the generated steganographic texts could effectively evade detection attacks from steganalysis tools. To verify the statistical imperceptibility of the steganographic text, we generated 2000 steganographic texts for four different mechanisms under four conditions. In total, this resulted in 32,000 texts being analyzed.
Figure 3 demonstrated the resilience of steganographic texts against steganalysis using the FCN [
41] model. The results indicate that VAE-stega [
2] performed best in the accuracy (
), precision (
), and
score (
), averaging 0.75%, 0.25%, and 0.5% better than the second-best method (ISSA-D), respectively. In contrast, Disop [
30] excelled in recall (
), outperforming the proposed method by an average of 11.75%.
Figure 4 illustrates the resilience of steganographic texts against steganalysis using the TS-RNN [
42] model. The performance of all four mechanisms in the
was relatively poor, yet all exceed 90%. In the
, the proposed method improved with increasing
, with gains of up to 5%. In
, Discop [
30] performed the best, followed by the proposed method, which averaged 5.75% higher than Discop [
30]. For the
score, VAE-stega [
2] led, followed by Discop [
30], the proposed method, and RNN-stega [
13].
Figure 5 demonstrates the resilience of steganographic texts against steganalysis using the HiduNet [
43] model. The results show that VAE-stega [
2] outperforms the other three methods across all four metrics, with Discop [
30] following closely. The proposed method averages 11.75%, 11%, 12.5%, and 11.5% points higher than VAE-stega [
2] in the respective metrics. It is important to note that, for steganographic text creators, lower values of these four metrics indicate better statistical imperceptibility of the generated steganographic text.
In summary, compared with the baseline model RNN-stega [
13], the steganographic texts generated by ISSA-D demonstrated superior capabilities in resisting statistical analysis. However, when compared with the more advanced models Discop [
30] and VAE-stega [
2], ISSA-D still shows potential for further improvement.
Table 5 presents the performance of steganographic texts generated by GPT-2 [
44] under the condition of 1
across four steganalysis tools. These steganographic texts were primarily used to evaluate the statistical imperceptibility of ODRTA-D.
6. Case Studies
The internet of things (IoT) [
45] faces numerous shortcomings, including low data security and inadequate privacy protection. These issues render IoT systems vulnerable to attacks, exposing them to both passive and active threats, particularly man-in-the-middle (MITM) attacks and zero-day attacks that compromise the availability of transmitted data [
46]. Transmitting confidential information [
47] in such insecure communication environments poses significant risks, potentially leading to the disclosure of users’ personal information [
48]. Thus, the introduction of protective measures is essential. Compared with conventional encryption systems, covert communication systems can conceal hidden information during transmission, making them more suitable for resource-constrained communication devices. Consequently, implementing generative linguistic steganography mechanisms in IoT environments is crucial. However, existing linguistic steganography mechanisms remain inadequate against MITM attacks. Therefore, we conducted a study to enhance the integrity and availability of hidden information at the receiving end. Our proposed method was tested for usability and effectiveness across various IoT environments.
6.1. Wifi or Buletooth
Consider a scenario in which Alice and Bob are employees of a company. Alice intended to transmit confidential information to Bob via the company’s Wi-Fi [
49,
50] or Bluetooth [
51,
52]. However, due to the inherent vulnerabilities of Wi-Fi and Bluetooth transmission, another employee, Eve, could intercept the data packets. To obscure the visibility of the confidential information, Alice used a generative steganographic text and transmitted a readable steganographic message
to Bob.
Assuming that the steganographic mechanism was shared within the company, Eve could exploit this mechanism to decipher the confidential information transmitted by Alice. At this juncture, Eve not only sought to acquire the confidential information but also aimed to prevent Bob from interpreting it. In this context, Eve executed an ISSA attack on the steganographic text.
Figure 6 illustrates information transmission in wireless communication networks (e.g., Wi-Fi and Bluetooth) and an MITM attack via in-domain synonym substitution attack (ISSA). In the left half, Alice constructs a steganographic text using the proposed ISSA-D method, with the currently selected word highlighted in red and a marker indicating the
i-th attack conducted by Eve. During testing, Eve can tamper with any position, though the example shows only a specific instance for illustration. Ultimately, Alice transmits the steganographic text
to Bob, while Eve intercepts and modifies it according to ISSA, producing the altered text
, as shown in the right half of the figure. If Bob lacks defensive mechanisms, he will be unable to extract the hidden information from
. The initial candidate pool displayed in
Figure 6 is also assumed to be known to Eve.
6.2. Internet of Medical Things (IoMT)
In the internet of medical things (IoMT) [
53], data security issues have also arisen [
54,
55]. For instance, when hospitals use steganography to conceal confidential information within local databases through text covers such as case reports or patient feedback, a zero-day attack on the database could compromise its availability. In such cases, a steganographic mechanism without recovery measures would be unable to extract hidden confidential information from the compromised text.
In
Figure 7a, S1 represents the steganographic text stored in the local database, with the points of assumed attack tampering highlighted in red, while the blue text indicates the auxiliary information required for tampering recovery. The figure illustrates both the construction of the steganographic text and the DODRTA process, resulting in the transformation of
into
. It is important to note that, as the adversary aims solely to corrupt the data, the candidate pool remains unknown to them.
6.3. Industrial Internet of Things (IIoT)
In industrial control environments [
56,
57], long-distance communication from the data acquisition end to the data processing end is particularly susceptible to tampering. Transmitting collected data in plaintext is highly insecure. Conversely, ciphertext transmission may attract the attention of adversaries, increasing the likelihood of interception. Therefore, introducing steganography into industrial control environments is essential.
Figure 7b shows a different scenario, where linguistic steganography is applied in the industrial internet of things, and an indiscriminate random tampering attack (CODRTA) is launched on all data. If the data processing end lacks a configured recovery mechanism, the chances of extracting hidden information are significantly reduced, leading to a loss of data availability. Here too, the candidate pool remains unknown to the adversary.
These experiments simulate real-world IoT and IoMT scenarios where active adversaries can tamper with steganographic texts without access to the candidate pool. The key observation is that, without recovery or defense mechanisms, both DODRTA and CODRTA can severely compromise the integrity and availability of hidden information, emphasizing the necessity of the proposed tampering-recovery strategies.
7. Conclusions
Current AI-based linguistic steganography mechanisms often fail to withstand active attacks. To address this, we proposed two novel text tampering models, ISSA and ODRTA, to distinguish whether tampered words exist in the candidate pool, with ODRTA further divided into discontinuous and continuous variants. Correspondingly, we designed targeted defense mechanisms, a proactive adaptive clustering for ISSA and a post-hoc context-oriented repair for ODRTA. These approaches effectively counter ISSA, CODRTA, and DODRTA, significantly improving the integrity and usability of hidden information at the receiving end.
Compared with existing methods such as VAE-stega and Discop, our framework introduces innovative attack models and effective defenses, explicitly addressing candidate-pool and out-of-domain tampering while maintaining reliable steganographic communication under adversarial conditions.
Despite these improvements, the generated steganographic texts still exhibit limited perceptual imperceptibility, especially at higher bits per word. Future work will focus on enhancing imperceptibility while preserving resilience, and extending the approach to other data covers such as images and videos to improve practical usability in IoT environments.