EGAN: Encrypting GAN Models Based on Self-Adversarial

Zhu, Yujie; Li, Wei; Jiang, Yuhang; Huang, Yanrong; Fang, Faming

doi:10.3390/math13244008

Open AccessArticle

EGAN: Encrypting GAN Models Based on Self-Adversarial

by

Yujie Zhu

¹,

Wei Li

²

,

Yuhang Jiang

²,

Yanrong Huang

³

and

Faming Fang

^1,*

¹

School of Computer Science and Technology, East China Normal University, Shanghai 200062, China

²

School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China

³

Department of Computer Science, University of Jaen, 23071 Jaen, Spain

^*

Author to whom correspondence should be addressed.

Mathematics 2025, 13(24), 4008; https://doi.org/10.3390/math13244008

Submission received: 2 December 2025 / Revised: 11 December 2025 / Accepted: 11 December 2025 / Published: 16 December 2025

(This article belongs to the Special Issue Information Security and Image Processing)

Download

Browse Figures

Versions Notes

Abstract

The increasing prevalence of deep learning models in industry has highlighted the critical need to protect the intellectual property (IP) of these models, especially generative adversarial networks (GANs) capable of synthesizing realistic data. Traditional IP protection methods, such as watermarking model parameters (white-box) or verifying outputs (black-box), are insufficient against non-public misappropriation. To address these limitations, we introduce EGAN (Encrypted GANs), which secures GAN models by embedding a novel self-adversarial mechanism. This mechanism is trained to actively maximize the feature divergence between authorized and unauthorized inputs, thereby intentionally corrupting the outputs from non-key inputs and preventing unauthorized operation. Our methodology utilizes key-based transformations applied to GAN inputs and incorporates a generator loss regularization term to enforce model protection without compromising performance. This technique is compatible with existing watermark-based verification methods. Extensive experimental evaluations reveal that EGAN maintains the generative capabilities of original GAN architectures, including DCGAN, SRGAN, and CycleGAN, while exhibiting robust resistance to common attack strategies such as fine-tuning. Compared with prior work, EGAN provides comprehensive IP protection by ensuring unauthorized users cannot achieve desired outcomes, thus safeguarding both the models and their generated data.

Keywords:

GAN; encryption; intellectual property; model protection; data security

MSC:

68M25

1. Introduction

Intellectual property (IP) is a concept used to protect innovation within a legal framework and possesses both moral and commercial value. In industry, acquiring real-world data for many practical tasks can be challenging. Deep learning models, particularly Generative Adversarial Networks (GANs) [1], serve to address the issue of insufficient data by generating new instances, and they have been widely applied in many fields such as industry [2,3] and medicine [4,5]. The substantial costs associated with collecting training data for GANs (e.g., tracking objects over time or purchasing expensive sensors), coupled with the significant computational resources required for model training (e.g., 512 core Google TPUv3 for training BigGAN [6]), contribute to the high value of the trained models.

Many researchers have attempted to claim IP rights by embedding specific watermarks in the parameters or output data of the models. For example, recent studies demonstrate various watermarking techniques: a fragile watermark can authenticate GAN integrity with

100 %

detection on ProGAN/DCGAN after only 1-epoch fine-tuning [7]; box-free ownership verification using the discriminator’s feature hypersphere already achieves ≥ 75% AUC against

40 %

pruning or heavy JPEG/noise [8]; and a joint black-/white-box scheme embeds visual logos (SSIM ≥ 0.9) and 56-byte signatures (BER = 0) into DCGAN/SRGAN/CycleGAN while surviving 30-epoch fine-tuning and watermark-overwriting attacks [9]. Despite such progress, the dual capability of integrity certification and box-free, trigger-free verification remains unexplored in current literature, a void that this study intends to fill.

Existing methods are generally divided into black-box [10,11] and white-box approaches [12,13]. Black-box strategies refer to embedding digital watermarks into the input, which triggers the model to produce specific watermarked outputs. However, these methods require the copyright owner to interact with the suspect model via input-output queries. Consequently, misappropriation of non-public models cannot be verified, as external access is restricted. White-box methods embed the digital watermark into the parameters of the model. Ownership is verified by accessing these parameters and identifying specific characteristics (e.g., positive and negative signs that can be encoded as specified information). Consequently, white-box verification necessitates full access to the suspect model and its parameters. In practical scenarios, obtaining such comprehensive access is often infeasible.

Recently, the study of GAN-IPR [9] model combines the above two methods to protect IP of GAN. The complete method endows the generator with dual capabilities for both black-box and white-box verification. Specifically, in the black-box setting, the model produces watermarked outputs upon receiving a specific trigger set. Simultaneously, for white-box verification, ownership information is encoded via the signs(positive/negative) of specific model parameters. Despite these advances, GAN-IPR still faces significant practical challenges in robustly protecting the IP of GAN. First, it is difficult to detect infringements every time. For example, infringers obtain profits by providing MLaaS services through online APIs, while concealing their activities by frequently changing virtual proxy IP addresses and domain names. It is impossible to monitor all the websites on the Internet daily. Second, even for suspected infringements that have been discovered, the process of collecting evidence and prosecuting for legal proceedings would be time-consuming and economically costly. Third, once some extremely important generated data or models are stolen, it would result in serious consequences. As mentioned in [14], model theft may implicate national security issues, where the reactive nature of ‘law enforcement after discovery’ often leads to irreversible consequences due to the inherent temporal lag. Furthermore, under the influence of objective force majeure factors such as cross-border and COVID-19 epidemics, it is difficult to safeguard IP through laws. Furthermore, the conditions for some models to reach Nash equilibrium through training are very harsh, such as GANs and their variants. The infringer may steal a well-trained model solely for offline, unauthorized research without exposing any public interface, rendering detection impossible. The problems are particularly acute in GANs, because once the generator is stolen, the data can be generated indefinitely.

In order to solve these problems, we propose a novel end-to-end encrypted training method based on a self-adversarial strategy. Here, ‘self-adversarial’ implies that the model is optimized to act as an adversary against itself when the input is unauthorized, effectively destroying the generation quality by maximizing feature distance loss. This mechanism ensures that unauthorized inference yields degraded outputs, thereby rendering stolen model files practically useless. At the same time, this security measure incurs no performance penalty for authorized owners or licensees. Our method is also compatible with the methods proposed in [9], preserving the capability for black-box and white-box watermarking as a complementary forensic measure for legal recourse.

We experimentally validate our proposed method. Figure 1 illustrates the effect of generating images for authorized, watermark-triggered, and unauthorized input. Notably, images generated from unauthorized inputs exhibit significant feature corruption. As shown in Table 1, our method renders the DCGAN incapable of producing meaningful data under unauthorized conditions, resulting in drastically reduced classification accuracy. Conversely, authorized inputs yield classification performance comparable to the standard DCGAN baseline. In essence, our encryption training method ensures that any unauthorized use of the stolen generator produces severely degraded samples, causing general-purpose classifiers to exhibit high error rates. This visual effect is further exemplified in Figure 2. Table 2 shows the classification error rate of image data generated by unauthorized persons using the generator with and without encryption training. This serves as a concrete implementation of model IP protection. In addition, we verify the method’s robustness against common adversarial threats, such as fine-tuning and overwriting attacks. Our method can coexist with existing property rights claim methods, which can comprehensively protect the intellectual property rights of neural network models.

Our contributions are shown as follows.

Unlike existing IP protection methods that rely on passive post hoc verification—which permits stolen models to function unimpeded—EGAN (Encrypted GANs) introduces an active protection paradigm. Our method ensures that unauthorized inference yields structurally corrupted outputs, thereby rendering the stolen model functionally unserviceable without the correct key.
We propose a novel self-adversarial feature-separation mechanism. This compels the generator to learn disjoint distributions for authorized and unauthorized inputs, intrinsically embedding the protection into the model’s weights rather than relying on external wrappers or easily removable tags.
Whereas conventional watermarking techniques are often susceptible to ambiguity attacks or can be erased via fine-tuning, EGAN demonstrates superior robustness. Our experiments show that the corruption mechanism persists even after significant parameter modifications (e.g., fine-tuning), providing a level of security that outperforms contemporary model watermarking schemes.

2. Related Work

Currently, there are two main methods for verifying the property rights of CNN models: white-box and black-box verification methods. The former requires access to the complete model parameters to verify the property rights, while the latter relies solely on specific input values and outputs to complete the verification. Uchida et al. [14] first proposed the embedding of digital watermarks into CNN models as a white-box verification method. E. Le Merrer et al. [15] proposed a black-box verification method that maps specific user inputs to designated output categories to authenticate ownership. Y. Adi et al. [16] presented a theoretical analysis of this black-box verification as a designed back-dooring. Moreover, studies such as [2,12,13,17] have demonstrated ways to improve the robustness of embedded watermarks, such as resistance to ambiguity attacks, fine-tuning, and pruning.

For GANs and their variants, Ding et al. [9] proposed a set of IP verification methods, including both white-box and black-box verification. The black-box approach induces the generator to synthesize watermarked images upon receiving specific trigger inputs, while the white-box approach specifically encodes the symbols of the BatchNorm layers in the model. Yuan et al. [7] introduce the first fragile watermark for GANs: a two-stage fine-tuning forces the generator to over-fit a secret label to a pre-defined watermark image, so any later parameter perturbation drops SSIM and raises MSE dramatically. Huang et al. [8] abandon the “choose-inputs, check-outputs” black-box paradigm and instead train a hypersphere on the discriminator’s features to enclose the source generator’s distribution. Lastly, ref. [18] used a variant of GAN to facilitate data augmentation on fetal heart rate (FHR) signals, aiming to preserve individual privacy through synthetic data. However, since the generator derives its distribution from real-world subjects, this method still poses a hidden danger of data privacy after the model is stolen.

However, it is crucial to note that the aforementioned studies focus primarily on claiming or verifying ownership. As noted in Section 1, merely asserting or verifying intellectual property is insufficient to prevent potential infringement. We need to use brute force means to make the model unusable after being stolen by an unauthorized person. In other words, the model should deliver severely degraded outputs to unauthorized users, preventing it from performing its intended task, such as high classification error rates or extremely poor generated image quality, etc.

Recent years have witnessed the successful implementation of numerous techniques for generating adversarial examples against Deep Neural Networks (DNNs). Broadly, these methodologies fall into two categories: iterative type algorithms [19,20,21,22] and generative type algorithms [23,24,25]. Iterative type algorithms operate by optimizing each individual sample to generate an optimal perturbation for adversarial attacks. These methods often yield a very strong attack effect. For example, ref. [22] can cause a severe misclassification in a DNN classifier by altering just a single pixel. However, the need for per-sample iterations makes this method computationally expensive for large-scale adversarial sample generation.

Generative algorithms, on the other hand, train a specialized DNN to generate adversarial perturbations. This allows for rapid attack generation via a single forward pass, offering a substantial speed advantage over iterative counterparts. Notably, the method in [25] is more general than prior works. Unlike task-specific methods (e.g., classification), its generated perturbations can simultaneously attack DNNs across a broad spectrum of downstream tasks. However, these generators are designed exclusively to synthesize perturbations and cannot generate high-fidelity, realistic data.

Watermarking techniques [11,14,16] embed signatures into parameters or outputs to enable post hoc ownership verification, but the stolen model remains fully functional. Similarly, Model-locking/passport [17,26] schemes enforce key-conditioned inference, yet they do not prevent a generative model from emitting usable samples once it is executed without the key. Addressing these limitations, EGAN applied a generative protection mechanism that actively corrupts the internal feature maps for any unauthorized input, producing low-quality data that are unusable for downstream tasks. Simultaneously, it preserves a recoverable watermark, ensuring that post hoc ownership verification remains a viable legal recourse.

Recently, strategies focusing on model weight modulation and resistance to fine-tuning have gained attention. Fei et al. [27] proposed OmniMark, which embeds fingerprints by modulating the convolutional kernels across multiple dimensions to achieve scalability. Similarly, Cui et al. [28] introduced FT-Shield, a method specifically designed to survive unauthorized fine-tuning by integrating fine-tuning loss into the watermark generation process. However, these state-of-the-art methods typically aim to maintain attribution capabilities (traceability) after attacks. In contrast, EGAN focuses on access control, ensuring the model becomes functionally unusable without the correct key, rather than just verifiable.

Different from the aforementioned iterative or generative adversarial attacks which aim to fool a frozen model with perturbed inputs, our ’self-adversarial’ approach is a defensive training strategy. It embeds the adversarial objective into the generator’s loss function, ensuring the model internally corrupts its own features absent the correct encryption key, as illustrated in the CycleGAN example in Figure 3.

3. EGAN

3.1. Threat Model

In our threat model, we assume that the attacker is capable of stealing the trained generative model, thereby gaining full access to the model files. This access enables the attacker to employ various exploitation techniques, including fine-tuning the model to alter its core functionality or adapt it for different tasks. Such capabilities reflect common vulnerabilities faced by model owners in computational environments and underscore the importance of safeguarding proprietary model architectures and data.

Although an attacker may acquire the model, they lack the secret key essential for its proper operation. This key, analogous to a cryptographic key, transforms input data to enable meaningful generation results. In the absence of this key, the attacker is restricted to providing raw, untransformed inputs to the encrypted generator. This renders the stolen model effectively useless for the attacker, as the generated data is degraded and unsuitable for any practical application.

In the hypothetical scenario where the attacker also obtains the secret key, the threat model incorporates additional defenses. The generative model embeds recoverable watermarks in both black-box and white-box formats, serving as post hoc verification mechanisms. These watermarks allow the rightful model owner to assert IP rights and pursue legal recourse, thus preserving avenues for ownership claims and enforcing intellectual property protections even in the most challenging circumstances.

By encompassing these threat dimensions, our approach not only fortifies the model against unauthorized use but also ensures a comprehensive framework for asserting ownership and maintaining privacy.

3.2. Overview of the Process of Model Protection

The basic components of GANs and their variants are the generator network and the discriminator network. The objective of the trained generator is to synthesize high-fidelity data, such as realistic images. Consequently, our proposed method specifically targets the encryption of this trained generator model.

Under our methodology, the process of protecting GANs is illustrated in Figure 4. The model owner selects a transformation as a key, which is then integrated into the cryptographic training of the GANs. Consequently, all inputs are transformed with the key before being processed by the generator. This ensures that the generator functions correctly only when presented with encrypted inputs that have been transformed with the corresponding key. The owner can distribute the key to the authorized party through a specific protocol or encrypted communication. If unauthorized persons steal the generator model, they cannot obtain expected outputs using inputs for normal GANs. Even if the infringer successfully acquires both the model and the key to generate valid outputs, the model owner can still claim IP rights and use legal weapons through black-box and white-box verification methods [9].

In essence, our approach involves applying a key-based transformation to all inputs during both model training and validation prior to their entry into the model. The remaining workflow follows conventional training procedures, with the exception of a feature distance regularization term incorporated into the loss function, which will be analyzed in Section 4. The transformation must strike a balance: it should be sufficiently strong to significantly degrade the output quality for unauthorized users employing unencrypted inputs, yet not so extreme as to undermine the model’s original performance. Additionally, the transformation should offer a sufficiently large key space to prevent easy deciphering. In this section, we demonstrate feasible encryption transformations and their effects on several widely-used GAN architectures, including DCGAN, CycleGAN, and SRGAN.

Our encryption transform functions as an asymmetric key-lock mechanism: the generator synthesizes high-fidelity outputs only if the input is transformed by a secret key; otherwise, a feature-level corruption regularizer is employed to render the outputs unusable. In this paper, we refer to this mechanism as “self-adversarial”. Specifically, this strategy guides the model to maximize the feature-level discrepancy between the feature maps produced by authorized inputs and those produced by keyless (unauthorized) inputs. To implement this mechanism, we introduce a new loss term that maximizes the Frobenius norm of the feature maps at the intermediate layers of the generator. Denote

x

as the original input to our generator G, and denote f as our key transformation, then the keyed input,

f (x)

will be denoted as

x^{*}

. Extracting G at layer ℓ we obtain the feature map of keyless input

G_{l} (x) \in R^{N_{l} \times D_{l}}

and that of keyed input

G_{l} (x^{*}) \in R^{N_{l} \times D_{l}}

, which we assume to be reshaped in matrix form, such that

N_{l} = H_{l} \times W_{l}

, with

H_{l}

and

W_{l}

the spatial dimensions of the feature map, and

D_{l}

its number of channels. From this, our Feature Separation layer loss is defined as

D_{f e a t} (x, x^{*}) = {∥G_{l} (x) - G_{l} (x^{*})∥}_{F}^{2}

(1)

where

{∥ \cdot ∥}_{F}

denotes the Frobenius norm.

Our objective is to maximize

D_{f e a t}

to induce structural corruption in the outputs of keyless inputs. However, to prevent the magnitude of

D_{f e a t}

from overwhelming the primary convergence of the model, we normalize

D_{f e a t}

to map it to

[0, 1]

. Consequently, our maximizing feature separation loss term is defined as

L_{f e a t} = 1 - D_{f e a t} .

(2)

We detail the inference protocol in Algorithm 1 to clarify the deployment and usage of the trained model.

Algorithm 1 Inference Protocol of EGAN

Input:

x_{r a w}

: input data provided by user; G: Trained Encrypted Generator;

K_{u s e r}

: Key provided by user (if any).
Output: Generated data y.

1:: function Inference( $x_{r a w}, K_{u s e r}$ )
2:: $f \leftarrow ConstructTransform (K_{u s e r})$ ▹ based on Equation (3)
3:: $x_{i n p u t} \leftarrow f (x_{r a w})$ ▹ Apply key transformation
4:: $y \leftarrow G (x_{i n p u t})$ ▹ Generator forward pass
5:: return y
6:: end function

3.3. Protection Process for DCGAN

To verify that our method preserves the model’s ability to claim property rights through watermarking, we adopt the same architecture mentioned in [9], SN-GAN [29], which is a variant of DCGAN. The input to DCGAN [30] is a latent vector randomly sampled from a standard normal distribution,

z \sim N (0, 1)

, which is first processed by a fully connected (FC) layer to upsample its dimensions prior to convolutional processing. We designate the output of the FC layer as the input of the model because it has enough dimensions to provide a sufficient space. In our experiments, the input dimension is 8192 (

512 \times 4 \times 4

) when the output image size is

32 \times 32

, and 32,768 (

512 \times 8 \times 8

) when the output image size is

64 \times 64

. The key transformation is implemented via a masking strategy that perturbs 256 randomly selected dimensions by adding or subtracting a fixed constant. This transformation serves as the cryptographic key, enforcing a dependency where the trained model produces valid mappings only when conditioned on this specific perturbation. Formally, we define the input transformation function, f, which maps the output vector of the FC layer to a keyed vector (

f : x_{F C} \mapsto x_{F C}^{*}

) as follows (shown in Figure 5):

f (x) = x + c (1 - b) and b \in {0, 1}^{d (x)} .

(3)

The bitmask

b

in the above formula controls which dimensions remain unchanged and to which dimensions a fixed constant c is added. The constant c can be chosen to be a positive or negative number that deviates significantly from 0 because our intention is to change the input distribution. We uniformly choose

c = 10.0

in our experiments. We set 256 dimensions in the bitmask

b

to have values of 0, and the remaining dimensions are set to 1. Here,

d (x)

represents the number of dimensions in

x

. Therefore, the size of our change selection space is

C_{8192}^{256} \approx 10^{494}

or

C_{32768}^{256} \approx 10^{634}

, while the selection space of a common 6-digit password is

10^{6}

.

To ensure consistency with the baseline, we adopt the Spectral Normalization GAN (SN-GAN) [29], which is a variant of DCGAN. The original objective function of the generator is defined as follows:

L_{DC} = - E_{z \sim p_{z} (z)} [D (G (z))] .

(4)

The loss term for the watermark is defined as [9]:

L_{w} (x_{w}, y_{w}) = 1 - SSIM (G (x_{w}), y_{w}) .

(5)

We incorporate the regularization term for feature separation (Equation (2)), and the objective function of encrypted DCGAN is defined as:

L_{{DC}_{e}} = L_{DC} + α L_{w} + β L_{f e a t}

(6)

where the hyperparameter

α

controls the watermark quality when triggered, and

β

controls the severity of degradation for the non-key output. The complete training procedure is summarized in Algorithm 2. Figure 6 illustrates this degradation effect on the CTU-UHB dataset.

Algorithm 2 Training Pseudocode of EGAN

Input:

x_{i}

: original input; G: Generator; D: Discriminator; f: mask for encryption (Equation (3));

f_{w}

: mask for the watermark trigger.
Output: Trained Generator G

1:: repeat
2:: Compute trigger input: $x_{i w} \leftarrow f_{w} (x_{i})$
3:: Compute encrypted input: $x_{i}^{*} \leftarrow f (x_{i})$
4:: Obtain feature maps at layer ℓ: $G_{l} (x_{i})$ and $G_{l} (x_{i}^{*})$
5:: Pass $G (x_{i}^{*})$ to D to compute GAN loss $L_{G A N}$ using Equations (4) and (7)
6:: Compute feature separation loss $L_{feat}$ using Equations (1) and (2)
7:: Compute watermark loss $L_{w}$ using Equation (5)
8:: Compute gradients and update weights of G and D
9:: until convergence
10:: return G

3.4. Protection Process for CycleGAN

To clarify, in CycleGAN, there are two pairs of generators

G

and discriminators

D

. Each

G

takes in an image as input and produces an image as output. We choose one of these generators as our protection target. To encrypt the input image, we first randomly select a set of positions, and then apply a predetermined transformation to the pixel values at these positions (for example, adding or subtracting a constant). The transformation method and position set together form the encryption key. This results in the encrypted input image, denoted as

x^{*}

. Similar to the previous section, we can represent this process using Equation (3), where

x

now represents an image matrix instead of a one-dimensional vector.

The objective function of cyclegan is as follows.

\begin{matrix} L_{GAN} = & E_{y \sim p_{data} (y)} [log D_{Y} (y)] + \\ E_{x \sim p_{data} (x)} [log (1 - D_{Y} (x))] \end{matrix}

(7)

L_{Cyc} = E_{x \sim p_{data} (x)} [{∥ F (G (x)) - x ∥}_{1}]

(8)

L_{C} = L_{GAN} + L_{Cyc}

(9)

We introduce a regularization term for feature separation and retain the regularization term used to embed the watermark, so the objective function of CycleGAN is defined as:

L_{C_{e}} = L_{C} + α L_{w} + β L_{f e a t} .

(10)

3.5. Protection Process for SRGAN

SRGAN is widely recognized as a pioneering framework in the field of image super-resolution. Its generator

G

, is specifically designed to reconstruct high-fidelity high-resolution images from low-resolution inputs. When using the trained

G

for super-resolution tasks, the input dimensions are often variable, while the super-resolution scale factor remains fixed. For instance, if we input a

32 \times 32

image, we can get a

128 \times 128

super-resolution image, and if we input a

128 \times 96

image, we can get a

512 \times 384

super-resolution image. Consequently, the fixed-position spatial transformation typically employed in architectures like CycleGAN is ill-suited for this scenario, as it relies on rigid spatial alignment that does not exist here. Instead, we select one or several channels of the input image and transform the entire channel to avoid the influence of different input sizes. The visual comparison is shown in Figure 7. The objective function of protected SRGAN is denoted as:

L_{{SR}_{e}} = L_{SR} + α L_{w} + β L_{f e a t} .

(11)

4. Experimental Results

In order to distinguish the original baseline models, the watermark baseline models and our proposed encryption models, we denote the watermark baseline models with subscript w (i.e.,

{DCGAN}_{w}

,

{CycleGAN}_{w}

,

{SRGAN}_{w}

), and our proposed encrypted models with a subscript

w e

(i.e.,

{DCGAN}_{w e}

,

{CycleGAN}_{w e}

,

{SRGAN}_{w e}

). It should be noted that the watermark here contains both the black-box and white-box settings proposed in [9]. The original text is represented by subscripts w and s respectively, but we use the subscript w here to unify them. In addition, for results generated from valid inputs processed with keys, we add * in the subscript to denote them.

4.1. Hyperparameters and Benchmark

We demonstrate our proposed encryption method on three GAN models: DCGAN, CycleGAN, and SRGAN. To ensure the generalizability and reproducibility of our results, we chose classic publicly available datasets, avoiding the potential extreme features of domain-specific datasets that may affect model performance.

For fair comparison, all hyperparameters and network architectures follow the original work of each GAN model. In all experiments, the models were trained for 100 epochs with the Adam optimizer (

β_{1}

= 0.5,

β_{2}

= 0.999) and a constant learning rate of

2 \times 10^{- 4}

. The batch size was 64 for DCGAN/CycleGAN and 16 for SRGAN. The feature separation loss is computed on the generator’s penultimate convolution block. We observed that applying the loss to shallower layers hindered model convergence, whereas applying it solely to the final pixel layer failed to capture high-level semantic discrepancies. Therefore, the penultimate layers were selected to maximize perceptual degradation in unauthorized outputs while maintaining training stability.

For DCGAN, we use CIFAR10 (

32 \times 32

), CTU-UHB (

32 \times 32

), CUB200 (

64 \times 64

), and ChestXray2017 (

64 \times 64

) as benchmark datasets (sample results are shown in Figure 8). For the CTU-UHB dataset, we transform the 1D time series data into 3 channels, similar to color images, and apply the same training procedure as for CIFAR10.

For CycleGAN, we follow [9] and train the model on the Cityscapes dataset [31], but only protect one of the generators (labels → photos). We introduce an input transformation function and a feature separation regularization term, while maintaining all other original hyperparameters and configurations as described in [32].

For SRGAN, we adhere to the experimental setup in [9] by randomly sampling 350k images from Imagenet [33] for training. The other hyperparameters remain the same as the baseline SRGAN [34].

4.2. Evaluation Metrics

To ensure that our method does not compromise the original performance of the GANs, we utilized several metrics to evaluate the quality of the generated images. For the image generation task using DCGAN, we calculated the Frechet Inception Distance (FID) [35] between the generated and real images. For CycleGAN, we measured the FCN-scores as presented in [32] on the Cityscapes label → photo dataset, which includes per-pixel accuracy, per-class accuracy, and class intersection-over-union. For image super-resolution with SRGAN, we used peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) as our metrics. These metrics were comprehensively evaluated on standard benchmark datasets, including Set5, Set14, and BSD100.

To verify that our method preserves the capability to embed watermarks and claim property rights, we used the same watermark evaluation method as [9]. We measured the quality of image watermarks using SSIM between the generated watermark and the ground truth watermark, denoted as

Q_{w m}

. Additionally, the integrity of the weighted sign signatures is assessed via the Bit Error Rate (BER).

As shown in Table 3, Table 4 and Table 5, the performance of the protected DCGAN, CycleGAN and SRGAN (i.e.,

{DCGAN}_{w e^{*}}

/

{CycleGAN}_{w e^{*}}

/

{SRGAN}_{w e^{*}}

) is comparable to the baselines on several datasets. Moreover, our method guarantees that these models produce significantly degraded results when subjected to unencrypted inputs.

4.3. Necessity of Regularization Term

In this section, we demonstrate that while training models exclusively on encrypted data inherently restricts their functionality to encrypted inputs, the incorporation of the feature distance regularization term (Equation (2)) remains critical. This regularization term ensures more robust destruction of generated data quality, preventing unauthorized individuals from using it for various downstream tasks. Furthermore, the term enhances the model’s resistance to fine-tuning attacks.

Experiments demonstrate that without the feature distance regularization term during training, unauthorized individuals can fine-tune the model to significantly reduce its encrypted properties, thereby allowing them to use unencrypted inputs to obtain results comparable to the expected outputs. In contrast, incorporating the feature distance regularization term effectively prevents unauthorized fine-tuning from yielding usable outputs. Table 6, Table 7 and Table 8 show that the absence of this term allows adversaries to obtain results comparable to those from encrypted inputs. With the term applied, however, the quality of illicitly obtained results severely degrades, confirming the failure to achieve expected generation quality. Moreover, our method maintains the robustness of the model’s watermarking capabilities (black box and white box), with high image watermark quality

Q_{w m}

and zero model symbol watermark bit error rate BER for the fine-tuned models.

We observe that the BER consistently remains 0 across our experiments (Table 6, Table 7 and Table 8). This is because our work follows [9], which uses the scaling factors

γ

from the batch normalization layer as signatures, and converts

γ

into ASCII characters to serve as a model watermark for protection. This approach is taken because, if someone attempts to steal the model parameters for an attack, the

γ

values are unlikely to undergo positive or negative shifts due to the attack, thus keeping the ASCII characters used as the watermark unchanged. Since the BER (bit-error rate) measures the difference between the original watermark and the post-attack watermark, and given that the watermark remains unchanged in adversarial scenarios (as detailed in [9]), its value is 0.

4.4. Key Transform

As mentioned in Section 3.2, the key transformation should strike a balance: it should be neither too aggressive nor too simplistic. This section aims to verify this by showcasing the importance of slightly targeted designs for the key transformations of GANs. The required input data types may vary depending on the GAN, such as Gaussian noise for DCGAN or image data for CycleGAN and SRGAN. Nonetheless, the key transformations of each GAN can be seen as a mask, with Equation (3) expressing the mathematical formulation. Notably, the bits with a value of 1 in the

b

vector do not alter the original input

x

, only the bits with a value of 0 affect the corresponding bits of

x

. We designate the bits with a value of 0 in

b

as the significant bits of the mask. Therefore, when the constant c remains constant, a larger number of significant bits in the vector

b

implies a greater deviation in the input distribution. Table 9 presents the FID scores of EN-DCGAN on the CUB200 dataset with different numbers of significant mask bits. Results indicate that EN-DCGAN achieves generation quality comparable to or exceeding the baseline when the number of significant bits ranges from 256 to 512. However, an insufficient or excessive number of effective bits results in an increased FID score and poor image quality.

4.5. Comparison

Compared with prior GAN-protection baselines, EGAN is the only method that actively disables an unauthorized copy instead of merely leaving an audit trail. GAN-IPR [9], Box-free [8] and Fragile [7] all allow a pirate to keep generating high-quality images; their protection is reduced to post hoc verification. EGAN, by contrast, forces the model to emit unusable noise when the secret key is absent, and this “fail-closed” behavior survives fine-tuning, pruning and even white-box parameter overwriting. Consequently, EGAN shifts the risk–reward calculus: a stolen model has zero commercial value, so the incentive for theft is removed at source rather than punished afterwards.

5. Discussion

EGAN efficiently processes trigger inputs through a key-lock transformation mechanism, ensuring that meaningful outputs are exclusively reserved for authorized users. This system applies specific key-based transformations to inputs before they reach the generator, optimizing security with a regularization term that maximizes feature separation loss. Consequently, unauthorized outputs are structurally corrupted. For validation against stolen GANs, EGAN integrates the proactive corruption of unauthorized outputs with reactive verification capabilities—facilitated by compatibility with black-box and white-box watermarking techniques, thereby significantly bolstering intellectual property protection. By combining these strategies, EGAN not only maintains the operational integrity for legitimate users but also robustly guards against misuse, securing both the models and their generated data.

Currently, our method has been verified primarily on public datasets (e.g., CIFAR10, CTU-UHB, Cityscapes). In the future, we aim to explore the scalability and robustness of our approach in more specialized and practical domains. Given that data in fields such as medical imaging and industrial inspection are often difficult to acquire and possess high commercial or privacy value, we plan to conduct experiments on these high-stakes datasets. This will allow us to further validate the real-world applicability of our self-adversarial protection mechanism in safeguarding sensitive and high-value intellectual property.

6. Conclusions

In this paper, we presented EGAN, a robust framework designed to safeguard the intellectual property of deep generative models against unauthorized usage and theft. By integrating a secret-key-dependent encryption mechanism into the model architecture, EGAN ensures that high-fidelity generation is exclusive to authorized users, while unauthorized access results in degraded, noise-like outputs. The contributions of our EGAN are as follows: (1) Distinct from passive verification, we introduce an active protection paradigm where unauthorized inference results in structurally corrupted outputs. (2) We propose a novel self-adversarial feature-separation mechanism that physically embeds protection into the model weights rather than relying on external wrappers. (3) We demonstrate that EGAN offers superior robustness against fine-tuning compared to existing watermarking schemes while it maintains the original generative performance for authorized users.

Regarding ethical deployment, we recommend (i) embedding keys within authenticated software, (ii) maintaining comprehensive logs of all decryption activities, and (iii) restricting model licenses to non-safety-critical research domains with third-party escrow, thereby balancing strict IP protection with safe usage.

Author Contributions

Conceptualization, Y.Z., W.L. and Y.J.; Formal analysis, Y.Z., Y.J. and Y.H.; Investigation, Y.Z. and Y.J.; Methodology, W.L. and F.F.; Resources, Y.H.; Supervision, W.L. and F.F.; Validation, Y.Z., Y.J. and Y.H.; Writing—original draft, Y.Z. and Y.J.; Writing—review and editing, Y.Z., W.L., Y.J. and F.F. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key R&D Program of China (2022ZD0161800), and AI-Empowered Research Paradigm Reform and Discipline Leap Plan under Grant 2024AI01012.

Data Availability Statement

The original data presented in the study are openly available in [Set5, Set14, BSD100] at [https://figshare.com/articles/dataset/BSD100_Set5_Set14_Urban100/21586188 (accessed on 10 December 2025)], [CIFAR10] at [https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 10 December 2025)], [CUB200] at [https://www.kaggle.com/datasets/wenewone/cub2002011 (accessed on 10 December 2025)], [CTU-UHB] at [https://physionet.org/content/ctu-uhb-ctgdb/1.0.0/ (accessed on 10 December 2025)], and [ChestXRay] at [https://www.kaggle.com/datasets/nih-chest-xrays/data (accessed on 10 December 2025)].

Conflicts of Interest

The authors declare no conflicts of interest.

References

Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27, 2672–2680. [Google Scholar]
Guo, Q.; Li, Y.; Song, Y.; Wang, D.; Chen, W. Intelligent fault diagnosis method based on full 1-D convolutional generative adversarial network. IEEE Trans. Ind. Inform. 2019, 16, 2044–2053. [Google Scholar] [CrossRef]
Gao, Y.; Liu, X.; Xiang, J. FEM simulation-based generative adversarial networks to detect bearing faults. IEEE Trans. Ind. Inform. 2020, 16, 4961–4971. [Google Scholar] [CrossRef]
Cheema, M.N.; Nazir, A.; Yang, P.; Sheng, B.; Li, P.; Li, H.; Wei, X.; Qin, J.; Kim, J.; Feng, D.D. Modified GAN-cAED to minimize risk of unintentional liver major vessels cutting by controlled segmentation using CTA/SPET-CT. IEEE Trans. Ind. Inform. 2021, 17, 7991–8002. [Google Scholar] [CrossRef]
Chen, C.; Zhou, K.; Zha, M.; Qu, X.; Guo, X.; Chen, H.; Wang, Z.; Xiao, R. An effective deep neural network for lung lesions segmentation from COVID-19 CT images. IEEE Trans. Ind. Inform. 2021, 17, 6528–6538. [Google Scholar] [CrossRef]
Brock, A.; Donahue, J.; Simonyan, K. Large scale GAN training for high fidelity natural image synthesis. arXiv 2018, arXiv:1809.11096. [Google Scholar]
Yuan, Z.; Li, L.; Wang, Z.; Zhang, X. Integrity Protection of Generative Adversarial Networks Using Fragile Watermarking. ACM Trans. Multimed. Comput. Commun. Appl. 2025, 21, 1–21. [Google Scholar] [CrossRef]
Huang, Z.; Li, B.; Cai, Y.; Wang, R.; Guo, S.; Fang, L.; Chen, J.; Wang, L. What can discriminator do? Towards box-free ownership verification of generative adversarial networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 5009–5019. [Google Scholar] [CrossRef]
Ong, D.S.; Chan, C.S.; Ng, K.W.; Fan, L.; Yang, Q. Protecting Intellectual Property of Generative Adversarial Networks from Ambiguity Attack. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 19–25 June 2021. [Google Scholar] [CrossRef]
Li, X.; Wang, Y.; Wang, Q.H.; Kim, S.T.; Zhou, X. Copyright protection for holographic video using spatiotemporal consistent embedding strategy. IEEE Trans. Ind. Inform. 2019, 15, 6187–6197. [Google Scholar] [CrossRef]
Zhang, J.; Gu, Z.; Jang, J.; Wu, H.; Stoecklin, M.P.; Huang, H.; Molloy, I. Protecting intellectual property of deep neural networks with watermarking. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, Incheon, Republic of Korea, 4–8 June 2018; pp. 159–172. [Google Scholar] [CrossRef]
Chen, H.; Rohani, B.D.; Koushanfar, F. Deepmarks: A digital fingerprinting framework for deep neural networks. arXiv 2018, arXiv:1804.03648. [Google Scholar] [CrossRef]
Rouhani, B.D.; Chen, H.; Koushanfar, F. Deepsigns: A generic watermarking framework for ip protection of deep learning models. arXiv 2018, arXiv:1804.00750. [Google Scholar] [CrossRef]
Uchida, Y.; Nagai, Y.; Sakazawa, S.; Satoh, S. Embedding watermarks into deep neural networks. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, Bucharest, Romania, 6–9 June 2017; pp. 269–277. [Google Scholar] [CrossRef]
Le Merrer, E.; Perez, P.; Trédan, G. Adversarial frontier stitching for remote neural network watermarking. Neural Comput. Appl. 2020, 32, 9233–9244. [Google Scholar] [CrossRef]
Adi, Y.; Baum, C.; Cisse, M.; Pinkas, B.; Keshet, J. Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In Proceedings of the 27th USENIX Security Symposium (USENIX Security 18), Baltimore, MD, USA, 15–17 August 2018; pp. 1615–1631. [Google Scholar]
Zhang, J.; Chen, D.; Liao, J.; Zhang, W.; Hua, G.; Yu, N. Passport-aware normalization for deep model protection. Adv. Neural Inf. Process. Syst. 2020, 33, 22619–22628. [Google Scholar]
Zhang, Y.; Zhao, Z.; Deng, Y.; Zhang, X. FHRGAN: Generative adversarial networks for synthetic fetal heart rate signal generation in low-resource settings. Inf. Sci. 2022, 594, 136–150. [Google Scholar] [CrossRef]
Szegedy, C.; Zaremba, W.; Sutskever, I.; Bruna, J.; Erhan, D.; Goodfellow, I.; Fergus, R. Intriguing properties of neural networks. arXiv 2013, arXiv:1312.6199. [Google Scholar]
Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and Harnessing Adversarial Examples. arXiv 2014, arXiv:1412.6572. [Google Scholar]
Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; Vladu, A. Towards deep learning models resistant to adversarial attacks. arXiv 2017, arXiv:1706.06083. [Google Scholar]
Su, J.; Vargas, D.V.; Sakurai, K. One pixel attack for fooling deep neural networks. IEEE Trans. Evol. Comput. 2019, 23, 828–841. [Google Scholar] [CrossRef]
Li, Y.; Bai, S.; Xie, C.; Liao, Z.; Shen, X.; Yuille, A. Regional homogeneity: Towards learning transferable universal adversarial perturbations against defenses. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 795–813. [Google Scholar] [CrossRef]
Poursaeed, O.; Katsman, I.; Gao, B.; Belongie, S. Generative adversarial perturbations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4422–4431. [Google Scholar] [CrossRef]
Salzmann, M.; Nakka, K.K. Learning Transferable Adversarial Perturbations. Adv. Neural Inf. Process. Syst. 2021, 34, 13950–13962. [Google Scholar]
Fan, L.; Ng, K.W.; Chan, C.S. Rethinking deep neural network ownership verification: Embedding passports to defeat ambiguity attacks. Adv. Neural Inf. Process. Syst. 2019, 32, 4714–4723. [Google Scholar]
Fei, J.; Dai, Y.; Xia, Z.; Huang, F.; Zhou, J. OmniMark: Efficient and Scalable Latent Diffusion Model Fingerprinting. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; Volume 39, pp. 16550–16558. [Google Scholar]
Cui, Y.; Ren, J.; Lin, Y.; Xu, H.; He, P.; Xing, Y.; Lyu, L.; Fan, W.; Liu, H.; Tang, J. Ft-shield: A watermark against unauthorized fine-tuning in text-to-image diffusion models. ACM SIGKDD Explor. Newsl. 2025, 26, 76–88. [Google Scholar] [CrossRef]
Miyato, T.; Kataoka, T.; Koyama, M.; Yoshida, Y. Spectral normalization for generative adversarial networks. arXiv 2018, arXiv:1802.05957. [Google Scholar] [CrossRef]
Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar] [CrossRef]
Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar] [CrossRef]
Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4681–4690. [Google Scholar] [CrossRef]
Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process. Syst. 2017, 30, 6629–6640. [Google Scholar]

Figure 1. The images generated by DCGAN for authorized input, watermark-triggered input, and unauthorized input are presented from left to right. The classification results given by the classifier are shown below each image.

Figure 2. The top row displays superior generation results achieved with keyed input by the model owner and authorized person, while the second row shows results with keyless input by the unauthorized person. The third row presents outcomes obtained using the watermark trigger set.

Figure 3. The first column represents the input for one of the generators of CycleGAN, where the content of the input images is the segmented annotation domain of the Cityscapes dataset. The rows show the results of the encrypted, unencrypted, and watermark-trigger inputs, respectively. It should be noted that we edited 256 pixels from the original 1024 × 1024 input. As the encrypted input, the change is small enough that the difference between the first and second input is not obvious. The other columns display the generated images.

Figure 4. An overview of the model protection process. The expected high-quality output can only be obtained by transforming the input data according to a specific key and then inputting it to the generator. The model owner and the authorizer share the key through a predetermined protocol or encrypted communication. An unauthorized person can intercept the stolen model, but without the key, they cannot generate high-quality data using the generator model. Even if the key is deciphered in extreme cases, the model owner can still claim ownership and enforce legal rights through black-box and white-box verification.

Figure 5. The input processing of DCGAN. The diagram displays the latent vector z sampled from a Gaussian distribution, followed by the output vector of the FC layer

x_{F C}

, and finally the keyed vector

x_{F C}^{*}

.

Figure 5. The input processing of DCGAN. The diagram displays the latent vector z sampled from a Gaussian distribution, followed by the output vector of the FC layer

x_{F C}

, and finally the keyed vector

x_{F C}^{*}

.

Figure 6. The first row displays an image plotted from the real FHR data. The second row presents the generator output for authorized input, while the third row shows the generated result for unauthorized input. It is evident that the result exhibits significant feature-level differences when compared directly to the real data image.

Figure 7. The first image is a low-res image input

x

. The next three images are respectively the result of the model with encrypted input, watermark-trigger input, and unencrypted input.

Figure 7. The first image is a low-res image input

x

. The next three images are respectively the result of the model with encrypted input, watermark-trigger input, and unencrypted input.

Figure 8. For the chestXRay dataset example, in first row, the first two columns show samples from the original dataset, while the 3rd–5th columns display the generation results using the trigger set. In second row, 1st–3rd columns show the generation results with encrypted input, and the 4th–6th columns show the results with unencrypted input.

Table 1. Classification recognition accuracy of generated images using different inputs in DCGAN.

Input	Accuracy
baseline z	$86.79$
authorized input $z^{*}$	$86.25$
unauthorized input z	$26.38$
watermark trigger input $z_{w}$	$79.36$

Table 2. Misclassification rates for generated images using unauthorized inputs.

Trained Model	Misjudgment Rate
unencrypted training	$15.73$
encrypted training	$74.14$

Table 3. Fidelity in DCGAN: Scores are in FID (lower is better).

	CIFAR10	CUB200	CTU-UHB	ChestXRay
DCGAN	26.54 ± 1.04	58.34 ± 1.50	74.12 ± 0.76	84.95 ± 0.84
${DCGAN}_{w}$	26.27 ± 0.54	56.64 ± 2.74	73.65 ± 0.98	84.16 ± 0.74
${DCGAN}_{w e^{*}}$	26.17 ± 0.07	47.68 ± 0.06	74.56 ± 0.65	85.32 ± 0.97
${DCGAN}_{w e}$	113.67 ± 0.55	96.26 ± 0.78	128.32 ± 2.54	163.25 ± 2.89

Table 4. Fidelity in CycleGAN (higher is better).

	Per-Pixel Acc.	Per-Class Acc.	Class IoU
CycleGAN	0.55	0.18	0.13
${CycleGAN}_{w}$	0.58	0.19	0.14
${CycleGAN}_{w e^{*}}$	0.60	0.18	0.12
${CycleGAN}_{w e}$	0.47	0.09	0.06

Table 5. Fidelity in SRGAN: Scores are in PSNR/SSIM (higher is better).

	Set5	Set14	BSD
SRGAN	$29.38 / 0.85$	$25.92 / 0.71$	$25.08 / 0.67$
${SRGAN}_{w}$	$29.14 / 0.85$	$26.00 / 0.72$	$25.35 / 0.67$
${SRGAN}_{w e^{*}}$	$29.08 / 0.87$	$26.16 / 0.72$	$25.57 / 0.67$
${SRGAN}_{w e}$	$25.14 / 0.79$	$24.54 / 0.67$	$23.58 / 0.62$

Table 6. First row is the FID (lower is better) scores, watermark quality,

Q_{w m}

and BER for

{DCGAN}_{w e^{*}}

. Second row shows the scores of the model without feature distance regular term after fine-tuning. Third row shows the scores of the model with feature distance regular term after fine-tuning.

Table 6. First row is the FID (lower is better) scores, watermark quality,

Q_{w m}

and BER for

{DCGAN}_{w e^{*}}

. Second row shows the scores of the model without feature distance regular term after fine-tuning. Third row shows the scores of the model with feature distance regular term after fine-tuning.

	FID	$Q_{wm}$
${DCGAN}_{w e^{*}}$	47.68 ± 0.06	0.97
${Fine-tune}_{w / o f l}$	53.54 ± 0.17	0.96
${Fine-tune}_{w f l}$	271.87 ± 3.28	0.96

Table 7. First row is the FCN-scores (higher is better), watermark quality,

Q_{w m}

and BER for

{CycleGAN}_{w e^{*}}

. Second row shows the scores of the model without feature distance regular term after fine-tuning. Third row shows the scores of the model with feature distance regular term after fine-tuning.

Table 7. First row is the FCN-scores (higher is better), watermark quality,

Q_{w m}

and BER for

{CycleGAN}_{w e^{*}}

. Second row shows the scores of the model without feature distance regular term after fine-tuning. Third row shows the scores of the model with feature distance regular term after fine-tuning.

	Per-Pixel Acc.	Per-Class Acc.	Class IoU	$Q_{wm}$
${CycleGAN}_{w e^{*}}$	0.60	0.18	0.12	0.90
${Fine-tune}_{w / o f l}$	0.59	0.17	0.11	0.85
${Fine-tune}_{w f l}$	0.51	0.12	0.08	0.85

Table 8. First row is the PSNR/SSIM (higher is better) scores, watermark quality,

Q_{w m}

and BER for

{SRGAN}_{w e^{*}}

. Second row shows the scores of the model without feature distance regular term after fine-tuning. Third row shows the scores of the model with our proposed feature distance regular term after fine-tuning.

Table 8. First row is the PSNR/SSIM (higher is better) scores, watermark quality,

Q_{w m}

and BER for

{SRGAN}_{w e^{*}}

. Second row shows the scores of the model without feature distance regular term after fine-tuning. Third row shows the scores of the model with our proposed feature distance regular term after fine-tuning.

	Set5	Set14	BSD	$Q_{wm}$
${SRGAN}_{w e^{*}}$	$29.08 / 0.87$	$26.16 / 0.72$	$25.57 / 0.67$	$0.93$
${Fine-tune}_{w / o f l}$	$29.10 / 0.85$	$26.01 / 0.70$	$25.44 / 0.66$	$0.83$
${Fine-tune}_{w f l}$	$19.42 / 0.53$	$20.18 / 0.56$	$20.60 / 0.60$	$0.83$

Table 9. FID scores of images generated by EN-DCGAN under different effective dimensions of masks. Where

G (z^{*})

represents the outputs obtained from authorized inputs, and

G (z)

represents the outputs obtained from unauthorized inputs.

Table 9. FID scores of images generated by EN-DCGAN under different effective dimensions of masks. Where

G (z^{*})

represents the outputs obtained from authorized inputs, and

G (z)

represents the outputs obtained from unauthorized inputs.

mask_dim	FC_dim	FID of G(z*)	FID of G(z)
1	32768	266.13	51.71
64	32768	275.73	79.40
256	32768	47.89	97.03
384	32768	55.61	98.30
512	32768	74.33	116.76
4096	32768	230.92	196.70

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhu, Y.; Li, W.; Jiang, Y.; Huang, Y.; Fang, F. EGAN: Encrypting GAN Models Based on Self-Adversarial. Mathematics 2025, 13, 4008. https://doi.org/10.3390/math13244008

AMA Style

Zhu Y, Li W, Jiang Y, Huang Y, Fang F. EGAN: Encrypting GAN Models Based on Self-Adversarial. Mathematics. 2025; 13(24):4008. https://doi.org/10.3390/math13244008

Chicago/Turabian Style

Zhu, Yujie, Wei Li, Yuhang Jiang, Yanrong Huang, and Faming Fang. 2025. "EGAN: Encrypting GAN Models Based on Self-Adversarial" Mathematics 13, no. 24: 4008. https://doi.org/10.3390/math13244008

APA Style

Zhu, Y., Li, W., Jiang, Y., Huang, Y., & Fang, F. (2025). EGAN: Encrypting GAN Models Based on Self-Adversarial. Mathematics, 13(24), 4008. https://doi.org/10.3390/math13244008

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EGAN: Encrypting GAN Models Based on Self-Adversarial

Abstract

1. Introduction

2. Related Work

3. EGAN

3.1. Threat Model

3.2. Overview of the Process of Model Protection

3.3. Protection Process for DCGAN

3.4. Protection Process for CycleGAN

3.5. Protection Process for SRGAN

4. Experimental Results

4.1. Hyperparameters and Benchmark

4.2. Evaluation Metrics

4.3. Necessity of Regularization Term

4.4. Key Transform

4.5. Comparison

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI