A Symmetric Projection Space and Adversarial Training Framework for Privacy-Preserving Machine Learning with Improved Computational Efficiency

Li, Qianqian; Zhou, Shutian; Zeng, Xiangrong; Shi, Jiaqi; Lin, Qianye; Huang, Chenjia; Yue, Yuchen; Jiang, Yuyao; Lv, Chunli

doi:10.3390/app15063275

Open AccessArticle

A Symmetric Projection Space and Adversarial Training Framework for Privacy-Preserving Machine Learning with Improved Computational Efficiency

by

Qianqian Li

^1,†,

Shutian Zhou

^1,†,

Xiangrong Zeng

^1,†,

Jiaqi Shi

^1,2,

Qianye Lin

^1,3,

Chenjia Huang

^1,4,

Yuchen Yue

^1,5,

Yuyao Jiang

¹ and

Chunli Lv

^1,*

¹

China Agricultural University, Beijing 100083, China

²

Beijing Foreign Studies University, Beijing 100089, China

³

University of International Business and Economics, Beijing 100029, China

⁴

China University of Political Science and Law, Beijing 102249, China

⁵

Peking University, Beijing 100871, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2025, 15(6), 3275; https://doi.org/10.3390/app15063275

Submission received: 11 December 2024 / Revised: 9 March 2025 / Accepted: 11 March 2025 / Published: 17 March 2025

(This article belongs to the Special Issue Cloud Computing: Privacy Protection and Data Security)

Download

Browse Figures

Versions Notes

Abstract

This paper proposes a data security training framework based on symmetric projection space and adversarial training, aimed at addressing the issues of privacy leakage and computational efficiency encountered by current privacy protection technologies when processing sensitive data. By designing a new projection loss function and combining autoencoders with adversarial training, the proposed method effectively balances privacy protection and model utility. Experimental results show that, for financial time-series data tasks, the model using the projection loss achieves a precision of 0.95, recall of 0.91, and accuracy of 0.93, significantly outperforming the traditional cross-entropy loss. In image data tasks, the projection loss yields a precision of 0.93, recall of 0.90, accuracy of 0.91, and mAP@50 and mAP@75 of 0.91 and 0.90, respectively, demonstrating its strong advantage in complex tasks. Furthermore, experiments on different hardware platforms (Raspberry Pi, Jetson, and NVIDIA 3080 GPU) show that the proposed method performs well on low-computation devices and exhibits significant advantages on high-performance GPUs, particularly in terms of computational efficiency, demonstrating good scalability and efficiency. The experimental results validate the superiority of the proposed method in terms of data privacy protection and computational efficiency.

Keywords:

privacy protection; data security; adversarial training; computational efficiency; high-dimensional data compression

1. Introduction

The importance of data security has become increasingly prominent in the modern information society. With the widespread adoption of artificial intelligence technologies and big data applications, the dependence on data analysis across various industries has significantly increased [1,2]. However, this trend is accompanied by potential risks of data privacy breaches. Once sensitive data in fields such as healthcare, finance, and education are exposed, irreversible social and economic consequences may ensue [3,4,5]. Therefore, ensuring the effective use of processing data while ensuring privacy protection has become a focal issue in both academia and industry during data-driven decision-making processes.

Traditional privacy-preserving techniques primarily include methods such as differential privacy (DP), homomorphic encryption (HE), and federated learning (FL). These techniques each have advantages in data protection. For example, DP mitigates the risk of data leakage by adding random noise to obscure individual information [6]; HE allows data to be processed in an encrypted state, ensuring privacy during computation [7]; and FL avoids direct centralization of data by employing distributed training [8]. However, these methods still present significant limitations in practical applications. HE has a high computational complexity, making it difficult to meet real-time requirements. Although FL avoids data sharing, challenges such as communication overhead and inconsistencies among participating parties persist. Furthermore, these methods generally focus on data encryption or noise addition and lack dedicated designs for abstracting data features and efficiently utilizing nonsensitive information, which may hinder achieving optimal results in certain scenarios.

Existing privacy-preserving methods mainly concentrate on data concealment and security, with relatively less exploration on how to balance privacy and utility [9,10,11]. Although DP is widely used, the design of noise mechanisms often fails to meet the specific needs of different scenarios; Yan et al. [12] proposed a method to address these challenges and enhance privacy protection by applying DP with stochastic gradient descent to train surrogate models that protect original data. A differential evolution operator was used to generate personalized new samples for multiple clients based on promising auxiliary samples, preventing the exposure of newly generated data. Additionally, a similarity-based aggregation algorithm was integrated to effectively build a global surrogate model. Experimental results showed that their method demonstrated significant optimization performance on a set of synthetic problems in a joint setup while maintaining data privacy.

Methods like HE and secure multi-party computation, while enhancing privacy protection, face issues related to high computational cost and system complexity. Wang et al. [13] proposed a blockchain-based transaction privacy protection method using lightweight HE, focusing on securing transaction data and safeguarding user privacy while enhancing transaction efficiency. The experimental results demonstrated that the method achieved strong privacy and security protection. The data leakage probability was as low as 2.8%, successfully preventing replay attacks and forged transaction attacks. Pan et al. [14] sought to protect gradients in FL systems while maintaining practical performance by analyzing the impact of security parameters, such as polynomial modulus and coefficient modulus, on homomorphic operations. Based on the evaluation results, they provided a method for selecting the optimal multiplication depth while satisfying operational requirements. An adaptive segmented encryption method tailored for CKKS was introduced to circumvent its encryption length limitations, enhancing the processing capabilities of encrypted neural network models. Finally, they proposed FedSHE, and evaluation results indicated that FedSHE outperformed existing FL research based on HE in terms of model accuracy, computational efficiency, communication costs, and security levels. Moreover, although FL solutions have made progress in distributed scenarios, they still rely heavily on multi-party data consistency and trust mechanisms. In response, Xie et al. [15] provided a detailed overview of privacy-preserving federated learning (PPFL) based on HE, discussing the background of FL, privacy protection technologies (such as DP and MPC), and threat models.

To address these issues, as shown in Table 1, a data security training framework based on symmetric projection spaces and adversarial training is proposed, aiming to provide a general and efficient solution for balancing privacy protection and data utility. Specifically, the proposed method utilizes symmetric projection techniques, obfuscating the data while retaining the key information required for training and inference. Additionally, an autoencoder-based generative module is introduced, which generates abstract data representations through encoding and decoding processes. Unlike traditional noise-adding methods, the autoencoder module can accurately separate sensitive and nonsensitive information through the optimization of network weights, further improving the precision of data privacy protection and the utility of model training. In adversarial training, a generator-discriminator mechanism is designed, ensuring that the generated data not only maintain high privacy but also remain consistent with the original data distribution.

Low-dimensional data abstraction based on symmetric projection: This method uses symmetry design and nonlinear transformations to project high-dimensional sensitive data into a lower-dimensional abstract space, achieving obfuscated data representation while effectively retaining key features required for training and inference.
Privacy-preserving module combining adversarial training and autoencoders: Within the adversarial training framework, the generator is responsible for generating obfuscated data, while the discriminator optimizes its discriminative ability, forming a dynamic adversarial mechanism. Meanwhile, the introduction of the autoencoder enables the efficient separation and reconstruction of sensitive and nonsensitive information, ensuring both data privacy and enhanced model utility.
Wide adaptability and performance validation for multi-task data: The proposed framework has been validated on multiple datasets, including time-series and image data, demonstrating its generality and performance advantages. The method shows outstanding results in terms of privacy protection and computational efficiency.

In conclusion, by innovatively combining symmetric projection, autoencoders, and adversarial training technologies, this work provides a universal solution that balances security and utility in the field of privacy protection.

2. Related Work

2.1. Latent Space

Latent space is a fundamental concept in data representation learning, where high-dimensional data are mapped to a lower-dimensional space. This transformation allows for the capture of key features within the data while reducing redundant information [16,17]. Essentially, it serves as a process of feature abstraction, aiming to preserve the intrinsic structure of the data while enhancing computational efficiency and robustness for subsequent processing. In traditional machine learning methods, data are typically handled in its original high-dimensional form [18,19,20]. For instance, in image recognition tasks, each image is represented as a high-dimensional vector consisting of thousands of pixel values. The processing of such high-dimensional data is computationally expensive and prone to noise and redundant information, which may lead to inefficient learning algorithms and even overfitting [21]. To address these challenges, the introduction of latent space has proven to be an effective solution.

The core idea of latent space is to map data from their original high-dimensional space to a lower-dimensional latent space. In this lower-dimensional space, the key features of the data are effectively preserved, while redundant or irrelevant information is discarded. This mapping is typically achieved through models such as deep neural networks or autoencoders [22,23]. Within the autoencoder framework, input data are processed through the encoder and ultimately mapped to the latent space. Each point in the latent space represents a compressed version of the data, typically containing the essential information from the original input. The mapping process to latent space can be expressed as follows:

z = f (x),

(1)

where x denotes the input data and f is the mapping function that transforms the high-dimensional input x into the lower-dimensional representation z in the latent space. The mapping function f is typically learned via neural networks. In the autoencoder framework, the task of the encoder network is to learn the mapping function that converts the input data x into the low-dimensional representation z in the latent space. The decoder network then reconstructs the original data x based on z. Through this process, the latent space representation retains the key information of the input data. To better understand the role of latent space, the relationship between the latent space and the input data can be further analyzed. Suppose the input data x belongs to a high-dimensional space with dimension d, while the latent space has dimension k, typically with

k < d

. In this case, the objective of learning latent space is to compress the data from the d-dimensional space to the k-dimensional space while preserving the key features of the original data as much as possible. The model is trained by optimizing a loss function such that the resulting latent representation effectively retains the structural information of the original data [24].

The concept of latent space is not only applied in autoencoders and variational autoencoders but is also widely used in other deep learning models. For example, Generative Adversarial Networks (GANs) also utilize latent space in generating samples. In GANs, the generator maps a random noise vector to the latent space, and then generates realistic data samples based on the latent vectors [25]. In this context, latent space serves to map random noise to a meaningful space, enabling the generated samples to possess structure and regularity. The application of latent space is not limited to generative models; it also extends to tasks such as classification and regression [26,27]. For instance, in classification tasks, the latent space transforms input data into a more compact representation, enabling the classifier to distinguish between different data classes more effectively. In regression tasks, the latent space captures underlying patterns in a lower-dimensional representation, thereby enhancing the model’s regression performance.

2.2. Adversarial Learning

Adversarial learning has emerged as a key research direction in machine learning in recent years, with its core idea rooted in game theory. It involves two competing models (the generator and the discriminator) that jointly achieve data distribution fitting and privacy protection [28,29,30]. The most representative method in adversarial learning is GANs, where the generator creates samples that closely resemble the real data distribution, while the discriminator distinguishes between generated samples and real samples, ultimately leading to a mutual improvement of both the generator and the discriminator [31]. The principles of adversarial learning can be expressed in the following mathematical formulations. Let the real data distribution be denoted as

p_{real} (X)

. The objective of the generator G is to sample from the noise distribution

p_{z} (z)

and generate samples

X^{'}

that approximate the real data distribution as closely as possible:

X^{'} = G (z; θ_{G}),

(2)

where

z

is the random noise input to the generator and

θ_{G}

represents the parameters of the generator. Simultaneously, the goal of the discriminator D is to distinguish between real and generated data. Its input is a data sample

X

or generated data

X^{'}

, and the output is the classification probability, defined as:

D (X; θ_{D}) \in [0, 1],

(3)

where

θ_{D}

represents the parameters of the discriminator. A value of

D (X)

close to 1 indicates that the data originate from the real distribution, while a value close to 0 indicates that the data come from the generator. To address potential mathematical inconsistencies when

D (X) = 0

or

D (X) = 1

, we introduce a small positive constant

ϵ

(e.g.,

ϵ = 10^{- 8}

) to ensure numerical stability. Specifically, the range of

D (X)

is constrained to

(ϵ, 1 - ϵ)

by applying the transformation

D (X) = max (min (D (X), 1 - ϵ), ϵ)

. The revised adversarial objective function is expressed as:

min_{G} max_{D} L GAN = E X \sim p_{real} [log D (X)] + E z \sim p z [log (1 - D (G (z)) + ϵ)],

(4)

where

L_{GAN}

represents the adversarial loss function for GANs. This adjustment ensures that logarithmic operations remain defined under all circumstances and improves the stability of the optimization process. Through this optimization, the discriminator aims to maximize its ability to differentiate between real and generated data, while the generator seeks to minimize the discriminator’s ability to distinguish between the generated data and real data. In privacy-preserving scenarios, adversarial learning has been extended to generate obfuscated low-dimensional representations that make the generated data difficult to reverse into the original sensitive information while retaining the key information necessary for model training [32,33,34]. The incorporation of numerical stability measures in the discriminator and generator loss functions further enhances the robustness of adversarial learning frameworks in privacy-sensitive applications.

2.3. Autoencoders

Autoencoders are unsupervised learning models widely used for dimensionality reduction, feature learning, and data reconstruction [35,36,37]. The primary idea behind autoencoders is to map the input data through an encoder to a lower-dimensional latent space and then reconstruct the data back to their original form through a decoder [38,39]. The core of the autoencoder lies in the construction of the latent space, which typically has a lower dimension than the input data. This means that the autoencoder extracts key features from the input data through compression, thereby providing an efficient representation of the data. The architecture of the autoencoder makes it highly effective in noise removal and anomaly detection [40,41].

Through a set of functions

h = f (x)

, mapping the input data x to the low-dimensional representation h in the latent space, where h is a vector in the latent space. The encoder typically uses a series of neural network layers for feature extraction and mapping, continuously optimizing parameters so that the resulting latent representation retains the core information of the input data. The decoder is responsible for reconstructing the low-dimensional representation h in the latent space to a reconstructed version

\hat{x}

that closely matches the original input data x, i.e., through the function

g (h) = \hat{x}

for decoding. The encoder and decoder aim to minimize the difference between the input data x and the reconstructed data

\hat{x}

by optimizing a loss function. The goal of the autoencoder is to minimize the reconstruction error between the input data x and the reconstructed data

\hat{x}

. The mean squared error (MSE) is commonly used as the loss function, which can be expressed as:

L = ∥ x - \hat{x} ∥^{2} = \sum_{i = 1}^{n} {(x_{i} - {\hat{x}}_{i})}^{2},

(5)

where

x_{i}

and

{\hat{x}}_{i}

represent the ith component of the input data x and the reconstructed data

\hat{x}

, respectively, and n is the dimensionality of the data. By minimizing the loss function, the autoencoder adjusts the parameters of both the encoder and decoder, ensuring that the input data x are effectively represented in the latent space, and the output

\hat{x}

reconstructed by the decoder closely resembles the original input. In the autoencoder structure, the latent space is typically designed with a dimension smaller than the input data dimension. This implies that, during the mapping process, the encoder discards some redundant information and retains only the key features that effectively represent the data. This compression process improves the model’s generalization ability, enabling the autoencoder to learn the underlying patterns of the input data. However, the choice of latent space dimensionality needs to be carefully considered. A too-low dimensionality may lead to information loss, negatively affecting reconstruction quality, while a too-high dimensionality may result in overfitting.

To better understand the role of latent space, the mathematical model of the autoencoder can be used. Suppose the input data x are a d-dimensional vector. The encoder maps the input data to a low-dimensional representation h in the latent space using a mapping function

h = f (x; θ)

, where

θ

denotes the parameters of the encoder. Then, the decoder reconstructs the data

\hat{x}

from the latent representation h using a mapping function

\hat{x} = g (h; ϕ)

, where

ϕ

represents the parameters of the decoder. The entire autoencoder process can be expressed as:

\hat{x} = g (f (x; θ); ϕ),

(6)

where f and g represent the encoder and decoder functions, respectively, parameterized by neural networks. The training goal of the autoencoder is to optimize the parameters

θ

and

ϕ

by minimizing the loss function

L

, i.e., minimizing the difference between the input data x and the reconstructed data

\hat{x}

. The construction of latent space is crucial to the performance of the autoencoder. Typically, each dimension in the latent space represents an important feature. The autoencoder can effectively extract key features, which are then reconstructed as the original data during decoding. This property has led to the successful application of autoencoders in many practical domains, such as image compression and denoising. In image compression and denoising tasks, autoencoders map image data to latent space, remove redundant parts, and then reconstruct clear images through the decoder [42,43]. In anomaly detection tasks, autoencoders learn the normal patterns of data and can detect abnormal data that deviates from the normal patterns [44,45]. Additionally, denoising autoencoders (DAEs) introduce noise interference into the autoencoder framework and restore the original data by minimizing reconstruction errors [46]. This method enhances the model’s robustness, allowing effective learning and feature extraction even in the presence of noisy data.

3. Materials and Methods

3.1. Dataset Collection

This study investigates the need for data security and privacy protection by selecting multiple representative datasets for experimentation to validate the effectiveness and applicability of the proposed method. To fully demonstrate the versatility of the method, the experiments cover a range of domains, including image classification, object detection, and time-series data analysis. The specific datasets chosen for each domain are highly representative in terms of data size and task complexity. The following provides a detailed description of the datasets used in the experiments and their respective application scenarios. In the image classification domain, the MNIST-USPS, BreakHis, and CelebA datasets were selected, as shown in Figure 1. The MNIST-USPS dataset is a classic dataset in the field of handwritten digit classification, where the MNIST dataset contains approximately 60,000 training images and 10,000 test images, while the USPS dataset contains 9298 images. These two datasets are widely used in low-resolution handwriting recognition tasks and provide an ideal scenario for testing the basic classification ability of the algorithm. Meanwhile, the CelebA dataset is used for complex facial image tasks, including gender classification, age prediction, and multi-attribute label generation. It contains more than 200,000 labeled facial images, each with 40 attributes. The high data complexity and diverse scenarios make it a good reflection of the algorithm’s performance in high-dimensional feature spaces. Additionally, in the field of medical image analysis, the BreakHis dataset was chosen. This dataset contains 7909 microscopic tissue slice images of breast tumors, covering benign and malignant samples at magnification levels of 4×, 10×, 20×, and 40×. The task is to accurately identify tumor malignancy through image classification.

For time-series data analysis, the study utilized the credit card fraud detection dataset and the IMDb movie review dataset. The credit card fraud detection dataset contains 284,807 credit card transaction records, with 492 labeled fraudulent samples. The task objective is to achieve precise classification of fraudulent transactions through pattern recognition. This type of dataset is highly imbalanced, presenting challenges for the model’s robustness and sensitivity when dealing with imbalanced data. The IMDb movie review dataset, on the other hand, is used for sentiment analysis, containing 50,000 labeled text reviews categorized into positive and negative sentiments. This dataset serves as an important basis for testing the algorithm’s text comprehension and sentiment classification accuracy. This dataset is widely used in medical diagnostic research, with high image quality and precise labels, providing a feasible validation for the method when processing medical image data. The specific number of datasets used is summarized in Table 2.

3.2. Data Augmentation

3.2.1. Time-Series Data Cleaning and Imputation

Time-series data, commonly utilized in domains such as finance, healthcare, and IoT, are characterized by sequential recordings of events or measurements over time. However, raw time-series data often suffer from issues such as noise, missing values, and outliers. If left unaddressed, these issues can severely compromise data quality and model performance. To mitigate these challenges, a tailored cleaning and imputation strategy was proposed to enhance data completeness and accuracy, thereby providing high-quality input for subsequent model training.

The primary objective of time-series data cleaning is to detect and handle noise and outliers. Assuming a time-series dataset

X = {x_{1}, x_{2}, \dots, x_{T}}

, where T represents the temporal length and

x_{t}

denotes the observation at time t, outlier detection was conducted using a statistical standard deviation-based method, which is a straightforward yet effective approach. The detection rule for outliers is expressed as:

x_{t} is identified as an outlier if and only if | x_{t} - μ | > α σ,

(7)

where

μ

denotes the time-series data and

α

is a hyperparameter typically set to values between 2 and 3 to control sensitivity to outliers. This approach quantifies the deviation of each data point from the mean to identify potential outliers. Detected outliers are subsequently replaced to ensure continuity and consistency within the data. Interpolation methods were utilized for outlier replacement, balancing the preservation of data trends and minimizing model distortion. Among these, linear interpolation is widely used for data with simple trends. The linear interpolation formula is given as:

x_{t} = \frac{x_{t - 1} + x_{t + 1}}{2},

(8)

where

x_{t - 1}

and

x_{t + 1}

represent the observations immediately preceding and following the outlier, respectively. This replacement strategy reduces disruption to the overall data distribution. For more complex data trends, spline interpolation was employed to better fit the underlying patterns and enhance precision. Missing values are another prevalent issue in time-series data. To address this, multiple imputation strategies were designed, considering the stationarity and periodicity of the data. Forward fill is a straightforward method that fills missing values with the last observed value, defined as:

x_{t} = x_{t - 1}, if x_{t} is missing .

(9)

This method assumes short-term stability in the data, making it suitable for datasets with short-term trends. For data with strong periodicity, such as daily or seasonal patterns, a periodic mean imputation strategy was introduced to leverage historical cyclic patterns. The formula for periodic mean imputation is as follows:

x_{t} = \frac{1}{N} \sum_{i = 1}^{N} x_{t - i \cdot P},

(10)

where P represents the period length and N is the number of reference cycles. This method assumes stability in periodic characteristics, preserving periodic trends when filling missing values. For scenarios involving significant consecutive missing values, such as sensor failures, multi-variate interpolation and machine learning models were employed to generate more accurate imputed values by exploring latent relationships within the data.

In summary, the proposed time-series data cleaning and imputation strategy encompassed noise detection, outlier handling, and missing value imputation. By integrating statistical detection, interpolation, and diverse imputation strategies, significant improvements in data quality were achieved.

3.2.2. Image Data Augmentation

Image data augmentation is a crucial preprocessing technique designed to diversify training data by applying various transformations to the original samples. This enhances the generalization capability of models, mitigating overfitting issues and improving adaptability to real-world complexities. For the image data in this study, particularly under limited sample conditions, advanced augmentation methods such as Random Erase, GridMask, and CutMix were employed. These techniques perturb or combine local image information to enhance model robustness and generalization to diverse transformations.

Random Erase involves randomly occluding a portion of the image to simulate real-world scenarios where objects may be partially obscured. Let

I \in R^{H \times W \times C}

denote an input image with height H, width W, and C channels. The process selects a random rectangular region and replaces it with fixed values or random noise. The operation is defined as:

I (x : x + l_{h}, y : y + l_{w}, :) = v,

(11)

where

(x, y)

represents the occluded region,

l_{h}

and

l_{w}

are the height and width, and v is the replacement value, typically set to constants or random noise. This method forces the model to rely on global context rather than local features, improving its robustness. GridMask augments images by applying a periodic grid structure for occlusion, which increases spatial feature diversity. For an input image

I

, the grid mask

M

is defined as:

M (x, y) = \{\begin{matrix} 0, & if (x mod L_{x}) < d_{x} or (y mod L_{y}) < d_{y}, \\ 1, & otherwise, \end{matrix}

(12)

where

L_{x}

and

L_{y}

represent the grid periods and

d_{x}

and

d_{y}

are the occlusion widths within each period. The augmented image

I^{'}

is obtained by element-wise multiplication:

I^{'} = I ⊙ M,

(13)

This approach enhances the model’s ability to learn global structural features even with missing local information. CutMix creates augmented samples by mixing two images and their corresponding labels. Given two input images,

I_{A}

and

I_{B}

, with labels

y_{A}

and

y_{B}

, a rectangular region is cut from

I_{A}

and replaced with the corresponding region from

I_{B}

. The region’s position and dimensions are determined as:

r_{x} = Uniform (0, W), r_{y} = Uniform (0, H),

(14)

r_{w} = W \sqrt{1 - λ}, r_{h} = H \sqrt{1 - λ},

(15)

where

(r_{x}, r_{y})

specifies the top-left corner of the rectangle and

r_{w}

and

r_{h}

are its width and height, respectively. The augmented image

I^{'}

is defined as:

I^{'} (x, y) = \{\begin{matrix} I_{A} (x, y), & if (x, y) \notin cut region, \\ I_{B} (x, y), & otherwise . \end{matrix}

(16)

The labels are mixed linearly as:

y^{'} = λ y_{A} + (1 - λ) y_{B} .

(17)

These augmentation methods collectively enhance data diversity and robustness, reducing overfitting risks and improving model performance across diverse tasks.

3.3. Proposed Method

The proposed data security training framework, based on symmetric projection spaces and adversarial training, consists of several interconnected modules: data input, the symmetric projection space extractor, adversarial training with a generator and discriminator, an autoencoder-based data generation module, and a projection loss function, as shown in Figure 2.

These modules work together to achieve a balance between data privacy protection and model utility. The detailed construction and interactions of each module are described below. First, the preprocessed input data

X \in R^{n \times d}

are fed into the first module of the model—the symmetric projection space extractor. The primary task of this module is to map the high-dimensional data into a lower-dimensional latent space through a nonlinear transformation and obfuscate the input data to protect data privacy. The symmetric projection space extractor learns a mapping function

f (\cdot)

that maps the input data to a low-dimensional abstract space, producing a low-dimensional representation

z \in R^{n \times k}

, where

k < d

. This low-dimensional representation retains the essential information of the input data. The processed data

Z

are then passed into the adversarial training network framework. In this framework, the generator G is responsible for generating obfuscated data

X^{'}

based on the low-dimensional representation

Z

while maintaining data privacy. The generated data

X^{'}

are encrypted to ensure that it is difficult to reverse-engineer into the original sensitive data. Meanwhile, the discriminator D is tasked with distinguishing between real data and generated data, determining whether the data come from the real data distribution. The generator and discriminator optimize each other through adversarial learning: the generator continuously improves the privacy of the generated data, while the discriminator increases its ability to differentiate between real and generated data. The optimization goal of adversarial training is to minimize the discriminator’s ability to distinguish between real and generated data while maximizing its discriminatory power. Furthermore, an autoencoder-based data generation module is introduced to further enhance privacy protection accuracy. In this module, the input data

X

are mapped into the latent space by an encoder to obtain the latent representation

H

, which are then decoded back into data

X^{'}

by the decoder. This process obfuscates the data during encoding and decoding, effectively separating sensitive and nonsensitive information and generating a privacy-preserving representation of the data. Unlike traditional noise addition methods, the autoencoder module improves privacy protection and training efficiency by optimizing the network weights. Finally, a projection loss function is employed to optimize the balance between data obfuscation and model utility. It consists of two components: reconstruction error and privacy loss. The reconstruction error measures the similarity between the generated data

X^{'}

and the original data

X

, while the privacy loss evaluates the effectiveness of data privacy protection based on the output of the discriminator.

3.3.1. Adversarial Training Network Framework

The proposed adversarial training network framework utilizes a game-theoretic mechanism between a generator and a discriminator, aiming to ensure that the generated data not only has high privacy but also maintains distribution consistency with the original data, as shown in Figure 3. This framework is an improvement over the traditional GAN structure, adapted to meet the requirements of data privacy protection tasks. Through the optimization of both the generator and the discriminator, the adversarial training framework enables the balance between privacy protection and data utility, thereby maximizing the privacy protection capability of the model while maintaining the effectiveness of the training data.

In this framework, both the generator and the discriminator are structured as deep neural networks. The generator G is tasked with generating obfuscated training data

X^{'}

based on the low-dimensional representation z (obtained through the symmetric projection space extractor). The structure of the generator G consists of several fully connected layers, combined with activation functions (such as ReLU or LeakyReLU) to introduce nonlinearity and enhance its representational power. The input to the generator is the low-dimensional representation

z \in R^{n \times k}

, where k is the dimension of the latent space, typically set to a value smaller than the input data dimension d (i.e.,

k < d

). The output of the generator is the generated data

X^{'} \in R^{n \times d}

, which shares the same dimensionality as the original data

X

. The network structure of the generator generally includes several fully connected layers, batch normalization layers, and an output layer.

The goal of the generator is to learn the optimal parameters

θ_{G}

so that the generated data closely match the distribution of the real data

X

while ensuring sufficient privacy protection. The discriminator D is responsible for classifying the input samples (real data

X

or generated data

X^{'}

) to determine whether they come from the real data distribution. The discriminator is typically modeled using CNN or FCN, where the structure consists of several convolutional layers or fully connected layers, followed by activation functions to produce an output probability representing the likelihood that the input data come from the real data distribution. The output of the discriminator is

D (X; θ_{D}) \in [0, 1]

, where

D (X)

close to 1 indicates that the data come from the real distribution, and close to 0 indicates that the data are generated. The goal of the discriminator is to maximize its ability to distinguish between real data and generated data. The entire adversarial training process achieves a game-theoretic balance by optimizing the objective functions of the generator G and the discriminator D.

Meanwhile, the discriminator D optimizes by maximizing its ability to distinguish between real data and generated data. Through adversarial training, the generator continuously improves the privacy of the generated data, while the discriminator improves its ability to differentiate between real and generated data. In privacy protection tasks, the game between the generator G and the discriminator D is not only aimed at generating more realistic data but also ensuring that the generated data effectively protect privacy. Specifically, the generator decodes the low-dimensional representation z to generate data

X^{'}

with privacy-preserving features, while the discriminator performs classification tasks to determine whether the generated data exhibit sufficient privacy. The adversarial mechanism between the generator and the discriminator effectively prevents the leakage of sensitive information, thus avoiding the accuracy losses that may arise from traditional noise addition methods. Mathematically, the optimization goal of the generator is to minimize the privacy loss component in the adversarial loss function, ensuring that the generated data have a low probability of being recognized as real, that is:

min_{G} E_{z \sim p_{z}} [log (1 - D (G (z)))],

(18)

while the discriminator aims to maximize its ability to distinguish between generated and real data:

max_{D} E_{X \sim p_{real}} [log D (X)] + E_{z \sim p_{z}} [log (1 - D (G (z)))],

(19)

Through this game-theoretic mechanism, the generator and discriminator optimize each other, ensuring that the generated data maintain privacy while retaining the key features of the original data. The structure and parameter selection of both the generator G and the discriminator D are critical in this task. The generator G must have sufficient representational capacity to ensure that the data

X^{'}

generated from the low-dimensional representation z closely resemble the original data structure and features. At the same time, the discriminator D needs a powerful network structure capable of effectively distinguishing between generated and real data. In practice, the structures of the generator and discriminator can be implemented through a combination of multi-layer fully connected networks, convolutional networks, and other architectures to enhance the model’s representational and discriminative capabilities.

3.3.2. Symmetric Projection Space Extractor

The symmetric projection space extractor is a key module in the privacy protection framework proposed in this study. Its primary task is to map high-dimensional sensitive data into a lower-dimensional latent space through nonlinear transformations, while ensuring data privacy protection during the projection process by utilizing symmetry design, as shown in Figure 4.

Specifically, this module is designed to learn a mapping function that reduces the input data from their original high-dimensional space to a low-dimensional abstract space, preserving essential information while minimizing the risk of sensitive information leakage. The structure of the symmetric projection space extractor is built upon a deep neural network, primarily composed of several fully connected (FC) layers, with ReLU activation functions employed to enhance the network’s nonlinear representational power. The input to the network is high-dimensional data

X \in R^{n \times d}

, where n represents the number of samples and d denotes the feature dimension of the input data. The output is a low-dimensional latent representation

Z \in R^{n \times k}

, where k is the dimensionality of the latent space and is typically chosen such that

k < d

. The detailed architecture of the symmetric projection space extractor is outlined as follows:

1.: Input layer: The input data $X$ are mapped through a fully connected layer with output dimension $h_{1}$ . Assuming the input feature dimension is d, the parameters of the first layer are $W_{1} \in R^{d \times h_{1}}$ and the bias is $b_{1} \in R^{h_{1}}$ , with the activation function being ReLU:

$H_{1} = ReLU (X W_{1} + b_{1}),$

(20)

where $H_{1} \in R^{n \times h_{1}}$ .
2.: Hidden layers: Data are transformed through multiple hidden layers, with the output dimension gradually decreasing. The parameters of the i-th layer are $W_{i} \in R^{h_{i - 1} \times h_{i}}$ , the bias is $b_{i} \in R^{h_{i}}$ , and the activation function remains ReLU:

$H_{i} = ReLU (H_{i - 1} W_{i} + b_{i}),$

(21)

where $H_{i} \in R^{n \times h_{i}}$ .
3.: Output layer: The final layer maps the data to the low-dimensional latent space $Z \in R^{n \times k}$ , where k is the latent space dimension. The output layer’s parameters are $W_{out} \in R^{h_{m} \times k}$ , the bias is $b_{out} \in R^{k}$ , and the activation function is a linear function to ensure continuous output:

$Z = H_{m} W_{out} + b_{out},$

(22)

where $Z$ is the low-dimensional latent representation.

The core idea of the symmetric projection space extractor mapping process gradually reduces the dimensionality of the data using multiple fully connected layers within the deep neural network. By progressively compressing the feature dimensions, the network ensures that the final low-dimensional representation

Z

effectively retains the key information of the input data, while redundant information is gradually discarded. The network’s parameters are optimized via backpropagation to minimize the loss function. A specific loss function, such as the projection loss function introduced in this study, combines reconstruction error and privacy loss to optimize the balance between data privacy and model utility. During the design of the symmetric projection space extractor, symmetry constraints are applied to ensure the smoothness and robustness of the mapping. The principle of symmetry design is to share or constrain network parameters so that the data transformation during projection has a similar effect on each dimension of the input data. This avoids the overcompression of certain dimensions, while preserving other dimensions, resulting in more balanced feature compression and information retention. These mathematical constraints can be embodied in the loss function, for instance, through regularization terms such as L2 regularization or symmetry regularization. Through these constraints, the network can maximize privacy protection while retaining the structural information of the data. The application of the symmetric projection space extractor ensures that the data are both obfuscated for privacy protection and optimized for downstream tasks. By using nonlinear transformations, the extractor can efficiently reduce data dimensions while maintaining the critical features necessary for model training and inference. Furthermore, the introduction of symmetry constraints guarantees that data transformations do not disproportionately affect certain features, thus promoting better privacy protection and more balanced data representations. This design provides a key advantage in privacy-sensitive tasks, where ensuring that sensitive information is effectively obscured without compromising the utility of the data is paramount.

3.3.3. Autoencoder-Based Data Generation Module

The autoencoder-based data generation module plays a crucial role in the privacy protection framework proposed in this study. Autoencoders effectively separate and reconstruct sensitive and nonsensitive information by encoding the input data into a low-dimensional latent representation and then generating obfuscated data through the decoder, thereby providing efficient data privacy protection. Unlike traditional attention mechanisms, which focus on weighting features to highlight important information, autoencoders rely primarily on the encoding and decoding process to compress and reconstruct data, as shown in Algorithm 1. Applsci 15 03275 i001

In privacy protection tasks, the autoencoder-based data generation module offers a distinct advantage, as it can accurately capture key features in sensitive data through learned network weights while avoiding unnecessary privacy leakage. The specific network structure is as follows:

1.: Encoder design: The encoder consists of several fully connected layers designed to map the input data $X$ to a low-dimensional latent space $H$ . Assume the input data $X$ have dimension d, and the encoder’s output $H$ has dimension k (typically, $k ≪ d$ ). The first layer of the encoder maps the input data to an intermediate layer $h_{1}$ using a fully connected layer, with parameters $W_{1} \in R^{d \times h_{1}}$ , and bias $b_{1} \in R^{h_{1}}$ , using the ReLU activation function. This process continues, progressively compressing the data features down to the latent space $H \in R^{n \times k}$ .
2.: Decoder design: The decoder’s task is to reconstruct the original data $X^{'}$ from the latent representation $H$ . The structure of the decoder is similar to the encoder and consists of several fully connected layers. Assume that the final output of the decoder is the reconstructed data $X^{'} \in R^{n \times d}$ . The first layer of the decoder maps the latent representation $H$ back to the high-dimensional space:

$X^{'} = Sigmoid (H W_{out} + b_{out}),$

(23)

where $W_{out} \in R^{k \times d}$ is the weight matrix of the decoder, $b_{out} \in R^{d}$ is the bias, and the activation function is Sigmoid (or ReLU, depending on the task requirements), ensuring that the reconstructed data maintain the structure of the original data.
3.: Network parameter design: For each layer of the network, careful consideration is given to the matching of input and output dimensions and the network’s expressiveness. The encoder typically uses intermediate layers $h_{1}, h_{2}, \dots, h_{n}$ , where $h_{1}$ is the output of the first layer, progressively increasing the feature abstraction capacity until the data are compressed into the latent space $h_{n} = k$ . The decoder then reconstructs the original data dimensions based on the latent space data through reverse mapping.

The goal of the autoencoder is to minimize the reconstruction error between the input data

X

and the reconstructed data

X^{'}

. To achieve this, the MSE is used as the loss function, which is expressed as:

L_{recon} = {∥ X - X^{'} ∥}^{2} = \sum_{i = 1}^{n} {(X_{i} - X_{i}^{'})}^{2},

(24)

where

X_{i}

and

X_{i}^{'}

represent the original and reconstructed data of the i-th sample, respectively. By minimizing this loss function, the network learns an effective mapping that makes the reconstructed data as close as possible to the original data. The advantage of the autoencoder lies in its ability to automatically learn the latent structure of the data during the compression process and then precisely restore the data through the decoder. Compared to traditional noise addition methods, the autoencoder achieves a better balance between privacy protection and data utility by learning the low-dimensional representation of the data through training. Since the network weights are optimized via backpropagation, the autoencoder can precisely control both the reconstruction quality of the data and the privacy protection effects. In the adversarial training network framework of this study, the autoencoder-based data generation module is combined with the game mechanism between the generator (G) and the discriminator (D). Specifically, the generator G generates obfuscated data

X^{'}

from the low-dimensional latent representation

Z

, while the discriminator D distinguishes between real and generated data. Through adversarial training, the generator can retain the key information in the data while ensuring privacy protection. The introduction of the autoencoder module further enhances the quality of the generated data, especially in terms of the separation and reconstruction of sensitive information, making it more accurate in achieving data privacy protection compared to traditional noise addition methods. In this framework, the input to the generator G is the low-dimensional representation

Z

, and the privacy-protecting data

X^{'}

is generated through the encoding and decoding process of the autoencoder. The discriminator D is then used to optimize the privacy and authenticity of the generated data. Through this joint training, the generator not only learns how to generate privacy-protecting data but also enhances the privacy and authenticity of the generated data through adversarial training. The autoencoder module further refines the separation of sensitive and nonsensitive information, enabling the generator to better preserve data privacy while improving the effectiveness of the trained model.

3.3.4. Projection Loss Function

Common loss functions are designed to optimize the model’s predictive accuracy. However, in privacy-preserving tasks, traditional loss functions have certain limitations when it comes to achieving privacy protection goals. Specifically, directly optimizing traditional loss functions may lead to privacy leaks when dealing with sensitive data, or while the generated data may be accurate, it might be overfitted. To address this issue, a new projection loss function is designed, as shown in Algorithm 2. Applsci 15 03275 i001

The projection loss function consists of two components: reconstruction error and privacy loss. The reconstruction error primarily measures the similarity between the generated data and the original data, while privacy loss ensures that the generated data meet the privacy protection requirements. The formula for the projection loss function can be expressed as:

L_{projection} = α L_{recon} + β L_{privacy},

(25)

where

α

and

β

are hyperparameters that control the weights of the reconstruction error and privacy loss, respectively. In this paper, the hyperparameters

α

and

β

are used to adjust the weights of the reconstruction error (

L_{recon}

) and the privacy loss (

L_{privacy}

). After initial experimentation and tuning, the values of

α

and

β

are set to 0.5 and 0.5, meaning equal weights for both. To further optimize the model’s performance, the values of

α

and

β

are fine-tuned using a cross-validation strategy, to adapt to different datasets and tasks. Specifically, the reconstruction error

L_{recon}

can be expressed using mean squared error, which measures the difference between the generated data and the original data:

L_{recon} = \frac{1}{n} \sum_{i = 1}^{n} {| X_{i} - X_{i}^{'} |}^{2},

(26)

The reconstruction error measures the similarity between the generated data and the original data, and the objective is to optimize this term to preserve the key information of the data. Privacy loss

L_{privacy}

is then measured through the output of the discriminator D, which evaluates the privacy of the data. The goal is to minimize the probability that the generated data are classified as "real" by the discriminator. The privacy loss can be expressed as:

L_{privacy} = - E_{X^{'} \sim p_{gen}} [log (1 - D (X^{'}))],

(27)

where

D (X^{'})

represents the output of the discriminator for the generated data

X^{'}

. The goal is to make the generated data difficult for the discriminator to distinguish, thus ensuring the privacy of the generated data. By optimizing the privacy loss, the generated data must not only retain enough information for training and inference but also avoid leaking sensitive information. The projection loss function is combined with the symmetric projection space extractor, forming a complete privacy protection framework. In the symmetric projection space extractor, the original data are projected into the low-dimensional space

Z

via a nonlinear mapping. This process effectively compresses the redundant parts of the data and avoids information loss through symmetry design. The optimization of the projection loss function ensures the balance between privacy protection and data utility in the generated data. Specifically, after the data are mapped into the low-dimensional space by the symmetric projection space extractor, it is passed through the generator for generation. The generated data are then evaluated by the discriminator and ultimately optimized through the projection loss function to ensure that the generated data achieves an optimal balance between privacy protection and utility. By combining the projection loss function with the symmetric projection space extractor, precise control over the privacy of the generated data can be achieved during the generation process, avoiding the potential privacy leakage issues common in traditional methods, while also ensuring the effectiveness of the generated data. Particularly in the design of the symmetric projection space, the nonlinear transformation of the input data effectively retains the key information, while the projection loss function ensures that the generated data are not only strongly privacy-preserving but also retain the essential features of the original data during inference and training.

3.4. Experimental Setup

3.4.1. Hardware and Software Platforms

For the hardware configuration, a high-performance deep learning server equipped with state-of-the-art hardware was employed in this study. Specifically, the server was configured with two NVIDIA A100 GPUs, each offering 40 GB of memory, enabling large-scale parallel data processing and efficient training of deep learning models. The server was further equipped with an AMD EPYC 7742 processor with 64 physical cores, facilitating multi-threaded parallel computations and significantly accelerating data preprocessing and model training tasks. The system included 512 GB of DDR4 RAM to ensure seamless handling of large datasets during preprocessing and training. Additionally, NVMe SSD storage was utilized to provide high-speed data read/write capabilities, thereby optimizing data loading and model training efficiency throughout the experimental process.

For the software configuration, a widely used deep learning software stack was adopted to ensure flexibility and scalability of the framework. The experimental environment was built on Ubuntu 22.04, offering a stable platform for development and execution. PyTorch 2.0 was utilized as the deep learning framework, renowned for its dynamic computation graph and highly optimized GPU acceleration, which are ideal for the rapid development and validation of complex models. CUDA 12.1 and cuDNN 8.9 were employed to maximize GPU parallel computing capabilities. For data processing and analysis, Python 3.10 was utilized, along with several scientific computing and visualization libraries, including NumPy, Pandas, Scikit-learn, and Matplotlib 3.9. To ensure the reproducibility of experimental results, a Conda virtual environment was used for dependency management, allowing strict control over software versions. These combined hardware and software configurations established a high-performance and stable experimental environment, providing robust support for the proposed methodology.

3.4.2. Hyperparameters and Training Configuration

To ensure efficient convergence and stability during training, critical hyperparameters were carefully tuned. The Adam optimizer was employed, with an initial learning rate set to

α = 0.001

. A learning rate decay strategy was implemented, reducing the learning rate by a factor of 0.1 every 10 epochs to adapt to the convergence requirements of the model. The batch size was set to 64, balancing training efficiency and model performance. Additionally, a dropout mechanism was introduced to prevent overfitting, with a dropout rate of 0.5 applied to randomly deactivate a portion of neurons.

In the experimental design, the dataset was partitioned to ensure the scientific rigor and robustness of model training and evaluation. The dataset was split into training, validation, and test sets in proportions of 70%, 15%, and 15%, respectively, to provide sufficient training data while reserving adequate validation and test data for evaluating model generalization. To further enhance the reliability of experimental results, five-fold cross-validation was employed. The training set was subdivided into five mutually exclusive subsets, with one subset used as the validation set and the remaining four as the training set in each iteration. The performance metrics from all iterations were aggregated to provide an overall evaluation of the model. Cross-validation effectively mitigates biases caused by data distribution differences, enhancing the stability of experimental outcomes.

3.4.3. Baseline Methods

To comprehensively evaluate the performance and advantages of the proposed method, several classical privacy-preserving techniques were selected as baseline methods, including DP [47], FL [48], secure multi-party computation (MPC) [49], and HE [50]. DP protects sensitive information by adding noise to data or models. Its core principle ensures that the influence of individual samples is negligible through randomization. In this study, Gaussian noise addition was employed with a privacy budget parameter of

ϵ = 1.0

. Detailed experiments were conducted to analyze the trade-offs between privacy levels and model performance. FL protects privacy by training models locally on devices and sharing only model parameters rather than raw data. A gradient aggregation-based FL framework was adopted, and further experiments were conducted to evaluate its resilience under scenarios of participant dropout and data heterogeneity. To optimize communication overhead, model updates were compressed before transmission, resulting in a 15% reduction in communication costs without compromising model accuracy. Secure multi-party computation ensures collaborative computations among multiple parties under a trustless environment using cryptographic techniques and distributed computation. This study implemented a Shamir’s Secret Sharing protocol, focusing on its communication efficiency and computational scalability under varying numbers of participants. HE, which allows computations on encrypted data, provides the highest theoretical security but incurs significant computational overhead. The Paillier HE algorithm was utilized, with a detailed analysis of encryption and decryption delays during training. A batch processing optimization strategy was introduced, reducing the computational overhead by 20% compared to the standard Paillier implementation. By comparing these baseline methods, the proposed framework was demonstrated to offer superior privacy protection, model utility, and computational efficiency.

3.5. Evaluation Metrics

To comprehensively assess the performance of the proposed method, a set of evaluation metrics was employed, including precision, recall, F1 score, accuracy, mean Average Precision (mAP@50 and mAP@75), and Frames Per Second (FPS). These metrics quantify classification and detection performance, as well as computational efficiency, from different perspectives. Precision measures the proportion of correctly predicted positive samples among all predicted positive samples, serving as an important metric for assessing prediction accuracy. Recall indicates the proportion of actual positive samples correctly identified, reflecting the sensitivity of the model. The F1 score, as the harmonic mean of precision and recall, is particularly useful in scenarios with imbalanced datasets, providing a balanced measure of a model’s ability to correctly identify positive samples while avoiding false positives. Accuracy, the most commonly used classification metric, measures a comprehensive evaluation of overall model performance. mAP is the key metric in object detection tasks. Specifically, mAP@50 and mAP@75 were adopted in this study to evaluate the average precision at Intersection over Union (IoU) thresholds of 0.5 and 0.75, respectively, illustrating performance under different levels of stringency. Finally, FPS was utilized to measure the real-time computational efficiency of the method, representing the number of images processed per second, which is critical for deployment in practical scenarios. The mathematical definitions of these metrics are as follows:

Precision = \frac{TP}{TP + FP},

(28)

Recall = \frac{TP}{TP + FN},

(29)

F 1 = 2 \cdot \frac{Precision \cdot Recall}{Precision + Recall},

(30)

Accuracy = \frac{TP + TN}{TP + TN + FP + FN},

(31)

AP = \int_{0}^{1} Precision (r) d r,

(32)

mAP = \frac{1}{C} \sum_{c = 1}^{C} {AP}_{c},

(33)

FPS = \frac{N}{T},

(34)

where TP, FP, and FN denote the number of true positives, false positives, and false negatives, respectively. The F1 score provides a robust metric for balancing precision and recall, particularly when one metric dominates the other in scenarios such as highly imbalanced datasets. r represents the recall,

Precision (r)

denotes the precision at recall r, C is the total number of classes,

{AP}_{c}

is the average precision for class c, N is the total number of frames processed, and T is the total processing time. By integrating these metrics, including the F1 score, the performance of the proposed method was thoroughly evaluated in terms of predictive accuracy, sensitivity, overall classification capability, object detection precision, and computational efficiency, providing a holistic view of the model’s strengths and limitations.

4. Results and Discussion

4.1. Financial Fraud Detection Results

The purpose of the financial fraud detection experiment was to evaluate the effectiveness of the proposed method compared to several existing privacy-preserving techniques in terms of key performance metrics, including precision, recall, F1 score, and accuracy. Financial fraud detection tasks often deal with highly imbalanced data, where fraudulent transactions are much fewer than legitimate transactions. This makes the use of a model capable of distinguishing between these classes crucial. Additionally, the integration of privacy-preserving techniques is essential to ensure that sensitive financial data, such as credit card transactions, are protected while maintaining the model’s performance. The experiment aimed to validate whether the proposed method could not only provide strong privacy protection but also improve or maintain high classification accuracy compared to other existing methods. The results presented in Table 3 reflect the performance of various models, including MPC, HE, DP, FL, and the proposed method.

From the results, it is clear that the proposed method surpasses all other models in terms of precision, recall, F1 score, and accuracy. With a precision of 0.95, recall of 0.91, and F1 score of 0.93, the method demonstrates its capability to achieve both accurate detection and a low rate of false positives, making it particularly efficient in identifying fraudulent activities. Comparatively, baseline methods such as MPC, HE, DP, and FL show inferior performance across all metrics. For example, MPC, with a precision of 0.83 and recall of 0.78, achieves an F1 score of 0.80, indicating challenges in balancing privacy preservation with detection accuracy. HE, despite offering strong privacy guarantees, achieves slightly better results but remains behind the proposed method, with an F1 score of 0.84 derived from a precision of 0.86 and recall of 0.82. DP, which applies noise to obscure individual information, achieves a precision of 0.89, recall of 0.86, and F1 score of 0.87, but struggles with a clear trade-off between privacy and performance. FL, which enables training on decentralized data while ensuring privacy, achieves a precision of 0.92, recall of 0.89, and F1 score of 0.90, demonstrating its ability to preserve privacy while maintaining relatively high performance. The proposed method, however, likely benefits from its capability to effectively balance the trade-off between privacy and model utility. By integrating symmetric projection space extraction with adversarial training, the framework obfuscates sensitive information while retaining critical features needed for fraud detection. From a mathematical standpoint, the approach employs adversarial learning, optimizing the generator (responsible for creating obfuscated data) and the discriminator (tasked with distinguishing real from generated data) in a game-theoretic manner. This ensures that the obfuscated data remain of sufficient quality for decision-making processes without compromising sensitive details. Moreover, the incorporation of symmetric projection space ensures robust privacy protection with minimal information loss during dimensionality reduction. This synergy of advanced techniques contributes to the method’s exceptional performance in both privacy preservation and model accuracy within the context of financial fraud detection. The proposed method’s strong performance, particularly in recall, precision, and F1 score, is largely due to its ability to handle imbalanced datasets effectively while ensuring robust privacy protection. Unlike traditional methods such as DP, which degrade performance by adding noise, the proposed approach achieves a fine balance between privacy and utility through its innovative design, thereby improving fraud detection without compromising model integrity. The F1 score underscores the method’s balanced effectiveness in achieving high precision and recall, especially in scenarios with imbalanced data. This unique advantage, rooted in the mathematical foundation of the adversarial training process and symmetric projection space, ensures that the proposed approach not only exceeds existing techniques in performance metrics but also delivers a highly secure and practical solution for financial fraud detection in privacy-sensitive environments.

4.2. Image and Semantic Classification Detection Results

The purpose of the image and semantic classification detection experiment was to assess the performance of the proposed method in the context of privacy-preserving machine learning for image-related tasks, particularly in terms of both classification and object detection accuracy. The experiment was designed to compare the effectiveness of different privacy-preserving techniques, including MPC, HE, DP, FL, and the proposed method. The key metrics used for evaluation include precision, recall, F1 score, accuracy, and mAP at thresholds of 50% (mAP@50) and 75% (mAP@75). These metrics are particularly important for image classification and object detection tasks, as they provide a comprehensive view of both the accuracy and robustness of the model in handling different types of image data. Through this experiment, the aim was to demonstrate that the proposed method could achieve a high level of privacy protection while maintaining or even improving classification and detection performance compared to traditional methods.

From the results presented in Table 4, it is clear that the proposed method outperforms all the other models in terms of precision, recall, F1 score, accuracy, and mAP. The precision of 0.93, recall of 0.90, and F1 score of 0.91 indicate the method’s high ability to both correctly identify positive instances and minimize false positives, making it highly effective for detecting relevant objects in image data. In terms of accuracy, the proposed method achieves 0.91, surpassing the other models. Additionally, the mAP@50 and mAP@75 values of 0.91 and 0.90, respectively, highlight the model’s strong performance in object detection, especially in challenging detection scenarios where the object-to-background ratio is low or when the objects are obscured. When compared with the baseline models, such as MPC, HE, and DP, the proposed method consistently provides superior results. For example, MPC, with a precision of 0.84, recall of 0.80, and F1 score of 0.82, performs relatively well but still lags behind in terms of both accuracy and mAP. HE offers better performance than MPC, with a precision of 0.86, recall of 0.82, and F1 score of 0.84, but still falls short of the proposed method. The DP method, which applies noise to the data to preserve privacy, achieves a precision of 0.89, recall of 0.86, and F1 score of 0.87, but still faces a trade-off between privacy and accuracy. FL, with its decentralized training approach, yields the best results among the baseline methods with a precision of 0.91, recall of 0.88, and F1 score of 0.89, but it still does not reach the performance level of the proposed approach.

The superior performance of the proposed method can be attributed to its ability to balance privacy protection and model utility. The method integrates symmetric projection space extraction and adversarial training, enabling it to obfuscate sensitive information in the data while retaining the essential features for classification and detection. Mathematically, the symmetric projection space ensures that the data are projected into a low-dimensional space in such a way that the critical features necessary for training are preserved, while unnecessary information is discarded. The adversarial training framework further strengthens the model’s ability to generate privacy-preserving data while optimizing the generator’s ability to produce realistic data and the discriminator’s ability to detect generated data. This adversarial mechanism, combined with the symmetric projection, enables the proposed method to retain high-quality features for both classification and object detection tasks, which accounts for the higher precision, recall, and F1 score values. Furthermore, the F1 score provides a more balanced evaluation of the model’s performance, particularly in scenarios where precision and recall are not equally weighted. Unlike traditional methods like DP, which add noise and degrade performance, the proposed method balances privacy and utility through its innovative framework, enhancing classification and detection tasks without compromising the model’s integrity. The combination of these techniques allows the proposed method to achieve a better balance between privacy and performance, making it highly suitable for tasks requiring both security and accuracy.

4.3. Computational Efficiency Analysis

The purpose of this experiment was to analyze the efficiency of different privacy-preserving methods in practical computational environments, particularly in devices with limited computational power. By comparing the FPS performance of MPC, HE, DP, FL, and the proposed method across different hardware platforms, such as Raspberry Pi, Jetson, and 3080 GPU, the experiment aims to reveal the computational efficiency differences between these models. The goal is to select the most suitable privacy-preserving method for real-world applications. Computational efficiency directly impacts the feasibility and performance of privacy-preserving technologies in large-scale applications, especially in scenarios where real-time data streams need to be processed efficiently. The experimental results provide a comprehensive evaluation of how each method performs under different hardware configurations and offer theoretical insights into how privacy protection can be balanced with computational efficiency.

As observed from the experimental results presented in Table 5, the FPS of all methods significantly improved as the computational platform’s performance increased, with the highest performance achieved when using high-performance GPUs (3080 GPU). On the Raspberry Pi platform, due to its limited computational power, FPS values were generally lower, with the proposed method showing an FPS of 29.93, slightly ahead of the other methods, but still noticeably lower than the results from high-performance platforms. The performance on the Jetson platform, which lies between the Raspberry Pi and 3080 GPU, showed better efficiency, with the proposed method achieving an FPS of 46.64, surpassing most traditional privacy-preserving methods. Particularly on the 3080 GPU, the proposed method achieved an FPS of 57.89, significantly outpacing the other models, especially when compared to FL (48.39 FPS) and DP (51.48 FPS), showing a much higher computational efficiency. This demonstrates that the proposed method optimizes the balance between privacy protection and computational efficiency, especially when leveraging high-performance computing platforms.

Theoretically, the advantage of the proposed method can be attributed to its design, which combines symmetric projection and adversarial training to effectively reduce computational overhead while maintaining strong privacy protection. This optimization is particularly evident on high-performance hardware, where the ability to handle the increased computational load allows the method to demonstrate superior efficiency. From a mathematical perspective, the reduction in computational complexity is achieved by minimizing the amount of redundant information that needs to be processed during the privacy-preserving transformation. Traditional privacy-preserving methods, such as DP and HE, typically rely on adding noise to the data or performing encryption operations, which incur high computational costs, especially on lower-powered hardware. In contrast, the proposed method uses symmetric projection to generate low-dimensional representations, significantly optimizing both data storage and computation efficiency. Furthermore, adversarial training between the generator and discriminator allows for rapid iteration, enabling the full utilization of hardware capabilities. Consequently, the proposed method shows distinct advantages on high-performance platforms. Overall, the experimental results validate that the proposed method not only provides robust privacy protection but also exhibits high computational efficiency across different hardware platforms, especially on high-performance GPUs, where the advantage is more pronounced. This provides critical experimental evidence for the widespread adoption of the proposed method in large-scale real-world applications.

4.4. Ablation Experiment on Different Loss Functions Results

The purpose of this experiment was to evaluate the effectiveness of different loss functions in privacy-preserving tasks, particularly in time-series and image classification tasks. By training the models with cross-entropy loss, focal loss, and projection loss, the experiment aimed to explore the differences between these loss functions in terms of precision, recall, F1 score, accuracy, and mAP. These metrics are key indicators for evaluating the performance of classification and detection tasks and provide insight into how different loss functions influence the model’s effectiveness. The ablation experiment helps to understand the contribution of each loss function to the model’s training process and final results, offering a theoretical basis for further optimization of the model. The experimental results in the tables clearly show that the projection loss function exhibits significant advantages across different tasks, especially in more challenging scenarios.

Based on the experimental results shown in Table 6 and Table 7 for both time-series and image data, the model using the projection loss function outperforms those using cross-entropy loss and focal loss on all evaluation metrics. For time-series data, the cross-entropy loss model has a precision of 0.77, recall of 0.72, F1 score of 0.74, and accuracy of 0.75. In contrast, the focal loss improves these metrics with a precision of 0.86, recall of 0.81, F1 score of 0.83, and accuracy of 0.84. The model using the projection loss function, however, shows a much more significant improvement with a precision of 0.95, recall of 0.91, F1 score of 0.93, and accuracy of 0.93, demonstrating that the projection loss not only maintains high privacy protection but also improves model performance effectively. In the image data task, a similar trend is observed. The cross-entropy loss achieves a precision of 0.71, recall of 0.67, F1 score of 0.69, and accuracy of 0.69. The focal loss performs better, with a precision of 0.82, recall of 0.78, F1 score of 0.80, and accuracy of 0.80. The projection loss, however, achieves a precision of 0.93, recall of 0.90, F1 score of 0.91, accuracy of 0.91, and mAP@50 and mAP@75 of 0.91 and 0.90, respectively, showing the strong performance of the projection loss in image classification and detection tasks.

From a theoretical perspective, the cross-entropy loss function is widely used for multi-class classification tasks but does not account for class imbalance, leading to lower recall and F1 score in imbalanced datasets. The focal loss addresses this issue by adjusting the weights of easy-to-classify and hard-to-classify samples, aiming to reduce the influence of easy-to-classify samples on the loss function and focus more on harder samples. Although focal loss improves model performance on imbalanced data, it still struggles with retaining data privacy and critical information features. In contrast, the projection loss function combines reconstruction error and privacy loss, effectively preserving the critical features of the data while ensuring privacy protection. Mathematically, the projection loss uses a nonlinear mapping to compress the data into a low-dimensional space, reducing the computational complexity of redundant information while ensuring that the generated data are difficult to distinguish, thus balancing privacy protection with model utility. This design allows the projection loss to optimize both privacy protection and data utility, resulting in better classification and detection performance in both time-series and image data tasks.

5. Conclusions

In this paper, a data security training framework based on symmetric projection space and adversarial training is proposed, aimed at balancing the conflict between data privacy protection and model utility. In traditional privacy protection methods, such as homomorphic encryption and differential privacy, a certain trade-off in computational efficiency or model utility is often required to ensure data privacy. In contrast, the proposed method in this paper utilizes nonlinear mapping to compress high-dimensional data into a low-dimensional space, reducing the computational cost of redundant information while preserving key data, thus achieving a balance between privacy protection and data utility. Experimental results show that the proposed method outperforms traditional privacy protection methods across multiple tasks. In the financial time-series data task, the model using projection loss shows significant advantages in metrics such as precision, recall, and accuracy, especially in enhancing model utility while ensuring privacy protection. In image data tasks, the projection loss function also demonstrates strong performance, not only improving accuracy but also achieving higher values in metrics such as mAP. Furthermore, the experiments indicate that the proposed method maintains good computational efficiency across different hardware platforms, especially on high-performance GPU platforms, showing substantial advantages.

The trade-off between privacy protection and model utility is a complex task, and in practical applications, privacy protection often needs to be compatible with efficient model performance. By introducing symmetric projection and adversarial training mechanisms, this paper effectively reduces the computational overhead caused by privacy protection while maintaining the efficiency and practicality of the model. When dealing with different data types and tasks, the proposed method not only provides strong privacy protection but also ensures efficient training and inference performance, demonstrating its broad applicability across various fields. In summary, the framework proposed in this paper provides an effective solution for privacy protection tasks, optimizing model utility while ensuring data privacy. With further optimization and adjustments, the method can play an important role in more complex and diverse application scenarios, offering a theoretical foundation and practical reference for the balance between data privacy protection and machine learning model utility in the future.

Author Contributions

Conceptualization, Q.L. (Qianqian Li), S.Z., X.Z., Y.J. and C.L.; Data curation, Q.L. (Qianye Lin) and C.H.; Formal analysis, J.S. and Q.L. (Qianye Lin); Funding acquisition, C.L.; Investigation, J.S. and Y.Y.; Methodology, Q.L. (Qianqian Li), S.Z. and X.Z.; Project administration, C.L.; Resources, Q.L. (Qianye Lin) and C.H.; Software, Q.L. (Qianqian Li), S.Z., X.Z. and Y.Y.; Supervision, Y.J. and C.L.; Validation, J.S.; Visualization, C.H., Y.Y. and Y.J.; Writing—original draft, Q.L. (Qianqian Li), S.Z., J.S., Q.L. (Qianye Lin), C.H., Y.Y., Y.J. and C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China grant number 61202479.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Yang, P.; Xiong, N.; Ren, J. Data security and privacy protection for cloud storage: A survey. IEEE Access 2020, 8, 131723–131740. [Google Scholar] [CrossRef]
Li, Q.; Zhang, Y. Confidential Federated Learning for Heterogeneous Platforms against Client-Side Privacy Leakages. In Proceedings of the ACM Turing Award Celebration Conference 2024, Changsha, China, 5–7 July 2024; pp. 239–241. [Google Scholar]
Kaissis, G.A.; Makowski, M.R.; Rückert, D.; Braren, R.F. Secure, privacy-preserving and federated machine learning in medical imaging. Nat. Mach. Intell. 2020, 2, 305–311. [Google Scholar] [CrossRef]
Oyewole, A.T.; Oguejiofor, B.B.; Eneh, N.E.; Akpuokwe, C.U.; Bakare, S.S. Data privacy laws and their impact on financial technology companies: A review. Comput. Sci. Res. J. 2024, 5, 628–650. [Google Scholar]
Huang, L. Ethics of artificial intelligence in education: Student privacy and data protection. Sci. Insights Educ. Front. 2023, 16, 2577–2587. [Google Scholar] [CrossRef]
Ponomareva, N.; Hazimeh, H.; Kurakin, A.; Xu, Z.; Denison, C.; McMahan, H.B.; Vassilvitskii, S.; Chien, S.; Thakurta, A.G. How to dp-fy ml: A practical guide to machine learning with differential privacy. J. Artif. Intell. Res. 2023, 77, 1113–1201. [Google Scholar] [CrossRef]
Jin, W.; Yao, Y.; Han, S.; Gu, J.; Joe-Wong, C.; Ravi, S.; Avestimehr, S.; He, C. FedML-HE: An efficient homomorphic-encryption-based privacy-preserving federated learning system. arXiv 2023, arXiv:2303.10837. [Google Scholar]
Chen, J.; Yan, H.; Liu, Z.; Zhang, M.; Xiong, H.; Yu, S. When federated learning meets privacy-preserving computation. ACM Comput. Surv. 2024, 56, 1–36. [Google Scholar] [CrossRef]
Wang, B.; Chen, Y.; Jiang, H.; Zhao, Z. Ppefl: Privacy-preserving edge federated learning with local differential privacy. IEEE Internet Things J. 2023, 10, 15488–15500. [Google Scholar] [CrossRef]
Li, Q.; Zhang, Y.; Ren, J.; Li, Q.; Zhang, Y. You Can Use But Cannot Recognize: Preserving Visual Privacy in Deep Neural Networks. arXiv 2024, arXiv:2404.04098. [Google Scholar]
Li, Q.; Ren, J.; Zhang, Y.; Song, C.; Liao, Y.; Zhang, Y. Privacy-Preserving DNN Training with Prefetched Meta-Keys on Heterogeneous Neural Network Accelerators. In Proceedings of the 2023 60th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 9–13 July 2023; pp. 1–6. [Google Scholar]
Yan, Y.; Wang, X.; Ligeti, P.; Jin, Y. DP-FSAEA: Differential Privacy for Federated Surrogate-Assisted Evolutionary Algorithms. IEEE Trans. Evol. Comput. 2024. [CrossRef]
Wang, G.; Li, C.; Dai, B.; Zhang, S. Privacy-Protection Method for Blockchain Transactions Based on Lightweight Homomorphic Encryption. Information 2024, 15, 438. [Google Scholar] [CrossRef]
Pan, Y.; Chao, Z.; He, W.; Jing, Y.; Hongjia, L.; Liming, W. FedSHE: Privacy preserving and efficient federated learning with adaptive segmented CKKS homomorphic encryption. Cybersecurity 2024, 7, 40. [Google Scholar] [CrossRef]
Xie, Q.; Jiang, S.; Jiang, L.; Huang, Y.; Zhao, Z.; Khan, S.; Dai, W.; Liu, Z.; Wu, K. Efficiency optimization techniques in privacy-preserving federated learning with homomorphic encryption: A brief survey. IEEE Internet Things J. 2024, 11, 24569–24580. [Google Scholar] [CrossRef]
Vahdat, A.; Kreis, K.; Kautz, J. Score-based generative modeling in latent space. Adv. Neural Inf. Process. Syst. 2021, 34, 11287–11302. [Google Scholar]
Zhang, Y.; Wa, S.; Zhang, L.; Lv, C. Automatic plant disease detection based on tranvolution detection network with GAN modules using leaf images. Front. Plant Sci. 2022, 13, 875693. [Google Scholar] [CrossRef]
Wang, Y.; Yang, D.; Bremond, F.; Dantcheva, A. Latent image animator: Learning to animate images via latent space navigation. arXiv 2022, arXiv:2203.09043. [Google Scholar]
Kwon, M.; Jeong, J.; Uh, Y. Diffusion models already have a semantic latent space. arXiv 2022, arXiv:2210.10960. [Google Scholar]
Chen, X.; Jiang, B.; Liu, W.; Huang, Z.; Fu, B.; Chen, T.; Yu, G. Executing your commands via motion diffusion in latent space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 18000–18010. [Google Scholar]
Pang, B.; Han, T.; Nijkamp, E.; Zhu, S.C.; Wu, Y.N. Learning latent space energy-based prior model. Adv. Neural Inf. Process. Syst. 2020, 33, 21994–22008. [Google Scholar]
Maus, N.; Jones, H.; Moore, J.; Kusner, M.J.; Bradshaw, J.; Gardner, J. Local latent space bayesian optimization over structured inputs. Adv. Neural Inf. Process. Syst. 2022, 35, 34505–34518. [Google Scholar]
Zhang, Y.; Wa, S.; Liu, Y.; Zhou, X.; Sun, P.; Ma, Q. High-accuracy detection of maize leaf diseases CNN based on multi-pathway activation function module. Remote Sens. 2021, 13, 4218. [Google Scholar] [CrossRef]
Wu, C.H.; De la Torre, F. A latent space of stochastic diffusion models for zero-shot image editing and guidance. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–3 October 2023; pp. 7378–7387. [Google Scholar]
Tzelepis, C.; Tzimiropoulos, G.; Patras, I. Warpedganspace: Finding non-linear rbf paths in gan latent space. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 6393–6402. [Google Scholar]
Ramaswamy, V.V.; Kim, S.S.; Russakovsky, O. Fair attribute classification through latent space de-biasing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 9301–9310. [Google Scholar]
Pang, B.; Wu, Y.N. Latent space energy-based model of symbol-vector coupling for text generation and classification. In Proceedings of the International Conference on Machine Learning, PMLR, Online; 2021; pp. 8359–8370. [Google Scholar]
Chen, L.; Li, J.; Peng, J.; Xie, T.; Cao, Z.; Xu, K.; He, X.; Zheng, Z.; Wu, B. A survey of adversarial learning on graphs. arXiv 2020, arXiv:2003.05730. [Google Scholar]
Zhang, W.; Li, X. Federated transfer learning for intelligent fault diagnostics using deep adversarial networks with data privacy. IEEE/Asme Trans. Mechatron. 2021, 27, 430–439. [Google Scholar] [CrossRef]
Zhao, K.; Hu, J.; Shao, H.; Hu, J. Federated multi-source domain adversarial adaptation framework for machinery fault diagnosis with data privacy. Reliab. Eng. Syst. Saf. 2023, 236, 109246. [Google Scholar] [CrossRef]
Croce, D.; Castellucci, G.; Basili, R. GAN-BERT: Generative adversarial learning for robust text classification with a bunch of labeled examples. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 2114–2119. [Google Scholar]
Wu, Z.; Wang, H.; Wang, Z.; Jin, H.; Wang, Z. Privacy-preserving deep action recognition: An adversarial learning framework and a new dataset. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 2126–2139. [Google Scholar] [CrossRef] [PubMed]
Xiao, T.; Tsai, Y.H.; Sohn, K.; Chandraker, M.; Yang, M.H. Adversarial learning of privacy-preserving and task-oriented representations. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA; 2020; Volume 34, pp. 12434–12441. [Google Scholar]
Liu, X.; Xie, L.; Wang, Y.; Zou, J.; Xiong, J.; Ying, Z.; Vasilakos, A.V. Privacy and security issues in deep learning: A survey. IEEE Access 2020, 9, 4566–4593. [Google Scholar] [CrossRef]
Yang, Z.; Xu, B.; Luo, W.; Chen, F. Autoencoder-based representation learning and its application in intelligent fault diagnosis: A review. Measurement 2022, 189, 110460. [Google Scholar] [CrossRef]
Qian, J.; Song, Z.; Yao, Y.; Zhu, Z.; Zhang, X. A review on autoencoder based representation learning for fault detection and diagnosis in industrial processes. Chemom. Intell. Lab. Syst. 2022, 231, 104711. [Google Scholar] [CrossRef]
Vahdat, A.; Kautz, J. NVAE: A deep hierarchical variational autoencoder. Adv. Neural Inf. Process. Syst. 2020, 33, 19667–19679. [Google Scholar]
Zhang, Y.; Lv, C. TinySegformer: A lightweight visual segmentation model for real-time agricultural pest detection. Comput. Electron. Agric. 2024, 218, 108740. [Google Scholar] [CrossRef]
Zhang, Y.; Yang, X.; Liu, Y.; Zhou, J.; Huang, Y.; Li, J.; Zhang, L.; Ma, Q. A time-series neural network for pig feeding behavior recognition and dangerous detection from videos. Comput. Electron. Agric. 2024, 218, 108710. [Google Scholar] [CrossRef]
Chen, X.; Ding, M.; Wang, X.; Xin, Y.; Mo, S.; Wang, Y.; Han, S.; Luo, P.; Zeng, G.; Wang, J. Context autoencoder for self-supervised representation learning. Int. J. Comput. Vis. 2024, 132, 208–223. [Google Scholar] [CrossRef]
Ge, T.; Hu, J.; Wang, L.; Wang, X.; Chen, S.Q.; Wei, F. In-context autoencoder for context compression in a large language model. arXiv 2023, arXiv:2307.06945. [Google Scholar]
Liang, Y.; Liang, W. ResWCAE: Biometric Pattern Image Denoising Using Residual Wavelet-Conditioned Autoencoder. arXiv 2023, arXiv:2307.12255. [Google Scholar]
He, Y.; Carass, A.; Zuo, L.; Dewey, B.E.; Prince, J.L. Autoencoder based self-supervised test-time adaptation for medical image analysis. Med. Image Anal. 2021, 72, 102136. [Google Scholar] [CrossRef]
Zhang, X.; Liu, P.; Lin, N.; Zhang, Z.; Wang, Z. A novel battery abnormality detection method using interpretable Autoencoder. Appl. Energy 2023, 330, 120312. [Google Scholar] [CrossRef]
Mao, Y.; Xue, F.F.; Wang, R.; Zhang, J.; Zheng, W.S.; Liu, H. Abnormality detection in chest x-ray images using uncertainty prediction autoencoders. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2020: 23rd International Conference, Lima, Peru, 4–8 October 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 529–538. [Google Scholar]
Saad, O.M.; Chen, Y. Deep denoising autoencoder for seismic random noise attenuation. Geophysics 2020, 85, V367–V376. [Google Scholar] [CrossRef]
Dwork, C. Differential privacy. In Proceedings of the International Colloquium on Automata, Languages, and Programming, Venice, Italy, 10–14 July 2006; Springer: Berlin/Heidelberg, Germany, 2006; pp. 1–12. [Google Scholar]
Li, L.; Fan, Y.; Tse, M.; Lin, K.Y. A review of applications in federated learning. Comput. Ind. Eng. 2020, 149, 106854. [Google Scholar] [CrossRef]
Knott, B.; Venkataraman, S.; Hannun, A.; Sengupta, S.; Ibrahim, M.; van der Maaten, L. Crypten: Secure multi-party computation meets machine learning. Adv. Neural Inf. Process. Syst. 2021, 34, 4961–4973. [Google Scholar]
Munjal, K.; Bhatia, R. A systematic review of homomorphic encryption and its contributions in healthcare industry. Complex Intell. Syst. 2023, 9, 3759–3786. [Google Scholar] [CrossRef]

Figure 1. Image dataset samples.

Figure 2. Flowchart of proposed method.

Figure 3. Architecture of adversarial training network framework.

Figure 4. Architecture of symmetric projection space extractor.

Table 1. Comparison of privacy protection methods.

Method	Privacy Protection	Model Utility	Computational Efficiency	Complexity
Differential Privacy	High	Moderate	Low	High
Homomorphic Encryption	Very High	Low	Very Low	Very High
Federated Learning	Moderate	High	Moderate	Moderate
Proposed Method	High	Very High	High	Low

Table 2. Summary of datasets used in the study.

Dataset Name	Quantity	Task Type
MNIST	70,009	Image Classification
USPS	9298	Image Classification
CelebA	200,703	Facial Attribute Prediction and Classification
Credit Card Fraud Detection	284,807	Financial Fraud Detection
IMDb Reviews	50,379	Sentiment Classification
BreakHis	7909	Image Classification (Benign/Malignant)

Table 3. Financial fraud detection results.

Model	Precision	Recall	F1 Score	Accuracy	Computation Complexity
Multi-party Secure Computation	0.83	0.78	0.80	0.80	12.83
Homomorphic Encryption	0.86	0.82	0.84	0.84	83.14
Differential Privacy	0.89	0.86	0.87	0.88	4.71
Federated Learning	0.92	0.89	0.90	0.91	31.65
Proposed Method	0.95	0.91	0.93	0.93	3.28

Table 4. Image data experiment results.

Model	Precision	Recall	F1 Score	Accuracy	mAP@50	mAP@75	Computation Complexity
Multi-party Secure Computation	0.84	0.80	0.82	0.82	0.81	0.80	27.03
Homomorphic Encryption	0.86	0.82	0.84	0.84	0.84	0.83	215.81
Differential Privacy	0.89	0.86	0.87	0.87	0.88	0.87	11.54
Federated Learning	0.91	0.88	0.89	0.89	0.90	0.89	63.96
Proposed Method	0.93	0.90	0.91	0.91	0.91	0.90	7.27

Table 5. FPS experiment results.

Model	Raspberry Pi	Jetson	NVIDIA 3080 GPU
Multi-party Secure Computation	24.64	37.73	43.91
Homomorphic Encryption	28.25	39.67	42.94
Differential Privacy	15.03	41.27	51.48
Federated Learning	26.79	43.02	48.39
Proposed Method	29.93	46.64	57.89

Table 6. Ablation experiment on different loss functions for financial data.

Model	Precision	Recall	F1 Score	Accuracy
Cross-Entropy Loss	0.77	0.72	0.74	0.75
Focal Loss	0.86	0.81	0.83	0.84
Projection Loss	0.95	0.91	0.93	0.93

Table 7. Ablation experiment on different loss functions for image data.

Model	Precision	Recall	F1 Score	Accuracy	mAP@50	mAP@75
Cross-Entropy Loss	0.71	0.67	0.69	0.69	0.68	0.67
Focal Loss	0.82	0.78	0.80	0.80	0.79	0.78
Projection Loss	0.93	0.90	0.91	0.91	0.91	0.90

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, Q.; Zhou, S.; Zeng, X.; Shi, J.; Lin, Q.; Huang, C.; Yue, Y.; Jiang, Y.; Lv, C. A Symmetric Projection Space and Adversarial Training Framework for Privacy-Preserving Machine Learning with Improved Computational Efficiency. Appl. Sci. 2025, 15, 3275. https://doi.org/10.3390/app15063275

AMA Style

Li Q, Zhou S, Zeng X, Shi J, Lin Q, Huang C, Yue Y, Jiang Y, Lv C. A Symmetric Projection Space and Adversarial Training Framework for Privacy-Preserving Machine Learning with Improved Computational Efficiency. Applied Sciences. 2025; 15(6):3275. https://doi.org/10.3390/app15063275

Chicago/Turabian Style

Li, Qianqian, Shutian Zhou, Xiangrong Zeng, Jiaqi Shi, Qianye Lin, Chenjia Huang, Yuchen Yue, Yuyao Jiang, and Chunli Lv. 2025. "A Symmetric Projection Space and Adversarial Training Framework for Privacy-Preserving Machine Learning with Improved Computational Efficiency" Applied Sciences 15, no. 6: 3275. https://doi.org/10.3390/app15063275

APA Style

Li, Q., Zhou, S., Zeng, X., Shi, J., Lin, Q., Huang, C., Yue, Y., Jiang, Y., & Lv, C. (2025). A Symmetric Projection Space and Adversarial Training Framework for Privacy-Preserving Machine Learning with Improved Computational Efficiency. Applied Sciences, 15(6), 3275. https://doi.org/10.3390/app15063275

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Symmetric Projection Space and Adversarial Training Framework for Privacy-Preserving Machine Learning with Improved Computational Efficiency

Abstract

1. Introduction

2. Related Work

2.1. Latent Space

2.2. Adversarial Learning

2.3. Autoencoders

3. Materials and Methods

3.1. Dataset Collection

3.2. Data Augmentation

3.2.1. Time-Series Data Cleaning and Imputation

3.2.2. Image Data Augmentation

3.3. Proposed Method

3.3.1. Adversarial Training Network Framework

3.3.2. Symmetric Projection Space Extractor

3.3.3. Autoencoder-Based Data Generation Module

3.3.4. Projection Loss Function

3.4. Experimental Setup

3.4.1. Hardware and Software Platforms

3.4.2. Hyperparameters and Training Configuration

3.4.3. Baseline Methods

3.5. Evaluation Metrics

4. Results and Discussion

4.1. Financial Fraud Detection Results

4.2. Image and Semantic Classification Detection Results

4.3. Computational Efficiency Analysis

4.4. Ablation Experiment on Different Loss Functions Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI