Next Article in Journal
A Hybrid Optimization Algorithm for the Synthesis of Sparse Array Pattern Diagrams
Previous Article in Journal
SiC Powder Binder Jetting 3D Printing Technology: A Review of High-Performance SiC-Based Component Fabrication and Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Dynamic Node Privacy Feature Decoupling Graph Autoencoder Based on Attention Mechanism

1
State Key Laboratory of Public Big Data, College of Computer Science and Technology, Guizhou University, Guiyang 550025, China
2
Department of Electrical & Electronic Engineering, University of Bristol, Bristol BS8 1QU, UK
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(12), 6489; https://doi.org/10.3390/app15126489
Submission received: 22 April 2025 / Revised: 4 June 2025 / Accepted: 6 June 2025 / Published: 9 June 2025

Abstract

:
Graph autoencoders’ inherent capability to capture node feature correlations poses significant privacy risks through attackers inference. Previous feature decoupling approaches predominantly apply uniform privacy protection across nodes, disregarding the varying sensitivity levels inherent in graph structures. To solve the above problems, we propose a novel dual-path graph autoencoder incorporating attention-aware privacy adaptation. Firstly, we design an attention-driven metric learning framework to quantify node-specific privacy importance through attention weights and select important nodes to construct the privacy distribution, so that realizing the dynamically privacy decoupling and reducing utility loss. Then, we introduce Hilbert-Schmidt Independence Criterion (HSIC) to measure the dependence between privacy and non-privacy information, which avoids the deviations that occur when using approximate methods such as variational inference. Finally, we use the method of alternating training to comprehensively evaluate the privacy importance of nodes. Experimental results on three real-world datasets—Yale, Rochester, and Credit Defaulter—demonstrate that our proposed method significantly outperforms existing approaches like PVGAE, GAE-MI, and APGE, where the inference accuracy regarding privacy decreased by 25.5%, but the accuracy rate of link prediction achieved the highest 84.7% compared to other methods.

1. Introduction

Graph Neural Network (GNN)-based graph embedding techniques have demonstrated powerful capabilities in processing and analyzing large-scale graph data, with applications in social network analysis [1], recommendation systems [2], and traffic flow prediction [3]. Graph autoencoders [4] further enhance the representational capacity of graph embeddings by integrating node representation learning and graph data reconstruction through an encoder-decoder framework. However, since graph autoencoders can highly fit both the topological structure and node attributes of graph data, they may inadvertently capture fine-grained details of the learned data, posing privacy leakage risks in the released graph embeddings. For instance, attackers could exploit the model’s capabilities in multiple ways: they might reconstruct adjacency matrices through the decoder to expose sensitive social relationships, leverage the encoder’s latent space to execute membership inference attacks [5], or perform attribute inference attacks by analyzing and quantifying specific data patterns present in the graph embeddings [6]. Therefore, privacy decoupling within the graph embedding process is a critical technique for preserving privacy.
Moreover, prior work has not accounted for the varying global importance of nodes, which necessitates different degrees of privacy decoupling intensity. Quantifying node importance inherently involves weight information correlated with privacy features, and these weights directly impact the representational performance of nodes, influencing the analytical capability of graph embeddings in downstream tasks. Therefore, developing a privacy decoupling mechanism capable of dynamically calibrating protection intensity according to nodal importance levels emerges as a crucial yet understudied challenge in achieving trade-off between data privacy and utility.
To solve the above problems, we designed a dual-channel privacy decoupling graph autoencoder based on the attention mechanism. Firstly, we proposed a node importance ranking algorithm. In the privacy graph autoencoder, the self-attention mechanism is used to quantify the importance of different nodes for privacy protection, and the nodes are ranked according to their importance. Secondly, sort by the importance of nodes and select the top-ranked nodes as the data distribution of privacy information. Then, in the utility graph autoencoder, Hilbert-Schmidt Independence Criterion (HSIC) [7] is used to measure the dependence between the corresponding nodes as a penalty term to optimize the network parameters of the utility graph autoencoder. Finally, the privacy rights of each node are dynamically evaluated and adjusted through alternating training, enabling the model to achieve privacy utility trade-ups even when dealing with complex data. The key contributions could be summarized as follows:
  • A novel node privacy ranking algorithm that employs attention mechanisms to dynamically score the privacy features of each node, quantifying their privacy importance. This method mitigates the influence of high-privacy nodes in the embedding while retaining low-privacy nodes to ensure data utility.
  • We introduce the use of the Hilbert-Schmidt Independence Criterion (HSIC) to assess the dependency relationship between privacy and non-privacy distribution, which avoids the deviations that occur when using approximate methods by means of hypothesis testing
  • A dual-channel privacy graph autoencoder that decouples embedded privacy and utility features of graph data. Freezing parameters during alternating training, prevents gradient interference during backpropagation, enhancing the stability of node privacy importance measurement.
  • Comprehensive evaluation on real-world graph datasets using node classification and link prediction. Experimental results show that the proposed method effectively resists inference attacks on private information in node classification. It maintains high utility in link prediction while achieving an optimal privacy-utility trade-off.

2. Related Work

Currently, the methods for privacy-preserving graph embedding can be primarily categorized into two types: data perturbation methods and feature decoupling methods. Data perturbation-based methods, exemplified by k-anonymity [8] and its variants [9], employ data obfuscation strategies through graph modification and generalization to obscure sensitive individual information by reducing data granularity. However, k-anonymity strategies exhibit significant utility degradation as graph scale increases. Differential privacy (DP) [10], as a classical privacy preservation framework, guarantees mathematically rigorous privacy protection through randomized noise injection. For example, Sajadmanesh et al. [11] proposed adding Gaussian noise to graph neural network aggregation outputs, with privacy budget analysis demonstrating compliance with DP requirements. Nevertheless, these approaches fail to account for the heterogeneous importance of individual nodes within global graph structures, resulting in suboptimal personalized protection where critical nodes receive inadequate safeguards while trivial nodes suffer from over-protection. Furthermore, optimal privacy budget allocation for high-dimensional graph data remains computationally intractable [12], with numerical configurations predominantly relying on manual trial-and-error and iterative validation, substantially increasing computational overhead. Large-scale noise injection may also distort critical graph structural information, including node degree distributions, community structures, and subgraph patterns, thereby compromising data utility.
To circumvent the dual challenges of utility degradation in data perturbation and computational complexity in noise scaling, feature decoupling-based methods have been proposed. These techniques aim to disentangle privacy-sensitive features from utility-preserving features during graph embedding, thereby preventing adversarial inference of sensitive node attributes through embedded representations. Motivated by this paradigm, several privacy-preserving graph embedding frameworks leveraging adversarial learning and fairness learning have been proposed. To begin with, Bose et al. [13] eliminated the implicit association between node representations and sensitive attributes by constructing a composable fair constraint framework based on adversarial training. However, some researchers point out that it is very difficult to obtain all the private information during the training process in a real environment [14]. Liu et al. [15] and Oh et al. [16] propose a dual-objective framework using adversarial minimax gaming to jointly optimize task performance and privacy protection. A Key innovation lies in decoupling graph embedding into primary task learning and privacy preservation sub-objectives, achieved through iterative parameter optimization that directly obfuscates sensitive features. The method constructs adversarial mappings between feature spaces and privacy labels, systematically erasing sensitive information traces while maximizing embedding utility. Liu et al. [17] proposed a graph privacy funnel framework based on the variational method, achieving the decoupling of sensitive attributes through the information bottleneck objective function and the GNN encoder. The work [18] studies the privacy leakage problem caused by network structure, proposes a new general homogeneity ratio to quantify privacy risk, develops a graph private attribute inference attack as an evaluation tool, and at the same time proposes a graph data publishing method combined with learnable graph sampling technology to effectively protect user privacy and achieve the best balance between privacy and utility. Furthermore, many researchers achieve privacy decoupling by exploring the relationship between privacy and non-privacy features. Especially, Duddu et al. [19] adopted fairness-aware learning by eliminating statistical biases in graph embeddings toward privacy attributes through orthogonal subspace projection constraints, demonstrating particular efficacy in healthcare data scenarios requiring structured isolation of sensitive information. Hu et al. [20] introduced variational independence criteria as regularization terms to enforce mutual independence between distributions of privacy-sensitive and non-sensitive features in low-dimensional embeddings. Nevertheless, the above-mentioned approaches do not take into account the different importance of each node on the global graph data and fail to achieve dynamic privacy decoupling, which will lead to problems such as improper protection of important nodes and excessive protection of secondary nodes.

3. Preliminaries

Variational Graph Autoencoder

Variational graph autoencoder (VGAE) [21] is a generative model based on variational autoencoder and graph convolution networks to embed a graph to a low-demension space. Firstly, the encoder part applies a node-wise transformation by a spectral convolution function:
Z ( l + 1 ) = f Z ( l ) , A | W ( l ) ,
where Z ( l + 1 ) is the output of prezent layer, and t Z ( l ) is the output of piror layer. In this paper, Z ( 0 ) = X is the input features matrix of the graph. W ( l ) is parameter matrix which will be learned. The spectral convolution function could be specifically formulated as follows:
f ( Z ( l ) , A | W ( l ) ) = ϕ D ˜ 1 2 A ˜ D ˜ 1 2 Z ( l ) W ( l ) ,
where A ˜ = A + I n and D ˜ = j A ˜ i j . ϕ is a non-linearity activation function such as ReLU. Then the encoder part could be constructed as follows:
Z ( 1 ) = f R e L u X , A | W ( 0 ) ,
Z ( 2 ) = f l i n e a r Z ( 1 ) , A | W ( 1 ) .
By employing a variational inference model, the following forms can be derived as:
q ( Z | X , A ) = i = 1 n q ( z i | X , A ) = i = 1 n N ( z i | μ i , diag ( σ 2 ) ) ,
where μ = Z ( 2 ) is the mean vector matrix representing the embedding of node i; similarly, the log-variance log σ = f l i n e a r ( Z ( 1 ) , A | W ( 1 ) ) shares the parameters of the first-layer graph convolutional neural network with the mean μ .
The goal of the VGAE is to embed the graph’s topological structure into a low-dimensional space. Therefore, during the decoding phase, the adjacency matrix is reconstructed through the inner product of the embedding matrix Z. The reconstructed adjacency matrix A ^ can be derived as follows:
A ^ = sigmoid ( Z Z T ) ,
where Z T is the transpose matrix of the embedding matrix Z. To preserve the graph’s structural information, the VGAE measures the difference between the reconstructed adjacency matrix and the original adjacency matrix by minimizing the cross-entropy loss:
L r e c = E q θ ( Z | X , A ) [ log p ϕ ( A | Z ) ] = 1 N 2 i = 1 N j = 1 N A i j log ( A ^ i j ) .
Finally, the loss function of the variational autoencoder is formulated as:
L = L r e c L r e g = E q θ ( Z | X , A ) [ log p ϕ ( A | Z ) ] KL [ q θ ( Z | X , A ) p ( Z ) ] ,
where q θ ( Z | X , A ) denotes the posterior distribution of the embedding representation Z encoded by the graph neural network; KL [ q ( · ) p ( · ) ] denotes the KL divergence between distributions q ( · ) and p ( · ) ; p ( Z ) represents the prior distribution of the embedding representation Z. The second term serves as a regularization term, ensuring that the latent distribution of the graph data explicitly fits a predefined prior distribution.

4. System Model

We primarily focuses on modeling graph data as undirected graphs for research purposes. An undirected graph can be denoted as G = ( A , X ) , where A { 0 , 1 } N × N represents the adjacency matrix, with N being the number of nodes. The node set of graph G can be expressed as { v 1 , v 2 , , v N } . Additionally, X R N × D signifies the node feature matrix, and D denotes the feature dimension of the nodes. Following the setup similar to that in literature, the feature vector of a node v i can be characterized as a triplet ( x i , p i , y i ) , where x i R D k signifies the non-private portion of the node vector, p i R k represents the private part of the node vector, and y i R indicates the label of the node. It is assumed that during training, the partial privacy of the users can be accessed, and the process of learning node embeddings involves decoupling to protect user privacy.
As illustrated in Figure 1, the model architecture is based on a dual-channel privacy-disentangled graph autoencoder with an attention mechanism. The model comprises two graph autoencoders: the privacy graph autoencoder and the utility graph autoencoder. Firstly, a privacy graph autoencoder is trained to estimate the privacy distribution of the graph data. An attention layer is inserted between the privacy latent representation and the privacy decoder. This attention layer measures the global importance of nodes for privacy protection through an attention mechanism and ranks them. The nodes with higher importance and their latent representations are selected to constitute the main privacy information distribution. Secondly, during the training of the utility autoencoder, the selected node corresponds to the utility latent representations to form the distribution of utility information. The HSIC is used to measure the dependence between privacy and non-privacy features. HSIC is also incorporated as a penalty term in the optimization process of the utility graph autoencoder. Finally, by alternately training the privacy and utility graph autoencoders, the privacy importance of each node is dynamically calculated, enhancing the robustness of the model across different complex datasets.

5. Method Details

5.1. Node Privacy Ranking by Attention

The proposed method combines two variational graph autoencoders. The former is used to extract the feature representation of the graph, while the latter focuses on estimating privacy-related data. Specifically, the first autoencoder generates low-dimensional embeddings through the graph structure and node features to capture the overall features of the graph, while the second autoencoder estimates the distribution of privacy information through privacy loss. The objective function of privacy autoencoder can be formulated as:
L p = E q θ p ( Z p A , X ) log p φ p ( P Z p ) KL q θ p ( Z p A , X ) p ( Z p ) ,
where θ p and φ p represent the parameters of the privacy encoder and privacy decoder respectively. P is the observed partial private information. Z p represents the privacy latent representation; q φ p ( Z p A , X ) represents the posterior distribution of the privacy information distribution; P ( Z p ) is the prior distribution of Z p . The objective function shown in (9) contains two parts. The first term is used to estimate the sensitive attribute distribution given the graph structure and non-sensitive attributes. By optimizing this objective, the privacy information can be mapped to a low-dimensional graph embedding representation, thereby enabling inference and prediction of privacy information. The second term is a regularization term, using KL divergence to force the privacy distribution to fit a prior distribution.
To quantify the global importance of nodes for privacy protection, we propose an attention-driven node privacy importance measurement scheme. To begin with, an attention layer [22] is added between the privacy latent embedding representation and the privacy decoder. The attention fraction matrix is updated through the privacy-solving loss. Specifically, this fraction matrix reflects the relative importance of each node in the privacy protection process. We adjust these attention scores by minimizing the reconstruction loss of privacy information so that important nodes contribute more to privacy inference. By summing each column of the attention score matrix and then sorting it from large to small, the global importance score of each node for privacy protection can be obtained.
In practice, for the obtained privacy latent representation, generate the corresponding query Q, key K, and value V through three linear transformations, which can be represented in matrix form as:
Q = W q Z p ,
K = W k Z p ,
V = W v Z p ,
where W q , W k and W v represent the corresponding learnable weight parameters. Q reflects the node’s demand to obtain or focus on other nodes in privacy protection; K is used to determine the overall weight of each node in privacy protection; V is a comprehensive measure of the importance of different nodes in privacy protection. Then, by aggregating the information of neighboring nodes through weighted aggregation, the weighted privacy latent representation Z p ˜ can be obtained as:
Z p ˜ = a t t n p · V = s o f t m a x Q K T d k · V ,
where a t t n p R n × n is attention score matrix, and attn p i j represents the importance of node j to node i in terms of privacy protection. Summing the columns of attn p yields the global importance of nodes for privacy protection, which can be mathematically written as:
α p = i = 1 n attn p [ : , j ] , j { 1 , 2 , , n } ,
where α p is the node privacy importance score.
Compared with other node importance assessment methods, such as node centrality measures [23] based on static topological structure, the use of the attention mechanism can adapt to the topological structure and task requirements of different graphs. Meanwhile, by extending the attention mechanism, comprehensive nonlinear relationships and high-order dependencies can be captured.
Finally, the loss function of the privacy-preserving graph autoencoder (9) can be further expressed as:
L p = E q θ p ( Z p ˜ | H ) log p θ p ( P | Z p ˜ ) KL q θ p ( Z p ˜ | A , X ) P ( Z p ˜ ) ,
The detailed steps of the node privacy importance ranking algorithm are shown in Algorithm 1.
Algorithm 1 Node Privacy Importance Ranking Algorithm
Input:  Adjacency matrix A, node features X, training epochs epoch p , learning rate η p , partial observed privacy information P
Output:  Node privacy importance matrix α R n × 1
 1:  Initialize network parameters θ p , θ f , φ p
 2:  for   i = 1 to epoch p do
 3:       //Generate privacy-preserving latent representation
            Z p θ p ( A , X )
 4:       //Generate queries, keys and values through attention layer
            Q W q Z p , K W k Z p , V W v Z p
 5:       //Normalized attention matrix
            attn p softmax Q K T d k
 6:       //Weighted aggregation of neighbor information
            Z ˜ p attn p · V
 7:       //Predicted privacy information
            P ^ = φ p ( P )
 8:       //Compute loss
            L p E q θ p ( Z ˜ p | H ) log p φ p ( P | Z ˜ p ) KL q θ p ( Z ˜ p | A , X ) P ( Z ˜ p )
 9:       //Update parameters via gradient descent
            θ p θ p η p L p θ p
            φ p φ p η p L p φ p
 10:       end for
 11:  //Compute node privacy importance score
         α p i = 1 n attn p [ : , j ] , j { 1 , 2 , , n }
 12:  return α p

5.2. Privacy Decoupling Based on Node Privacy Ranking

In order to achieve more accurate privacy information decoupling, based on the node privacy importance scores obtained from Algorithm 1, we sort them in descending order and select the top-k nodes’ indices and their privacy latent representations to construct the privacy distribution of the graph data, which can be formulated as:
Z p k = { z p 1 ( 1 ) , z p 2 ( 2 ) , , z p k ( k ) } ,
where Z p k represents the privacy distribution and p i represents the index of selected nodes. Correspondingly, the latent representations of the selected t o p k index nodes in the graph embedding generated by the utility encoder constitute the distribution of non-privacy information:
Z u k = { z u 1 ( 1 ) , z u 2 ( 2 ) , , z u k ( k ) } ,
where Z u k represents the non-privacy distribution and u i is the index of the corresponding nodes.
To eliminate the dependence between the two potential distributions learned and protect sensitive information, we adopt the Hilbert-Schmidt Independence Criterion (HSIC) as the independence measurement method. HSIC maps two random variables or distributions to the regenerating kernel Hilbert space (RKHS) through kernel functions, thereby quantifying the dependence between them. Specifically, HSIC is capable of evaluating the independence between two embedded distributions in a non-parametric manner, thereby avoiding the assumption of specific distribution patterns. From a mathematical perspective, HSIC measures independence through the relationship between two kernel matrices, and the formula is as follows:
H S I C ( Z u , Z s ) = 1 ( n 1 ) 2 tr ( R K u R K s ) ,
R = I e e T k ,
where I is the identity matrix, and e is a column vector of ones. K u and K s represent the Gram matrices of Z x and Z s , respectively, where the elements are defined as k u , i j = k u ( z i x , z j x ) and k s , i j = k s ( z i s , z j s ) . The kernel functions are derived from inner products. R is the centering matrix used to remove the influence of the distribution means, and tr ( ) denotes the computation of the trace of a matrix. A smaller HSIC value indicates a lower dependency between the two distributions, thereby achieving the goal of decoupling the potential representations. Compared with mutual information, the HSIC independence criterion does not require probability density estimation. It directly calculates independence through the kernel matrix, avoiding errors and complexity in high-dimensional scenarios. Moreover, it can capture linear and nonlinear dependencies through multiple kernel functions and adapt to more complex data distributions.
Therefore, the overall loss function L t o t of the utility autoencoder can be expressed as the multi-objective optimization function:
L t o l = L a d j β L d e c L r e g = E q ϕ f ( Z | A , X ) log p θ f ( A | Z ) β H S I C ( Z u , Z s ) KL q ϕ f ( Z | A , X ) p ( Z ) ,
where the first term L a d j is the utility loss, measuring the representational capability of the latent representations. The second term L d e c is the privacy decoupling loss, quantifying the dependence between privacy and non-privacy distribution. The third term L r e g is the regularization term, constraining the distribution of q ϕ f ( Z | A , X ) to minimize its distance from the predefined prior distribution p ( Z u ) . The KL divergence KL [ · ] measures the distance between these two distributions. Additionally, β is a trade-off factor controlling the strength of privacy decoupling. Higher values of β increase privacy decoupling strength but may lead to greater utility loss, while lower values reduce utility loss but weaken privacy protection effectiveness.

5.3. Model Training

This paper adopts an alternating training approach to optimize the objective functions of both the privacy graph autoencoder and utility graph autoencoder, iteratively updating model parameters. In each training epoch, we first train the privacy graph autoencoder to extract the privacy latent representations Z p and select top-k important nodes based on their importance scores to construct the privacy information data distribution. Next, we train the utility graph autoencoder by computing the cosine similarity between the utility latent representations Z u of the selected k nodes and their corresponding privacy latent representations Z p . The HSIC penalty enforces orthogonality between the learned utility latent codes and the privacy latent codes, effectively achieving the decoupling of privacy information. During training, we implement a parameter freezing strategy: when the privacy encoder is being trained, we keep the utility encoder parameters θ u fixed; conversely, when training the utility encoder, the parameters of the privacy encoder θ p remain unchanged. This approach helps prevent model oscillation, allowing each encoder to focus on its specific objectives. The entire algorithm process is shown as Algorithm 2.
Algorithm 2 Dynamic Privacy Decoupling
Input:  Adjacency matrix A, node features X, utility autoencoder epochs e p o c h u , utility learning rate lr u , privacy autoencoder epochs e p o c h p , selected node count k
Output:  Publishable privacy-preserving graph embedding Z p u b
  1:  Initialize network parameters θ p , θ f , φ p
  2:  for   i 1 to epoch u do
  3:     for j 1 to epoch p  do
  4:       Compute node privacy importance scores α R n × 1 via Algorithm 1
  5:     end for
  6:     //Generate utility latent representation
           Z u θ f ( A , X )
  7:     //Reconstruct adjacency matrix
           A ^ Sigmoid ( Z u Z u T )
  8:     //Compute reconstruction error
           L r e c 1 N 2 i = 1 N j = 1 N A i j log ( A ^ i j )
  9:     //Construct the privacy distribution
           Z p k { z p 1 ( 1 ) , z p 2 ( 2 ) , , z p k ( k ) }
  10:   //Construct the non-privacy distribution
           Z u k { z u 1 ( 1 ) , z u 2 ( 2 ) , , z u k ( k ) }
  11:   //Compute privacy decoupling loss
           L d e c H S I C ( Z u k , Z p k )
  12:   //Compute total utility autoencoder loss
           L t o t L a d j β L d e c L r e g
  13:   //Update parameters via gradient descent
           θ f θ f η f L t o t θ f
  14:  end for
  15:  return   Z p u b

6. Simulation Results

6.1. Dataset and Evaluation Metrics

We conducted our experiments on three datasets: Yale, Rochester and Credit defaulter, which were constructed in [1]. Yale and Rochester are social network datasets, which are collected from Yale University and Rochester University. These two datasets describe user attributes and their relationships with each other. The Yale contains 8578 nodes, 405,450 edges, while the Rochester contains 4563 nodes, 167,653 edges. The Credit defaulter is an ethical dataset of defaulter, which contains 30,000 individuals with 14 spending and payment patterns. These three datasets offer a large scale and complexity, enabling the model to be evaluated in social networks of different scales and structures, and being able to test both universality and robustness. And all of them are collected from the real world, providing an opportunity for the effectiveness evaluation of the model in practical applications. Additionally, characteristics of these datasets are summarized in Table 1 and the dataset visualization is displayed in Figure 2.
The evaluation tasks in this study are divided into two main components: utility tasks and privacy tasks. The utility performance of the models is quantified through link prediction, with the Area Under the ROC Curve (AUC) and Average Precision (AP) serving as key metrics—higher values indicate stronger performance. For node classification, accuracy (ACC) and Macro-F1 are employed to assess the models’ resilience against inference attacks. Specifically, the goal is to maintain high inference accuracy for utility attributes while achieving low accuracy for privacy attributes.

6.2. Compared Models, Attack Model and Parameter Settings

This study conducts a comparative analysis of four representative graph embedding models, specifically including:
(1)
VGAE [21]: VGAE employs a variational autoencoder architecture that combines graph structure reconstruction loss with KL divergence by enforcing latent representations to fit a prior distribution, effectively achieving distributed learning of graph embeddings. Notably, this model does not incorporate any privacy protection mechanisms, thus providing a fundamental reference for evaluating the privacy protection efficacy of subsequent models.
(2)
PVGAE [20]: An enhanced version of the VGAE architecture that introduces a dual-encoder alternating training mechanism. By constructing variational independence constraints, it systematically eliminates privacy-sensitive information from embedded representations. This method achieves targeted privacy stripping while maintaining graph structure representation capabilities.
(3)
GAE-MI [24]: Adopts an adversarial training framework to simultaneously optimize utility performance and privacy protection objectives. Innovatively employs mutual information as the metric: by maximizing application utility mutual information while minimizing privacy leakage mutual information, it constructs a dual-objective optimization function. To improve computational efficiency, it uses variational lower bounds for approximation estimation, significantly reducing computational complexity while ensuring model performance.
(4)
APGE [25]: Based on the classical graph autoencoder architecture, proposes an extended layer fusion mechanism to encode privacy label information into latent space. Designs a dual-path adversarial training strategy: while minimizing public label prediction error, it maximizes privacy label prediction loss through adversarial optimization, thereby achieving active obfuscation of privacy information in latent space.
In summary, while VGAE serves as a baseline model with no privacy mechanisms, PVGAE, GAE-MI, and APGE each implement distinct strategies for privacy protection. PVGAE focuses on eliminating sensitivity through variational independence, GAE-MI emphasizes a balance between utility and privacy via mutual information, and APGE enhances privacy labeling through adversarial optimization. This comparative analysis highlights the varying methodologies employed to achieve privacy preservation in graph embedding frameworks.
This study employs attribute inference attacks as the attack model. Attribute inference attacks aim to analyze publicly available or accessible data, such as published graph embeddings and partial users’ privacy attributes, by training a classifier to infer undisclosed private attributes of target individuals or data, constituting privacy infringement methods for attributes like age, gender, location, etc. For the Yale dataset, class year is selected as the privacy attribute and student/faculty status as the utility attribute; for Rochester, gender is treated as the privacy attribute while class year serves as the utility attribute. In Credit defaulter dataset, we will infer whether a user will default on a credit card payment as a public attribute and consider an individual’s marital status as private attribute. Additionally, for link prediction, this study randomly samples 10 % of links as the test set and 10 % as negative edges. For node classification tasks, 20 % of nodes are randomly selected as the test set. All experiments are implemented on a PC with 64-bit Windows OS, AMD Ryzen 8-core Process CPU, and Nvidia Geforce RTX 4060Ti GPU.

6.3. Overall Utility and Privacy Performance

This study compares DU-GAE with VGAE, PVGAE, GAE-MI, and APGE, with experimental results presented in Table 2. The findings reveal that without any privacy protection techniques, VGAE suffers from severe privacy leakage issues, where attackers can easily predict node privacy attributes with extremely high accuracy based on the published graph embeddings.
The experimental results demonstrate that our proposed model effectively resists attribute inference attacks, outperforming other privacy-preserving methods in protecting privacy attributes while simultaneously achieving the highest accuracy in utility attribute inference and minimal data utility loss, thereby achieving an optimal privacy-utility trade-off. Specifically, in the Rochester dataset, our method achieves a utility attribute inference accuracy of 0.865 while reducing privacy attribute inference accuracy to the lowest level of 0.575. Notably, our model also achieves the lowest Macro-F1 score for privacy attribute inference compared to other privacy protection methods, indicating successful suppression of privacy information representation in the graph embeddings. Additionally, the variances of each indicator in our method are relatively small, which reflects that the method we proposed has stable performance and good generalization performance in different datasets. For example, in the Yale dataset, both the classification accuracy of utility labels and privacy labels and the variance of Macro-F1 achieved the lowest 0.001, demonstrating the stability of the model.
Unlike PVGAE which decouples privacy latent spaces for all nodes, our approach selectively identifies key nodes through an attention mechanism to construct privacy information distributions. This strategy effectively reduces the utility loss caused by privacy decoupling, demonstrating that uniformly decoupling privacy information across all nodes excessively suppresses non-private information representation. Furthermore, these results confirm that different nodes contribute variably to privacy protection, with some nodes being more critical for data utility—excessive decoupling of such nodes would significantly impair the representational performance of graph embeddings.

6.4. The Impect of Attack Model

To investigate the impact of different attack models on privacy protection, this study implemented two common classification models—MLP and SVM, with their inference results shown in Figure 3. The experimental results demonstrate that different attack models yield varying privacy attribute inference outcomes, yet overall, MLP and SVM maintain highly consistent accuracy rates with minimal divergence. Specifically, the discrepancy between the two attack models’ inference results was more pronounced in the Yale dataset compared to the Rochester dataset. Both MLP and SVM achieved a privacy attribute inference accuracy of 0.075 on graph embeddings generated by PVGAE.
This phenomenon may stem from the distinct classification strategies employed by each attack model: SVM’s performance depends on kernel function selection, while MLP utilizes different activation functions. However, their consistent overall inference performance suggests that the choice of attack model has limited influence on user privacy inference outcomes. Instead, the inherent privacy protection strength of the graph embedding technique and the complexity of different graph datasets play more dominant roles.
Notably, the graph embeddings generated by our proposed method exhibited the smallest performance gap between MLP and SVM across both datasets. This indicates that our alternating training approach comprehensively evaluates each node’s privacy significance while maintaining precise identification of critical privacy nodes, thereby enhancing the model’s robustness and generalizability across diverse graph data

6.5. Sensitive Study

This section primarily evaluates the impact of the number of key nodes and the number of attention heads on data utility and privacy protection performance. Specifically, this study examines different proportions of key nodes relative to the total node count ( 10 % , 30 % , 50 % , 70 % , 90 % ) and selects varying numbers of attention heads (1, 2, 4, 8). The experimental results are presented in Figure 4 and Figure 5.
On one hand, With the increase of key nodes, the accuracy rate of utility attribute classification generally shows an upward trend on the three datasets, while the classification accuracy rate of privacy labels remains stable. Specifically, taking the Credit defaulter dataset as an example, when the number of key nodes is only 10%, the classification accuracy rate of the utility attribute is 0.787. However, when the number of key nodes gradually increases, the change in its accuracy rate first decreases and then increases. But when the proportion of key nodes is 50%, the maximum value of 0.801 is achieved. However, during the proportion change process of key nodes, the classification accuracy rate of privacy tags changed at the maximum of only 0.021, with almost no change. The above phenomenon indicates that the privacy information of graph data is often only contained in a few important nodes, which is also in line with the sparse characteristics of social networks. Excessive decoupling will instead lose utility.
On the other hand, With the increase in the number of attention heads, the accuracy rates of utility attribute classification and privacy attribute classification were relatively stable on the three datasets, among which the achievement on the Rochester dataset was the smallest. Specifically, when the number of attention heads increased from 1 to 8, the change rates of the accuracy rates of utility attribute classification and privacy attribute classification in Rochester were 0.05, from 0.811 to 0.861, and 0.02, from 0.572 to 0.592, respectively. It indicates that the self-attention mechanism has been able to capture relatively comprehensive relationships. Furthermore, in the Credit defaulter dataset, the influence trend of the number of attention heads on the classification accuracy of utility attributes and privacy attributes is the same. This suggests that in model optimization, we should pay attention to other potential factors (such as data processing, feature selection, etc.) to improve the privacy protection performance.

6.6. The Performance of Downstream Tasks

To verify the representation performance of graph embeddings generated by the dual-channel privacy decoupled graph autoencoder based on the attention mechanism, this study uses link prediction to measure its performance in downstream tasks. Since the edges in the Credit defaulter dataset are established based on node similarity, link prediction for them is of no significant meaning. Therefore, we only conduct experimental verification on the Yale and Rochester datasets. The experimental results are shown in Figure 6 and Table 3. The experimental results show that the method we proposed achieves the best performance on both datasets and two metrics, where in Yale is 0.859 and in Rochester is 0.912. Moreover, the small variance indicates that our method can make stable predictions for the links in the graph, thereby maintaining the performance of graph embedding in downstream tasks. For example, in Yale dataset, when the AUC value is low, such as F P R = 0.2 , it can be observed that the method proposed in this study has a higher true positive rate compared to other privacy protection methods in the lower false positive rate region. This indicates that the method proposed in this study has higher accuracy in predicting low-probability links. In addition, link prediction is essentially a measure of node similarity. The superior performance that the method we proposed can achieve indicates that we maintain the relative distance between two nodes in the original space during the embedding process.
Among them, GAE-MI uses mutual information as the measurement index, but estimates the value of mutual information using the variational lower bound. Especially when dealing with high-dimensional graph data, it is prone to fall into local optimum. Meanwhile, the complex distribution of the graph data itself also exacerbates the difficulty of convergence. Furthermore, PVGAE decoupled privacy from all nodes, resulting in excessive protection of non-critical nodes and thereby affecting data utility. In this study, through the attention mechanism, the key nodes that are important for privacy protection can be precisely screened, thereby avoiding excessive utility loss. Meanwhile, the practice of merely decoupling the privacy information of some key nodes can significantly reduce the damage to the topological integrity of the graph.

7. Conclusions

In this paper, we propose a dynamic node privacy feature decoupling graph autoencoder that fundamentally addresses the static protection dilemma in graph privacy preservation. this study adopts the node-level self-attention weight matrix to dynamically identify highly privacy-sensitive nodes and further screen the key privacy nodes to construct the privacy information distribution of graph data. We propose to use HSIC to test the dependency relationship between privacy and non-privacy distributions. Through the method of alternating training, the importance score of each node can be comprehensively evaluated. Unlike conventional methods that apply uniform protection strategies, our method implements a selective shielding mechanism that automatically adjusts privacy preservation levels according to node privacy importance assessments. Experiments on two real-world datasets show that, compared with other privacy protection methods, our proposed method achieves node privacy protection while maintaining a high level of data utility, effectively balancing the inherent conflict between privacy and utility. In the future research, more potential privacy leakage factors can be considered, such as the high-order correlation between nodes, the interaction between communities or subgraphs, and global quantitative indicators, so as to establish a measurement model closer to the actual privacy threat.

Author Contributions

Conceptualization, Y.H. and J.T.; methodology, Y.H.; software, Y.H.; validation, Y.H.; formal analysis, Y.H.; investigation, Y.H.; resources, J.T.; data curation, Y.H.; writing—original draft preparation, Y.H.; writing—review and editing, Y.H., J.T. and S.D.; visualization, Y.H.; supervision, J.T.; project administration, J.T.; funding acquisition, J.T. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Guizhou Provincial Science and Technology Projects “Research on Big Data Attribute Privacy Protection and Utility Trade-off Based on Graph Neural Networks” (No. QianKeHe Basic MS [2025] 622) and “Data Feature Privacy Protection Based on Generative AI” (No. QianKeHe Basic—[2024] Youth 126).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The source code of the simulation program and the raw data are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Agarwal, C.; Lakkaraju, H.; Zitnik, M. Towards a unified framework for fair and stable graph representation learning. In Proceedings of the Uncertainty in artificial intelligence, PMLR, Online, 27–30 July 2021; pp. 2114–2124. [Google Scholar]
  2. Amara, A.; Taieb, M.A.H.; Aouicha, M.B. A multi-view GNN-based network representation learning framework for recommendation systems. Neurocomputing 2025, 619, 129001. [Google Scholar] [CrossRef]
  3. Sharma, A.; Sharma, A.; Nikashina, P.; Gavrilenko, V.; Tselykh, A.; Bozhenyuk, A.; Masud, M.; Meshref, H. A graph neural network (GNN)-based approach for real-time estimation of traffic speed in sustainable smart cities. Sustainability 2023, 15, 11893. [Google Scholar] [CrossRef]
  4. Hou, Z.; Liu, X.; Cen, Y.; Dong, Y.; Yang, H.; Wang, C.; Tang, J. Graphmae: Self-supervised masked graph autoencoders. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 594–604. [Google Scholar]
  5. Wang, K.; Wu, J.; Zhu, T.; Ren, W.; Hong, Y. Defense against membership inference attack in graph neural networks through graph perturbation. Int. J. Inf. Secur. 2023, 22, 497–509. [Google Scholar] [CrossRef] [PubMed]
  6. Zhang, Z.; Chen, M.; Backes, M.; Shen, Y.; Zhang, Y. Inference attacks against graph neural networks. In Proceedings of the 31st USENIX Security Symposium (USENIX Security 22), Boston, MA, USA, 10–12 August 2022; pp. 4543–4560. [Google Scholar]
  7. Wang, T.; Dai, X.; Liu, Y. Learning with Hilbert–Schmidt independence criterion: A review and new perspectives. Knowl.-Based Syst. 2021, 234, 107567. [Google Scholar] [CrossRef]
  8. Sweeney, L. k-anonymity: A model for protecting privacy. Int. J. Uncertain. Fuzziness- Knowl.-Based Syst. 2002, 10, 557–570. [Google Scholar] [CrossRef]
  9. Meyerson, A.; Williams, R. On the complexity of optimal k-anonymity. In Proceedings of the Twenty-Third ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Paris, France, 14–16 June 2004; pp. 223–228. [Google Scholar]
  10. Dwork, C. Differential privacy. In Proceedings of the International Colloquium on Automata, Languages, and Programming, Venice, Italy, 10–14 July 2006; pp. 1–12. [Google Scholar]
  11. Sajadmanesh, S.; Shamsabadi, A.S.; Bellet, A.; Gatica-Perez, D. {GAP}: Differentially private graph neural networks with aggregation perturbation. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23), Anaheim, CA, USA, 9–11 August 2023; pp. 3223–3240. [Google Scholar]
  12. Mandal, B.; Amariucai, G.; Wei, S. Uncertainty-autoencoder-based privacy and utility preserving data type conscious transformation. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 18–23 July 2022; pp. 1–8. [Google Scholar]
  13. Bose, A.; Hamilton, W. Compositional fairness constraints for graph embeddings. In Proceedings of the International conference on machine learning. PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 715–724. [Google Scholar]
  14. Dai, E.; Wang, S. Say no to the discrimination: Learning fair graph neural networks with limited sensitive attribute information. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, Online, 8–12 March 2021; pp. 680–688. [Google Scholar]
  15. Liu, J.; Li, Z.; Yao, Y.; Xu, F.; Ma, X.; Xu, M.; Tong, H. Fair representation learning: An alternative to mutual information. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 1088–1097. [Google Scholar]
  16. Oh, C.; Won, H.; So, J.; Kim, T.; Kim, Y.; Choi, H.; Song, K. Learning fair representation via distributional contrastive disentanglement. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, 14–18 August 2022; pp. 1295–1305. [Google Scholar]
  17. Lin, W.; Lan, H.; Cao, J. Graph privacy funnel: A variational approach for privacy-preserving representation learning on graphs. IEEE Trans. Dependable Secur. Comput. 2024, 22, 967–978. [Google Scholar] [CrossRef]
  18. Yuan, H.; Xu, J.; Wang, C.; Yang, Z.; Wang, C.; Yin, K.; Yang, Y. Unveiling privacy vulnerabilities: Investigating the role of structure in graph data. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Barcelona, Spain, 25–29 August 2024; pp. 4059–4070. [Google Scholar]
  19. Duddu, V.; Boutet, A.; Shejwalkar, V. Quantifying privacy leakage in graph embedding. In Proceedings of the MobiQuitous 2020-17th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, Darmstadt, Germany, 7–9 December 2020; pp. 76–85. [Google Scholar]
  20. Hu, Q.; Song, Y. Independent distribution regularization for private graph embedding. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, Birmingham, UK, 21–25 October 2023; pp. 823–832. [Google Scholar]
  21. Kipf, T.N.; Welling, M. Variational graph auto-encoders. arXiv 2016, arXiv:1611.07308. [Google Scholar]
  22. Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; NIPS: Long Beach, CA, USA, 2017; Volume 30. [Google Scholar]
  23. Bloch, F.; Jackson, M.O.; Tebaldi, P. Centrality measures in networks. Soc. Choice Welf. 2023, 61, 413–453. [Google Scholar] [CrossRef]
  24. Wang, B.; Guo, J.; Li, A.; Chen, Y.; Li, H. Privacy-preserving representation learning on graphs: A mutual information perspective. In Proceedings of the 27th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Online, 14–18 August 2021; pp. 1667–1676. [Google Scholar]
  25. Li, K.; Luo, G.; Ye, Y.; Li, W.; Ji, S.; Cai, Z. Adversarial privacy-preserving graph embedding against inference attack. IEEE Internet Things J. 2020, 8, 6904–6915. [Google Scholar] [CrossRef]
Figure 1. The architecture of proposed model.
Figure 1. The architecture of proposed model.
Applsci 15 06489 g001
Figure 2. Dataset visualization: (a) Yale. (b) Rochester. (c) Credit defaulter.
Figure 2. Dataset visualization: (a) Yale. (b) Rochester. (c) Credit defaulter.
Applsci 15 06489 g002
Figure 3. The impact of different attack model: (a) Yale. (b) Rochester. (c) Credit.
Figure 3. The impact of different attack model: (a) Yale. (b) Rochester. (c) Credit.
Applsci 15 06489 g003
Figure 4. The impact of different attack model: (a) Yale. (b) Rochester. (c) Credit defaulter.
Figure 4. The impact of different attack model: (a) Yale. (b) Rochester. (c) Credit defaulter.
Applsci 15 06489 g004
Figure 5. The impact of different attention heads: (a) Yale. (b) Rochester. (c) Credit defaulter.
Figure 5. The impact of different attention heads: (a) Yale. (b) Rochester. (c) Credit defaulter.
Applsci 15 06489 g005
Figure 6. The performance on link prediction: (a) The performance of link prediction on Yale. (b) The performance of link prediction on Rochester.
Figure 6. The performance on link prediction: (a) The performance of link prediction on Yale. (b) The performance of link prediction on Rochester.
Applsci 15 06489 g006
Table 1. Description of datasets.
Table 1. Description of datasets.
NodesEdgesFeaturesAttributesAverage Degree
Yale8578405,450188694.3
Rochester4563167,653236673.5
Credit defaulter30,0001,459,992141497.3
Table 2. Utility and privacy evalution on Yale and Rochester.
Table 2. Utility and privacy evalution on Yale and Rochester.
DatasetModelUtility AttributePrivacy Attribute
ACC Macro-F1 ACC Macro-F1
VGAE 0.902 ± 0.001 0.895 ± 0.001 0.867 ± 0.002 0.865 ± 0.001
APGE 0.813 ± 0.001 0.780 ± 0.009 0.614 ± 0.014 0.618 ± 0.015
YalePVGAE 0.840 ± 0.010 0.826 ± 0.002 0.740 ± 0.003 0.740 ± 0.002
GAEMI 0.709 ± 0.005 0.704 ± 0.002 0.625 ± 0.010 0.636 ± 0.004
ours 0.848 ± 0.001 0.839 ± 0.002 0.612 ± 0.001 0.609 ± 0.010
VGAE 0.873 ± 0.002 0.866 ± 0.002 0.646 ± 0.001 0.645 ± 0.001
APGE 0.852 ± 0.005 0.852 ± 0.004 0.611 ± 0.005 0.606 ± 0.005
RochesterPVGAE 0.830 ± 0.002 0.821 ± 0.014 0.594 ± 0.013 0.593 ± 0.004
GAEMI 0.834 ± 0.011 0.826 ± 0.003 0.612 ± 0.004 0.609 ± 0.006
ours 0.865 ± 0.002 0.855 ± 0.002 0.569 ± 0.004 0.575 ± 0.004
VGAE 0.796 ± 0.002 0.638 ± 0.002 0.647 ± 0.014 0.654 ± 0.017
APGE 0.785 ± 0.003 0.640 ± 0.002 0.629 ± 0.001 0.654 ± 0.001
Credit defaulterPVGAE 0.778 ± 0.003 0.621 ± 0.003 0.634 ± 0.04 0.632 ± 0.004
GAEMI 0.789 ± 0.002 0.624 ± 0.018 0.619 ± 0.002 0.612 ± 0.002
ours 0.775 ± 0.003 0.745 ± 0.003 0.610 ± 0.010 0.611 ± 0.012
Table 3. Performance on link prediction.
Table 3. Performance on link prediction.
MethodsYaleRochester
AP AUC AP AUC
VGAE 0.886 ± 0.003 0.891 ± 0.002 0.926 ± 0.001 0.924 ± 0.011
APGE 0.784 ± 0.002 0.806 ± 0.002 0.892 ± 0.013 0.896 ± 0.002
PVGAE 0.817 ± 0.003 0.827 ± 0.014 0.909 ± 0.003 0.907 ± 0.002
GAE-MI 0.789 ± 0.010 0.811 ± 0.013 0.874 ± 0.002 0.879 ± 0.002
ours 0.847 ± 0.001 0.859 ± 0.001 0.916 ± 0.002 0.912 ± 0.011
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, Y.; Tang, J.; Dang, S. Dynamic Node Privacy Feature Decoupling Graph Autoencoder Based on Attention Mechanism. Appl. Sci. 2025, 15, 6489. https://doi.org/10.3390/app15126489

AMA Style

Huang Y, Tang J, Dang S. Dynamic Node Privacy Feature Decoupling Graph Autoencoder Based on Attention Mechanism. Applied Sciences. 2025; 15(12):6489. https://doi.org/10.3390/app15126489

Chicago/Turabian Style

Huang, Yikai, Jinchuan Tang, and Shuping Dang. 2025. "Dynamic Node Privacy Feature Decoupling Graph Autoencoder Based on Attention Mechanism" Applied Sciences 15, no. 12: 6489. https://doi.org/10.3390/app15126489

APA Style

Huang, Y., Tang, J., & Dang, S. (2025). Dynamic Node Privacy Feature Decoupling Graph Autoencoder Based on Attention Mechanism. Applied Sciences, 15(12), 6489. https://doi.org/10.3390/app15126489

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop