You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Article
  • Open Access

9 October 2024

TCα-PIA: A Personalized Social Network Anonymity Scheme via Tree Clustering and α-Partial Isomorphism

,
,
,
and
1
Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, No. 1, Jinji Road, Qixing District, Guilin 541004, China
2
Guangxi Key Laboratory of Digital Infrastructure, Guangxi Zhuang Autonomous Region Information Center, Nanning 530201, China
*
Author to whom correspondence should be addressed.

Abstract

Social networks have become integral to daily life, allowing users to connect and share information. The efficient analysis of social networks benefits fields such as epidemiology, information dissemination, marketing, and sentiment analysis. However, the direct publishing of social networks is vulnerable to privacy attacks such as typical 1-neighborhood attacks. This attack can infer the sensitive information of private users using users’ relationships and identities. To defend against these attacks, the k-anonymity scheme is a widely used method for protecting user privacy by ensuring that each user is indistinguishable from at least k 1 other users. However, this approach requires extensive modifications that compromise the utility of the anonymized graph. In addition, it applies uniform privacy protection, ignoring users’ different privacy preferences. To address the above challenges, this paper proposes an anonymity scheme called TC α -PIA (Tree Clustering and α -Partial Isomorphism Anonymization). Specifically, TC α -PIA first constructs a similarity tree to capture subgraph feature information at different levels using a novel clustering method. Then, it extracts the different privacy requirements of each user based on the node cluster. Using the privacy requirements, it employs an α -partial isomorphism-based graph structure anonymization method to achieve personalized privacy requirements for each user. Extensive experiments on four public datasets show that TC α -PIA outperforms other alternatives in balancing graph privacy and utility.

1. Introduction

With the rise of social platforms like WeChat and Twitter, a vast number of users have joined, forming large-scale social networks and generating massive amounts of data. This published data has significant value, and social network operators often share user relationship graphs and other information with third parties [1] to facilitate data analysis and mining, social group analysis, and personalized recommendations. However, directly publishing unprocessed data poses significant privacy risks for things such as privacy theft attacks, identity attacks, and social relationship inference attacks.
To protect user privacy in social networks, various privacy-preserving mechanisms have been widely employed, such as naïve anonymization mechanisms [2], k-anonymity [3,4,5], encryption techniques [6,7,8], and differential privacy [9,10,11,12]. Naïve anonymization replaces identity information in social network data with random pseudo-identities, effectively defending against zero-knowledge adversaries. However, it fails to protect against attackers who possess knowledge of the graph structure. Encryption techniques encrypt social network data to protect the privacy of sensitive information. However, these techniques are costly in terms of data encryption, decryption, and key management. Additionally, they lack flexibility and are vulnerable to privacy attacks once the key has been compromised. Differential privacy introduces noise into data and has been widely applied in data mining, machine learning, and data analysis due to its theoretical security and ease of implementation [13]. However, the added noise can reduce the accuracy of data analysis, affecting the practical utility of the data. In contrast, k-anonymity is based on the concept of data generalization, ensuring that each node in the graph is indistinguishable from at least k 1 other nodes by adding or deleting nodes and/or edges in the social network graph. This significantly reduces the possibility of attackers identifying users to 1/k. In summary, k-anonymity provides stronger anonymity than naïve mechanisms, avoids the high computational overhead and complex key management required by encryption techniques, and is easier to understand and implement compared to differential privacy.
Many efforts have been made to develop an efficient k-anonymity scheme. For example, Yazdanjue et al. [14] proposed an EDPSO algorithm based on k-anonymity. This algorithm clusters nodes and edges in the network into super-nodes and super-edges to protect the structural information of individuals, thereby protecting user privacy. Although this approach effectively protects structural privacy, it does not address the personalized anonymization needs of users, indicating potential for further improvement. Wang et al. [15] proposed the GCA-DA algorithm. This algorithm clusters nodes by quantifying the distance and attribute similarity between them, ensuring that each cluster contains at least k nodes before anonymizing them. It protects privacy through attribute generalization and distinguishes between numerical and non-numerical attributes to maintain data usability. While this approach enhances clustering quality, it lacks flexibility. Zhang et al. [16] introduced a social network privacy protection scheme called GPPS. This scheme first employs degree-based graph entropy and spectral clustering algorithms for cluster nodes, and then adjusts 1-neighborhood graphs using a maximum weight bipartite matching method to achieve k-anonymity. However, it does not focus on graph isomorphism and fails to consider different user privacy requirements, indicating that the scheme can be further improved.
Therefore, to effectively address the aforementioned problems and balance the privacy and utility of the anonymity scheme in social networks, this paper proposes a similarity tree clustering method and an α -partial isomorphism anonymization scheme based on the concepts of clustering and isomorphism. First, we perform node clustering using a similarity tree. We calculate the structural similarity between nodes and construct the similarity tree based on the obtained results. Next, we perform a unification operation on each cluster after the initial clustering to prevent attackers from exploiting differences in the number of users across clusters for identification attacks. Finally, we propose an α -partial isomorphism anonymization algorithm based on the concept of isomorphism. This method addresses different structural privacy requirements of users by using mapping relationships and mapping matrices. It aims to minimize modifications to the graph and improve the utility of the anonymized graph. Specifically, the main contributions of this paper are as follows.
  • We propose a new combined criterion for node similarity calculation to capture structural features of the 1-neighborhood subgraph from different factors, providing a foundation for subsequent node clustering.
  • We propose a similarity tree clustering method that constructs a connection relationship tree based on similarity results. This method better reveals correlations between users, achieves node clustering, and effectively mitigates information difference attacks.
  • We propose an α -partial isomorphism ( 0 < α 1 ) anonymization algorithm based on the concept of isomorphism, which meets users’ personalized structural privacy requirements while enhancing the utility of anonymized graphs.
  • Based on the real datasets, the experimental comparison between the proposed TC α -PIA scheme and the same type of scheme is implemented. The results show that TC α -PIA has higher utility. In particular, the excellent performance in information loss fully indicates that TC α -PIA has less impact on the original graph and better preserves the utility of the anonymized graph.
The rest of the paper is organized as follows. Section 2 reviews the representative-related works. Section 3 introduces the privacy protection scenarios and problem definitions for social networks. Section 4 details the newly proposed TC α -PIA. Section 5 evaluates the protocols and completes the experimental comparisons. Section 6 summarizes the work of this paper and gives future work.

3. Preliminaries

  • Problem Statement. Given an undirected and unlabeled graph G(V, E) and its anonymized form G*(V*, E*), it is essential to ensure that any attacker with background knowledge of 1-neighborhood information (i.e., degree or structural information) cannot re-identify any individual structural information through queries on G*.
Definition 1
(Social network graph). An undirected, unlabeled graph G ( V , E ) , where V represents the set of nodes (users) in the social network, E represents the set of edges (relationships between users), and each ( v i , v j ) E represents the edge between v i V and v j V .
Publishing the original graph G without proper anonymization can result in significant privacy risks for users. To mitigate these risks, the graph is typically anonymized before publication, with the k-anonymity method being one of the most widely used approaches.
Definition 2
(k-anonymity). Given a graph G ( V , E ) , the graph is k-anonymous if for each node v V , it is indistinguishable from at least k 1 other nodes.
k-anonymity is commonly used to protect users’ identities by modifying nodes so that each user is indistinguishable from k 1 other users. However, adversaries may use additional information, such as 1-neighborhood knowledge, to re-identify target nodes. Therefore, it is also necessary to protect the neighborhood information.
Definition 3
(1-neighborhood subgraphs [16]). Given a graph G ( V , E ) , g ( v ) = V ( g ( v ) ) , and E ( g ( v ) ) represents the 1-neighborhood subgraph of v in the graph G, where V ( g ( v ) ) represents the nodes connected to v within 1-hop distance in G, denoted as V ( g ( v ) ) = { u ( u , v ) E } { v } ; E ( g ( v ) ) represents the set of edges between nodes in V ( g ( v ) ) , denoted as E ( g ( v ) ) = { ( u , v ) u , v V ( g ( v ) ) ( u , v ) E } .
To address the limitations of k-anonymity, we introduce k-neighborhood subgraph anonymity and other anonymity methods.
Definition 4
(k-neighborhood subgraph anonymity). A graph G ( V , E ) satisfies k-neighborhood subgraph anonymity if for each node v V , there are at least k 1 other nodes in V that have the same 1-neighborhood subgraph as v.
Definition 5
(Graph isomorphism [33]). Given two graphs G ( V , E ) and G * ( V * , E * ) , where | V | = | V * | , G and G * are isomorphic if there exists a bijection h between V and V * , h : V ( G ) V ( G * ) , such that ( u , v ) E if and only if ( h ( u ) , h ( v ) ) E ( G * ) ; then, G and G * are isomorphic. It is also said that there exists an isomorphism from G to G * , or ( u , v ) is isomorphic to ( h ( u ) , h ( v ) ) .
Definition 6
(Partial isomorphism [35]). Given two graphs G ( V , E ) and G * ( V * , E * ) , and their subgraph structures A and B, G and G * are partially isomorphic if there exists a partial function, which is also a bijective function, s : A B , such that the relationship between the nodes in A is preserved and reflected in B when a node in A is mapped to B.
Definition 7
(k-isomorphism [33]). Given a graph G ( V , E ) , G satisfies the k-isomorphism if G consists of k disjoint subgraphs, i.e., G = { g 1 , g 2 , , g k } , and these k subgraphs are isomorphic to each other.

4. TC α -PIA

This paper proposes a novel tree clustering and α -partial isomorphism anonymization scheme (TC α -PIA) that satisfies different privacy requirements of users in the social network, effectively resists 1-neighborhood subgraph attacks, and preserves high utility of the anonymized graph.
As shown in Figure 1, the main process of TC α -PIA includes the following steps: (1) Cluster nodes as multiple groups using the proposed similarity tree method. (2) Reduce inter-cluster differences using the proposed branch unification method. (3) Perform α -partial isomorphism anonymization on the clustered groups to generate the anonymized graph.
Figure 1. An overview of TC α -PIA.

4.1. Node Clustering Based on Similarity Tree

This section proposes a clustering method by constructing the similarity tree. This method groups similar nodes into multiple clusters, denoted as C = { C 1 , C 2 , , C T } , where T is the number of clusters. In Algorithm 1, the process is divided into three main steps. First, we compute the similarity between each pair of nodes (lines 1–15). Second, we construct a similarity tree based on the similarity results (lines 16–26). Finally, we perform branch unification on the constructed similarity tree to minimize inter-cluster differences (line 27). The details are as follows.
Algorithm 1 Similarity tree established
  • Require: Original graph G ( V , E ) , anonymity parameter k, number of nodes n
  • Ensure: Number of clusters C, ordered similarity list D, similarity tree T r e e
1:
for  i = 1 to n do
2:
      Compute x i = < d i , E g i , Δ i , O E Δ i > for each node
3:
end for
4:
m = 0
5:
for  i = 1 to n do
6:
      for  j = i + 1 to n do
7:
            Calculate the similarity of each node 1-neighborhood graph
8:
      end for
9:
       M = Sort Based on Similarity
10:
    for  z = 1 to n do
11:
           D ( m , 0 ) = M ( z 1 , 0 )
12:
           D ( m , 1 ) = M ( z 1 , 1 )
13:
           m = m + 1
14:
    end for
15:
end for
16:
m = 0 , T r e e null , Tree . root null
17:
for  i = 1 to n do
18:
       v i . p a r e n t D ( m , 1 )
19:
       v i . p a r e n t . c h i l d r e n v i
20:
       m = m + n
21:
end for
22:
for  i = 1 to n do
23:
      if  v i . p a r e n t is null then
24:
            v i . p a r e n t T r e e . r o o t
25:
      end if
26:
end for
27:
Branch Unification seeing Algorithm 2
  • return CD T r e e
Step 1: node similarity calculation. The similarity between each pair of nodes is calculated by the combined criterion, which includes four factors: the node’s degree, the number of edges in the 1-neighborhood subgraph of the node, the number of triangles in which the node participates (triangles involving the node), and the number of common edges of triangles in which the node participates (overlapping edge of participating triangles). The latter two factors are further given as follows:
Definition 8
(Triangles involving the node). An undirected and unlabeled original graph G ( V , E ) , where V = V ( G ) is the set of nodes and E = E ( G ) is the set of edges. The triangles involving the node v are defined as Δ ( v ) = { v z V u V ( u , v ) E ( G ) ( v , z ) E ( G ) ( u , z ) E ( G ) } , where u, v, and z are nodes in G. For example, Figure 2a is the 1-neighborhood subgraph of v 2 . It contains two triangles: one comprises v 1 , v 2 , and v 3 , and the other comprises v 2 , v 3 , and v 5 . For ease of understanding, we will use the term “participating triangle” in the following.
Figure 2. Example of triangles involving the node: (a) Original graph. (b) The 1-neighborhood subgraph of v 2 containing two participating triangles ( v 1 , v 2 , v 3 ) and ( v 2 , v 3 , v 5 ) and one overlapping edge ( v 2 , v 3 ) .
Definition 9
(Overlapping edge of participating triangles). An undirected, unlabeled graph G ( V , E ) , where V is the set of nodes and E is the set of edges. We define the common edge between several participating triangles in Definition 10 as the overlapping edge of the participating triangles: O E = { ( u , v ) u , v V ( u , v ) E ( Δ 1 ) E ( Δ 2 ) E ( Δ 1 ) , E ( Δ 2 ) E } . For example, the red line in Figure 2b is the common edge between the above two triangles.
Then, we give the formula of similarity calculation based on the four factors mentioned above.
Δ f = x i x j .
x i = < d i , E g i , Δ i , O E Δ i > .
where Δ f is the similarity function, and the smaller its result, the higher the similarity between two nodes. x i is a vector about the node v i , and it includes four factors: d i , E g i , Δ i , and O E Δ i . d i is the node’s degree, E g i is the number of edges in the 1-neighborhood subgraph g i of the node, Δ i represents the triangles involving the node, and O E Δ i is the overlapping edge of participating triangles. The last two factors, Δ i and O E Δ i , account for the stability of the triangle structures and the number of participating paths, thus better capturing the 1-neighborhood subgraph structure.
After calculating the similarity, the results are stored in a list D of size n × 2 . D ( i , 0 ) stores the similarity values between v i and other nodes in ascending order, denoted as S i m ( v i ) = { S i m ( v i , v j ) i , j = 1 , 2 , , n 1 and i j } . D ( i , 1 ) stores the corresponding nodes in the order of S i m ( v i ) , denoted as P N ( v i ) = { v j j = 1 , 2 , , n 1 } .    
D = s i m ( v 1 ) P N ( v 1 ) s i m ( v 2 ) P N ( v 2 ) s i m ( v n ) P N ( v n )
Step 2: similarity tree construction. To ensure tight connections within clusters, we construct a similarity tree based on the calculated similarities. Each node is connected to its most similar node, referred to as the parent node, forming the branches of the tree. For nodes that are most similar but not identical, such as v 2 and v 6 in Table 1, one node is randomly chosen as the parent and the other as the child to form the relationship tree. Nodes without a parent node are connected directly to the root node, which is used only for tree construction and does not correspond to an actual node in the graph, nor does it participate in the clustering process. Each branch under the root node represents a cluster. The relevant definitions are as follows:
Table 1. Example of parent–child node relationship.
Definition 10
(Parent–child node relationship). For each v i , identify the v j that is most similar to v i by selecting the first element in P N ( v i ) , denoted as P ( v i ) = P N ( v i ) [ 1 ] , where P ( v i ) is the parent node of v i .
Table 1 and Figure 3 show the establishment of parent–child relationships and the construction process of the similarity tree, respectively. Each node is directly connected to its parent node, P ( v i ) . For example, both v 7 and v 4 have the same parent node, v 6 , and are directly connected to it. In cases where nodes are most similar, such as v 2 and v 6 , one node is randomly selected as the parent. Here, v 2 is chosen as the parent of v 6 ; similarly, the relationship between v 3 and v 5 is established in the same manner. Consequently, we obtain two branches in Figure 3a: { v 2 , v 6 , v 7 , v 4 , v 8 , v 9 } and { v 3 , v 5 , v 1 } . From Figure 3a, it is clear that v 2 and v 3 do not have parent nodes. For such nodes, we directly connect them to the root node; i.e., v 2 and v 3 are connected directly to the root node, as indicated by the text “root” in Table 1.
Figure 3. Example of similarity tree construction (setting k = 3): (a) Two branches, { v 2 , v 6 , v 7 , v 4 , v 8 , v 9 } and { v 3 , v 5 , v 1 } , where v 2 and v 3 have no parent nodes. (b) Connect the nodes v 2 and v 3 to the root node. (c) After branch unification, we obtain three independent branches, corresponding to three clusters: C 1 = { v 7 , v 8 , v 9 } , C 2 = { v 2 , v 6 , v 4 } , and C 3 = { v 3 , v 5 , v 1 } .
Step 3: branch unification of the similarity tree. The varying branch sizes in the similarity tree can cause differences in anonymity levels and increase the risk of information leakage. We unified branches of the similarity tree, ensuring that all clusters were of the same size. As shown in Figure 3b,c, when k = 3 , we split the branches that contained significantly more than three nodes. Starting from the bottom, we divided v 9 , v 8 , and v 7 from the branch to form an independent branch and connected it to the root node.
Algorithm 2 presents the details for the branch unification of the similarity tree. The key steps are as follows: (1) Traverse each branch of the similarity tree and perform removal or splitting operations on nodes within the branches (lines 2–10) to achieve a unified number of nodes among branches. The quality of clustering is ensured by rationalizing and reassigning nodes and tree branches. (2) Mark branches with more than k nodes and those with significantly fewer than k nodes. Remove the excess nodes from branches with more than k nodes and eliminate branches with fewer nodes than k . Based on the similarity results, reassign the removed nodes to the next-most-similar parent node as sub-nodes of that node (lines 11–19). (3) To maintain clustering quality, check the number of neighbors for the nodes within each branch. Remove nodes with significant differences in similarity and reassign them to the next-most-similar node (lines 20–26).
Algorithm 2 Branch unification
  • Require: k, D, tree root T r e e _ r o o t
  • Ensure: C, T r e e
1:
S , T r e e _ r o o t r o o t
2:
for branch in root.children do         ▹ Iterate through branches under the root node
3:
       if  | branch | k  then
4:
             Remove the branch from root and add nodes from the branch to S
5:
       else if  | branch | > k  then
6:
             Place the remaining | branch % k | in S
7:
              R b i Divide branch into equal branches
8:
             Add R b i into r o o t . c h i l d r e n        ▹ Split the branch and connect to the root node
9:
       end if
10:
end for
11:
while S not empty do               ▹ Search for the next parent for the removed nodes
12:
      for  i = 1 to | S |  do
13:
             Find next parent node p v i from D
14:
             if  | branch ( p v i ) | k  then
15:
                    v i . parent p v i
16:
                   Remove v i from S
17:
             end if
18:
      end for
19:
end while
20:
S
21:
for  b r a n c h in r o o t . c h i l d r e n  do
22:
      Remove nodes with significant differences
23:
      Add nodes to S
24:
end for
25:
Repeat steps 12–19 until S is empty
26:
Update T r e e , C branches
  • return C, T r e e

4.2. Graph Anonymity Modification Based on α -Partial Isomorphism

After clustering nodes, this section anonymizes nodes within each cluster using the proposed α -partial isomorphism. This method includes the following steps: (1) Select seed nodes and determine α values for each cluster. (2) Establish the mapping relationship between the 1-neighborhood graph of the node and the 1-neighborhood graph of the seed node in each cluster. (3) Based on the mapping relationship, establish a mapping matrix between the nodes and their corresponding seed nodes. Compare the structural information in the matrix and calculate the percentage of identical structures. Then, compare this proportion with the α value and perform the necessary anonymization modifications based on the requirements. We first give the relevant definitions before presenting the α -partial isomorphism algorithm.
Definition 11
(Mapping relationship). Given a graph G ( V , E ) , where g ( v ) and g ( u ) are 1-neighborhood subgraphs of nodes v and u in G., if there is a correspondence between nodes in g ( v ) and g ( u ) , then there is mapping between g ( v ) and g ( u ) , denoted as f : g ( v ) g ( u ) . Figure 4 and Table 2 show an example of a mapping relationship.
Figure 4. Mapping relationship.
Table 2. Subgraph node sequence and mapping relationship.
Definition 12
(Centrality of edge neighborhood [23]). The neighborhood centrality of an edge is used to measure the importance or centrality of an edge in a network, which describes the influence of an edge ( v i , v j ) in its neighborhood and can be expressed by the following equation:
N C { v i , v j } = | Γ ( v i ) Γ ( v j ) | | Γ ( v i ) Γ ( v j ) | 2 m a x ( d e g ) .
where N C represents the neighborhood centrality of the edge ( v i , v j ) , and Γ ( v i ) denotes the number of neighbors of node v i . Since the edges between two nodes are removed randomly, there is a risk of destroying the triangles in the graph. Therefore, based on the above definition, this paper uses the edge participation measure mentioned in Definition 11 to minimize the effect on the triangle structure. The calculation is as follows:
N C _ r e m o v e ( v i , v j ) = N C ( v i , v j ) × ( r Δ g i + 1 )
where N C _ r e m o v e ( v i , v j ) is the edge to be removed.
Figure 5 provides an example to illustrate the process of α -partial isomorphism anonymization based on the above definitions. (1) First, the seed node, highlighted in red, is selected as the node with the largest number of neighbors in each cluster. Simultaneously, the maximum α value for each cluster is determined and labeled as α i ( i = 1 , 2 , 3 , 4 ), also highlighted in red in Figure 5. (2) Next, we establish the mapping relationship between the 1-neighborhood graph of each node and the 1-neighborhood graph of the seed node in descending order by node degree within each cluster, and establish mapping matrices MD . For illustration, we use cluster C 2 , which contains nodes v 2 and v 3 , with a similar process applied to other clusters. We calculate the percentage of identical structures between nodes v 2 and v 3 by comparing the common edges in the MD , where MD [i][j] = 1 indicates the presence of an edge between the two nodes and MD [i][j] = 0 indicates its absence. Although v 2 and v 3 have the same 1-neighborhood structure with six identical edges, the unequal number of neighbors means that α 2 (100%)-partial isomorphism is not satisfied. To achieve structural isomorphism with the seed node v 2 , a fake node f 1 is added and connected to v 3 . (3) After modifications within all clusters, the fake nodes are merged to avoid the excessive addition of fake nodes, as shown in Figure 5. Merging f 1 and f 2 is effective because it does not compromise the privacy of nodes v 3 and v 8 .
Figure 5. α -partial isomorphism: (a) Original graph G. (b) Select the seed node with the largest number of neighbors and maximum α i value in each cluster. (c) Establish the mapping relationship between the 1-neighborhood subgraph of a node and the 1-neighborhood subgraph of the seed node in each cluster. (d) Modify the 1-neighborhood subgraph structure of a node to achieve α i -partial isomorphism by referencing the 1-neighborhood structure of the seed node. (e) Anonymized graph G . (f) Merge the fake nodes to obtain the final anonymized graph G * .
Algorithm 3 presents a detailed description of graph anonymity based on α -partial isomorphism anonymization. (1) First, the node with the largest number of neighbors in each cluster is selected as the seed node, and the maximum value of α i is determined for each cluster C i (lines 1–3). The selection of α i is based on the privacy threshold specified by the users. The privacy requirements of the users are classified into three levels, strong, medium, and weak, corresponding to the intervals [0%, 30%], (30%, 60%], and (60%, 100%], respectively. α i is determined by identifying the highest privacy requirement within each cluster. For example, if the privacy requirements of three users in cluster C i are 25%, 52%, and 20%, then α 1 is set to 52% for C i , indicating a medium privacy requirement. (2) Second, after selecting the seed node and max α i in each cluster, we use the mapping matrix to evaluate the structural similarity between the seed node and other nodes. If the similarity meets the threshold α i , these nodes are considered isomorphic to the seed node and require no further anonymization. Otherwise, we adjust their 1-neighborhood subgraph structures until α i is reached (lines 4–37). (3) Third, since the anonymization process described in Step 2 may involve the addition of fake nodes, we propose a node-merging strategy. This strategy evaluates pairs of fake nodes to determine whether they can be merged without compromising the privacy of the connected nodes. If merging fake nodes changes their 1-neighborhood structure and does not satisfy the α i structure anonymity requirements for the connected nodes, then the fake nodes should not be merged. This approach ensures privacy while optimizing the graph structure and minimizing the disruption caused by the addition of fake nodes (lines 38–42).
Algorithm 3 Graph anonymity
  • Require:  G ( V , E ) , Structure level anonymity threshold α
  • Ensure: Anonymized graph G *
1:
for  i = 1 to C do
2:
      Find seed node in C i
3:
      determine the maximum α i value in C i
4:
      for node in C i  do
5:
             if  sim ( node , seed ) < α i  then
6:
                   Add ( d seed d node ) fake nodes into g node
7:
                   Sort the nodes in subgraph ( g seed , g node ) according to degree
8:
                    M p ( g seed , g node ) Mapping node in seed node
9:
                    M D ( g seed , g node ) Mapping matrix
10:
                   S 1 , S 2
11:
                  for v in g node  do
12:
                        if  d v < d M p ( v , seed )  then
13:
                              Add v into S 1
14:
                        else if  d v > d M p ( v , seed )  then
15:
                              Add v into S 2
16:
                        end if
17:
                  end for
18:
                  for  v 1 , v 2 in S 1  do
19:
                        Add the edge( v 1 , v 2 )
20:
                  end for
21:
                  for  v 1 , v 2 in S 2  do
22:
                        if  d v 1 > d M p ( v 1 , seed )  and  v 2 g v 1  and  min { N C remove ( v 1 , v 2 ) }  then
23:
                              Remove the edge( v 1 , v 2 )
24:
                        end if
25:
                  end for
26:
                  Update S 1 , S 2
27:
                  if  sim ( node , seed ) < α i  then
28:
                        Matching triangle structures with M D ( g seed , g node )
29:
                        Repeat similar steps 6–27
30:
                        if  sim ( node , seed ) < α i  then
31:
                              Modify edges using M D
32:
                        end if
33:
                        Repeat until sim ( node , seed ) α i
34:
                  end if
35:
             end if
36:
      end for
37:
end for
38:
for each fake node pair ( f 1 , f 2 )  do
39:
    if No node privacy is compromised by merging f 1 and f 2  then
40:
          Merge fake nodes f 1 and f 2
41:
    end if
42:
end for
  • return  G *
Figure 6 shows examples of edge addition and deletion in the Algorithm 3, covering cases not illustrated in Figure 5.
Figure 6. Edge modification strategy: (a) A degree reduction/edge deletion strategy for when there is an edge between v 1 and v 2 . (b) A degree reduction/edge deletion strategy for when there is no edge between v 1 and v 2 . (c) A degree increase/edge addition strategy for when there is no edge between v 1 and v 2 . (d) A degree increase/edge addition strategy for when there is an edge between v 1 and v 2 . (e) An edge swapping strategy for when there is no edge between v 1 and v 2 . (f) An edge swapping strategy for when there is an edge between v 1 and v 2 .
(1) Both v 1 and v 2 need to decrease their degrees. If there is an edge between v 1 and v 2 , delete the edge between the two nodes. If there is no edge between v 1 and v 2 , use Formulas (3) and (4) to find and delete the edge with the least effect. To ensure the degrees of v 3 and v 4 remain unchanged, add an edge between them.
(2) Both v 1 and v 2 need to increase their degrees. If there is no edge between v 1 and v 2 , add one. If an edge already exists between v 1 and v 2 , find non-adjacent nodes for v 1 and v 2 , and connect them to v 1 and v 2 , respectively. To maintain the original degrees, the existing edge between the newly connected nodes should be removed. For example, connect v 3 to v 1 and v 2 to v 4 , while removing the edge between v 4 and v 3 .
(3) v 1 needs to increase its degree, while v 2 needs to decrease its degree. We use Formulas (3) and (4) to remove the edge with the lowest centrality in v 2 ’s neighborhood and reconnect the disconnected node to v 1 . This adjustment does not affect the degree of the node previously connected to v 2 , while satisfying the requirement for increasing the degree of v 1 and decreasing the degree of v 2 .
(4) v 1 needs to decrease its degree, while v 2 needs to increase its degree. This process is analogous to (3), so a detailed explanation is omitted here.

4.3. Algorithm Complexity Analysis

According to the previous chapters, TC α -PIA mainly includes node clustering based on the similarity tree and anonymous graph modification based on α -partial isomorphism anonymization.
In the node clustering phase, the complexity of similarity calculation is O ( n 2 ) , where n is the number of nodes. The complexity of similarity tree construction is O ( n ) . In the branch unification of the similarity tree, nodes and branches not meeting the requirements are removed, with a complexity of O ( l ) , where l is the number of branches. The complexity for reallocating these removed nodes and branches has two cases. In the best case, each excluded node can easily find its parent node, and the branch of the parent node can continue to accommodate its child nodes. The complexity is O ( m ) , where m is the number of nodes removed from the branches. In the worst case, the parent node’s branch cannot accommodate new nodes, and the excluded node must continuously search for the parent node. The complexity is O ( m · n ) . In summary, the best-case complexity of branch unification for the similarity tree is O ( n 2 + n + l + m ) , and the worst-case complexity is O ( n 2 + n + l + m · n ) . Since l , m n , the overall complexity of the clustering process is approximately O ( n 2 ) .
In the graph-anonymization phase, each cluster is searched for the seed node and α , with a time complexity of O ( | C | ) , where | C | represents the number of clusters identified after branch unification. The complexity of node anonymization in each cluster is O ( | C i | · d ) , where | C i | denotes the number of nodes in the cluster and d represents the average degree difference between nodes requiring modification to achieve partial α -partial isomorphism with the seed node.
In summary, the time complexity of TC α -PIA is O ( n 2 + | C | · | C i | · d ) .

4.4. Privacy Analysis

In this section, we demonstrate the effectiveness of TC α -PIA for anonymizing the social network graph structure while satisfying users’ different privacy requirements. The anonymized graph obtained by the TC α -PIA scheme can resist attacks from adversaries with different background knowledge.
Theorem 1.
For the complete or partial 1-neighborhood graph attack of any target node, the probability that the attacker can re-identify the target node’s identity does not exceed 1/k.
Proof of Theorem 1.
Assume there exist two kinds of attacks according to the capability of the attacker: (1) The attacker has partial knowledge of the 1-neighbor subgraph of the target node. (2) The attacker has complete knowledge of the 1-neighbor subgraph of the target node.
Partial protection analysis. The attacker has a partial 1-neighbor subgraph structure of the target node. In this scenario, the attacker’s matching results may be biased, and cannot be guaranteed to fully match the corresponding node clusters. In addition, it can be guaranteed that at least k 1 nodes have a 1-neighbor subgraph identical to that of the target node. Therefore, the probability that the attacker can uniquely identify the target node is not higher than 1 / k .
Complete protection analysis. The attacker has completed the 1-neighbor subgraph structure of the target node. Since our scheme allows subgraphs to be isomorphic, at least k 1 nodes will have the same 1-neighborhood subgraph as the target node. Therefore, TC α -PIA satisfies the anonymity requirement; i.e., the probability that the attacker can uniquely identify the target node is no higher than 1 / k . □

5. Experimental Evaluations

In Section 5.1, we introduce the datasets used in the experiments. In Section 5.2, we describe the metrics used to evaluate the experimental results. In Section 5.3, we explore the performance of TC α -PIA and the comparative experiment.

5.1. Datasets

Four real datasets are used in our experiments, and their statistics are shown in Table 3.
Table 3. Real-world graph datasets.
  • The Facebook dataset [36] from SNAP, with 4039 nodes representing users and 88,234 edges representing relationships.
  • The Ca-CondMat dataset [36] on condensed matter physics, with 23,133 nodes representing papers and 186,936 edges representing co-authorship.
  • The email-Eu-core dataset [36], with 986 nodes representing users and 25,571 edges representing email communications.
  • The soc-wiki-Vote dataset [37], with 889 nodes representing Wikipedia users and 2916 edges representing voting interactions.

5.2. Utility Metrics

To evaluate the proposed TC α -PIA scheme, we used the following metrics:
  • Information Loss (IL). IL refers to the data difference between the modified graph and the original graph after modifications. Modifications include adding or deleting nodes and edges, and edge swapping, which can cause information loss in the original graph. The formula for calculating the information loss [13] is as follows.
    I L = δ · N D ( G , G * ) + ( 1 δ ) · E D ( G , G * )
    where
    N D = 1 2 · ( | N ( G ) N ( G * ) | | N ( G ) | + | N ( G ) N ( G * ) | | N ( G * ) | )
    E D = 1 2 · ( | E ( G ) E ( G * ) | | N ( G ) | + | E ( G ) E ( G * ) | | N ( G * ) | )
    where E D represents the information loss of edges, N D represents the information loss of nodes, E ( G ) represents the edges of the original graph, E ( G * ) represents the edges of the anonymized graph, and N D ( G ) and N D ( G * ) represent the nodes in the original and anonymized graphs, respectively. δ represents the weight parameter, with a value range of [0,1]. The above formula can calculate different results based on the different emphases of the anonymity scheme on nodes and edges in the graph. In general, the larger the calculated result, the higher the degree of information preservation from the original graph.
  • Average Clustering Coefficient (ACC). ACC focuses on the closeness between nodes, that is, the number of connections between nodes. The change in ACC can reveal the degree of change in the connection relationship between nodes in the graph after anonymization.
    A C C = 1 | N | i = 1 | N | C i
    where | N | is the number of nodes in the graph and C i is the local clustering coefficient of node i.
  • Average Shortest Path Length (APL). The average shortest path length is the average length of the shortest path between any two nodes in the graph. By comparing the average shortest path length of the original and anonymous graphs, we can understand the influence of anonymization on graph connectivity.
    A P L = i j p a t h ( v i , v j ) | N | ( | N | 1 )
    where p a t h ( v i , v j ) is the length of the shortest path between nodes v i and v j .
  • Eigenvector Centrality (EC). Eigenvector centrality is used to measure the importance of nodes in a network structure. Analyzing the change in eigenvector centrality can provide insight into changes in the importance and influence of nodes caused by anonymization.

5.3. Comparison and Analysis of Experimental Results

Implementation. The experimental setup consists of a device manufactured by Lenovo located in Beijing, China, equipped with an AMD Ryzen 5 5600H processor with Radeon Graphics clocked at 3.30 GHz and 16 GB RAM. The operating system used is Windows 10 Home, and the programming language used for the implementation is Python 3.7.
Comparison. To demonstrate the effectiveness of our proposed TC α -PIA scheme, we performed the following experimental comparisons: (1) We compared the TC α -PIA scheme with the GPPS [16] introduced in Section 1. For a fair comparison, we used consistent α values for uniform anonymity and compared them with the GPPS scheme. In addition, we compared the results under different α values (referred to as TC α -PIAsecond), with those of the GPPS scheme. Note that the results of GPPS are averaged over several experiments under non-isomorphic conditions, while the TC α -PIAsecond results are the average of several experiments with different α values required by different users. (2) To validate the effectiveness of tree clustering based on the combined criterion, we compared it with traditional clustering based on Euclidean similarity within the same anonymity scheme.
Experimental settings. We set k = 5 , 10, 15, 20, 25. For TC α -PIA, we set three α values: α 1 = 100 % , α 2 = 50 % , and α 3 = 10 % .

5.3.1. Comparison of the Overall Performance of Schemes

This experiment was performed on both complete and random graphs. For random graphs, we followed the method in reference [13] and randomly extracted 500 to 2000 nodes and their edges from the larger Ca-CondMat dataset to construct four random graph datasets for further experimental comparison with the GPPS scheme [16]. These datasets are denoted as CA-CM1, CA-CM2, CA-CM3, and CA-CM4, respectively, and contain 500, 1000, 1500, and 2000 nodes and their corresponding edges.
Complete graphs
  • Comparison of IL.
Figure 7a shows the effect of different schemes on the IL in the soc-wiki-Vote dataset. It is evident that as α decreases, the IL of TC α -PIA increases, indicating fewer modifications to the original graph and better preservation of the original data. In addition, for the same α , IL decreases as k increases, suggesting that higher privacy levels result in lower utility of the graph. The lower IL at k = 15 compared to k = 20 is due to dataset inhomogeneities causing experimental fluctuations. Compared to GPPS, TC α -PIA( α 1 ), TC α -PIA( α 2 ), and TC α -PIA( α 3 ) perform better, as the proposed scheme effectively limits the addition of fake nodes and edges, reducing their impact on the original graph. Unlike TC α -PIA, TC α -PIAsecond considers the different α requirements of all users. Since not all users require weak privacy, TC α -PIAsecond results in slightly lower IL compared to TC α -PIA( α 3 ). However, it outperforms TC α -PIA( α 2 ), TC α -PIA( α 1 ), and GPPS by considering different privacy requirements, rather than applying a uniform α . This reduces unnecessary modifications and makes anonymization more efficient, preserving more of the original graph structure.
Figure 7. Comparison of IL on complete graphs.
Figure 7b,c show the results for the Email-Eu-core and Facebook datasets, respectively. In Figure 7b, TC α -PIA( α 3 ) significantly outperforms GPPS. For k < 20 , the IL of TC α -PIA( α 2 ) is comparable to that of GPPS, while TC α -PIA( α 1 ) performs slightly worse. The analysis for TC α -PIAsecond is consistent with the findings in Figure 7a.
Overall, TC α -PIA performs better on the soc-wiki-Vote dataset, and all four TC α -PIAs outperform GPPS with different parameters because they are better suited for uniformly distributed and moderately sized graph networks.
  • Change in ACC.
Figure 8a shows the effect of different schemes on ACC in the soc-wiki-Vote dataset. TC α -PIA consistently has a lower rate of change in ACC compared to GPPS, indicating better performance. This result is attributed to the fact that TC α -PIA considers the triangle structure and number during structural anonymization, which helps to preserve more original edges and structures. As k increases, the impact of each scheme on the ACC gradually increases, i.e., the difference between the anonymized graph and the original graph increases. This is because higher anonymity levels lead to more changes in the graph. In the TC α -PIA scheme, a smaller α results in less disruption to the original graph structure, as indicated by a lower effect on the ACC. The performance of TC α -PIAsecond follows a similar trend but outperforms TC α -PIA ( α 2 ), TC α -PIA ( α 1 ), and GPPS due to its more personalized approach to user privacy requirements.
Figure 8. Comparison of the change in ACC on complete graphs.
Figure 8b shows the experimental results for the Email-Eu-core dataset. Under the TC α -PIA scheme, the change in ACC decreases as α decreases, mirroring the trend observed for the soc-wiki-Vote dataset in Figure 8a. Therefore, no further details are needed. Compared to GPPS, TC α -PIA( α 3 ) consistently shows a lower change in ACC for all k values. At k = 15 , TC α -PIA( α 2 ) exhibits slightly higher fluctuations than GPPS, but remains lower for other k values. This fluctuation is due to the non-uniformity of the dataset, but overall TC α -PIA( α 2 ) outperforms GPPS. For k < 20 , TC α -PIA( α 1 ) shows a higher change in ACC compared to GPPS, due to the formation of additional triangles for a stable structure. However, for k > 20 , TC α -PIA( α 1 ) gradually outperforms GPPS.
Figure 8c compares the TC α -PIA and GPPS schemes on the Facebook dataset. The analysis is similar to that of Figure 8a,b.
  • Change in APL.
Figure 9a shows the APL changes for the soc-wiki-Vote dataset. For all k values, the APL changes for TC α -PIA( α 1 ), TC α -PIA( α 2 ), TC α -PIA( α 3 ),and TC α -PIAsecond are consistently lower than those for GPPS. This is because TC α -PIA considers edges that participate more frequently in the shortest paths during anonymization, thus preserving more connection paths between nodes in the original graph.
Figure 9. Comparison of the change in APL on complete graphs.
Figure 9b shows the APL changes for the Email-Eu-core dataset. TC α -PIA( α 2 ) and TC α -PIA( α 3 ) exhibit significantly lower APL changes compared to GPPS. For k < 15 , the results for TC α -PIA( α 1 ) are similar to those of GPPS. For k 15 , the APL change for TC α -PIA( α 1 ) is slightly higher than GPPS. Additionally, as α decreases, the APL impact of TC α -PIA decreases accordingly.
Figure 9c shows the APL changes for the Facebook dataset. The results are similar to those in Figure 9a,b, so no further details are provided.
  • Error rate of EC.
Figure 10a shows the error rate of EC on the soc-wiki-Vote dataset. All three TC α -PIA schemes outperform GPPS on this metric, as TC α -PIA better preserves graph structural features. Additionally, as α decreases, the EC error rate for TC α -PIA decreases significantly.
Figure 10. Comparison of the error rate of EC on complete graphs.
Figure 10b shows the error rate of EC on the Email-Eu-core dataset. TC α -PIA( α 3 ) has a lower error rate than GPPS. For k < 15 , TC α -PIA( α 2 ) shows a slightly higher error rate than GPPS, but for k > 15 , it performs better. This is because TC α -PIA uses triangle-structure-based similarity to minimize the impact on original nodes and controls the addition of fake nodes and edges, preserving more of the original graph’s structure. TC α -PIA( α 1 ) performs slightly worse than GPPS due to additional matrix elements that increase graph modifications.
Figure 10c shows the error rate of EC on the Facebook dataset. The experimental results are similar to those shown in Figure 10a,b, and no further elaboration is needed.
To clarify the impact of controlling the addition of fake nodes on the graph, we compared two approaches based on this metric. The first approach is the original TC α -PIAsecond, which merges and removes nodes after adding fake nodes. The second approach, referred to as TC α -PIAthird, does not apply any post-processing after adding fake nodes and does not control their number. Figure 11 shows the experimental comparison results of the two approaches on three datasets, showing that TC α -PIAsecond consistently outperforms TC α -PIAthird. This suggests that controlling for the addition of fake nodes effectively reduces the impact on the original graph.
Figure 11. Comparison of fake node schemes on the error rate of EC.
Random graphs
  • Comparison of IL.
Figure 12 shows the effect of TC α -PIA and GPPS on IL in four random graph datasets. TC α -PIA consistently outperforms GPPS for all values of α and k. TC α -PIA performs better as α decreases. As k increases, IL decreases for all methods, reflecting more graph modifications to meet stricter privacy requirements. Results from CA-CM2 show worse performance than CA-CM1, indicating that larger datasets require more modifications. While the results fluctuate due to dataset randomness, the overall trend is downward.
Figure 12. Comparison of IL on random graphs.
  • Change in ACC and APL.
Figure 13 and Figure 14 show the effect of TC α -PIA and GPPS on changes in ACC and APL across random graph datasets. In almost all cases, TC α -PIA results in smaller changes in ACC and APL compared to GPPS. This is because TC α -PIA focuses on triangle structures during anonymization, minimizing the addition of edges.
Figure 13. Comparison of the change in ACC on random graphs.
Figure 14. Comparison of the change in APL on random graphs.
  • Error rate of EC.
Figure 15 shows the effect of different schemes on the EC error rate in four random graph datasets. The trend is similar to that in Figure 10, with analogous analysis and underlying reasons.
Figure 15. Comparison of the error rate of EC on random graphs.

5.3.2. Impact of Different Clustering Algorithms on Scheme Performance

To compare the tree clustering algorithm based on the new combined criterion proposed in this paper with the traditional Euclidean similarity clustering algorithm, we conducted an analysis under the same anonymization approach, employing the α -partial isomorphism anonymization scheme for both, to explore their effect on the final experimental results. Figure 16a shows the effect of the TC α -PIA scheme based on tree clustering and the traditional clustering scheme EC α -PIA based on Euclidean similarity on the IL in the soc-wiki-Vote dataset. Under the same levels of anonymization, k and α , the IL of the TC α -PIA scheme is significantly higher than that of the EC α -PIA scheme, indicating that the combined criterion in tree clustering takes into account more graph structural factors, thereby improving clustering accuracy and laying a foundation to reduce the modifications needed for anonymization in subsequent steps. The analysis for Figure 16b,c is similar to that for Figure 16a and is not repeated.
Figure 16. Comparison of anonymity schemes with different clustering algorithms.

6. Conclusions

This paper proposes TC α -PIA, a privacy protection scheme based on k-anonymity for undirected social network graphs. The main goal is to protect users’ personal information while satisfying different privacy requirements. TC α -PIA sets a threshold for structural similarity between 1-neighborhood subgraphs and anonymizes the graphs accordingly. TC α -PIA involves three main steps: First, it constructs a relationship tree based on node similarity calculation and clusters nodes into distinct groups. Second, to defend against differential attacks between clusters, TC α -PIA ensures that each cluster has a similar size by unifying the number of nodes across clusters. Finally, it performs α -partial isomorphism anonymization on the graph.
Experimental comparisons using different types and scales of datasets indicate the following: (1) TC α -PIA satisfies different privacy requirements while effectively preserving graph utility, achieving a balance between privacy and utility. (2) The TC α -PIA scheme is applicable to other networks that can be abstracted as consisting of nodes and edges, such as the email network dataset and the Wikipedia voting dataset used in our experiments, enabling targeted privacy protection based on user needs.
As social networks expand, anonymization algorithms face efficiency challenges. Future research will focus on optimizing these algorithms to meet the demands of large-scale, complex networks while developing real-time privacy protection schemes to address dynamic changes in user information. Additionally, we will consider a broader range of user privacy requirements and integrate and enhance different privacy mechanisms to propose more comprehensive personalized privacy protection schemes.

Author Contributions

Conceptualization, M.Z. and Y.H.; methodology, M.Z.; software, M.Z.; validation, M.Z. and Y.H.; investigation, M.Z. and P.L.; resources, L.C.; data curation, M.Z.; writing—original draft preparation, M.Z.; writing—review and editing, M.Z., L.L. and Y.H.; project administration, L.L.; funding acquisition, L.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Natural Science Foundation of China (Nos. U22A2099, 62462019, 62172350), the Open Project Program of Guangxi Key Laboratory of Digital Infrastructure (No. GXDIOP2024019), the Guangdong Basic and Applied Basic Research Foundation (No. 2023A1515012846), the Key Research and Development Program of Guangxi (Nos. AB24010085, AB23026120), and the Natural Science Foundation of Guangxi Province (No. 2021GXNSFBA196054).

Data Availability Statement

The real datasets used in the paper were downloaded from https://snap.stanford.edu/data/ (accessed on 12 June 2024) and https://networkrepository.com/index.php (accessed on 28 June 2024), respectively.

Acknowledgments

Thanks to all the team members who contributed to this work.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Siddula, M.; Li, Y.; Cheng, X.; Tian, Z.; Cai, Z. Anonymization in Online Social Networks Based on Enhanced Equi-Cardinal Clustering. IEEE Trans. Comput. Soc. Syst. 2019, 6, 809–820. [Google Scholar] [CrossRef]
  2. Abawajy, J.H.; Ninggal, M.I.H.; Herawan, T. Privacy Preserving Social Network Data Publication. IEEE Commun. Surv. Tutor. 2016, 18, 1974–1997. [Google Scholar] [CrossRef]
  3. Gangarde, R.; Sharma, A.; Pawar, A.; Joshi, R.; Gonge, S. Privacy Preservation in Online Social Networks Using Multiple-Graph-Properties-Based Clustering to Ensure k-Anonymity, l-Diversity, and t-Closeness. Electronics 2021, 10, 2877. [Google Scholar] [CrossRef]
  4. Mauw, S.; Ramírez-Cruz, Y.; Trujillo-Rasua, R. Preventing active re-identification attacks on social graphs via sybil subgraph obfuscation. Knowl. Inf. Syst. 2022, 64, 1077–1100. [Google Scholar] [CrossRef]
  5. Shakeel, S.; Anjum, A.; Asheralieva, A.; Alam, M. k-NDDP: An Efficient Anonymization Model for Social Network Data Release. Electronics 2021, 10, 2440. [Google Scholar] [CrossRef]
  6. Zheng, Y.; Lu, R.; Zhang, S.; Guan, Y.; Wang, F.; Shao, J.; Zhu, H. PRkNN: Efficient and Privacy-Preserving Reverse kNN Query Over Encrypted Data. IEEE Trans. Dependable Secur. Comput. 2023, 20, 4387–4402. [Google Scholar] [CrossRef]
  7. Wu, A.; Luo, W.; Weng, J.; Yang, A.; Wen, J. Fuzzy Identity-Based Matchmaking Encryption and Its Application. IEEE Trans. Inf. Forensics Secur. 2023, 18, 5592–5607. [Google Scholar] [CrossRef]
  8. Yu, L.; Nan, X.; Niu, S. A Privacy-Preserving Friend Matching Scheme Based on Attribute Encryption in Mobile Social Networks. Electronics 2024, 13, 2175. [Google Scholar] [CrossRef]
  9. Jiang, L.; Yan, Y.; Tian, Z.; Xiong, Z.; Han, Q. Personalized sampling graph collection with local differential privacy for link prediction. World Wide Web 2023, 26, 2669–2689. [Google Scholar] [CrossRef]
  10. Hou, L.; Ni, W.; Zhang, S.; Fu, N.; Zhang, D. PPDU: Dynamic graph publication with local differential privacy. Knowl. Inf. Syst. 2023, 65, 2965–2989. [Google Scholar] [CrossRef]
  11. Huang, H.; Zhang, D.; Xiao, F.; Wang, K.; Gu, J.; Wang, R. Privacy-Preserving Approach PBCN in Social Network with Differential Privacy. IEEE Trans. Netw. Serv. Manag. 2020, 17, 931–945. [Google Scholar] [CrossRef]
  12. Zhu, L.; Lei, T.; Mu, J.; Mu, J.; Cai, Z.; Zhang, J. Differential Privacy-Based Spatial-Temporal Trajectory Clustering Scheme for LBSNs. Electronics 2023, 12, 3767. [Google Scholar] [CrossRef]
  13. Ding, X.; Wang, C.; Choo, K.K.R.; Jin, H. A Novel Privacy Preserving Framework for Large Scale Graph Data Publishing. IEEE Trans. Knowl. Data Eng. 2019, 33, 331–343. [Google Scholar] [CrossRef]
  14. Yazdanjue, N.; Yazdanjouei, H.; Karimianghadim, R.; Gandomi, A. An enhanced discrete particle swarm optimization for structural k-Anonymity in social networks. Inf. Sci. 2024, 670, 120631. [Google Scholar] [CrossRef]
  15. Wang, Z.; Liu, T.; Wang, Y.; Bao, X.; Xu, X.; Huang, X.; Cheng, B. Graph-Clustering Anonymity Privacy Protection Algorithm with Fused Distance-Attributes. J. Phys. Conf. Ser. 2023, 2504, 012058. [Google Scholar] [CrossRef]
  16. Zhang, H.; Lin, L.; Xu, L.; Wang, X. Graph partition based privacy-preserving scheme in social networks. J. Netw. Comput. Appl. 2021, 195, 103214. [Google Scholar] [CrossRef]
  17. Sweeney, L. k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 2002, 10, 557–570. [Google Scholar] [CrossRef]
  18. Cunha, M.; Mendes, R.; Vilela, J.P. A survey of privacy-preserving mechanisms for heterogeneous data types. Comput. Sci. Rev. 2021, 41, 100403. [Google Scholar] [CrossRef]
  19. Lu, X.; Song, Y.; Bressan, S. Fast Identity Anonymization on Graphs. In International Conference on Database and Expert Systems Applications; Liddle, S.W., Schewe, K.D., Tjoa, A.M., Zhou, X., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 281–295. [Google Scholar]
  20. Hartung, S.; Hoffmann, C.; Nichterlein, A. Improved Upper and Lower Bound Heuristics for Degree Anonymization in Social Networks. In Experimental Algorithms; Gudmundsson, J., Katajainen, J., Eds.; Springer: Cham, Switzerland, 2012; pp. 376–387. [Google Scholar]
  21. Casas-Roma, J.; Herrera-Joancomartí, J.; Torra, V. k-Degree anonymity and edge selection: Improving data utility in large networks. Knowl. Inf. Syst. 2017, 50, 447–474. [Google Scholar] [CrossRef]
  22. Sharma, A.; Pathak, S. Enhancement of k-anonymity algorithm for privacy preservation in social media. Int. J. Eng. Technol. (UAE) 2018, 7, 40–45. [Google Scholar] [CrossRef]
  23. Kiabod, M.; Dehkordi, M.N.; Barekatain, B. TSRAM: A time-saving k-degree anonymization method in social network. Expert Syst. Appl. 2019, 125, 378–396. [Google Scholar] [CrossRef]
  24. Kiabod, M.; Dehkordi, M.N.; Barekatain, B. A fast graph modification method for social network anonymization. Expert Syst. Appl. 2021, 180, 115148. [Google Scholar] [CrossRef]
  25. Xiang, N.; Ma, X. TKDA: An Improved Method for K-degree Anonymity in Social Graphs. In Proceedings of the 2022 IEEE Symposium on Computers and Communications (ISCC), Rhodes, Greece, 30 June–3 July 2022; pp. 1–6. [Google Scholar]
  26. Yu, G. A modified firefly algorithm based on neighborhood search. Concurr. Comput. Pract. Exp. 2021, 33, e6066. [Google Scholar] [CrossRef]
  27. Ji, S.; Mittal, P.; Beyah, R. Graph Data Anonymization, De-Anonymization Attacks, and De-Anonymizability Quantification: A Survey. IEEE Commun. Surv. Tutor. 2017, 19, 1305–1326. [Google Scholar] [CrossRef]
  28. Zhou, B.; Pei, J. Preserving Privacy in Social Networks Against Neighborhood Attacks. In Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, Cancun, Mexico, 7–12 April 2008; pp. 506–515. [Google Scholar]
  29. Ren, W.; Ghazinour, K.; Lian, X. kt-Safety: Graph Release via k-Anonymity and t-Closeness. IEEE Trans. Knowl. Data Eng. 2023, 35, 9102–9113. [Google Scholar] [CrossRef]
  30. Tripathy, B.K.; Panda, G.K. A New Approach to Manage Security against Neighborhood Attacks in Social Networks. In Proceedings of the IEEE 2010 International Conference on Advances in Social Networks Analysis and Mining, Odense, Denmark, 9–11 August 2010; pp. 264–269. [Google Scholar]
  31. Zou, L.; Chen, L.; Özsu, M.T. K-Automorphism: A General Framework for Privacy Preserving Network Publication. VLDB Endow. 2009, 2, 946–957. [Google Scholar] [CrossRef]
  32. Yang, J.; Wang, B.; Yang, X.; Zhang, H.; Xiang, G. A secure K-automorphism privacy preserving approach with high data utility in social networks. Secur. Commun. Netw. 2014, 7, 1399–1411. [Google Scholar] [CrossRef]
  33. Cheng, J.; Fu, A.W.; Liu, J. K-Isomorphism: Privacy Preserving Network Publication against Structural Attacks. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, Indianapolis, IN, USA, 6–10 June 2010; Association for Computing Machinery: New York, NY, USA, 2010; pp. 459–470. [Google Scholar]
  34. Rong, H.; Ma, T.; Tang, M.; Cao, J. A novel subgraph K+-isomorphism method in social network based on graph similarity detection. Soft Comput. 2018, 22, 2583–2601. [Google Scholar] [CrossRef]
  35. Ó Conghaile, A. Cohomology in Constraint Satisfaction and Structure Isomorphism. In Proceedings of the 47th International Symposium on Mathematical Foundations of Computer Science (MFCS 2022), Vienna, Austria, 22–26 August 2022; Schloss Dagstuhl–Leibniz-Zentrum für Informatik: Dagstuhl, Germany, 2022; pp. 75:1–75:16. [Google Scholar]
  36. Traud, A.L.; Mucha, P.J.; Porter, M.A. Social structure of facebook networks. Phys. A Stat. Mech. Its Appl. 2012, 391, 4165–4180. [Google Scholar] [CrossRef]
  37. Rossi, R.A.; Ahmed, N.K. The Network Data Repository with Interactive Graph Analytics and Visualization. In Proceedings of the AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; pp. 4292–4293. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.