1. Introduction
Graph-structured data [
1,
2,
3] underpins a wide range of real-world applications, including social networks [
4], biological interaction networks [
5], and recommender systems [
6]. Link prediction [
7,
8,
9], a fundamental task in graph learning, aims to infer potential or missing links from existing topology and node features. It plays a crucial role in downstream applications. For example, in co-authorship networks [
10,
11], it enables the discovery of potential collaborators; in recommender systems [
12], it identifies latent associations between users and items; in bioinformatics [
13], it helps to predict protein–protein interactions; in knowledge graphs [
14], it uncovers new relations; and, in cybersecurity [
15], it aids in detecting suspicious connections. The development of GNNs [
16,
17], which jointly model graph structure and node attributes, has significantly enhanced link prediction performance across these domains.
Although GNNs have achieved significant success in link prediction, their effectiveness largely relies on access to large-scale and high-quality training data. However, recent studies [
18,
19,
20] have shown that GNN models may exhibit serious vulnerabilities when training data is intentionally manipulated or poisoned, which has raised growing concerns about their security in real-world deployments. Among these threats, backdoor attacks [
21,
22] pose a serious risk by enabling adversaries to covertly manipulate model behavior through the injection of specially crafted triggers during training. Once trained, the backdoored model behaves normally on clean inputs but consistently produces attacker-specified outputs when the triggers are present.
Figure 1 illustrates an example of such an attack targeting link prediction. In this context, the attacker may embed covert patterns into the training graph by injecting triggers—such as synthetic nodes and crafted features—so that the backdoored model consistently predicts the existence of links for specific node pairs during inference. For example, in recommendation systems, malicious users can forge false interactions with selected items to ensure that these items are always recommended by the system—regardless of actual user preferences—thereby achieving targeted exposure or economic benefits. Given the widespread adoption of GNN-based link prediction models in sensitive fields such as recommendation systems, biological research, and network security, it is crucial to understand their vulnerability to backdoor attacks. Studying such attacks not only reveals fundamental flaws in the current learning paradigm but also provides important insights for designing more robust and trustworthy graph learning systems.
Despite recent advances, research on backdoor attacks specifically targeting GNN-based link prediction remains limited. To date, only two prior works have directly addressed this problem in static graphs. Link-Backdoor [
23] initiates this line of research by optimizing complex subgraph triggers via gradient-based methods. While effective in small graphs, it incurs high computational overhead and struggles to scale, resulting in limited attack success and degraded performance on large datasets. More recently, Dai et al. [
24] proposed a globally shared single-node trigger whose features are crafted based on globally rare attributes. All poisoned node pairs are connected to this centralized trigger. However, this “one-to-many” design creates structural vulnerability: the trigger node becomes a high-degree outlier, making it easily identifiable through its abnormal connectivity. Moreover, the injected features introduce local heterophily, further amplifying its detectability. As a result, such triggers are susceptible to removal by basic graph anomaly detection techniques, undermining the effectiveness of the attack.
To overcome these limitations, we propose a homophily-driven backdoor attack framework tailored for link prediction. The core idea is to craft a trigger that exploits GNNs’ inherent preference for homophily. For each targeted node pair , we construct a unique path trigger by injecting a bridge node that serves as a common neighbor, thereby introducing structural homophily. The term “unique” here refers not only to the topological independence of each trigger but also to its adaptively synthesized feature. The feature vector of is not statically defined. Instead, it is dynamically generated through a context-aware probabilistic sampling process. We first analyze the joint neighborhood of to obtain the occurrence frequency of each feature dimension, then construct a probability distribution accordingly. Sampling from this distribution yields a feature vector that closely aligns with the local context, ensuring semantic homophily. This design ensures that the trigger conforms to both structural and feature-level patterns, thereby making it more plausible, stealthy, and effective in misleading the model.
To further improve attack efficiency and reduce trigger redundancy, we propose an intelligent trigger injection position selection strategy based on model confidence. Specifically, we leverage a proxy model trained on clean data to estimate the link existence probabilities for all candidate negative edges. Node pairs with the lowest predicted confidence scores (i.e., those the model is most certain are not connected) are selected as attack targets. By injecting triggers into these high-certainty negative samples, the model is forced to override its prior beliefs and internalize a strong generalizable backdoor rule. This confidence-based strategy improves attack effectiveness while reducing the poisoning budget by targeting representative and model-contradictory samples.
Overall, the novelty of this work lies in proposing a homophily-guided backdoor attack framework for GNN-based link prediction, where context-aware path triggers and confidence-driven injection are combined to achieve both high stealthiness and strong attack effectiveness, even on large-scale graphs. The main contributions of this paper can be summarized as follows:
We propose a novel backdoor attack framework for GNN-based link prediction, theoretically grounded in the principle of homophily, offering a new perspective on attack design.
We design a context-aware path trigger whose features are synthesized via probabilistic sampling of the local joint neighborhood. This approach adheres to graph homophily, significantly enhancing stealth while maintaining attack effectiveness.
We introduce a confidence-based trigger injection location selection strategy that identifies the most cognitively representative attack samples. By selecting links that the model is highly confident do not exist, our method encourages the model to override its learned priors and improves both attack efficiency and effectiveness.
Our overall framework is lightweight, making it applicable in realistic black-box settings. Experiments on five benchmark datasets demonstrate that it outperforms the baselines in ASR while maintaining low BPD.
The remainder of this paper is organized as follows.
Section 2 reviews the related work on GNN-based link prediction and backdoor attacks.
Section 3 provides the preliminaries and formulates the problem of backdoor attacks on link prediction.
Section 4 details our proposed homophily-based backdoor attack framework.
Section 5 presents our experimental setup, comprehensive results, and analysis. Finally,
Section 6 concludes the paper and discusses potential future work.
5. Experiments
5.1. Experimental Settings
In this section, we detail our experimental settings, including the datasets, target models, evaluation metrics, baselines, and parameter configurations.
5.1.1. Datasets
Our method is evaluated on five widely adopted benchmark datasets—Cora, Citeseer, Pubmed, CS, and Physics—with Physics being a large-scale graph dataset.
Table 3 summarizes the key statistics of these datasets.
5.1.2. GNN-Based Link Prediction Models
To assess the effectiveness of our attack across different architectures, we target four popular GNN-based models for link prediction:
GAE [
25]: A foundational model that uses a GCN encoder to learn node embeddings and an inner product decoder to reconstruct the adjacency matrix.
VGAE [
25]: A variational extension of GAE, where the encoder learns a probabilistic distribution for each node’s embedding, enhancing robustness.
ARGA [
26]: An adversarially regularized version of GAE that replaces the standard regularizer with a discriminator to better shape the embedding distribution.
ARVGA [
26]: A model that combines the variational aspects of VGAE with the adversarial training of ARGA.
5.1.3. Evaluation Metrics
We use two primary metrics to evaluate the effectiveness and evasiveness of the backdoor attacks, consistent with prior work [
23,
24]:
ASR: This metric measures the attack’s effectiveness. It is the ratio of triggered non-existent links that are successfully misclassified as “existent” by the backdoored model. A higher ASR indicates a more effective attack.
BPD: This metric measures the attack’s evasiveness. It is defined as the difference in the Area Under Curve (AUC) score between a clean model and the backdoored model when evaluated on a benign test set (). A lower BPD indicates a stealthier attack as it has less impact on the model’s normal functionality.
5.1.4. Baselines
We compare our proposed method against the two state-of-the-art backdoor attacks on link prediction discussed in our related work, as well as two classic backdoor attacks originally designed for graph classification:
Link-Backdoor (LB) [
23]: A powerful attack that uses gradient information to optimize a subgraph trigger. We will compare against the results reported in their paper.
Single Node (SN) [
24]: A gradient-free attack that uses a globally shared single node with rare features as a trigger. We will re-implement this method.
Erdos–Renyi Backdoor (ERB) [
33]: An attack that uses a randomly generated Erdos–Renyi graph as a trigger.
Graph Trojaning Attack (GTA) [
34]: A generative attack that uses an optimization algorithm to craft a trigger.
For ERB and GTA, which were originally designed for graph classification, we adapt them to the link prediction task as described in [
23]. Specifically, the generated trigger subgraph is modified to connect to and include the two endpoint nodes of the target non-existent link.
5.1.5. Parameter Settings
We split the edges of each dataset into 85% for training, 5% for validation, and 10% for testing. The validation and test edges are entirely unseen during training, which is the standard and default setting for link prediction tasks, ensuring that the evaluation reflects an inductive scenario. The GNN encoders in all models consist of a two-layer GCN with a 128-dimensional hidden layer and a 64-dimensional output embedding layer. During training, we employ an early stopping criterion: training is halted if the validation loss does not decrease for 20 consecutive epochs. All models are trained for 300 epochs unless early stopping is triggered.
Table 4 reports the AUC scores of all models trained on the clean datasets. We use GAE as the proxy model when selecting edges. For our attack, unless otherwise specified, we set the poisoning rate to 1% of the total number of existent links in the training set. The minimum sampling probability for present features during trigger synthesis (i.e., the probability floor
) is set to 0.3 by default. All experiments are repeated ten times with different random seeds, and we report the mean and standard deviation of ASR and BPD. For the baselines, we adhere to their default hyperparameter settings.
5.2. Overall Attack Performance
As shown in
Table 5, our proposed backdoor attack consistently achieves the highest ASR across all datasets and model architectures while maintaining a low BPD. This demonstrates that our method is effective in implanting targeted backdoor behavior and can maintain the practicality of the model on clean data. The superior performance can be attributed to the design of our homophily-based path trigger, which integrates seamlessly into the graph structure and node feature space, making it inherently difficult to distinguish from benign patterns. Furthermore, the confidence-based trigger injection location selection strategy ensures that the poisoned links introduce strong yet localized supervisory signals, thereby maximizing the attack’s impact with minimal interference.
Compared to existing baselines, our method exhibits clear advantages. Gradient-based methods like LB tend to suffer from scalability issues and reduced effectiveness on large graphs due to their reliance on global optimization procedures. Random or generative approaches such as ERB and GTA show consistently lower ASR and are less reliable as they were not originally designed as backdoor attack methods for link prediction tasks. Although SN performs competitively on small-scale datasets, its effectiveness degrades on larger graphs. This is primarily because the trigger, constructed from global feature frequency statistics, exhibits local heterogeneity with neighboring nodes, thereby reducing its impact.
Notably, our method maintains strong performance even on large-scale datasets such as Physics, where several baselines experience significant degradation. This highlights the scalability and robustness of our local adaptive trigger mechanism, which remains effective across diverse graph sizes and densities.
In summary, the results confirm that our homophily-guided backdoor framework achieves an excellent balance between attack effectiveness and evasiveness and offers clear advantages over existing methods in terms of both generality and scalability.
5.3. Impact of Poisoning Rate
To further understand the robustness and controllability of our attack strategy, we evaluate its performance under varying poisoning rates. Specifically, we vary the proportion of poisoned training edges from 0.2% to 1.4% in increments of 0.2% and report the corresponding ASR and BPD values. The results are summarized in
Figure 3.
We observe that the ASR increases rapidly at lower poisoning rates (e.g., from 0.2% to 0.6%), and then gradually saturates as the rate increases. This indicates that, owing to our confidence-based trigger injection location selection and consistency-aligned trigger design, strong attack effectiveness can be achieved without requiring excessive poisoning.
The BPD remains consistently low across all poisoning rates. Although it exhibits slight fluctuations due to randomness in the poisoned samples, no clear monotonic trend is observed. This confirms the stability and evasiveness of our method, even as the poisoning intensity varies.
5.4. Impact of Training Data Poisoning
To assess the benefit of poisoning the training data, we compare two settings: injecting the trigger during training versus only applying it at inference time. Since the latter does not modify the training data, we report only the ASR as the evaluation metric.
As shown in
Figure 4, incorporating the trigger into the training set leads to a noticeable improvement in ASR across all datasets and models, with increases ranging from approximately 5% to 9%. This demonstrates that our homophily-aware trigger, when introduced during training, effectively guides the model to internalize the backdoor pattern, thereby enhancing the overall attack performance. Nonetheless, the relatively high ASR achieved even without poisoning underscores the robustness and transferability of our trigger design.
5.5. Ablation Studies
To validate the effectiveness of the key components in our proposed attack, we conduct two ablation studies on the Citeseer dataset. We investigate the contributions of our context-aware feature synthesis and our confidence-based target selection strategy, respectively. The results are presented in
Table 6.
5.5.1. Impact of Feature Synthesis
To verify the importance of our context-aware homophily-based feature synthesis, we create a variant,
No Feature Synthesis, where the bridge node’s features are replaced with a randomly generated multi-hot vector. To ensure a fair comparison, the number of non-zero features in this random vector is set to the average number of non-zero features per node across the dataset. As shown in
Table 6, removing our feature synthesis strategy leads to a precipitous drop in attack performance. The ASR decreases by an average of 10.4% across the four GNN models. This significant degradation confirms that synthesizing trigger features that are homophilous with the local environment is a critical factor for the attack’s success, as opposed to using generic context-free random features.
5.5.2. Impact of Target Selection
To validate our intelligent trigger location selection strategy, we compare our complete method (
Full Attack) against the
No Confidence-based Selection variant, which uses the same homophilous trigger but injects it into randomly selected non-existent links. The results in
Table 6 demonstrate that abandoning the lowest-confidence selection strategy leads to a consistent decrease in ASR across all models, with an average drop of 2.15%. While less dramatic than the feature ablation, this still proves that our principled approach of identifying and poisoning the most challenging negative examples is superior to a naive random strategy and contributes significantly to the attack’s high efficiency.
5.6. Parameter Sensitivity Analysis
In this section, we investigate the impact of a key hyperparameter in our feature synthesis process: the feature sampling probability floor, . This parameter controls the minimum sampling probability for any feature present in the target’s joint neighborhood, thus directly influencing the density and strength of the trigger’s feature signal. A higher leads to a denser, and potentially stronger, trigger feature vector.
To analyze its effect, we conduct an experiment on the Citeseer dataset using the GAE model. We vary the value of
from 0.1 to 0.4. The results are presented in
Table 7.
As shown in
Table 7, we observe that, as the probability floor
increases, the ASR exhibits a consistent upward trend. Specifically, the ASR improves steadily from 91.50% at
to 93.68% at
. This trend indicates that a denser trigger feature vector, which provides a stronger and less ambiguous signal by ensuring more features from the local context are included, leads to a more effective and reliable backdoor. Notably, the rate of improvement appears to diminish as
surpasses 0.25, suggesting that, while a denser trigger is beneficial, there are diminishing returns once the signal becomes sufficiently potent. Meanwhile, the BPD remains consistently low and stable across all tested values, indicating that varying the trigger’s feature density in this range has a negligible impact on the model’s performance on clean data.
5.7. Generalization Across Architectures
To further evaluate the generality of our proposed attack beyond the two-layer GCN encoder analyzed theoretically, we conducted additional experiments on the Citeseer dataset using three alternative architectures: a two-layer GAT, a two-layer GraphSAGE, and a three-layer GCN. We applied our attack, as well as the four baseline models (GAE, VGAE, ARGA, and ARVGA), under the same settings described in
Section 5.1.5.
As shown in
Table 8, our attack generalizes well across different GNN architectures. On the GAT encoder, the ASR tends to be higher than that on the two-layer GCN, suggesting that attention mechanisms can amplify the influence of our homophily-guided trigger. For GraphSAGE, the ASR is generally lower, which can be attributed to its neighborhood sampling that dilutes localized trigger effects. On the deeper three-layer GCN, the ASR is slightly lower than that of the two-layer version, indicating that increased network depth does not substantially weaken the backdoor. Across all architectures, the BPD remains low and stable, confirming that the attack consistently preserves the model’s utility on clean data.
5.8. Defense
We evaluate two representative defenses against our attack.
Cosine-based graph purification prunes
of edges with lowest cosine similarity and adds
high-similarity non-edges, aiming to purify the structure before training.
Feature-noise defense, following prior work [
23], perturbs
of node feature entries at inference time (bit-flip for binary attributes).
Figure 5 reports the ASR under these settings across all datasets and models.
The purification strategy consistently raises ASR because its objective—enhancing homophily—is closely aligned with our backdoor design, making the injected path triggers easier to learn. By contrast, feature-noise defense produces a slight reduction in ASR, but the effect remains marginal since random perturbations rarely touch the injected bridge nodes and are quickly smoothed during message passing. Overall, our attack retains high effectiveness under both defenses.
5.9. Embedding Visualization
To further examine whether our attack leaves detectable traces in the latent space, we visualize node embeddings with t-SNE on the Citeseer dataset. As shown in
Figure 6, the injected bridge nodes (red) do not form any separate cluster but are scattered across the embedding space. The attacked endpoints (blue) often appear close to these bridge nodes, reflecting the intended local homophily effect, yet they remain distributed across different clusters. Importantly, the overall embedding structure remains consistent with the benign case, making the backdoor visually indistinguishable and difficult to detect through clustering- or visualization-based defenses.
6. Conclusions
In this paper, we presented a novel backdoor attack framework targeting GNN-based link prediction, addressing the key limitations of existing methods in terms of scalability, stealth, and attack effectiveness. Motivated by the principle of homophily, we designed a path-based trigger that introduces both topological and feature-level similarity between target node pairs. To further enhance attack efficiency and reduce poisoning overhead, we proposed a confidence-based trigger injection strategy that selects the most cognitively representative negative links for manipulation.
Extensive experiments on five benchmark datasets and four representative GNN architectures demonstrate that our method consistently achieves superior ASR while maintaining a low BPD. Ablation and sensitivity analyses further verify the robustness and effectiveness of our homophily-aware trigger design and confidence-guided injection mechanism.
In future work, we plan to explore three key directions. First, while our method focuses on path-based triggers, incorporating more complex homophilous motifs may lead to stronger or stealthier attack patterns. Second, we aim to extend our framework to dynamic and heterogeneous graphs, where structural evolution and multi-relational semantics introduce new challenges for both attack and defense. Third, we will investigate defense strategies specifically tailored to detect or mitigate homophily-guided triggers.
We believe this work provides new insights into the security vulnerabilities of GNN-based link prediction and lays a foundation for both adversarial research and robust graph learning in future applications.