DPGAD: Structure-Aware Dual-Path Attention Graph Node Anomaly Detection

Dong, Xinhua; Zhang, Hui; Han, Hongmu; Xu, Zhigang

doi:10.3390/sym17091452

Open AccessArticle

DPGAD: Structure-Aware Dual-Path Attention Graph Node Anomaly Detection

School of Computer Science, Hubei University of Technology, Wuhan 430068, China

^*

Author to whom correspondence should be addressed.

Symmetry 2025, 17(9), 1452; https://doi.org/10.3390/sym17091452

Submission received: 27 July 2025 / Revised: 19 August 2025 / Accepted: 28 August 2025 / Published: 4 September 2025

(This article belongs to the Section Computer)

Download

Browse Figures

Versions Notes

Abstract

Graph anomaly detection (GAD) is crucial for safeguarding the integrity and security of complex systems, such as social networks and financial transactions. Despite the advances made by Graph Neural Networks (GNNs) in the field of GAD, existing methods still exhibit limitations in capturing subtle structural anomaly patterns: they typically over-rely on reconstruction error, struggle to fully exploit structural similarity among nodes, and fail to effectively integrate attribute and structural information. To tackle these challenges, this paper proposes a structure-aware dual-path attention graph node anomaly detection method (DPGAD). DPGAD employs wavelet diffusion to extract network neighborhood features for each node while incorporating a dual attention mechanism to simultaneously capture attribute and structural similarities, thereby obtaining richer feature details. An adaptive gating mechanism is then introduced to dynamically adjust the fusion of attribute features and structural features. This allows the model to focus on the most relevant features for anomaly detection, enhancing its robustness and antinoise capability. Our experimental evaluation across multiple real-world datasets demonstrates that DPGAD consistently surpasses existing methods, achieving average improvements of 9.06% in AUC and 11% in F1-score. Especially in scenarios where structural similarity is crucial, DPGAD has a performance advantage of more than 20% compared with the most advanced methods.

Keywords:

graph neural network; anomaly detection; graph structure similarity; graph theory

1. Introduction

Anomaly detection is a critical task in numerous real-world applications, with use cases including identifying malicious behaviors in social networks [1,2], detecting financial fraud in transaction networks [3,4], and discovering anomalous patterns in industrial automation systems [5,6,7], among others. With the widespread adoption of graph-structured data across various domains, the significance of GAD has become increasingly prominent [8]. Accurately identifying anomalies in graph-structured data is crucial for maintaining the integrity and security of complex systems [9,10]. In recent years, alongside the rapid growth of graph-structured data in diverse fields, the importance of GAD has garnered extensive attention.

While significant progress has been made in this field, existing methods still face significant challenges. Early research primarily relied on statistical methods and community/clustering-based techniques. For instance, OddBall [11] detects anomalies by analyzing node degrees, weight distributions, or local structural patterns (e.g., star/clique subgraphs). NetWalk [12] introduced a real-time anomaly detection framework for dynamic networks, which jointly learns low-dimensional node representations via clique embedding and deep autoencoders while dynamically updating cluster centers using streaming K-means for anomaly identification. However, these methods typically rely on local structural features, making them less effective in capturing complex relationships and global patterns in graphs. With the advancement of deep learning, graph neural networks have emerged as a powerful tool for GAD. CmaGraph [13] pioneered the integration of dynamic community evolution into anomaly detection. Its three-module architecture (community detection, metric enhancement, and one-class anomaly scoring) reconstructs intra-/inter-community node distances, significantly improving sensitivity to structural anomalies. FAGAD [14] tackles the issue of anomaly signal loss caused by GNNs’ low-pass filtering. It proposes a frequency-adaptive GNN that enhances high-frequency anomaly feature extraction via all-pass/low-pass/high-pass signal fusion and a self-supervised bootstrapping strategy—without requiring labeled data. G3AD [15] addresses the adverse effects of anomalies on GNNs by designing a dual-auxiliary encoder (decoupling attributes and topology) and an adaptive caching mechanism. This prevents the model from directly reconstructing anomaly-contaminated graph structures, thereby improving robustness against subtle structural deviations in unsupervised settings. However, most existing GNN-based methods focus solely on either node attributes [16,17] or topological structures, resulting in limited sensitivity to structural anomalies. This limitation becomes particularly evident when anomalies manifest as subtle structural deviations rather than distinct attribute differences. For example, as shown in Figure 1, banks have identified anomalous credit card transactions where small purchases are immediately followed by large withdrawals across thousands of accounts. While individual accounts appear normal, fraudsters mask cash-out activities by coordinating transactions across controlled accounts. Such evasive patterns remain undetectable through neighborhood or attribute analysis alone. By encoding structural similarities, we can highlight shared behaviors among fraudulent nodes, such as connections to 5–10 peripheral accounts or 90% fund flow through these nodes, revealing hidden relationships to identify entire fraud networks rather than isolated suspicious accounts.

Current methods exhibit three key limitations: (1) Over-reliance on reconstruction error: Many existing approaches, such as autoencoder-based methods [18,19], detect anomalies by measuring reconstruction errors in node attributes or graph structure. However, these methods often fail to capture discriminative features, limiting their ability to identify anomalies in complex graph structures. (2) Inadequate exploitation of structural similarity: Most current methods poorly exploit structural similarity between nodes. For instance, SVD [20] and Eigenvalue Decomposition [21] applied to adjacency or Laplacian matrices are sensitive to minor graph perturbations and struggle to distinguish nodes with subtle yet critical structural differences. Similarly, random walk-based methods like DeepWalk [22] and Node2Vec [23] often fail to assign similar embeddings to nodes with identical structural roles but distant graph positions. (3) Disjoint handling of attributes and structure: Many methods focus exclusively on either node attributes [24,25] or graph topology like DBSCAN [26] and Struc2Vec [27], neglecting their synergistic integration. This separation reduces sensitivity to complex anomalies that exhibit deviations in both attributes and structure [28].

To address these limitations, we present DPGAD, a structure-aware dual-path attention graph node anomaly detection method. Specifically, DPGAD employs a dual-path attention mechanism to jointly model attribute features and structural similarity features in a unified framework. An adaptive gating mechanism dynamically balances the contributions of these features, enabling robust detection across diverse anomaly types. To summarize, we have the following contributions:

(1): We propose a novel dual similarity attention mechanism that captures both attribute and structural similarities in graph nodes. This mechanism enhances the model’s ability to detect anomalies, especially those defined by structural pattern deviations.
(2): A learnable gating mechanism dynamically adjusts the fusion of attribute and structural features. This allows DPGAD to focus on task-critical patterns, enhancing robustness against diverse anomalies.
(3): Extensive experiments on real-world datasets demonstrate DPGAD’s outperformance, with notable gains in accuracy for structure-sensitive anomalies.

2. Related Work

2.1. Graph Node Anomaly Detection

Graph node anomaly detection aims to identify anomalous nodes that deviate from the majority of nodes in terms of structural patterns, attribute distributions, or dynamic behavior patterns. Early studies primarily utilized statistical methods and community-based embeddings. OddBall [9] detects anomalies by analyzing local structural features such as node degrees and weight distributions (e.g., star/clique subgraphs). Hu et al. [29] proposed a community-affiliation-based node embedding method, where nodes bridging multiple communities are flagged as structural anomalies. However, these methods struggle to capture richer relational patterns. With the advancement of deep learning, Graph Neural Networks have emerged as the dominant approach due to their superior ability to model complex relationships [30,31]. DOMINANT [18] employs a GCN-based autoencoder to compute reconstruction errors for both structure and attributes, treating nodes with significant deviations as anomalies. SpecAE [32] integrates GNNs with Gaussian Mixture Models (GMMs), classifying low-probability nodes as anomalies. Recent approaches leverage graph signals—for instance, RQGNNRQGNN [33] demonstrates that the cumulative spectral energy of graph signals can be represented via the Rayleigh Quotient and introduces a Rayleigh Quotient GNN for graph-level anomaly detection. There are also methods based on diffusion models, such as DiffGAD [34], which attempts to transfer the conditional diffusion model from the generation task to the GAD task and extracts discriminative content by comparing the outputs of the unconditional diffusion model and the conditional diffusion model, thereby enhancing the model’s discriminative ability.

2.2. Graph Node Embedding

As a core technique, graph node embedding provides robust support for graph analysis tasks by learning low-dimensional representations of nodes in structured graph data. These methods are broadly categorized into attribute-based and structure-based embeddings. Attribute-based embeddings emphasize semantic feature representations of node features and labels. Representative approaches like GCN, GraphSAGE [35], and GAT [25] excel at neighbor feature aggregation, demonstrating strong performance in semi-supervised and unsupervised tasks. Unsupervised contrastive learning methods (DGI [36]) and pretraining frameworks (GPT-GNN [37]) further generalize attribute embeddings by maximizing the mutual information between node representations and global graph contexts. In contrast, structure-based embeddings focus on modeling topological relationships between nodes, making them suitable for graphs without node attributes or structural role analysis. Struc2Vec [27] further refine structural equivalence representations. Recent advances like PGNN [38] and Graphormer [39] leverage global position-aware mechanisms and Transformer architectures to enhance modeling of complex structural dependencies.

3. Methodology

A key challenge in graph anomaly detection lies in distinguishing local attribute anomalies from global structural anomalies. Traditional methods often process node features and topological structures independently, resulting in insufficient sensitivity to structural anomalies. To address this, we propose the DPGAD. Figure 2 overviews our proposed DPGAD, which operates by jointly analyzing attribute features and structural similarity features to enhance detection robustness.

3.1. Attribute Feature Embeddings

To obtain attribute feature embeddings, we employ graph attention networks to capture both node attribute information and neighborhood attribute relationships. This framework learns discriminative attribute representations for each node through the following process: First, the original node attributes

x_{i} \in R^{f}

are linearly projected into a higher-dimensional feature space:

z_{i}^{a t t r} = W_{a} x_{i} + b

(1)

W_{a}

are learnable parameters. For each node and its neighboring node

j \in N_{i}

(

N_{i}

denotes the neighborhood set of node i), we compute attention scores between node pairs and normalize them as follows:

e_{i j}^{a t t r} = LeakyReLU (a^{T} [z_{i}^{a t t r} ‖ z_{j}^{a t t r}])

(2)

α_{i j}^{a t t r} = \frac{exp (e_{i j}^{a t t r})}{\sum_{k \in N_{i}} exp (e_{i k}^{a t t r})}

(3)

where

a^{T}

are learnable parameters and ‖ denotes concatenation. We then aggregate and get the attribute characteristics of the node through the activation function

σ

:

h_{i}^{a t t r} = σ (\sum_{j \in N_{i}} α_{i j}^{a t t r} z_{j}^{a t t r})

(4)

3.2. Structural Similarity Feature Embedding

To capture structural similarity or equivalence among graph nodes, we employ wavelet diffusion to derive structural similarity embeddings. In reference [40], the wavelet diffusion method can capture the structural similarity of nodes, which provides rigorous mathematical proof. Specifically, for a given undirected graph

G = (V, E)

,

V

is the node set,

E

is the edge set,

A \in R^{N * N}

is the adjacency matrix,

D \in R^{N * N}

is the degree matrix, and

L = D - A

is the non-normalized Laplacian matrix. We decompose the Laplace matrix:

L = U Λ U^{T}

(5)

Λ = diag (λ_{1}, λ_{2}, \dots, λ_{N}) (0 = λ_{1} \leq λ_{2} \leq \dots \leq λ_{N}) .

(6)

where

Λ

is the eigenvalue diagonal matrix and

U = [u_{1}, u_{2}, \dots, u_{N}]

is the corresponding orthogonal eigenvector matrix. Decomposing the Laplacian matrix on large-scale graphs may lead to high computational complexity. To address this, we employ Chebyshev polynomial approximation for spectral graph wavelet computation, avoiding explicit matrix decomposition. This approach reduces the computational complexity to

O (K | E |)

, where K denotes the polynomial order and

| E |

represents the number of edges. Then for each node, we define its diffusion wavelet as

Ψ_{i} = U g_{s} (Λ) U^{T} δ_{i}

(7)

g_{s} (λ) = e^{- s λ} (s > 0)

(8)

Here,

δ_{i}

is the one-hot vector for node

v_{i}

,

g_{s}

is the kernel function, the diffusion wavelet

Ψ_{i}

represented as a heat diffusion pattern centered at node

v_{i}

, and

Ψ_{j} i

represents the intensity of energy received by node

v_{j}

during the heat diffusion process from

v_{i}

, satisfying

\sum_{j = 1}^{N} Ψ_{j i} = 1

and

Ψ_{j i} > 0

. s is the diffusion coefficient, which determines the radius of the network neighborhood around each node

v_{i}

. A small value of s determines node embedding based on the similarity of nearby nodes. Conversely, a larger value of s causes the diffusion process to propagate further in the network, resulting in embeddings of neighborhoods with larger radius. For diffusion scale selection, ref. [40] proposes an automated method that generates an arithmetic sequence between

s_{m} i n

and

s_{m} a x

, matching the size of the frequency parameter.

Then, wavelet diffusion is defined as a probability distribution on a graph, which is characterized by an empirical spectral function:

ϕ_{i} (t) = E_{j \sim Ψ_{i}} [e^{i t ψ_{j}}] = \frac{1}{N} \sum_{j = 1}^{N} e^{i t ψ_{j i}}

(9)

t \in R

is the frequency parameter of characteristic function and i is the imaginary unit. Finally, the node

v_{i}

structure embedding is obtained by adopting d equally spaced sampling points:

f_{i}^{c o n} = {[Re (ϕ_{i} (t_{k})), Im (ϕ_{i} (t_{k}))]}_{k = 1}^{d} \in R^{2 d}

(10)

Since the empirical characteristic function

ϕ_{i} (t))

is a complex-valued function, at each sampling point,

ϕ_{i} (t))

has a real part and an imaginary part. Therefore,

Re (\cdot)

and

Im (\cdot)

are used here to calculate the real and imaginary parts of the complex number, respectively.

To further consider the mutual influence of structural features in the neighborhood, the attention mechanism is introduced to propagate information to obtain structural similarity features with domain influence:

z_{i}^{s t r u c t} = W_{a} f_{i}^{c o n}

(11)

e_{i j}^{s t r u c t} = LeakyReLU (a_{i}^{T} [z_{i}^{s t r u c t} ‖ z_{j}^{s t r u c t}])

(12)

α_{i j}^{s t r u c t} = \frac{exp (e_{i j}^{s t r u c t})}{\sum_{k \in N_{i}} exp (e_{i j}^{s t r u c t})}

(13)

h_{i}^{c o n} = σ (\sum_{j \in N_{i}} α_{i j}^{s t r u c t} z_{j}^{s t r u c t})

(14)

The algorithm flow for learning the structural similarity embedding is shown in Algorithm 1.

Algorithm 1 Learning Structural Embeddings

Input:: Laplacian matrix L, scale parameter $s > 0$
Output:: Structural similarity embedding $h_{i}^{c o n} \in R^{2 d}$ for each node $i \in V$
1:: Eigen-decomposition: $L = U Λ U^{T}$
2:: Compute heat kernel: $g_{s} (λ) = e^{- s λ}$
3:: Compute wavelet diffusion: $Ψ_{i} = U g_{s} (Λ) U^{T} δ_{i}$
4:: for $t \in {t_{1}, t_{2}, \dots, t_{d}}$ do
5:: Compute $ϕ_{i} (t) = E_{j \sim Ψ_{i}} [e^{i t Ψ_{j i}}] = \frac{1}{N} \sum_{j = 1}^{N} e^{i t Ψ_{j i}}$
6:: for $i \in V$ do
7:: Compute $f_{i}^{c o n} = {[Re (ϕ_{i} (t_{k})), Im (ϕ_{i} (t_{k}))]}_{k = 1}^{d}$
8:: end for
9:: end for
10:: Initialize structural embeddings: $z_{i}^{s t r u c t} = W_{a} f_{i}^{c o n}$
11:: Apply attention mechanism to get $h_{i}^{c o n}$ (Equations (12)–(14))
12:: return ${h_{i}^{c o n}}_{i \in V}$

3.3. Adaptive Gating Fusion

In order to achieve the dynamic fusion of node structure features and attribute features, this paper introduces an adaptive gated fusion mechanism shown in Figure 3 to integrate the embedded information from the structure and attribute paths. Its mathematical expression is

h_{i}^{c a t} = [h_{i}^{a t t r} ‖ h_{i}^{c o n}]

(15)

g_{i} = σ (w_{g}^{T} h_{i}^{c a t} + b_{g}), g_{i} \in (0, 1)

(16)

h_{i}^{f i n a l} = g_{i} \cdot h_{i}^{a t t r} + (1 - g_{i}) \cdot h_{i}^{c o n},

(17)

We concatenate the node attribute features and structural features and then calculate the gating coefficient

g_{i}

through a linear layer.

w_{g}^{T}

is learnable parameter matrix,

b_{g}

is the bias, and

σ

is the sigmoid activation function, ensuring the gating coefficient is between 0 and 1. Then, adaptive fusion is performed through Equation (17) to obtain the fusion feature. When the gating coefficient value is larger, the model pays more attention to the attribute characteristics of anomalous nodes, and when the gating coefficient value is smaller, the model pays more attention to the structural characteristics of anomalous nodes. This method enables the model to automatically adjust the focus according to the actual situation of the node, thereby being compatible with a variety of anomalous types.

4. Experiment

4.1. Experimental Environment

This experiment uses an Intel (R) core (TM) i9-10900k CPU@3.70 GHZ in the Microsoft Windows environment and an NVIDIA GeForce RTX 3090 GPU card. All the codes are implemented by Python 3.11.11, pytoch 2.4.1, and the pytoch geometric is used to complete the related graph calculation.

4.2. Datasets

This section describes the datasets used to evaluate DPGAD’s performance. Table 1 summarizes their key statistics.

Weibo: Derived from user interactions on Tencent Weibo, this dataset includes location-based posting patterns and text features extracted via bag-of-words modeling. Users who publish two post pairs within a short timeframe (e.g., 60 s) and engage in at least five such activities are labeled as suspicious; others are considered normal.

Reddit: This dataset captures user–subreddit interactions on Reddit. Posts from users and subreddits are converted into feature vectors using Linguistic Inquiry and Word Count (LIWC) categories. Users banned from subreddits are flagged as anomalous.

Disney: Collected from Amazon’s movie co-purchase network, this dataset includes pricing, review counts, and ratings. Anomaly labels are determined via majority voting among high school students.

Books: Extracted from Amazon’s book co-purchase network, this dataset contains pricing, review volume, and ratings. Anomalies are labeled based on Amazon’s amazonfail tags.

Amazon: This dataset comprises instrument reviews from Amazon.com, designed to detect paid users posting fake reviews. It includes manually curated user features and behavioral statistics.

Tolokers: Sourced from Toloka’s crowdsourcing platform, this dataset tracks worker profiles and task performance. An edge connects two workers if they collaborate on the same task. The goal is to predict which workers are banned in a given project.

YelpChi: Built from Yelp.com reviews, this dataset identifies anomalies—unfair promotions or defamatory comments. Suspicious reviews are flagged based on biased or malicious content.

4.3. Baseline

LOF [41]: Local Outlier Factor (LOF) quantifies node anomalies based on their isolation relative to neighboring nodes. LOF relies solely on node features, with neighborhoods selected via k-nearest neighbors (KNN).

DIF [42]: Deep Isolation Forest (DIF) employs a novel representation scheme, where randomly initialized neural networks project raw data into randomized embeddings.

MLPAE [24]: MLPAE adopts a Multilayer Perceptron (MLP) as both encoder and decoder to reconstruct node features, with reconstruction loss serving as the anomaly score for each node.

GCN: GCN is the most representative balanced network embedding method, which achieves node embedding by aggregating the characteristics of neighbor nodes.

GraphSAGE [35]: GraphSAGE learns diverse aggregation functions—including mean, LSTM, and pooling aggregators—to integrate neighborhood information effectively.

GCNAE [19]: GCNAE utilizes a Variational Graph Autoencoder (VGAE), where the encoder learns node embeddings and the decoder reconstructs both adjacency matrices and node attributes. It has anomaly detection using reconstruction error.

GAT [25]: GAT enhances GCNs by adaptively weighting neighbor contributions through attention mechanisms.

DOMINANT [18]: DOMINANT is a deep graph autoencoder that learns node representations via a shared encoder while separately reconstructing adjacency and attribute matrices. The anomaly score combines weighted structural and attribute reconstruction errors.

DONE [43]: DONE employs dual autoencoders to encode topological structure and node attributes independently. Cross-modal interactions capture anomaly patterns, while a unified loss function jointly optimizes node embeddings and anomaly scores.

AdONE [43]: AdONE extends DONE by integrating adversarial learning. A generator–discriminator framework refines node representations, distinguishing normal from anomalous patterns.

AnomalyDAE [44]: AnomalyDAE leverages dual autoencoders with GATs to encode adjacency matrices and node features into separate embeddings. Attention mechanisms model asymmetric node interactions.

GAAN [45]: GAAN adopts a Generative Adversarial Network (GAN) framework for outlier detection. The generator learns normal node distributions, while the discriminator differentiates real from generated nodes. Anomaly scores derive from discriminator outputs.

DiffGAD [34]: DiffGAD introduces a discriminative content-guided generation paradigm. It extracts discriminative features by contrasting unconditional and conditional diffusion models and then computes reconstruction scores as node anomaly metrics.

4.4. Evaluation

To comprehensively evaluate the model’s performance, this paper adopts the following core evaluation metrics: ROC-AUC, F1-score, Recall, and Average Precision (AP). These metrics measure model performance from different perspectives. ROC-AUC (Receiver Operating Characteristic–Area Under Curve) is a widely used tool for assessing binary classification model performance. It visualizes the relationship between the True Positive Rate (TPR) and False Positive Rate (FPR) across varying thresholds. The AUC value ranges from 0 to 1, with higher values indicating stronger model discriminative ability. Recall is a crucial metric for measuring a model’s ability to identify positive/abnormal samples. It represents the proportion of actual positive/abnormal samples correctly identified by the model. The F1-score is the harmonic mean of precision and Recall. It provides a balanced assessment of the model’s precision and Recall performance. Average Precision (AP) measures a model’s precision performance at different Recall levels. It is calculated by computing the weighted average of precision values across these Recall levels, offering a more comprehensive evaluation of model performance across thresholds. The calculation formulas for these metrics are given as follows:

R e c a l l = \frac{TP}{TP + FN}

(18)

F 1 = 2 \times \frac{Precision \times R e c a l l}{Precision + R e c a l l}

(19)

A P = \sum_{k = 1}^{n} (R_{k} - R_{k - 1}) \times P_{k}

(20)

4.5. Experimental Results

To comprehensively evaluate the effectiveness of the proposed DPGAD, we conducted systematic experiments on multiple real-world datasets. The experiments primarily focused on addressing three key questions: (1) Can DPGAD significantly outperform existing methods in terms of overall performance? (2) Do structural similarity features play a crucial role in the GAD task? (3) Does DPGAD exhibit strong noise resistance performance?

4.5.1. Performance Analysis

This section presents a comprehensive comparison with state-of-the-art methods using AUC and F1-score metrics. The ROC-AUC results are summarized in Table 2, while F1-scores are reported in Table 3. The performance of Recall and AP is shown in the Appendix A.

From the experimental results of the two evaluation indices, we can derive the following observations: (1) DPGAD demonstrates robust anomaly detection across all datasets, achieving the highest AUC on six datasets and close-to-optimal performance on Amazon. With an average AUC improvement of 9.69% over the second best method, DPGAD significantly outperforms existing methods in graph anomaly detection. (2) DPGAD maintains strong F1-scores across all datasets, with a 12.97% average improvement over the second best method. This indicates its insensitivity to class imbalance, highlighting superior robustness and stability. (3) DPGAD surpasses competitors by 5.63% (AUC) and 12.9% (F1), attributed to its focus on structural similarity. Here, anomalies exhibit higher clustering coefficients (0.400 vs. 0.301), indicating structural deviations. DPGAD achieves 27.63% (AUC) and 28% (F1) gains over the runner-up. Unlike baselines that overfit on this small-scale dataset, DPGAD effectively isolates anomalies by leveraging structural distinctions during encoding.

4.5.2. Ablation Experiment

Effectiveness of structural similarity feature extraction. To validate the importance of structural similarity feature extraction in GAD, we design con_MLP—a standalone model combining structural similarity embeddings with an MLP head. Experiments are conducted on seven datasets to evaluate its effectiveness. We further compare its ROC-AUC performance against two baselines: (1) MLPAE (MLP-based autoencoder) and (2) a GAT model trained solely on attribute features. The results are summarized in Table 4.

Comparative experiments reveal that using structural similarity features alone achieves competitive performance compared to MLP-based architectures that solely rely on node attributes. This demonstrates the effectiveness of structural features in graph anomaly detection tasks. Notably, the con_MLP variant significantly outperforms attribute-only models on Weibo and Disney datasets. This indicates heightened sensitivity of anomalous nodes to structural patterns in these datasets, which further explains the superior overall performance observed in Section 4.5.1.

The effect of gating coefficient on performance. In the adaptive gated fusion unit, the model reduces to a standard GAT when

g_{i} = 1

, while it becomes a structure-similarity-based GAT when

g_{i} = 0

. This demonstrates that analyzing different gating coefficients helps reveal the importance of structural similarity features in GAD tasks. To evaluate the impact of gating coefficients on model performance, we conduct experiments on Weibo, Reddit, Disney, and Books by fixing different gating values and analyzing their effects across datasets. Specifically, we list the ROC-AUC performance with epoch = 500 and with different values of ranging from 0.1 to 1.0 over the Weibo, Reddit, Disney, and Books datasets.

Figure 4 reveals that the optimal

g_{i}

varies across datasets. A higher

g_{i}

indicates DPGAD’s stronger reliance on attribute features, while a lower

g_{i}

suggests greater dependence on structural similarity features. Table 5 reports the mean

g_{i}

values when DPGAD achieves peak performance after 500 training epochs. These values align closely with the optimal

g_{i}

in Figure 4, demonstrating that the adaptive gating unit converges to appropriate coefficients during training. Notably, the gating coefficients corresponding to peak performance consistently fall below 0.5. As shown in Figure 5, we analyze the distribution of gating coefficients after 500 training epochs on Weibo and Disney. The results reveal that most nodes exhibit gating values below 0.5. More importantly, the vast majority of anomalous nodes show gating coefficients under 0.5, indicating that DPGAD’s performance gains stem from its effective utilization of structural dependencies. This further highlights the critical role of structural similarity in graph node anomaly detection.

Robustness Testing in Complex Scenarios. To evaluate DPGAD robustness and applicability in complex scenarios, we conducted robustness tests. These tests simulate real-world challenges in consumer applications, including data sparsity, disordered review content, and adversarial users evading keyword-based detection, to assess DPGAD performance under such conditions. Specifically, we injected varying noise levels (10%, 20%, 30%) into the attribute features across four datasets and measured DPGAD ROC-AUC performance under different noise scales. Results are shown in Figure 6.

DPGAD maintains consistent performance across multiple datasets under varying noise levels (10%, 20%, 30%). While the AUC-ROC scores decrease with higher noise levels, the marginal degradation remains within acceptable thresholds, demonstrating the model’s noise tolerance. Notably, on the Weibo and Disney datasets, DPGAD shows less than 3% performance drop when noise increases from 10% to 30%, indicating superior robustness. Section 4.5.2 further reveals that DPGAD primarily relies on structural similarity features, which explains its resilience to attribute noise across all four datasets.

Performance evaluation of different types of anomalies. To assess DPGAD’s adaptability to diverse anomalies, we conducted experiments on the Weibo dataset. However, the publicly available dataset does not specify distinct anomaly types. Therefore, we employed K-means clustering to categorize anomalies automatically. Specifically, we first concatenated each node’s structural and attribute features as input. Then, we computed the Silhouette Coefficient for different K values to determine the optimal number of clusters. On the Weibo dataset, the best-performing K was 3, indicating three distinct anomaly types. The distribution of these three anomaly categories is visualized in Figure 7 (left).

The confusion matrix in Figure 7b demonstrates DPGAD’s classification performance on clustered data, where each cell value represents the proportion relative to its class total. The results indicate DPGAD achieves strong recognition across all categories, confirming its generalization capability for diverse anomaly types.

4.5.3. Visual Analysis

Feature visualization in training process. To visually validate the effectiveness of attribute feature embeddings and structural similarity embeddings, we visualize both representations—along with their gated fusion—at early (epoch = 1) and late (epoch = 500) training stages. Using t-SNE [46], we project the attribute and structural similarity embeddings into 2D space, with normal and anomalous nodes color-coded for distinction. Results are shown in Figure 8 and Figure 9.

In the visualization results, we found the following: (1) Combining features more effectively distinguishes and isolates anomalous nodes from normal ones compared to using attribute features or structural similarity features alone. This indicates that feature fusion enables the model to learn more comprehensive node representations, thereby improving detection accuracy and robustness. The improvement is particularly pronounced on Weibo and Disney, aligning with DPGAD’s superior performance on these datasets. (2) At Epoch = 1, node distributions using only AF exhibit significant overlap between normal (blue) and anomalous (red) nodes, with no clear separation. However, after incorporating SSF, even in early training stages, anomalies begin to diverge from normal nodes. This demonstrates that SSF plays a critical role in capturing anomalous patterns, especially when AF provides limited discriminative signals. (3) DPGAD’s advantages are particularly evident on small-scale datasets like Disney. By jointly leveraging AF and SSF, the model cleanly separates the minority anomalous nodes from the majority normal ones. This capability confirms DPGAD’s consistent detection efficacy on small datasets, even when anomalies are sparse or scarce.

Visualization of classification performance. To visually compare the performance of different graph anomaly detection methods on real-world datasets, visualization reveals how effectively each model separates normal nodes (blue) from anomalies (red). Specifically, we analyze the feature distributions of deep learning-based methods after 500 training epochs, with results visualized in Figure 10.

The visualization results demonstrate that DPGAD achieves sharper cluster separation between normal and anomalous nodes, with minimal overlap between the two categories. Although GCN and GraphSAGE show clear clustering effects for anomalous nodes, they fail to effectively separate anomalies from normal nodes, resulting in significant overlap regions between them. This limitation leads to their inferior performance compared to DPGAD. This observation indicates DPGAD’s enhanced sensitivity and precision in detecting structural anomalies within graph data. Notably, DPGAD maintains robust detection performance on the Weibo dataset, effectively identifying anomalous nodes despite their low population ratio (10.3%). This confirms DPGAD’s exceptional capability in detecting minority anomaly groups within imbalanced graph datasets.

5. Discussion

While DPGAD demonstrates strong performance in graph node anomaly detection, several key limitations remain and require further improvement in future work. (1) Although we optimized the time complexity of wavelet diffusion via Chebyshev polynomial approximation and sparse matrix representation, space complexity remains challenging for extremely large graphs. Notably, memory requirements grow substantially when processing graphs with a high edge count. (2) DPGAD achieves particularly strong results on datasets where structural similarity is critical for anomaly detection. However, its performance may degrade when structural patterns are less discriminative, as seen in certain datasets. (3) The performance of DPGAD varies on different datasets, indicating that this method has a certain dependence on the diversity and complexity of the datasets. In some datasets, anomalous features may be more reflected in attribute differences rather than structural similarities, which requires further research on how to better combine attribute and structural features. To address these limitations, future work will focus on (i) more efficient structural feature extraction to reduce computational overhead and (ii) enhanced model architectures and training strategies to improve robustness across diverse datasets.

6. Conclusions

To address the limitations of existing methods in graph anomaly detection, we propose a novel structure-aware dual path attention graph node anomaly detection method named DPGAD. DPGAD dynamically fuses attribute and structural similarity embeddings via an adaptive gating mechanism, significantly improving anomaly detection accuracy. Extensive experiments on multiple public datasets demonstrate that DPGAD outperforms existing methods in detection performance, particularly in scenarios where structural similarity plays a critical role in anomaly detection. Specifically, DPGAD achieves average improvements of 9.69% in AUC and 12.97% in F1-score, highlighting its enhanced discriminative capability. Ablation studies and visual analysis further confirm the importance of structural similarity features in graph anomaly detection and the effectiveness of the adaptive gating mechanism in dynamically balancing feature fusion. Despite the limitations of DPGAD in terms of computational cost and application scenarios, we will continue to explore more effective model structures and more efficient computational patterns in future work, with the hope of providing stronger support for the field of graph anomaly detection.

Author Contributions

Conceptualization, X.D. and H.Z.; methodology, H.H. and Z.X.; software, X.D. and H.H.; validation, H.Z. and Z.X.; formal analysis, H.H.; investigation, X.D. and H.H.; data curation, Z.X.; writing—original draft preparation, X.D.; writing—review and editing, H.Z.; supervision, Z.X.; project administration, Z.X. and X.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the Key-Area Research and Development Program of Guangdong Province 2020B1111420002, the Key-Area Research and Development Program of Hubei Province 2022BAA040, and the Innovation Fund of Hubei University of Technology BSQD2019027, BSQD2019020, and BSQD2016019. We sincerely thank the anonymous reviewers for their very comprehensive and constructive comments.

Data Availability Statement

The original contributions presented in this study are included in the article material. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

DPGAD demonstrates significant superiority in AP performance (Table A1) and Recall (Table A2). Regarding AP performance, our method achieves state-of-the-art results across multiple datasets, indicating strong generalization capabilities and high anomaly detection accuracy of our model. In terms of Recall, our approach also exhibits outstanding performance on most datasets, demonstrating that the model can effectively identify more anomalous samples while maintaining a low false negative rate. However, it must be acknowledged that our model is not without limitations. DPGAD fails to achieve optimal AP performance on the Reddit and Amazon datasets. This suggests that our model may have certain limitations when handling specific types of anomalies or graph-structured data with particular characteristics.

Table A1. A comparison of the AP performance of 14 baseline methods across 7 datasets. The best results of all methods are indicated in boldface; and the second best results are underlined. The ’Avg.’ column shows each method’s mean performance across all datasets.

Algorithm	Weibo	Reddit	Disney	Books	Amazon	YelpChi	Avg.
LOF	0.1580	0.0420	0.0520	0.0150	0.2046	0.2254	0.1162
IF	0.1293	0.0290	0.0940	0.0220	0.2258	0.2336	0.1223
MLPAE	0.5598	0.0340	0.0630	0.0150	0.4526	0.3761	0.2501
GCN	0.5664	0.0420	0.0664	0.0426	0.5032	0.4679	0.2814
GraphSAGE	0.6033	0.0413	0.0746	0.0368	0.5865	0.4415	0.2973
GCNAE	0.7085	0.0340	0.0487	0.0250	0.6033	0.4037	0.3039
GAT	0.5986	0.0354	0.0628	0.0334	0.6547	0.3896	0.2958
DOMINANT	0.2250	0.0402	0.1124	0.0324	0.6895	0.4762	0.2626
DONE	0.7052	0.0417	0.0572	0.0183	0.7056	0.4983	0.3377
AdONE	0.6126	0.0320	0.0724	0.0253	0.7742	0.5420	0.3431
AnomalyDAE	0.4013	0.0411	0.0593	0.0521	0.7329	0.5339	0.3034
GAAN	0.8020	0.0380	0.0568	0.0314	0.7523	0.6326	0.3855
DiffGAD	0.8150	0.0378	0.0622	0.0618	0.7542	0.6532	0.4040
Ours	0.8221	0.0402	0.1364	0.0952	0.7654	0.6894	0.4198

Table A2. A comparison of the Recall performance of 14 baseline methods across 6 datasets. The best results of all methods are indicated in boldface; and the second best results are underlined. The ’Avg.’ column shows each method’s mean performance across all datasets.

Algorithm	Weibo	Reddit	Disney	Books	Amazon	YelpChi	Avg.
LOF	0.3612	0.3755	0.2554	0.3216	0.2236	0.4573	0.3216
IF	0.3815	0.3217	0.3542	0.3925	0.2352	0.4428	0.3439
MLPAE	0.5021	0.3646	0.3854	0.4034	0.4519	0.6428	0.4368
GCN	0.4058	0.4428	0.3346	0.4853	0.5527	0.3527	0.4206
GraphSAGE	0.4896	0.4252	0.4715	0.4931	0.5756	0.6049	0.4875
GCNAE	0.6356	0.3755	0.3254	0.4662	0.4526	0.3525	0.4133
GAT	0.3507	0.3946	0.3747	0.4215	0.5431	0.4126	0.3979
DOMINANT	0.6535	0.4425	0.3452	0.4753	0.6216	0.4212	0.4484
DONE	0.5849	0.4328	0.3125	0.4046	0.6423	0.3759	0.4265
AdONE	0.5542	0.4258	0.3741	0.4983	0.6653	0.4212	0.4518
AnomalyDAE	0.7225	0.4460	0.3542	0.5237	0.5842	0.2984	0.4464
GAAN	0.6742	0.4542	0.3215	0.4852	0.5986	0.6045	0.4700
DiffGAD	0.6258	0.4428	0.3889	0.5346	0.6115	0.6424	0.4906
Ours	0.7966	0.4876	0.6742	0.5774	0.7145	0.6828	0.6307

References

Wang, H.; Qiao, C. A nodes’ evolution diversity inspired method to detect anomalies in dynamic social networks. IEEE Trans. Knowl. Data Eng. 2019, 32, 1868–1880. [Google Scholar] [CrossRef]
Guo, D.; Liu, Z.; Li, R. RegraphGAN: A graph generative adversarial network model for dynamic network anomaly detection. Neural Netw. 2023, 166, 273–285. [Google Scholar] [CrossRef] [PubMed]
Motie, S.; Raahemi, B. Financial fraud detection using graph neural networks: A systematic review. Expert Syst. Appl. 2024, 240, 122156. [Google Scholar] [CrossRef]
Dong, X.; Zhang, X.; Chen, L.; Yuan, M.; Wang, S. SpaceGNN: Multi-Space Graph Neural Network for Node Anomaly Detection with Extremely Limited Labels. In Proceedings of the The Fourteenth International Conference on Learning Representations, Singapore, 24 April 2025. [Google Scholar]
Chang, Z.; Cai, Y.; Liu, X.F.; Xie, Z.; Liu, Y.; Zhan, Q. Anomalous Node Detection in Blockchain Networks Based on Graph Neural Networks. Sensors 2025, 25, 1. [Google Scholar] [CrossRef]
Yao, Z.; Zhu, Q.; Zhang, Y.; Huang, H.; Luo, M. Minimizing Long-Term Energy Consumption in RIS-Assisted UAV-Enabled MEC Network. IEEE Internet Things J. 2025, 12, 20942–20958. [Google Scholar] [CrossRef]
Zhao, Z.; Xiao, Z.; Tao, J. MSDG: Multi-Scale Dynamic Graph Neural Network for Industrial Time Series Anomaly Detection. Sensors 2024, 24, 7218. [Google Scholar] [CrossRef]
Wang, X.; Jin, B.; Du, Y.; Cui, P.; Tan, Y.; Yang, Y. One-class graph neural networks for anomaly detection in attributed networks. Neural Comput. Appl. 2021, 33, 12073–12085. [Google Scholar] [CrossRef]
Ma, X.; Wu, J.; Xue, S.; Yang, J.; Zhou, C.; Sheng, Q.Z.; Xiong, H.; Akoglu, L. A comprehensive survey on graph anomaly detection with deep learning. IEEE Trans. Knowl. Data Eng. 2021, 35, 12012–12038. [Google Scholar] [CrossRef]
Hao, H.; Yao, E.; Pan, L.; Chen, R.; Wang, Y.; Xiao, H. Exploring heterogeneous drivers and barriers in MaaS bundle subscriptions based on the willingness to shift to MaaS in one-trip scenarios. Transp. Res. Part A Policy Pract. 2025, 199, 104525. [Google Scholar] [CrossRef]
Akoglu, L.; McGlohon, M.; Faloutsos, C. Oddball: Spotting anomalies in weighted graphs. In Advances in Knowledge Discovery and Data Mining: 14th Pacific-Asia Conference, PAKDD 2010, Hyderabad, India, 21–24 June 2010; Proceedings. Part II 14; Springer: Berlin/Heidelberg, Germany, 2010; pp. 410–421. [Google Scholar]
Yu, W.; Cheng, W.; Aggarwal, C.C.; Zhang, K.; Chen, H.; Wang, W. Netwalk: A flexible deep embedding approach for anomaly detection in dynamic networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 2672–2681. [Google Scholar]
Lin, W.; Bao, X.; Li, M.J. Cmagraph: A triblocks anomaly detection method in dynamic graph using evolutionary community rep-resentation learning. In Proceedings of the Artificial Neural Networks and Machine Learning–ICANN 2021: 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, 14–17 September 2021; Proceedings, Part I 30. Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 105–116. [Google Scholar]
Gu, M.; Yang, G.; Zheng, Z.; Liu, M.; Wang, H.; Chen, J.; Zhou, S.; Bu, J. Frequency Self-Adaptation Graph Neural Network for Unsupervised Graph Anomaly Detection. Neural Netw. 2025, 190, 107612. [Google Scholar] [CrossRef]
Bei, Y.; Zhou, S.; Shi, J.; Ma, Y.; Wang, H.; Bu, J. Guarding Graph Neural Networks for Unsupervised Graph Anomaly Detection. IEEE Trans. Neural Netw. Learn. Syst. 2025, 1–14. [Google Scholar] [CrossRef]
Khan, W.; Ebrahim, N. ANOGAT-Sparse-TL: A hybrid framework combining sparsification and graph attention for anomaly detec-tion in attributed networks using the optimized loss function incorporating the twersky loss for improved robustness. Knowl.-Based Syst. 2025, 311, 113144. [Google Scholar] [CrossRef]
Xiao, C.; Xu, X.; Lei, Y.; Zhang, K.; Liu, S.; Zhou, F. Counterfactual graph learning for anomaly detection on attributed networks. IEEE Trans. Knowl. Data Eng. 2023, 35, 10540–10553. [Google Scholar] [CrossRef]
Ding, K.; Li, J.; Bhanushali, R.; Liu, H. Deep anomaly detection on attributed networks. In Proceedings of the 2019 SIAM International Conference on Data Mining, Calgary, AB, Canada, 2–4 May 2019; Society for Industrial and Applied Mathematics, 2019. pp. 594–602. [Google Scholar]
Kipf, T.N.; Welling, M. Variational graph auto-encoders. arXiv 2016, arXiv:1611.07308. [Google Scholar] [CrossRef]
Abdi, H. Singular value decomposition (SVD) and generalized singular value decomposition. Encycl. Meas. Stat. 2007, 907, 44. [Google Scholar]
Hall, P.; Marshall, D.; Martin, R. Adding and subtracting eigenspaces with eigenvalue decomposition and singular value decompo-sition. Image Vis. Comput. 2002, 20, 1009–1016. [Google Scholar] [CrossRef]
Perozzi, B.; Al-Rfou, R.; Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
Grover, A.; Leskovec, J. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 855–864. [Google Scholar]
Sakurada, M.; Yairi, T. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, Gold Coast, QLD, Australia, 2 December 2014; pp. 4–11. [Google Scholar]
Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
Zhang, W.; Yu, Y.; Ji, S.; Zhang, S.; Ni, C. A multitask graph convolutional network with attention-based seasonal-trend decomposition for short-term load forecasting. IEEE Trans. Power Syst. 2024, 40, 3222–3231. [Google Scholar] [CrossRef]
Ribeiro, L.F.R.; Saverese, P.H.P.; Figueiredo, D.R. struc2vec: Learning node representations from structural identity. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 385–394. [Google Scholar]
Sun, L.; Shi, W.; Tian, X.; Li, J.; Zhao, B.; Wang, S.; Tan, J. A plane stress measurement method for CFRP material based on array LCR waves. Ndt E -Ternational. 2025, 151, 103318. [Google Scholar] [CrossRef]
Hu, R.; Aggarwal, C.C.; Ma, S.; Huai, J. An embedding approach to anomaly detection. In Proceedings of the 2016 IEEE 32nd International Conference on Data Engineering (ICDE), Helsinki, Finland, 16–20 May 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 385–396. [Google Scholar]
He, J.; Xu, Q.; Jiang, Y.; Wang, Z.; Huang, Q. Ada-gad: Anomaly-denoised autoencoders for graph anomaly detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, Canada, 20–27 February 2024; Volume 38, pp. 8481–8489. [Google Scholar]
Zhang, Y.; Ma, X.; Wu, J.; Yang, J.; Fan, H. Heterogeneous subgraph transformer for fake news detection. In Proceedings of the ACM Web Conference 2024, Singapore, 13–17 May 2024; pp. 1272–1282. [Google Scholar]
Li, Y.; Huang, X.; Li, J.; Du, M.; Zou, N. Specae: Spectral autoencoder for anomaly detection in attributed networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, Beijing, China, 3–7 November 2019; pp. 2233–2236. [Google Scholar]
Dong, X.; Zhang, X.; Wang, S. Rayleigh Quotient Graph Neural Networks for Graph-level Anomaly Detection. In Proceedings of the 2024 International Conference on Learning Representations, Vienna, Austria, 7–11 May 2024. [Google Scholar]
Li, J.; Gao, Y.; Lu, J.; Fang, J.; Wen, C.; Lin, H.; Wang, X. DiffGAD: A Diffusion-based Unsupervised Graph Anomaly Detector. In Proceedings of the International Conference on Learning Representations, Singapore, 24 April 2025. [Google Scholar]
Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
Yin, P.; Yan, X.; Zhou, J.; Fu, Q.; Cai, Z.; Cheng, J.; Tang, B.; Wang, M. Dgi: An easy and efficient framework for gnn model evaluation. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, CA, USA, 6–10 August 2023; pp. 5439–5450. [Google Scholar]
Hu, Z.; Dong, Y.; Wang, K.; Chang, K.W.; Sun, Y. Gpt-gnn: Generative pre-training of graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, San Diego, CA, USA, 23–27 August 2020; pp. 1857–1867. [Google Scholar]
Daw, A.; Karpatne, A.; Watkins, W.D.; Read, J.S.; Kumar, V. Physics-guided neural networks (pgnn): An application in lake temperature model-ing. In Knowledge Guided Machine Learning; Chapman and Hall/CRC: Boca Raton, FL, USA, 2022; pp. 353–372. [Google Scholar]
Ying, C.; Cai, T.; Luo, S.; Zheng, S.; Ke, G.; He, D.; Shen, Y.; Liu, T.Y. Do transformers really perform badly for graph representation? Adv. Neural Inf. Process. Syst. 2021, 34, 28877–28888. [Google Scholar]
Donnat, C.; Zitnik, M.; Hallac, D.; Leskovec, J. Learning structural node embeddings via diffusion wavelets. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 1320–1329. [Google Scholar]
Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD iNternational Conference on Management of Data, Dallas, TX, USA, 16–18 May 2000; pp. 93–104. [Google Scholar]
Xu, H.; Pang, G.; Wang, Y.; Wang, Y. Deep isolation forest for anomaly detection. IEEE Trans. Knowl. Data Eng. 2023, 35, 12591–12604. [Google Scholar] [CrossRef]
Bandyopadhyay, S.; N, L.; Vivek, S.V.; Murty, M.N. Outlier resistant unsupervised deep architectures for attributed network embedding. In Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 3–7 February 2020; pp. 25–33. [Google Scholar]
Fan, H.; Zhang, F.; Li, Z. Anomalydae: Dual autoencoder for anomaly detection on attributed networks. In Proceedings of the ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 5685–5689. [Google Scholar]
Chen, Z.; Liu, B.; Wang, M.; Dai, P.; Lv, J.; Bo, L. Generative adversarial attributed network anomaly detection. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, Virtual, 19–23 October 2020; pp. 1989–1992. [Google Scholar]
Qu, L.; Zhu, H.; Zheng, R.; Shi, Y.; Yin, H. Imgagn: Imbalanced network embedding via generative adversarial graph networks. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual, 14–18 August 2021; pp. 1390–1398. [Google Scholar]

Figure 1. Two fraudsters have similar structural roles even though they are distant in the graph.

Figure 2. An overview of DPGAD.

Figure 3. Adaptive gated fusion mechanism.

Figure 4. The ROC-AUC performance of DPGAD when setting different gating coefficient values on Weibo, Reddit, Disney, and Books. The gating coefficient with the best performance is marked with a bold red dot.

Figure 5. The distribution of the gating coefficient after training 500 epochs on (a) Weibo and (b) Disney. Blue is the proportion of all nodes, and orange is the proportion of abnormal nodes.

Figure 6. DPGAD performance under different attribute noise levels across four datasets.

Figure 7. (a) Distribution of nodes with different types of anomalies; (b) confusion matrix obtained from DPGAD testing on multiple types of anomalies.

Figure 8. Visual display of attribute features, structural similarity features, and combining features when epoch=1 and epoch = 500 on Weibo and Reddit. The red circles indicate the separation of nodes with combining features.

Figure 9. Visual display of attribute features, structural similarity features, and combining features when epoch=1 and epoch = 500 on Disney and Books. The red circles indicate the separation of nodes with combining features.

Figure 10. Visualization of classification performance of different methods on Weibo dataset.

Table 1. The statistics of datasets.

Datasets	# Nodes	# Edges	# Features	Avg. Degree	Ratio
Weibo	8405	407,963	400	48.5	10.30%
Reddit	10,984	168,016	64	15.3	3.30%
Disney	124	355	28	2.7	4.80%
Books	1418	3695	21	2.6	2.00%
Amazon	11,944	4,398,392	25	368.25	6.87%
YelpChi	45,954	3,846,979	32	83.71	14.53%

Table 2. A comparison of the ROC-AUC performance of 14 baseline methods across 7 datasets. The best results of all methods are indicated in boldface; and the second best results are underlined. The ’Avg.’ column shows each method’s mean performance across all datasets.

Algorithm	Weibo	Reddit	Disney	Books	Amazon	YelpChi	Avg.
LOF	0.565	0.572	0.479	0.365	0.552	0.5636	0.5161
DIF	0.5358	0.4527	0.576	0.448	0.579	0.5563	0.5246
MLPAE	0.8216	0.506	0.497	0.475	0.742	0.7246	0.6277
GCN	0.7338	0.5869	0.503	0.552	0.7677	0.5385	0.6137
GraphSAGE	0.7587	0.5542	0.633	0.564	0.7869	0.7808	0.6796
GCNAE	0.9082	0.506	0.492	0.545	0.742	0.5664	0.6266
GAT	0.6887	0.5181	0.5563	0.4982	0.7487	0.612	0.6037
DOMINANT	0.8906	0.5602	0.515	0.551	0.813	0.6533	0.6639
DONE	0.8531	0.5499	0.477	0.472	0.828	0.5442	0.6207
AdONE	0.8462	0.549	0.538	0.556	0.866	0.5983	0.6589
AnomalyDAE	0.9152	0.557	0.508	0.622	0.857	0.5633	0.6704
GAAN	0.925	0.562	0.48	0.593	0.808	0.6873	0.6759
DiffGAD	0.8776	0.5564	0.5456	0.6305	0.8137	0.7054	0.6882
Ours	0.9713	0.5946	0.8326	0.6642	0.8435	0.8042	0.7851

Table 3. A comparison of the F1-score performance of 14 baseline methods across 7 datasets. The best results of all methods are indicated in boldface; and the second best results are underlined. The ’Avg.’ column shows each method’s mean performance across all datasets.

Algorithm	Weibo	Reddit	Disney	Books	Amazon	YelpChi	Avg.
LOF	0.4542	0.4317	0.3023	0.3521	0.2218	0.4919	0.3757
DIF	0.4185	0.3753	0.4016	0.4228	0.2637	0.4604	0.3904
MLPAE	0.5538	0.4019	0.3424	0.4332	0.4826	0.6673	0.4802
GCN	0.4521	0.4765	0.3518	0.5012	0.5783	0.3737	0.4556
GraphSAGE	0.5027	0.4413	0.4715	0.5119	0.6238	0.7032	0.5424
GCNAE	0.7035	0.4014	0.3337	0.4918	0.4815	0.3819	0.4656
GAT	0.3812	0.4138	0.4019	0.4427	0.5327	0.4715	0.4406
DOMINANT	0.6824	0.4612	0.3717	0.5019	0.6521	0.5523	0.5369
DONE	0.6018	0.4512	0.3332	0.4335	0.6824	0.4412	0.4906
AdONE	0.5823	0.4518	0.3917	0.5112	0.6815	0.4624	0.5135
AnomalyDAE	0.7312	0.4619	0.3634	0.5518	0.7035	0.3917	0.5339
GAAN	0.7015	0.2471	0.3335	0.5412	0.6321	0.7018	0.5636
DiffGAD	0.6524	0.4615	0.4012	0.5624	0.6532	0.6524	0.5639
Ours	0.8521	0.5018	0.7515	0.6012	0.7224	0.7324	0.6936

Table 4. Performance of con_MLP, MLPAE, and GAT.

Algorithm	Weibo	Reddit	Disney	Books	Amazon	YelpChi
MLPAE	0.8216	0.506	0.497	0.475	0.742	0.7246
GAT_attr	0.6887	0.5181	0.5563	0.4982	0.7487	0.612
con_MLP	0.8763	0.5377	0.7813	0.5024	0.7567	0.753

Table 5. After training 500 epochs, the average value of the gating coefficient when dpad achieves the best performance is presented.

Datasets	Weibo	Reddit	Disney	Books	Amazon	YelpChi
$g_{i}$	0.4239	0.1025	0.3835	0.2620	0.4526	0.3226

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dong, X.; Zhang, H.; Han, H.; Xu, Z. DPGAD: Structure-Aware Dual-Path Attention Graph Node Anomaly Detection. Symmetry 2025, 17, 1452. https://doi.org/10.3390/sym17091452

AMA Style

Dong X, Zhang H, Han H, Xu Z. DPGAD: Structure-Aware Dual-Path Attention Graph Node Anomaly Detection. Symmetry. 2025; 17(9):1452. https://doi.org/10.3390/sym17091452

Chicago/Turabian Style

Dong, Xinhua, Hui Zhang, Hongmu Han, and Zhigang Xu. 2025. "DPGAD: Structure-Aware Dual-Path Attention Graph Node Anomaly Detection" Symmetry 17, no. 9: 1452. https://doi.org/10.3390/sym17091452

APA Style

Dong, X., Zhang, H., Han, H., & Xu, Z. (2025). DPGAD: Structure-Aware Dual-Path Attention Graph Node Anomaly Detection. Symmetry, 17(9), 1452. https://doi.org/10.3390/sym17091452

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

DPGAD: Structure-Aware Dual-Path Attention Graph Node Anomaly Detection

Abstract

1. Introduction

2. Related Work

2.1. Graph Node Anomaly Detection

2.2. Graph Node Embedding

3. Methodology

3.1. Attribute Feature Embeddings

3.2. Structural Similarity Feature Embedding

3.3. Adaptive Gating Fusion

4. Experiment

4.1. Experimental Environment

4.2. Datasets

4.3. Baseline

4.4. Evaluation

4.5. Experimental Results

4.5.1. Performance Analysis

4.5.2. Ablation Experiment

4.5.3. Visual Analysis

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI