TRed-GNN: A Robust Graph Neural Network with Task-Relevant Edge Disentanglement and Reverse Process Mechanism
Abstract
1. Introduction
- We propose TRed-GNN, a novel framework for node classification. By disentangling graph edges into task-relevant and task-irrelevant subgraphs, TRed-GNN mitigates noise from edge heterogeneity and processes information through independent channels. This dual-path design improves classification accuracy, especially on graphs with complex heterogeneous structures.
- To handle edge heterogeneity and node-level noise, TRed-GNN introduces a reverse process on the task-irrelevant subgraph. This mechanism recovers useful information while suppressing noise, effectively alleviating over-smoothing.
- We conducted systematic experiments on multiple real-world datasets (e.g., Cora, Citeseer, Chameleon) to evaluate TRed-GNN, demonstrating its superior performance over existing graph neural network methods across graphs with varying levels of homogeneity and heterogeneity.
2. Related Work
2.1. Graph Neural Networks
2.2. GNN for Heterophilous Graphs
2.3. Consider Task Relevance and Disentanglement Representation Learning
2.4. The Problem of Over-Smoothing in GNNs
3. Notations and Preliminaries
4. The Proposed Method
4.1. Dynamically Update Graph Topology
4.2. Neighborhood Aggregation
4.3. Computational Complexity Analysis
5. Experiments
- Datasets: In this section, we evaluate TRed-GNN on real-world datasets. We use the following real-world datasets: Cora, Citeseer, Cornell, Chameleon, Squirrel, Wisconsin, Texas, and Film.
- Data Splits: For homophilous graphs, we follow the standard setting of selecting 20 nodes per class for training, 500 nodes for validation, and 1500 nodes for testing. For heterophilous graphs, we split the data into training, validation, and test sets with ratios of 60%, 20%, and 20%, respectively.
- Baselines and Implementation Details: To assess the performance of our model, we compare it against several state-of-the-art GNN models and task-specific models. Specifically, the baseline models include GCN [12], GAT [13], SGC [42], GraphSAGE [14], APPNP [43], Geom-GCN [19], ACM-GCN [21], H2GCN [20], FAGCN [39], GPR-GNN [44], LRGNN [37], and MixHop [45]. For all baselines and TRed-GNN, we set as the number of hidden units to ensure a fair comparison, use Adam as the optimizer, and tune hyperparameters for each dataset using Optuna on the validation set. For the multi-layer perceptron, the hidden feature dimension is set to 512, and training is performed for 200 runs. After obtaining the optimal hyperparameters, we train the model for 1000 epochs with an early stopping strategy of 100 epochs patience. The final performance is reported as the average over 10 runs with different random data splits on the test set.
5.1. Classification Results
5.2. Ablation Experiment
5.3. Robustness Analysis
5.4. Relieve the Problem of Excessive Smoothness
5.5. Limitations
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Costa, A.R.; Ralha, C.G. AC2CD: An actor–critic architecture for community detection in dynamic social networks. Knowl.-Based Syst. 2023, 261, 110202. [Google Scholar] [CrossRef]
- Li, D.X.; Zhou, P.; Zhao, B.W.; Su, X.R.; Li, G.D.; Zhang, J.; Hu, P.W.; Hu, L. Biocaiv: An integrative webserver for motif-based clustering analysis and interactive visualization of biological networks. BMC Bioinform. 2023, 24, 451. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Lin, B.; Luo, B.; Gui, N. Graph representation learning beyond node and homophily. IEEE Trans. Knowl. Data Eng. 2022, 35, 4880–4893. [Google Scholar] [CrossRef]
- Zheng, Q.; Zhang, Y. Tagnn: Time adjoint graph neural network for traffic forecasting. In Proceedings of the International Conference on Database Systems for Advanced Applications, Tianjin, China, 17–20 April 2023; Springer: Cham, Switzerland, 2023; pp. 369–379. [Google Scholar]
- Li, W.; Wang, C.h.; Cheng, G.; Song, Q. Optimum-statistical Collaboration Towards General and Efficient Black-box Optimization. arXiv 2021, arXiv:2106.09215. [Google Scholar]
- Rusch, T.K.; Bronstein, M.M.; Mishra, S. A survey on oversmoothing in graph neural networks. arXiv 2023, arXiv:2303.10993. [Google Scholar] [CrossRef]
- Chen, D.; Lin, Y.; Li, W.; Li, P.; Zhou, J.; Sun, X. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. Proc. AAAI Conf. Artif. Intell. 2020, 34, 3438–3445. [Google Scholar] [CrossRef]
- He, L.; Bai, L.; Yang, X.; Liang, Z.; Liang, J. Exploring the role of edge distribution in graph convolutional networks. Neural Netw. 2023, 168, 459–470. [Google Scholar] [CrossRef]
- Liu, L.; Wang, Y.; Xie, Y.; Tan, X.; Ma, L.; Tang, M.; Fang, M. Label-aware aggregation on heterophilous graphs for node representation learning. Displays 2024, 84, 102817. [Google Scholar] [CrossRef]
- Chen, Y.; Jiang, D.; Tan, C.; Song, Y.; Zhang, C.; Chen, L. Neural moderation of ASMR erotica content in social networks. IEEE Trans. Knowl. Data Eng. 2023, 36, 275–280. [Google Scholar] [CrossRef]
- Guo, J.; Huang, K.; Zhang, R.; Yi, X. ES-GNN: Generalizing graph neural networks beyond homophily with edge splitting. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 11345–11360. [Google Scholar] [CrossRef]
- Kipf, T. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Bruna, J.; Zaremba, W.; Szlam, A.; LeCun, Y. Spectral networks and locally connected networks on graphs. arXiv 2013, arXiv:1312.6203. [Google Scholar]
- Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Neural message passing for quantum chemistry. In Proceedings of the International Conference on Machine Learning, Sydney, NSW, Australia, 6–11 August 2017; PMLR. pp. 1263–1272. [Google Scholar]
- He, M.; Wei, Z.; Huang, z.; Xu, H. Bernnet: Learning arbitrary graph spectral filters via bernstein approximation. Adv. Neural Inf. Process. Syst. 2021, 34, 14239–14251. [Google Scholar]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. In Proceedings of the 30th International Conference on Neural Information Processing System, Barcelona, Spain, 5–10 December 2016. [Google Scholar]
- Pei, H.; Wei, B.; Chang, K.C.C.; Lei, Y.; Yang, B. Geom-gcn: Geometric graph convolutional networks. arXiv 2020, arXiv:2002.05287. [Google Scholar] [CrossRef]
- Zhu, J.; Yan, Y.; Zhao, L.; Heimann, M.; Akoglu, L.; Koutra, D. Beyond homophily in graph neural networks: Current limitations and effective designs. Adv. Neural Inf. Process. Syst. 2020, 33, 7793–7804. [Google Scholar]
- Luan, S.; Hua, C.; Lu, Q.; Zhu, J.; Zhao, M.; Zhang, S.; Chang, X.W.; Precup, D. Revisiting heterophily for graph neural networks. Adv. Neural Inf. Process. Syst. 2022, 35, 1362–1375. [Google Scholar]
- Wang, R.; Mou, S.; Wang, X.; Xiao, W.; Ju, Q.; Shi, C.; Xie, X. Graph structure estimation neural networks. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 342–353. [Google Scholar]
- Xu, D.; Cheng, W.; Luo, D.; Chen, H.; Zhang, X. Infogcl: Information-aware graph contrastive learning. Adv. Neural Inf. Process. Syst. 2021, 34, 30414–30425. [Google Scholar]
- Sun, Q.; Li, J.; Peng, H.; Wu, J.; Fu, X.; Ji, C.; Yu, P.S. Graph structure learning with variational information bottleneck. Proc. AAAI Conf. Artif. Intell. 2022, 36, 4165–4174. [Google Scholar] [CrossRef]
- Yang, M.; Shen, Y.; Qi, H.; Yin, B. Soft-mask: Adaptive substructure extractions for graph neural networks. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 2058–2068. [Google Scholar]
- Zheng, C.; Zong, B.; Cheng, W.; Song, D.; Ni, J.; Yu, W.; Chen, H.; Wang, W. Robust graph representation learning via neural sparsification. In Proceedings of the International Conference on Machine Learning, Virtual Event, 13–18 July 2020; PMLR. pp. 11458–11468. [Google Scholar]
- Luo, D.; Cheng, W.; Yu, W.; Zong, B.; Ni, J.; Chen, H.; Zhang, X. Learning to drop: Robust graph neural network via topological denoising. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, Virtual Event, 8–12 March 2021; pp. 779–787. [Google Scholar]
- Wang, H.; Leskovec, J. Unifying graph convolutional neural networks and label propagation. arXiv 2020, arXiv:2002.06755. [Google Scholar] [CrossRef]
- Seo, S.; Kim, S.; Park, C. Interpretable prototype-based graph information bottleneck. Adv. Neural Inf. Process. Syst. 2023, 36, 76737–76748. [Google Scholar]
- Higgins, I.; Amos, D.; Pfau, D.; Racaniere, S.; Matthey, L.; Rezende, D.; Lerchner, A. Towards a definition of disentangled representations. arXiv 2018, arXiv:1812.02230. [Google Scholar] [CrossRef]
- Liu, Y.; Wang, X.; Wu, S.; Xiao, Z. Independence promoted graph disentangled networks. Proc. AAAI Conf. Artif. Intell. 2020, 34, 4916–4923. [Google Scholar] [CrossRef]
- Ma, J.; Cui, P.; Kuang, K.; Wang, X.; Zhu, W. Disentangled graph convolutional networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; PMLR. pp. 4212–4221. [Google Scholar]
- Yang, Y.; Feng, Z.; Song, M.; Wang, X. Factorizable graph convolutional networks. Adv. Neural Inf. Process. Syst. 2020, 33, 20286–20296. [Google Scholar]
- Li, H.; Zhang, Z.; Wang, X.; Zhu, W. Disentangled graph contrastive learning with independence promotion. IEEE Trans. Knowl. Data Eng. 2022, 35, 7856–7869. [Google Scholar] [CrossRef]
- Zhu, J.; Rossi, R.A.; Rao, A.; Mai, T.; Lipka, N.; Ahmed, N.K.; Koutra, D. Graph neural networks with heterophily. Proc. AAAI Conf. Artif. Intell. 2021, 35, 11168–11176. [Google Scholar] [CrossRef]
- Maurya, S.K.; Liu, X.; Murata, T. Simplifying approach to node classification in graph neural networks. J. Comput. Sci. 2022, 62, 101695. [Google Scholar] [CrossRef]
- Liang, L.; Hu, X.; Xu, Z.; Song, Z.; King, I. Predicting global label relationship matrix for graph neural networks under heterophily. Adv. Neural Inf. Process. Syst. 2023, 36, 10909–10921. [Google Scholar]
- Song, Y.; Zhou, C.; Wang, X.; Lin, Z. Ordered gnn: Ordering message passing to deal with heterophily and over-smoothing. arXiv 2023, arXiv:2302.01524. [Google Scholar] [CrossRef]
- Bo, D.; Wang, X.; Shi, C.; Shen, H. Beyond low-frequency information in graph convolutional networks. Proc. AAAI Conf. Artif. Intell. 2021, 35, 3950–3957. [Google Scholar] [CrossRef]
- Chamberlain, B.; Rowbottom, J.; Gorinova, M.I.; Bronstein, M.; Webb, S.; Rossi, E. Grand: Graph neural diffusion. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; PMLR. pp. 1407–1418. [Google Scholar]
- Rong, Y.; Huang, W.; Xu, T.; Huang, J. Dropedge: Towards deep graph convolutional networks on node classification. arXiv 2019, arXiv:1907.10903. [Google Scholar]
- Wu, F.; Souza, A.; Zhang, T.; Fifty, C.; Yu, T.; Weinberger, K. Simplifying graph convolutional networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; PMLR. pp. 6861–6871. [Google Scholar]
- Gasteiger, J.; Bojchevski, A.; Günnemann, S. Predict then propagate: Graph neural networks meet personalized pagerank. arXiv 2018, arXiv:1810.05997. [Google Scholar]
- Chien, E.; Peng, J.; Li, P.; Milenkovic, O. Adaptive universal generalized pagerank graph neural network. arXiv 2020, arXiv:2006.07988. [Google Scholar]
- Abu-El-Haija, S.; Perozzi, B.; Kapoor, A.; Alipourfard, N.; Lerman, K.; Harutyunyan, H.; Ver Steeg, G.; Galstyan, A. Mixhop: Higher-order graph convolutional architectures via sparsified neighborhood mixing. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; PMLR. pp. 21–29. [Google Scholar]
Methods | Task Relevance Handling | Heterophily Handling | Over-Smoothing Mitigation |
---|---|---|---|
GCN | Does not explicitly handle task relevance. All neighbors are treated equally. | Assumes homophily (same class nodes are connected), struggles with heterophily. | Over-smoothing becomes a major issue in deeper layers. |
GAT | Uses attention to weigh the importance of neighbors, but does not separate task-relevant and irrelevant edges. | Self-attention helps focus on more relevant neighbors, but still fails with heterophily in certain cases. | Attention may reduce over-smoothing, but it is still problematic with deep layers. |
FactorGNN | Factorizes node representations, but does not explicitly separate task-relevant information. | Handles heterophily with factorization, but no direct mechanism for isolating task-irrelevant information. | Does not address over-smoothing effectively. |
MixHop | Mixes features from different neighborhood levels, but task relevance is not explicitly modeled. | Captures higher-order neighbors but may struggle with heterophily where connections do not align with node labels. | The mixing of high-order features leads to over-smoothing as layers increase. |
FAGCN | Uses frequency-based aggregation to improve task relevance, but lacks explicit separation of relevant and irrelevant edges. | Improves handling of heterophilous graphs but does not model task relevance clearly. | Frequency aggregation reduces over-smoothing but can still be an issue with deeper layers. |
ACM-GNN | Adapts the aggregation process but does not explicitly split task-relevant from irrelevant edges. | Self-adaptive channel mixing improves performance on heterophilous graphs but does not isolate irrelevant edges. | Combines different filter types but does not completely mitigate over-smoothing. |
TRed-GNN | Explicitly separates task-relevant and task-irrelevant edges, ensuring that only relevant information is used. | Uses a reverse process to recover useful information from task-irrelevant edges, specifically designed for heterophilous graphs. | Uses a reverse diffusion process to recover information and prevent over-smoothing, ensuring effective message passing even in deep layers. |
Dataset | Nodes | Edges | Features | Classes | Homophily (%) |
---|---|---|---|---|---|
Cora | 2708 | 1433 | 1433 | 7 | 81 |
Citeseer | 3327 | 3703 | 3703 | 6 | 74 |
Chameleon | 2277 | 31,421 | 2325 | 5 | 23 |
Squirrel | 5201 | 198,493 | 2089 | 5 | 22 |
Film | 7600 | 26,752 | 931 | 5 | 22 |
Cornell | 183 | 280 | 1703 | 5 | 31 |
Wisconsin | 251 | 466 | 1703 | 5 | 21 |
Texas | 183 | 295 | 1703 | 5 | 11 |
Methods | GCN | GAT | FAGCN | MixHop | GPR-GNN |
---|---|---|---|---|---|
Time complexity | |||||
Methods | SGC | ACM-GNN | FactorGNN | Geom-GCN | TRed-GNN |
Time complexity |
Method | Cora | Citeseer | Chameleon | Wisconsin | Texas | Squirrel | Cornell | Film |
---|---|---|---|---|---|---|---|---|
GCN | ||||||||
GAT | ||||||||
SGC | ||||||||
GraphSAGE | ||||||||
APPNP | ||||||||
GeomGCN | ||||||||
ACM-GCN | ||||||||
H2GCN | ||||||||
FAGCN | ||||||||
GPR-GNN | ||||||||
LRGNN | ||||||||
MixHop | ||||||||
TRedGNN | ||||||||
w/o ZIR | ||||||||
w/o |
Method | Cora | Citeseer | Chameleon | Wisconsin | Texas | Squirrel | Cornell | Film |
---|---|---|---|---|---|---|---|---|
GCN | 79.74 ± 0.74 | 69.56 ± 0.93 | 67.69 ± 1.18 | 59.51 ± 2.05 | 61.74 ± 2.36 | 55.25 ± 0.93 | 52.85 ± 3.72 | 31.26 ± 0.68 |
GAT | 79.13 ± 0.68 | 69.91 ± 0.74 | 67.96 ± 1.49 | 57.72 ± 2.79 | 55.43 ± 3.35 | 54.76 ± 1.36 | 51.24 ± 3.10 | 30.97 ± 0.81 |
SGC | 82.21 ± 0.62 | 69.92 ± 1.18 | 67.34 ± 1.43 | 57.91 ± 2.17 | 55.42 ± 1.36 | 54.86 ± 0.81 | 50.42 ± 0.62 | 30.58 ± 0.50 |
GraphSAGE | 86.88 ± 0.81 | 72.02 ± 1.24 | 62.24 ± 1.49 | 62.71 ± 2.60 | 58.92 ± 1.55 | 55.25 ± 0.99 | 52.31 ± 0.74 | 30.86 ± 0.74 |
APPNP | 87.36 ± 0.37 | 75.29 ± 0.99 | 54.39 ± 1.18 | 45.69 ± 1.80 | 58.92 ± 1.55 | 35.11 ± 1.12 | 58.65 ± 1.61 | 26.53 ± 0.68 |
GeomGCN | 84.83 ± 0.56 | 75.41 ± 0.80 | 60.92 ± 0.62 | 64.51 ± 0.68 | 68.38 ± 2.17 | 38.09 ± 1.12 | 59.45 ± 1.80 | 31.65 ± 0.93 |
ACM-GCN | 86.71 ± 0.62 | 77.09 ± 1.05 | 66.47 ± 1.36 | 76.47 ± 3.65 | 74.05 ± 0.80 | 54.38 ± 3.04 | 84.86 ± 0.62 | 36.12 ± 0.68 |
H2GCN | 81.46 ± 0.87 | 78.62 ± 1.24 | 82.63 ± 2.48 | 82.63 ± 2.48 | 79.48 ± 2.29 | 50.43 ± 0.81 | 79.62 ± 3.04 | 38.46 ± 0.99 |
FAGCN | 82.65 ± 0.81 | 70.34 ± 0.99 | 69.08 ± 1.12 | 65.32 ± 2.71 | 60.35 ± 3.41 | 50.46 ± 1.61 | 79.25 ± 3.41 | 37.94 ± 0.87 |
GPR-GNN | 81.51 ± 0.93 | 69.63 ± 1.05 | 69.68 ± 1.05 | 82.32 ± 2.54 | 81.76 ± 3.05 | 55.16 ± 0.74 | 79.95 ± 3.47 | 38.31 ± 0.68 |
LRGNN | 72.65 ± 0.81 | 70.53 ± 0.68 | 77.16 ± 1.80 | 78.25 ± 1.99 | 71.14 ± 1.92 | 56.75 ± 1.36 | 56.86 ± 2.48 | 21.65 ± 0.80 |
MixHop | 81.36 ± 0.56 | 71.45 ± 0.56 | 81.03 ± 0.87 | 79.71 ± 2.60 | 77.24 ± 1.18 | 55.31 ± 1.05 | 62.34 ± 2.60 | 32.37 ± 0.56 |
TRedGNN | 89.13 ± 0.68 | 80.65 ± 1.05 | 83.22 ± 1.30 | 86.48 ± 2.67 | 85.03 ± 2.48 | 60.47 ± 0.93 | 76.12 ± 2.17 | 38.66 ± 0.50 |
w/o ZIR | 76.65 ± 0.50 | 70.23 ± 0.74 | 73.67 ± 1.55 | 78.11 ± 2.79 | 75.41 ± 2.61 | 48.87 ± 1.18 | 67.21 ± 1.80 | 30.54 ± 0.74 |
w/o | 83.22 ± 0.68 | 77.08 ± 0.87 | 79.12 ± 1.80 | 80.53 ± 3.16 | 78.24 ± 2.54 | 54.96 ± 0.87 | 73.32 ± 1.92 | 34.23 ± 0.80 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, M.; Yan, Y.; Wang, Q.; Chen, H.; Zhang, Z. TRed-GNN: A Robust Graph Neural Network with Task-Relevant Edge Disentanglement and Reverse Process Mechanism. Algorithms 2025, 18, 632. https://doi.org/10.3390/a18100632
Xu M, Yan Y, Wang Q, Chen H, Zhang Z. TRed-GNN: A Robust Graph Neural Network with Task-Relevant Edge Disentanglement and Reverse Process Mechanism. Algorithms. 2025; 18(10):632. https://doi.org/10.3390/a18100632
Chicago/Turabian StyleXu, Menghui, Yang Yan, Qiuyan Wang, Hanning Chen, and Zhao Zhang. 2025. "TRed-GNN: A Robust Graph Neural Network with Task-Relevant Edge Disentanglement and Reverse Process Mechanism" Algorithms 18, no. 10: 632. https://doi.org/10.3390/a18100632
APA StyleXu, M., Yan, Y., Wang, Q., Chen, H., & Zhang, Z. (2025). TRed-GNN: A Robust Graph Neural Network with Task-Relevant Edge Disentanglement and Reverse Process Mechanism. Algorithms, 18(10), 632. https://doi.org/10.3390/a18100632