NAGNE: Node-to-Attribute Generation Network Embedding for Heterogeneous Network
Abstract
:1. Introduction
- We propose an attributed random walk method on a heterogeneous network, which can integrate node attributes into the generated node sequence and does not require a predefined meta-path.
- We propose a heterogeneous network neural network model, i.e., NAGNE, for network embedding on a heterogeneous network. NAGNE is able to capture both structural and attribute heterogeneity, and can flexibly determine the appropriate scope of neighborhood information.
- We conduct extensive experiments on several public datasets, and the results show that our model significantly outperforms state-of-the-art network embedding models in various tasks.
2. Related Work
2.1. Graph Neural Network
2.2. Heterogeneous Network Embedding
3. Preliminary
4. Methodology
4.1. Node Attribute Mapping
4.2. Attributed Random Walks
- Selecting a target attribute type t. Sampling one attribute type t from those that have nodes connected to via heterogeneous edges. The candidate set of attribute types for selecting is denoted as . According to the node type of , can be divided into two parts, and , which are denoted asTo overcome the problem that the node distribution from the random walk sequences are skewed towards these highly visible node types, we first sample one set from with the following probability:
- Sampling the target node . Sampling one node from the nodes with the selected attribute type t. The corresponding candidate set of nodes is denoted as . After selecting a target attribute type t, we can uniformly sample one node as from with the following probability:Parameter q controls the search procedure in the walk. As we hope the nodes in are more likely to be sampled, we set . Note that by generating the next node based on the attribute type, the attributed random walk samples nodes not only from those connected to , but also from the nodes have similar attributes with , which allows the random walk to explore both the structural and attribute information of the network. The pseudocode is shown in Algorithm 1.
Algorithm 1 Attributed Random Walks |
Require: graph , initial stay probability controlling parameters and s, random walks parameter q, number of random walks per node r, maximum walk length
|
4.3. Node-to-Attribute Generation
4.3.1. Node Sequence Encoder
4.3.2. Node Attribute Generation
5. Experiment
5.1. Experimental Setup
5.1.1. Datasets
- DBLP [18]: A frequently used bibliographic network in heterogeneous network studies. We extract a subset of DBLP containing nodes from three domains, including 2000 authors (A), 9556 papers (P) and 20 conferences (C).
- IMDB [30]: A movie rating website with nodes also in three domains. We composed a heterogeneous network with 4353 actors (A), 3676 movies (M) and 1678 directors (D). Movies were categorized into action, comedy and drama classes asabels based on their genres.
- ACM [30]: This network contains papers from prestigious conferencesike KDD, SIGMOD, SIGCOMM, MobiCOMM and VLDB. Our subset of the ACM includes 3025 papers, 5835 authors and 56 subjects. Paper features are represented through a bag-of-words model of keywords.
- Aminer [31]: Bibliographic graphs. The papers in Aminer areabeled with 17 research fields, e.g., Artificial Intelligence, which are used for node classification. Our subset of the Aminer includes 302402 papers and 520456 authors.
5.1.2. Baselines
- DeepWalk [1]: A homogeneous network embedding model that integrates random walks and the skip-gram to learn network embeddings.
- Metapath2Vec [7]: A heterogeneous network embedding method adopting meta-path-based random walks and a heterogeneous skip-gram model.
- HIN2Vec [32]: A neural network-based approach that learns embeddings through multiple predictive training tasks.
- MAGNN [33]: This model uses node content transformation along with intra- and inter-metapath aggregations to generate node embeddings.
- HAN [8]: This approach utilizes neural networks for heterogeneous networks, generating embeddings by aggregating features from meta-path-based neighbors hierarchically.
- GTN [34]: GTN transforms heterogeneous networks into new graphs using meta-paths and applies convolution to these meta-path-derived graphs to learn node embeddings.
- HGNN-AC [19]: This model transforms a heterogeneous graph into multiple meta-path-defined graphs, learning node representations through convolution and completing attributes for nodes lacking them via weighted aggregation.
- HGSL [20]: HGSL concurrently learns the structure of a heterogeneous graph and the parameters of GNNs for classification tasks.
- Megnn [24]: Megnn merges different bipartite sub-graphs tied to edge types into a novel trainable graph structure using heterogeneous convolution.
5.2. Multi-Label Classification
5.3. Node Clustering
5.4. Visualization
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Perozzi, B.; Al-Rfou, R.; Skiena, S. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-Supervised Classification with Graph Convolutional Networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Hamilton, W.; Ying, Z.; Leskovec, J. Inductive representation learning on large graphs. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 1024–1034. [Google Scholar]
- Chang, S.; Han, W.; Tang, J.; Qi, G.J.; Aggarwal, C.C.; Huang, T.S. Heterogeneous Network Embedding via Deep Architectures. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 10–13 August 2015; KDD ‘15. pp. 119–128. [Google Scholar] [CrossRef]
- Shang, J.; Qu, M.; Liu, J.; Kaplan, L.M.; Peng, J. Meta-Path Guided Embedding for Similarity Search in Large-Scale Heterogeneous Information Networks. arXiv 2016, arXiv:1610.09769. [Google Scholar]
- Dong, Y.; Chawla, N.V.; Swami, A. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 135–144. [Google Scholar]
- Wang, X.; Ji, H.; Shi, C.; Wang, B.; Ye, Y.; Cui, P.; Yu, P.S. Heterogeneous graph attention network. In Proceedings of the World Wide Web Conference, San Francisco, CA, USA, 13–17 May 2019; pp. 2022–2032. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Li, Y.; Tarlow, D.; Brockschmidt, M.; Zemel, R. Gated graph sequence neural networks. arXiv 2015, arXiv:1511.05493. [Google Scholar]
- Zhang, J.; Shi, X.; Xie, J.; Ma, H.; King, I.; Yeung, D.Y. Gaan: Gated attention networks for learning on large and spatiotemporal graphs. arXiv 2018, arXiv:1803.07294. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Shi, Y.; Gui, H.; Zhu, Q.; Kaplan, L.; Han, J. Aspem: Embedding learning by aspects in heterogeneous information networks. In Proceedings of the 2018 SIAM International Conference on Data Mining, San Diego, CA, USA, 3–5 May 2018; pp. 144–152. [Google Scholar]
- Shi, C.; Hu, B.; Zhao, W.X.; Philip, S.Y. Heterogeneous information network embedding for recommendation. IEEE Trans. Knowl. Data Eng. 2018, 31, 357–370. [Google Scholar] [CrossRef]
- Zhang, C.; Song, D.; Huang, C.; Swami, A.; Chawla, N.V. Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 793–803. [Google Scholar]
- Hu, Z.; Dong, Y.; Wang, K.; Sun, Y. Heterogeneous graph transformer. In Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; pp. 2704–2710. [Google Scholar]
- Zhao, J.; Wang, X.; Shi, C.; Liu, Z.; Ye, Y. Network Schema Preserved Heterogeneous Information Network Embedding. In Proceedings of the 29th International Joint Conference on Artificial Intelligence (IJCAI), Virtual, 7–15 January 2020. [Google Scholar]
- Lu, Y.; Shi, C.; Hu, L.; Liu, Z. Relation structure-aware heterogeneous information network embedding. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 29–31 January 2019; Volume 33, pp. 4456–4463. [Google Scholar]
- Jin, D.; Huo, C.; Liang, C.; Yang, L. Heterogeneous Graph Neural Network via Attribute Completion. In Proceedings of the WWW ’21: The Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021. [Google Scholar]
- Zhao, J.; Wang, X.; Shi, C.; Hu, B.; Song, G.; Ye, Y. Heterogeneous Graph Structure Learning for Graph Neural Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021. [Google Scholar]
- Jiang, X.; Jia, T.; Fang, Y.; Shi, C.; Lin, Z.; Wang, H. Pre-training on Large-Scale Heterogeneous Graph. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual, 14–18 August 2021. [Google Scholar]
- Wang, X.; Liu, N.; Han, H.; Shi, C. Self-supervised heterogeneous graph neural network with co-contrastive learning. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, Virtual, 14–18 August 2021; pp. 1726–1736. [Google Scholar]
- Yang, Y.; Guan, Z.; Li, J.; Zhao, W.; Cui, J.; Wang, Q. Interpretable and efficient heterogeneous graph convolutional network. IEEE Trans. Knowl. Data Eng. 2021, 35, 1637–1650. [Google Scholar] [CrossRef]
- Chang, Y.; Chen, C.; Hu, W.; Zheng, Z.; Zhou, X.; Chen, S. Megnn: Meta-path extracted graph neural network for heterogeneous graph representation learning. Knowl.-Based Syst. 2022, 235, 107611. [Google Scholar] [CrossRef]
- Yang, X.; Yan, M.; Pan, S.; Ye, X.; Fan, D. Simple and efficient heterogeneous graph neural network. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 10816–10824. [Google Scholar]
- Tian, Y.; Dong, K.; Zhang, C.; Zhang, C.; Chawla, N.V. Heterogeneous graph masked autoencoders. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 9997–10005. [Google Scholar]
- Cen, Y.; Zou, X.; Zhang, J.; Yang, H.; Zhou, J.; Tang, J. Representation Learning for Attributed Multiplex Heterogeneous Network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 1358–1368. [Google Scholar] [CrossRef]
- Liu, M.; Liu, J.; Chen, Y.; Wang, M.; Chen, H.; Zheng, Q. AHNG: Representation learning on attributed heterogeneous network. Inf. Fusion 2019, 50, 221–230. [Google Scholar] [CrossRef]
- Wang, F.; Li, T.; Wang, X.; Zhu, S.; Ding, C. Community discovery using nonnegative matrix factorization. Data Min. Knowl. Discov. 2011, 22, 493–521. [Google Scholar] [CrossRef]
- Wang, X.; Cui, P.; Wang, J.; Pei, J.; Zhu, W.; Yang, S. Community preserving network embedding. In Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar]
- Wan, H.; Zhang, Y.; Zhang, J.; Tang, J. AMiner: Search and Mining of Academic Social Networks. Data Intell. 2019, 1, 58–76. [Google Scholar] [CrossRef]
- Fu, T.y.; Lee, W.C.; Lei, Z. Hin2vec: Explore meta-paths in heterogeneous information networks for representation learning. In Proceedings of the ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; pp. 1797–1806. [Google Scholar]
- Fu, X.; Zhang, J.; Meng, Z.; King, I. MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding. In Proceedings of the Web Conference 2020, Taipei, Taiwan, 20–24 April 2020; pp. 2331–2341. [Google Scholar]
- Yun, S.; Jeong, M.; Kim, R.; Kang, J.; Kim, H.J. Graph transformer networks. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; pp. 11983–11993. [Google Scholar]
- Shi, C.; Kong, X.; Huang, Y.; Philip, S.Y.; Wu, B. Hetesim: A general framework for relevance measure in heterogeneous networks. IEEE Trans. Knowl. Data Eng. 2014, 26, 2479–2492. [Google Scholar] [CrossRef]
- Maaten, L.v.d.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Method | Heterogeneity | Predefined Meta-Paths | |
---|---|---|---|
Structure | Attribute | ||
Metapath2Vec [7] | Yes | No | Yes |
HAN [8] | Yes | No | Yes |
HetGNN [15] | Yes | Yes | Yes |
HGT [16] | Yes | Yes | Yes |
HGCN [23] | Yes | Yes | Yes |
HGMAE [26] | Yes | Yes | Yes |
NAGNE | Yes | Yes | No |
DBLP | IMDB | ACM | Aminer | |||||
---|---|---|---|---|---|---|---|---|
Micro-F1 | Macro-F1 | Micro-F1 | Macro-F1 | Micro-F1 | Macro-F1 | Micro-F1 | Macro-F1 | |
DeepWalk | 90.21 | 89.52 | 56.53 | 55.34 | 82.22 | 81.77 | 83.23 | 82.91 |
Metapath2Vec | 92.50 | 91.87 | 52.38 | 51.15 | 83.56 | 82.31 | 84.43 | 83.65 |
HIN2Vec | 84.77 | 83.45 | 49.57 | 48.71 | 55.37 | 49.18 | 78.76 | 77.45 |
MAGNN | 92.92 | 92.50 | 59.44 | 58.63 | 83.42 | 82.80 | 84.65 | 83.92 |
HAN | 93.31 | 92.22 | 58.35 | 54.22 | 82.86 | 81.67 | 84.27 | 83.81 |
GTN | 94.16 | 93.63 | 61.24 | 59.45 | 81.23 | 80.53 | 85.31 | 84.87 |
HGNN-AC | 93.53 | 92.57 | 60.96 | 59.27 | 80.33 | 79.66 | 84.54 | 83.48 |
HGSL | 93.07 | 92.56 | 60.05 | 59.25 | 80.22 | 79.68 | 84.08 | 83.45 |
Megnn | 93.35 | 92.64 | 60.85 | 58.91 | 82.30 | 81.92 | 84.37 | 83.52 |
NAGNE | 95.02 | 94.32 | 62.42 | 60.38 | 83.86 | 82.61 | 85.72 | 85.13 |
DBLP | IMDB | ACM | Aminer | |
---|---|---|---|---|
DeepWalk | 66.25 | 0.41 | 42.82 | 42.86 |
Metapath2Vec | 68.74 | 0.09 | 42.79 | 44.98 |
HIN2Vec | 65.84 | 0.04 | 42.21 | 44.86 |
MAGNN | 69.81 | 0.49 | 42.36 | 44.97 |
HAN | 69.12 | 0.67 | 41.56 | 44.41 |
GTN | 69.54 | 0.70 | 42.69 | 45.26 |
HGNN-AC | 68.31 | 0.64 | 40.59 | 43.16 |
HGSL | 67.24 | 0.59 | 41.81 | 44.43 |
Megnn | 66.34 | 0.47 | 40.95 | 43.59 |
NAGNE | 69.76 | 0.73 | 43.35 | 46.63 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, Z.; Xu, H.; Li, Y.; Zhai, Z.; Ding, Y. NAGNE: Node-to-Attribute Generation Network Embedding for Heterogeneous Network. Appl. Sci. 2024, 14, 1053. https://doi.org/10.3390/app14031053
Zhang Z, Xu H, Li Y, Zhai Z, Ding Y. NAGNE: Node-to-Attribute Generation Network Embedding for Heterogeneous Network. Applied Sciences. 2024; 14(3):1053. https://doi.org/10.3390/app14031053
Chicago/Turabian StyleZhang, Zheding, Huanliang Xu, Yanbin Li, Zhaoyu Zhai, and Yu Ding. 2024. "NAGNE: Node-to-Attribute Generation Network Embedding for Heterogeneous Network" Applied Sciences 14, no. 3: 1053. https://doi.org/10.3390/app14031053
APA StyleZhang, Z., Xu, H., Li, Y., Zhai, Z., & Ding, Y. (2024). NAGNE: Node-to-Attribute Generation Network Embedding for Heterogeneous Network. Applied Sciences, 14(3), 1053. https://doi.org/10.3390/app14031053