1. Introduction
Complex systems in the real world can usually be constructed in the form of networks with nodes representing different entities in the system and links representing the relationships between these entities. Link prediction is to predict whether two nodes in a network are likely to have a link [
1]. It is used for diverse applications such as friend suggestion [
2], recommendation systems [
3,
4], biological networks [
5] and knowledge graph completion [
6].
Existing link prediction methods can be classified in the similarity-based method and the learning-based method. The similarity-based method assumes that the more similar the nodes are, the greater the possibility of links are between them [
7,
8]. It calculates the similarity between nodes by defining a function that can use some network information, such as network topology or node attributes, to calculate the similarity between nodes, and then utilize the similarity between nodes to predict the possibility of links between nodes. The accuracy of prediction largely depends on whether the network structure features can be selected well or not. The learning-based method constructs a model that can extract various features to build a model for the given network, train the model with the existing information, and finally use the trained model to predict whether there will be links between the nodes.
In early works, heuristic scores are used for measuring the proximity of connectivity of two nodes in link prediction [
1,
9]. Popular heuristic methods include the similarity method based on local information and the similarity method based on path similarity. The similarity methods based on local information only considers the local structure, such as the common neighbors of two nodes, as a measure of similarity [
1,
10]. The method based on path similarity utilizes the global structural information of the networks, including paths [
11,
12] and communities [
13], as a basis to find the similarity of nodes. However, the structure-based method entirely relies on the topology of the given network. Furthermore, the structure-based method is not that reliable, and there is such a situation where different networks may have distinct clusterings and path lengths but have the same degree distributions [
14]. Therefore, it can only show the different performance for each network and is unable to effectively capture the underlying topological relationship between nodes.
In recent years, more and more learning-based algorithms have been proposed. This method mainly extracts the features of the network by constructing a model and then predicts the links through the models. Learning-based methods can be divided into two categories: shallow neural network-based method and deep neural network-based method. DeepWalk [
15] is an algorithm based on the shallow neural network. It treats the path of a random walk as a sentence to learn the representation vectors of nodes through the skip-gram model [
16], which greatly improves the performance of the network analysis task. The LINE (Large-scale Information Network Embedding) [
17] algorithm based on shallow neural network and node2vec [
16] algorithm based on modified DeepWalk then occur successively. Since the shallow model does not capture the highly nonlinear network structure, which results in the non-optimal network representation results, a semi-supervised depth model based on the SDNE (Structural Deep Network Embedding) [
18] deep neural network algorithm, which is composed of multi-layer non-linear functions, is put up to capture highly non-linear network structure. However, these two methods only take advantage of the potential features and cannot effectively capture the structural similarity of links [
19].
Depth neural network has made great progress in image classification, target detection and recognition in recent years because of its powerful feature learning and expression abilities [
20]. However, deep neural network has the problem of gradient disappearance when deepening the depth [
21]. The residual network [
21] mentioned in CVPR2016 (CVPR2016: IEEE Conference on Computer Vision and Pattern Recognition) has not only achieved good results in image recognition and target detection tasks but also solved the degradation problem of network learning ability caused by network deepening [
21]. The residual network is used to further deepen the number of network layers by introducing an identity mapping [
22] into the original network structure. Adding such a short connection essentially reduces the loss of information between network layers. Dense convolutional neural network [
23] is an improved version of the residual network. By introducing more short connections, information flow can be transmitted more effectively in the network, thereby achieving better recognition and detection results. Therefore, this paper proposes a link prediction method based on deep convolutional neural network to predict missing/unknown links in the network.
To address the above-mentioned problems, we propose a link prediction method based on the deep convolution neural network. The main contributions of our work can be summarized as follows:
To solve the link prediction problem, we transform it into a binary classification problem and construct a deep convolution neural network model to solve the problem.
In view of the fact that heuristic methods can only utilize the network’s topological structure and represent learning methods can only utilize the potential features of the network, such as DeepWalk, LINE, node2vec, we propose a sub-graph extraction algorithm, which can better contain the information needed by the link prediction algorithm. On this basis, a residual attention model is proposed, which can effectively learn from graph structure features to link structure features.
Through further research, we find that the residual attention mechanism may impede the information flow in the whole network. Therefore, a dense convolutional neural network model is proposed to improve the effect of link prediction.
The remainder of the paper is organized as it follows. In the next section, the related works are presented. In
Section 3 the preliminaries about this paper are introduced. In
Section 4 the sub-graph extraction algorithm and residual attention network are proposed. The performance evaluation results and discussion are summarized in
Section 5, while conclusive remarks are given in the last section.
Author Contributions
W.W. and Y.H. discussed and confirmed the idea; Y.H. and L.W. carried out the experiment and analyzed the data; L.W. wrote the paper; H.W. reviewed the paper; R.Z. carried out project administration.
Funding
This research was funded by The National Natural Science Foundation of China (61772562) and “The Fundamental Research Funds for the Central Universities”, South-Central University for Nationalities (CZY18014) and Innovative research program for graduates of South-Central University for Nationalities (2019sycxjj120).
Conflicts of Interest
The authors declare no conflict of interest.
References
- Liben-Nowell, D.; Kleinberg, J. The link-prediction problem for social networks. JASIST 2007, 58, 1019–1031. [Google Scholar] [CrossRef] [Green Version]
- Aiello, L.M.; Barrat, A.; Schifanella, R.; Cattuto, C.; Markines, B.; Menczer, F. Friendship prediction and homophily in social media. ACM Trans. Web 2012, 6, 1–33. [Google Scholar] [CrossRef] [Green Version]
- Tang, J.; Wu, S.; Sun, J.M.; Su, H. Cross-Domain Collaboration Recommendation. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; pp. 1285–1293. [Google Scholar]
- Akcora, C.G.; Carminati, B.; Ferrari, E. Network and Profile Based Measures for User Similarities on Social Networks. In Proceedings of the 2011 IEEE International Conference on Information Reuse& Integration, Las Vegas, NV, USA, 2–5 August 2011; pp. 292–298. [Google Scholar]
- Turki, T.; Wei, Z. A link prediction approach to cancer drug sensitivity prediction. J. BMC Syst. Biol. 2017, 11, 13–26. [Google Scholar] [CrossRef] [PubMed]
- Nickel, M.; Murphy, K.; Tresp, V.; Gabrilovich, E. A Review of Relational Machine Learning for Knowledge Graphs; IEEE: New York, NY, USA, 2016; pp. 11–33. [Google Scholar]
- Ahn, M.W.; Jung, W.S. Accuracy test for link prediction in terms of similarity index: The case of WS and BA models. J. Phys. A 2015, 429, 3992–3997. [Google Scholar] [CrossRef]
- Hoffman, M.; Steinley, D.; Brusco, M.J. A note on using the adjusted Rand index for link prediction in networks. J. Soc. Netw. 2015, 42, 72–79. [Google Scholar] [CrossRef] [PubMed]
- Lv, L.; Zhou, T. Link prediction in complex networks: A survey. J. Phys. A 2011, 390, 1150–1170. [Google Scholar] [Green Version]
- Newman, M.E.J. Clustering and preferential attachment in growing networks. J. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 2001, 64, 025102. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, H.-H.; Liang, G.; Zhang, X.L.; Giles, C.L. Discovering Missing Links in Networks Using Vertex Similarity Measures. In Proceedings of the 27th Annual ACM Symposium on Applied Computing, Trento, Italy, 26–30 March 2012; pp. 138–143. [Google Scholar]
- Lichtenwalter, R.N.; Chawla, N.V. Vertex Collocation Profiles: Subgraph Counting for Link Analysis and Prediction. In Proceedings of the 21st International Conference on World Wide Web, Lyon, France, 16–20 April 2012; pp. 1019–1028. [Google Scholar]
- Li, R.-H.; Jeffrey, X.Y.; Liu, J. Link Prediction: The Power of Maximal Entropy Random Walk. In Proceedings of the 20th ACM International Conference on Information and Knowledge Management, Glasgow, UK, 24–28 October 2011; pp. 1147–1156. [Google Scholar]
- Shang, Y. Distinct Clusterings and Characteristic Path Lengths in Dynamic Small-World Networks with Indentical Limit Degree Distribution. J. Stat. Phys. 2012, 149, 505–518. [Google Scholar] [CrossRef]
- Perozzi, B.; Al-Rfou, R.; Skiena, S. Deep Walk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
- Grover, A.; Leskovec, J. Node2vec: Scalable Feature Learning for Networks. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 855–864. [Google Scholar]
- Tang, J.; Qu, M.; Wang, M.Z.; Zhang, M.; Yan, J.; Mei, Q.Z. Line: Large-Scale Information Network Embedding. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 1067–1077. [Google Scholar]
- Wang, D.X.; Cui, P.; Zhu, W.W. Structural Deep Network Embedding. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 1225–1234. [Google Scholar]
- Zhang, M.; Chen, Y. Link prediction Based on Graph Neural Networks. In Proceedings of the Thirty-Second Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 2–8 December 2018. [Google Scholar]
- Hou, J.H.; Deng, Y.; Cheng, S.M.; Xiang, J. Visual Object Tracking Based on Deep Features and Correlation Filter. J. South Cent. Univ. Nat. 2018, 37, 67–73. [Google Scholar]
- He, K.M.; Zhang, X.Y.; Ren, S.Q.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- He, K.M.; Zhang, X.Y.; Ren, S.Q.; Sun, J. Identity Mappings in Deep Residual Networks. In Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 630–645. [Google Scholar]
- Huang, G.; Liu, Z.; Maaten, L.V.D.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Wang, F.; Jiang, M.; Qian, C.; Yang, S.; Li, C.; Zhang, H.; Wang, X.; Tang, X. Residual Attention Network for Image Classification. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6450–6458. [Google Scholar]
- Tu, C.; Yang, C.; Liu, Z.; Sun, M. Network representation learning: An overview. J. Sci. Sin. 2017, 47, 980–996. [Google Scholar] [CrossRef]
- Qi, J.S.; Liang, X.; Li, Z.Y.; Chen, Y.F.; Xu, Y. Representation learning of large-scale complex information network: Concepts, methods and challenges. Chin. J. Comput. 2018, 41, 2394–2420. [Google Scholar]
- Shang, Y. Subgraph Robustness of Complex Networks under Attacks. J. IEEE Trans. Syst. Man Cybern. Syst. 2019, 49, 821–833. [Google Scholar] [CrossRef]
- Lada, A.A.; Eytan, A. Friends and neighbors on the web. J. Soc. Netw. 2003, 25, 211–230. [Google Scholar]
- Barabási, A.L.; Albert, A. Emergence of scaling in random networks. J. Sci. 1999, 286, 509–512. [Google Scholar]
- Lecun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. PointCNN: Convolution on χ–Transformed Points. Available online: https://arxiv.org/abs/1801.07791 (accessed on 10 October 2018).
- Li, Y.; Shang, Y.; Yang, Y. Clustering coefficients of large networks. J. Inf. Sci. 2017, 382–383, 350–358. [Google Scholar] [CrossRef]
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).