A Link Prediction Method Based on Neural Networks

: Link prediction to optimize network performance is of great signiﬁcance in network evolution. Because of the complexity of network systems and the uncertainty of network evolution, it faces many challenges. This paper proposes a new link prediction method based on neural networks trained on scale-free networks as input data, and optimized networks trained by link prediction models as output data. In order to solve the inﬂuence of the generalization of the neural network on the experiments, a greedy link pruning strategy is applied. We consider network efﬁciency and the proposed global network structure reliability as objectives to comprehensively evaluate link prediction performance and the advantages of the neural network method. The experimental results demonstrate that the neural network method generates the optimized networks with better network efﬁciency and global network structure reliability than the traditional link prediction models.


Introduction
A network is a special case to express the relationship between systems. In recent years, network systems have attracted more and more attention from various disciplines in modeling, network topology and structure optimization [1]. Generally, network system is often seen in reliability engineering practice and real life, such as communication networks, computer networks, circuit networks, traffic networks, urban water supply networks, power supply networks, etc. However, we know that these real networks are constantly evolving in actual operation to ensure the sustainable development of various fields. Therefore, a series of network evolution models and algorithms aiming at optimizing the structure are emerged as the time requires [2].
As an essential mechanism for network evolution, link prediction has received widespread attention as soon as it was proposed, and it has been practically applied in many fields [3]. At present, the research on link prediction mainly focuses on evaluating the accuracy and precision of missing links. For many real networks, network evolution aims to extend the network topology to ensure the long-term development and regular operation of networks. Therefore, it is necessary to study link prediction with the purpose of improving reliability in network evolution.
However, the structure complexity and diversity of real networks bring great difficulties to link prediction aiming at improving reliability. In recent years, various disciplines have devoted themselves to intelligent research in related fields. Artificial intelligence has been at the forefront of the development of science and technology around the world with its advantages of improving productivity and reducing the difficulty of work. It has exerted a profound influence on all fields and promoted the evolution of society. We have to say that artificial intelligence of networks is the general trend. Among them, the neural network has become the representative of artificial intelligence methods because of the powerful self-learning function and the ability to find optimal solutions. It provides strong support for link prediction based on improving reliability.

Literature Review
In network evolution, the primary methods to improve reliability include adding links, reconnecting links and adding nodes with changing network topology. At present, the most widely used method is link prediction based on adding edges [4]. Barabási and Albert [5] regarded the node degree as the basis of link possibility. Newman [6] introduced the concept of the common neighbor to link prediction for the first time. As soon as this method came out, it caused a boom in the field of link prediction. Ma et al. [7] proposed a local friend recommendation index to facilitate link prediction by a typical structural property: nodes were preferentially linked to the nodes with the weak clique structure in many real networks. By considering both the nodes' types of effects and structural similarities, Fan et al. [8] proposed a combined link prediction index and studied the robustness. Gao et al. [9] proposed a link prediction algorithm suitable for bipartite networks. Aghabozorgi and Khayyambashi [10] tested the similarity measure based on structural units through a supervised learning experiment framework, and found that the relations built through the paths between the endpoints determine the influence of the new links. Yang et al. [11] proposed a novel link prediction index significant influence, which was modeled by distinguishing the strong influence of the weak. Pech et al. [12] assumed that the likelihood of a link between two nodes can be realized by a linear summation of neighbor nodes' contributions. As a result, they obtained the analytical solution of the optimal likelihood matrix and better performance. Balls-Barker and Webb [13] proposed a new method to predict the formation of unobserved links in the real-world networks, relying on the theory of isospectral matrix reduction to calculate the probability and finally transitioning from one vertex to another in a random walk on the network. Li et al. [14] combined the four similarity indexes, including Common Neighbors, Leicht-Holme-Newman, Cosine based on the Laplacian matrix, and Matrix Forest based on the Logistic regression algorithm and the Xgboost algorithm, and introduced the idea of stacking into the link prediction of complex networks. Bai et al. [15] constructed a new model that regarded link prediction in multiplex networks as a multi-attribute decisionmaking problem, in which the potential links in the target layer are alternatives, the layers are viewed as attributes, and the similarity score of a potential link in each layer are an attribute value.
Artificial intelligence method, a hot research topic in recent years, is also used in link prediction. Liu et al. [16] believed that different common neighbors played various roles so that they have different contributions, and therefore proposed a local naive

Problem Description and Formula
Link prediction refers to predicting the possibility of a link between two nodes that have not yet been connected in the network through known nodes and structural information [37]. The undirected scale-free network G (V, L) is studied in this paper, where V and L denote sets of nodes and links, respectively. In this section, we introduce link prediction models and evaluation indexes for the undirected scale-free networks. Link prediction methods need to provide a score (S) to predict the probability of a link existing between a pair of unconnected nodes. For example, S CN xy represents the score of possible links between two nodes x and y in a network calculated by the Common Neighbors (CN) model. Similarly, S RA xy is the score of the Resource Allocation (RA) model, and S MF is a matrix of the scores of all links obtained using the Matrix Forest (MF) model. All links not generated in the original network are sorted in decreasing order according to their scores, and the links at the top are most likely to exist.

Link Prediction Models
There exist many methods applied in the link prediction field. Some of the well-known techniques [38], including local link prediction methods and global link prediction methods, are selected as the necessary link prediction models to compare with the neural network methods.
Common Neighbors (CN) [39] takes the number of common neighbors as a score to determine whether or not a link can be generated between two nodes. The more common neighbors two nodes share, the greater the probably they are related with a link. For nodes x, y, the sets of their neighbors are Γ(x), Γ(y); their common neighbors' set is Γ(x)∩Γ(y). CN is defined as: Resource Allocation (RA) [40] measures the possibility of a link occurs by common neighbors' degree. Here, z represents the common neighbor node of node x and y, k z is the degree of node z. Then, the expression of RA is: The Jaccard Coefficient (JC) [41] is proposed by Jaccard, which considers the neighbor nodes and common neighbors of two nodes. It is defined as: Hub Promoted (HP) [42] is proposed for quantifying the topological overlap of pairs of substrates in metabolic networks, and defined as: Leicht-Holme-Newman (LHN) [43] considers the number of common neighbors and degree of two nodes. The expression of LNH is: Average Commute Time (ACT) [44] thinks that the shorter the time, the more similar the two nodes, then the possibility of link between the nodes x and y can be defined as: where l + xy denotes the corresponding entry in terms of the Laplacian matrix's pseudoinverse, [45].
Cosine Time (CT) [45] is an inner-product-based measure, which is based on L+ by calculating the possibility of two vectors. The cosine time is defined as the cosine of the node vectors: Matrix Forest (MF) [46] thinks the similarity between x and y can be understood as the ratio of the number of spanning rooted forests, such that nodes x and y belong to the same tree rooted at x to all spanning rooted forests of the network. MF is defined as:

Network Efficiency (E)
At present, the direct definition or measurement of network reliability is not clear. Network efficiency is usually used to indirectly evaluate it. For nodes x, y in a network, the efficiency is the reciprocal of the shortest distance between them [47]. For an undirected network, it is expressed as: where d xy is the shortest distance between the nodes x and y; N is the number of nodes in the network. It reflects the overall connection in the whole range of the network itself. The greater the value is, the better the network structure establishes.

Global Network Structure Reliability (P)
Gupta and Kundu [48] first proposed the generalized exponential distribution of reliability index and applied it to the machine, components and other equipment [49]. Its form is: where l xy + denotes the corresponding entry in term pseudoinverse, L+ (L = D − A) [45].
Cosine Time (CT) [45] is an inner-product-based me calculating the possibility of two vectors. The cosine time node vectors: S xy CT =cos(x,y) + = l xy + l xx + ·l yy + Matrix Forest (MF) [46] thinks the similarity between the ratio of the number of spanning rooted forests, such t same tree rooted at x to all spanning rooted forests of the

Network Efficiency (E)
At present, the direct definition or measurement of Network efficiency is usually used to indirectly evaluate the efficiency is the reciprocal of the shortest distance be rected network, it is expressed as: where dxy is the shortest distance between the nodes x and the network. It reflects the overall connection in the whole greater the value is, the better the network structure estab 3.2.2. Global Network Structure Reliability (P) Gupta and Kundu [48] first proposed the generalized liability index and applied it to the machine, component form is: here, ℷ is the failure rate. The failure rate of the systems is d evolves into: where ℷ is the reliability factor and C is the equivalent sy achieving the object.
In researches, many complex systems are regarded a sider changing some expression forms to realize the calcu ture reliability to comprehensively characterize the optim network evolution. Generally, the parameter ℷ is related to puting resources, human resources, time and other measu critical network structural characteristic indexes include a t (10) here, Average Commute Time (ACT) [44] thinks that the shorter the time, the more similar the two nodes, then the possibility of link between the nodes x and y can be defined as: S xy ACT = 1 l xx + +l yy + -2l xy + (6) where l xy + denotes the corresponding entry in terms of the Laplacian matrix's pseudoinverse, L+ (L = D − A) [45].
Cosine Time (CT) [45] is an inner-product-based measure, which is based on L+ by calculating the possibility of two vectors. The cosine time is defined as the cosine of the node vectors: Matrix Forest (MF) [46] thinks the similarity between x and y can be understood as the ratio of the number of spanning rooted forests, such that nodes x and y belong to the same tree rooted at x to all spanning rooted forests of the network. MF is defined as:

Network Efficiency (E)
At present, the direct definition or measurement of network reliability is not clear. Network efficiency is usually used to indirectly evaluate it. For nodes x, y in a network, the efficiency is the reciprocal of the shortest distance between them [47]. For an undirected network, it is expressed as: where dxy is the shortest distance between the nodes x and y; N is the number of nodes in the network. It reflects the overall connection in the whole range of the network itself. The greater the value is, the better the network structure establishes.

Global Network Structure Reliability (P)
Gupta and Kundu [48] first proposed the generalized exponential distribution of reliability index and applied it to the machine, components and other equipment [49]. Its form is: here, ℷ is the failure rate. The failure rate of the systems is difficult to obtain, so the formula evolves into: where ℷ is the reliability factor and C is the equivalent system cost of destructions after achieving the object. In researches, many complex systems are regarded as networks. Therefore, we consider changing some expression forms to realize the calculation of global network structure reliability to comprehensively characterize the optimization degree in the process of network evolution. Generally, the parameter ℷ is related to the knowledge level, the computing resources, human resources, time and other measurable factors [50]. Similarly, the critical network structural characteristic indexes include average shortest path length, be-is the failure rate. The failure rate of the systems is difficult to obtain, so the formula evolves into: Average Commute Time (ACT) [44] thinks that the the two nodes, then the possibility of link between the n denotes the corresponding entry in ter pseudoinverse, L+ (L = D − A) [45].
Cosine Time (CT) [45] is an inner-product-based m calculating the possibility of two vectors. The cosine tim node vectors: Matrix Forest (MF) [46] thinks the similarity betwe the ratio of the number of spanning rooted forests, such same tree rooted at x to all spanning rooted forests of th

Network Efficiency (E)
At present, the direct definition or measurement o Network efficiency is usually used to indirectly evaluat the efficiency is the reciprocal of the shortest distance rected network, it is expressed as: where dxy is the shortest distance between the nodes x an the network. It reflects the overall connection in the who greater the value is, the better the network structure esta

Global Network Structure Reliability (P)
Gupta and Kundu [48] first proposed the generaliz liability index and applied it to the machine, componen form is: here, ℷ is the failure rate. The failure rate of the systems is evolves into: where ℷ is the reliability factor and C is the equivalent achieving the object.
In researches, many complex systems are regarded sider changing some expression forms to realize the cal ture reliability to comprehensively characterize the opti network evolution. Generally, the parameter ℷ is related puting resources, human resources, time and other meas critical network structural characteristic indexes include C (11) where Appl. Sci. 2021, 11, x FOR PEER REVIEW 5 of 18 Average Commute Time (ACT) [44] thinks that the shorter the time, the more similar the two nodes, then the possibility of link between the nodes x and y can be defined as: where l xy + denotes the corresponding entry in terms of the Laplacian matrix's pseudoinverse, L+ (L = D − A) [45].
Cosine Time (CT) [45] is an inner-product-based measure, which is based on L+ by calculating the possibility of two vectors. The cosine time is defined as the cosine of the node vectors: Matrix Forest (MF) [46] thinks the similarity between x and y can be understood as the ratio of the number of spanning rooted forests, such that nodes x and y belong to the same tree rooted at x to all spanning rooted forests of the network. MF is defined as:

Network Efficiency (E)
At present, the direct definition or measurement of network reliability is not clear. Network efficiency is usually used to indirectly evaluate it. For nodes x, y in a network, the efficiency is the reciprocal of the shortest distance between them [47]. For an undirected network, it is expressed as: where dxy is the shortest distance between the nodes x and y; N is the number of nodes in the network. It reflects the overall connection in the whole range of the network itself. The greater the value is, the better the network structure establishes.

Global Network Structure Reliability (P)
Gupta and Kundu [48] first proposed the generalized exponential distribution of reliability index and applied it to the machine, components and other equipment [49]. Its form is: (10) here, ℷ is the failure rate. The failure rate of the systems is difficult to obtain, so the formula evolves into: where ℷ is the reliability factor and C is the equivalent system cost of destructions after achieving the object. In researches, many complex systems are regarded as networks. Therefore, we consider changing some expression forms to realize the calculation of global network structure reliability to comprehensively characterize the optimization degree in the process of network evolution. Generally, the parameter ℷ is related to the knowledge level, the computing resources, human resources, time and other measurable factors [50]. Similarly, the critical network structural characteristic indexes include average shortest path length, be-is the reliability factor and C is the equivalent system cost of destructions after achieving the object.
In researches, many complex systems are regarded as networks. Therefore, we consider changing some expression forms to realize the calculation of global network structure reliability to comprehensively characterize the optimization degree in the process of network evolution. Generally, the parameter Appl. Sci. 2021, 11, x FOR PEER REVIEW Average Commute Time (ACT) [44] thinks that the shorter the two nodes, then the possibility of link between the nodes x a  [45].
Cosine Time (CT) [45] is an inner-product-based measure, calculating the possibility of two vectors. The cosine time is de node vectors: Matrix Forest (MF) [46] thinks the similarity between x an the ratio of the number of spanning rooted forests, such that no same tree rooted at x to all spanning rooted forests of the netwo

Network Efficiency (E)
At present, the direct definition or measurement of netwo Network efficiency is usually used to indirectly evaluate it. For the efficiency is the reciprocal of the shortest distance between rected network, it is expressed as: where dxy is the shortest distance between the nodes x and y; N the network. It reflects the overall connection in the whole range greater the value is, the better the network structure establishes.

Global Network Structure Reliability (P)
Gupta and Kundu [48] first proposed the generalized expo liability index and applied it to the machine, components and form is: here, ℷ is the failure rate. The failure rate of the systems is difficul evolves into: where ℷ is the reliability factor and C is the equivalent system achieving the object.
In researches, many complex systems are regarded as netw sider changing some expression forms to realize the calculation ture reliability to comprehensively characterize the optimization network evolution. Generally, the parameter ℷ is related to the k puting resources, human resources, time and other measurable critical network structural characteristic indexes include average is related to the knowledge level, the computing resources, human resources, time and other measurable factors [50]. Similarly, the critical network structural characteristic indexes include average shortest path length, betweenness centrality structural entropy and average clustering coefficient [51,52]. Therefore, we quantify global network structure reliability within a weighted summation method to describe the parameter Appl. Sci. 2021, 11, x FOR PEER REVIEW Average Commute Time (ACT) [44] thinks that the shorter the time, the more the two nodes, then the possibility of link between the nodes x and y can be defin  [45].
Cosine Time (CT) [45] is an inner-product-based measure, which is based o calculating the possibility of two vectors. The cosine time is defined as the cosin node vectors: Matrix Forest (MF) [46] thinks the similarity between x and y can be under the ratio of the number of spanning rooted forests, such that nodes x and y belon same tree rooted at x to all spanning rooted forests of the network. MF is defined

Network Efficiency (E)
At present, the direct definition or measurement of network reliability is n Network efficiency is usually used to indirectly evaluate it. For nodes x, y in a n the efficiency is the reciprocal of the shortest distance between them [47]. For a rected network, it is expressed as: where dxy is the shortest distance between the nodes x and y; N is the number of n the network. It reflects the overall connection in the whole range of the network it greater the value is, the better the network structure establishes.

Global Network Structure Reliability (P)
Gupta and Kundu [48] first proposed the generalized exponential distributi liability index and applied it to the machine, components and other equipment form is: here, ℷ is the failure rate. The failure rate of the systems is difficult to obtain, so the evolves into: where ℷ is the reliability factor and C is the equivalent system cost of destructio achieving the object.
In researches, many complex systems are regarded as networks. Therefore, sider changing some expression forms to realize the calculation of global networ ture reliability to comprehensively characterize the optimization degree in the pr network evolution. Generally, the parameter ℷ is related to the knowledge level, t puting resources, human resources, time and other measurable factors [50]. Simil critical network structural characteristic indexes include average shortest path len . One of the most straightforward indexes to measure a network is the degree. We consider that the average degree (K = ∑ N x=1 k x N ) has the same effect as C in calculating the global network structure reliability, so the global network structure reliability can be expressed in the following form: where L is the average shortest path length. E B is the betweenness centrality structural entropy. C is the average clustering coefficient. ε 1 , ε 2 , ε 3 are the weights of parameters, and ε 1 + ε 2 + ε 3 = 1. Here, we assume that ε 1 = ε 2 = ε 3 = 1/3 in a general scale-free network. The following is the specific description and formula explanation for these indexes.
A. Average Shortest Path Length (L). The average shortest path length is an important feature measure in network analysis, which is the average distance of all node pairs. For an undirected network, it is defined as follows: The smaller the L, the greater the connection intensity between nodes. In order to solve the difference between L and other indexes, let L = 1/L. B. Betweenness Centrality Structural Entropy (E B ). Node betweenness centrality characterizes the importance of the nodes. Network structure entropy is introduced to characterize the dispersion of the nodes' importance [51]. Let d yk denote the total number of the shortest path starting from the nodes y to k; d yk (x) is the number of the shortest path through the node x from node y to k. If it goes through the node x, then d yk (x) = 1, otherwise d yk (x) = 0. The betweenness centrality of node x is B x : The betweenness centrality structural entropy is expressed as: C. Average Clustering Coefficient (C). The clustering coefficient is a characteristic index to describe the tightness of the network [52]. A node with a larger clustering coefficient is closer to its neighbor nodes in the network. The clustering coefficient of node x (C x ) is: Therefore, the network's average clustering coefficient is: here, E x is the number of edges connected to neighbor nodes of node x.

Neural Networks (NN)
Neural networks have many successful applications in solving various problems related to prediction, classification and regression [53]. There are three basic layers in neural networks, including input layer, hidden layer and output layer, as shown in Figure 1. Usually, the data form of neural networks is arrays. As we know, a network is always expressed as an adjacency matrix. It is impossible to train the neural network as input data because using the adjacency matrix as the input of neural networks would require In this way, the network's adjacency matrix is transformed into an array, and link prediction of a network is achieved by the neural network method. The neural network is trained based on original scale-free networks as the input data and optimized networks by link prediction models as the desired output data. Specifically, we first generate scale-free networks with specified requirements as original data, and the link prediction models obtain the optimized networks. Then, the adjacency matrix of the networks is transformed into the link list.
In the training of the neural network, the input is the link list of the original scale-free networks and the output is the link list of the optimized networks generated by the link prediction models. It solves the problem that neural networks cannot train network data.
Neural networks have many successful applications in solving various problems related to prediction, classification and regression [53]. There are three basic layers in neural networks, including input layer, hidden layer and output layer, as shown in Figure 1. Usually, the data form of neural networks is arrays. As we know, a network is always expressed as an adjacency matrix. It is impossible to train the neural network as input data because using the adjacency matrix as the input of neural networks would require the dimension size that can represent 2 N 2 adjacency matrices, where N is the number of nodes in the network. Here, a new network representation called link list (LL) is applied that contains binary elements representing existence of node pairs among all possible combinations N 2 . Generally, the possible links are L = {(1, 2), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2,5), (3,4), (3,5), (4,5)} for an undirected network with N = 5. If four of the links exist by L = {(1, 4), (2,3), (2,4), (3,5)}, then LL = [0 0 1 0 1 1 0 0 1 0]. In this way, the network's adjacency matrix is transformed into an array, and link prediction of a network is achieved by the neural network method. The neural network is trained based on original scale-free networks as the input data and optimized networks by link prediction models as the desired output data. Specifically, we first generate scale-free networks with specified requirements as original data, and the link prediction models obtain the optimized networks. Then, the adjacency matrix of the networks is transformed into the link list. In the training of the neural network, the input is the link list of the original scale-free networks and the output is the link list of the optimized networks generated by the link prediction models. It solves the problem that neural networks cannot train network data.
. The original networks are scale-free networks with the initial number of nodes N0 = 3 and link density M0 = 2 for N = 30, N = 50, N = 80 and N = 100. The link prediction method based on neural networks is evaluated by the network efficiency and global network structure reliability. In order to verify the feasibility and advantages of link prediction based on neural networks, we use the traditional link prediction models as the performance reference models. In experiments, we assume that the number of links increased by link prediction models is equal to the number of network's nodes. We randomly generate a database of 10 data sets which have scale-free networks with the same number of nodes as the input data of the neural network; 80% of them are training data and 20% are test data. In the neural network method, the number of hidden layers is set to 4 [54]. The maximum training epochs is 1000, the training goal is 0.00001 and the neural network's learning rate is set to 0.001. The original networks are scale-free networks with the initial number of nodes N 0 = 3 and link density M 0 = 2 for N = 30, N = 50, N = 80 and N = 100. The link prediction method based on neural networks is evaluated by the network efficiency and global network structure reliability. In order to verify the feasibility and advantages of link prediction based on neural networks, we use the traditional link prediction models as the performance reference models. In experiments, we assume that the number of links increased by link prediction models is equal to the number of network's nodes. We randomly generate a database of 10 data sets which have scale-free networks with the same number of nodes as the input data of the neural network; 80% of them are training data and 20% are test data. In the neural network method, the number of hidden layers is set to 4 [54]. The maximum training epochs is 1000, the training goal is 0.00001 and the neural network's learning rate is set to 0.001.
We performed two kinds of experiments on scale-free networks with N = 30, 50, 80 and 100, and analyze the results by MATLAB. First, we randomly generated several scale-free networks with a specified number of nodes as data sets. In Experiment 1, the input data are different networks with the same number of nodes, and the output data are optimized networks obtained by the same link prediction model. We found that the traditional link prediction models have different advantages for different networks. Therefore, in order to give full play to the advantages of different link prediction models to obtain the more optimized networks, we carried out Experiment 2, in which the input data are the same network and the output data are the optimized networks generated by different link prediction models.

Link Pruning
The number of links of the optimized networks produced by neural networks cannot be controlled to be equal to the number of links of the optimized network produced by link prediction models, due to neural networks' generalization properties. To solve this problem, there is an additional stage named link pruning. Link pruning is the process of removing redundant links generated by neural networks according to the goal, so that the optimized network generated by the neural network has the same network topology complexity as the optimized network generated by link prediction. Here, in order to maximize the reliability, we apply the idea of the greedy algorithm to achieve link pruning. The greedy algorithm means that when solving the problem, the algorithm always makes what seems to be the best choice at the moment. Our specific approach is to add all the links generated by the neural network into the original network to form a new network. Then, one of all the links generated by the neural network is randomly deleted from the new network at a time to calculate the network efficiency and global network structure reliability. After all the links generated by the neural network have been deleted once, the data are sorted from large to small according to the target. Finally, the link pruning process is completed when the number of links in the network is equal to that generated by the link prediction models.

General Scheme
Next, we describe the general scheme for the above the neural network method.
Step 1-The scale-free networks that meet the specified requirements are randomly generated as the original networks.
Step 2-The optimized networks are obtained by link prediction models.
Step 3-Convert the original networks and optimized networks to the link list.
Step 4-Train the neural network where the inputs are scale-free networks and the outputs are optimized networks.
Step 5-The predicted networks are input into the trained neural network to obtain the new optimized networks.
Step 6-According to the goals, the final optimized networks are obtained by link pruning.
Step 7-According to Equations (11) and (14), the network efficiency and global network structure reliability are calculated.

Experiment Results
We found that the hubs of the optimized networks produced by the link prediction models are obviously observed and the degree difference between nodes is large through comparison and analysis. For example, Figure 2 shows the network topology of the original network, the optimized network generated by the RA model and the optimized network generated by the neural network. This structure is unstable when the network is targeted attacked. As for the optimized networks produced by neural networks, compared to the optimized network produced by the link prediction models, the number of medium-sized hubs increased. The structure can withstand greater attacks. Therefore, the performance of the network structure optimized by the neural network is better.
1, x FOR PEER REVIEW 9 of 18 medium-sized hubs increased. The structure can withstand greater attacks. Therefore, the performance of the network structure optimized by the neural network is better. In order to further illustrate and verify our conclusion, we calculate the network efficiency and global network structure reliability and plot the experimental results of the networks with N = 30, 50, 80 and 100 into Figures 3-6. As shown in Figure 3, Figure 3a is the network efficiency of the networks with N = 30 for different methods. Figure 3b shows the global network structure reliability of the networks with N = 30 for different methods. Similarly, we can explain Figure 4a to Figure 6b. It is found that the networks optimized by neural networks are significantly higher than those by the traditional link prediction models, regardless of network efficiency or global network structure reliability. Besides, by comparing Figures 3-6, we find that the neural network has a more obvious effect on the traditional link prediction models with poor optimization results. For example, the network efficiency of the optimized network obtained by LHN and MF models in Figure  3a is the minimum, but the network efficiency of the optimized network obtained by the neural network is almost the maximum. In Figure 4b, CN and RA models have achieved high global network structure reliability. Although the trained neural network is higher than the traditional link prediction models, the improvement is not apparent. The conclusion is also reflected in Figures 5 and 6 for networks with N = 80 and 100. Comparing (a) and (b) of the four figures we can see that the difference in network efficiency is more prominent. In contrast, the results of global network structure reliability proposed are more similar for different methods. This is because the global structure network reliability is more comprehensive than network efficiency in evaluating network structure. At the same time, the availability of the global network structure reliability can be explained. We can also see that the original scale-free networks' global network structure reliability is generally low, which indicates that the structure of the randomly generated scale-free networks is unstable. In order to further illustrate and verify our conclusion, we calculate the network efficiency and global network structure reliability and plot the experimental results of the networks with N = 30, 50, 80 and 100 into Figures 3-6. As shown in Figure 3, Figure 3a is the network efficiency of the networks with N = 30 for different methods. Figure 3b shows the global network structure reliability of the networks with N = 30 for different methods. Similarly, we can explain Figure 4a to Figure 6b. It is found that the networks optimized by neural networks are significantly higher than those by the traditional link prediction models, regardless of network efficiency or global network structure reliability. Besides, by comparing Figures 3-6, we find that the neural network has a more obvious effect on the traditional link prediction models with poor optimization results. For example, the network efficiency of the optimized network obtained by LHN and MF models in Figure 3a is the minimum, but the network efficiency of the optimized network obtained by the neural network is almost the maximum. In Figure 4b, CN and RA models have achieved high global network structure reliability. Although the trained neural network is higher than the traditional link prediction models, the improvement is not apparent. The conclusion is also reflected in Figures 5 and 6 for networks with N = 80 and 100. Comparing (a) and (b) of the four figures we can see that the difference in network efficiency is more prominent. In contrast, the results of global network structure reliability proposed are more similar for different methods. This is because the global structure network reliability is more comprehensive than network efficiency in evaluating network structure. At the same time, the availability of the global network structure reliability can be explained. We can also see that the original scale-free networks' global network structure reliability is generally low, which indicates that the structure of the randomly generated scale-free networks is unstable.
x FOR PEER REVIEW 10 of 18    x FOR PEER REVIEW 13 of 18 From the above results, we find that the traditional link prediction models' performance ranking is inconsistent for the networks with different node numbers and structures. For example, the optimized network generated by ACT model has the maximum reliability for N = 30 in Figure 3b. The optimized network generated by RA model has the maximum network efficiency for N = 80 in Figure 5a. In Figure 6, for the prediction network 1 with N = 100, the network efficiency and global network structure reliability obtained by the CN model are greater than that obtained by the RA model. However, the network efficiency and global network structure reliability obtained by CN model are From the above results, we find that the traditional link prediction models' performance ranking is inconsistent for the networks with different node numbers and structures. For example, the optimized network generated by ACT model has the maximum reliability for N = 30 in Figure 3b. The optimized network generated by RA model has the maximum network efficiency for N = 80 in Figure 5a. In Figure 6, for the prediction network 1 with N = 100, the network efficiency and global network structure reliability obtained by the CN model are greater than that obtained by the RA model. However, the network efficiency and global network structure reliability obtained by CN model are lower than that obtained by RA model for the prediction network 2 with N = 100. Thus, we believe that different link prediction models have different advantages in various networks. Therefore, in order to achieve the best results, we carried out new experiments considering the benefits of different models. In the experiments, the input data are the same original network each time, and the output data are the optimized network generated by different link prediction models. After training, we input the original network again, and output a higher performance network. The experimental results shown in Table illustrate that the optimized networks generated by neural networks have the best performance. Moreover, the method based on neural networks has obvious advantages over the traditional link prediction models in improving network efficiency and global network structure reliability. The network efficiency and global network structure reliability in different methods are illustrated in Table 1 and Figures 7 and 8. Table 1 shows the specific data about network efficiency and global network structure reliability of all methods for all networks. In Figures 7 and 8, the X-axis is the number of nodes, and the Y-axis is the network efficiency or global network structure reliability. We want to observe how network efficiency and global network structure reliability vary with the number of nodes. We can see intuitively that the optimized network generated by neural networks always has the most excellent network efficiency and global network structure reliability. In addition, we find that the more nodes the network has, the less network efficiency and global network structure reliability is. This can be attributed to that a large number of nodes indicates a much complex network, the performance of which becomes difficult to improve.  always has the most excellent network efficiency and global network structure reliabilit In addition, we find that the more nodes the network has, the less network efficiency an global network structure reliability is. This can be attributed to that a large number nodes indicates a much complex network, the performance of which becomes difficult improve.

Conclusions
The study of optimization network performance by artificial intelligence methods is important and challenging research. In this paper, we propose a new method that produces optimized networks with high network efficiency and reliability in network evolution. The neural network method based on link prediction models greatly reduces the training complexity but improves the performance. In order to solve the generalization problem of neural networks, we propose link pruning, which applies the thought of greedy algorithm to complete missing links and remove redundant links generated by neural networks according to the goal. Furthermore, we also give the quantitative index of global network structure reliability to evaluate the network optimization performance.
We designed two kinds of experiments via MATLAB to explore the link prediction method based on neural networks to improve the reliability. The experimental results show that the optimized networks generated by neural networks have better performance in network efficiency and global network structure reliability than traditional link prediction models. Therefore, the neural network method has higher accuracy and performance than popular link prediction models, and the artificial intelligence method offers a more positive contribution to link prediction.