AMPS: A Direction-Aware Adaptive Multi-Scale Potential Model for Link Prediction in Complex Networks

Qin, Xinghua; Liu, Sizheng; Zhang, Mengmeng; Tang, Jun; Ruan, Yirun

doi:10.3390/bdcc10020048

Open AccessArticle

AMPS: A Direction-Aware Adaptive Multi-Scale Potential Model for Link Prediction in Complex Networks

by

Xinghua Qin

^†,

Sizheng Liu

^†

,

Mengmeng Zhang

,

Jun Tang

and

Yirun Ruan

^*

Department of Laboratory for Big Data and Decision, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Big Data Cogn. Comput. 2026, 10(2), 48; https://doi.org/10.3390/bdcc10020048

Submission received: 27 November 2025 / Revised: 13 January 2026 / Accepted: 29 January 2026 / Published: 3 February 2026

Download

Browse Figures

Versions Notes

Abstract

To overcome the limitations of current link prediction methods in effectively leveraging topological information and node importance, this paper introduces a new model called AMPS (Adaptive Multi-scale Potential-enhanced Path Similarity). The model is built on a hierarchical structure that captures both global network topology and local interaction patterns, with full compatibility for directed and undirected networks. This is achieved through a process that quantifies node potential fields, enhances multi-scale similarity, and fuses information across scales. Specifically, we define three types of potential field models, global, local, and k-hop, to flexibly measure node importance. We also introduce two complementary prediction modules: an enhanced common neighbor matrix (PCN), which uses potential fields to refine local structural details, and a feature-weighted generalized path similarity (GLP), which integrates node importance into path evaluation. The final similarity score is obtained by adaptively combining the outputs of PCN and GLP. Experiments on 12 undirected datasets and 9 directed datasets demonstrate that AMPS significantly outperforms other mainstream algorithms in terms of the AUC metric. It also exhibits strong robustness under varying training set ratios, maintaining stable advantages in both directed and undirected scenarios. This framework provides a physically intuitive, topology-aware, and high-precision solution for link prediction across various types of networks.

Keywords:

link prediction; complex networks; potential field model; multi-scale similarity; node importance

1. Introduction

Complex systems in the real world—such as social networks, transport networks, and biological networks—are often constituted by numerous interacting elements, whose collective behavior cannot be explained solely through isolated individual characteristics. Abstracting such systems as complex networks composed of nodes (entities) and edges (relationships) provides a unified framework for understanding their structure, function, and evolution. This representation not only accommodates multi-source heterogeneous data but also describes local connections and global organization within the same framework, thereby supporting analysis and decision-making. Networks are inherently dynamic and evolving; the continuous emergence of new nodes and connections leads to the modification or disappearance of existing relationships. Faced with this continuously evolving reality, where observations are often incomplete, link prediction emerges as a critical task. Its objective is to estimate latent missing edges or future potential connections based solely on the currently observable network topology. This task not only compensates for the incompleteness of observed networks but also provides forward-looking insights into system evolution, demonstrating significant value across numerous domains [1]. Examples include merchant prediction tasks based on transaction data for constructing recommendation systems [2], exploring potential new ‘follow’ relationships between users in social networks [3], forecasting natural gas prices [4], research on Information Dissemination in Social Media [5], and mining scientific research trends through keyword networks [6].

Despite the significant value of link prediction, balancing accuracy and efficiency remains a challenge. Existing methods generally fall into three categories similarity-based, likelihood-based, and learning-based approaches. While simple topological metrics are efficient, they often lack accuracy. Conversely, learning-based methods offer high performance but suffer from high computational complexity and limited interpretability. Crucially, most existing methods treat nodes as homogeneous entities or rely on static centrality measures, failing to dynamically capture the “field effects” or potential influence of nodes within the network topology. For instance, a connection formed by a highly influential core node should carry a different structural weight than one formed by a peripheral node.

To address these limitations, incorporating node importance and topological potential theory into link prediction offers a promising direction. Physics-inspired potential field models can intuitively characterize a node’s global and local influence. However, effectively integrating these potential fields with multi-scale path information—spanning from local neighbors to global k-hop structures—remains an open problem, especially when dealing with both directed and undirected networks.

Based on this motivation, this paper proposes a novel link prediction algorithm integrating node importance. The algorithm’s core comprises two complementary modules: a prediction module based on path information and node importance, and a prediction module based on structural similarity and node importance. To accommodate diverse network characteristics, we define three potential field models: a global potential field model based on shortest path distances between nodes, a local potential field model integrating node local structural attributes, and a k-hop potential field model that balances local aggregation and multi-hop propagation requirements by adjusting the range of influence.

The main contributions of this paper are:

(1): Proposing the use of potential field models as a core tool for quantifying node importance, and defining three specialized potential fields (global, local, k-hop) to accommodate different network characteristics.
(2): Innovatively designing two link prediction modules, one that deeply integrates path information with node importance, and another that embeds potential field node importance into structural similarity calculations, enabling differentiated weight allocation for neighboring nodes.
(3): Conducting extensive experiments across multiple real-world network datasets to validate the superiority of the proposed method over various benchmark approaches and analyzing the applicability of different potential field models across diverse network types.
(4): Enhancing model interpretability by visually demonstrating, through potential fields, how node importance influences prediction outcomes.

2. Related Work and Preliminaries

2.1. Traditional Link Prediction Approaches

Despite the significant value of link prediction tasks, the adaptability differences of existing methods across various network topologies make it challenging to balance accuracy and efficiency in practical applications [7]. Currently, link prediction methods can be broadly categorized into three types. Firstly, similarity-based approaches directly utilize the network’s topological structure, predicting links by calculating node similarity or path characteristics. These include similarity metrics based on local information, such as the common neighbor metric [8], the Adamic-Adar metric [9], and the preferential attachment similarity metric PA [10]. They also include path-based local path metrics (LP) [11], Katz metrics [12], and random walk-based similarity metrics such as average commuting time ACT [13]. The aforementioned single-metric approaches exhibit high computational efficiency but limited accuracy. In recent years, hybrid metrics have enhanced performance by integrating local and global features. For instance, Mishra et al. (2022) [14] proposed the MNERLP-MUL method for link prediction by calculating node and edge correlations based on aggregated graphs. Rai et al. (2023) [15] introduced the NSim algorithm, which synthesizes common neighbors, information transmission ratio, and proximity, while Kong et al. (2025) [16] proposed a multi-type network link prediction method integrating structure, attributes, and weighted similarity. The second category comprises likelihood-based link prediction approaches. Yao et al. (2025) [17] propose the CICN index, which calculates node similarity based on clustering mutual information of common neighbors and integrates it with higher-order clustering coefficients to achieve a combination of local and global information for higher-order link prediction. Yang et al. [18] innovatively proposed the concept of simple motifs and applied it to link prediction, providing a new paradigm for higher-order link prediction. Liu et al. (2024) [19] proposed RB and GRB link prediction methods; they defined four resource transmission paths, quantified nodes’ resource holding/receiving capabilities, and bidirectionally integrated inter-node resource broadcast to measure similarity for complex network link prediction. The second category of link-prediction methods is likelihood-based. These methods infer network formation rules by evaluating the likelihood of latent network structures to predict links. Representative models include the hierarchical structure model (HSM) [20] and the stochastic block model (SBM) [21]. Pan et al. (2016) [22] proposed an algorithmic framework based on likelihood analysis. By predefining a mechanism that integrates network clustering with higher-order ring information, it calculates network probabilities through structural Hamiltonian computation. It quantifies the likelihood of missing links via the conditional probability of adding unobserved links and assesses the probability of false links through the conditional probability of removing observed links. The final category comprises learning-based approaches, which integrate complex networks with emerging machine learning and deep learning techniques to generate low-dimensional embeddings for nodes or edges. Representative works include DeepWalk [23], Node2Vec [24], and subgraph embedding methods such as SEAL [25], which enhance performance but may produce identical representations for different links via their decoders. The improved method SEAL+ significantly enhances accuracy by augmenting decoder input information [26].

2.2. Node Importance and Potential Fields

Despite significant advances in accuracy achieved by learning-based methods, their computational complexity on large-scale networks and poor model interpretability remain substantial challenges. Recently, incorporating node importance into link prediction has emerged as a promising research direction. As a core concept in network analysis, node importance enriches the feature dimensions of link prediction [27], providing deeper topological insights. For instance, connections between important nodes or to important nodes may be more likely to form. The feasibility of this integrated approach is driven by the availability of network data, the mature compatibility of node importance algorithms with prediction models, and the support of contemporary computational capabilities. For instance, Wang et al. (2021) [28] proposed a novel similarity index named DLC, which integrates the degree-related clustering coefficient (DCC) and the link clustering coefficient (ALC) to address the limitation that existing similarity-based link prediction methods mostly focus on node topological features while neglecting link clustering information. Liu et al. (2022) [29] proposed the IICN link prediction algorithm, which first quantifies nodes’ initial information contribution using node degree, network average clustering coefficient, and average degree, then calculates the total information received by node pairs via direct, local, and global bidirectional information transmission, and finally normalizes the total information by the size of the union of node neighbors to obtain node structural similarity for link prediction. The PSI and PLSI methods proposed by Aziz et al. (2020) [30] effectively improve the accuracy of link prediction by integrating node degree information on local paths and path length weights, addressing the defect that traditional path-based methods ignore node information on paths. The GSIM framework proposed by Kong et al. (2025) [16] effectively improves link prediction accuracy across weighted/unweighted and attributed/non-attributed networks by integrating source node information (via IIC/WIIC) and multi-module fusion into the PSI method, addressing the defect that PSI ignores source node information and lacks adaptability to diverse network types. Wu et al. (2016) [31] proposed the CCLP algorithm. Its core is to exploit more local link and triangle information by means of the clustering coefficient of common neighbors to accurately measure the contribution of common neighbors to the similarity between node pairs.

Moreover, the application of topological potential theory in identifying critical nodes has provided fresh insights for link prediction. Hu et al. (2010) [32] proposed the topological potential model, which defines three forms of topological potential calculation—Gaussian-type, reciprocal-type, and inverse-square-type. This model characterizes node importance by integrating node activity and local effects, optimizes the control parameter σ to determine the optimal influence range of nodes through minimizing Shannon entropy, and calculates the topological potential score of each node for ranking. Du et al. (2020) [33] proposed the Improved Topological Potential Model considering Entropy (ITPE), which constructs the metro network as a directed and weighted graph, integrates three centrality measures (Eigenvector Centrality, Weighted Betweenness Centrality, Weighted Closeness Centrality), calculates the weight of each measure using topological entropy, optimizes the control parameter σ to quantify the improved topological potential of nodes for ranking node importance, and verifies the model’s effectiveness in identifying critical nodes through invulnerability analysis. The Topological Potential Centrality (TPC) method proposed by Zhang et al. (2024) [34] effectively enhances the accuracy of critical node identification through a topological potential model, inspiring us to incorporate potential theory into link prediction.

Global path dependency approaches face extremely high computational complexity in large-scale networks due to the requirement to traverse all network paths. Consequently, k-hop local-scope modelling emerges as the preferred approach—its core rationale being that node influence decays with distance, and neighboring nodes within k-hop range suffice to provide critical topological information. Feng et al. [35] focused on the expressive power of k-hop message-passing graph neural networks, demonstrating that k-hop message-passing strictly outperforms the 1-WL test in expressive capability, distinguishing nearly all regular graphs. Li et al. (2024) [36] proposed the LSNI algorithm, which quantifies the contribution of local structure information between endpoints and their common neighbors, calculates the differentiated contribution of 3-hop paths, and then integrates these two contributions to obtain the structural similarity of node pairs for link prediction.

2.3. Link Prediction in Directed Networks

Notably, a large proportion of real-world networks exhibit directed characteristics, such as academic citation networks, social follow networks, and traffic flow networks, where edge direction encodes critical asymmetric interaction logic that undirected network models fail to capture. Early directed link prediction research primarily focused on adapting undirected metrics by distinguishing between in-degree and out-degree. Representative adaptations include the Directed Common Neighbor (DCN) metric [37], which separately calculates the overlap between the out-neighbors of the source node and the in-neighbors of the target node to quantify local structural similarity; the Directed Adamic-Adar (DAA) index [37], which weights common neighbors by the inverse logarithm of their out-degrees to downplay the influence of high-degree nodes; and the Directed Preferential Attachment (DPA) method [37], which predicts link likelihood based on the product of the source node’s out-degree and the target node’s in-degree. Additionally, the local path (LP) index, originally designed for undirected networks, was extended to directed scenarios by accumulating weighted directed path strengths within 2-hop or 3-hop ranges [11]. However, these methods only perform superficial adaptations; DCN and DAA overly rely on immediate common neighbors, failing to capture higher-order directed path patterns; DPA ignores structural proximity entirely, leading to poor accuracy on dense networks; and LP, while incorporating multi-hop paths, uses uniform weighting that cannot distinguish the heterogeneous contribution of paths with opposite directions.

To address this limitation, specialized directed link prediction methods have been developed. Structural motif-based approaches, such as the Bifan metric [38], leverage directed subgraph patterns that inherently encode asymmetric interaction logic, as these motifs are statistically over-represented in real directed networks and closely associated with link formation. Path-based methods have also advanced; some studies optimize LP by introducing direction-aware weight decay, while others propose the Directed Random Walk (DRA) [37] algorithm, which simulates asymmetric information propagation using transition probabilities dependent on node out-degree distributions, thereby capturing global topological dependencies. Despite these advances, directed link prediction still faces distinct challenges; first, many methods overemphasize local directed structures while neglecting the integration of global topological information and node importance, leading to limited performance on sparse directed networks. Second, even methods that incorporate node importance often rely on simple centrality metrics rather than dynamic models that reflect contextual influence. Third, learning-based directed embedding methods, similar to their undirected counterparts, suffer from high computational complexity and poor interpretability, hindering their application in large-scale directed networks.

2.4. Preliminaries

To assess the adaptability of the proposed method across diverse network topologies, we compared it against seven representative similarity-based algorithms spanning local metrics (CN and PA), a semi-local metric (LP), a global metric (ACT), and recent multi-feature fusion methods (NSim, SAC, and GSIM). Together, these baselines cover the main lineage of traditional similarity-based approaches and have demonstrated stable performance in prior studies, supporting fair and reliable experimental comparisons.

Given a graph G = (V, E) with vertex set V and edge set E, we evaluated the following similarity indices: CN, PA, LP, ACT, NSim, SAC, and GSIM. Detailed definitions, parameter settings, and references are provided in this section.

2.4.1. Common Neighbors (CNs)

The common neighbor metric is the simplest indicator based on local information [8]. Its fundamental principle is that two unconnected nodes are more likely to be connected if they share a greater number of common neighbors. Common neighbor similarity is defined as

s_{x y}^{C N} = | Γ (x) \cap Γ (y) |

(1)

where

Γ (x)

denotes the set of neighbors of node

v_{x}

.

2.4.2. Preferential Attachment (PA)

The core principle of preferential attachment is that ‘the rich get richer’ [10]. During network formation, new nodes exhibit a stronger tendency to connect with those possessing the highest degree within the network. From a similarity perspective, nodes with high degrees possess a greater likelihood of forming connections based on this preferential mechanism. Mathematically, this can be defined as

s_{x y}^{P A} = k_{x} k_{y}

(2)

where

k_{x}

denotes the degree of node

v_{x}

.

2.4.3. Local Path (LP)

Local path metrics focus on path information within the immediate vicinity of nodes [11]. By analyzing characteristics such as the number and length of these paths, they characterize the similarity and degree of association between nodes. Building upon common neighbors, they incorporate third-order path factors, defined as

S = A^{2} + α A^{3}

(3)

where

α

is an adjustable parameter, and

A

denotes the network’s adjacency matrix.

2.4.4. Average Commute Time (ACT)

The average commuting metric is grounded in the principle of random walks, positing that connections between nodes in complex networks are established through random traversal [13]. A shorter average duration indicates greater structural proximity between nodes, thereby increasing the likelihood of a link between them. As a global metric, it considers the network’s overall structural information, capturing connections between nodes via shared long or complex paths. ACT-based similarity may be defined as follows:

s_{x y}^{A C T} = \frac{1}{l_{x x}^{+} + l_{y y}^{+} - 2 l_{x y}^{+}}

(4)

where

l_{x x}^{+}

,

l_{y y}^{+}

, and

l_{x y}^{+}

denote the corresponding elements in the matrix, and

L^{+}

represents the pseudoinverse of the network’s Laplacian matrix

L (L = D - A)

.

2.4.5. NSim

Kumar Rai et al. [15] propose a parametric link prediction method based on multi-feature fusion. The core approach involves integrating local neighbor features, information flow features, and global features, representing respectively: the number of common neighbors between nodes; resource allocation efficiency via common neighbor

u

(resource allocation index); and global compactness (proximity to centrality metrics). By adjusting the weights of each feature through controllable parameters, this method addresses the limitations of traditional approaches, which rely on single features and exhibit poor generalizability. The metric is defined as follows:

N S i m (x, y) = γ (| τ (x) \cap τ (y) |) + δ (\sum_{u \in τ (x) \cap τ (y)} \frac{1}{| τ (u) |}) + (1 - (γ + δ)) \frac{N}{s h_{x y}}

(5)

Here,

s h_{x y}

denotes the shortest path length, while

γ

and

δ

respectively govern the weighting of neighbor characteristics and resource allocation features.

2.4.6. SAC

Nandini et al. [39] propose a similarity measurement method based on node average centrality. Its core approach quantifies the connectivity potential between node pairs by identifying ‘co-neighbors exceeding the network’s average centrality’, thereby addressing the limitation of traditional local similarity methods, which overlook variations in the importance of co-neighbors. It employs both local centrality metrics—including degree centrality and clustering coefficient—and global centrality metrics—such as betweenness centrality and proximity centrality. The definition is as follows:

S A C_{ζ} (x, y) = | {u | u \in Γ (x) \cap Γ (y) and ζ (u) \geq A ζ (G)} |

(6)

A ζ (G) = \frac{\sum_{v \in V (G)} ζ (v)}{N}

(7)

where

S A C_{ζ} (x, y)

denotes the similarity score between nodes x and y,

ζ (v)

represents a measure of centrality for node

v

, and

V (G)

denotes the set of nodes in the network. The comparison algorithm employs a similarity calculation based on average degree centrality.

2.4.7. GSIM

Kong et al. [16] propose a general link prediction framework called GSIM. This similarity metric comprises two components, comprehensively considering both the information of all path nodes and that of the source nodes. It measures similarity between nodes by accumulating the contributions of all path nodes

z

, while the source node information component primarily assesses the structural similarity between the two source nodes

v_{x}

and

v_{y}

themselves. The combined information of path nodes and source nodes is integrated through a product operation. It is defined as follows:

W (x, y) = S_{1} (x, y) \cdot S_{2} (x, y)

(8)

S_{1} (x, y) = \sum_{\begin{array}{l} z \in V \ {x, y} \\ 1 < d = δ (x, z) + δ (z, y) \end{array}} β^{d - 2} \frac{1}{\log (I I C (v_{z}) + 1)}

(9)

S_{2} (x, y) = \frac{1}{| I I C (v_{x}) - I I C (v_{y}) | + 1}

(10)

I I C (v_{x}) = {(\frac{k_{x}}{K} \cdot \frac{< c >}{< k >})}^{α}

(11)

Here,

k_{x}

,

K

,

< k >

, and

< c >

represent respectively the degree of nodes in the network, the maximum degree of the network, the average degree, and the average clustering coefficient.

3. Proposed Method

Accurate prediction of latent links within complex networks hinges on effectively integrating node importance with multi-scale structural information, rather than relying solely on singular topological metrics. To this end, this paper proposes an Adaptive Multi-scale Potential-enhanced Path Similarity Model (AMPS), employing a hierarchical architecture comprising ‘node potential field quantification—multi-scale similarity enhancement—cross-scale information fusion’. This aims to simultaneously capture both global network topology and local node interaction patterns. The core design of this framework possesses the following key characteristics: (1) it proposes three optional potential field models based on global shortest paths, local structural attributes, and k-hop distance decay, respectively. These are applicable to different network structures, avoiding the bias introduced by a single importance assessment metric. (2) It achieves hierarchical fusion of structural similarity. This combines enhanced common neighbor similarity with feature-weighted path similarity, incorporating adjustable weights to balance the influence of local and global structural information on link prediction. (3) It innovates a potential field–distance coupling mechanism. By integrating node potential field values with topological distance constraints, it enhances the validity of similarity assessments for non-adjacent nodes, proving particularly effective for node pairs exhibiting high potential field values yet sparse direct connections. The algorithm framework flow of AMPS is shown in Figure 1 and a summary of key notations and definitions is shown in Table 1.

The core modules of the model comprise:

Node potential field computation: Quantifying a node’s global and local importance within the network.
Enhanced common neighbor matrix: Integrating node potential fields with local neighbor overlap information.
Feature-weighted generalized path similarity: Incorporating node potential fields to adjust global path contributions.
Composite similarity matrix: Synthesizing local and global similarity metrics to generate final predictions.

3.1. Construction of the Node Potential Field

3.1.1. Theoretical Motivation for Potential Functions

Gaussian Decay: We explain that the Gaussian kernel corresponds to the solution of the heat diffusion equation, representing the most natural form of information diffusion in a network. It is also the maximum entropy distribution for a fixed variance, making it the statistically “safest” assumption for global influence estimation without introducing arbitrary biases.

Inverse Decay: This is motivated by gravity-like or electrostatic-like interactions, which are standard models for describing local aggregation forces in complex systems. While strict mathematical “optimality” is difficult to prove for all graph types (due to the heterogeneity of complex networks), we argue that these physically inspired forms provide a robust baseline that covers both diffusive (global) and aggregative (local) behaviors.

3.1.2. Design of Multi-Scale Node Potential Field Models

Node importance serves as the core driver in link formation, yet traditional metrics such as degree centrality can only characterize local properties. To comprehensively quantify node importance, three complementary potential field models have been designed to capture a node’s influence on the network from global, local, and multi-scale perspectives respectively:

(i).: Global potential field model

This model is based on the shortest path distance between nodes, emphasizing the impact of a node’s reachability across the entire network:

{ϕ_{i}}^{g l o b a l} = \sum_{j = 1}^{n} \exp (- (\frac{d_{i j}}{σ})^{2})

(12)

Here,

d_{i j}

denotes the shortest path distance between nodes

i

and

j

, while

σ

represents the Gaussian kernel width, which governs the rate of distance decay. This model quantifies node

i

’s global radiative capacity by summing the decayed distances from all nodes to

i

.

(ii).: Local potential field model

This model characterizes a node’s capacity for aggregation within local communities by integrating its local structural properties. Nodes exhibiting strong local aggregation are more likely to form new links within the community:

{ϕ_{i}}^{l o c a l} = μ \cdot k_{i} + τ \cdot \sum_{j \in Γ (i)} e^{- 1 / η} + γ \cdot c_{i}

(13)

Here,

k_{i}

denotes the degree of node

i

;

Γ (i)

represents the set of

i

’s direct neighbors;

c_{i}

is the clustering coefficient of node

i

;

μ

,

τ

, and

γ

are weighting coefficients; and

η

is the decay parameter for neighbor distance.

(iii).: k-hop potential field model

To balance the requirements of local aggregation and multi-hop propagation, this model avoids the high computational complexity of global models while compensating for the neglect of long-range correlations in local models. It is suitable for analyzing medium-scale network structures:

{ϕ_{i}}^{k - h o p} = \sum_{k = 0}^{K} (\frac{1}{1 + k^{δ}} \cdot |Γ_{k} (i)|)

(14)

Here,

Γ_{k} (i)

denotes the set of k-hop neighbors for node

i

, K represents the maximum hop count, and

δ

signifies the exponential decay coefficient. This model quantifies a node’s multi-scale influence within the k-hop range by comprehensively considering both the number of neighbors at varying hop distances and the distance decay weighting.

Using a 30-node WS small-world network as the subject, we analyzed the quantitative differences in node influence across various potential field models, with results shown in Figure 2. To ensure direct comparability of the three potential fields, a unified color scale was applied, normalized to the range [0, 1]. A value of 1 corresponds to dark colors, while 0 corresponds to light colors, representing a decreasing order of node potential field values from high to low. For the global potential field, nodes with high potential values are concentrated around network hubs, confirming its ability to effectively capture global topological information. For the local potential field, high-potential nodes cluster within highly clustered communities, demonstrating its accuracy in depicting local structural features. For the k-hop potential field, high-potential nodes exhibit both local clustering and multi-hop propagation capabilities, highlighting its advantage in balancing multi-scale topological information. Overall distribution patterns show broadly consistent trends across the three potential field models. However, subtle differences in node-specific potential field rankings persist due to variations in their respective quantitative definitions. The figure simultaneously highlights the three nodes with the highest and lowest potential values under each model. Node 23 emerges as a high-influence core node across all three potential fields. Visual analysis reveals its position at the network’s geometric center, surrounded by numerous neighbors and connected to other nodes via shortest paths. Such high-potential nodes exhibit significantly higher probabilities of initiating new links. Among nodes with the lowest potential field values, Node 19 is consistently identified as the node with the lowest potential field across all three fields. Topologically, this node connects to only two neighbors and resides at the network periphery, making it significantly less likely to generate new links.

To unify dimensions, all potential field values undergo normalization processing:

ϕ_{i} = \frac{ϕ_{i} - \min (ϕ)}{\max (ϕ) - \min (ϕ)}

(15)

Considering the definition of three potential fields for directed networks, it is necessary to incorporate directional factors. Therefore, the core modification lies in decomposing symmetric relationships in undirected graphs into asymmetric predecessor–successor relationships. The design approach includes direction-aware potential field computation, direction-specific neighbor definition, and path direction constraints. The three potential field models are defined as follows:

i.: Global potential field model

{ϕ_{i}}^{g l o b a l} = \sum_{j = 1}^{n} \exp (- (\frac{d_{i j}^{d i r}}{σ})^{2})

(16)

Here,

d_{i j}^{d i r}

denotes the direction-specific shortest path distance from node

i

to node

j

. For out-edge prediction tasks,

d_{i j}^{d i r} = d_{i j}^{o u t}

represents the forward distance; for in-edge prediction tasks,

d_{i j}^{d i r} = d_{i j}^{i n}

represents the backward distance.

ii.: Local potential field model

{ϕ_{i}}^{l o c a l} = μ \cdot \frac{k_{i}^{i n} + k_{i}^{o u t}}{2} + τ \cdot \sum_{j \in Γ^{d i r} (i)} e^{- 1 / η} + γ \cdot c_{i}^{d i r}

(17)

where

k_{i}^{i n}

and

k_{i}^{o u t}

denote the in-degree and out-degree of a node, respectively, and

Γ^{d i r} (i)

represents the direction-specific neighborhood set.

c_{i}^{d i r}

is the directed clustering coefficient.

iii.: k-hop potential field model

Based on direction-specific k-hop neighbor counting, considering distance decay effects:

{ϕ_{i}}^{k - h o p} = \sum_{k = 0}^{K} (\frac{1}{1 + k^{δ}} \cdot |Γ_{k}^{d i r} (i)|)

(18)

Here,

Γ_{k}^{d i r} (i)

denotes the set of k-hop neighbors in a specific direction.

Figure 3 illustrates the node potential distributions of three different potential field models in the same directed network: the local potential field does not require direction distinction as it fuses the bidirectional (in-edge and out-edge) information of nodes; there are certain differences in the node potential distributions of the three models, and the distribution trends of the global potential field and k-hop potential field are relatively close. Figure 4 further focuses on the potential distribution differences between the global potential field and k-hop potential field in the “out-edge” and “in-edge” directions: for the same potential field, the potential values of the same node in the out-edge and in-edge directions are significantly distinct, while the overall distribution patterns of the global potential field and k-hop potential field in the same direction are roughly similar. In the experiment, only links actively initiated by the predicted node are considered (

d_{i j}^{d i r} = d_{i j}^{o u t}

).

3.2. Augmented Co-Matrix

The traditional common neighbor (CN) metric merely counts the ‘number of overlapping neighbors’ without distinguishing the varying importance of different neighbors (e.g., core neighbors contribute more significantly to link formation). Therefore, it is necessary to combine node potential fields to adjust the ‘effective contribution’ of neighbors.

P C N (i, j) = \sqrt{\min (ϕ_{i}, ϕ_{j})} \cdot \sum_{u \in Γ (i) \cap Γ (j)} \sqrt{ϕ_{u}}

(19)

Here,

\sum_{u \in Γ (i) \cap Γ (j)} \sqrt{ϕ_{u}}

represents the weighted sum of potential fields from common neighbors, amplifying the contribution of important neighbors to similarity,

\min (ϕ_{i}, ϕ_{j})

denotes the minimum potential field constraint for nodes

i

and

j

. We employ the minimum operator instead of a product or sum to model the limiting factor constraint in link formation. The likelihood of a connection is often constrained by the node with a lower potential or activity level. Specifically, we have the following:

Compatibility Constraint: A connection requires varying degrees of reciprocity. Even if a high-potential hub initiates a link, a low-potential node may lack the capacity to sustain it. The

\min

function ensures that the baseline similarity is determined by the structural bottleneck of the pair.

Bias Mitigation: Unlike the product operation

(ϕ_{i}, ϕ_{j})

, which excessively amplifies scores between two high-degree nodes (potentially leading to the “Rich-Club” bias and false positives), the operator provides a more conservative and robust estimation, preventing the dominance of node importance over topological evidence. This model preserves local structural information while incorporating global node importance, thereby enhancing the rationality of similarity assessment.

Direction-specific enhanced common neighbor matrices require redefinition: common neighbors are defined as the intersection of node

i

’s outgoing edge neighbors and node

j

’s incoming edge neighbors:

P C N (i, j) = \sqrt{\min (ϕ_{i}, ϕ_{j})} \cdot \sum_{u \in Γ^{o u t} (i) \cap Γ^{i n} (j)} \sqrt{ϕ_{u}}

(20)

3.3. Feature-Weighted Generalized LP Similarity

Traditional path similarity (LP) assesses node similarity by counting the number of second- and third-order paths between nodes, employing exponential decay weighting. However, this approach overlooks the heterogeneity of nodes within paths. To address this, this study proposes a feature-weighted generalized LP similarity algorithm. Building upon the original path-weighting scheme, it introduces a node-level weighting mechanism. This ensures that path contribution is modulated simultaneously by both path length and the characteristics of relay nodes.

(i).: Fundamental concept

The strength of a path is determined not only by its length but also by the product of the weights of all relay nodes along the path. Consequently, paths involving core nodes exert a greater influence on similarity, thereby more accurately capturing the latent associations between node pairs within the global topology.

(ii).: Mathematical model

Consider an undirected, unweighted graph

G = (V, E)

, where

V

denotes the set of vertices and

E

denotes the set of edges. The adjacency matrix of the graph is defined as

A

, where

A_{i j} = 1

indicates a connection exists between vertices

i

and

j

; it is 0 otherwise. The traditional LP similarity matrix is defined as

S = A^{2} + α A^{3}

(21)

where

α > 0

denotes the adjustable parameter for the control path weight.

To incorporate node importance, a diagonal matrix

W = d i a g (w)

of node weights is introduced, where

w_{i}

represents the weight of node

i

, defined by potential fields or other centrality metrics. By recursively defining the weighted strength matrix of paths, joint modulation of path length and node importance is achieved:

First-order path (direct connection): No relay nodes; the weighted strength matrix is the adjacency matrix itself:

P^{(1)} = A

(22)

Second-order path (

i \to w \to j

): The weight

w_{w}

of relay node

w

modulates the path strength, with the total weighted strength being the sum of contributions from all relay nodes:

P^{(2)} = A \cdot W \cdot A

(23)

Generally, the weighted intensity matrix for an

l

-order path may be recursively defined as

P^{(l)} = P^{(l - 1)} W A = A {(W A)}^{l - 1}

(24)

Ultimately, the feature-weighted generalized LP similarity represents a linear combination of the weighted strengths for paths of each order (truncated to a finite length of 3 during actual computation):

S_{G L P} = \sum_{l = 1}^{L} β^{l} P^{(l)} = \sum_{l = 1}^{L} β^{l} A {(W A)}^{l - 1}

(25)

(iii).: Weighting function design

The weight vector w constitutes the model’s core innovation and requires flexible design according to varying task requirements. This paper proposes the following alternative approaches:

Inverse Degree Weighting: Suitable for suppressing the excessive influence of nodes with high degrees:

${w_{i}}^{i n v_\deg} = \frac{1}{k_{i}}$

(26)
Logarithmic Inverse Degree Weighting: Further smoothing of degree variations, with enhanced robustness:

${w_{i}}^{i n v_\log_\deg} = \frac{1}{\log (k_{i} + 1)}$

(27)
Potential Field Weighting: Directly reusing node potential field quantification results in strongly correlating path contributions with node importance:

${w_{i}}^{P C N} = ϕ_{i}$

(28)

3.4. Combination Similarity Matrix

The enhanced common neighbor (PCN) focuses on ‘local neighbor overlap’, while the feature-weighted LP concentrates on ‘semi-local multi-length paths’, with both approaches providing complementary information. Through a weighted combination, a balance is achieved between local accuracy and global generalization, thereby enhancing the robustness of link prediction.

S_{f i n a l} = ω \cdot S_{P C N} + (1 - ω) \cdot S_{G L P}

(29)

The weights and

ω

can be optimized using the validation set.

The pseudo-code of the algorithm is presented in Algorithm 1.

Algorithm 1 Computation Framework of AMPS

Input: Adjacency matrix

A

, Potential parameters

(σ, η, μ, τ, γ, δ, K)

, Fusion weight

ω

,
GLP parameters

(L, β)

.

Output:

Final Similarity Matrix S^{f i n a l}

1. // Step 1: Calculate Node Potential Field $ϕ$
2. if model == ‘global’ then
3. Compute distance matrix

D

4.

ϕ_{i} \leftarrow \sum_{j \neq i} \exp (- {(d_{i j} / σ)}^{2})

for all i \in V

5. else if model == ‘local’ then
6.

ϕ_{i} \leftarrow μ \cdot k_{i} + τ \cdot \sum_{j \in Γ (i)} e^{- 1 / η} + γ \cdot c_{i}

for all

i \in V

7. else if model == ‘k-hop’ then
8.

ϕ_{i} \leftarrow \sum_{k = 0}^{K} | Γ_{k} (i) | \cdot {(1 + k^{δ})}^{- 1}

for all i \in V

9. end if
10. Normalize

ϕ

to range [0, 1] via Equation (15)
11. // Step 2: Compute Enhanced Common Neighbor (PCN)
12. for

each pair (i, j) \in V \times V

do
13.

Identify common neighbors : U_{i j} = Γ (i) \cap Γ (j)

14.

S u m_{ϕ} \leftarrow \sum_{u \in U_{i j}} ϕ_{u}

15.

S_{i j}^{P C N} \leftarrow \min (ϕ_{i}, ϕ_{j}) \cdot S u m_{ϕ}

16. end for
17. // Step 3: Compute Feature-weighted Generalized LP (GLP)
18.

Construct weight matrix W \leftarrow diag (weighting_scheme (ϕ, A));

19.

Initialize S^{G L P} \leftarrow 0

, P^{(l)} \leftarrow A

;
20.

for l = 2

to

L

do
21.

P^{(l)} \leftarrow P^{(l - 1)} W A

;
22.

S^{G L P} \leftarrow S^{G L P} + β^{l - 1} P^{(l)}

;
23. end for

24.

Normalize S^{G L P}

to range [0, 1];
25. // Step 4: Adaptive Fusion
26.

S^{f i n a l} \leftarrow ω \cdot S^{P C N} + (1 - ω) \cdot S^{G L P}

27. $return S_{f i n a l}$

3.5. Time Complexity Analysis

The computational complexity of the proposed AMPS method is analyzed as follows. Let

N

and

M

denote the number of nodes and edges, respectively, and

〈k〉

denote the average degree of the network. The node potential field computation depends on the chosen model:

Global potential model: Theoretically, it requires $O (N^{3})$ time using the Floyd–Warshall algorithm. However, for sparse graphs, we employ $N$ independent runs of Breadth-First Search (BFS), reducing the complexity to $O (N (N + M))$
Local potential model: This involves iterating over neighbors to calculate clustering coefficients. For a node $i$ with degree $k_{i}$ , this costs $O (k_{i}^{2})$ . Summing over all nodes, the total complexity is $O (N \cdot < k >^{2})$
k-hop potential model: This utilizes a BFS truncated at depth $K$ . In the worst case for sparse graphs, the search space grows exponentially with the branching factor, approximately $O (N \cdot {〈k〉}^{K})$ . Since $K$ is typically small (e.g., $K = 2 or 3$ ), this remains efficient.

The prediction modules invoke the following costs:

Enhanced common neighbor (PCN): This involves computing weighted neighbor overlaps, which is computationally equivalent to sparse matrix multiplication, taking $O (N \cdot < k >^{2})$ time.
Feature-weighted generalized path similarity (GLP): This requires iterative sparse matrix multiplications up to path length $L$ . Assuming the matrices remain relatively sparse during early iterations, the complexity is approximately $O (L \cdot N \cdot < k >^{2})$ .
Fusion: The final adaptive combination is a linear operation on the similarity matrices, taking $O (N^{2})$ .

Considering the dominant terms, the overall time complexity of AMPS is generally dominated by the matrix multiplication components, approximating

O (L \cdot N \cdot < k >^{2})

for sparse graphs, which ensures scalability for medium-to-large-scale networks.

4. Experiments and Discussion

In this section, we present the evaluation results of the proposed method on real-world datasets and compare it with the prediction algorithm from Section 2. Concurrently, the paper conducts performance comparisons between different potential field models and experiments on optimizing weighting parameters based on composite similarity matrices.

4.1. Datasets

For evaluating the effectiveness of the proposed method in link prediction tasks, this study selected nine real networks with distinct topological characteristics as the experimental datasets. Key topological properties of the experimental datasets are tabulated in Table 2.

(i)

Undirected networks

(1): KA (Karate) [40]: A network about the social connections of a karate club’s members.
(2): Polbooks [41]: A network of US politics-related books compiled by V. Krebs (Valdis Krebs).
(3): JZ (Jazz) [42]: A network of connections among jazz musicians.
(4): USAir [8]: A network of the US air transportation system, where nodes typically represent US airports and edges represent air routes between them.
(5): Infect [43]: A human contact network where nodes stand for humans and edges between nodes represent physical-world proximity.
(6): CE (C. elegans) [44]: A metabolic network of Caenorhabditis elegans, represented by a list of edges that denote connections in the organism’s metabolic processes.
(7): Food [45]: A network of 620 official blue-tick food-related Facebook pages with links representing their associations.
(8): Email [46]: A network of email communication at the University Rovira i Virgili (Tarragona, southern Catalonia, Spain), where nodes represent individual users and edges indicate that at least one email was sent.
(9): PB [47]: A network that captures hyperlink connections between US politics-themed weblogs.
(10): PPI [48]: A network where nodes represent proteins and edges represent the interaction relationships between different proteins.
(11): Wiki [49]: A network of Wiki links within Wikipedia, where nodes represent individual articles, and each directed edge denotes a single Wiki link.
(12): Openflights [50]: A network of routes between airports worldwide.

(ii)

Directed networks

(1): Chess [51]: A network for chess games. Nodes represent chess players, and directed edges represent game interactions; an outgoing edge corresponds to the player using the white pieces, while an ingoing edge corresponds to the player using the black pieces.
(2): Highschool [52]: A directed network describing the friendship relationships among male students at a high school in Illinois, USA.
(3): Kohonen [53]: A citation network for papers on self-organizing maps or Kohonen T.
(4): Physicians [54]: A directed network describing the spread of innovative ideas among 246 physicians across four towns.
(5): Residence [41]: A friendship network comprising 217 residents of the Australian National University dormitory area.
(6): FWFD (Food Web of Florida Bay in fry season) [55]: A dry-season food web in a south Florida cypress wetland.
(7): Wiki-Vote [56]: A social network based on election participation on Wikipedia. Users are treated as nodes, and voting behavior corresponds to directed edges.
(8): Polblogs [47]: The hyperlink network among US political blogs.
(9): Adolescent [57]: A friendship network among students, constructed based on a survey conducted from 1994 to 1995.

4.2. Evaluation Metric

To verify the effectiveness of the proposed algorithm, a standard link prediction evaluation framework was adopted for the experiment. The real edge set E of the original network is divided into a training set

E^{t r}

and a test set

E^{t e}

, which satisfies the properties of mutual exclusivity (

E^{t r} \cap E^{t e} = \emptyset

) and completeness (

E^{t r} \cup E^{t e} = E

). Among them,

E^{t r}

is used for algorithm training, and

E^{t e}

is used for performance verification. The set of edges that actually do not exist in the network is defined as

E^{n} = U - E

(U is the maximum possible number of edges in the network).

The algorithm calculates the similarity scores of all node pairs on the training set

E^{t r}

. The performance evaluation is realized by comparing the score distributions of the real missing edges

E^{t e}

and the real non-existent edges

E^{n}

in the test set. Two key metrics are used for quantification: the AUC (Area Under the Receiver Operating Characteristic Curve) index and the ROC (Receiver Operating Characteristic) curve itself. AUC describes the probability that “a randomly selected missing edge has a higher predicted score than a randomly selected non-existent edge”.

The specific calculation process is as follows: conduct

n

independent comparisons, randomly select an edge

e^{t e}

from

E^{t e}

and an edge

e^{n}

from

E^{n}

each time, and compare their scores. Two results are counted:

n_{1}

is the number of times that the score of

e^{t e}

is higher than that of

e^{n}

, and

n_{2}

is the number of times that the score of

e^{n}

is higher than that of

e^{t e}

. The AUC is calculated according to the following formula [58]:

A U C = \frac{n_{1} + 0.5 \times n_{2}}{n}

(30)

When AUC = 0.5, the algorithm’s predictive performance is indistinguishable from random guessing, indicating it cannot differentiate between positive and negative samples. When AUC approaches 1, the algorithm consistently ranks positive samples above negative ones, achieving optimal predictive performance. In link prediction experiments, a higher AUC value signifies the algorithm’s enhanced ability to capture potential connection relationships between nodes.

4.3. Experimental Results

As previously mentioned, to validate the experimental effectiveness of the proposed method, a standardized link prediction experimental framework was employed to conduct a unified evaluation of the proposed AMPS algorithm against seven comparison algorithms (CN, PA, LP, ACT, NSim, SAC, and GSIM). The true edge set E of each network was divided into training sets

E^{t r}

and

E^{t c}

through independent random partitioning, with the training set accounting for 90% and the test set for 10%. Additionally, a set of non-existent edges

E^{n} = U - E

was defined as negative samples for performance evaluation. To mitigate random partitioning bias, the experiment was independently repeated 50 times, with the final evaluation metric being the average AUC value across all 50 runs.

Key parameters for each algorithm employ domain-standard or optimized configurations: for comparison algorithms, path weight

α

in LP was set to 0.1; weights

γ

and

δ

in NSim were both set to 0.4; SAC used degree centrality as its core parameter; and weights

α

and

β

in GSIM were set to 1.8 and 0.2 respectively. The proposed AMPS algorithm selected the optimal ‘GFP + inv_log_deg’ configuration, where the global potential field model parameter was

σ = 1.0

; the local potential field model parameters were

μ = 0.5

,

τ = 0.2

,

γ = 0.3

, and

η = 1.0

; the maximum hop count was

K = 3

for the k-hop potential field model; the distance decay coefficient was

δ = 1.0

; the maximum path order was set to 3; and a fusion weight ratio of 1:9 was set between PCN and GLP. Table 3 displays the average AUC values after 50 experiments for different methods. The results demonstrate that the proposed AMPS model consistently achieves the highest AUC values on all 12 datasets, outperforming all other benchmark methods.

To visually demonstrate the performance differences between the proposed AMPS algorithm and various comparison algorithms in link prediction tasks across different network datasets, we plotted ROC curves, as shown in Figure 5. It can be observed that the AMPS algorithm achieves optimal ROC curves across all 12 datasets, validating its universality across diverse network topologies and scales.

Due to the simple topology and limited data volume of the Karate dataset, it is difficult to distinguish fine-grained performance differences among different algorithm configurations. Therefore, we present only the AUC values across 11 real-world network datasets for the nine combinations of three potential field models (GPF, LPF, and kPF) and three weighting schemes (inv_deg, inv_log_deg, and PF_w) within the AMPS algorithm. The results are presented in Table 4. For PF_w, we chose ‘global’ as the parameter because, by comparison, the results using ‘global’ as the parameter are better than those of the other two. This quantifies the impact of different configurations on link prediction accuracy. Results indicate that the GPF + inv_log_deg combination achieves optimal performance, reaching the highest AUC of 0.991 on the Openflights dataset. In contrast, PF_w-based combinations (e.g., GPF + PF_w and LPF + PF_w) exhibit lower AUC values across most datasets, confirming that inverse logarithmic degree weighting better adapts to node importance.

To verify the effectiveness of the proposed algorithm in directed networks, this study first compared the performance of three potential field models. Experimental results show that their performance is generally comparable across different datasets. Therefore, the global potential field model was selected as the basic framework for subsequent comparative experiments, and the specific results are presented in Table 5. It can be observed that the proposed algorithm ranks first in terms of AUC values on seven out of nine datasets. On the ‘Residence’ dataset, AMPS achieves a performance nearly identical to the best baseline (0.852 vs. 0.853). Regarding the ‘Kohonen’ dataset, we acknowledge a noticeable performance gap compared to the motif-based Bifan algorithm (0.725 vs. 0.890), although AMPS still outperforms standard local metrics like DCN and DAA. This suggests that specific directed motifs may play a dominant role in citation networks like Kohonen, which are partially distinct from the potential field properties captured by AMPS. These results fully demonstrate the superiority of the proposed algorithm in directed network link prediction tasks.

4.4. Parameter Sensitivity and Robustness Analysis

To determine the optimal weight allocation for the composite similarity matrix, we conducted an extensive grid search on the weight parameter

ω

(controlling the contribution of the PCN module) across all real-world datasets. The experimental results are shown in Figure 6. The parameter

ω

was systematically adjusted from 0.1 to 0.9 at intervals of 0.1, covering all feasible weight allocation ratios between the two modules. Experimental results demonstrate that when

ω = 1

, the model achieves superior performance across diverse network topologies, with several notable advantages. Taking the 8:2 ratio for the training set to test set as an example, we have the following: in the validation of weight allocation for the AMPS method, the 1:9 PCN-GLP ratio proves to be the optimal configuration, achieving peak AUC performance across multiple critical datasets, including Jazz (0.967), USAir (0.939), CE (0.943), and Food (0.904). Particularly on the CE dataset, the 1:9 ratio delivers an AUC of 0.943, achieving a significant 2.0% improvement over the PCN-only configuration (0.925), highlighting its performance advantage. Even in certain datasets like the PPI network, where the 8:2 ratio reaches an identical peak AUC of 0.939, the 1:9 ratio maintains this excellent level, demonstrating stable performance retention. This optimal weight allocation validates our core design concept: the GLP module captures the network’s global structural backbone, making it the primary determinant of link prediction accuracy (contributing approximately 90% of the predictive power). Meanwhile, the PCN module acts as a fine-tuning mechanism that appropriately refines local structural information with a 10% weighting. This specific ratio prevents the noise and bias associated with over-reliance on common neighbors, achieving an ideal balance of global pattern dominance and local information refinement.

To evaluate the robustness and stability of algorithms under varying training set sizes, we designed experiments examining the relationship between training set proportion and algorithm performance. As shown in Figure 7, the experiments investigated the variation patterns of AUC performance (averaged over 50 trials) for each algorithm as the training set proportion changed within the range [0.5, 0.9].

Quantitative analysis at the optimal fusion ratio (

ω = 0.1

) reveals that the AMPS algorithm exhibits significant robustness advantages. Regardless of the training set proportion, its AUC values consistently outperform all comparison algorithms. Specifically, we see the following:

On the USAir dataset, AMPS maintains a high baseline of 0.9205 even with only 50% training data. As data availability increases to 90%, the performance steadily climbs to 0.9455, demonstrating a clear data-driven gain.

On the CE dataset, the improvement is even more pronounced, surging from 0.8667 (50% training) to 0.9450 (90% training), verifying the model’s capacity to learn from denser structures.

On the JZ dataset, the model achieves a remarkable AUC of 0.9661 at the 90% training scale, up from 0.9425 at the 50% level.

While the Karate dataset shows some variability due to its smaller scale, on all other datasets, AMPS’s performance exhibits a sustained and stable upward trend as the training set proportion increases. This phenomenon strongly validates the design advantages of the AMPS model and its PCN-GLP fusion mechanism; it effectively captures key topological features via local potentials when training samples are limited (solving the cold-start problem in sparse networks) while refining the exploration of global associations when training data is abundant. This ensures the model maintains outstanding and stable performance across various training scales.

5. Conclusions

Link prediction is one of the most important and challenging tasks in complex network analysis. The goal of a link prediction algorithm is to estimate the likelihood of the existence of missing or future links based on the currently observed network topology. In this paper, we have designed a novel link prediction method, named AMPS, that significantly improves the prediction accuracy compared to existing state-of-the-art methods. The proposed method is based on a multi-scale similarity framework that incorporates the global and local importance of nodes, quantified by potential field models, into both neighborhood overlap and path-based similarity indices. Unlike traditional methods that rely on singular topological features, our approach adaptively fuses the structural diversity of a node’s neighborhood with the topological cohesion between nodes. The extensive experiments presented in this paper on 21 real-world network datasets demonstrate that the proposed method achieves a higher accuracy, measured by AUC, compared to other popular benchmark methods, and exhibits strong robustness across different training set ratios. There are several directions in which the work reported here can be extended. From a theoretical perspective, it would be interesting to explore other node centrality measures and fusion strategies to further enhance prediction performance. From an application perspective, it would be interesting to explore the applications of the proposed link prediction method in other domains such as personalized recommendation and biological interaction inference.

Author Contributions

Conceptualization, X.Q.; methodology, X.Q.; software, X.Q.; validation, M.Z. and J.T.; formal analysis, X.Q.; investigation, X.Q.; resources, X.Q. and M.Z.; data curation, S.L.; writing—original draft preparation, X.Q. and S.L.; writing—review and editing, J.T. and Y.R.; visualization, X.Q.; supervision, M.Z.; project administration, Y.R.; funding acquisition, Y.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant numbers 72101265 and 72401286.

Data Availability Statement

All data analyzed during this study are included in this published article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tu, H.; Wang, Y.; Zhang, Y.; Wang, X.; Liu, W. A Spectrally Discretized Wide-Angle Parabolic Equation Model for Simulating Acoustic Propagation in Laterally Inhomogeneous Oceans. J. Acoust. Soc. Am. 2023, 153, 3334. [Google Scholar] [CrossRef] [PubMed]
Yilmaz, E.A.; Balcisoy, S.; Bozkaya, B. A Link Prediction-Based Recommendation System Using Transactional Data. Sci. Rep. 2023, 13, 6905. [Google Scholar] [CrossRef] [PubMed]
Dileo, M.; Zignani, M.; Gaito, S. Temporal Graph Learning for Dynamic Link Prediction with Text in Online Social Networks. Mach. Learn. 2024, 113, 2207–2226. [Google Scholar] [CrossRef]
Liu, J.; Qiu, B.; Du, P.; Zhao, X.; Zhu, J. A Novel Probabilistic Connectivity Network Link Prediction Model for Natural Gas Price Based on an Improved K-Shell Algorithm. Phys. A Stat. Mech. Its Appl. 2025, 671, 130672. [Google Scholar] [CrossRef]
Zhou, F.; Lü, L.; Liu, J.; Mariani, M.S. Beyond network centrality: Individual-level behavioral traits for predicting information superspreaders in social media. Natl. Sci. Rev. 2024, 11, nwae073. [Google Scholar] [CrossRef]
Behrouzi, S.; Shafaeipour Sarmoor, Z.; Hajsadeghi, K.; Kavousi, K. Predicting Scientific Research Trends Based on Link Prediction in Keyword Networks. J. Informetr. 2020, 14, 101079. [Google Scholar] [CrossRef]
Tu, H.; Wang, Y.; Zhou, X.; Xu, G.; Gao, D.; Ma, S. Application of a Spectral Scheme for Simulating Slowly Horizontally Varying Three-Dimensional Ocean Acoustic Propagation. Ocean. Eng. 2026, 343, 123035. [Google Scholar] [CrossRef]
Newman, M.E.J. Clustering and Preferential Attachment in Growing Networks. Phys. Rev. E 2001, 64, 025102. [Google Scholar] [CrossRef]
Adamic, L.A.; Adar, E. Friends and Neighbors on the Web. Soc. Netw. 2003, 25, 211–230. [Google Scholar] [CrossRef]
Barabasi, A.-L.; Albert, R. Emergence of Scaling in Random Networks. Science 1999, 286, 509–512. [Google Scholar] [CrossRef]
Zhou, T.; Lü, L.; Zhang, Y.-C. Predicting Missing Links via Local Information. Eur. Phys. J. B 2009, 71, 623–630. [Google Scholar] [CrossRef]
Katz, L. A New Status Index Derived from Sociometric Analysis. Psychometrika 1953, 18, 39–43. [Google Scholar] [CrossRef]
Klein, D.J.; Randi, M. Resistance Distance. J. Math. Chem. 1993, 12, 81–95. [Google Scholar] [CrossRef]
Mishra, S.; Singh, S.S.; Kumar, A.; Biswas, B. MNERLP-MUL: Merged Node and Edge Relevance Based Link Prediction in Multiplex Networks. J. Comput. Sci. 2022, 60, 101606. [Google Scholar] [CrossRef]
Rai, A.K.; Tripathi, S.P.; Yadav, R.K. A Novel Similarity-Based Parameterized Method for Link Prediction. Chaos Solitons Fractals 2023, 175, 114046. [Google Scholar] [CrossRef]
Kong, Z.; Zhai, S.; Wang, L.; Guo, G. A General Link Prediction Method Based on Path Node Information and Source Node Information. Inf. Sci. 2025, 709, 122051. [Google Scholar] [CrossRef]
Yao, Y.; Ti, Z.; Xu, Z.; He, Y.; Liu, Z.; Liu, W.; He, X.; Nian, F.; Tang, J. CICN: Higher-Order Link Prediction with Clustering Mutual Information of Common Neighbors. J. Comput. Sci. 2025, 85, 102513. [Google Scholar] [CrossRef]
Yang, R.; Liu, B.; Lü, L. Simplicial Motif Predictor Method for Higher-Order Link Prediction. Expert Syst. Appl. 2025, 269, 126284. [Google Scholar] [CrossRef]
Liu, Z.; Yao, Y.; Xu, Z. Rb-Based: Link Prediction Based on the Resource Broadcast of Nodes for Complex Networks. Evol. Intel. 2024, 17, 3793–3813. [Google Scholar] [CrossRef]
Clauset, A.; Moore, C.; Newman, M.E.J. Hierarchical Structure and the Prediction of Missing Links in Networks. Nature 2008, 453, 98–101. [Google Scholar] [CrossRef]
Anderson, C.J.; Wasserman, S.; Faust, K. Building Stochastic Blockmodels. Soc. Netw. 1992, 14, 137–161. [Google Scholar] [CrossRef]
Pan, L.; Zhou, T.; Lü, L.; Hu, C.-K. Predicting Missing Links and Identifying Spurious Links via Likelihood Analysis. Sci. Rep. 2016, 6, 22955. [Google Scholar] [CrossRef]
Perozzi, B.; Al-Rfou, R.; Skiena, S. DeepWalk: Online Learning of Social Representations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA, 24–27 August 2014; pp. 701–710. [Google Scholar]
Grover, A.; Leskovec, J. Node2vec: Scalable Feature Learning for Networks. Knowl. Discov. Databases 2016, 2016, 855–864. [Google Scholar] [CrossRef]
Zhang, M.; Chen, Y. Link Prediction Based on Graph Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018; Curran Associates, Inc.: Nice, France, 2018; Volume 31. [Google Scholar]
Karami, R.; Vahidipour, S.M.; Rezvanian, A. SEAL+: A Subgraph-Enhanced Framework for Link Prediction with Graph Neural Networks. J. Ind. Inf. Integr. 2025, 44, 100802. [Google Scholar] [CrossRef]
Ruan, Y.; Liu, S.; Tang, J.; Guo, Y.; Yu, T. GLC: A Dual-Perspective Approach for Identifying Influential Nodes in Complex Networks. Expert Syst. Appl. 2025, 268, 126292. [Google Scholar] [CrossRef]
Wang, M.; Lou, X.; Cui, B. A Degree-Related and Link Clustering Coefficient Approach for Link Prediction in Complex Networks. Eur. Phys. J. B 2021, 94, 33. [Google Scholar] [CrossRef]
Liu, Y.; Liu, S.; Yu, F.; Yang, X. Link Prediction Algorithm Based on the Initial Information Contribution of Nodes. Inf. Sci. 2022, 608, 1591–1616. [Google Scholar] [CrossRef]
Aziz, F.; Gul, H.; Muhammad, I.; Uddin, I. Link Prediction Using Node Information on Local Paths. Phys. A Stat. Mech. Its Appl. 2020, 557, 124980. [Google Scholar] [CrossRef]
Wu, Z.; Lin, Y.; Wang, J.; Gregory, S. Link Prediction with Node Clustering Coefficient. Phys. A Stat. Mech. Its Appl. 2016, 452, 1–8. [Google Scholar] [CrossRef]
Hu, J.; Han, Y.; Hu, J. Topological Potential: Modeling Node Importance with Activity and Local Effect in Complex Networks. In Proceedings of the 2010 Second International Conference on Computer Modeling and Simulation, Cambridge, UK, 15–16 May 2010; Volume 2, pp. 411–415. [Google Scholar]
Du, Z.; Tang, J.; Qi, Y.; Wang, Y.; Han, C.; Yang, Y. Identifying Critical Nodes in Metro Network Considering Topological Potential: A Case Study in Shenzhen City—China. Phys. A Stat. Mech. Its Appl. 2020, 539, 122926. [Google Scholar] [CrossRef]
Zhang, X.; Wang, Z.; Liu, G.; Wang, Y. Key Node Identification in Social Networks Based on Topological Potential Model. Comput. Commun. 2024, 213, 158–168. [Google Scholar] [CrossRef]
Feng, J.; Chen, Y.; Li, F.; Sarkar, A.; Zhang, M. How Powerful Are K-Hop Message Passing Graph Neural Networks. arXiv 2023, arXiv:2205.13328. [Google Scholar] [CrossRef]
Li, T.; Zhang, R.; Niu, B.; Yao, Y.; Ma, J.; Jiang, J.; Zhao, Z. Link Prediction Based on Local Structure and Node Information Along Local Paths. Comput. J. 2024, 67, 45–56. [Google Scholar] [CrossRef]
Zhang, X.; Zhao, C.; Wang, X.; Yi, D. Identifying Missing and Spurious Interactions in Directed Networks. Int. J. Distrib. Sens. Netw. 2015, 11, 507386. [Google Scholar] [CrossRef]
Zhang, Q.-M.; Lü, L.; Wang, W.-Q.; Yu, X.; Zhou, T. Potential Theory for Directed Networks. PLoS ONE 2013, 8, e55437. [Google Scholar] [CrossRef]
Nandini, Y.V.; Lakshmi, T.J.; Enduri, M.K.; Sharma, H. Link Prediction in Complex Networks Using Average Centrality-Based Similarity Score. Entropy 2024, 26, 433. [Google Scholar] [CrossRef]
Zachary, W.W. An Information Flow Model for Conflict and Fission in Small Groups. J. Anthropol. Res. 1977, 33, 452–473. [Google Scholar] [CrossRef]
Kunegis, J. KONECT: The Koblenz Network Collection. In Proceedings of the 22nd International Conference on World Wide Web, Rio de Janeiro, Brazil, 13–17 May 2013; ACM: Rio de Janeiro, Brazil, 2013; pp. 1343–1350. [Google Scholar]
Gleiser, P.; Danon, L. Community Structure in Jazz. Advs. Complex Syst. 2003, 6, 565–573. [Google Scholar] [CrossRef]
Isella, L.; Stehlé, J.; Barrat, A.; Cattuto, C.; Pinton, J.-F.; Van den Broeck, W. What’s in a Crowd? Analysis of Face-to-Face Behavioral Networks. J. Theor. Biol. 2011, 271, 166–180. [Google Scholar] [CrossRef]
Duch, J.; Arenas, A. Community Detection in Complex Networks Using Extremal Optimization. Phys. Rev. E 2005, 72, 027104. [Google Scholar] [CrossRef]
Rossi, R.; Ahmed, N. The Network Data Repository with Interactive Graph Analytics and Visualization. AAAI 2015, 29, 9277. [Google Scholar] [CrossRef]
Guimera, R.; Danon, L.; Diaz-Guilera, A.; Giralt, F.; Arenas, A. Self-Similar Community Structure in Organisations. Phys. Rev. E 2003, 68, 065103. [Google Scholar] [CrossRef]
Adamic, L.; Labs, H.; Glance, N.; Blvd, B. The Political Blogosphere and the 2004 U.S. Election: Divided They Blog. In Proceedings of the 3rd International Workshop on Link Discovery, Chicago, IL, USA, 21–25 August 2005. [Google Scholar]
Bu, D. Topological Structure Analysis of the Protein-Protein Interaction Network in Budding Yeast. Nucleic Acids Res. 2003, 31, 2443–2450. [Google Scholar] [CrossRef]
Qiao, H. Fengduqianhe/GraphEmbedding-Master 2025. Available online: https://github.com/fengduqianhe/GraphEmbedding-master (accessed on 18 October 2025).
Openflights|Infrastructure Networks|Network Data Repository. Available online: https://networkrepository.com/inf-openflights.php (accessed on 18 October 2025).
Chess. Available online: http://www.konect.cc/networks/chess/ (accessed on 18 October 2025).
Introduction to Mathematical Sociology|Princeton University Press. Available online: https://press.princeton.edu/books/hardcover/9780691145495/introduction-to-mathematical-sociology (accessed on 19 November 2025).
Batagelj, V.; Mrvar, A. Pajek—Analysis and Visualization of Large Networks. In Proceedings of the International Symposium on Graph Drawing, New York, NY, USA, 29 September–2 October 2004. [Google Scholar]
Coleman, J.; Katz, E.; Menzel, H. The Diffusion of an Innovation Among Physicians. Sociometry 1957, 20, 253. [Google Scholar] [CrossRef]
Michalski, R.; Palus, S.; Kazienko, P. Matching Organizational Structure and Social Network Extracted from Email Communication. In Business Information Systems; Abramowicz, W., Ed.; Lecture Notes in Business Information Processing; Springer: Berlin/Heidelberg, Germany, 2011; Volume 87, pp. 197–206. ISBN 978-3-642-21829-3. [Google Scholar]
Leskovec, J.; Huttenlocher, D.; Kleinberg, J. Predicting Positive and Negative Links in Online Social Networks. arXiv 2010, arXiv:1003.2429. [Google Scholar] [CrossRef]
Moody, J. Peer Influence Groups: Identifying Dense Clusters in Large Networks. Soc. Netw. 2001, 23, 261–283. [Google Scholar] [CrossRef]
Lü, L.; Zhou, T. Link Prediction in Complex Networks: A Survey. Phys. A Stat. Mech. Its Appl. 2011, 390, 1150–1170. [Google Scholar] [CrossRef]

Figure 1. The algorithm framework flow of AMPS.

Figure 2. Quantitative distribution of node influence in three potential field models.

Figure 3. Distribution comparison of global, local, and k-hop potential fields in directed networks.

Figure 4. Out/In-direction differences of potential fields in directed networks.

Figure 5. Comparison of ROC curves of various link prediction algorithms on different network datasets.

Figure 6. AUC performance comparison of PCN and GLP weight allocation across different datasets.

Figure 7. Comparison of the AUC robustness of various link prediction algorithms under different training set ratios.

Table 1. Summary of key notations and definitions.

Symbol	Definition
$V$	The node set of the target network
$E$	The edge set of the target network
$N$	The number of nodes
$M$	The number of edges
$A$	Adjacency matrix of the network
$k_{x}$	Degree of node $x$
$Γ (i)$	The set of neighbors of node $i$
$c_{x}$	Clustering coefficient of node x
$d_{i j}$	Shortest path distance between node $i$ and $j$
$ϕ_{i}$	The potential field value (importance) of node $i$
$σ$	Gaussian kernel width parameter
$η$	Decay parameter for neighbor distance
$μ, τ, γ$	Weights for degree, clustering, and neighbors
$δ$	Distance decay coefficient
$K$	Maximum hop count
$L$	Maximum path length
$α, β$	The adjustable parameter for the control path weight
$W$	Diagonal weight matrix
$ω$	The weight of PCN

Table 2. The topological properties of datasets.

	Datasets	N	M	<d>	<k>	<c>
Undirected networks	KA	34	78	2.3374	4.5880	0.5880
	Polbooks	105	441	3.079	8.4	0.488
	Jazz	198	2742	2.235	27.697	0.633
	USAir	332	2126	2.738	12.807	0.749
	Infect-Dublin	410	2765	3.233	13.488	0.467
	CE	453	2025	2.664	8.94	0.665
	Food	620	2103	5.089	6.781	0.418
	Email	1133	5451	3.606	9.622	0.254
	PB	1222	16,714	2.738	27.355	0.36
	PPI	2375	11,693	5.096	9.847	0.388
	Wiki	2424	17,981	3.652	10.612	0.48
	Openflights	2939	15,677	4.097	10.668	0.589
	Highschool	70	366	3.969	5.229	0.329
Directed networks	FWFD	128	2106	2.412	16.453	0.173
	Residence	217	2672	2.765	12.313	0.287
	Physicians	241	1098	3.31	4.556	0.199
	Polblogs	1224	19,025	3.39	15.543	0.21
	Adolescent	2539	12,969	6.277	5.108	0.104
	Kohonen	3772	12,731	3.272	3.375	0.125
	Wiki-Vote	7115	103,689	3.341	14.573	0.081
	Chess	7301	65,053	4.476	8.224	0.101

Table 3. Comparison of AUC values among original link prediction algorithms.

Datasets/ Algorithm	CN	PA	LP	ACT	NSim	SAC	GSIM	AMPS
KA	0.754	0.733	0.765	0.607	0.737	0.665	0.844	0.869
Polbooks	0.899	0.669	0.838	0.716	0.913	0.863	0.902	0.942
Jazz	0.951	0.773	0.843	0.789	0.957	0.925	0.876	0.969
USAir	0.956	0.913	0.898	0.902	0.970	0.922	0.975	0.981
Infect	0.945	0.709	0.904	0.803	0.961	0.890	0.957	0.979
CE	0.921	0.826	0.816	0.757	0.951	0.848	0.831	0.966
Food	0.909	0.838	0.911	0.907	0.964	0.892	0.926	0.966
Email	0.858	0.808	0.872	0.808	0.917	0.852	0.905	0.929
PB	0.927	0.911	0.924	0.895	0.931	0.927	0.895	0.948
PPI	0.916	0.861	0.944	0.905	0.972	0.889	0.963	0.974
Wiki	0.914	0.817	0.890	0.807	0.946	0.871	0.862	0.954
Openflights	0.962	0.921	0.930	0.914	0.984	0.942	0.961	0.991

Bold values indicate the best performance for each dataset.

Table 4. Performance comparison of AMPS model configurations.

Algorithm/ Datasets	Polbooks	Jazz	USAir	Infect	CE	Food	Email	PB	PPI	Wiki
GPF + inv_deg	0.913	0.964	0.976	0.969	0.961	0.955	0.922	0.941	0.974	0.956
GPF + inv_log_deg	0.942	0.969	0.981	0.979	0.966	0.966	0.929	0.948	0.971	0.954
GPF + PF_w	0.908	0.906	0.924	0.950	0.791	0.932	0.905	0.924	0.964	0.926
LPF + inv_deg	0.904	0.962	0.972	0.969	0.954	0.956	0.921	0.941	0.973	0.953
LPF + inv_log_deg	0.919	0.960	0.975	0.970	0.961	0.952	0.921	0.947	0.974	0.956
LPF + PF_w	0.904	0.904	0.920	0.957	0.807	0.937	0.911	0.927	0.962	0.922
kPF + inv_deg	0.945	0.971	0.976	0.966	0.962	0.957	0.924	0.942	0.972	0.952
kPF + inv_log_deg	0.910	0.965	0.974	0.968	0.963	0.960	0.925	0.946	0.972	0.953
kPF + PF_w	0.924	0.922	0.948	0.952	0.909	0.944	0.917	0.941	0.970	0.937

Bold values indicate the best performance for each dataset.

Table 5. Comparison of AUC performance of different link prediction algorithms on directed network datasets.

Datasets/ Algorithm	Bifan	DCN	DAA	DPA	DRA	LP	AMPS
Chess	0.877	0.805	0.806	0.836	0.804	0.843	0.902
Highschool	0.799	0.855	0.846	0.644	0.838	0.735	0.914
Kohonen	0.890	0.682	0.674	0.886	0.680	0.606	0.725
Physicians	0.912	0.841	0.833	0.643	0.850	0.906	0.945
Residence	0.782	0.838	0.844	0.635	0.853	0.688	0.852
FWFD	0.926	0.744	0.742	0.865	0.750	0.871	0.995
Wiki-vote	0.966	0.920	0.921	0.945	0.919	0.944	0.967
Polblogs	0.927	0.914	0.918	0.885	0.916	0.905	0.929
Adolescent	0.816	0.744	0.743	0.625	0.736	0.809	0.835

Bold values indicate the best performance for each dataset.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Qin, X.; Liu, S.; Zhang, M.; Tang, J.; Ruan, Y. AMPS: A Direction-Aware Adaptive Multi-Scale Potential Model for Link Prediction in Complex Networks. Big Data Cogn. Comput. 2026, 10, 48. https://doi.org/10.3390/bdcc10020048

AMA Style

Qin X, Liu S, Zhang M, Tang J, Ruan Y. AMPS: A Direction-Aware Adaptive Multi-Scale Potential Model for Link Prediction in Complex Networks. Big Data and Cognitive Computing. 2026; 10(2):48. https://doi.org/10.3390/bdcc10020048

Chicago/Turabian Style

Qin, Xinghua, Sizheng Liu, Mengmeng Zhang, Jun Tang, and Yirun Ruan. 2026. "AMPS: A Direction-Aware Adaptive Multi-Scale Potential Model for Link Prediction in Complex Networks" Big Data and Cognitive Computing 10, no. 2: 48. https://doi.org/10.3390/bdcc10020048

APA Style

Qin, X., Liu, S., Zhang, M., Tang, J., & Ruan, Y. (2026). AMPS: A Direction-Aware Adaptive Multi-Scale Potential Model for Link Prediction in Complex Networks. Big Data and Cognitive Computing, 10(2), 48. https://doi.org/10.3390/bdcc10020048

Article Menu

AMPS: A Direction-Aware Adaptive Multi-Scale Potential Model for Link Prediction in Complex Networks

Abstract

1. Introduction

2. Related Work and Preliminaries

2.1. Traditional Link Prediction Approaches

2.2. Node Importance and Potential Fields

2.3. Link Prediction in Directed Networks

2.4. Preliminaries

2.4.1. Common Neighbors (CNs)

2.4.2. Preferential Attachment (PA)

2.4.3. Local Path (LP)

2.4.4. Average Commute Time (ACT)

2.4.5. NSim

2.4.6. SAC

2.4.7. GSIM

3. Proposed Method

3.1. Construction of the Node Potential Field

3.1.1. Theoretical Motivation for Potential Functions

3.1.2. Design of Multi-Scale Node Potential Field Models

3.2. Augmented Co-Matrix

3.3. Feature-Weighted Generalized LP Similarity

3.4. Combination Similarity Matrix

3.5. Time Complexity Analysis

4. Experiments and Discussion

4.1. Datasets

4.2. Evaluation Metric

4.3. Experimental Results

4.4. Parameter Sensitivity and Robustness Analysis

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI