1. Introduction
Information diffusion on online platforms is usually initiated from a limited set of selected users rather than from the whole network. In applications such as cross-platform promotion, public-opinion monitoring, and emergency information release, the available budget, time window, and platform access are often limited [
1,
2,
3,
4]. Therefore, the selected seed users must reach broad local neighborhoods and support stable diffusion across heterogeneous relations. A poorly selected seed set may waste dissemination resources or amplify information in unintended communities [
5].
Against this backdrop, the Influence Maximization (IM) problem was proposed [
6,
7], whose main objective is to select seed nodes
k within a given diffusion model to maximize the expected reach of the diffusion of information after completion [
8]. Current IM research for single-layer networks has developed a relatively mature theoretical framework. Methods such as the marginal gain selection approach based on greedy algorithms and a series of extended approximate solution frameworks [
9,
10,
11] provide effective support for information dissemination decisions in single-platform or simple relationship scenarios [
12].
However, the single-platform or simple-relation assumption is often insufficient for describing modern information diffusion. In practical cross-platform dissemination scenarios, the same user may participate in multiple social platforms, communities, or relationship types, such as following, reposting, interacting, and collaborating. These heterogeneous interactions naturally form multilayer networks with inter-layer coupling. In such networks, information not only propagates along intra-layer edges but may also migrate between layers through users’ cross-platform behaviors. This migration can trigger cascading diffusion in new network layers [
13,
14]. Multilayer networks capture various relationships and interaction mechanisms. As a result, they are crucial for reconstructing authentic information diffusion processes, evaluating the value of key nodes, and formulating cross-platform dissemination strategies. This makes multilayer influence maximization (MLIM) a more practical intelligent decision-support technology [
15,
16].
In recent years, research on influence maximization in multilayer networks has garnered increasing academic attention, although its complexity far exceeds that of single-layer network studies [
17]. On the one hand, researchers have extended classical single-layer IM frameworks to multilayer scenarios—such as adopting greedy marginal gain selection methods combined with Monte Carlo simulations to estimate diffusion benefits. Although such approaches demonstrate certain advantages in diffusion effectiveness, the exponential growth of inter-layer state spaces makes the Monte Carlo simulations computationally prohibitively expensive, making them impractical for large-scale scenarios. On the other hand, to improve the practical efficiency of research methods, some studies have proposed heuristic seeding strategies based on topological centrality and multilayer structural metrics, such as degree centrality, PageRank, and various multilayer network indicators [
18], using rapid sorting to determine the seed node sets. Other studies incorporate swarm intelligence and meta-heuristic optimization methods, such as particle swarm optimization (PSO) and the slime mold algorithm (SMA), to search within discrete combinatorial spaces, striving to balance method efficiency with diffusion effectiveness.
For instance, DPSOMIM combines random-connectivity-based candidate screening with discrete particle swarm optimization for multilayer seed selection [
19], while NGGA organizes the search through node grouping based on node influence and selection cost [
20]. More recently, evolutionary deep reinforcement learning has been introduced to learn seed-selection policies in multilayer social networks [
16]. These studies improve MLIM from the perspectives of candidate screening, search-space organization, and learning-based optimization. However, as also summarized in recent surveys on influence maximization models [
11], scalable MLIM still requires both efficient diffusion evaluation and controllable search-space reduction. This motivates a method that can simultaneously provide a lightweight evaluation signal for inter-layer and higher-order diffusion potential and preserve effective seed-set structures during stochastic search.
From the evaluation perspective, to mitigate computational complexity in diffusion simulations, a significant research approach involves constructing approximate diffusion evaluation metrics. These leverage node neighborhood information to rapidly estimate diffusion gains, such as using the expected contribution of one-hop or finite-hop neighborhoods as evaluation proxies. Existing research focuses primarily on two directions: efficient evaluation of diffusion effects and scalable seed node selection. However, in multilayer networks, inter-layer information transfer and inter-layer coupling effects may enable nodes that are not prominent in local structures to trigger larger-scale cascading diffusion through inter-layer pathways. Against this backdrop, existing multilayer influence maximization methods often struggle to fully capture the diffusion potential triggered by inter-layer diffusion during the effectiveness evaluation phase [
21].
Within multilayer networks, a node’s influence extends beyond its one-hop neighborhood within its own layer. It may migrate to other network layers through cross-platform behavior, triggering higher-order cascading diffusion within the new layer structure. If evaluation models rely solely on local neighborhood information or one-hop expected contributions, they often fail to reflect the potential diffusion gains from such inter-layer penetration. This can introduce systematic biases during candidate solution ranking and search guidance phases, which compromises the accuracy and stability of seed node selection. Simultaneously, as the combinatorial search space continuously expands, heuristic or meta-heuristic search processes become more prone to convergence instability without mechanisms to leverage historically effective structures. Specifically, the above limitations can be summarized into three challenges: (i) how to construct a lightweight surrogate evaluation metric that reflects inter-layer and higher-order diffusion potential; (ii) how to preserve effective seed-set structures during stochastic meta-heuristic search; and (iii) how to control the candidate space while maintaining sufficient structural diversity under computational constraints.
To address these challenges and avoid computationally prohibitive simulations, this paper proposes a novel bio-inspired swarm intelligence framework: a two-stage hybrid slime mold algorithm guided by probabilistic memory, denoted PB-MSMA. First, to tackle the issue of “localization” in surrogate evaluation and the shortcomings of existing methods in ignoring inter-layer higher-order diffusion, this paper constructs the Preference-based Expected Diffusion Value (P-EDV) as a lightweight surrogate fitness metric. Building upon the one-hop direct diffusion benefit, it incorporates inter-layer penetrability and neighborhood topological potential. From an optimization perspective, the probabilistic pipeline mechanism introduces a lightweight distributed search memory into the search process, making random exploration no longer independent of historical search trajectories. The main research contributions of this paper are as follows:
We propose the Preference-based Expected Diffusion Value (P-EDV) metric to capture inter-layer penetration and higher-order diffusion potential, which overcomes the limitations of traditional one-hop surrogate metrics in multilayer networks and successfully bypasses the need for computationally expensive Monte Carlo simulations during the search phase, achieving an average 5.80% improvement over alternative evaluation variants.
We develop a probabilistic pipeline mechanism to guide the discrete combinatorial search. By encoding effective historical node configurations as statistical priors, it preserves structural inheritance and enhances search stability, increasing the final search fitness by 9.61% compared with removing the decay-controlled probability update and by 8.75% compared with using archive-based elite memory.
We design the Probabilistic-Based Multilayer Slime Mold Algorithm (PB-MSMA), a novel swarm intelligence framework. By integrating a dynamic hybrid candidate pool, this two-stage approach effectively restricts the search space and balances exploration with exploitation, obtaining the best overall Friedman average rank of 1.033 in the repeated-run statistical comparison.
Through experiments on six real-world multilayer network datasets and nine seed budgets, we show that PB-MSMA achieves a dataset-level improvement range of 3.68–14.50% over representative baselines, including CELF, DPSOMIM, Degree, DIRCI, and PRGC, with an average improvement of 10.32%. These results indicate that PB-MSMA provides an efficient seed-selection strategy for multilayer diffusion scenarios where repeated simulation-based evaluation is costly.
The remainder of this paper is organized as follows.
Section 2 reviews related studies on influence maximization and multilayer networks.
Section 3 presents the multilayer diffusion model and the standard slime mold algorithm.
Section 4 describes the proposed PB-MSMA framework, including P-EDV, the probabilistic pipeline, and the candidate-pool strategy.
Section 5 introduces the experimental settings, and
Section 6 reports the experimental results and analysis.
Section 7 concludes the paper and discusses future work.
2. Related Work
Existing research on Influence Maximization (IM) includes both single-layer and multilayer network settings. Since this paper focuses on multilayer influence maximization (MLIM), the following review mainly emphasizes methods designed for multilayer diffusion environments and cross-layer influence propagation. Within each category, mainstream solution strategies typically fall into three directions: greedy algorithms, heuristic algorithms, and meta-heuristic algorithms. This section reviews the relevant work sequentially according to this classification framework.
Early studies on Influence Maximization (IM) were initiated by Domingos [
6] and Richardson [
7], who treated influence analysis as an algorithmic problem in networks. Their work shifted the focus from evaluating isolated individuals to selecting nodes with strong diffusion potential. Kempe et al. [
8] later formulated IM under the Independent Cascade (IC) and Linear Threshold (LT) models as a discrete seed-selection problem. They proved that the problem is NP-hard and that the objective function is submodular under standard settings, which provides the theoretical basis for greedy approximation with a
guarantee. However, greedy selection still requires repeated marginal-gain estimation. When Monte Carlo simulations are used for this estimation, the computational cost becomes high on large networks. Lazy-forward greedy methods such as CELF [
22] reduce redundant evaluations, but simulation-based spread estimation remains costly in large-scale or multilayer diffusion scenarios. This motivates more efficient heuristic and search-based methods.
To address efficiency, topologically based heuristic algorithms have been extensively developed. Early research focused primarily on metrics such as degree centrality [
18]. In recent years, physics-inspired gravity models have emerged as a prominent research branch within heuristic algorithms due to their ability to effectively integrate local and global features. Ma et al. [
23] pioneered treating the K-shell values as mass, while Xu et al. [
24] introduced communicability matrices (CAGM) to further refine the model. In addition, community structure-based strategies [
25] have been used to distribute seed nodes. However, such heuristics typically focus on static topological properties, often neglecting the random probabilistic nature of diffusion processes and lacking adaptability to minor network perturbations.
Unlike deterministic heuristic rules, meta-heuristic algorithms treat influence maximization as a combinatorial optimization problem, leveraging swarm intelligence to search for global optimal solutions. Qiu et al. [
26] introduced a local influence decay strategy for differential evolution (DE). Wang et al. [
27] improved particle swarm optimization (PSO) for discrete spaces. Although these evolutionary algorithms demonstrate strong performance on single-layer networks, most are designed for single-layer topologies, posing significant challenges when applied directly to multilayer networks.
Real-world systems often exhibit multilayer interactions, whose diffusion dynamics are fundamentally different from those of single-layer networks. Studies by Liu et al. [
13] and Chen et al. [
14] highlight that inter-layer coupling and opinion cascades generate complex nonlinear dynamics (e.g., information-disease competition), making it difficult for traditional single-layer models to accurately assess a node’s true influence within multilayer structures. Although classical greedy strategies from single-layer networks can theoretically be extended to multilayer scenarios using Monte Carlo simulations to capture inter-layer diffusion benefits, the exponential growth of inter-layer state spaces in multilayer networks makes precise diffusion impact assessment computationally prohibitively expensive for large-scale heterogeneous networks. Consequently, low-cost evaluation methods tailored for multilayer structures have emerged as a research priority.
In the heuristic domain, researchers have attempted to identify key nodes by aggregating multilayer topological features. Ni et al. [
28] proposed a weighted gravity-based selection strategy, while Lv et al. [
29] introduced multilayer PageRank to correct for the effective mass of the node (PRGC). Furthermore, community structure is considered a key driver of multilayer diffusion. Lv et al. [
30] designed community-based centrality metrics to identify inter-layer hubs, while An et al. [
21] utilized dynamic reachability to evaluate node importance. However, although these methods integrate information between layers, most are limited to linear weighting or static aggregation, failing to effectively capture complex nonlinear synergistic mechanisms between layers and lacking adaptability to dynamic diffusion probabilities.
In the field of meta-heuristics, current research on optimizing the search for multilayer network configurations remains relatively scarce. Hu et al. [
20] proposed a genetic algorithm based on node grouping (NGGA) for multiplexed networks, aiming to reduce the dimension of the chromosome through grouping strategies. Wang et al. [
19] attempted to apply discrete PSO to multilayer networks, using random connectivity centrality for population initialization. Furthermore, with the advancement of artificial intelligence technologies, Tang et al. [
16] recently proposed a hybrid framework combining evolutionary strategies with deep reinforcement learning (Evolutionary DRL), aiming to leverage learning mechanisms to optimize the seed selection process. However, existing algorithms face the risk of premature convergence when faced with the high-dimensional search space of multilayer networks. This requires the introduction of algorithms with stronger adaptive optimization capabilities and dynamic oscillation mechanisms to robustly locate optimal seed sets within complex multilayer topologies.
To provide a clear comparison of existing studies,
Table 1 summarizes representative methods in terms of category, metric, supported network type, benchmark coverage, comparison scale, and main gap. More detailed supporting information is provided in
Appendix C. Specifically,
Table A6 lists the methodology, benchmark datasets, and compared baselines of each method, while
Table A7 summarizes the reported effects and limitations related to MLIM.
As shown in
Table 1, greedy methods provide strong theoretical support but usually rely on repeated diffusion evaluation. Heuristic methods improve efficiency through structural scoring, but many of them depend on single-layer topology, static ranking, or predefined aggregation rules. Meta-heuristic and learning-based methods improve global search ability, but existing multilayer methods still face challenges in surrogate evaluation, search stability, training or runtime cost, and inter-layer diffusion modeling. These limitations motivate the proposed P-EDV surrogate and probabilistic guidance mechanism.
3. Preliminaries
This section first provides a formal definition of multilayer networks, then describes a multilayer independent cascading diffusion model based on platform preferences, and finally introduces the fundamental principles of the standard slime mold algorithm.
3.1. Multilayer Network
Inspired by users’ cross-platform heterogeneous interactions in real-world social networks, this paper models multilayer networks as multiplex networks. Formally, a multilayer network system comprising L layers of relationships is represented as . Each layer is defined as (), where is a set of n nodes shared across all layers. These nodes represent different user entities. The set is the edge set specific to layer .
In this model, the topology of each layer is described by the adjacency matrix , where the element indicates a connection between nodes and in layer , and 0 otherwise. This structure enables the same node to exhibit heterogeneous connection strengths and topological characteristics in different dimensions (layers).
3.2. Multilayer Independent Cascade Model
To describe influence propagation over multiple relationship layers, this paper uses a multilayer independent cascade model (MLIC) with a platform preference mechanism. Diffusion proceeds in discrete time steps . In each layer, every node has one of two states: active or inactive.
At , a seed set with is selected from the node set V, and the corresponding nodes in the specified layer are initialized as active. At time step , each node newly activated at time step t has one chance to activate its neighbors. This setting follows the standard independent cascade assumption.
For intra-layer diffusion, suppose that node u is active in layer at time step t and that . Then u attempts to activate node v in the same layer with probability . Each activation trial is independent. Once an activation attempt fails, the same edge is not used again in later time steps.
The MLIC model further considers inter-layer penetration for the same physical node. A node has a one-to-one identity across layers, but its activation state can differ from layer to layer. After node u becomes active in layer , it first decides whether to initiate an inter-layer attempt according to its platform preference parameter . If this attempt is triggered, u then tries to activate its counterpart in layer () with penetration probability . This two-stage setting separates the tendency of cross-platform diffusion from the strength of inter-layer penetration, allowing different nodes and layer pairs to exhibit heterogeneous transfer behaviors.
Only nodes activated in the previous time step can further diffuse information. The process stops when no new nodes are activated at a time step. Under this diffusion model, multilayer influence maximization aims to find a seed set that maximizes the expected number of activated node–layer pairs after diffusion terminates.
Figure 1 gives an example of the MLIC process. The labels 1 to 8 correspond to
. The meanings of node colors and link types are summarized in
Figure 1.
Taking the process shown in
Figure 1 as an example, at the initial diffusion time
, nodes 1 and 7 are selected as initial seeds in layer
. Subsequently, node 1 attempts to activate its neighboring nodes with intra-layer diffusion probabilities
and
. Node 3 is successfully activated and further activates node 6 in subsequent time steps. Other intra-layer diffusion processes follow the same mechanism. Activated nodes trigger inter-layer penetration behavior. For example, nodes 1, 7, and 8 determine whether to initiate inter-layer diffusion based on their platform preference parameters
, respectively. They successfully activate their corresponding nodes in other layers with inter-layer penetration probabilities
,
, and
. Nodes activated through inter-layer penetration then continue diffusing within their respective layers according to independent cascade rules until no new activated nodes emerge across the entire network.
3.3. Slime Mold Algorithm
The Slime Mold Algorithm (SMA) [
31] is a meta-heuristic algorithm inspired by the foraging behavior of Physarum polycephalum, designed to simulate the adaptive foraging behavior of slime molds. The core of the SMA lies in its dynamic feedback mechanism, which mimics the morphological changes in slime mold filaments to balance exploration and exploitation.
The algorithm’s search process comprises three synergistic phases. During the random exploration phase, individuals perform extensive random movements to prevent the population from becoming trapped in local optima, ensuring a comprehensive search across high-dimensional influence spaces. In the adaptive envelope phase, the algorithm generates adaptive weights based on nonlinear biological oscillation intensity to simulate organism and food concentrations. This enables the population to concentrate its search on the current global best candidate. During the contraction phase, a contraction mechanism fine-tunes the solution, enabling stable convergence toward an optimal seed set. By integrating these behaviors, the SMA demonstrates exceptional robustness and scalability, establishing itself as a suitable foundation for tackling complex discrete optimization challenges involving multilayer influence maximization.
The SMA was selected as the foundational search framework for this paper primarily due to its dynamic feedback and structural construction mechanisms, which align strongly with the combinatorial nature of influence maximization. On the one hand, the SMA achieves a nonlinear switch between global exploration and local exploitation through positive/negative feedback weights and a biologically inspired oscillatory adaptive envelope process, reducing the risk of getting stuck in local optima during exploration. However, its update process tends toward high-quality structures, facilitating the inheritance of structural information during discrete seed set searches. Taking advantage of these characteristics, this paper adopts the SMA as the core framework. Subsequent sections introduce prior statistical guidance to address structural degradation caused by discretization, thus enhancing intergenerational stability and search efficiency.
4. Proposed Method
The multilayer network influence maximization problem faces dual challenges during its solution process: high evaluation complexity and insufficient stability in discrete search. On the one hand, estimating the expected influence range of multilayer diffusion processes typically relies on extensive Monte Carlo simulations, which makes high-frequency invocation during iterative optimization impractical. On the other hand, seed sets constitute high-dimensional discrete combinatorial variables; directly applying continuous swarm intelligence methods to this scenario often leads to difficulties in preserving structural information across generations, triggering convergence instability or premature search termination.
To address these challenges, this paper proposes the Probabilistic-Based Multilayer Slime Mold Algorithm (PB-MSMA). Through the synergistic design of the P-EDV surrogate evaluation, the probabilistic pipeline mechanism, and the discrete search strategy, PB-MSMA achieves a balance between computational efficiency and diffusion effectiveness. First, the paper constructs the Preference-based Expected Diffusion Value (P-EDV) as a surrogate fitness function, approximating the diffusion potential of candidate seed sets without explicitly simulating multi-hop diffusion. Subsequently, a probabilistic pipeline mechanism is introduced to statistically characterize effective recurring structures from historical searches as probability distributions, guiding subsequent searches as prior preferences. Building upon this foundation, it integrates a dynamic hybrid candidate pool with an improved slime mold algorithm to form a complete multilayer influence maximization solution process.
To visually illustrate the overall flow of the solution and the coordination between modules of the proposed method,
Figure 2 presents the schematic diagram of the PB-MSMA. This process can be summarized into four interconnected phases. As shown, the algorithm first takes the original multilayer network as input in state a. Subsequently, in state b, a set of candidate nodes composed of both the structural elite pool and the random exploration pool is constructed using the dynamic hybrid candidate pool strategy. This induces the formation of candidate-induced multilayer subnetworks, effectively constraining the search space scale and reducing combinatorial complexity. Within the constrained search domain, the algorithm enters the slime-shaped evolutionary search phase (state c). The diffusion potential of candidate seed structures is rapidly evaluated using the Preference-based Expected Diffusion Value (P-EDV). This metric comprehensively characterizes one-hop joint activation capability, inter-layer penetration capability, and two-hop structural potential density. Similarly, a probabilistic pipeline mechanism embeds structural preferences from historical optimal solutions as a probability distribution into the evolutionary process. This guides random exploration directions and improves intergenerational structural memory. As the evolutionary process iterates, the Top-
k projection operation maps continuous preference representations to a discrete seed set. The algorithm gradually converges in state d and stabilizes in seed structures with high multilayer diffusion potential, ultimately generating the global optimum solution.
4.1. Preference-Based Expected Diffusion Value (P-EDV)
To avoid frequent invocations of Monte Carlo simulations during the search process, this paper constructs the Preference-based Expected Diffusion Value (P-EDV) as a surrogate evaluation function for the seed set. P-EDV is grounded in the one-hop joint activation expectation within multilayer networks, further modulated by inter-layer diffusion capacity and two-hop structural potential. This approach approximates the overall diffusion potential of the seed set without explicitly simulating multi-hop diffusion. This effectively mitigates the localization issue in surrogate evaluation within multilayer networks, where results are prone to being dominated by local structures. Consequently, P-EDV can be regarded as a structure-aware surrogate fitness metric, whose assessment signal not only depends on the scale of the local neighborhood but also explicitly reflects multilayer structural characteristics.
Following the multilayer network definition in
Section 3.1, let
denote a multilayer network with
L layers, where
. Here,
is the number of layers,
V is the shared node set,
is the edge set of the
-th layer, and
denotes the neighbor set of node
v in layer
. Given a seed set
with
, its P-EDV is defined as:
where
denotes the one-hop neighborhood of
S across all layers, i.e.,
, and
is the effective one-hop non-seed neighborhood.
represents the joint probability of node
v being successfully activated by seed set
S within the first hop in a multilayer diffusion environment. Therefore,
.
denotes the average penetration rate between layers associated with the one-hop diffusion boundary, and
denotes the density of the structural potential.
For
, let
denote the number of seed neighbors of
v in layer
. Given the basic intra-layer propagation probability
, the joint activation probability is computed as:
where
is the probability that
v is not activated by any seed neighbor in layer
. The product over all layers gives the probability that
v remains inactive across the multilayer network. The core diffusion gain is given by the expected joint activation probability at the first hop:
This metric directly reflects the first-order diffusion capability of seed sets within multilayer networks. However, in multilayer networks, even when activating seed sets of similar scale in a single hop, subsequent diffusion effects may still vary significantly due to inter-layer coupling and local structural differences. To characterize this variation, this paper introduces a multiplicative modulation term
based on the expectation of joint activation in a single hop. The average inter-layer penetrability
characterizes the overall penetrability of the diffusion process across different network layers, defined as:
where
is the effective one-hop non-seed boundary used for normalization, and
.
denotes the effective two-hop neighborhood of
S, obtained by collecting nodes that can be reached from
through one additional edge across all layers, excluding the seed nodes.
is the aggregated edge set over all layers. If
,
is set to 0. The pairwise transition term is computed as:
where
is the layer preference weight of the
-th layer and satisfies
.
denotes the propagation probability or normalized edge weight between nodes
i and
j in layer
. For unweighted networks,
if
, and
otherwise. Therefore,
. Since
aggregates pairwise transition feasibility over the one-hop boundary, it is used as a non-negative expansion coefficient, i.e.,
, rather than as a single transition probability. This metric reflects the overall inter-layer feasibility when activating nodes propagate to deeper layers in a single hop.
On the other hand, the two-hop structural potential density
measures the latent diffusion capability embodied within the seed set within a two-hop range, defined as:
where
denotes the effective two-hop neighborhood of the seed set.
denotes the baseline reachability probability of node
j in the two-hop neighborhood. Under the uniform propagation setting used in this study,
.
indicates the global topological degree of node
j. Specifically,
is computed as:
where
is the degree of node
j in layer
, and thus
. Since P-EDV is used as a surrogate score for ranking candidate seed sets during the search phase,
serves as a relative structural indicator rather than a normalized diffusion probability. If
,
is set to 0. This metric characterizes the potential diffusion strength corresponding to a unit one-hop diffusion boundary by weighting and aggregating the influence capabilities of nodes within the two-hop neighborhood, then normalizing by the size of the effective one-hop neighborhood.
The multiplicative modulation term is introduced as a structure-aware correction to the one-hop activation expectation. This form is not intended to be an exact closed-form expectation of the complete MLIC diffusion process. Instead, it is derived as a surrogate-level conditional modulation. Specifically, describes the latent two-hop structural resources behind the one-hop boundary, whereas describes the feasibility of reaching such resources through multilayer transitions. Therefore, the exploitable two-hop potential can be approximated by their product, . When either the trans-layer penetration capability or the two-hop structural potential is weak, this product becomes small, and P-EDV naturally reduces to a value close to the one-hop activation expectation. Adding 1 preserves the original one-hop expected activation gain as the baseline. Therefore, P-EDV does not replace the one-hop expectation but augments it with trans-layer and two-hop structural information. By jointly incorporating the average inter-layer penetration rate and the two-hop structural potential density into the one-hop joint activation expectation, P-EDV provides an indirect characterization of deep diffusion potential without explicitly simulating multi-hop diffusion. This design provides stable and distinguishable evaluation signals for subsequent swarm-based search processes.
4.2. Probabilistic Pipeline Mechanism
In discrete combinatorial optimization based on swarm intelligence, random exploration is crucial to maintaining search diversity. However, without effective constraints on its generation distribution, it often fails to fully utilize structural information revealed during the search process. Although individual fitness reflects the quality of current solutions, the structural preferences embedded in high-quality solutions are typically weakened by uniform perturbations during random exploration, resulting in a lack of stable structural guidance for the search process.
To mitigate this issue, this paper introduces the Probabilistic Pipeline mechanism. It statistically characterizes recurring effective structures in the search process as probability distributions, using these as statistical priors to guide the random exploration distribution. The core idea of the Probabilistic Pipeline is not to directly store a discrete seed set or to construct an explicit historical solution archive. Instead, it maps the optimal solution structure from the current iteration to a probability vector defined over the candidate node space, characterizing the relative tendency of each node to be selected in high-quality solutions. From the perspective of swarm intelligence optimization, the Probabilistic Pipeline can be viewed as an implicit search memory mechanism that preserves and utilizes cross-generational structural information without explicitly storing historical solutions.
Let
denote the probability reward vector corresponding to the global optimal solution obtained during the
t-th generation of the search. The update rule for the Probabilistic Pipeline vector
is defined as:
where
denotes the probabilistic state of the pipeline at generation
t;
is the decay coefficient, which controls the strength of the inheritance of historical information by the statistical prior;
denotes the fitness value of the current global optimal solution;
represents the sum of fitness values across all individuals in the population, used to normalize the relative contribution of the current solution;
is the reward vector constructed from the optimal solution structure, where non-zero components correspond to selected nodes in the optimal solution;
denotes the
norm of a vector, ensuring the normalization of the probability distribution.
From the update perspective, the probabilistic pipeline consists of two superimposed components: the first term maintains the continuity of the statistical prior and prevents premature convergence through the decay mechanism; the second term injects structural information from the current optimal solution into the probability distribution in a weighted manner based on its fitness level, thereby progressively reinforcing the effective structural patterns that recur during the search. This mechanism addresses the issues of excessive randomness and unstable convergence in discrete search mentioned earlier by providing structural guidance for random exploration through a distribution-level statistical prior.
4.3. PB-MSMA (Probabilistic-Based Multilayer Slime Mold Algorithm)
4.3.1. Candidate Node Selection
To constrain the search space scale, PB-MSMA performs seed selection only on the candidate node set during each iteration. The candidate set is constructed through a dynamic hybrid strategy, and its size and composition adapt to the seed budget and network scale.
Given a multilayer network
, a node set
V with
, and a seed budget
k, the capacity of the candidate pool
is defined as:
where
and
, respectively, control the expansion rate and the maximum size ratio of the candidate pool. After determining
, the candidate set consists of a mixture of two node components. Let the random mixing ratio be
; then the sizes of the elite candidates and the random candidates are
and
, respectively. The elite candidate nodes are selected based on the global degree rank between layers, and the top
nodes form the elite candidate set. Random candidate nodes are uniformly sampled without replacement from the entire node set
V, producing
nodes. The final set of candidates
is obtained by merging these two parts and removing duplicates. If the size falls below
k, the nodes are supplemented from the remaining pool to ensure feasibility.
4.3.2. Node Selection
In the candidate node set
, PB-MSMA employs the Slime Mold Algorithm (SMA) to perform the swarm search, iteratively selecting the seed set.
Section 4.3.3 outlined the standard SMA position update mechanism, which comprises three branches: random exploration, convergence search, and contraction update. While preserving the core SMA framework, this paper employs a probabilistic pipeline mechanism to retain historical seed node structures, guiding algorithmic exploration.
Initialization Phase: At the start of each iteration, a continuous preference vector is initialized for each individual
i.
Each dimension corresponds to the relative priority of the candidate nodes selected as seeds. The initial preference vector is generated uniformly at random to ensure comprehensive coverage of the candidate space during the initial search phase. Subsequently, the preference vector is converted into a discrete seed set
through Top-
k mapping. In each iteration, a corresponding seed set is generated based on the current individual positions, and their diffusion potential is evaluated using the preference-type expected diffusion value
defined in
Section 4.1. Individual fitness is defined as follows:
Based on this, the globally optimal individual of the current generation and its corresponding seed set are identified. This optimal structure is then used to update the probabilistic pipeline, thus forming a structurally statistical memory accumulated over generations.
Update Phase: In the standard SMA, the random exploration branch achieves exploration capability by uniformly sampling new positions within the search space. However, this approach does not take advantage of the effective structures discovered during historical searches, leading to repeated discarding of structural information. To address this, this paper introduces a probabilistic pipeline mechanism that preserves historical seed node structures to guide algorithmic exploration.
In PB-MSMA, individual position updates are governed by a unified probability-driven mechanism. Its objective is to enhance search efficiency and stability in discrete combinatorial optimization by incorporating intergenerational structural information while maintaining the inherent stability of the SMA search framework. Specifically, the update behavior of an individual in each iteration is jointly determined by the random variable r and the individual’s state parameter, enabling adaptive switching between different search modes.
As shown in Equation (
12), when
, the individual enters a structure-guided exploration update phase. Its new position is generated centered on the probabilistic pipeline vector
, with a small random perturbation superimposed. This update method statistically constrains the search region, prioritizing exploration around high-quality structures frequently encountered in historical iterations and avoiding completely unguided random walks. When
and
, individuals perform optimality-driven search updates. Their positions are jointly determined by the current optimal individual
and the relative relationships among individuals in the population, strengthening excellent solution structures and accelerating convergence toward potentially optimal regions. When
and
, individuals enter the contraction update phase. This involves proportionally scaling the current state to maintain population diversity and prevent premature convergence to local optima.
The specific individual position update can be expressed as
where
denotes the position vector of the
i-th individual in the
t-th generation, and
represents the position of the individual with the highest fitness in the current population;
and
are two different individuals randomly selected from the population;
denotes the probabilistic pipeline vector for generation
t, characterizing the optimal structural patterns of high-frequency that emerge during historical searches;
represents the random perturbation vector, maintaining search diversity. The random variable
controls the switching between different search modes for individuals. The parameter
serves as the exploration phase trigger threshold,
denotes the individual-specific update probability;
and
are scaling factors and
W is the weight coefficient.
Through this unified update mechanism, PB-MSMA achieves a dynamic equilibrium between structure-guided exploration, convergence reinforcement, and diversity preservation in each generation iteration. Compared to the original SMA, this update rule explicitly embeds intergenerational structural information from the probabilistic pipeline into individual updates without altering the overall search logic. This transforms the search behavior from a purely random-feedback-driven process into a probabilistic-guided optimization process constrained by historical structures. After the update, boundary constraints are applied to individual positions, and a new seed set is generated via Top-k mapping to enter the next generation iteration.
4.3.3. Overall Flow of the PB-MSMA
Integrating the previously proposed Preference-based Expected Diffusion Value (P-EDV), probabilistic pipeline mechanism, and dynamic hybrid candidate pool strategy, this paper constructs a comprehensive PB-MSMA solution framework for influence maximization in multilayer network environments. The algorithm takes a multilayer network as input. In each iteration, it dynamically constructs a candidate node set to constrain the search space, then performs a population search within this constrained domain using an improved slime mold algorithm. The diffusion potential of candidate seed sets is rapidly evaluated via P-EDV, while the probabilistic pipeline continuously accumulates structural information from historically high-quality solutions to guide subsequent iterations. The algorithm iterates until the termination condition is met, eventually outputting the seed set with maximum diffusion potential.
The algorithm first initializes the probabilistic pipeline vector
P and the globally optimal seed set
. In each iteration, the set of candidate nodes is constructed using the dynamic hybrid candidate pool operator
, thus constraining the search space for the current iteration. Then the population is initialized on this set of candidates, where each individual represents the relative priority of the candidate nodes selected as seeds in the form of a continuous vector. The corresponding seed set is obtained via Top-
k mapping, and fitness is evaluated using the P-EDV function. Based on the fitness results, the current and global optimal solutions are updated, and corresponding structural information is written into the probabilistic pipeline. Guided by the probabilistic pipeline, the population is updated using an improved slime mold algorithm, enhancing search stability and structural retention capabilities during random exploration. This process repeats until the maximum iteration count is reached, ultimately returning the global optimal seed set as the algorithm’s output. The overall workflow is illustrated in Algorithm 1.
| Algorithm 1 PB-MSMA |
- Input:
Multilayer network , seed budget k, maximum iterations T, population size , convergence tolerance , stall threshold - Output:
Global best seed set - 1:
Initialize the probabilistic pipeline state vector by , for each - 2:
Initialize global best seed set , and global best fitness - 3:
Initialize stall counter and iteration index - 4:
Construct the initial candidate pool - 5:
Initialize population within candidate pool - 6:
while and do - 7:
Refresh the candidate pool - 8:
Repair or remap the population according to the current candidate pool - 9:
for each individual do - 10:
Map continuous representation to seed set: - 11:
Evaluate fitness: - 12:
end for - 13:
Identify current best individual: , , - 14:
if then - 15:
Update global best: , - 16:
Reset stall counter: - 17:
else - 18:
Keep historical global best unchanged - 19:
Update stall counter: - 20:
end if - 21:
Select elite seed structures from current population and global best solution - 22:
Update probabilistic pipeline: - 23:
for each individual do - 24:
Update individual: - 25:
Apply boundary repair to keep within candidate search space - 26:
end for - 27:
- 28:
end while - 29:
return
|
4.3.4. Time Complexity Analysis
To avoid notational ambiguity, the population size is denoted by , whereas is used only for the probabilistic pipeline state vector at iteration t. Let be the number of nodes, L be the number of layers, and be the total number of intra-layer edges across all layers. Let , and denote the maximum number of iterations, the seed budget, and the candidate-pool size, respectively.
The computational cost of PB-MSMA consists of four components. First, candidate-pool preparation requires multilayer structural scoring and candidate ranking, with a preprocessing cost of
. After the structural scores are obtained, candidate-pool refreshing or sampling in each iteration costs at most
. Second, population-based search includes Top-
k mapping and SMA position updating. For one individual, these two operations cost
and
, respectively. Thus, the population search cost per iteration is
. Third, P-EDV fitness evaluation requires one-hop and two-hop neighborhood expansion. For a seed set
S, its evaluation cost is denoted as:
where
,
is the number of multilayer edges visited during one-hop expansion from the seed nodes, and
is the number of edges visited during expansion from the one-hop boundary to the two-hop neighborhood. The term
corresponds to the layer-wise computation of the joint activation probability. Let
denote the average P-EDV evaluation cost over the population. Fourth, the probabilistic pipeline update costs
for sparse updates and is bounded by
for a full candidate-space update.
Therefore, the overall time complexity of PB-MSMA is:
In the worst case, the one-hop and two-hop expansion may visit a large fraction of the multilayer edge set. Thus, . This revised analysis explicitly accounts for candidate-pool preparation, Top-k mapping, SMA updating, P-EDV neighborhood expansion, the number of layers, the multilayer edge size, and probabilistic pipeline updating.
5. Experimental Setup
To validate the effectiveness and superiority of the proposed method in maximizing influence across multilayer networks, experiments were conducted on six public multilayer network datasets. Six representative algorithms were selected for comparative analysis. The experimental results were compared under a unified diffusion model and common evaluation criteria. All algorithms were implemented in Python 3.13.5 and executed under identical software and hardware conditions. The experiments were conducted on a Windows operating system with the following hardware configuration: Intel Core i7-13700K processor with a 3.40 GHz base frequency and 32 GB RAM. To ensure fairness and stability, each experiment was independently repeated 1000 times under identical parameter settings. The average result was used as the final evaluation metric.
5.1. Datasets
To comprehensively evaluate the performance and robustness of the proposed PB-MSMA, experiments were conducted on six real-world multilayer network datasets. The detailed descriptions of these datasets are as follows:
CKM [
32]: Serving as a multiplex social graph in the field of medical sociology, this dataset encompasses three distinct layers. Its topology is constructed based on how healthcare practitioners responded to three different survey questions regarding the adoption of novel medications, with each question forming an independent relational layer.
CElegans [
33]: Functioning as a model for biological nervous systems, this multiplex dataset maps the complex connectome of the
Caenorhabditis elegans roundworm. The diverse layers within the network systematically delineate various categories of synaptic linkages.
PerformingArts: Gathered independently through custom web scraping techniques, this dataset establishes a two-layer social multiplex. It illustrates the interconnected relationships and interactions between popular public figures across two prominent social media platforms, namely TikTok and Weibo.
Gallus [
34]: Extracted from the BioGRID database, this network models the intricate biological interactions of the
Gallus. The multilayer structure reflects different types of documented genetic connections.
London_transport [
35]: Serving as a multilayer representation of urban transit infrastructure, this dataset models the public transportation system in London. Within this structure, the topological nodes signify individual railway stations, while the intra-layer edges delineate the active operational routes linking them.
EUAir [
36]: This large-scale aviation transportation network is structured into 37 distinct topological layers. Each layer exclusively captures the flight trajectories and routes operated by a different European airline carrier.
The fundamental topological properties of these six datasets are summarized in
Table 2.
5.2. Baseline Algorithms
To comprehensively evaluate the performance of the proposed PB-MSMA, this study uses six representative algorithms as benchmarks for comparison. These algorithms encompass greedy strategies, structure-based heuristics and meta-heuristic optimizers, as detailed below:
CELF [
22]: This is an optimized variant of the greedy algorithm. It leverages the submodularity of influence functions to avoid extensive unnecessary edge benefit calculations through delayed policy evaluation. This approach significantly improves computational efficiency while maintaining the same approximation ratio guarantee as the original greedy algorithm. In experiments, the number of Monte Carlo simulations was set to 10,000.
DPSOMIM [
19]: A discrete particle swarm optimization framework specifically designed to maximize influence in multilayer networks. This algorithm incorporates random connectivity centrality (RCC) for screening of candidate nodes and integrates Neighborhood Optimization (NO) strategies to enhance local search capabilities. Following the optimal parameter settings reported in original literature, we set the population size
N to 60, the maximum iteration count
T to 100, the inertia weight
w to 0.6 and the learning factors
. Additionally, the random walk duration
in the RCC phase was set to 10, with bias parameters
and
.
DIRCI [
21]: In recent years, a multilayer network critical node identification method has been proposed. This approach models node heterogeneity through dynamic influence ranges and integrates local structural information with global multilayer dependencies under a unified framework by incorporating network layer importance and metrics of community importance. DIRCI uses multilayer PageRank to model inter-layer correlations and adopts different community structures for different network layers, effectively improving the accuracy of critical node identification in multilayer networks.
PRGC [
29]: A multilayer network centrality method based on the gravitational model. This algorithm uses multilayer PageRank values as node “quality” and models interaction strength between nodes via inter-layer shortest path distances, thereby integrating both local and global structural information. Compared to traditional gravitational centrality methods based on degree or k-shells, PRGC more accurately characterizes the comprehensive influence of nodes in multilayer networks.
Degree: A fundamental topological heuristic method. In multilayer networks, this approach ranks the nodes according to their aggregate degree across all layers and selects the top k nodes as the seed set. As a parameter-free baseline, it measures the contribution of basic structural properties to information diffusion.
Random-MC: This method randomly selects k nodes from the network as the seed set and evaluates their influence through 10,000 Monte Carlo simulations. It typically serves as a lower-bound benchmark for algorithm performance, validating the effectiveness of optimization strategies.
5.3. Evaluation Metrics
5.3.1. Influence Spread
The core objective of maximizing influence in multilayer networks is to maximize the influence spread by a seed set under a given diffusion model. This paper evaluates algorithmic diffusion performance under the Multilayer Independent Cascade (MLIC) model. Let
denote the set of selected seed nodes with
. Let
denote the expected number of final activated nodes triggered by
S after the diffusion process ends. Since
is typically difficult to compute analytically, this paper uses Monte Carlo simulation for approximate estimation, calculated as follows:
where
denotes the final set of activated nodes triggered by the seed set
S in the
r-th simulation, and
R represents the number of independent simulations, set at 10,000 in this study. The same value
R is used in all experiments to ensure fairness and consistency when evaluating different methods.
It should be noted that the proposed Preference-based Expected Diffusion Value (P-EDV) serves only as a surrogate evaluation function during the algorithmic search process, guiding the update and selection of candidate solutions. All final performance comparisons are based on the Monte Carlo estimation results derived from the actual diffusion process.
5.3.2. Influence Overlap Ratio
In multilayer network environments, the diffusion regions that influence different seed nodes may exhibit significant overlap, leading to redundant diffusion coverage. To quantitatively characterize the degree of overlap in diffusion coverage among seed node sets, this paper introduces the Influence Overlap Ratio metric for evaluation. For a given seed set
, its Influence Overlap Ratio is defined as:
where
denotes the expected influence range generated by node
v acting as a seed under the MLIC model, while
represents the expected influence range of the seed set
S under joint diffusion conditions. This metric ranges from
, with higher values indicating greater overlap in influence coverage between different seed nodes. It helps in analyzing how various methods characterize the complementarity between seed selection diversity and diffusion coverage within multilayer network structures.
6. Experimental Results
6.1. Parameter Sensitivity
To evaluate the impact of key parameters in PB-MSMA on algorithm performance and determine default configurations for subsequent experiments, this study conducted a parameter sensitivity analysis on random exploration probability
z, pipeline-uniform sampling hybrid ratio
, statistical prior update rate
, candidate pool expansion rate
, candidate pool capacity parameter
, and random injection ratio
. Each experiment modified only one parameter while keeping the others constant, repeating 1000 times across multiple multilayer network datasets. The final P-EDV performance metrics are shown in
Figure 3.
Figure 3a–f collectively demonstrate the sensitivity performance of the proposed method under different parameter configurations. As shown in
Figure 3a, when the random exploration probability
z varies over a wide range, P-EDV remains stable across all datasets, with only a slight decrease at lower values. This indicates strong robustness to random exploration frequency. Balancing performance and stability, a higher
z value was selected to enhance exploration capacity, ultimately adopting
.
Figure 3b illustrates the impact of the mixing ratio
on the performance of the algorithm. The results show minimal variation in P-EDV in the tested range, with smaller
values producing slightly better performance on multiple datasets. This indicates that the probabilistic pipeline itself provides effective guidance. Consequently, subsequent experiments adopt
, allowing the pipeline mechanism to fully dominate candidate generation.
Figure 3c reveals that the decay rate of the prior statistical updates has a weak impact on the algorithm performance, causing only minor fluctuations. A moderate value achieves a good balance between preserving historical statistical information and adapting to new search feedback. Consequently,
is selected in this article.
Figure 3d,e present sensitivity results for the expansion rate of the candidate pool
and the capacity parameter
, respectively. As both values increase, P-EDV shows a declining trend across most datasets, indicating that an excessively large candidate space weakens the search intensity under a fixed computational budget. To ensure search efficiency, this article selects smaller parameters of candidate pool size:
and
. As shown in
Figure 3f, the random injection ratio
exhibits minimal impact on P-EDV within the tested range, indicating that this mechanism primarily helps exploration without significantly altering the search structure. Taking into account stability,
is adopted in this paper.
In summary, PB-MSMA exhibits strong robustness to parameters such as z, , , and , while and have relatively significant impacts on performance. Based on average cross-dataset performance, the default parameter configuration for subsequent experiments is selected as , , , , , and .
The sensitivity results in
Figure 3 also provide guidance for domain-oriented parameter fine-tuning. The numerical results indicate that the candidate-pool parameters
and
are more sensitive to network structure, with smaller values showing more stable P-EDV performance across the tested domains. Therefore, when PB-MSMA is applied to a new domain,
and
should be tuned first, while
z,
,
, and
can be initialized near the selected default values and then adjusted according to convergence stability.
6.2. Comparison of Influence Diffusion
Figure 4 presents the comparison of the influence diffusion ranges in six real multilayer network datasets with varying seed sizes
k for different algorithms. The experiments were conducted using the same diffusion model and parameter configurations, with seed sizes incrementally increasing from
to
. The compared methods include CELF, DPSOMIM, Degree, DIRCI, PRGC, and Random-MC, as described with their corresponding references in
Section 5.2. All results were obtained by averaging multiple Monte Carlo simulations.
Observing
Figure 4 reveals that the diffusion ranges of all methods monotonically increase with increasing values of
k. However, there are significant differences in the growth rates and ultimate coverage capabilities among the different algorithms. Compared to the average levels of CELF, DPSOMIM, DIRCI, PRGC, and Degree, PB-MSMA achieves an average diffusion range improvement of approximately 10.23% across the six datasets. The overall distribution of the curve indicates that this enhancement is not driven by isolated outliers but is consistently maintained across most datasets and within medium to large seed size ranges.
Further analysis of curve patterns across datasets reveals that PB-MSMA’s advantage primarily manifests itself during the mid-to-late seed expansion phase. In networks like CKM, London_transport and Gallus, as k increases beyond moderate scales, the diffusion curves of CELF, Degree, and PRGC show significant deceleration. In some cases, they yield only limited diffusion gains between adjacent k intervals. This reflects the tendency of these methods to concentrate seed selection on already highly covered core regions, thereby limiting further diffusion expansion. DIRCI demonstrates strong competitiveness across these datasets, with its overall curve typically outperforming traditional greedy and simple heuristic methods. However, beyond , it gradually falls behind PB-MSMA, indicating that relying solely on structural metrics and dynamic influence modeling struggles to consistently avoid seed coverage overlap in the mid-to-late stages.
In contrast, PB-MSMA maintains a more stable growth slope across these datasets, indicating its ability to continuously introduce more dispersed and complementary node configurations during the search process. This effectively delays the saturation of the diffusion scope. In relatively compact networks such as EUAir and PerformingArts, where the inherent differences between methods are minimal, the diffusion curves between algorithms converge more closely. Although the gap between PB-MSMA and DIRCI/CELF remains limited, PB-MSMA maintains leading or joint-optimal performance in most k settings with smoother curve transitions. This shows its ability to maintain stable search behavior across diverse network structures. Results on the C. elegans dataset further demonstrate that in networks with complex inter-layer coupling relationships, PB-MSMA gradually establishes an advantage even at moderate seed sizes, sustaining this advantage during subsequent scaling phases.
Combining experimental results across six datasets reveals that PB-MSMA exhibits stable and consistent diffusion advantages across varying network sizes, inter-layer connection strengths, and structural heterogeneity. To provide a more explicit numerical evaluation of the diffusion performance,
Table 3 reports the dataset-level average diffusion spread, standard deviation, and relative improvement of PB-MSMA over the reference baselines. The reference baselines are defined as CELF, DPSOMIM, Degree, DIRCI, and PRGC, while Random-MC is retained only as a random lower-bound baseline. Using the revised dataset–seed-size evaluation protocol, PB-MSMA achieves an overall average improvement of 10.23%. The complete per-dataset and per-seed-size results are provided in Appendix
Table A5. This enhancement is more pronounced in networks with sparser structures and more diverse diffusion paths. In compact networks, the gains are smaller but remain stable. These results indicate that PB-MSMA’s performance gains do not depend on a specific dataset or parameter setting. Instead, they stem from its effective control of seed distribution and coverage expansion during the search process. This enables more structurally efficient diffusion expansion in multilayer influence maximization tasks.
To further verify the statistical reliability of the diffusion-spread comparison, we conducted repeated-run tests at
. Pairwise Wilcoxon signed-rank tests and a Friedman ranking test were performed using the repeated results. Appendix
Table A1,
Table A2,
Table A3 and
Table A4 report the repeated-run summaries, confidence intervals, Wilcoxon results, Friedman results, and average ranks. The results show that PB-MSMA obtains the highest repeated-run diffusion spread and the best average rank among the compared algorithms.
6.3. Ablation Study
To further dissect the specific contributions of key design elements to overall performance, this section conducts ablation analyses of PB-MSMA across multiple dimensions. The ablation analysis in this section unfolds on two levels: First, it examines the validity of the surrogate fitness metric P-EDV itself—specifically, the impact of replacing different fitness functions on search results without altering the search framework. Second, it analyzes the role of key mechanisms within the search framework by removing or simplifying modules such as probabilistic memory guidance and dynamic candidate pools to assess their effects on diffusion efficacy and stability. Through these two complementary sets of ablation configurations, we can more clearly distinguish the respective functions of evaluation-driven mechanisms and search mechanisms within PB-MSMA.
6.3.1. Surrogate Fitness Metric P-EDV
This section aims to independently validate the role of the surrogate fitness metric P-EDV in the search process. To this end, while fully preserving the PB-MSMA search framework and parameter configuration, only the fitness function is replaced. The diffusion effects of seed sets obtained under different fitness drivers are then compared in real diffusion evaluations. The fitness function is set to the corresponding node importance metrics from EDV, Degree, DIRCI, and PRGC, respectively. These are uniformly integrated into the PB-MSMA search process, forming control settings such as PB-MSMA(EDV), PB-MSMA(Degree), PB-MSMA(DIRCI) and PB-MSMA(PRGC). All methods generate seed sets with the same seed size k and evaluate their average number of activated nodes through 10,000 Monte Carlo simulations. The diffusion range is represented by the average number of activated nodes .
Figure 5 presents the results of
that vary with
k in different datasets. It is visually evident that PB-MSMA (P-EDV) maintains the highest or joint-highest diffusion curve across all datasets and the vast majority of
k intervals, with its advantage continuing to persist as
k increases. In contrast, replacing
with EDV causes the curve to shift downward in general. When using purely structural fitness metrics (Degree, DIRCI, PRGC), the downward shift is even more pronounced. This shows that incorporating P-EDV indeed favors selecting seed configurations with stronger actual diffusion gains during the “search phase,” rather than simply increasing local or single-point structural scores.
A detailed examination reveals that, although the absolute performance margin varies across datasets, the relative ordering of methods remains highly stable. Specifically, for datasets such as CKM, EUAir, and C. elegans, the P-EDV curve begins to separate from the curves of competing fitness functions at relatively small values of k, and maintains a clear and consistent advantage throughout the medium-to-large k regime. This manifests as the entire curve shifting upward almost in parallel. This indicates that when networks exhibit stronger inter-layer coupling and richer inter-layer diffusion pathways, relying solely on local structure (e.g., Degree) or single-centrality modeling (the scoring emphasis of DIRCI/PRGC) tends to guide the search toward homogeneous high-score regions, resulting in insufficient real marginal gains between seeds. In contrast, P-EDV better captures differences in “inter-layer diffusion potential” and “multi-step expected returns” during scoring, thereby facilitating broader coverage configurations.
In contrast, in London_transport, the curves of various methods are closer with smaller gaps, reflecting a more “linear” diffusible space and stronger structural dominance within the current parameters and the range k. Although structural metrics still provide some effective guidance here, P-EDV maintains a lead in the middle-to-late stages, indicating that its advantage stems not from accidental early selections but persists throughout the iterative search process. On the other hand, Gallus exhibits the most pronounced differentiation among alternative fitnesses: the curve corresponding to structural fitnesses is noticeably lower and grows more slowly, while P-EDV and EDV better maintain effective gains as k increases, with P-EDV exhibiting a higher upper bound. This indicates that in scenarios with stronger structural heterogeneity and more complex alternative diffusion paths, the ability to incorporate true diffusion benefits into fitness directly determines whether the search can sustainably identify high-quality incremental seeds.
In summary, this section demonstrates, through a comparative analysis of the adapted fitness function, that when identical PB-MSMA search operators and hyperparameters are employed, the P-EDV method consistently yields higher values of . This improvement stems not from random fluctuations inherent to the search framework itself but rather from P-EDV’s more accurate modeling of multilayer diffusion benefits. This enables population searches to progress more consistently toward solutions with higher diffusion potential during each generation’s comparison and selection process. This result aligns with the core design motivation for proposing P-EDV as a surrogate fitness metric while also providing direct evidence for PB-MSMA’s performance enhancement in subsequent ablation and overall comparison experiments.
6.3.2. Convergence Analysis
To further analyze the impact of key mechanisms within the PB-MSMA search framework on performance, this section performs an ablation comparison of convergence behavior across different algorithm configurations. The comparison includes Original SMA, SMA + Pool, SMA + Pipe, SMA + Pipe-NoDecay, SMA + Elite Archive, and the full PB-MSMA model, all run with a fixed seed size
and repeated 1000 times.
Figure 6 presents the P-EDV convergence curves of these methods in six multilayer network datasets.
Figure 6 presents the comparative results of the surrogate evaluation values over iterations during the search process for PB-MSMA and five comparison variants in different multilayer network datasets. In this comparison, “Original SMA” denotes the baseline algorithm without probabilistic memory or candidate pooling. “SMA + Pipe” and “SMA + Pool” indicate variants using probabilistic memory and candidate pooling alone, respectively. “SMA + Pipe-NoDecay” denotes the variant that removes the decay term from the probability update, and “SMA + Elite Archive” denotes the variant that uses archive-based elite memory. “PB-MSMA” denotes the full method. It can be observed that, across all datasets, the full method consistently achieves the highest final P-EDV value throughout the iteration process. Its convergence curve remains above the other ablation versions in the mid-to late stages, indicating that the proposed modules exhibit significant synergistic gains in overall search performance.
Figure 6 presents the comparative results of the surrogate evaluation values over iterations during the search process for PB-MSMA and five comparison variants in different multilayer network datasets. In this comparison, “Original SMA” denotes the baseline algorithm without probabilistic memory or candidate pooling. “SMA + Pipe” and “SMA + Pool” indicate variants using probabilistic memory and candidate pooling alone, respectively. “SMA + Pipe-NoDecay” denotes the variant that removes the decay term from the probability update, and “SMA + Elite Archive” denotes the variant that uses archive-based elite memory. “PB-MSMA” denotes the full method.
Further comparison of the convergence behavior across different ablation versions reveals different roles for each module during the search process. The version incorporating the dynamic candidate pool (+Pool) exhibits a faster initial ascent rate, suggesting that restricting the search space and prioritizing nodes with higher structural importance accelerates the discovery of early viable solutions. However, its curve gradually flattens in the mid-to-late stages, making it prone to local structural patterns. In contrast, the version guided solely by probabilistic memory (+Pipe) shows relatively limited improvement in the first few iterations but achieves sustained gains in the middle and late stages. This reflects how probabilistic memory accumulates and reuses high-quality structural information from previous iterations, enhancing search stability, and preventing premature convergence. The two memory baselines further clarify the role of the pipeline update rule. SMA + Pipe-NoDecay also uses probability-vector guidance, but its final fitness is generally lower than that of PB-MSMA. This suggests that the decay term helps prevent early search patterns from dominating later updates. SMA + Elite Archive improves the search compared with Original SMA in several datasets, but it remains below PB-MSMA because it reuses discrete elite structures rather than a distribution-level probability prior.
By combining the “rapid focus” capability of the dynamic candidate pool with the “long-term guidance” mechanism of probabilistic memory, the complete method enables the search process to quickly identify potential high-quality regions early on and continuously mine underutilized structural information later, resulting in a smoother and steadily ascending convergence trajectory. Experimental results across multiple datasets demonstrate that probabilistic memory guidance and dynamic candidate pooling do not merely exhibit additive performance gains. Instead, they form a complementary relationship at the search dynamics level. The former enhances the algorithm’s structural memory capabilities during multi-generation iterations, while the latter effectively mitigates randomness and noise interference arising from high-dimensional discrete search spaces. Their synergistic interaction enables PB-MSMA to outperform single-module variants and simpler memory baselines in both convergence speed and final solution quality, thereby validating the rationality and effectiveness of the proposed framework for multilayer influence maximization problems.
6.4. Comparison of Running Time
Figure 7 presents the running time comparison of different algorithms on six multilayer network datasets—CKM, C. elegans, Gallus, London_transport, EUAir, and PerformingArts—under a fixed seed size
.
It can be observed that PB-MSMA exhibits running times at a lower order of magnitude across all datasets, consistently outperforming the baseline algorithms Original-SMA, DPSOMIM, CELF, and DIRCI. Furthermore, its computational cost is significantly lower than that of CELF, which relies on high-frequency diffusion evaluation. Although Degree remains the fastest method in terms of runtime, it serves only as a structural baseline and is not comparable to PB-MSMA in diffusion effectiveness. Overall, PB-MSMA demonstrates stable runtime performance across different datasets, without significant amplification due to network scale or structural complexity.
Examining dataset-specific performance reveals that the PB-MSMA’s computational advantage is particularly pronounced in scenarios with more complex structures or larger networks. For instance, on the EUAir and CKM datasets, CELF’s runtime increases significantly, reflecting its need to repeatedly execute diffusion simulations during each seed selection round, where computational costs accumulate rapidly with network complexity. In contrast, PB-MSMA exhibits only limited fluctuations in runtime, consistently maintaining stable performance. This indicates that PB-MSMA avoids frequent invocations of actual diffusion evaluations during search. Instead, it leverages surrogate metrics for candidate solution screening and comparison, effectively controlling computational costs per iteration.
Compared to DPSOMIM and DIRCI, PB-MSMA also employs an iterative search framework. However, due to the constrained candidate space in each generation, the search process is more focused, avoiding the additional time overhead associated with large-scale candidate updates. This enables PB-MSMA to achieve a runtime lower than that of both methods across all six datasets. Overall, PB-MSMA’s runtime performance directly reflects its design philosophy: it effectively constrains computational costs through surrogate evaluation and controlled search strategies without relying on high-frequency diffusion simulations. Combined with the diffusion range comparison results from the previous section, it is evident that PB-MSMA does not sacrifice diffusion performance for efficiency. Instead, it maintains diffusion advantages while keeping the runtime within a low and stable order of magnitude. This characteristic enhances PB-MSMA’s computational feasibility in practical scenarios involving larger and more complex multilayer networks.
6.5. Overlap Analysis
Figure 8 presents the comparison of the overlap ratio in six datasets of real-world multilayer networks for different algorithms, with a fixed seed size
. A lower overlap ratio indicates less overlap in the diffusion regions activated by different seed nodes.
The general results show that PB-MSMA consistently achieves lower influence overlap ratios across all datasets, significantly outperforming methods such as CELF, DIRCI, and Degree in most cases. Compared to DPSOMIM and PRGC, PB-MSMA also maintains more stable performance and lower overlap. This indicates that under the same seed size constraints, the seed sets selected by PB-MSMA exhibit better dispersion in diffusion coverage rather than relying on multiple highly overlapping diffusion sources to achieve diffusion effects.
Examining the specific bar distributions across different datasets further reveals that PB-MSMA’s advantage in overlap control is more pronounced in networks with complex structures or diverse diffusion paths. In the EUAir, PerformingArts, and C. elegans datasets, CELF and Degree exhibit high overlap ratios, indicating that these greedy or locally structured sorting methods tend to repeatedly select nodes within similar core regions. This leads to significant early-stage overlap in diffusion paths between different seeds. While DIRCI mitigates this issue to some extent, its overlap levels remain markedly higher than PB-MSMA’s. In contrast, PB-MSMA consistently maintains relatively low overlap ratios across these datasets, reflecting its ability to effectively distinguish potential coverage relationships among candidate nodes during the search and avoid redundant seed deployment within the same diffusion region.
On datasets with inherently low overlap cardinality, such as CKM and London_transport, the differences between algorithms are relatively limited. Nevertheless, PB-MSMA consistently achieves the smallest or near-smallest overlap values, indicating that it avoids introducing additional coverage redundancy across diverse network structures. The results in the Gallus dataset further demonstrate that even in scenarios with more dispersed seed distributions, PB-MSMA maintains stable diffusion coverage while controlling overlap, rather than achieving superficial gains through concentrated deployment.
Combining the experimental results on influence overlap ratio and diffusion range from previous sections reveals that PB-MSMA’s diffusion advantage does not stem from simply superimposing multiple highly overlapping diffusion sources. Instead, it achieves coverage expansion through a more rational spatial distribution of seeds. Although maintaining a low overlap ratio in terms of influence overlap, PB-MSMA still achieves a larger diffusion range across multiple datasets, indicating that it effectively balances coverage dispersion and diffusion efficiency during the search process. This characteristic aligns with the improved diffusion performance observed in
Section 6.2, which shows that the proposed method not only improves the overall diffusion effectiveness in the multilayer influence maximization problem but also significantly reduces structural redundancy in the seed selection process, thus achieving more structurally efficient influence diffusion.
7. Conclusions
In multilayer network environments, the inter-layer coupling of diffusion processes and the high combinatorial complexity of seed selection spaces result in significant nonlinear characteristics for influence maximization problems at both the evaluation and search levels. Addressing the challenge of embedding high-frequency, precise evaluation into iterative search within multilayer independent cascade diffusion models, this paper proposes a novel bio-inspired swarm intelligence solution framework for multilayer influence maximization from an “intelligent surrogate evaluation + guided swarm search” perspective. This framework employs a lightweight surrogate model, the Preference-based Expected Diffusion Value (P-EDV), to rapidly approximate the diffusion potential of candidate seed sets, successfully bypassing computationally prohibitive explicit simulations. By integrating a probabilistic pipeline with a dynamic candidate pool mechanism, it introduces intergenerational statistical guidance and candidate space constraints during the search process, achieving a balance between computational efficiency and diffusion effectiveness.
Under this configuration, the search process does not rely on high-frequency multi-hop simulations or unguided random exploration. Instead, it progressively converges toward seed configurations with high diffusion potential. The experimental results systematically validate the proposed method in multiple dimensions, including parameter sensitivity, module ablation, influence diffusion range, runtime, and influence overlap. The results demonstrate consistent performance ranking across different network structures and seed sizes for each module configuration. The complete model exhibits stable advantages in diffusion effectiveness, convergence behavior, and redundancy control. Under real-world diffusion evaluation conditions, the proposed method achieves an average 10.23% improvement in the diffusion range compared to multiple representative baseline algorithms.
Similarly, runtime comparisons reveal that the proposed method significantly enhances diffusion performance without introducing additional computational overhead of the same order of magnitude, demonstrating strong computational feasibility and scalability for massive real-world datasets. Differences in network scale, inter-layer coupling strength, and structural heterogeneity influence the specific manifestation of performance variations: in smaller or highly centralized networks, performance gaps between methods remain relatively limited, whereas in scenarios with larger node scales and more diverse diffusion paths, the advantages derived from structural guidance and search strategies become more pronounced. This further indicates that surrogate evaluation and structure-guided mechanisms are better positioned to leverage their potential advantages in complex multilayer networks.
Additionally, the surrogate evaluation metrics employed in this paper primarily characterize first-order and local diffusion properties within multilayer networks, leaving room for further extensions to capture higher-order diffusion structures. In general, the proposed framework offers a highly efficient intelligent decision-support tool to maximize influence in multilayer networks, demonstrating its strong potential for real-world applications such as cross-platform viral marketing and public opinion intervention. Future work may focus on surrogate modeling for higher-order diffusion structures, adaptive characterization of inter-layer heterogeneity parameters, and extensions to dynamic multilayer network scenarios.
Despite these findings, the surrogate evaluation metrics employed in this paper primarily characterize first-order and local diffusion properties within multilayer networks. Therefore, there is still room to extend them toward higher-order diffusion structures. The proposed framework offers an efficient intelligent decision-support tool for seed selection in multilayer influence maximization. It also shows potential for real-world applications such as cross-platform viral marketing, public opinion intervention, and multilayer infrastructure diffusion analysis. This study suggests that simulation-free surrogate evaluation and structure-guided search can jointly improve the feasibility of influence maximization in complex multilayer environments. Future work may further develop more expressive surrogate metrics for deeper cascade structures, estimate inter-layer penetration parameters from real diffusion traces, and extend the framework to dynamic multilayer networks with time-varying nodes, edges, and layer interactions.
Author Contributions
Conceptualization, W.L.; methodology, S.C. and W.J.; software, S.C.; validation, S.C. and W.J.; investigation, W.L.; data curation, S.C.; writing—original draft preparation, S.C., W.J. and T.Z.; writing—review and editing, W.L.; visualization, T.Z.; funding acquisition, W.L. All authors have read and agreed to the published version of the manuscript.
Funding
This research was supported by the National Natural Science Foundation of China (Grant No. 61971233 and Grant No. 61702441), the Postgraduate Research & Practice Innovation Program of Jiangsu Province (Grant No. KYCX24_3740), and the Municipal-University Cooperation Project of Yangzhou Science and Technology Plan (Grant No. YZ2025208).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The implementation code, hyperparameter configuration file, random seed settings, dataset loading and preprocessing script, and an executable reproducibility demo are publicly available at
https://github.com/siyuc5456-design/PB-MSMA-Demo (accessed on 19 May 2026). The repository provides the core PB-MSMA implementation and a runnable demo on the Gallus multilayer dataset using the reported parameter settings.
Acknowledgments
We gratefully acknowledge the support from the National Natural Science Foundation of China and the Jiangsu Provincial Department of Education. Their support was essential to the completion of this research on PB-MSMA in multilayer networks.
Conflicts of Interest
The authors declare no conflicts of interest. Funders did not interfere in the research process.
Abbreviations
The following abbreviations are used in this manuscript:
| IM | Influence Maximization |
| SMA | Slime Mold Algorithm |
| PB-MSMA | Probabilistic-Based Multilayer Slime Mold Algorithm |
| EDV | Expected Diffusion Value |
| IC | Independent Cascade |
| LT | Linear Threshold |
| MLIM | Multilayer Influence Maximization |
Appendix A. Statistical Analysis Results
Table A1.
Repeated-run diffusion spread summary at .
Table A1.
Repeated-run diffusion spread summary at .
| Dataset | PB-MSMA | CELF | DPSOMIM | Degree | DIRCI | PRGC | Random-MC |
|---|
| CKM | 128.20 ± 0.83 | 121.21 ± 0.08 | 125.27 ± 1.11 | 115.49 ± 0.08 | 114.97 ± 0.08 | 77.06 ± 0.05 | 113.08 ± 0.10 |
| London 1 | 84.56 ± 0.89 | 76.74 ± 0.14 | 82.32 ± 1.00 | 76.50 ± 0.14 | 74.22 ± 0.13 | 52.76 ± 0.08 | 70.21 ± 0.16 |
| EUAir | 100.20 ± 1.11 | 97.27 ± 0.21 | 97.64 ± 0.88 | 95.61 ± 0.21 | 95.42 ± 0.21 | 93.89 ± 0.21 | 78.77 ± 0.30 |
| Perform 2 | 132.53 ± 0.64 | 129.77 ± 0.33 | 130.52 ± 0.64 | 129.30 ± 0.32 | 129.00 ± 0.32 | 111.26 ± 0.48 | 123.35 ± 0.41 |
| CElegans | 106.89 ± 0.68 | 103.85 ± 0.24 | 103.82 ± 0.84 | 100.07 ± 0.23 | 101.04 ± 0.25 | 97.59 ± 0.24 | 91.92 ± 0.29 |
| Gallus | 82.07 ± 0.67 | 76.41 ± 0.16 | 80.14 ± 0.83 | 75.25 ± 0.16 | 76.46 ± 0.16 | 72.17 ± 0.16 | 72.10 ± 0.22 |
Table A2.
Overall Wilcoxon signed-rank test results.
Table A2.
Overall Wilcoxon signed-rank test results.
| Comparison | Sample Size | Mean Diff. 1 | Median Diff. | Holm-adj. p-Value 2 |
|---|
| PB-MSMA vs. CELF | 6000 | 4.87 | 4.48 | < |
| PB-MSMA vs. DIRCI | 6000 | 7.22 | 5.86 | < |
| PB-MSMA vs. DPSOMIM | 6000 | 2.46 | 2.46 | < |
| PB-MSMA vs. Degree | 6000 | 7.04 | 6.78 | < |
| PB-MSMA vs. PRGC | 6000 | 21.62 | 15.00 | < |
| PB-MSMA vs. Random-MC | 6000 | 14.17 | 14.48 | < |
Table A3.
Friedman ranking test results.
Table A3.
Friedman ranking test results.
| Scope | Blocks | Algorithms | Friedman Stat. 1 | p-Value 2 |
|---|
| Overall | 6000 | 7 | 33,617.63 | < |
Table A4.
Average ranks from the Friedman ranking analysis.
Table A4.
Average ranks from the Friedman ranking analysis.
| Algorithm | Average Rank |
|---|
| PB-MSMA | 1.033 |
| DPSOMIM | 2.155 |
| CELF | 2.966 |
| Degree | 4.373 |
| DIRCI | 4.472 |
| Random-MC | 6.433 |
| PRGC | 6.567 |
Appendix B. Complete Numerical Diffusion Results
Table A5.
Per-dataset and per-seed-size numerical diffusion results of PB-MSMA.
Table A5.
Per-dataset and per-seed-size numerical diffusion results of PB-MSMA.
| Dataset | Size (k) | PB-MSMA Mean | Std. Dev. 1 | Baselines Mean 2 | Rel. Improv. 3 (%) |
|---|
| CKM | 5 | 41.39 | 9.57 | 37.55 | 10.21 |
| | 10 | 62.80 | 9.91 | 55.44 | 13.27 |
| | 15 | 76.82 | 9.79 | 68.81 | 11.63 |
| | 20 | 88.87 | 9.72 | 77.32 | 14.94 |
| | 25 | 98.38 | 9.34 | 84.66 | 16.21 |
| | 30 | 107.00 | 9.00 | 92.56 | 15.60 |
| | 35 | 114.72 | 8.70 | 99.11 | 15.75 |
| | 40 | 122.62 | 8.56 | 105.65 | 16.06 |
| | 45 | 128.02 | 8.14 | 110.69 | 15.65 |
| London_trans. | 5 | 15.06 | 4.20 | 14.00 | 7.55 |
| | 10 | 27.68 | 5.19 | 23.97 | 15.49 |
| | 15 | 37.12 | 5.37 | 31.80 | 16.75 |
| | 20 | 45.10 | 5.63 | 39.18 | 15.09 |
| | 25 | 52.71 | 5.96 | 46.85 | 12.50 |
| | 30 | 62.58 | 6.32 | 54.21 | 15.42 |
| | 35 | 69.19 | 6.47 | 60.71 | 13.97 |
| | 40 | 77.89 | 6.58 | 67.11 | 16.05 |
| | 45 | 84.55 | 6.77 | 72.45 | 16.70 |
| EUAir | 5 | 35.23 | 7.68 | 34.08 | 3.39 |
| | 10 | 52.55 | 7.49 | 50.14 | 4.79 |
| | 15 | 63.79 | 7.51 | 61.63 | 3.50 |
| | 20 | 70.85 | 7.16 | 69.49 | 1.96 |
| | 25 | 79.02 | 7.19 | 76.02 | 3.95 |
| | 30 | 82.94 | 7.13 | 81.02 | 2.37 |
| | 35 | 88.43 | 7.17 | 85.98 | 2.86 |
| | 40 | 95.20 | 6.77 | 90.96 | 4.65 |
| | 45 | 101.67 | 6.81 | 96.21 | 5.68 |
| PerformingArts | 5 | 47.00 | 15.67 | 40.35 | 16.47 |
| | 10 | 67.82 | 14.28 | 60.42 | 12.25 |
| | 15 | 82.48 | 13.27 | 74.49 | 10.74 |
| | 20 | 93.76 | 12.80 | 85.88 | 9.18 |
| | 25 | 102.60 | 12.05 | 95.24 | 7.73 |
| | 30 | 111.80 | 11.57 | 104.37 | 7.12 |
| | 35 | 118.93 | 11.28 | 111.94 | 6.25 |
| | 40 | 126.31 | 10.82 | 119.46 | 5.73 |
| | 45 | 132.59 | 10.38 | 126.08 | 5.17 |
| CElegans | 5 | 35.06 | 8.42 | 32.49 | 7.90 |
| | 10 | 50.07 | 8.53 | 46.66 | 7.32 |
| | 15 | 61.52 | 8.41 | 58.36 | 5.41 |
| | 20 | 71.71 | 8.19 | 67.69 | 5.95 |
| | 25 | 79.77 | 8.27 | 76.43 | 4.38 |
| | 30 | 86.58 | 8.08 | 83.88 | 3.22 |
| | 35 | 95.25 | 8.08 | 90.24 | 5.55 |
| | 40 | 99.76 | 7.94 | 96.51 | 3.37 |
| | 45 | 107.12 | 7.82 | 101.09 | 5.96 |
| Gallus | 5 | 35.17 | 5.14 | 26.80 | 31.25 |
| | 10 | 43.50 | 5.39 | 33.72 | 28.99 |
| | 15 | 49.78 | 5.39 | 40.36 | 23.34 |
| | 20 | 55.06 | 5.49 | 49.08 | 12.20 |
| | 25 | 61.12 | 5.57 | 56.43 | 8.31 |
| | 30 | 67.20 | 5.54 | 61.37 | 9.50 |
| | 35 | 71.24 | 5.50 | 67.06 | 6.23 |
| | 40 | 74.88 | 5.47 | 71.52 | 4.70 |
| | 45 | 80.71 | 5.57 | 76.14 | 6.00 |
Appendix C. Detailed Related Work Comparison
Table A6.
Detailed comparison of methodologies, benchmark datasets, and compared baselines.
Table A6.
Detailed comparison of methodologies, benchmark datasets, and compared baselines.
| Method | Methodology | Benchmark Datasets | Compared Baselines |
|---|
| CELF [22] | Submodularity-based lazy-forward greedy evaluation for outbreak detection and sensor/blog placement. | EPA water distribution network; real blog cascades. | Simple greedy; heuristic sensor-placement methods. |
| Gravity [23] | Gravity centrality using k-shell value as node mass and shortest-path distance as interaction distance. | Facebook; Netsci; Email; TAP; Y2H; Blogs; Router; HEP; PGP. | DC; MDD; G+; Cnc+; k-shell; BC; CC; SL. |
| CAGM [24] | Communicability-based adaptive gravity model using adaptive influence radius and communicability matrix. | Facebook; Power; BA-6000; LFR-6000; WS-6000; Erdos; WV; Ca-hepth; PGP; DBLP; Sex; Condmat. | ECRM; EGM; GC; GGC; KSGC; Ksh; LGC; LKG. |
| IM-ELPR [25] | Extended h-index seeding, label propagation, community merging, and top-k node selection. | Football; Email; Jazz; C. Elegans; Facebook Pages; Wiki Vote; Tech Routers; Gnutella P2P 08. | IM-LPA; GLR; LID; DCL; RNR; k-shell; PageRank; Betweenness; Greedy. |
| LIDDE [26] | Differential evolution with local-influence-descending search and EDIV objective function. | CA-HEPTh; CA-GrQc; CA-CondMat; Wiki-Vote; p2p-Gnutella31; Amazon. | Degree; Degree Discount; PMIA; CELF++; DDSE; LAPSO-IM; DPSO; ELDPSO. |
| IDPSO [27] | Improved discrete particle swarm optimization with a local influence evaluation function, initialization strategy, and local search mechanism. | NetInfective; NetGRQC. | CELF++; DPSO; DC. |
| WGCM [28] | Weighted gravity centrality using neighborhood size and weighted social distance in multiplex networks. | BA; WS; London transportation; C. elegans; AUCS CS-Aarhus; Physicians-Innovation. | Neighborhood size; degree centrality. |
| PRGC [29] | Improved gravity centrality using multi-PageRank as node mass and weighted shortest-path distance across layers. | C. elegans; London transportation; Xenopus; DanioRerio; Bos; Lazega-Law-Firm. | GC; SVT; f-EC; GBC; GCC; GEC. |
| CBCM [30] | Tensor-based multilayer centrality combining PageRank score, community importance, gateway influence, and layer importance. | Artificial multilayer network; MIT; Cora; Bos GPI; London transportation. | PR_BIS; f-EC; PC-M; SVT; WGCM. |
| DIRCI [21] | Dynamic influence range, network-layer centrality, and community-based centrality are combined to rank key nodes in multilayer networks. | Three artificial and sixteen real multilayer networks. | PRGC; PR_BIS; SVT; CBCM; DC+; KSGM; HVGC; KS_IF. |
| NGGA [20] | Cost-aware node grouping genetic algorithm with two-hop expected propagation. | Co-author; Cannes2013; Sanremo2016. | RANDOM; MAX_ICR; PPRank; NGGA-R. |
| DPSOMIM [19] | Two-stage MLIM algorithm using RCC candidate screening, discrete PSO seed selection, MLEDV fitness, and neighborhood optimization. | CKM; CElegans; Gallus; Multi-Soc-Wiki-Vote; Multi-lastfm_asia; ArXiv-Netscience. | Four influence-spread baselines; RCC_DPSO; NO_DPSO; DPSO. |
| DEDRL [16] | Differential evolution-aided deep reinforcement learning with multilayer network embedding under the CIC model. | CKM; FB-TT; Leskovec; London transportation; BA; BA-WS-ER. | DPSOMIM; WGCM; K++shell; MA-IMmulti; RRW; MIM-Reasoner. |
Table A7.
Detailed comparison of reported effects and limitations.
Table A7.
Detailed comparison of reported effects and limitations.
| Method | Reported Effect | Limitation for MLIM |
|---|
| CELF [22] | To select 100 blogs, greedy takes 4.5 h, while CELF takes 23 s, about 700× faster; CELF also outperforms the best water-network heuristic by 45% on the PA objective. | Single-layer outbreak setting; no inter-layer diffusion modeling. |
| Gravity [23] | SIR-based validation shows that gravity centrality identifies influential spreaders more effectively than common centralities; the paper reports ranking lists and monotonicity values on 9 networks. | Fixed-radius single-layer ranking; no inter-layer diffusion. |
| CAGM [24] | CAGM reports up to 2.90% ranking-accuracy margin, up to 3.71% lower imprecision function, and up to 0.49% spreading-probability margin; it also achieves the highest monotonicity on all 12 datasets. | Single-layer centrality ranking; no seed-set search. |
| IM-ELPR [25] | IM-ELPR reports higher infected scale in most datasets; Table 2 gives Kendall’s tau values, with IM-ELPR reaching 0.88 on Email and 0.85 on Gnutella P2P 08. Reported execution time ranges from 0.09 s to 21 s across datasets. | Community-dependent single-layer IM. |
| LIDDE [26] | On five social networks, LIDDE reports average gains of 17.28%, 32.90%, 21.19%, 23.13%, and 57.41% over DDSE, LAPSO-IM, Degree Discount, PMIA, and ELDPSO, respectively; on Amazon, it reports gains of 50.81%, 49.86%, 148%, 3.07%, and 200.72% over the same groups of methods. | Single-layer IM; no multilayer coupling. |
| IDPSO [27] | IDPSO reports influence spread close to CELF++ and better than DPSO and DC while requiring much shorter running time; results are mainly shown by influence-spread and runtime curves. | Single-layer IC setting; no inter-layer diffusion modeling. |
| WGCM [28] | WGCM-2 reports the most competitive Kendall’s tau and Spearman rank correlations with real influence in the illustrative multiplex network; under SSS, TSSS, and RSSS, it achieves higher influence coverage and faster diffusion acceleration in most networks. | Centrality-based seeding; sensitive to radius and edge weights. |
| PRGC [29] | PRGC identifies more infected nodes than other centralities in most real-world multilayer networks; intersection-similarity analysis also shows closer ranking consistency with the LT model. | Static centrality ranking; no combinatorial seed-set search. |
| CBCM [30] | In the artificial network, CBCM and the LT model share two of the top-three vertices; in the MIT network with SS = 12, CBCM influences more vertices at early time steps, and all methods reach 75 vertices when . | Requires community information; no global seed-set optimization. |
| DIRCI [21] | DIRCI reports the highest final activated nodes for top-10 nodes under a 30-step LT model on the tested multilayer networks; Kendall’s tau analysis reports a maximum . | Centrality ranking; no combinatorial seed-set optimization. |
| NGGA [20] | NGGA achieves 8.93% median spread gain over the strongest reported baseline under ; it obtains the highest mean influenced nodes in all six dataset-cost settings. | Cost-aware multiplex IM setting; not a general MLIC-based MLIM formulation. |
| DPSOMIM [19] | DPSOMIM reports the fastest convergence and best MLEDV among ablation variants on six datasets; its time complexity is reported as , where denotes the number of nodes. | MLEDV is mainly local/neighborhood-oriented; search depends on DPSO and neighborhood optimization. |
| DEDRL [16] | DEDRL reports 3.8% average performance improvement; for , it improves influence spread by 19.3% and 20.5% over K++shell and MA-IMmulti on BA, and by 5.72% and 10.9% over DPSOMIM and WGCM on CKM. Wilcoxon tests report p-values such as 0.028 and 0.046 in several comparisons. | Training cost; CIC still simplifies some real-world inter-layer factors. |
References
- Zhou, M.; Liu, H.; Liao, H.; Liu, G.; Mao, R. Finding the key nodes to minimize the victims of the malicious information in complex network. Knowl.-Based Syst. 2024, 293, 111632. [Google Scholar] [CrossRef]
- Ghavasieh, A.; De Domenico, M. Diversity of information pathways drives sparsity in real-world networks. Nat. Phys. 2024, 20, 512–519. [Google Scholar] [CrossRef]
- Hu, X.; Wang, L.; Zhang, C.K.; He, Y. Fixed-time synchronization of fuzzy complex dynamical networks with reaction-diffusion terms via intermittent pinning control. IEEE Trans. Fuzzy Syst. 2024, 32, 2307–2317. [Google Scholar] [CrossRef]
- Ahmed, S.F.; Kuldeep, S.A.; Rafa, S.J.; Fazal, J.; Hoque, M.; Liu, G.; Gandomi, A.H. Enhancement of traffic forecasting through graph neural network-based information fusion techniques. Inf. Fusion 2024, 110, 102466. [Google Scholar] [CrossRef]
- Ou, Z.; Wang, S. Finding robust and influential nodes on directed networks using a memetic algorithm. Swarm Evol. Comput. 2024, 87, 101542. [Google Scholar] [CrossRef]
- Domingos, P.; Richardson, M. Mining the network value of customers. In Proceedings of the Seventh ACM SIGKDD; ACM: New York, NY, USA, 2001; pp. 57–66. [Google Scholar] [CrossRef]
- Richardson, M.; Domingos, P. Mining knowledge-sharing sites for viral marketing. In Proceedings of the Eighth ACM SIGKDD; ACM: New York, NY, USA, 2002; pp. 61–70. [Google Scholar] [CrossRef]
- Kempe, D.; Kleinberg, J.; Tardos, É. Maximizing the spread of influence through a social network. In Proceedings of the Ninth ACM SIGKDD; ACM: New York, NY, USA, 2003; pp. 137–146. [Google Scholar] [CrossRef]
- Tang, J.; Geng, L.; Pang, J.; Fu, J.; Pu, M. FLADE: Guiding differential evolution through fitness landscape sequence analysis for influence maximization. Expert Syst. Appl. 2026, 299, 130229. [Google Scholar] [CrossRef]
- Mohammadi, S.; Nadimi-Shahraki, M.H.; Beheshti, Z. An effective Fuzzy Sign-aware Influence Maximization method in complex networks using an adaptive Improved Grey Wolf Optimizer with dynamic user interactions. Chaos Solitons Fractals 2026, 206, 117912. [Google Scholar] [CrossRef]
- Jaouadi, M.; Romdhane, L.B. A survey on influence maximization models. Expert Syst. Appl. 2024, 248, 123429. [Google Scholar] [CrossRef]
- Strogatz, S.H. Exploring complex networks. Nature 2001, 410, 268–276. [Google Scholar] [CrossRef]
- Liu, Y.; Zeng, Q.; Pan, L.; Tang, M. Identify influential spreaders in asymmetrically interacting multiplex networks. IEEE Trans. Netw. Sci. Eng. 2023, 10, 2201–2211. [Google Scholar] [CrossRef]
- Chen, J.; Li, Y.; Kou, G.; Wang, H. Effect of three-stage cascade of opinion dynamics models in coupled networks. Neurocomputing 2024, 572, 127176. [Google Scholar] [CrossRef]
- Yu, X.; Mi, J.; Tang, L.; Long, L.; Qin, X.; Rezaeipanah, A. Security-aware and scalable community detection in multilayer social networks via semi-supervised matrix factorization. Chaos Solitons Fractals 2025, 200, 116968. [Google Scholar] [CrossRef]
- Tang, J.; Li, C.; Liu, L.; Xu, T.; Yao, Y. A novel evolutionary deep reinforcement learning algorithm for the influence maximization problem in multilayer social networks. Chaos Solitons Fractals 2025, 200, 116967. [Google Scholar] [CrossRef]
- Achour, O.; Romdhane, L.B. A theoretical review on multiplex influence maximization models: Theories, methods, challenges, and future directions. Expert Syst. Appl. 2025, 266, 125990. [Google Scholar] [CrossRef]
- Freeman, L.C. Centrality in social networks conceptual clarification. Soc. Netw. 1978, 1, 215–239. [Google Scholar] [CrossRef]
- Wang, S.; Liu, W.; Chen, L.; Zong, S. Influence maximization based on discrete particle swarm optimization on multilayer network. Inf. Syst. 2025, 127, 102466. [Google Scholar] [CrossRef]
- Hu, X.M.; Zhao, Y.Q.; Yang, Z. Nodes grouping genetic algorithm for influence maximization in multiplex social networks. In Proceedings of the 2023 26th International Conference on Computer Supported Cooperative Work in Design (CSCWD); IEEE: New York, NY, USA, 2023; pp. 1130–1135. [Google Scholar] [CrossRef]
- An, Z.; Hu, X.; Jiang, R.; Jiang, Y. A novel method for identifying key nodes in multi-layer networks based on dynamic influence range and community importance. Knowl.-Based Syst. 2024, 305, 112639. [Google Scholar] [CrossRef]
- Leskovec, J.; Krause, A.; Guestrin, C.; Faloutsos, C.; VanBriesen, J.; Glance, N. Cost-effective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD; ACM: New York, NY, USA, 2007; pp. 420–429. [Google Scholar] [CrossRef]
- Ma, L.L.; Ma, C.; Zhang, H.F.; Wang, B.H. Identifying influential spreaders in complex networks based on gravity formula. Phys. A Stat. Mech. Its Appl. 2016, 451, 205–212. [Google Scholar] [CrossRef]
- Xu, G.; Dong, C. CAGM: A communicability-based adaptive gravity model for influential nodes identification in complex networks. Expert Syst. Appl. 2024, 235, 121154. [Google Scholar] [CrossRef]
- Kumar, S.; Singhla, L.; Jindal, K.; Grover, K.; Panda, B.S. IM-ELPR: Influence maximization in social networks using label propagation based community structure. Appl. Intell. 2021, 51, 7647–7665. [Google Scholar] [CrossRef]
- Qiu, L.; Tian, X.; Zhang, J.; Gu, C.; Sai, S. LIDDE: A differential evolution algorithm based on local-influence-descending search strategy for influence maximization in social networks. J. Netw. Comput. Appl. 2021, 178, 102973. [Google Scholar] [CrossRef]
- Wang, B.; Ma, L.; He, Q. IDPSO for influence maximization under independent cascade model. In Proceedings of the 2022 4th International Conference on Data-Driven Optimization of Complex Systems; IEEE: New York, NY, USA, 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Ni, C.; Yang, J.; Pang, Z.; Gong, Y. Seeding strategy based on weighted gravity centrality in multiplex networks. IEEE Trans. Netw. Sci. Eng. 2022, 10, 331–345. [Google Scholar] [CrossRef]
- Lv, L.; Zhang, T.; Hu, P.; Bardou, D.; Niu, S.; Zheng, Z.; Yu, G.; Wu, H. An improved gravity centrality for finding important nodes in multi-layer networks based on multi-PageRank. Expert Syst. Appl. 2024, 238, 122171. [Google Scholar] [CrossRef]
- Lv, L.; Hu, P.; Zheng, Z.; Bardou, D.; Zhang, T.; Wu, H.; Niu, S.; Yu, G. A community-based centrality measure for identifying key nodes in multilayer networks. IEEE Trans. Comput. Soc. Syst. 2023, 11, 2448–2463. [Google Scholar] [CrossRef]
- Li, S.; Chen, H.; Wang, M.; Heidari, A.A.; Mirjalili, S. Slime mould algorithm: A new method for stochastic optimization. Future Gener. Comput. Syst. 2020, 111, 300–323. [Google Scholar] [CrossRef]
- Coleman, J.; Katz, E.; Menzel, H. The diffusion of an innovation among physicians. Sociometry 1957, 20, 253–270. [Google Scholar] [CrossRef]
- Chen, B.L.; Hall, D.H.; Chklovskii, D.B. Wiring optimization can relate neuronal structure and function. Proc. Natl. Acad. Sci. USA 2006, 103, 4723–4728. [Google Scholar] [CrossRef]
- De Domenico, M.; Nicosia, V.; Arenas, A.; Latora, V. Structural reducibility of multilayer networks. Nat. Commun. 2015, 6, 6864. [Google Scholar] [CrossRef]
- De Domenico, M.; Solé-Ribalta, A.; Gómez, S.; Arenas, A. Navigability of interconnected networks under random failures. Proc. Natl. Acad. Sci. USA 2014, 111, 8351–8356. [Google Scholar] [CrossRef]
- Cardillo, A.; Gómez-Gardenes, J.; Zanin, M.; Romance, M.; Papo, D.; Del Pozo, F.; Boccaletti, S. Emergence of network features from multiplexity. Sci. Rep. 2013, 3, 1344. [Google Scholar] [CrossRef] [PubMed]
Figure 1.
Schematic Diagram of MLIC Diffusion.
Figure 1.
Schematic Diagram of MLIC Diffusion.
Figure 2.
Overall workflow of the proposed PB-MSMA.
Figure 2.
Overall workflow of the proposed PB-MSMA.
Figure 3.
Sensitivity analysis of key PB-MSMA parameters. (a) Random exploration probability z; (b) pipe–uniform mixing ratio ; (c) statistical prior update rate ; (d) candidate-pool expansion rate ; (e) candidate-pool capacity coefficient ; (f) random injection ratio . The vertical axis denotes P-EDV fitness, and the red dotted line marks the selected default value.
Figure 3.
Sensitivity analysis of key PB-MSMA parameters. (a) Random exploration probability z; (b) pipe–uniform mixing ratio ; (c) statistical prior update rate ; (d) candidate-pool expansion rate ; (e) candidate-pool capacity coefficient ; (f) random injection ratio . The vertical axis denotes P-EDV fitness, and the red dotted line marks the selected default value.
Figure 4.
Influence spread comparison of PB-MSMA and baseline methods under different seed-set sizes. Panels (a–f) correspond to CKM, London_transport, EUAir, PerformingArts, CElegans, and Gallus, respectively. The horizontal axis denotes the seed-set size k, and the vertical axis denotes the Monte Carlo-estimated influence spread.
Figure 4.
Influence spread comparison of PB-MSMA and baseline methods under different seed-set sizes. Panels (a–f) correspond to CKM, London_transport, EUAir, PerformingArts, CElegans, and Gallus, respectively. The horizontal axis denotes the seed-set size k, and the vertical axis denotes the Monte Carlo-estimated influence spread.
Figure 5.
Influence spread comparison of PB-MSMA under different fitness evaluators. Panels (a–f) correspond to CKM, London_transport, EUAir, PerformingArts, CElegans, and Gallus, respectively. The horizontal axis denotes the seed-set size k, and the vertical axis denotes the Monte Carlo-estimated influence spread.
Figure 5.
Influence spread comparison of PB-MSMA under different fitness evaluators. Panels (a–f) correspond to CKM, London_transport, EUAir, PerformingArts, CElegans, and Gallus, respectively. The horizontal axis denotes the seed-set size k, and the vertical axis denotes the Monte Carlo-estimated influence spread.
Figure 6.
Convergence comparison of PB-MSMA and ablation variants. Panels (a–f) correspond to CKM, London_transport, EUAir, PerformingArts, CElegans, and Gallus, respectively. The horizontal axis denotes the number of iterations, and the vertical axis denotes P-EDV fitness.
Figure 6.
Convergence comparison of PB-MSMA and ablation variants. Panels (a–f) correspond to CKM, London_transport, EUAir, PerformingArts, CElegans, and Gallus, respectively. The horizontal axis denotes the number of iterations, and the vertical axis denotes P-EDV fitness.
Figure 7.
Comparison of running time with different algorithms across various datasets.
Figure 7.
Comparison of running time with different algorithms across various datasets.
Figure 8.
Comparison of influence overlap ratio with different algorithms across various datasets.
Figure 8.
Comparison of influence overlap ratio with different algorithms across various datasets.
Table 1.
Compact comparison of representative related methods.
Table 1.
Compact comparison of representative related methods.
| Method | Category | Metric | Network | Bench. | Comp. | Main Gap |
|---|
| CELF [22] | Greedy | Marginal gain | Single | 2 app. | 1 | Not multilayer |
| Gravity [23] | Heuristic | Gravity score | Single | 9 nets | 8 | Fixed radius |
| CAGM [24] | Heuristic | Adaptive gravity | Single | 12 ds. | 8 | Single-layer ranking |
| IM-ELPR [25] | Heuristic | Community score | Single | 8 ds. | 9 | Community-dependent |
| LIDDE [26] | Meta-heuristic | EDIV | Single | 6 ds. | 8 | Single-layer IM |
| IDPSO [27] | Meta-heuristic | Local influence | Single | 2 ds. | 3 | Not multilayer |
| WGCM [28] | Heuristic | Weighted gravity | Multiplex | 6 nets | 2 | Radius-sensitive |
| PRGC [29] | Heuristic | Multi-PageRank gravity | Multilayer | 6 nets | 6 | Static ranking |
| CBCM [30] | Heuristic | CBCM | Multilayer | 5 nets | 5 | Community-dependent |
| DIRCI [21] | Heuristic | DIRCI score | Multilayer | 19 nets | 8 | Ranking only |
| NGGA [20] | Meta-heuristic | Two-hop EDV | Multiplex | 3 ds. | 4 | Cost-aware setting |
| DPSOMIM [19] | Meta-heuristic | MLEDV | Multilayer | 6 ds. | 7 | Local fitness |
| DEDRL [16] | Learning-based | CIC reward | Multilayer | 6 ds. | 6 | Training cost |
Table 2.
Detailed Information about the Real-world Multilayer Networks.
Table 2.
Detailed Information about the Real-world Multilayer Networks.
| Datasets | Number of Nodes | Number of Edges | Number of Layers | Domain |
|---|
| CKM [32] | 241 | 1551 | 3 | Social Network |
| PerformingArts | 284 | 8240 | 2 | Social Network |
| CElegans [33] | 279 | 5863 | 3 | Biological Network |
| Gallus [34] | 313 | 389 | 6 | Biological Network |
| London_transport [35] | 369 | 441 | 3 | Transportation Network |
| EUAir [36] | 417 | 3588 | 37 | Transportation Network |
Table 3.
Dataset-level summary of PB-MSMA performance.
Table 3.
Dataset-level summary of PB-MSMA performance.
| Dataset | Mean Spread 1 | Std. Dev. 2 | Baselines Mean 3 | Rel. Improv. 4 (%) |
|---|
| CKM | 93.40 | 9.19 | 81.31 | 14.37 |
| London 5 | 52.43 | 5.83 | 45.59 | 14.39 |
| EUAir | 74.41 | 7.21 | 71.73 | 3.68 |
| Perform 6 | 98.14 | 12.46 | 90.91 | 8.96 |
| CElegans | 76.32 | 8.19 | 72.59 | 5.45 |
| Gallus | 59.85 | 5.45 | 53.61 | 14.50 |
| Overall | 75.76 | 8.06 | 69.29 | 10.23 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |