1. Introduction
Complex network analysis fundamentally focuses on studying its structure, dynamics, and interactions to understand the importance of its nodes and connections in information diffusion, resilience, and global information. This has garnered significant attention in recent years [
1]. Processes such as synchronization, diffusion, and cascade effects are predominantly influenced by nodes with higher influence and connectivity [
2]. Thus, the study of these problems holds theoretical relevance, which is reflected in its practical applications across fields including computational biology, computer science, social networks, and artificial intelligence [
3].
According to the information provided by a network, centrality metrics are classified into three groups (local, semi-local, and global). Each evaluates node influence from the following two perspectives: their topological structure, which analyzes connections and positions within the network, and their valued nodal attributes, which consider quantifiable characteristics and weights assigned to each node to determine its importance in the network.
Local metrics are subdivided into the following two categories: those based on topological structure, such as degree centrality [
4], which evaluates direct connections, PageRank [
5], which analyzes nodes as web pages, and Trust-PageRank [
6]; and those with nodal weights, such as node-weighted degree centrality [
7], which incorporates weights through an
function, and the WNDegree/WNEDegree/WNEOpshalDegree variants [
8] that integrate nodal attributes with topological structure.
Semi-local metrics can also be approached from the following two perspectives: those based on topological structure such as K-Shell [
9], which identifies influential nodes through iterative removal, mixed degree decomposition [
10], which considers residual degrees, semi-local centrality [
11], which evaluates first and second-level neighbors, local structural centrality [
12], and degree and importance of lines [
13]; and those with nodal weights such as node-weighted harmonic centrality [
14], which combines weights and geodesic distances, node-weighted betweenness centrality [
15], which evaluates flows between nodes, modified node-weighted eigenvector centrality [
16], and MCNDI [
17], which integrates multiple indicators through the CRITIC method.
Global metrics represent the third group that utilizes information from the entire network. Among those based on classical structural topological approaches, notable examples include betweenness centrality [
4], which analyzes shortest paths between nodes, closeness centrality [
17], which evaluates proximity to other nodes [
18], and eigenvector centrality [
19], which considers the importance of neighboring nodes; while from the nodal attributes perspective, they incorporate developments such as LARSP [
20] and LASP [
21] that optimize shortest path calculations and ARP [
22] that considers reciprocal distances in directed networks.
These metrics have attempted to provide a balance between accuracy and efficiency in complex network analysis; however, local and global metrics possess limitations. Local metrics exhibit constraints, as they only consider highly restricted information from nodes’ immediate neighborhood [
23]. While computationally simple and efficient in considering only the nearest neighbors, their capability to identify truly influential nodes is compromised by this limited network vision. Meanwhile, global metrics, although more accurate by utilizing information from the entire network, face considerable practical challenges. Their high computational complexity makes them impractical for large-scale networks [
24].
Semi-local metrics, particularly those implementing the extended neighborhood concept (ENC), overcome these limitations by providing an optimal balance. By considering local subgraphs with LRASP (Local Relative Average Shortest Path), they generally achieve high accuracy in identifying influential nodes while maintaining manageable computational complexity [
25]. This approach enables the evaluation of both topological position and semi-local structure, simultaneously considering node importance and the influence of its nearby neighbors.
Despite advances in semi-local metrics such as LASP, which incorporates LRASP and ENC to evaluate centrality by combining topological structure, there remains a significant gap in developing metrics that effectively integrate both topological structure and nodal values at the semi-local level. While metrics such as node-weighted harmonic centrality evaluate nodal weights with geodesic distances and those such as node-weighted betweenness centrality consider flows between weighted nodes, these analyze weights in isolation without considering how these values affect the structure of local connections. This separation between weight and structure is particularly problematic in networks where a node’s influence depends on both factors in an interrelated manner, as occurs in phenomena such as quality control in manufacturing environments or scientific collaboration networks, where both node attributes and network position jointly determine their actual importance. For instance, the works of [
26,
27,
28,
29,
30] have addressed quality control through complex network analysis, although limiting themselves to the study of topological structure.
Lexicographic ordering has been utilized in various complex network contexts. Notable applications include the study of information diffusion through nodal configuration mapping [
31] and node importance evaluation through minimal winning coalitions [
32]. However, its potential for integrating topological structure with nodal values in centrality metrics remains relatively unexplored. This gap motivates the development of a new metric that leverages lexicographic ordering properties to simultaneously evaluate structure and nodal values in specific testing contexts, such as quality control.
In this context, this paper proposes SL-WLEN (Semi-Local Centrality with Weighted Lexicographic Extended Neighborhood), a novel centrality metric specifically designed for weighted networks with valued nodal attributes. The proposed method was applied, and its robustness was evaluated in a quality control network for semiconductor chip manufacturing, which was composed of 1555 nodes representing critical characteristics of the production process, with weighted connections indicating their degree of correlation between these variables. The metric was evaluated against other established methods in the scientific literature using the SIR propagation model and Kendall’s τ coefficient, demonstrating that SL-WLEN maintained notably consistent values across all analyzed test networks, which validated its effectiveness for identifying influential nodes in complex manufacturing environments.
The methodology of this work encompasses the construction of a quality control process network for chip production followed by the theoretical formulation of the new SL-WLEN metric and its practical implementation, and it culminates with a robustness and efficacy analysis compared to other well-established models.
2. Establishment of a Quality Control Process Network for Chip Production
Complex network theory constitutes a viable methodology for analyzing and modeling interrelationships in quality control systems for chip manufacturing [
29]. By establishing a network model that maps the evolution of critical quality parameters during the production process, it becomes possible to precisely identify crucial control points in the manufacturing chain. This approach enables the visualization of how each stage in the chip manufacturing process influences subsequent stages, facilitating the early detection of potential quality deviations [
27].
To create a network that represents the dynamics of quality control in the chip manufacturing process, it is necessary to analyze and process information regarding specific characteristics that influence finished product quality. This enables the definition of each mode and how they relate and interact with one another.
2.1. Baseline Information Configuring the Network
The data used to configure the network consist of two sets. The first is a matrix
, where
corresponds to the number of observations or manufactured products, and
represents the quality characteristics. The matrix is defined according to the following Equation (1):
where
with
.
Each row represents the characteristics of the -th product. Each column vector represents the values of the -th characteristic for all products. Each component indicates the presence (1) or absence (0) of quality defects in the j-th characteristic for the i-th product. Each product is manufactured in the same system. The second set corresponds to a vector , where . Each element of vector contains information associated with the quality of each product or observation from matrix . Specifically, it indicates whether the i-th finished product meets the required final quality (1) or is defective (0).
2.2. Network Node Definition
Within the network model, each node represents a quality characteristic of the manufactured product, and the nodal value of each node is defined based on a logistic regression model with Lasso
regularization. This allows for the assignment of a numerical value
to each node, representing the relevance or influence of the corresponding characteristic on the manufactured product’s quality. The objective function for logistic regression with Lasso regularization is expressed according to the following Equation (2):
where
is the total number of observations in the dataset,
is the i-th observation of the binary dependent variables,
is the matrix of feature vectors (independent variables) for all
observations, and
is the binary dependent vector, with
being the model coefficients and
the regularization parameter.
The equation consists of the following two components: the first is the negative log-likelihood expression for binary logistic regression: , obtained after applying the negative natural logarithm to the original likelihood function (Equation (2)), while the second component, the penalty, integrates two elements, representing the regularization parameter and the norm .
Regarding the expression , the L1 norm () of vector is defined as the sum of the absolute values of its components. In other words, for vector , the norm is expressed as , which measures the total magnitude of the coefficients. is a regularization parameter that controls the strength of the penalty. The larger is, the greater the penalty, leading to smaller coefficients. This additional penalty has the effect of ‘shrinking’ some coefficients towards zero and, in some cases, may cause certain coefficients to be exactly zero.
The incorporation of penalty into the objective function of the LASSO logistic regression enables automatic feature selection. By forcing some coefficients to zero, the lasso tends to select a more relevant subset of features, eliminating less important ones. This could result in simpler and more generalizable models.
From the fitted model, the resulting
coefficients are utilized as nodal values
within the network model. Each node
represents a quality characteristic of the manufactured product, and its nodal value
is defined by the absolute magnitude of its estimated coefficient in the Lasso model (Equation (3)).
2.3. Edge Weight Determination
The connections in the quality characteristics network are established through the Phi coefficient, which quantifies the degree and direction of the statistical association between pairs of binary characteristics in matrix . For each pair of characteristics , the coefficient defines the edge connecting them, evaluating the actual correlation between their variation patterns. This coefficient is calculated using the formula , where correspond to the frequencies in the 2 × 2 contingency table between characteristics, : is the positive coincidence frequency (1, 1), is the frequency of combination (1, 0), is the frequency of combination (0, 1), and d is the negative coincidence frequency (0, 0).
The weight of each edge is defined through a threshold function applied to the coefficient. The function establishes that if , or , and it is equal to 0 if , where represents a statistically significant threshold. This threshold filters weak correlations, allowing only statistically significant relationships to form part of the network structure.
The topological structure of the graph is described by the adjacency matrix.
, where if nodes are connected, and otherwise. The coefficient has a range of [−1, 1], with extreme values indicating perfect association: for perfect positive association, for perfect negative association, and for absence of association. This enables the construction of a network that faithfully reflects relationships between quality characteristics, capturing both positive and negative associations while avoiding irrelevant connections that could introduce noise into the analysis.
The symmetry of the coefficient and its specificity for binary variables make it ideal for modeling complex processes, such as chip manufacturing.
2.4. Construction of the Quality Control Network in Chip Manufacturing
During the chip manufacturing process, quality emerges as a complex phenomenon resulting from the dynamic interaction among multiple characteristics. This work adopts a complex network-based approach, visualizing quality control as an integrated system where each characteristic influences both individually and through its interactions with others.
The network is constructed by representing each quality characteristic as a node, whose importance is determined through the analysis of historical production data. Edges between nodes represent significant correlations between characteristics, revealing how changes in one can propagate and affect others. The resulting structure is an undirected weighted network, where nodal values quantify the individual importance of each characteristic, while edge weights reveal the strength of relationships between them. This model enables the visual understanding of how the production system’s equilibrium depends on both individual characteristics and their complex network of interactions.
3. Definition of a Centrality Metric for Identification and Categorization of Quality Characteristics Based on the Network
3.1. Literature Review
The study of complex networks provides methodological frameworks and fundamental structures that enable the development of more advanced and sophisticated artificial intelligence systems [
33]. The intersection between AI and complex networks has revolutionized the analysis and optimization of interconnected systems, enabling the development of promising and effective solutions across various technological and social domains [
34]. Within the framework of complex network analysis, the identification of influential elements and the understanding of their impact on the global system have garnered significant interest in recent years. This has led to the development and evolution of various metrics and methodologies aimed at quantifying the relative importance of components within these interconnected structures.
In this context, the present review examines the development of these metrics, focusing on the progression from purely structural approaches toward more sophisticated methods that integrate both the intrinsic attributes of nodes and the weights of their connections, thus responding to the growing need for more comprehensive analyses in complex networks that better reflect real-world phenomena. The analysis encompasses both local and semi-local centrality metrics, considering the topological connections between neighbors and their relative influence within the network structure.
Consider an unweighted and undirected network
, where
represents the set of nodes and
the set of edges, depending on the application context. The adjacency matrix associated with
is described by
, where
represents the weight of an edge between nodes
and
. The set
denotes the neighbors of node
. The degree of node
, denoted as
, is defined according to the following Equation (4):
Each node is characterized by an attribute vector , where is the number of attributes, and each component represents the value of the k-th attribute of node . This characterization enables the integration of both the topological structure of the network and the intrinsic properties of its nodes in the centrality measure.
Such a description proposes a complex network characterized by weighted edges and nodes with valued attributes, allowing for the modeling of systems where the centrality and influence of each element depend on both its topological structure and the intrinsic properties of the analyzed node and those that form its relational environment. This representation is particularly relevant in contexts where the importance of an element cannot be determined solely by its connectivity patterns but requires considering the heterogeneity of nodal attributes and their interaction with the network structure.
In the context of these complex networks, where centrality depends on both the weighted topological structure and nodal attributes, the scientific literature has followed a progressive development in its approaches to measuring node importance. This development is characterized by the following three distinctive stages: initially, metrics focused exclusively on the network’s topological structure, considering only node connections; subsequently, two parallel research lines emerged, one focused on incorporating edge weights and another on considering nodal attributes independently; finally, recent efforts seek to integrate both aspects into unified metrics, although this implies greater computational challenges. This evolution reflects the growing understanding of the multidimensional nature of centrality in complex networks, where a node’s importance is defined by the interaction between its structural position and intrinsic characteristics.
Table 1 presents the evolution of metrics that exclusively consider topological structure and edge weights, encompassing different network analysis levels. Among global metrics [
4], Betweenness Centrality (BC) considers the frequency with which a node appears in the shortest paths between all node pairs in the network, while Closeness Centrality (CC) measures the proximity of a node to all others through geodesic distances, and Degree Centrality (DC) proposed by [
35] evaluates importance according to a node’s direct connections. Semi-local metrics include Local Structural Centrality (LSC) by [
11], which incorporates both neighbor degrees and their local clustering coefficients, the DIL (Degree and Importance of Lines) metric by [
13] that combines node degree with the weighted importance of adjacent connections, LRASP [
19] which evaluates centrality considering induced subgraphs, WHC [
36] that integrates multiple centrality measures, INASP [
18] that combines three different aspects of local influence, K-shell Decomposition (KS) [
9], and Gravity Formula Based Method (G) [
37].
More advanced metrics focus on information propagation and node distance in complex networks, such as LARSP, ARP, and LASP. LARSP (Local Average Shortest Path) is a local metric that measures node centrality based on the average length of shortest paths from one node to all other nodes in its local subgraph. Its objective is to capture the node’s influence on information propagation within its immediate neighborhood, considering how local connectivity impacts the node’s capacity to transmit information across the network. ARP (Average Reciprocal Path) extends the LARSP concept by considering the reciprocal distance of shortest paths in a directed network. Specifically, it evaluates how the path structure between nodes, considering edge directions, affects node centrality. LASP (Local Average Shortest Path) is an optimized version of LARSP that incorporates a weighted local average of shortest distances, reducing computational complexity by focusing on each node’s local subgraph.
Meanwhile, metrics considering valued nodal attributes (
Table 2) also present different analytical scopes. Local metrics include node-weighted degree [
7], which modifies the traditional degree definition by incorporating a nodal weight function, and WNDegree variants [
8] that integrate nodal attributes with local topological structure. Semi-local metrics include node-weighted harmonic centrality [
14] that considers geodesic distances in the extended neighborhood, node-weighted betweenness centrality [
15] that incorporates the importance of communication between nearby node pairs, and modified eigenvector centrality [
16] that adjusts nodal weight influence through a variable parameter. Additionally, hybrid metrics have been developed, such as the nodal attribute screening method, applicable at both local and global levels, and the MCNDI metric that integrates multiple indicators through the CRITIC method, combining local and global aspects [
17]. The Node Attribute Screening Centrality method [
38] uses a regression model in which the centrality is the dependent variable, the nodal attributes are the predictor variables, and the regression coefficients represent the influence of each attribute.
The weighted K-shell method [
39] uses the node degree and link weights, while the weighted K-shell degree neighborhood method [
40] combines the degree and the k-shell index with adjustable parameters.
The integration of nodal attributes and connectivity in combined centrality metrics began with seminal works, such as [
41], who addressed attributed graph analysis by incorporating categorical attributes in centrality evaluation. Their proposal extends classical measures through the E-I homophily index and betweenness metrics, enabling node classification into groups based on qualitative characteristics. While this approach represents an initial step in considering nodal attributes, it is limited to categorical characteristics without exploiting the richness of numerical attributes that could more precisely capture actors’ influence in the network.
A more comprehensive advancement in integrating global structure and attributes was proposed by [
8], who developed a metric called node and edge-weighted closeness centrality, which calculates nodal importance considering both normalized distances between nodes and connection weights along shortest paths. This measure integrates the network’s global structure and connection weights into a global centrality metric. This measure, denoted as
, is defined as the product between the weight of node
and its weighted closeness centrality
. The latter is calculated as
, where
represents the edge weight e in the shortest path, and
denotes the length of said path, measured as the number of links between nodes
and
. However, its main limitation lies in the need to calculate the shortest distances between each node pair, resulting in high computational complexity, especially in extensive and complex networks. This complexity increases significantly in networks with weights and nodal attributes, due to the additional analysis required for each connection.
The development of centrality metrics reflects a progression from purely structural approaches toward approximations that incorporate edge weights or nodal attributes independently. However, there exists a significant gap in developing metrics that simultaneously integrate both edge weights and valued nodal attributes while maintaining manageable computational complexity. Existing attempts, such as that in [
8], although promising, face significant limitations in terms of scalability and computational efficiency. This gap is particularly relevant in the current context, where complex networks frequently exhibit heterogeneity in their connections and diversity in their node characteristics. Therefore, developing a centrality metric that can efficiently capture this duality while maintaining feasible computational complexity represents a necessary research direction to advance the understanding and analysis of real-world complex networks.
3.2. Proposed Metric: Semi-Local Centrality with Weighted and Lexicographic Extended Neighborhood in Node-Attributed Weighted Networks (SL-WLEN)
The SL-WLEN metric quantifies node centrality in complex networks based on the LARSP (Local Average Shortest Path) connectivity analysis through its DegreeLocal and DegreeSemiLocal components, which evaluate partial centrality as a function of connection degrees. SL-WLEN extends this foundation by incorporating the following two additional components: a local component through the normalized node value, and a semi-local component via SemilocalNodeLexOrder, which introduces the lexicographic ordering of neighbors. This component combination enhances the metric’s capability to reflect the influence of characteristics in chip manufacturing, enabling the identification of the most relevant features of the final product by considering both their connectivity and their intrinsic values, as well as their structural position within the network.
Figure 1 illustrates the metric implementation process.
3.3. Integration of Lexicographic Ordering
SL-WLEN integrates SemilocalNodeLexOrder, enabling a more precise characterization of node influence within its structural and attributive context. The implementation of SL-WLEN is based on Extended Neighborhood Connectivity (ENC), which extracts a subgraph encompassing node neighbors up to distance L. For details on ENC, see the work of [
21], pages 114 and 115.
Once the subgraph is obtained through ENC, SemilocalNodeLexOrder quantifies node influence by considering its position in a lexicographic ordering based on attributes and neighborhood structure. At each distance level l, SemilocalNodeLexOrder assigns higher weights to better-positioned nodes within the ordering, allowing the capture of subtle differences in nodes’ relative importance.
The metric operates by considering (1) the prioritization of important features through lexicographic comparison, (2) influence penalization as distance increases and the adjustment of node influence based on neighbor connectivity, and (3) influence accumulation.
The prioritization of important features through lexicographic comparison is discussed as follows:
In the chip quality network, each node represents a quality characteristic, and its importance depends not only on its individual contribution but on its relationship with other characteristics. The SemilocalNodeLexOrder function enables node ordering based on their relative importance within their neighborhood, ensuring that the most influential characteristics maintain a priority position. Given node , its local influence is measured from the lexicographic ordering of its immediate neighborhood at distance . The set of neighbors is ordered according to the importance value of each characteristic , obtaining such that This ordering favors nodes with highly relevant characteristics for chip quality, ensuring that those with higher values carry greater weight in the metric. In terms of chip manufacturing, this means that characteristics that most influence defects or improvements in the final product will occupy priority positions within the centrality evaluation. The partial contribution of node at level is defined as , where is v’s position in the lexicographic order. If a node has neighbors with a high impact on chip quality, its position in the list will be lower (closer to 1), increasing the numerator and, consequently, its influence in the metric.
Distance-based influence penalization and node influence adjustment based on neighbor connectivity are discussed as follows:
In the chip quality network, the effect of a characteristic can propagate through multiple interactions. However, its impact must be reduced with distance to prevent the overvaluation of distant connections. The influence of v at each level l is weighted according to its neighborhood size and the maximum connectivity at that level . Here, the first term maintains the lexicographic priority based on the characteristic’s importance, while the second term adjusts the relative contribution according to neighborhood size, enabling the differentiation of highly connected characteristics, and the third term introduces a penalization that reduces influence as distance increases, modeling the decreasing effect of characteristic propagation in manufacturing.
This adjustment aims to capture indirect relationships between characteristics without excessively diluting or overestimating their influence, ensuring that closer nodes have a more relevant impact on the metric, while the effects of distant nodes are attenuated in a controlled manner. In the context of chip quality, this approach helps evaluate not only directly influential characteristics but also those affecting the product in a more indirect yet equally relevant way, without excessive overvaluation.
Influence accumulation is discussed as follows:
Finally, the total semi-local influence of node v is obtained by accumulating partial contributions at each exploration level up to maximum L in the form . This enables the consideration of how a characteristic affects chip quality not only directly but also through indirect relationships with other characteristics. Additionally, it balances influence from to , preventing nodes with high connectivity from dominating the metric and providing a fair evaluation based on network structure. Lexicographic ordering proves particularly appropriate for evaluating chip manufacturing quality due to its unique capability to preserve the importance of critical characteristics. Unlike existing metrics, which tend to dilute the influence of important characteristics through various procedures, lexicographic ordering maintains the relevance of the most significant nodes throughout the analysis.
Traditional metrics present limitations in this context. Some use weighted sums like node-weighted degree centrality, others rely on distance normalizations like node-weighted harmonic centrality, or they employ shortest paths like node-weighted betweenness centrality. There are also those that apply products with adjustable parameters, linear regressions, or a combination of multiple indices, such as MCNDI. All these approaches may inadvertently reduce the influence of critical characteristics through their statistical aggregations. In contrast, lexicographic ordering preserves the relative importance of each characteristic through three complementary aspects; it prioritizes nodes based on their individual value, connectivity level, and influence adjusted by distance. This combination enables a more precise evaluation, where the importance of each characteristic is determined by its own value and its relationships with neighboring characteristics, without losing critical information in the process.
3.4. Definitions
The SL-WLEN metric quantifies node centrality in a complex network by considering two levels of analysis, local and semi-local, and integrating weighted connectivity components and nodal attributes. Its purpose is to capture node influence not only through direct connectivity but also by evaluating the importance of its neighbors at different proximity levels, their characteristics, and their relative position in the network. To achieve this, it integrates four main factors, local influence by connectivity (DegreeLocal), local influence by node , semi-local degree influence (DegreeSemiLocal), and semi-local node value influence based on lexicographic ordering (SemilocalNodeLexOrder).
In the final metric, normalizes the node value by relativizing it within its neighborhood, capturing its intrinsic importance beyond structural connectivity.
Definition 1. Local influence by connectivity (DegreeLocal).
DegreeLocal captures the local influence of the node based on its direct connectivity, normalizing the node degree with respect to the total network size. This reflects its immediate importance within the network. The local influence by the connectivity of
v denoted as
is defined according to the following Equation (5):
where
is the degree of node
, and
is the number of directly connected nodes.
Definition 2. Semi-Local Degree Influence (DegreeSemiLocal).
DegreeSemiLocal, derived from LARSP [
20] and based on LASP [
21], quantifies semi-local influence by considering nearby neighbors within a subgraph extracted through the ENC (extended neighborhood connectivity) concept. This influence is weighted based on several aspects, including weighted connectivity, which reflects the intensity of relationships between the node and its neighbors through edge weights, proximity, where neighbor influence decreases as distance increases, modeling impact propagation within the network, and structural importance, which prioritizes neighbors with higher topological relevance. The semi-local influence of
v, denoted as
, is defined according to the following Equation (6):
where
is the set of all neighbors up to level L of node
in network
, and
is the set of all neighbors at level
of node
.
Definition 3. Semi-Local Node Value Influence Based on Lexicographic Ordering (SemilocalNodeLexOrder).
SemilocalNodeLexOrder introduces a novel perspective through the lexicographic ordering of nodes based on their attributes and neighborhood structure. This evaluates how a node’s relative position within this order affects its influence, considering its structural and attributive environment at different distance levels. Additionally, it includes the contribution of nearby neighbors within a subgraph extracted through ENC, enabling a deeper evaluation of the node within its topological and attributive context.
The semi-local node influence based on lexicographic
measures node influence by considering its nearby neighbors at different distance levels (up to a maximum
). Each level contributes with a partial contribution
that depends on the node’s position in the lexicographic order
within its neighborhood, the number of neighbors at that distance, and the maximum degree among nodes at the same level according to the following Equation (7):
where
the partial contribution per level is defined according to the following Equation (8):
where
is the total number of nodes,
defines node
v’s position in the lexicographic order
at level
,
is the number of neighbors at distance
from node
,
is the maximum degree among all nodes at level
, and
is the maximum exploration level.
The ordering function
is defined as an ordered set of nodes based on a lexicographic comparison, as shown in the following Equation (9):
While
:
where
is a set of neighbors of
at distance
,
are values of
’s neighbors sorted in descending order, and
is the shortest path length between
and
in
.
This metric uses normalized values for each node, obtained by dividing its value by the maximum value of its neighbors at the same distance level, which adjusts its influence based on relative importance within the neighborhood.
3.5. Special Considerations
If several nodes have the same lexicographic order , then it is resolved by assigning the same order for the set of nodes . The term can be interpreted as the sum of contributions from each level , facilitating a detailed analysis of each node’s behavior at each exploration level.
Definition 4. Total Influence.
For a node
, SL-WLEN is defined by the following Equation (11):
where
, and
are adjustable parameters between 0 and 1. The first two control the local and semi-local influence of node connectivity, and the remaining ones control the local and semi-local influence of its nodal value, satisfying the condition
.
4. SL-WLEN Example
To better clarify the computational procedure of the proposed metric, we describe a numerical example. An undirected weighted graph with 11 nodes and 14 edges is assumed, as shown in
Figure 2. We present a calculation example for
, considering
and edge weights
as shown on the edges, with assigned nodal values
.
According to Definition 1,
and
. According to Definition 2, the calculation of
is performed as follows:
is determined from Definition 3.
At level 1, the neighbors of are nodes number 2, 3, 4, 5, 7, and 9, giving a total of neighbors. To calculate the lexicographic order, the normalized values associated with these nodes are considered, which are 0.375, 0.125, 0.125, 0.125, 0.125, 0.375, and 1.00. These values are sorted in descending order to form node V6’s signature, resulting in {1.0, 0.375, 0.375, 0.125, 0.125, 0.125, 0.125}. Comparing this signature with other nodes in the graph generates a ranking where nodes with higher signatures are placed first. In this case, node occupies position 4 in the level 1 lexicographic ranking, defined as .
At level 2, the neighbors of
are nodes number 1, 8 and 11, resulting in
neighbors. The normalized values associated with these neighbors are 0.375, 1.0, and 0.375, which are sorted in descending order to form
’s signature at this level, obtaining
. Following the same signature comparison process to determine lexicographic order, it is concluded that
occupies position 4 in the level 2 lexicographic ranking, defined as
. Substituting the values in Equations (6) and (7),
is determined as follows:
Finally, SL-WLEN(3) is calculated according to the adjustable parameters condition
, in accordance with the following Equation (16):
The SL-WLEN metric results for all nodes are shown in
Table 3.
5. Experimental Results
In
Figure 3, a general view of the complex network for chip manufacturing quality control is presented. The visualization shows the complete network structure, where nodes (circles) represent quality characteristics, and edges (lines) represent the correlations between them.
Figure 4 presents the detailed visualization of the quality control network. Node size and blue color intensity indicate the individual importance level of each characteristic (nodal value); larger size and darker blue tonality correspond to greater importance. Connections between nodes (edges) are represented on a grayscale, where tonalities closer to black indicate stronger correlations between characteristics, while lighter tones represent weaker correlations.
Figure 5 presents the visualization of the quality control network with characteristic identifiers. The features are identified with the prefix “f” followed by a four-digit sequential number. For example, f1263 corresponds to feature number 1263 of the process.
Figure 6 illustrates the visualization of the network’s structural configuration in core and peripheral zones. Panel (a) shows how the network accumulates connections in high-density areas, with strongly interconnected nodes forming clusters that reveal grouping patterns from the network’s center outward. Meanwhile, panel (b) illustrates the network’s peripheral region, where nodes with lower connectivity are located, demonstrating how these elements are spatially distributed in areas furthest from the network’s center. This progressive representation facilitates an understanding of the network’s complexity from different perspectives, enabling the direct appreciation of the relationships between process characteristics.
The visualization of the quality control network in chip manufacturing maintains graphical legibility, enabling the identification of characteristics’ importance hierarchy through node size and tonality, as well as correlation strength through connection intensity. The representation achieves a balance between showing densely connected structures (clusters) and more dispersed zones. This clarity in visualization facilitates an understanding of the complex network of interrelationships in the manufacturing process, providing an effective visual tool for quality control monitoring and analysis.
Table 4 shows the top 20 nodes with highest centrality according to the SL-WLEN metric, including their components and final rankings.
Figure 7 presents visual local subnetworks corresponding to the six highest-ranked nodes according to the SL-WLEN metric, revealing distinctive patterns of connectivity and local structure. The composite visualization shows different topological configurations that justify the ranking obtained through the proposed metric.
Node f625, which occupies the first position, exhibits the high density of local connections with a compact and well-connected structure, characterized by multiple intermediate nodes forming a cohesive community. The second highest-ranked node, f1397, presents a distinctive triangular connectivity pattern, less dense than f625 but with strategically distributed connections in its neighborhood. In the third position, f468 shows a predominantly radial structure with direct connections and a more pronounced dispersion pattern than the previous ones. Node f506, in fourth place, is characterized by minimal but strategic connectivity, with sparse links and a simpler structure compared to higher-ranked nodes. The fifth node, f981, presents moderate connection density with a semi-compact structure and irregular link distribution. Finally, f732, in sixth position, is distinguished by a hexagonal structure with regular and symmetric connections, showing moderate density with an ordered pattern.
This visualization provides empirical evidence of how the SL-WLEN metric captures different aspects of centrality and local structure in the network. Visually, it is possible to appreciate its capability to identify significant nodes based on multiple topological and structural criteria.
The visualizations are consistent with the SL-WLEN ranking. The following top three nodes (f625, f1397, f468) demonstrate more sophisticated connectivity patterns that reflect their high metric values: f625 with its dense and cohesive structure (), f1397 with strategic triangular connections (), and f468 with its efficient radial pattern (). The lower-ranked nodes (f506, f981, f732) exhibit simpler or less-integrated structures, consistent with their lower SL-WLEN values (0.3437, 0.3378, and 0.3192, respectively).
The identification of these central characteristics through SL-WLEN reveals not only nodes important for final product quality but also their role in manufacturing system stability. The connection structure of these nodes suggests that they are critical points for maintaining process coherence and stability; alterations in these characteristics could propagate extensively through the network due to their multilevel connectivity patterns. This complements the traditional approach based solely on nodal values by considering how these characteristics act as system stabilizers through their interconnections. For example, the dense and cohesive structure of node f625 suggests that it is crucial not only for final quality but also for maintaining the operational stability of the manufacturing process.
6. Robustness Analysis of the SL-WLEN Metric
To evaluate the robustness of the SL-WLEN metric, we adapted the methodology proposed by [
42], which continues to be employed in contemporary research, such as in the study [
43], who developed a systematic framework to analyze how classical centrality measures (degree, betweenness, closeness, and eigenvector) maintain their consistency under different conditions of error or perturbation in network data. The same perturbation and evaluation techniques were applied to our composite SL-WLEN metric, which, unlike classical metrics, incorporates both structural aspects and nodal values in its calculation. The importance of this analysis lies in that, in real situations, networks may be subject to various types of modifications or errors in their structure.
The process began with selecting a representative sample of the network, balancing computational efficiency and structural representativeness. Given that the complete network consists of 1555 nodes, a robustness analysis on the entire network would be computationally intensive and time-demanding. Therefore, a sample size of 100 nodes was determined, large enough to capture the network structure without compromising analysis viability. To ensure representativeness, stratified sampling based on connectivity distribution was implemented, following the power-law distribution observed in real networks. Strata were defined according to node degree, classifying them into high, medium, and low connectivity. Node allocation in each stratum was performed using the formula , where is the sample size for stratum , is the total sample size, is the stratum size in the population, and is the total number of nodes. To ensure balanced network representation, the sample distribution was adjusted, allocating 20% to highly connected nodes (hubs), 60% to medium connectivity nodes, and 20% to peripheral nodes. This allowed for capturing the global network structure while optimizing computational resources during test execution.
Four fundamental types of error that can occur in real networks were considered. Node removal simulates scenarios where data from some actors are lost, randomly selecting nodes, where n is the total number of nodes and p is the modification proportion (0.01, 0.05, etc.). Node addition represents situations where new actors are incorporated into the network, creating new nodes with degrees similar to randomly selected existing nodes. Edge removal simulates cases where existing connections are lost, randomly selecting edges, where m is the total number of edges. Edge addition represents scenarios where new connections are created between previously unconnected nodes.
The selection of perturbation levels was made considering a spectrum ranging from minimal modifications to substantial network changes. Levels of 1% were used, representing minimal errors or minor natural changes in the network; levels of 5% and 10% simulated moderate perturbations; levels of 25% represented significant structure alteration; and levels of 50% simulated extreme network modification. This gradation allows for the evaluation of the metric’s sensitivity to small perturbations, its resistance to moderate changes, and its behavior under extreme conditions.
To evaluate different aspects of robustness, five complementary metrics were implemented. The Top 1 metric measures the proportion of times the most important node maintains first position after modification, Top 3 indicates the frequency with which it remains among the top three, and Top 10% represents the proportion of times it stays within the first decile. For example, if a node has values of 0.980 in Top 1, 0.900 in Top 3, and 1.000 in Top 10%, it means that in 98% of replications it retained first position, in 90% it remained among the top three, and in all replications it stayed within the first decile.
The Overlap measure calculates the normalized intersection between the upper deciles of the original and modified networks, defined as , where A and B represent the sets of nodes in the first decile of each network. Its value varies between 0 and 1, indicating the degree of coincidence between both networks; for example, a value of 0.720 means that 72% of nodes in the first decile are the same in both versions. On the other hand, represents the square of the Pearson correlation between the SL-WLEN values of the original and modified network. Values close to 1, such as 0.997, indicate that the relative order of nodes is preserved almost perfectly, while lower values reflect a greater discrepancy in ordering.
The validation process was designed at multiple levels to ensure the robustness of results. Fifty replications were performed, generating in each one a modified version of the network, called “test network”, on which modifications were applied. The evaluation included an independent analysis of each combination of error type and level, calculating the five robustness measures and averaging the results. The entire process was documented, recording changes in each replication, ensuring modification traceability, and generating detailed reports.
The test results demonstrate that the SL-WLEN metric exhibits robust and reliable behavior under different network perturbation conditions. The metric shows notable stability against element removal, particularly in the case of edges, where it maintains Top1/3/10% values above 0.900 even with 50% modifications. For node removal, the metric preserves its stability up to 25% modification, with Top1/3/10% values equal to or greater than 0.800, and maintains R2 values above 0.92 even with 50% alterations.
Regarding element addition, both for nodes and edges, the metric shows progressive deterioration starting from 10% modification, demonstrating greater sensitivity to the incorporation of new elements than to their removal. This sensitivity is reflected in a substantial decrease in consistency for large-scale modifications, where R2 decreases to approximately 0.77. Nevertheless, the metric maintains high reliability in scenarios with small perturbations of 1–5%.
The practical implications of these results confirm that the SL-WLEN metric is particularly effective in identifying and maintaining the hierarchy of the most important nodes in the network, even under conditions of moderate data loss. Its greater sensitivity to the incorporation of new elements suggests the need for caution when making modifications that exceed 25% of the network structure. These findings validate the robustness and utility of the SL-WLEN metric for complex network analysis, demonstrating its capability to maintain consistency in identifying critical nodes under various perturbation conditions.
7. Performance of the Proposed Metric
In this section, the results of empirical experiments conducted to test the performance of the proposed centrality metric against several real-world networks are presented.
7.1. Test Parameters
For the execution of the experiments, the R programming language (version 4.3.3) was used on the RStudio development platform (version 2024.04.2+764). The computer employed runs Windows 10, with an Intel Core i3 processor up to 3.5 GHz and 16 GB of RAM. The influence of nodes is calculated using each of the selected centrality metrics, and they are then sorted from highest to lowest impact to form a TopK of most influential nodes, with
, as proposed by [
21,
44].
SWLAN consists of four parameters,
, and
, whose values represent the contribution of each component to the final influence value. These parameters have been established equally at 0.25 following the proposal of [
18,
21]. Meanwhile, the parameter L, which determines the maximum neighborhood level, has been defined after testing its effectiveness with various values.
7.2. Description of Test Datasets
The experiments were conducted on the following six doubly weighted complex networks (weight on edges and valued nodal attributes) of undirected type: (1) Bitcoin transaction network, (2) Sioux Falls city transport flow network, (3) energy flow network, (4) Reddit social network interaction network, (5) Global Urban Network, and (6) quality control network. The use of real and artificial networks is a commonly viable practice for validating node centrality metrics, allowing for the evaluation of their performance in diverse network structures that are representative of real-world scenarios [
45].
The Bitcoin transaction network represents a trust system among users trading with Bitcoin, where each node (5881 in total) symbolizes an individual user with an importance value determined by their activity on the platform, while edges (35,592 connections, 89% positive) indicate trust assessments between users with weights. The data were extracted from the Bitcoin OTC platform over a 2.5-year period (January 2014 to April 2017), constituting the first weighted and signed network available for research on reputation systems in anonymous environments.
The Sioux Falls city transport flow network models an urban traffic flow system where each node (24 zones) symbolizes a specific city intersection with an importance value determined by its strategic position, while edges (76 connections) indicate pathways with flow capacities between 2 and 10 units and a constant length of 0.15. This standardized dataset has been utilized as a benchmark in transportation research for traffic assignment studies and road planning.
The bus energy flow network represents an electrical power system where each node (118 buses) symbolizes a substation with an importance value determined by its voltage level (between 100 and 110 kV), while edges indicate transmission lines between these buses with their respective impedances. The dataset originates from a digitization of the American Midwest electrical system from 1962, converted to standard format to serve as a test case in power flow analysis and electrical network stability studies.
The Reddit social network interaction network represents a system of connections between online communities where each node (55,863 subreddits) symbolizes a thematic forum with an importance value determined by its activity, while edges (858,490 connections) indicate hyperlinks between subreddits with an associated sentiment. The data were extracted from Reddit over a 2.5-year period (2014–2017), identifying hyperlinks in post titles and bodies, with temporal metadata and textual property vectors, as part of a research project on interactions and conflicts between online communities.
The Global Urban Network represents a georeferenced spatial system where each node symbolizes an urban element with an importance value determined by its hosting capacity, while edges indicate physical connections with traffic levels as weights. The data were processed using Euclidean buffers of 100 m for nodal attributes and a threshold of 50 m for edge attributes, available in GeoJSON format for multiple global cities, enabling consistent spatial analyses across diverse urban contexts.
The quality control network for chip manufacturing comprises 1763 product observations with 1555 binary quality features, where nodes represent features with importance values determined through logistic regression with Lasso regularization, and edges show significant statistical correlations between features (measured with the Phi coefficient) above an established significance threshold, allowing for the identification of critical control points in the manufacturing process.
7.3. Performance Evaluation
The experiments are based on the application of the SIR (Susceptible-Infected-Recovered) model proposed by [
46], using Kendall’s correlation coefficient as a comparative method [
47,
48]. This allows for the examination of how information propagates through nodes determined as influential, and it evaluates the efficiency of such diffusion in the network structure [
18]. The evaluation compares the effectiveness of various centrality metrics, including SLWLAN through SIR simulations. The TopK nodes identified by each metric function as initial infection seeds, evaluating their propagation with parameters
and
. Kendall’s
coefficient quantifies the correlation between the original centrality ranking and the actual diffusion capacity observed in the SIR model, thus validating the predictive accuracy of each metric.
The SIR model is recognized for examining propagation dynamics in complex network systems. Each element is categorized as Susceptible (S), Infected (I), or Removed (R). When considering interactions between elements u and v, the contagion mechanism follows two principles expressed in the following Equations (17) and (18):
The parameter represents the infection rate while indicates the recovery rate. Both equations in combination describe how an element (already infected) can transform a susceptible element with probability , incorporating it into the infected group.
For the evaluation, each centrality indicator identifies the TopK most relevant elements, which function as initial infection foci in the model. Following the SIR logic, these elements can transmit the infection to their neighboring elements with probability
or can recover with probability
. The function
quantifies the sum of both infected and recovered elements at a specific moment
, as shown in Equation (19). This value serves as an indicator of the influence capacity of the initial foci, where
represents a temporal unit in the simulation.
The variables and correspond to the number of elements in infected and recovered states, respectively.
Kendall’s
coefficient is widely used to compare hierarchically ordered elements [
6]. In this coefficient, each observation represents the position of a node according to a centrality metric. Let
and
be two lists that order the nodes according to their influence scores. List A is obtained from the proposed centrality metric, while B is generated from the application of SIR. Suppose two pairs of common nodes
and
belong to sets A and B, respectively. According to Kendall’s
coefficient, if
, then the influence scores of these pairs of nodes are concordant; otherwise, they are considered discordant. Equation (20) formally defines Kendall’s
coefficient, as follows:
where
and
represent the number of concordant and discordant pairs, respectively. The coefficient varies between −1 and 1, where higher values indicate a greater similarity between the ranking lists.
7.4. Benchmark Metrics
In this work, the correlation between SL-WLEN and centrality metrics based on topological structures is evaluated, such as semi-local centrality (SC), k-shell decomposition (KS), Local Relative Average Shortest Path (LRASP), weighted hybrid centrality (WHC), and influential node based on Average Shortest Path (INASP). Additionally, the performance of SL-WLEN is compared with centrality metrics based on nodal attributes, such as node-weighted degree centrality (NWDegC), node-weighted harmonic centrality (NWHC), node-weighted betweenness centrality (NWBC), and modified node-weighted eigenvector centrality (MNWEC). The evaluation was conducted using constant configurations for all methods, ensuring an equitable comparison.
7.5. Analysis of Results
7.5.1. Analysis of the L Parameter of SL-WLEN
The maximum exploration level () is the sole adjustable parameter defining SL-WLEN’s neighborhood scope. This parameter is present in the definitions of DegreeSemiLocal and SemilocalNodeLexOrder of SL-WLEN. The value of plays an important role in identifying influential nodes in the network, as is applied as a hop count to extract the local subgraph through the extended neighborhood connectivity (ENC) concept. is used to determine the scope of the local subgraph.
Traditional semi-local centrality metrics generally only consider first- and second-level neighbors [
11]. However, exploring higher values of the
parameter could improve precision levels for identifying influential nodes in networks. Based on this premise, we evaluated various values of
in the DegreeSemiLocal and SemilocalNodeLexOrder components to optimize the performance of SL-WLEN.
Table 9 presents the comparative evaluation between various network structures using Kendall’s
coefficient. The data presented correspond to simulations performed with fixed recovery
and infection
parameters.
In the table, the optimal values appear highlighted in bold to facilitate their identification. Upon examining the averages row, it is evident that the SL-WLEN metric achieves its maximum effectiveness when configured with a neighborhood level . Consequently, the following experimental phases with the SL-WLEN metric focus specifically on this neighborhood level.
7.5.2. Comparison Based on Kendall’s Coefficient
Kendall’s
coefficient functions as a standard criterion when evaluating centrality metrics in the context of the SIR model. The experiments were conducted maintaining
while modifying
in the spectrum from 0.01 to 0.1, in accordance with the work of [
44].
Figure 8 illustrates the results of Kendall’s
coefficient, comparing the SL-WLEN metric with centrality metrics based on connectivity (KS, CS, LRASP, WHC, INASP) across the test networks. Meanwhile,
Figure 9 presents the comparison between SL-WLEN and various metrics based on nodal properties (NW-DegC, NW-BC, MW-BC, NW-BC, MW-WC). This analysis encompasses six distinct networks: Bitcoin transaction network, Sioux Falls City transportation flow, energy flow network, Reddit social network interactions, global urban network, and quality control network.
The experiment examines the correlation between propagation dynamics and various centrality metrics under different values of . The parameter quantifies the total of both infected and removed nodes at time , naturally increasing with temporal advancement. In the presented graphs, the rank correlation r is equivalent to Kendall’s coefficient, reflecting the relationship between the accumulation of infected nodes according to each metric and the value. Higher correlation values indicate greater predictive capacity regarding propagation potential.
The simulation results demonstrate that the SL-WLEN metric maintains consistently high values of the
coefficient across all analyzed networks. According to the graphs presented in
Figure 8 and
Figure 9, it can be observed that SL-WLEN improves Kendall’s
coefficient compared to both connectivity-based metrics and nodal attribute-based metrics. Specifically, in the Bitcoin transaction network, SL-WLEN improves between 8% and 10% over SC; in the energy flow network, where SL-WLEN maintains a particularly notable advantage, the improvement ranges between 15% and 18% compared to traditional metrics; in the Global Urban Network, improvement is between 7% and 9% over NW-DegC; and in the quality control network, it is between 12% and 15% over NW-BC.
The variability in the performance of the SL-WLEN method across the analyzed networks can be attributed to the inherent heterogeneity in the joint distribution of edge weights and nodal attributes. Each network exhibits a distinct structural correlation between these components, where SL-WLEN effectively captures this multidimensional complexity. The variations observed in Kendall’s τ coefficient demonstrate the sensitivity and adaptability of the method to diverse network topologies. This phenomenon corroborates the robustness and general utility of SL-WLEN as an effective analytical tool for centrality evaluation in complex networks with heterogeneous topological characteristics.
Specifically, SL-WLEN demonstrates superior performance when edge weights and nodal values provide complementary information about node importance in propagation dynamics; conversely, when these signals exhibit redundancy or conflicting patterns—as can occur in networks with specific physical constraints, such as in the Sioux Falls transport flow network—comparative performance fluctuations are observed.
A deeper analysis reveals that these variations can be explained through the following interrelated factors: First, in the energy flow network and quality control network, cascade propagation phenomena align closely with the underlying assumptions of the SIR model, whereas in the Sioux Falls transport flow network, the deterministic nature of flow limits the advantages of the probabilistic approach. Second, in networks such as the Bitcoin transaction network, the asymmetric trust structure allows the SemilocalNodeLexOrder component to effectively preserve the influence of high-reputation nodes, while in the Sioux Falls transport flow network, structural predetermination reduces the added value of lexicographic ordering. Third, the degree of modular structure affects relative performance, as observed in the Reddit social interaction network, where the pronounced community structure generates propagation dynamics that vary according to the parameter of the SIR model.
The analysis also identifies specific limitations. When there is a high correlation between topological importance and nodal values (as in the Sioux Falls network), the components of SL-WLEN may become redundant; performance varies considerably with different values of parameter , requiring specific calibration; lexicographic ordering introduces additional computational cost that may limit applicability in extremely dense networks; and in systems where spatial or physical constraints rigorously determine structure, the added value of the lexicographic approach decreases considerably.
This performance variability demonstrates that, while SL-WLEN generally provides superior results, its effectiveness is conditioned by specific network characteristics, allowing for the identification of scenarios where its application is optimal as well as those where alternative methods might be more appropriate.
7.6. Complexity Analysis
The complexity of SL-WLEN involves calculating local and semi-local influence. Local influence (DegreeLocal) has complexity , as it only requires dividing the node degree by the size of the network. Semi-local influence (DegreeSemiLocal) has complexity , as it depends on subgraph extraction through ENC and processing neighbors up to level L.
The calculation of SemilocalNodeLexOrder adds the complexity of the lexicographic ordering of neighbors, which at each level l is . The partial contributions by level have complexity , and their accumulation up to is .
Together, the total complexity of SL-WLEN is
, dominated by lexicographic ordering. This additional cost allows for better capturing the relative importance of nodes through their attributes, differentiating it from other semi-local metrics. This is consistent with what is noted by [
49], who argue that not all nodes have the same level of importance within a network, depending on their impact on the structure and dynamic behavior of the system.
8. Discussion
The SL-WLEN metric demonstrates applicability in the test networks employed in this study. Results across six heterogeneous networks (Bitcoin, Sioux Falls, energy flow, Reddit, global urban, and quality control) evidence its versatility and robustness in different contexts. In chip manufacturing environments, SL-WLEN enables the identification of critical features that both directly and indirectly affect the quality of the final product. The capacity to integrate nodal value and topological structure facilitates the discovery of strategic control points in the production process, proving particularly valuable for improving inspection processes, designing predictive monitoring systems, and prioritizing investments in quality improvements. Nodes f625 and f1397, identified as the most central according to SL-WLEN, are not only individually important but act as system stabilizers due to their multilevel connectivity patterns. This validates the utility of the method for capturing critical elements in the manufacturing context.
Beyond the principal case study, SL-WLEN demonstrates applicative potential in financial and trust networks, as evidenced in the Bitcoin network, where it identifies key users in reputation systems with 8–10% greater precision than traditional metrics. Its applicability extends to transportation systems, characterizing strategic intersections where interventions would have a greater impact on global flow; energy networks, identifying critical substations for system stability with a 15–18% improvement over conventional metrics; social networks, detecting communities with greater information diffusion power; and urban planning, identifying strategic locations for infrastructure development.
The comparative analysis using Kendall’s coefficient reveals that SL-WLEN consistently outperforms metrics based solely on connectivity (SC, KS, LRASP, WHC, INASP) or exclusively on nodal attributes (NWDegC, NWHC, NWBC, MNWEC). This superiority is due to its effective integration of structure and attributes, capturing their interaction in a manner that better reflects the reality of complex systems where importance depends on both factors jointly. The lexicographic ordering presents a distinctive advantage by preserving the relative hierarchy of important nodes, avoiding the dilution of their influence that typically occurs with methods based on statistical aggregations. Additionally, SL-WLEN achieves a favorable computational balance, maintaining acceptable complexity, which offers greater precision than local metrics without incurring the prohibitive cost characteristic of global metrics for large-scale networks.
Despite its advantages, SL-WLEN presents important limitations that must be considered for its optimal application. This metric shows sensitivity to the joint distribution of edge weights and nodal attributes, exhibiting performance variations in networks where these components present redundancy or conflicting patterns, as observed in certain fluctuations in the Sioux Falls network. The effectiveness of this method is conditioned by the selection of the neighborhood level , and, although optimal on average for the analyzed networks, each network type might require a specific value, implying a prior calibration process. Robustness analyses reveal an asymmetry in SL-WLEN’s response to structural modifications. It maintains greater consistency when elements are removed than when they are added, suggesting potential challenges in application to dynamic networks with rapid growth, where it might require more frequent recalculation to maintain its precision.
From a practical perspective, although SL-WLEN is computationally more efficient than global metrics, the lexicographic ordering adds an additional cost that could be a consideration in extremely large networks. For systems with millions of nodes, this incremental cost could compromise the practical applicability of the method in analyses requiring frequent or real-time updates. The effectiveness of SL-WLEN also depends on the adequate normalization of nodal values to allow meaningful comparisons, implying prior data processing that could be complex in cases where attributes present highly skewed distributions or significant outliers. While conceptually adaptable to different typologies, the present study focused primarily on undirected networks, so its application in directed networks might require methodological modifications to adequately capture the characteristic asymmetry of these relationships. The component weighting parameters ( and ) were equitably established following previous proposals, but different network types might benefit from alternative configurations, with the identification of optimal values specific to each domain being an additional practical challenge.
In terms of scalability, the current implementation shows performance deterioration in scenarios with substantial network modifications (greater than 25% in additions), suggesting potential limitations for its application in systems with high structural dynamics or requiring real-time analysis. To address these limitations, future research could explore the development of adaptive versions that automatically adjust parameters according to specific characteristics of each network; incremental implementations that allow updating centrality values without complete recalculations when the network experiences localized changes; parallelization strategies to improve performance in large-scale networks; extensions for temporal networks that capture the dynamic evolution of centrality; and integrations with machine learning techniques to predict changes in the relative importance of nodes. In conclusion, while SL-WLEN represents an advance in the identification of influential nodes in complex networks with weighted edges and nodal attributes, its application requires careful consideration of its limitations and the specific implementation context, balancing precision and computational efficiency according to the particular requirements of each application domain.
9. Conclusions
This paper proposes SL-WLEN as a weighted semi-local centrality metric based on the integration of lexicographic ordering and the extended neighborhood concept for identifying influential nodes in complex quality control networks. Beyond node importance, SL-WLEN incorporates both topological structure and nodal values in its evaluation, considering the following four main components: local influence by connectivity, local node influence, semi-local degree influence, and semi-local influence based on lexicographic ordering. By applying a distributed approach that analyzes subgraphs per node and utilizing lexicographic ordering to evaluate hierarchical importance, SL-WLEN provides an effective balance between accuracy and computational complexity.
The simulation results demonstrate that the SL-WLEN metric maintains consistently high values of Kendall’s τ coefficient across all analyzed networks, outperforming both connectivity-based metrics and nodal attribute-based metrics. Specifically, SL-WLEN improves between 8% and 10% over SC in the Bitcoin transactions network, between 15% and 18% in the energy flow network, between 7% and 9% over NW-DegC in the global urban network, and between 12% and 15% over NW-BC in the quality control network. Additionally, the analysis of the parameter demonstrates that the metric achieves its maximum effectiveness when configured with a neighborhood level .
Numerical robustness tests demonstrate SL-WLEN’s high stability, especially against element removal, maintaining its consistency even with significant network modifications. However, the metric shows greater sensitivity to the incorporation of new elements, suggesting areas for improvement in future work. The extension of the metric to consider network temporal dynamics and its adaptation for different types of complex networks represents a promising direction for subsequent research.