2. Methodology
In this section, the wellestablished models that constitute the core of the proposed centrality are described in detail. Later, an algorithm summarizing the required calculations to obtain the measure for multiplex networks is presented and analysed.
Let
$\mathcal{G}=(\mathcal{N},\mathcal{E})$ be a connected graph with the adjacency matrix
$A=\left(\right)open="("\; close=")">{a}_{ij}$, with
2.1. The Eigenvector Centrality for Networks with Data
In [
26], Bonacich presented the classical eigenvector centrality which measure the importance of a node depending on its connections. However, we can consider the possibility that not all the links are equally relevant. Taking this into account, we argue that the centrality does not only depend on the quantity of its links, but also on the degree of its adjacent nodes.
Denoting by
${x}_{i}$ the centrality of the node
i, it is possible to measure the importance of each node with the expression,
where
${a}_{ij}$ are the elements of the adjacency matrix corresponding to the row
i, and
$\lambda $ is a constant.
Defining the centrality vector as
$\mathit{x}=({x}_{1},{x}_{2},\dots )$, the expression (
1) can be rewritten as
From the expression (
2),
$\mathit{x}$ is an eigenvector of the adjacency matrix
A associated with the eigenvalue
$\lambda $. Taking into account that
A is nonnegative and irreducible and using the Perron–Frobenius theorem, there exists and eigenvector associated with the maximum eigenvector (in absolute value) with positive entries. This vector is the eigenvector centrality.
This classical eigenvector centrality only takes into account the topology and the links of the neighbouring nodes. It does not incorporate any other data of the spatial network.
In [
20], Agryzkov et al. present a new centrality measure for networks that takes into account geolocated data associated with the network. Besides, this measure allows to weight the contribution of the topology in the final classification.
The main idea of the model is the construction of a
data vector $\mathit{v}$ with all the information present in the network. This vector is normalized
${\mathit{v}}^{\ast}$ and allows to establish the
importance of an edge between nodes
i and
j as
Repeating the computation for all edges, a weight matrix W for the data is constructed. With the aim to avoid null elements in the weight matrix W, a basic minimum level of importance is introduced, denoted by $\alpha $. A matrix ${A}^{\ast}$ is constructed summarizing the importance of the topology of the network and the data present in it.
Algorithm 1 summarizes the model proposed in [
20].
Algorithm 1 (Eigenvector centrality for network with data). Let $\mathcal{G}=(\mathcal{N},\mathcal{E})$ be a primary graph with n nodes, A the adjacency matrix, D the data vector and ${\mathit{v}}_{0}$ the balanced vector for data. Let us denote by ∘ the Hadamard matrix product. 
 1
The data vector $\mathit{v}=D\xb7{\mathit{v}}_{0}$ is constructed.  2
Normalization of $\mathit{v}$.
 3
The weight matrix W
is constructed.  4
Calculate $\alpha $ using the expression $\alpha =min\left(\right)open="("\; close=")">{v}_{i}^{\ast}$.  5
Take $\u03f5$, according to the expression $\u03f5<\frac{1}{10}\alpha $.  6
From A, W, and $\alpha ,\u03f5$ construct ${A}^{\ast}$ as
 7
Compute the dominant eigenpair of ${A}^{\ast}$, $({\lambda}_{1},{\mathit{x}}_{\mathbf{1}})$.  8
From A and ${\mathit{x}}_{\mathbf{1}}$ compute the eigenvector centrality for networks with data CVP as

Where J is a matrix with 1’s in all its entries. It is relevant to remark, as a special characteristic of this model, that the data associated with the network allow to quantify and qualify the information located in their environments.
Based on the Agryzkov et al. [
20] model some small modifications are introduced in order to design and implement the centrality measure for multiplex networks.
First, the definition of the parameter
$\alpha $ has been modified slightly, reducing the value of the
basic minimum level of importance of data in the global network. Now,
$\alpha $ is calculated as
Specifically, the weight matrix
W is now defined as
The introduction of the basic minimum level of importance associated with the edges in matrix W is because of the own centrality, where the importance of a node is given by the influence of its neighbours. It can be said that a node with no data is always influenced by the global dataset of the whole network, even if the nodes are not directly connected to it.
Consequently, the definition of the matrix
${A}^{\ast}$ is now
In the step 8 of Algorithm 1, the centrality of the nodes is calculated from
A and
${\mathit{x}}_{\mathbf{1}}$ by the expression
which is different from the classic eigenvector centrality model computed using the expression (
1). In (
1), the centrality of a node is only determined by the influence of the nodes to which it is connected. However, in the eigenvector centrality with data, the term
${\mathit{x}}_{\mathbf{1}}$ is added in the expression (
3), which represents the importance of the node itself due to the data associated with it. This is a small variant that is introduced with respect to the classic eigenvector centrality, which aims to evaluate the importance of data associated with a particular node.
Figure 1 shows a schematic representation of the eigenvector centrality model proposed in [
20] for networks with data taking into account the modifications proposed.
2.2. The TwoLayer Approach Pagerank
A twolayer approach PageRank was propose by Pedroche et al. in [
18]. The key is to consider the PageRank model as a process divided into two parts: one related to the topology of the network and the other related to the probability of jumping between two nodes in the network, following a criterion that there is the same probability among all of them.
In [
18], the authors realize that the classification obtained by the PageRank graph
$\mathcal{G}$ can be understood as the stationary distribution of a Markov chain that occurs in a twolayer network
${\mathit{l}}_{\mathbf{1}}$, a physical layer: the network $\mathcal{G}$.
${\mathit{l}}_{\mathbf{2}}$, a teleportation layer: the network given by the personalized vector.
Within this framework, a block matrix
${M}_{A}$ is constructed, where each diagonal block is associated with every layer. Hence,
${M}_{A}$ can be constructed as
where
${M}_{A}$ defines a twolayer Markov chain.
Remark that matrix
${P}_{A}$ is a probability matrix defined as
where
${c}_{j}$ is the sum of the
jth column of the adjacency matrix
A.
Since
${M}_{A}$ is irreducible and primitive, Pedroche et al. [
18] defined the
twolayer approach PageRank of an adjacency matrix
A as the vector
where there is a unique normalized and positive eigenvector of matrix
${M}_{A}$ given by
${\left(\right)}^{{\pi}_{u}^{T}}T$.
The idea of separating the centrality based on the PageRank concept into two layers, differentiating the topological part of the network from the concept of personalization vector, can be extrapolated to multilayer networks, as the authors demonstrate in [
18].
2.3. Adapting the TwoLayer Pagerank Approach for Eigenvector Centrality
In this section, a modification of the eigenvector centrality described in
Section 2.1 is presented. It is based into the twolayer approach PageRank technique described in
Section 2.2. A
$2\times 2$ block matrix is used to distinguish the topology and the teleportation layer. But some previous reasoning are required to understand the similarity of both models.
The idea of the combination of a physical layer and a teleportation layer in PageRank measure, differentiating the topological part of the network from the idea of jumping in a random way from one node to other, can be applied in this case in a similar way. Thus, let us consider a first layer related to the quantity of data from the topology of the network and a second layer where a residual importance of the data is considered globally in the network, regardless of where they are located. The first layer may be called topological data while the second one may be called residual data.
The key of this model lies in the construction of the matrix
M given by
The first diagonal block of M is related to the topological data layer and may be expressed by the Hadamard product $A\circ W$, where A is the adjacency matrix and W is the weight matrix constructed from the quantity and location of data in the network. This block clearly reflects the influence of data regarding the topology of the network. The second diagonal block is related to the residual data layer and is expressed by the product $\u03f5J$ that summarizes the influence of a residual data value at a global network scale. In this second block, we introduce the basic minimum level of importance in the definition of the weight matrix W. This is in accordance with the idea of teleportation, considering equally likely the jump from one node to another, in a random way.
In
Figure 2, a schematic representation of the eigenvector centrality model [
20] is presented taking into account the twolayer approach PageRank.
Note that
M is irreducible since any node has a path to any other node, and this is independent of whether
A is irreducible or not. Besides,
M is also nonnegative and primitive (since it is known that an irreducible nonnegative matrix with a nonzero diagonal element is primitive [
27]). Therefore, the eigenvector centrality corresponding to
M is well defined in the sense that the dominant eigenvalue is unique and we can find an associated eigenvector with all its entries positive.
Consequently, because of the good spectral characteristics of
M, the eigenpar
$({\lambda}_{1},{\widehat{\pi}}_{M})$ is obtained, where
${\lambda}_{1}$ is the dominant eigenvalue and
is the unique positive eigenvector of matrix
M given by (
5). Therefore,
is the vector used to calculate the centrality.
This centrality, that adapts the twolayer approach for PageRank to the eigenvector centrality for networks with data, is denoted as CVP2f and may be calculated by the expression
2.4. The Eigenvector Centrality for Multiplex Networks with Data
Taking as a reference the model described in
Figure 2, it is possible to extend the centrality measure to the case of multiplex networks, where all the layers have the same nodes and the differences are in the relationships among them.
Let us consider a multiplex network
$\mathcal{M}=(\mathcal{N},\mathcal{E},\mathcal{S})$ with layers
$\mathcal{S}=({l}_{1},{l}_{2},\dots ,{l}_{k})$. Then, an eigenvector centrality is defined by associating to each layer
${l}_{i}$ a twolayer approach as it was described in
Section 2.2. Moreover, the transition between these layers must be allowed.
To begin with, a biplex networks $\mathcal{M}=(\mathcal{N},\mathcal{E},\mathcal{S})$ with twolayer $\mathcal{S}=({l}_{1},{l}_{2})$ and adjacency matrices ${A}_{1},{A}_{2}\in {\mathit{R}}^{n\times n}$ are considered. We write the following elements for every layer ${l}_{i},$ (for $i=1,2$): ${D}_{i}$ data matrix, ${v}_{0i}$ balanced vector, ${W}_{i}$ weight matrix, and ${\alpha}_{i},{\u03f5}_{i}$ parameters associated with the data vector.
It is possible to construct the
$4n\times 4n$ matrix
${M}_{BI}$ as
The spectral characteristics of
M are inherited by the fact that
${M}_{BI}$ is built nonnegative, irreducible and primitive. Therefore, there exists a unique dominant eigenvalue and an eigenvector associated with it with all its elements positive. That is, the eigenvector
is associated with the dominant eigenvalue
${\lambda}_{1}$. This vector is the basis to obtain the classification vector. Therefore, a unique vector is obtained
with all its elements positive.
Regarding to the calculation of the centrality, we do not have a single adjacency matrix as in the case of monoplex networks, since there is an adjacency matrix for each layer of the network. It is reasonable to think about constructing a global adjacency matrix of the network that reflects the connections between nodes in all the layers of the network. We can call this general matrix as
global adjacency matrix and denote it by
${A}_{G}$. This matrix is defined as
Therefore, if this centrality is denoted as CVPBI, it can be calculated by the expression
where
${A}_{G}$ is the global adjacency matrix given by (
10).
The following algorithm summarizes the steps required to calculate the CVPBI centrality.
Algorithm 2 (Eigenvector centrality for biplex networks). Let $\mathcal{M}=(\mathcal{N},\mathcal{E},\mathcal{S})$, with layers $\mathcal{S}=({l}_{1},{l}_{2})$ and adjacency matrices ${A}_{1},{A}_{2}$ be a biplex network with n nodes. Let ${D}_{1},{D}_{2}$ be the data matrices for layers ${l}_{1},{l}_{2}$, respectively. 
 1
Construct the weighted vectors ${\mathit{v}}_{\mathit{i}}={D}_{i}\xb7{\mathit{v}}_{0i}$, for $i=1,2$.  2
Normalization of ${\mathit{v}}_{\mathit{i}}$, for $i=1,2$.
 3
Construct the weighted matrices ${W}_{i}$, for $i=1,2$, as
 4
Compute ${\alpha}_{i}$, for $i=1,2$, using the expression ${\alpha}_{i}=min\left(\right)open="("\; close=")">{v}_{i}^{\ast}$.  5
Obtain ${\u03f5}_{i}$, according to the expression ${\u03f5}_{i}<\frac{1}{10}{\alpha}_{i}$.  6
From ${A}_{i}$, ${W}_{i}$, and ${\alpha}_{i},{\u03f5}_{i}$ construct ${M}_{BI}$ as
 7
Compute the dominant eigenpair of ${M}_{BI}$, $({\lambda}_{1},{\widehat{\mathit{\pi}}}_{\mathit{BI}})$.  8
Compute $\mathit{x}$ from the expression 9.  9
Compute ${A}_{G}$ the global adjacency matrix using expression ( 10).  10
From ${A}_{G}$ and $\mathit{x}$ compute the centrality

The Algorithm 2, denoted as CVPBI, summarizes the steps required to compute the centrality.
In
Figure 3, a scheme of the eigenvector centrality algorithm for biplex networks is presented.
This biplex measure provides a ranking vector of the nodes according to their importance. This classification is obtained from the importance of the nodes in two layers where the nodes are the same and it changes the links between the nodes and the data associated with them.
Remark that the ${M}_{BI}$ is built for biplex networks. But, it can be extended for multiplex networks with k layers $\left(\right)$, defining the adjacency and data matrices $\left(\right)$ and $\left(\right)$.
The matrix
${M}_{BI}$ is
with
and
${M}_{1,2}$,
${M}_{2,1}$ are diagonal matrices with the identity
${I}_{n}$ in its blocks.
The centrality for multiple layers may be denoted as CVPM and will be given by the expression
where
${A}_{G}$ is the global adjacency matrix given by
and
$\mathit{x}$ is the eigenvector of
${M}_{multi}$ associated with the dominant eigenvalue
${\lambda}_{1}$.
3. Results
In this section, we present some numerical examples of the theoretical models studied in
Section 2 for different types of networks and sizes. These examples allow the establishment of characteristics and properties of the centralities developed, with special emphasis on the possibilities offered by an eigenvector centrality for multiplex networks.
As was discussed in
Section 2.2, the way in which the final centrality is calculated in the measures described in this paper differs from the way in which it is calculated in the classical model, as can be shown looking at the expressions (
1) and (
3). To compare the results of centralities when applying both expressions, we distinguish between two measures of centrality, such as:
CVP The eigenvector centrality for networks with data, using expression (
3).
CVPclassic The eigenvector centrality with data calculating the centrality using the expression (
1). We will refer to this model as
classic eigenvector centrality with data.
Therefore, the different centralities involved in these examples are:
CVPclassic The classic eigenvector centrality with data.
CVP The eigenvector centrality for networks with data.
CVP2f The eigenvector centrality based on the twolayer approach PageRank idea.
CVPBI The eigenvector centrality for multiplex networks.
All the numerical tests have been carried out by implementing these centralities in R [
28], a free software under the terms of the GNU project. It constitutes a language and environment specially efficient for computing and graphics.
Firstly, onelayer networks (monoplex) are used to compare the results obtained for the CVPclassic, CVP, and CVP2f centralities, in order to subsequently develop a discussion on the coherence of the measures defined with respect to the traditional eigenvector centrality. Later, some examples of the CVPBI centrality for particular biplex networks are described in detail.
3.1. Monoplex Networks
Let
${\mathcal{G}}_{1}=(\mathcal{N},\mathcal{E})$ be a simple graph with 10 nodes where
$\mathcal{N}=\left(\right)open="\{"\; close="\}">1,2,\dots ,10$ and
$\mathcal{N}=\left(\right)open="\{"\; close="\}">(1,2),(1,3),(2,4),(3,4),(4,5),(5,6),(5,7),(5,8),(6,7),(6,10),(7,8),(7,9),(7,10),(8,9),(9,10)$. Let us consider the following datasets
${D}_{1},{D}_{2}$ and
${D}_{3}$:
Now, we perform the calculations of the CVPclassic, CVP, and CVP2f eigenvector centralities, using the expression (
1) and the Algorithms 1 and 2, respectively. The results are shown in
Table 1.
The numerical results of
Table 1 are represented graphically in
Figure 4. The graphs has been drawn on the left, while the values of centralities are shown in the right column. It is observed that the size of each vertex in the graphs is proportional to the amount of data associated with it. Thus, for example, in the upper part where the data set
${D}_{1}$ is evaluated, the nodes
$1,3$ and 4 are observed with a larger size, since they have the greatest quantity of data, specifically 8. In the following section a brief analysis of the characteristics of these centralities that emerge from this example is carried out, with special emphasis on the differences between the classical model of eigenvector centrality and that proposed by Agryzkov et al. [
20].
3.2. A Simple Biplex Network
In this section, we study the example of a simple biplex network constituted by 10 nodes and with two layers. In this case, the links between the nodes in the different layers have been generated randomly, while the data has been directly associated on the nodes in a simulated way to establish possible differences in the centrality values for each layer. So, let
${\mathcal{M}}_{1}=({\mathcal{N}}_{1},{\mathcal{E}}_{1},{\mathcal{S}}_{1})$ be a biplex network with nodes
${\mathcal{N}}_{1}=\{1,2,\dots ,10\}$, with layers
${\mathcal{S}}_{1}=({l}_{1},{l}_{2})$ and adjacency matrices
${A}_{1},{A}_{2}$ given by
Let
${D}_{1},{D}_{2}$ be the data vectors for layers
${l}_{1},{l}_{2}$, respectively,
It is observed that in layer 1 the largest amount of data has been assigned to those nodes that have less connectivity, that is, degree 2. However, in layer 2 just the opposite is done, the largest amount of data has been assigned to the nodes that have greater connectivity (degree 5).
In
Table 2, we have reflected the following information about each node of the network: the data
${D}_{1},{D}_{2}$ corresponding to layers
${l}_{1}$ and
${l}_{2}$, respectively, the connectivity in each layer (dg
${l}_{1}$, dg
${l}_{2}$), the eigenvector centrality for layer
${l}_{1}$ (CVPl1), its eigenvector centrality for layer
${l}_{2}$ (CVPl2) and, finally, the eigenvector centrality for the biplex network CVPBI calculated from Algorithm 2. Data in
Table 2 may be visualized by the graphs of
Figure 5,
Figure 6 and
Figure 7 respectively.
Algorithm 2 have been run taking this network with these datasets. The results for the centrality are summarized in
Table 2.
3.3. A Jazz Musicians Biplex Network
An example of a biplex network related to the history of jazz is shown in this section. Among the many jazz artists that emerged between 1900 and 1930, 75 has been selected from the most relevant and influential in the following decades, such as:
Louis Armstrong 1, John Coltrane 2, Charles Mingus 3, Charlie Parker 4, Miles Davis 5, Count Basie 6, Dizzy Guillespie 7, Duke Ellington 8, Ella Fitzgerald 9, Billie Holiday 10, Thelonious Monk 11, Abbey Lincoln 12, Alice Babs 13, Art Blakey 14, Arthur Prysock 15, Artie Shaw 16, Ben Webster 17, Benny Goodman 18, Bill Evans 19, Bing Crosby 20, Blue Mitchell 21, Bud Powell 22, George Buster Cooper 23, Cannonball Adderley 24, Cat Anderson 25, Chet Baker 26, Coleman Hawkins 27, Cootie Williams 28, Dexter Gordon 29, Earl Hines 30, Dave Brubeck 31, Grant Green 32, Hank Mobley 33, Harry Carney 34, Helen Merrill 35, Helen Humes 36, Herbie Hancock 37, Jackie Wilson 38, Jeri Southern 39, Gerry Mulligan40, Jim Hall41, Jimmy Hamilton 42, Jimmy Jones43, Jimmy Rushing 44, Joe Williams 45, Johnny Hartman 46, Johnny Hodges 47, Johnny Smith 48, Kenny Burrell 49, King Oliver 50, Lester Young 51, Max Roach 52, Milt Jackson 53, Nat King Cole 54, Nina Simone 55, Lionel Hampton 56, Oscar Peterson 57, Billy Eckstine 58, Paul Desmond 59, Paul Gonsalves 60, Clifford Brown 61, Russell Procope 62, Sam Woodyard 63, Sammy Davis 64, Sarah Vaughan 65, Fletcher Henderson 66, Sonny Rollins 67, Sonny Stitt 68, Stan Getz 69, Art Tatum 70, Teddy Wilson 71, Clark Terry 72, Tony Bennett 73, Dinah Washington 74, Wes Montgomery 75.
This is a personalized list and, therefore, debatable and improvable. However, the majority of the most influential jazz musicians of all time are in this set of 75 great musicians. Only seven of them are out of the range 1900–1930 but were included for its influence on musicians of later times.
The data collected from these jazz figures are: date of birth, place of birth, instrument and discography. Regarding of the discography, three data have been compiled. On the one hand, the number of discs (LP’s) commercially released by each artist. On the other hand, the number of appearances of an artist on the disc of other colleagues has been collected. Finally, the data referring to the production of singles & EPs by each musician have been extracted from specialized Web pages. A part of the data collected in the study are shown in
Table 3.
In addition to these data, a more indepth study is carried out based on the collaborations between them, understanding by collaboration the joint participation in discs, concerts, etc. Note that we also consider a collaborative relationship if an artist has been part of the band of another artist on the list. The majority of data has been collected from web pages specialized in jazz, such as
https://www.discogs.com,
https://en.wikipedia.org or
https://www.britannica.com/art/jazz. A map with the geographical location of the artists born in USA can be seen in
Figure 8.
This work aims to study the most influential jazz musicians of the early twentieth century taking into account on the one hand the professional collaborations between them, as well as the amount of contemporary artists to each musician. The data associated with each artist are related to the musical production of the artist throughout his professional career. For this purpose, we design a biplex network ${\mathcal{M}}_{2}=({\mathcal{N}}_{2},{\mathcal{E}}_{2},{\mathcal{S}}_{2})$ with nodes ${\mathcal{N}}_{2}=\{1,2,\dots ,75\}$, and layers ${\mathcal{S}}_{2}=({l}_{1},{l}_{2})$. The nodes are the jazz artists from the previous list and the two layers are constructed from the following relationships and data:
 layer 1
the nodes are the 75 artists previously enumerated and the relationships we analyze are the musical collaborations between them. That is, two artists are linked by an edge if they have collaborated together in a disc or a remarkable musical event. The data associated with each node are related to its musical production. In this layer each node has a number representing the quantity of discs commercially launched throughout their professional career.
 layer 2
the nodes are the same as in layer 1 but the relationships established between them are related to their contemporaneity. Specifically, a link between two artists is established if their age difference is less than 5 years. The data that accompanies each node is also related to its musical production, although now we measure the quantity of singles & EPs commercially launched along their life.
In
Figure 9 we have drawn the graphs corresponding to the two layers of the biplex network
${\mathcal{M}}_{2}$. On the left image, the graph of layer 1 has been drawn, where each link represents a collaboration between two jazz artists and the size of the nodes is proportional to the degree they have in the graph. Note that it is an undirected graph with 75 nodes and 386 edges, where the node that has a greater degree is that of Duke Ellington with 25 collaborations. In the graph of
Figure 9 (right), the graph of layer 2 is shown. Now the idea of establishing relationships between artists is given by their contemporaneity. Thus, we establish a link between two artists if the difference of their ages is less than 5 years. Analogously, the size of the nodes is directly proportional to the degree. It is a graph of 75 nodes and 728 edges, where now the node with the highest degree is John Coltrane, with 32 links.
Table 3 summarizes the whole set of data collected regarding to the biplex network of jazz artists of the early twentieth century. This table shows the names of the artists and their identifiers as network nodes. The following three columns show the information related to the musical production of each artist. The column
discs 1 shows the number of discs (LP’s) released commercially, the column
discs 2 shows, for each artist, the number of discs of other colleagues in which the artist has appeared and in the third column
singles we have the number of singles released commercially. The next columns
degree1 and
degree2 show the degrees of a node in the graphs of layer 1 and layer 2, respectively. Finally, the last three columns show the results of the calculated centralities. The centrality
CVPl1 refers to the eigenvector centrality taking individually the first layer,
CVPl2 refers to the eigenvector centrality taking individually the second layer, while the CVPBI centrality is shown in the third column, having been calculated running Algorithm 2.These results are analyzed and discussed in next section.
The biplex centrality for the jazz artists network is displayed in
Figure 10, where the biplex centrality CVPBI is represented in front of the individual centralities of each layer.
Figure 10 shows how there is a group of nodes with very high CVPBI centrality and a low centrality in layer 2.
4. Discussion
The way in which we calculate the centralities CVP (eigenvector centrality with data) and CVP2f (eigenvector centrality based on the twolayer PageRank approach) differs from the way in which the classical eigenvector centrality is calculated. When considering the CVP centrality it is assumed that the importance of the data associated with the node itself may be not negligible in the calculation of its importance within the network. If the expression (
3) is observed, we notice the presence of the component
$\mathit{x}$, which fulfills this function precisely and which does not appear in the classical eigenvector centrality.
The importance of this detail on the computation of centrality is shown in the first network of the results section. On a simple network of 10 nodes, with two clearly differentiated components, three data sets are strategically distributed between the different nodes of the network. We analyze them briefly.
We pay attention to the upper graph corresponding to the data set
${D}_{1}$, which centers all the data of the network in the first four nodes. Firstly, It is observed that the results of the three centralities studied are coherent, in the sense that the most relevant nodes coincide in the three measures, maintaining the order of importance of the nodes in all cases. However, certain differences are seen in the values of centrality in those nodes where the data are concentrated. Specifically, the biggest differences between the classic eigenvector centrality and the rest are given in nodes
$1,3$ and 4, which are the ones that concentrate the data. This is a consequence of the way in which eigenvector centrality for networks with data is calculated, taking into account not only the degree of the node but also its own importance based on the data it contains. Observing the graphs shown in
Figure 4, the great similarity in the values of the CVPclassic and CVP2f centralities is clear. Likewise, when the nodes do not have data, the three measures of centrality are practically identical. In the central graph of
Figure 4, corresponding to dataset
${D}_{2}$, it can be seen how the most relevant node in the network is 7, which is one of the two nodes that stores the data present in the network. One might think that the second most relevant node of the same would be node 4, which is the other node with 10 data. However, this is not the case, since the second node in importance is node 5. The reason for this behavior is that, although node 5 does not contain data, it is connected to the two nodes that contain all the data of the network (nodes 4 and 7). This case intuitively shows us the idea on which the eigenvector centrality is based.
In the lower graph of
Figure 4, corresponding to dataset
${D}_{3}$, the importance of connectivity is also seen. Although nodes 1 and 7 have the maximum data, node 7 is the most relevant due to its connectivity (grade 5), compared to node 1 that only has degree 2. The nodes connected to node 7 present a higher centrality for its greater connectivity. The fact repeated is that the greatest differences in the values of the centralities occur when data are present in the nodes.
This coherence in the values of the centralities studied is not only observed in small networks. Tests have been carried out with networks of different sizes, up to $10,000$ nodes. The literature suggests different alternatives to study the correlations between two rankings; in this case a classic one has been chosen to perform the numerical tests, as it is the the Spearman correlation coefficient. The results are conclusive: in all the cases tested with different sizes, the Spearman coefficient between the variables exceeded the $0.9999$ value, being 1 in most cases from sizes of $n=100$. This positive correlation is very relevant in this proposal since we have a solid measure such as the CVP2f centrality that allows us to design a new measure for networks with multiple layers.
Let us consider the network ${\mathcal{M}}_{1}$ with two layers and 10 nodes. In the first layer ${l}_{1}$ the data are associated with the nodes with less connectivity, while in layer ${l}_{2}$ they are located in the two nodes with greatest connectivity. The influence of data on those nodes with more links is clearly observed. If we analyze the global centrality of the biplex network, it is much closer to the eigenvector centrality calculated for layer ${l}_{2}$ than for layer ${l}_{1}$. In fact, the three most central nodes of the measures CVPl2 and CVPBI are the same, although following a different order in the ranking. However, if we consider separately the centrality of layer ${l}_{1}$, it has nothing to do with the global results when analyzing the network by layers. In layer ${l}_{1}$ the two most relevant nodes do not coincide with the nodes that have more data; however, this does not happen in layer ${l}_{2}$, where clearly the sum of data and degree makes the most central nodes are those that accumulate more data.
This shows that when a multilayer network with data is considered and evaluated, the results differ when the centrality is applied individually to each of the layers.
Now, we discuss the jazz musicians network described in
Section 3.3. Note that the goal is not only to establish a ranking of musicians of this time based on their collaborations and musical production. To address this objective, it would be enough to calculate the eigenvector centrality of layer
${l}_{1}$ (CVPl1) and we would have this classification. Note that we relate the collaborations between artists with those who are contemporary with each other. Following the idea of centrality based on the eigenvector concept, we consider that the relevance of an artist is also related to the presence of contemporary artists and, in addition, the more famous they are, the more fame they provide to a work and production. Therefore, the goal is not only to establish a ranking of musicians of that time based on their collaborations and musical production. If this were the objective, it should be enough to calculate the CVP centrality of layer 1 and we would have this classification. In this case it is mixed the collaborations between artists with those who are contemporary with each other. Following the idea of centrality based on the eigenvector concept, it is established that the importance of an artist is also related to the presence of contemporary artists and, in addition, the more relevant they are, the more value they provide to their work and production.
A portion of the dataset collected is shown in
Table 3, while the geographical location of the artists’ birth places can be seen in the USA map in
Figure 8, where the large production of artists in the east and southeast of the country is clear.
Regarding to the data referring to the musical production, some highlights may be remarked:
Much of the artists with the highest production of LP’s and singles are singers, such as Ella Fitzgerald, Billie Holiday, Bing Crosby, Nat King Cole, Nina Simone, Sarah Vaughan or Tony Bennett.
It is remarkable the huge musical production of Bing Crosby.
Most artists whose musical production is very low is a consequence of having been part of other bands, though their importance and influence in later times is undeniable.
If we focus on the artists who have a higher number of collaborations with other musicians, most of them are part of all the lists of the best jazz musicians of all time, such as Dizzy Guillespie, Duke Ellington, Ella Fitzgerald, Miles Davis, Charlie Parker, Stan Getz or Louis Armstrong.
To analyze the data obtained from the jazz artists network,
Table 3 is simplified by taking the 15 nodes that present higher values of centralities. Therefore,
Table 4 summarizes the ranking of nodes for the calculated values of centralities CVPl1, CVPl2 and CVPBI.
An extensive analysis of the results reproduced in
Table 4 and displayed in
Figure 11 is performed.
As already mentioned, if we limit to the calculation of the eigenvector centrality for networks with data in layer
${l}_{1}$ using Algorithm 1, we obtain a classification of the nodes in importance according to the collaborations with other artists and taking the data of his musical production in terms of records. We must bear in mind that it is being valued as relevant not only the number of collaborations but the quality of these, always under the prism that we are relevant if our contacts are relevant. The importance of the musicians with whom they collaborate or participate is measured. Looking at
Figure 11 (up left), the first in the ranking of the artists in the classification to measure the centrality CVPl1 is Dizzy Gillespie, key trumpeter in the evolution of jazz to the present. It is the node with the highest degree, that is, with a greater number of connections with other musicians.
It is noted that in this list are some of the best known artists of that time by the public, such as Dizzy Gillespie, Duke Ellington, Ella Fitzgerald, Count Basie, Oscar Peterson, Louis Armstrong and others. Other names are also not as well known as Earl Hines, pianist of the band of Louis Armstrong and whose musical production is remarkable with 182 albums released.
If the eigenvector centrality for layer 2 is now analyzed, a different pattern is observed. To begin with, there are hardly any names on the list that are so familiar to nonspecialists in jazz music. Remark that now we relate the artists for contemporaneity. It follows that the artists with higher centrality are those born between 1923 and 1924, years of abundance in the birth of artists of unquestionable quality, some of whom are on this list. It is not surprising that several artists have the same centrality, since they were born in the same year they form similar subgraphs with the same degrees.
Figure 11 (up right) displays the 15 first names in the classification.
Considering the network as a whole and not individually by layers, the influences of the different relationships between the nodes and the data associated with them are mixed together and the layers interact. Applying Algorithm 2 a ranking is obtained (see
Figure 11 down).The winner is Dizzy Gillespie, a trumpet virtuoso and improviser. In the 1940s Gillespie, with Charlie Parker, became a major figure in the development of bebop and modern jazz. The second artist in the classification is Oscar Peterson, exceptional pianist in the history of music. Being born in 1925, being contemporary of many jazz greats with whom he has collaborated actively throughout his career and his extensive musical production of both albums and singles takes to occupy a high position in this ranking. In the third place appears Sarah Vaughan, born in 1924. As in the previous case, her enormous musical production and having sung with the most relevant artists in the history of jazz cause her to be in second place. The same behavior repeats with the rest of the artists.
If we compare the three classifications, the names do not match. This is really what we expected when we consider multipex networks: the value of individual centrality does not exactly match the global centrality.
This example help us to understand how the data must be analyzed in the context of the networks and their characteristics. Thus, the analysis of the data collected on the musical production shows us a clear pattern as it is that the singers of this list have a very high musical production. The most obvious cases are those of Bing Crosby, Ella Fitzgerald or Nat King Cole, which have a singles production of 1181, 590 and 550, respectively, occupying the first three positions if we take this isolated data. Note that these artists are born before 1919 and the great explosion of artists in those decades is between 1921 and 1927, which penalizes them and it does not allow them to occupy higher positions in the rankings.
Throughout the example, we see the possibilities of treatment that a dataset has from the study of diverse relations between the different nodes of the network. If we had related the artists in another sense, the results probably would not be same, but it is certain that in the final list some of the greatest artists in jazz history should appear.
5. Conclusions
In this paper, a centrality measure for biplex networks (CVPBI), based on the eigenvector centrality for networks with data, has been designed and implemented. The advantage of this type of measure is twofold. Firstly, it can determine the importance of the nodes of a network by analysing multiple relationships between the nodes. On the other hand, it allows to work with several datasets associated with the nodes themselves.
As a preliminary step to the design of the measure for multilayer networks, it has been necessary to adapt the eigenvector centrality for networks with data to the idea underlying the twolayered approach PageRank. Following this technique, a new centrality (CVP2f) is designed by means of the construction of a $2\times 2$ block matrix, where the blocks of the main diagonal have the objective of separating the effect of the network topology on the data with the quantity of these. Thus, the first block assumes the importance of the network topology while the second block takes into account the influence of the data at a global or residual level.
In the several numerical tests performed on networks of different types and sizes, a coherence was observed in the values offered by CVP2f measure with the classic eigenvector centrality (CVPclassic) and with the eigenvector centrality for networks with data (CVP). This consistent result has allowed us to generalize to multiplex networks the idea of considering blocks in each of the layers differentiating the influence of the data according to the network topology and the data as a whole (following a similar reasoning as in the CVP2f centrality).
The centrality proposed for multiplex networks has been experienced on a real network of jazz musicians of the early twentieth century. It has demonstrated its ability to evaluate different relationships on the same set of nodes when different datasets are considered. From this particular example, we have shown how introducing a layer structure, by distinguishing different types of interactions between the nodes, may vary the behaviour of the network.