Next Article in Journal
An Efficient Design and Implementation of a Quadrotor Unmanned Aerial Vehicle Using Quaternion-Based Estimator
Next Article in Special Issue
Analysis of Generalized Multistep Collocation Solutions for Oscillatory Volterra Integral Equations
Previous Article in Journal
Automatic Judgement of Online Video Watching: I Know Whether or Not You Watched
Previous Article in Special Issue
Modeling of Artificial Groundwater Recharge by Wells: A Model Stratified Porous Medium
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Corrected Evolutive Kendall’s τ Coefficients for Incomplete Rankings with Ties: Application to Case of Spotify Lists

by
Francisco Pedroche
1,* and
J. Alberto Conejero
2
1
Institut de Matemàtica Multidisciplinària, Universitat Politècnica de València, Camí de Vera s/n, 46022 València, Spain
2
Instituto Universitario de Matemática Pura y Aplicada, Universitat Politècnica de València, Camí de Vera s/n, 46022 València, Spain
*
Author to whom correspondence should be addressed.
Mathematics 2020, 8(10), 1828; https://doi.org/10.3390/math8101828
Submission received: 23 September 2020 / Revised: 7 October 2020 / Accepted: 11 October 2020 / Published: 18 October 2020
(This article belongs to the Special Issue Mathematical Methods, Modelling and Applications)

Abstract

:
Mathematical analysis of rankings is essential for a wide range of scientific, public, and industrial applications (e.g., group decision-making, organizational methods, R&D sponsorship, recommender systems, voter systems, sports competitions, grant proposals rankings, web searchers, Internet streaming-on-demand media providers, etc.). Recently, some methods for incomplete aggregate rankings (rankings in which not all the elements are ranked) with ties, based on the classic Kendall’s tau coefficient, have been presented. We are interested in ordinal rankings (that is, we can order the elements to be the first, the second, etc.) allowing ties between the elements (e.g., two elements may be in the first position). We extend a previous coefficient for comparing a series of complete rankings with ties to two new coefficients for comparing a series of incomplete rankings with ties. We make use of the newest definitions of Kendall’s tau extensions. We also offer a theoretical result to interpret these coefficients in terms of the type of interactions that the elements of two consecutive rankings may show (e.g., they preserve their positions, cross their positions, and they are tied in one ranking but untied in the other ranking, etc.). We give some small examples to illustrate all the newly presented parameters and coefficients. We also apply our coefficients to compare some series of Spotify charts, both Top 200 and Viral 50, showing the applicability and utility of the proposed measures.

1. Introduction

The analysis of rankings of scores (cardinal rankings) or, particularly, rankings composed of natural numbers (ordinal rankings), have been studied from different perspectives attending to the ultimate goal of the researchers or practitioners (see [1]). When the interest is on obtaining a consensus score that summarizes the opinion of various judges, the used mathematical tools are usually aimed to find a ranking that minimizes a given distance metric (see the seminal paper [2,3] for some properties of different metrics). In such a case, we say that a distance metric minimizes disagreement. We can place in this area the methods called voter systems, ranking aggregation, and others (see the detailed review in [4]).
When the interest is focused on comparing two series of rankings, one of the key points is to obtain a measure that describes the evolution of the series. In this case, we have a series of rankings such that each one of them prioritizes the elements based on the scores obtained at a particular time (see [5]). For example, sports rankings belong to this category. Obviously, at the end of a season, there is no need to find a consensus ranking since, by the nature of sports leagues, it is the last ranking that serves to summarize the result of the overall season. The same happens with the Stock Market, the richest people rankings made by the Fortune magazine [6], university rankings (e.g., [7,8]), songs rankings based on the number of downloads, streaming, or sales (see [9]), etc. Our work is focused on a series of rankings behavior.
The terminology applied to rankings is not unique. For example, in [10] the term partial is used to indicate rankings in which ties are presented, while in [11] the term partial indicates that not all the objects are compared. In this paper, we use the terminology coined in [4,12]. We talk of complete rankings when all the objects are compared (as in a football league) and incomplete when there are absent objects (as in a Top k ranking). We explicitly use the terms with ties or without ties to indicate whether we consider the presence of tied objects in the rankings. We recall that in [11] the term linear order is used when all objects are compared and no ties are allowed (that is, for us, complete rankings with no ties) and the term weak ordering when all objects are compared, but ties are allowed (that is, for us, complete rankings with ties).
Incomplete rankings appear in multiple areas. For example, in national or European grant calls, judges evaluate only a subset of the applications, and therefore each judge handles an incomplete ranking. The same happens in literary contests, where each judge only reads a small number of manuscripts. In the case of the results shown by search engines, it is clear that only the first Top k web pages are displayed, being, as a consequence, an incomplete ranking.
We use, and extend, the results of some previous papers. Some concepts are taken from [5], where a method to compare series of complete rankings with no ties was presented, and from [13], where a method to compare series of complete rankings with ties was analyzed. We also make reference to [14], where some theoretical aspects where studied. In all these works, there are two main ingredients:
1.
The use of generalizations of the classical concept of Kendall’s τ coefficient of disagreement [15,16,17];
2.
The use of graphs associated to the series of rankings as a tool to visualize and also to help in the definition the coefficients that summarize the “behaviour” of the series of rankings.
Regarding to extensions of Kendall’s τ coefficient, the first attempt to incorporate an axiomatic distance metric was in [2], followed by the works [11,18,19].
More recently, in [4] these previous works were revised and a new axiomatic framework for incomplete rankings was introduced. To the best of our knowledge, the last paper devoted to an axiomatic study for incomplete rankings is [12], where it is shown as an extension of Kendall’s τ coefficient to the case of incomplete rankings with ties.
Kendall’s τ has been extensively used, and some extensions can be found in the literature up to the present day on [10,12,20]. In particular, Kendall’s τ has been recently reviewed for ophthalmic research in [21] and it is a tool used in neuroscience studies—e.g., [22]—and in bioinformatics [23].
Regarding the use of graphs to represent a series of rankings, we recall, in particular, that a graph can be used to describe the crossings between two rankings. This graph is called a permutation graph (see [24,25]). When a graph is defined to show the consecutive crossings between a series of m rankings, it is called a Competitivity graph [5]. This concept corresponds to that of intersection graph of a concatenation of permutation diagrams in graph theory (see [26]). For more relations on graphs associated with rankings, see [14].
In this paper, we take some results of [4,12] as our starting point to develop two coefficients to describe the evolution of a series of m 2 incomplete rankings with ties. When applied to the case of only two rankings, our measures reduce to the measures given in [4,12].
We also extend the study of a series of complete rankings with ties developed in [13] to the case of incomplete rankings with ties. We make use of the standard modern notation in the field of rankings mainly based on [10,12,27], among others.
We take as our starting point the definition of τ x of [12] that is based on the computation of a certain sum of the form i = 1 n j = 1 n A i j B i j that involves the terms of some matrix A and B that indicate the relative positions of the elements of two rankings. In Theorem 1, we give an expression of this sum as a function of the type of interactions between a pair of elements { i , j } from one ranking to the next one (e.g., interchanges from tie to untie, absence of one of the elements in one ranking, crossings, etc.). This result allows for writing τ x (and τ ^ x ) in terms of the interactions of the elements of the rankings.
On the one hand, this theoretical result also allows a computation of the sum i = 1 n j = 1 n A i j B i j without computing explicitly the involved matrices. On the other hand, it allows for interpreting the interactions of a series of rankings by using a permutation graph or, more generally speaking, a competitivity graph. The edges are weighted to represent the weight of the corresponding interactions and the whole series of rankings.
We define two coefficients τ e v and τ ^ e v for series of incomplete rankings with ties by using an analogy based on previous well-established definitions. We recall that, in the field of incomplete rankings, “intuition” is usually used for some measures over others since when you handle an incomplete ranking, there is no unique form to interpret the results (see this kind of reasoning in [4,12]). In our case, our measures’ behaviour is checked by ensuring that they are well normalized and that they reduce to well-known cases in limit situations.
Finally, other contributions of the paper are placed on a practical field. We give a methodology to study the movements of rankings (of songs) in Spotify by using two different approaches: the cases of series of incomplete rankings without ties and series of incomplete rankings with ties.
The structure of the paper is as follows. In Section 2, we recall Kendall’s τ and give the fundamental relations that will be useful throughout the paper. In Section 3, we recall the notation and basic results for the case of two incomplete rankings with ties allowed.
In Section 4, we give the fundamental theoretical result of the paper and some remarks that give insight both into the validity and application of this result. In Section 5, we recall some definitions from [13] to measure the evolution of m complete rankings with ties. In Section 6, we present two coefficients, denoted as τ e v and τ ^ e v to characterize the evolution of m incomplete rankings with ties and some examples are given. In Section 7, we illustrate the applicability of the new coefficients by using some real data obtained from Spotify charts. Finally, in Section 8, we outline the main conclusions of the paper.

2. Preliminaries

In [16] it is shown that Kendall’s τ coefficient (also called measure of disarray) associated with two rankings with the same number of elements n, can be written in the form
τ = 1 2 s 1 2 n ( n 1 )
where s is the minimum number of interchanges required to transform one ranking into the other. This coefficient is a measure of the intensity of rank correlation. The coefficient can also be written as
τ = P Q 1 2 n ( n 1 )
where P is the number of pair of elements that maintain its relative order when passing from the first ranking to the second one (that is, the first element is above or below the second in both rankings) and Q is the number of pairs of elements that interchange its order (that is, in one ranking, the first element is above the second and, in the other ranking, the first element is below the second, or vice-versa).
Note that Q and s are equal. Furthermore, this quantity can be identified with the number of crossings or inversions when passing from the first ranking to the second. For this reason, throughout the paper, we will keep in mind that Equation (1) gives the equivalence between the number of crossings and the associated τ . This will be important in what follows since we will deal with different extensions of Kendall’s τ coefficient and since one of our preferred tools will be counting the number of crossings, as in [5].
We recall from [27] that a distance metric d ( a , b ) can be transformed into a correlation coefficient τ ( a , b ) by the formula
τ ( a , b ) = 1 2 d ( a , b ) d m a x ( a , b )
where d m a x ( a , b ) is the maximum possible distance between two rankings. We recall that a distance metric between two rankings a and b is a non-negative real function f, such that it is symmetric ( f ( a , b ) = f ( b , a ) , for any pair of rankings), regular ( f ( a , b ) = 0 a = b ) and satisfying the triangle inequality ( f ( a , c ) f ( a , b ) + f ( b , c ) , for any rankings a, b, and c). Note that Equation (1) is of this form, since n ( n 1 ) / 2 is the maximum number of crossings between two given rankings. The same happens with the Spearman’s ρ coefficient. In [16] the Spearman’s ρ for two ordinal complete rankings x = ( x 1 , x 2 , , x n ) and y = ( y 1 , y 2 , , y n ) with x i , y i N is defined by
ρ = 1 6 i = 1 n ( x i y i ) 2 n 3 n
and this is of the form (3) since it is easy to show that the maximum value of i = 1 n ( x i y i ) 2 occurs when one ranking is the reverse of the other and, as a consequence, the maximum value of the distance metric d ( x , y ) = i = 1 n ( x i y i ) 2 is 1 3 ( n 3 n ) (see [3] for this and other properties of distance metrics).
We also recall that a permutation graph (called competitivity graph in [5]) is associated with two rankings over the same elements in such a way that the nodes represent the elements and two nodes and are connected with an edge if they cross their positions when passing from one ranking to the other.
In this way, it is clear that the number of edges of this graph is, precisely, s. Furthermore, another quantity (borrowed from graph theory) is also introduced in [5]: the Normalized Mean Strength N S ; that is, the normalized sum of the weights of the edges of a weighted graph. When considering only two rankings and its corresponding competitivity graph, we have the following relation
N S = 1 τ 2
that gives the equivalence between the Normalized Mean Strength and Kendall’s τ for two rankings. Note that τ [ 1 , 1 ] and N S [ 0 , 1 ] . We consider that the measure N S is more intuitive than τ since it allows us to interpret the movements or activity of a series of rankings as a percentage.

3. Coefficients for Two Incomplete Rankings with Ties

In this section, we recall some definitions used in [4,12]. We will use the next three ingredients in order to define a coefficient to compare two rankings:
1.
A vector to define the ordinal ranking (including the description of absent elements and tied elements);
2.
A matrix to indicate the relative positions of the elements of the ranking (including absent and tied elements);
3.
A formula to define the coefficients for a pair of rankings by using the entries of their associate matrices defined in the previous step.
Let V = { v 1 , v 2 , , v n } be the objects to be ranked, with n > 1 . The ranking is given by
a = [ a 1 , a 2 , , a n ]
where a i is the position of v i in the ranking. Note that if a i = a j , then v i and v j are tied. If v i is not ranked, then it is denoted as a i = . We also define the set
V a = { v i V | a i } .
We define an n × n matrix A = ( A i j ) , with entries A i j associated to a as follows:
A i j = 1 if a i a j 1 if a i > a j 0 if i = j , a i = , or a j =
According to [12], we define the coefficients
τ x ( a , b ) = i = 1 n j = 1 n A i j B i j n ( n 1 )
and, when n ¯ > 1
τ ^ x ( a , b ) = n ( n 1 ) n ¯ ( n ¯ 1 ) τ x ( a , b )
where n ¯ is the number of common ranked elements v i to a and b . That is:
n ¯ = | V a V b |
Example 1.
Let V = { 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 } , and let us consider two rankings a and b . Then, a = [ 6 , 4 , 5 , 5 , , 2 , 1 , 3 ] represents the incomplete ranking with ties ( 7 , 6 , 8 , 2 , 3 4 , 1 ) , where 3 4 indicate tied elements. Analogously, b = [ 3 , 3 , 2 , 2 , , 1 , , 4 ] represents the ranking ( 6 , 3 4 , 1 2 , 8 ) . Note that n = 8 and n ¯ = 6 .
Note that τ x with complete rankings and no ties reduces to the classic Kendall’s τ given by (1), while τ ^ x is a renormalization of τ x , verifying | τ ^ x | | τ x | .
As we will see, Definition 6 in Section 6, is based on an analogy with Equation (1). To that end, it will be necessary to count all the possible cases when passing from a to b (interactions between the relative positions of pair of elements such as crossings, pass from tie to untie, from being in the ranking to quitting it, etc.). We do this in the next section.

4. Main Result

The following result is the fundamental theoretical result of this paper. This result will allow us to write τ x and τ ^ x in terms of the interactions of the rankings’ elements. It opens the possibility of giving weights to the interactions, as is a common practice in modern definitions of Kendall’s tau [10]. This result also constitutes our starting point to define a coefficient for a series of more than two incomplete rankings. This theorem also allows giving insight into the differences between τ x and τ ^ x . Some other consequences are detailed in the remarks below and in Corollary 1.
Theorem 1.
Given two vectors a , b representing incomplete rankings of n elements with ties, represented as in (5), and their corresponding matrices A = ( A i j ) and B = ( B i j ) defined by (6), it holds that
i = 1 n j = 1 n A i j B i j = n ( n 1 ) 4 s 2 n t u 2 N i n c
where
N i n c = n 2 + n 2 + n 2 + n ( n + n + n ) + n ( n + n ) + n n
  • s is the number of crossings—that is, the number of pairs { i , j } —such that a i < a j and b i > b j , or a i > a j and b i < b j .
  • n t u is the number of pairs that are tied in only one ranking (from tie to untie or viceversa), that is, such that a i = a j and b i b j , or a i a j and b i = b j .
In the definitions of s, and n t u , it is assumed that a i and b i are different from •. For the cases when one or more • may appear, the following notation holds:
  • n is the number of entries such that a i = b i = ;
  • n is the number of entries, such that a i = and b i ;
  • n is the number of entries, such that a i and b i = .
Finally, it is also needed to define n as the number of entries, such that a i and b i .
Proof of Theorem 1. 
For each pair { i , j } we will evaluate each term A i j B i j + A j i B j i in the expression i = 1 n j = 1 n A i j B i j . The case i = j gives A i i B i i + A i i B i i = 0 .
Thus, we focus on pairs { i , j } with i j . There is a total number of n ( n 1 ) / 2 of these pairs. It is useful to consider the basic cell of the pair { i , j } with i < j .
a i b i a j b j
where a k and b k can be natural numbers or a • if the element k is not ranked in a or b .
Let us study first the cases that can appear when no • is present in the basic cell.
The Complete Case (C):
That is a k , b k , for all k { 1 , 2 , n } . We distinguish four types of basic cells.
Type C.1: Not crossing, and no ties in a nor in b.
For example:
1 3 2 4 or 2 4 1 3 .
So that, we have a i a j and b i b j and two cases can appear:
C.1.1.
If a i < a j and b i < b j , then A i j B i j + A j i B j i = 1 · 1 + ( 1 ) · ( 1 ) = 2 .
C.1.2.
If a i > a j and b i > b j , then A i j B i j + A j i B j i = ( 1 ) · ( 1 ) + 1 · 1 = 2 .
Type C.2: Crossing.
For example:
1 4 2 3 or 2 3 1 4 .
Again, we have a i a j and b i b j and two more cases can appear:
C.2.1.
If a i < a j and b i > b j , then A i j B i j + A j i B j i = 1 · ( 1 ) + ( 1 ) · 1 = 2 .
C.2.2.
If a i > a j and b i < b j , then A i j B i j + A j i B j i = ( 1 ) · ( 1 ) + 1 · ( 1 ) = 2 .
Type C.3: From tie to untie or viceversa.
For example:
1 3 1 4 , 1 4 1 3 , 3 1 4 1 , or 4 1 3 1
We have a i = a j and b i b j or a i a j and b i = b j . Therefore, four cases can appear:
C.3.1.
If a i = a j and b i < b j then A i j B i j + A j i B j i = 1 · 1 + 1 · ( 1 ) = 0 .
C.3.2.
If a i = a j and b i > b j then A i j B i j + A j i B j i = 1 · ( 1 ) + 1 · 1 = 0 .
C.3.3.
If a i < a j and b i = b j then A i j B i j + A j i B j i = 1 · 1 + ( 1 ) · 1 = 0 .
C.3.4.
If a i > a j and b i = b j then A i j B i j + A j i B j i = ( 1 ) · 1 + 1 · 1 = 0 .
Type C.4: From tie to tie.
For example:
1 2 1 2
That is, we have: a i = a j and b i = b j , and then A i j B i j + A j i B j i = 1 · 1 + 1 · 1 = 2 .
We denote the number of pairs of each case using the terminology of Table 1. Note that n t t is the number of pairs that are tied in both rankings, that is, such that a i = a j and b i = b j . Note also that n t u is the number of pairs that go from tie to untie or viceversa.
The Incomplete Case (I):
There is at least one • in the basic cell. In other words, there is some k such that a k = , or b k = , or both. We distinguish seven cases:
Type I.1: Four • That is a i = a j = b i = b j = , or graphically
Then A i j B i j + A j i B j i = 0 · 0 + 0 · 0 = 0 . Let us denote by n the number of null rows that appear in the matrix with columns a and b . Therefore, we have n 2 pairs { i , j } of this type.
Type I.2: Three •. That is, a cell of one of these forms
, , , or
where ∗ is a number (not a •). Therefore, we have four cases, but all are similar to this one: a i and a j = b i = b j = 0 . Then, A i j B i j + A j i B j i = 0 · 0 + 0 · 0 = 0 .
Denoting n the number of rows of the form ( ) in the n × 2 matrix ( a b ) , and n the number of rows of the form ( ) in the same matrix, it is clear that the number of pairs { i , j } of this type is: n ( n + n ) .
Type I.3: Two •, one on each ranking. That is, any cell of one of these forms
, , , or
These four cases can be reduced to two:
I.3.1.
If a i = b i = , a i and b j , then A i j B i j + A j i B j i = 0 · 0 + 0 · 0 = 0 .
I.3.2.
If a i = , a j , b i and b j = , then A i j B i j + A j i B j i = 0 · 0 + 0 · 0 = 0 .
Denoting by n the number of rows of the form ( ) in the n × 2 matrix ( a b ) , it is clear that the number of pairs { i , j } of this type is n n + n n .
Type I.4: Tied in one ranking and two • in the other. For example,
1 1 , 1 1
That is, we have two cases, which are similar to this a i = a j and b i = b j = , and then A i j B i j + A j i B j i = 0 · 0 + 0 · 0 = 0 .
Let us denote by n a the number of different natural numbers in a and by n b be the number of different natural numbers in b . Let n i be the number of rows of the form ( i , ) in that matrix, for i = 1 , , n a and, analogoulsly, let n i be the number of rows of the form ( , i ) in the matrix ( a b ) for i = 1 , , n b . Then, it is straightforward to see that the number of cases of this type is given by
i = 1 n a n i 2 + i = 1 n b n i 2 .
Type I.5: Tied in one ranking, one • in the other. For example
1 1 2 , 1 2 1 , 1 2 1 , 2 1 1 .
We have the following 4 cases:
I.5.1.
If a i = a j and b i = and b j , then A i j B i j + A j i B j i = 0 · 0 + 0 · 0 = 0 .
I.5.2.
If a i = a j and b i and b j = , then A i j B i j + A j i B j i = 0 · 0 + 0 · 0 = 0 .
I.5.3.
If a i = and a j and b i = b j , then A i j B i j + A j i B j i = 0 · 0 + 0 · 0 = 0 .
I.5.4.
If a i and a j = and b i = b j , then A i j B i j + A j i B j i = 0 · 0 + 0 · 0 = 0 .
Let n i be the number of rows of the form ( i , ) (where ∗ can be i) in the same matrix, with i { 1 , 2 , n a } .
Analogously, let n i be the number of rows of the form ( , i ) (where ∗ can be i) in the matrix ( a b ) . Then, it is straightforward to see that the number of cases of this type is given by
i = 1 n a n i n i + i = 1 n b n i n i .
Type I.6: Two • in one ranking and different numbers in the other.
For example
1 2 , 2 1
We have here only two cases:
I.6.1.
If a i a j and b i = b j = then A i j B i j + A j i B j i = ( ± 1 ) · 0 + ( ± 1 ) · 0 = 0 .
I.6.2.
If a i = a j = and b i b j then A i j B i j + A j i B j i = 0 · ( ± 1 ) + 0 · ( ± 1 ) = 0 .
Then, it is easy to see that the number of pairs { i , j } of this type is
n 2 + n 2 i = 1 n a n i n i i = 1 n b n i n i
where we have subtracted the number of cases of the type I.4.
Type I.7: Only one • and no ties.
For example, they are cases of the form
1 1 2 , 1 2 1 , 1 1 2 , 2 1 1
We can have four cases that are similar to these
If
a i < a j and b i , b j = then A i j B i j + A j i B j i = 1 · 0 + ( 1 ) · 0 = 0 .
If
a i > a j and b i , b j = then A i j B i j + A j i B j i = ( 1 ) · 0 + 1 · 0 = 0 .
Let n i be number of rows of the form ( i , ) (where ∗ can be i) in the same matrix, with i { 1 , 2 , , n a } and, analogously, let n i be the number of rows of the form ( , i ) (where ∗ can be i) in the matrix ( a b ) , with i { 1 , 2 , , n a } . Then, the number of pairs { i , j } of this type is given by
n ( n + n ) i = 1 n a n i n i i = 1 n b n i n i
where we have subtracted the number of cases of the type I.5.
In Table 2 we overview the number of cases for each type of the incomplete case.
To end the proof, we add the contributions for all the cases, complete (C) and incomplete (I), to the sum i = 1 n j = 1 n A i j B i j and we obtain
i = 1 n j = 1 n A i j B i j = 2 n n c 2 s + 2 n t t
Now, taking into account that all the cases must amount up to the total number of pairs we have
n ( n 1 ) 2 = n n c + s + n t t + n t u + N i n c
where N i n c is the sum of all the cases in Table 2. By plugging n n c = n ( n 1 ) 2 s n t t n t u N i n c into (12), we finally get
i = 1 n j = 1 n A i j B i j = n ( n 1 ) 4 s 2 n t u 2 N i n c
where
N i n c = n 2 + n 2 + n 2 + n ( n + n + n ) + n ( n + n ) + n n
In the next example, we illustrate the previous result.
Example 2.
Given the rankings a = [ 1 , , 2 , , 3 , 2 , , , , 1 ] and b = [ 2 , , 4 , 2 , , 1 , 3 , 3 , , 2 ] , then n = 10 , n = 2 , n = 3 , n = 1 , n = 4 , s = 2 (corresponding to the pairs { 1 , 6 } and { 6 , 10 } ), n t u = 1 (corresponding to the pair { 3 , 6 } ), n t t = 1 (corresponding to the pair { 1 , 10 } ), n a = 3 , n b = 4 , n 1 = n 2 = 0 , n 3 = 1 , n 1 = 0 , n 2 = 1 , n 3 = 2 , n 4 = 0 , n 1 = 2 , n 2 = 2 , n 3 = 0 , n 1 = 1 , n 2 = 2 , n 3 = 0 , and, n 4 = 1 .
From the parameters of Table 3, we obtain N i n c = 39 . Thus, it is easy to check that i = 1 n j = 1 n A i j B i j = n ( n 1 ) 4 s 2 n t u 2 N i n c = 2 as stated in Theorem 1.
The number of pairs { i , j } is 45, corresponding to the following cells
1 2 , 1 2 2 4 , 1 2 2 , 1 2 3 , 1 2 2 1 , 1 2 3 , 1 2 3 , 1 2
1 2 1 2 , 2 4 , 2 , 3 , 2 1 , 3 , 3 ,
1 2 , 2 4 2 , 2 4 3 , 2 4 2 1 , 2 4 3 , 2 4 3 , 2 4 , 2 4 1 2
2 3 , 2 2 1 , 2 3 , 2 3 , 2 , 2 1 2 , 3 2 1 , 3 3
3 3 , 3 , 3 1 2 , 2 1 3 , 2 1 3 , 2 1 , 2 1 1 2 , 3 3
3 , 3 1 2 , 3 , 3 1 2 , 1 2
and the number of cases of each type for the incomplete case appearing on Theorem 1 are shown in Table 3.
Remark 1.
By using (10) and (7) we obtain
τ x = 1 4 ( s + 1 2 n t u ) + 2 N i n c n ( n 1 )
that can be thought of an extension of (1) to the case of two incomplete rankings with ties. This formula is one of the original contributions of this paper. Note that the term N i n c is known since it is given by (11). This formula will be useful in Section 6 to define our measure of correlation for a series of incomplete rankings with ties.
Remark 2.
For two complete rankings with ties allowed, Equation (10) simplifies to
i = 1 n j = 1 n A i j B i j = n ( n 1 ) 4 s 2 n t u
If we recall the definition of the distance of Kemeny and Snell [2] depending on a matrix C ( a ) = C i j ( a ) such that
C i j ( a ) = 1 if element i is preferred to element j 1 if element j is preferred to element i 0 if i = j , or if both elements i and j are tied
by following a similar procedure as in the proof of Theorem 1 it is easy to show that
i j | C i j ( a ) C j i ( b ) | = 4 s + 2 n t u
and by using (15) we get
i = 1 n j = 1 n A i j B i j = n ( n 1 ) i j | C i j ( a ) C j i ( b ) |
that it is in agreement with the results shown in [27], but we obtain it as a particular case of Theorem 1.
Remark 3.
The common number of ranked elements in a and b that we denote as n ¯ in (9) is precisely n . Moreover, by using that
n + n + n = n n ¯
Let us check that N i n c given by (11) can be rewritten as
N i n c = n 2 n ¯ 2
To that end, it is needed to use that n = n ¯ and
n + n + n = n n ¯
To see how it is, we first note that
n 2 + n 2 + n 2 = 1 2 n 2 + n 2 + n 2 ( n + n + n ) = 1 2 n 2 + n 2 + n 2 n + n ¯
Second, we can simplify, by using (20)
n ( n + n + n ) = n ( n n )
Third, note that, by using (20),
n ( n + n ) = n ¯ n n ¯ 2 n ¯ n
Now, by using (21)(23) we have that N i n c given by (11) becomes
N i n c = 1 2 n 2 + 1 2 ( n + n ) 2 + 1 2 ( n ¯ n ) + n ( n n n ¯ ) + n ¯ ( n n ¯ )
and since
1 2 ( n + n ) 2 = 1 2 n 2 2 n n ¯ + n ¯ 2 + 2 n ¯ n 2 n n + n 2
we get
N i n c = 1 2 ( n ¯ n ) + 1 2 n 2 2 n n ¯ + n ¯ 2 + n ¯ n n ¯ 2 = n ( n 1 ) 2 + n ¯ n ¯ 2 2
that is to say
N i n c = n 2 n ¯ 2
and the proof is done. Note also that, by using (13), we have: n ¯ 2 = n n c + s + n t t + n t u .
This last remark motivates the next result.
Corollary 1.
Given two vectors a , b representing incomplete rankings of n elements with ties and their corresponding matrices A = ( A i j ) and B = ( B i j ) , it holds that
i = 1 n j = 1 n A i j B i j = n ¯ ( n ¯ 1 ) 4 s 2 n t u
where n ¯ is the number of common ranked elements in both rankings—see (9)—s is the number of crossings, that is, the number of pairs { i , j } , such that a i < a j and b i > b j or a i > a j and b i < b j , and n t u is the number of pairs that are tied in only one ranking (from tie to untie or viceversa), that is, such that a i = a j and b i b j , or a i a j and b i = b j .
With (24), it is easy to obtain the maximum and minimum of the expression i = 1 n j = 1 n A i j B i j . When s = 0 and n t u = 0 we have
i = 1 n j = 1 n A i j B i j = n ¯ ( n ¯ 1 )
that is the maximum value of i = 1 n j = 1 n A i j B i j . Analogously, by taking s = n ¯ 2 , that is the maximum number of crossings and consequently n t u = 0 , we obtain from (24)
i = 1 n j = 1 n A i j B i j = n ¯ ( n ¯ 1 ) 4 n ¯ 2 = n ¯ ( n ¯ 1 )
that is the minimum value of i = 1 n j = 1 n A i j B i j . These facts, that are in agreement with the results shown in [12], explain why τ ^ x defined by (8) takes values in [ 1 , 1 ] .
Remark 4.
By using (7) and (24) we obtain
τ x = n ¯ ( n ¯ 1 ) n ( n 1 ) 4 s + 2 n t u n ( n 1 )
and from (8) and (25) we get
τ ^ x = 1 4 s + 2 n t u n ¯ ( n ¯ 1 )
Remark 5.
As we have pointed out in (3), a distance metric d ( a , b ) can be transformed into a correlation coefficient τ ( a , b ) by the formula
τ ( a , b ) = 1 2 d ( a , b ) d m a x ( a , b )
Note that in expression (14), when N i n c 0 , the quantity n ( n 1 ) is not the maximum value of the distance metric d ( a , b ) = 2 s + n t u + N i n c (see Example 6). This problem does not appear with the use of τ ^ x since, by using (26) we can identify a “distance metric” given by d ^ ( a , b ) = 2 s + n t u and its maximum value is achieved when s = n ¯ ( n ¯ 1 ) / 2 (and consequently n t u = 0 ) and has the value of
d ^ m a x = n ¯ ( n ¯ 1 ) / 2
Therefore, τ ^ x should be preferred over τ x in terms of normalization (see [12] for other considerations). This fact will be useful for the definition that we will introduce in Section 6.
In the next examples, we illustrate the two previous remarks. Note that when s = 0 and n t u = 0 then, by (26), τ ^ x = 1 and it is not affected by the presence of • in the rankings. By analogy with (4), we denote the Normalized Mean Strength of a and b as
N S ( a 1 , a 2 ) = ( 1 τ x ) 2 , and N S ^ ( a 1 , a 2 ) = ( 1 τ ^ x ) 2 .
Example 3.
Let a 1 = [ 1 , 2 , 3 , , , ] and a 2 = [ 1 , , 2 , 3 , , ] . It is easy to obtain: N i n c ( a 1 , a 2 ) = 14 , τ x ( a 1 , a 2 ) = 0.1556 , N S ( a 1 , a 2 ) = 0.4222 , τ ^ x ( a 1 , a 2 ) = 1 , and N S ^ ( a 1 , a 2 ) = 0.0 .
Example 4.
Let a 1 = [ 1 , 2 , 3 , 4 , , ] and a 2 = [ 1 , , 2 , 3 , 4 , ] . It is easy to obtain: N i n c ( a 1 , a 2 ) = 12 , τ x ( a 1 , a 2 ) = 0.2 , N S ( a 1 , a 2 ) = 0.4 , τ ^ x ( a 1 , a 2 ) = 1 , and N S ^ ( a 1 , a 2 ) = 0.0 .
The next example shows the results when a ranking is compared to itself and its reverse ranking for the case of complete rankings (note that τ x = τ ^ x since n ¯ = n ).
Example 5.
Let a 1 = [ 1 , 2 , 3 , 4 , 5 , 6 ] and a 2 = [ 6 , 5 , 4 , 3 , 2 , 1 ] . Then
a 1 a 1 a 1 a 2 N i n c 0 0 τ x 1.0 1.0 N S 0.0 1.0 τ ^ x 1.0 1.0 N S ^ 0.0 1.0
The next example shows that τ x does not take its limit values when the rankings are incomplete and that τ ^ x is not defined when there are no elements in common in both rankings.
Example 6.
Let a 1 = [ 1 , 2 , 3 , , , ] , a 2 = [ , , , 3 , 2 , 1 ] , a 3 = [ 1 , 2 , 3 , 4 , , ] , and a 4 = [ , , 4 , 3 , 2 , 1 ] , Then
a 1 a 1 a 1 a 2 a 3 a 4 N i n c 12 15 14 τ x 0.2 0.0 0.0667 N S 0.4 0.5 0.5333 τ ^ x 1.0 not defined 1.0 N S ^ 0.0 not defined 1.0
Our main practical result in this paper is the definition of a measure to deal not only with two rankings a 1 and a 2 , as we have seen so far, but with a series of incomplete rankings with ties { a 1 , a 2 , a m } in which, in practical situations, some kind of time evolution is presented (e.g., a sport ranking during a session where there may be ties or inclusion/elimination of teams, charts of songs ordered on a daily/weekly basis, etc.). In order to define this measure, it will be useful to recall some concepts defined for complete rankings.

5. Treatment of More Than Two Complete Rankings. Known Results

To study the evolution of more than two rankings we will use the concept of Kendall distance defined in [10], where some weights were introduced to measure the changes when passing from one ranking to the next. After that, we will recall how to extend this definition to a series of m complete rankings, as in [13].

5.1. Kendall Distance for Complete Rankings with Penalty Parameters

We recall the definition of Kendall distance with penalty parameters p and q from [10,13].
Definition 1.
Let a and b be two complete rankings with ties of the set N = { 1 , , n } , and penalty parameters p [ 0 , 1 2 ] and q [ 0 , 1 2 ] . The Kendall distance with penalty parameters p and q is defined as
K ( p , q ) ( a , b ) = { i , j } N K ¯ i , j ( p , q ) ( a , b )
where K ¯ i , j ( p , q ) ( a , b ) is computed according to the following cases:
  • Case 1: If i and j are not tied in a , nor in b . If they cross their positions when passing from a to b then K ¯ i , j ( p , q ) = 1 . Otherwise, K ¯ i , j ( p , q ) = 0 .
  • Case 2: If i and j are tied in both a and b . Then K ¯ i , j ( p , q ) = q .
  • Case 3: If i and j are tied only in one ranking. Then K ¯ i , j ( p , q ) = p .
Remark 6.
The penalty parameters p and q are bounded and take into account the cases where there exist tied elements in a , in b , or in both. For our purposes of measuring competitiveness, it is reasonable to assign p = 1 / 2 , to represent that they are tied in one ranking, and q = 0 to represent that they are tied in both of them. These assignments are inspired by [10]. In particular, they proved that p [ 0.5 , 1 ] in order to get that K ( p , 0 ) was a metric.
Remark 7.
Note that, by using the notation introduced in Theorem 1, it is easy to see that
K i , j ( p , q ) ( a , b ) = s + p n t u + q n t t
where n t t is the number of pairs { i , j } that go from tie to tie. Therefore, by using (14) with N i n c = 0 we get
τ x ( a , b ) = 1 4 K ( 0.5 , 0 ) ( a , b ) n ( n 1 )
that is, once more, a relation of the form (3). We see here another consequence of Theorem 1: it opens the possibility of defining new metrics based on putting penalties to the cases n , n , etc. since it gives an explicit expression on these cases.
With the previous definitions, we can deal with the general case of the study of a series of complete rankings. We do this in the next section.

5.2. Series of Complete Rankings with Ties

In [13], it was shown how to extend Definition 1 to m complete rankings with ties in a natural way. We recall these definitions here because they will be extended in Section 6 to a series of incomplete rankings.
Definition 2.
Given m complete rankings with ties a 1 , a 2 , a m of n elements, we define the evolutive Kendall distance with penalty parameters p and q as
K e v ( p , q ) ( a 1 , a 2 , , a m ) = i = 1 m 1 K ( p , q ) ( a i , a i + 1 ) .
When handling m rankings it is natural to include a new case (see [13]) that consists of a series of ties between a crossing (see Example 7 further on). Thus it is convenient to define a new case in the definition of K e v ( p , q ) ( a 1 , a 2 , , a m ) according to the following rule.
Definition 3.
Given m complete rankings with ties a 1 , a 2 , a m of n elements, we define the crossing after ties coefficient K ¯ i , j c a t ( a 1 , a 2 , , a m ) following the rule
Case 4.
If there exists a maximal set of rankings a t 1 , , a t k such that for each = 1 , , k the pair { i , j } is not tied in a t , but is tied in a t + 1 , a t + 2 , , a t + s , with s 1 , it is not tied in a t + s + 1 and, moreover, { i , j } exchange their relative positions between a t and a t + s + 1 . In this case K ¯ i , j c a t ( a 1 , a 2 , , a m ) = k , where k is the number of rankings in the maximal set of rankings a t 1 , , a t k verifying the aforementioned property.
Example 7.
Given the rankings with ties
r 1 r 2 r 3 r 4 r 5 r 6 1 1 , 2 1 , 2 2 1 , 2 1 2 3 3 1 3 2 3 4 4 3 4 3 4 4 4 4
the corresponding a i are
a 1 a 2 a 3 a 4 a 5 a 6 1 1 1 2 1 1 2 1 1 1 1 2 3 2 2 2 2 3 4 3 3 3 3 4
we have that the only nonzero crossing after ties coefficient is
K ¯ 1 , 2 c a t ( a 1 , a 2 , , a 6 ) = 2
since we have the appearance of the two series
a 1 a 2 a 3 a 4 1 1 1 2 2 1 1 1 and a 4 a 5 a 6 2 1 1 1 1 2
that show a series of ties between a crossing of the pair { i = 1 , j = 2 } .
By including the cases given by Definition 3 in the sum defined in Definition 2, in [13] a corrected evolutive distance in the following form is defined.
Definition 4.
Given m complete rankings with ties a 1 , a 2 , a m of n elements we define the corrected evolutive Kendall distance with penalty parameters p and q as follows:
K c e v ( p , q ) ( a 1 , , a m ) = K e v ( p , q ) ( a 1 , , a m ) + { i , j } K ¯ i , j c a t ( a 1 , , a m ) ,
where the summation is over the pairs { i , j } that verify Case 4 in Definition 3.
Following the same argument as in [13], it is easy to show that
max [ K c e v ( 0.5 , 0 ) ( a 1 , , a m ) ] = 1 2 ( m 1 ) n ( n 1 )
Now, in analogy with (3) and (14), the Kendall’s evolutive coefficient τ e v for a series of m complete rankings with ties can be defined as
τ e v ( a 1 , a 2 , a m ) = 1 4 K c e v ( 0.5 , 0 ) ( a 1 , , a m ) ( m 1 ) n ( n 1 ) [ 1 , 1 ]
With these previous definitions we can present the new coefficients for incomplete rankings with ties.

6. New Coefficients for Series of Incomplete Rankings with Ties

Given a series { a 1 , a 2 , , a m } of incomplete rankings with ties, for each pair of rankings a i and a j we can use Definitions 1–4 straightforwardly to also apply for a series of incomplete rankings by assuming that there is no penalty for the case of absent elements (regarding Definitions 1 and 2) and that these absent elements (denoted by `•’) do not contribute to either ties or to crossings after ties (regarding Definitions 3 and 4). That is, those definitions are applied as they are, ignoring the effect of the absent elements.
Keeping this in mind and, in analogy with (14), given a series of m incomplete rankings we could include the effect of the incomplete cases by defining
τ e v = 1 2 d e v o l ( a 1 , a 2 , , a m ) max ( d e v o l )
with
d e v o l ( a 1 , a 2 , , a m ) = 2 K c e v ( p = 0.5 , q = 0 ) ( a 1 , , a m ) + i = 1 m 1 N i n c ( a i , a i + 1 )
where N i n c ( a i , a i + 1 ) is the number of incomplete cases when passing from ranking a i to ranking a i + 1 . Note that the explicit form of N i n c ( a i , a i + 1 ) for each pair of consecutive rankings is given by (11) in Theorem 1 and Corollary 1. The value of max ( d e v o l ) depends on N i n c ( a i , a i + 1 ) . We have seen in Remark 5 that the definition of τ x corresponds to take d m a x ( a , b ) as the value corresponding to N i n c = 0 (and that is the reason why τ x is not well normalized). We can translate here the same reasoning and formalize it in the next definition.
Definition 5.
Given m incomplete rankings with ties a 1 , a 2 , a m of n elements we define the corrected evolutive Kendall’s τ coefficient for the series with penalty parameters p = 0.5 and q = 0 as follows:
τ e v = 1 4 K c e v ( 0.5 , 0 ) ( a 1 , , a m ) + 2 i = 1 m 1 N i n c ( a i , a i + 1 ) ( m 1 ) n ( n 1 )
where K c e v ( 0.5 , 0 ) ( a 1 , , a m ) is given by Definition 4, and N i n c ( a i , a i + 1 ) is given by (11).
Here we have the same drawback as we showed for τ x in Remark 5: τ e v is not properly normalized and it cannot get the values ± 1 if any N i n c ( a i , a i + 1 ) 0 . Therefore, in analogy with (26), we introduce a new coefficient in the following definition.
Definition 6.
Given m incomplete rankings with ties a 1 , a 2 , a m of n elements, such that n ¯ i , i + 1 > 1 , for all i = 1 , 2 , , m 1 , we define the scaled corrected evolutive Kendall’s τ coefficient for the series with penalty parameters p = 0.5 and q = 0 as follows:
τ ^ e v = 1 2 K c e v ( 0.5 , 0 ) ( a 1 , , a m ) max ( K c e v ( 0.5 , 0 ) ( a 1 , , a m ) )
where K c e v ( 0.5 , 0 ) ( a 1 , , a m ) is given by Definition 4 and with
m a x [ K c e v ( 0.5 , 0 ) ( a 1 , , a m ) ] = 1 2 i = 1 m 1 n ¯ i , i + 1 ( n ¯ i , i + 1 1 )
where n ¯ i , i + 1 denotes the common ranked elements between a i and a i + 1 .
Note that we need that, for some i, n ¯ i , i + 1 0 .
Remark 8.
In the limit case of m complete rankings with ties, note that Equation (37) collapses to Equation (32) . Note also that τ ^ e v is affected by the crossings, the pass from tie to untie (or viceversa) and the long crossings (crossings after ties given by K ¯ i , j c a t ( a 1 , a 2 , , a m ) , given by Definition 3), due to the term 2 K c e v ( p = 0.5 , q = 0 ) ( a 1 , , a m ) . The effect of the elements that are out of the rankings appear explicitly by the term n ¯ i , i + 1 that does not take into account the position in a i nor in a i + 1 . τ ^ e v is well normalized, that is τ ^ e v [ 1 , 1 ] .
Example 8.
Let n = 6 . Given the series of incomplete rankings with ties a 1 = [ 1 , 2 , 3 , 4 , 5 , 6 ] , a 2 = [ 1 , 2 , 3 , , , ] , and a 3 = [ 1 , 2 , , , , ] , an easy computation shows K c e v ( a 1 , a 2 , a 3 ) = 0 and thus τ ^ e v = 1 . Note that τ e v = 0.1333 .
Example 9.
Let n = 6 . Given the series of incomplete rankings with ties a 1 = [ 1 , 2 , 3 , 4 , 5 , 6 ] , a 2 = [ 3 , 2 , 1 , , , ] , and a 3 = [ 1 , 2 , , , , ] , it is easy to obtain that K c e v ( a 1 , a 2 , a 3 ) = 4 = max ( K c e v ) and thus τ ^ e v = 1 . Note that τ e v = 0.1333 .
As we have seen in the above definitions, the importance of Theorem 1 and Corollary 1 consists of giving the explicit formula for N i n c ( a i , a i + 1 ) to allow for the computation of the coefficient τ ^ e v for the series of m incomplete rankings with ties. Note that τ ^ e v [ 1 , 1 ] . For the particular case when the rankings are complete, we have N i n c ( a i , a i + 1 ) = 0 for all the pairs of consecutive rankings and n ¯ i , i + 1 = n , for i = 1 , 2 , , m 1 , and therefore Equation (36) reduces to the complete case given by Equation (33), that is, τ ^ e v collapses to τ e v .
Another contribution of Theorem 1 and Definition 6 is that they are useful to describe the behavior of the series of m rankings in terms of a competitivity graph. We can define a weighted graph for each one of the interactions between the elements when passing from a i to a i + 1 : crossings, passing from tie to untie (or vice-versa), and crossing after ties. Moreover, for each kind of graph, we can add the contributions of all the pairs of consecutive rankings to obtain a projected graph for any interaction (crossings, passing from tie to untie (or vice-versa), and crossing after ties). The procedure is the following: First, we construct an undirected graph for each pair of rankings a k , a k + 1 by identifying each element i as a node and defining an edge between i and j by the rule: there is an edge connecting { i , j } with weight K ¯ i , j ( p , q ) ( a k , a k + 1 ) when this weight is nonzero. By adding the m 1 pairs of undirected graphs we obtain a projected graph with a total sum of weights K c e v ( p = 0.5 , q = 0 ) ( a 1 , , a m ) . By adding the crossing after ties term to the projected graph we have all the ingredients appearing on Definition 6. We show this procedure by using the next example with m = 6 and n = 8 .
Example 10.
Given the series of incomplete rankings with ties
r 1 r 2 r 3 r 4 r 5 r 6 5 2 4 6 2 1 7 1 8 1 , 4 1 , 4 5 3 8 3 3 6 , 7 8 8 3 2 , 6 8 5 3 1 , 4 5 , 7 5 , 7 2 3 4 4 1 7 8
the corresponding a i are
a 1 a 2 a 3 a 4 a 5 a 6 5 2 6 2 2 1 1 4 5 1 3 4 3 3 5 4 5 6 1 2 2 5 1 5 5 4 2 4 1 3 2 5 5 6 3 4 3 2 4 6 3
In this example we have n = 8 , and an easy computation leads to the parameters shown in Table 4. For each pair of consecutive rankings it is easy to compute the parameters defined in Theorem 1: n , n , n , n , s, n t u , and n t t . Then, by using Equation (10) in Theorem 1 we can obtain, for any pair of rankings, the value N i n c . n ¯ is the number of common elements, given by (9). The coefficient τ e v is given by (35) , and the coefficient τ ^ e v is given by (36). In analogy with (4) we can define the corresponding normalized mean strengths given by
N S = ( 1 τ e v ) 2
and
N S ^ = ( 1 τ ^ e v ) 2
Finally, in Table 4 we include the coefficients τ x and τ ^ x given by (7) and (8), respectively. These last coefficients are included to show that our new coefficients τ e v and τ ^ e v reduce to them when only a pair of rankings are considered.
To compute our new coefficients τ e v and τ ^ e v for the whole series of rankings a 1 to a 6 we need some previous parameters. First, we need the value
i = 1 5 N i n c ( a i , a i + 1 ) = 52
To compute K c e v ( p = 0.5 , q = 0 ) ( a 1 , , a 6 ) , given by (31), we need to know, previously, the value of the crossing after ties coefficients K ¯ i , j c a t ( a 1 , , a 6 ) , given by Definition 3. Note that the unique long crossing occurs for the pair { 1 , 4 } : the elements tagged as 1 and 4 are such that 4 is above 1 in r 3 , both elements are tied in rankings r 4 and r 5 , and, finally, 4 is below 1 in ranking r 6 . Note, for example, that the pair { 5 , 7 } does not accomplish the conditions of crossing after ties. Therefore the only term that contributes to { i , j } K ¯ i , j c a t is K ¯ 1 , 4 c a t ( a 1 , , a 6 ) = 1 .
With respect to K e v ( p = 0.5 , q = 0 ) ( a 1 , , a 6 ) , given by (30), we need to compute the terms K ¯ i , j ( p , q ) ( a i , a i + 1 ) , given by (28), for any pair of consecutive rankings. A detailed computation shows that, in this example, we have 42 crossings and 6 cases of tie to untie or viceversa. The precise pairs of elements that contribute to these cases are shown in the corresponding projected weighted graphs in Figure 1. The crossing after ties case is represented in Figure 2.
Therefore we have all the ingredients to compute K c e v ( p = 0.5 , q = 0 ) . That is
K c e v ( p = 0.5 , q = 0 ) ( a 1 , , a 6 ) = K e v ( p = 0.5 , q = 0 ) ( a 1 , , a 6 ) + { i , j } K ¯ i , j c a t ( a 1 , , a 6 ) = i = 1 5 K ( p , q ) ( a i , a i + 1 ) + { i , j } K ¯ i , j c a t ( a 1 , , a 6 )
and, by Remark (7), we know that
K i , j ( p , q ) ( a , b ) = s + p n t u + q n t t
Therefore, we have
K c e v ( p = 0.5 , q = 0 ) ( a 1 , , a 6 ) = ( 9 + 12 + 8 + 9 + 4 ) + 0.5 ( 2 + 0 + 2 + 1 + 1 ) + K ¯ 1 , 4 c a t ( a 1 , , a 6 ) = 42 + 3 + 1 = 46 .
By using (35), we obtain
τ e v = 1 4 · 46 + 2 · 52 ( 6 1 ) 8 · 7 = 1 1.0286 = 0.0286
that corresponds to an equivalent normalized mean strength
N S = ( 1 τ e v ) 2 = 0.5143
Finally, regarding τ ^ e v , we have
τ ^ e v = 1 2 · 46 1 2 ( 6 · 5 + 7 · 6 + 7 · 6 + 7 · 6 + 5 · 4 ) = 1 4 · 46 176 = 0.0455
that corresponds to
N S ^ = 0.5227 .
All in all, we conclude that τ ^ e v is a proper coefficient for the evaluation of m incomplete rankings with ties and can be considered as a natural extension of the coefficient τ ^ x presented in [12]. In the next section we apply the new coefficients τ e v and τ ^ e v to real rankings appearing on Spotify charts.

7. Results

Spotify is one of the major music streaming services worldwide, with 299 million monthly active users, as of July 2020 [28]. The company Spotify Technology S.A. has been listed on the New York Stock Exchange since 2018. As of September 2020, the company offers a catalog of 60 million tracks and operates in 92 countries from Albania to Vietnam [29]. Spotify divides the monthly active users into four regions [30]: Europe (35%), North America (26%), Latin America (22%) and rest of the world (17%). The app is available on several devices, such as computers, smartphones, tablets, wearable devices, etc. The users can choose between a free service (called Freemium or Ad-Supported) or a Premium service. In any case, the user can listen by streaming any song of the catalog (that is, the user does not own the song’s digital file, but can listen to it). It is accepted that music streaming services have transformed the entire music market—see [31]—and they have evolved very fast, changing their services and capabilities. For example, Spotify has signed some partnerships with Microsoft [32], Sony [33] and Facebook [34] among other big companies. There exists a large amount of literature about Spotify, but it is mainly focused on Economics and Music. To the best of our knowledge, a small number of papers are devoted to the mathematical aspects of the rankings produced by Spotify. Among these papers, we have [35,36]. A paper that studies the relationship between personality and type of music is [37]. See [38] for more details about Spotify.
Like other services on the Internet, Spotify provides some chart lists (song rankings) based on the platform’s number of streamings. To this kind of rankings belongs the Top 200 (see [39,40,41]), that is one of the topics of our study. Another ranking that we are interested in is called Viral 50 which is an evolution of the original Social 50 ranking (see [42,43,44]) that incorporated in the song chart the effect of the social sharing of a track by Spotify users. This sharing included platforms such as Facebook and Twitter. It is not completely clear for us how this rank is computed, but it aims to gather fresh songs that acquire high impact on social networks by new release promotions, special apparitions on tv-shows, music festivals, tours, etc. (see [45] for an example of how a viral song transformed into a Top 100 song in 2013).
Due to the situation caused by the COVID-19 pandemic, the live music business reflected some drawbacks, such as festivals being cancelled worldwide, a reduction in public-performance licensing, and other related factors—see [46]. As an example, Warner Music Group Corp showed a total revenue fall of 1.7 % in the first quarter of 2020 compared to the first quarter of 2019 [47]. Spotify also reported some impact on their business, but in the first quarter of 2020, it seemed that the consumption recovered and monthly active users increased faster in the first quarter of 2020 than in the same period of 2019 [30]. Some perturbations in Spotify streaming were also reported by the music analytic company Chartmetric that observed a change in the type of consumption of Spotify streamings by music genre in the period between 3 March 2020 to 9 April 2020, concluding that it seemed that it had been a pandemic-induced lifestyle change [48].
With regard to the Top 50 viral, it is reasonable to think that the fact that many artists (such as Lady Gaga, Alicia Keys, and Cardi B. [46]) have postponed big releases may have decreased the movements in these charts.

7.1. Method to Convert Spotify Lists into Incomplete Rankings

Both Spotify Top 200 and Viral 50 lists can be treated as incomplete rankings since some elements (songs) quit the list and some others that appear on the list (new songs). Let us call any of these rankings as Top k rankings. In order to handle these Top k rankings, our methodology consists of the following steps:
1.
Select a set of m lists { v 1 , v 2 , , v m } with k entries in each v i .
2.
Denote as n the number of different songs that appear on these m lists. We tag these songs from 1 to n, following the order they first appear, reading the lists from the first to the last one, and each list from top to bottom. Denote t i the tagged version of v i , for i = 1 , 2 , , m , including all the n songs.
3.
Denote r 1 a vector with entries from 1 to n. The first k values correspond to the elements in v 1 .
4.
Construct the rankings r i for 2 = 1 , m , in the following form:
(a)
The first k entries of r i are copied from t i ;
(b)
The rest of the entries form a vector s i and come from the the elements that quit from t i 1 plus the elements that, being in s i 1 , are not included in t i .
These n k elements preserve their relative order. This order is not important since these elements are not included in the Top k ranking t i .
5.
From each t i , we construct the corresponding incomplete ranking a i given by (5).
Example 11.
Let us consider three Top 4 lists ( v 1 , v 2 , v 3 ) and construct the corresponding three rankings ( a 1 , a 2 , a 3 ). Here we have m = 3 and k = 4 .
v 1 v 2 v 3 A B F B C C C E B D A E t 1 t 2 t 3 s 1 s 2 s 3 r 1 r 2 r 3 1 2 6 2 3 3 3 5 2 4 1 5 5 4 1 6 6 4 a 1 a 2 a 3 1 4 2 1 3 3 2 2 4 3 4 1
We have denoted as s i the elements beyond the k position in each ranking r i . The rankings a i are constructed looking at r i from positions 1 to 4. Since the elements that do no belong to t i are in s i , we tagged them as •.

7.2. Comparison of Two Series of Top 200 Rankings

From the site [49] we downloaded the series of Top 200 (Global) rankings corresponding to the following time intervals:
  • 2019 Series: 18 weekly rankings ranging from 28 December 2018 to 3 May 2019.
  • 2020 Series: 18 weekly rankings ranging from 27 December 2019 to 1 May 2020.
The term Global means that the charts were produced from streaming on Spotify from all over the world. By using the methodology explained in the previous section, we convert the 18 downloaded rankings to a series of incomplete rankings (with no ties) a 1 , , a 18 , and we compute our parameters. This is repeated for each considered year. The results are shown in Table 5.
In Table 5 we have denoted by < n ¯ i , i + 1 > the average of { n ¯ i , i + 1 } for i = 1 , 2 , , 17 , that is the mean number of common elements from each pair of consecutive rankings. We see that the number of songs involved in the 2019 series is n = 474 , which is lower than the 2020 series number. This fact could indicate that there was more activity in the 2020 series since more new songs appeared than the previous year. By extension, we can also conclude that the activity on Spotify of the users was higher in the 2020 series.
The same tendency is observed by looking at N i n c and n ¯ i , i + 1 . Our coefficients N S and N S ^ corroborate this intuition since they take higher values in the 2020 Series than in the 2019 Series. Analogously, by looking at τ e v and τ ^ e v , we see a decrease when comparing the 2019 Series with the 2020 Series. Recall that the coefficients N S and N S ^ introduced in this paper offer a measure of the movements in the rankings, since they take into account the number of crossings and, in this case, that we do not have ties, due to the effect of absent elements.
In the same manner, as we did in Example 10, we can construct the projected graph corresponding to the crossings for each series. We show these graphs in Figure 3, that have been plotted with MATLAB by using the option ”subspace”.

7.3. Comparison of Two Series of Viral-50 Rankings

From the site [50], we downloaded the series of Viral 50 (Global) weekly rankings corresponding to the following periods:
  • 2019 Series: 18 weekly rankings ranging from 3 January 2019 to 2 May 2019.
  • 2020 Series: 18 weekly rankings ranging from 2 January 2020 to 30 April 2020.
For each considered year, we convert the 18 downloaded rankings to a series of incomplete rankings (with no ties) a 1 , , a 18 , and we computed again the aforementioned parameters. The results are shown in Table 6.
The number of songs involved in the 2019 series is n = 315 , that is greater than the number involved in the 2020 series. This fact could indicate that there was less viral activity in the 2020 series since fewer new songs appeared than the previous year. The same tendency is observed at N i n c . This intuition is corroborated by our coefficients. N S and N S ^ since they take lower values in the 2020 series than in the 2019 series. We also see an increase in τ e v and τ ^ e v when comparing the 2019 series with the 2020 series.
If we compare these results with those obtained in the previous section, we conclude that Spotify’s viral activity was negatively affected by the Pandemic. This may seem reasonable since many events that produce sharing in Social Networks, such as shows, new releases, and performances, were postponed during these months, as we have already discussed. We again plot the projected graph corresponding to the crossings for each series in Figure 4.

7.4. Comparison of a Series of Top 200 and a Series of Viral 50 Rankings

Given that our coefficients τ e v , τ ^ e v , N S , and N S ^ are normalized, we can compare series of rankings of different type. Looking at Table 5 and Table 6, we conclude (e.g., looking at N S ^ ) that the Viral-50 rankings present more activity than the Top 200 rankings. For example in the 2019 series the value of N S ^ is 0.1982 for the Viral-50 rankings, and only 0.0730 for the Top 200 rankings. This conclusion seems reasonable, taking into account that the Viral-50 rankings are constructed by looking at the behaviours of songs that may rapidly change, since they are viral phenomena.

7.5. Comparison of the Evolution of Two Series of Incomplete Ranking with Ties

Spotify charts Top 200 do not present ties, but we can construct incomplete rankings with ties if we take into account the Top 200 ranking and the rest of the songs that appear in the whole studied interval. In detail, to obtain a series of incomplete rankings with ties from a Top 200 series on Spotify, we will consider the whole list of tracks along with the m rankings and focus on what happens in positions greater than 200. Using the terminology used in Example 11 we consider the elements that appear on the rankings, denoted as s 1 , s 2 , . In this ranking we consider the following:
(i)
All the tracks in s 1 are tied. That is a 1 = [ 1 , 200 1 n 200 ] where 1 , 200 is a row vector of 200 entries of the type •, and 1 n 200 is the row vector of all-ones, with n 200 entries, being n the total number of different tracks in the m rankings.
(ii)
For i = 2 , 3 m , we consider that in s i we have (at most) two buckets of tied elements. In one bucket we have the elements (if any) that come from t i 1 . In the other bucket, we consider the rest of the elements of s i
The next example with a series of m = 7 Top 4 charts illustrates this methodology.
Example 12.
Let us consider the series of seven Top 4 tracks v i with n = 10 elements { A , B , , J } given by the rankings
A A F G G G J B B E H H I C C E A C C B A D F C E E A H
from these rankings we construct the rankings t i and s i to obtain the rankings in the form
t i s i 1 1 6 7 7 7 10 2 2 5 8 8 9 3 3 5 1 3 3 2 1 4 6 3 5 5 1 8 5 3 2 6 6 8 7 6 4 4 1 1 3 9 7 7 7 2 2 5 2 8 8 8 4 4 6 5 9 9 9 9 9 4 6 10 10 10 10 10 10 4
Now, we consider the rankings s i as a series of incomplete rankings with ties with the convention explained above and we compute the corresponding a i vectors to obtain the rankings
a 1 a 2 a 3 a 4 a 5 a 6 a 7 1 1 1 2 1 1 1 1 1 2 2 1 2 2 1 1 2 1 1 1 2 2 1 2 2 1 1 2 2 1 1 2 2 2 1 1 1 2 2 2 1 2
Note that, since there are at most two buckets, the entries of a i belong to the set { 1 , 2 , } . Note also that in s 5 there is only one bucket.
By using this methodology, we have converted the series of rankings studied in Section 7.2 to the corresponding series a i with ties. The parameters obtained are shown in Table 7.
If we look at n, N i n c , < n ¯ i , i + 1 > , and N S ^ in Table 7, we conclude that there has been more activity in the 2020 Series than in the 2019 Series. However, by looking at N S (and τ e v ), the conclusion seems to be the reverse. Here we see, therefore, that τ e v and τ ^ e v can present different tendencies. This is related to the form in which they are normalized, as we have commented in Remark 5 and in Section 6. These results provide an example of how the transformation from τ e v to τ ^ e v is not linear, since τ e v increases from 2019 to 2020 but τ ^ e v decreases in the same period.
In Figure 5, we show the plot of the giant component corresponding to the projected graph showing the interactions of the form tie to untie or viceversa. That is, there is a link between elements (nodes) i and j when the pair { i , j } goes from tie to untie (or vice versa) in any pair of consecutive rankings a i and a i + 1 . We see many more interactions of this type in the 2020 series than in the 2019 series.
In Figure 6, we show the plot of the giant component corresponding to the projected graph showing the interactions of the form tie to tie, that is, there is a link between elements (nodes) i and j when the pair { i , j } goes from tie to tie in any pair of consecutive rankings a i and a i + 1 . We also see many more interactions of this type in the 2020 series than in the 2019 series.
Therefore, and taking into account the values of Table 7, we can conclude (for this artificial model of incomplete ranking with ties) that there was more activity in the 2020 series than in the 2019 series.
We have shown the application of the new coefficients introduced in this work, as long as the utility of the visualizations based on the projected graph plots of the (evolutive) competitive graph associated to a series of incomplete rankings with or without ties.

8. Conclusions

We present the main conclusions of our work:
  • We provide a theoretical result that allows for understanding, in terms of the type of interactions between pairs of elements in a series of incomplete rankings with ties, two recently introduced coefficients, given in [4,12].
  • We have defined two new coefficients to characterize a series of incomplete rankings with ties in terms of the interactions mentioned above.
  • We have presented a methodology to treat Spotify charts (both Top 200 and Viral 50) as a series of incomplete rankings. This methodology allows us to obtain conclusions about the movements in the lists and, therefore, on the activity of the users of the app.
  • We have obtained an artificial series of incomplete rankings with ties based on Spotify Top 200 lists, to apply our coefficients and show the applicability of the method.
  • The main theoretical result (Theorem 1) may serve to define new coefficients by giving weight to the interactions between pairs of elements when going from one ranking to the next one. The applications can be of interest in other fields (neuroscience, sports, bioinformatics, etc.).

Author Contributions

All authors contributed equally to this paper and have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Spanish Government, Ministerio de Economía y Competividad, grant number MTM2016-75963-P.

Acknowledgments

We thank the four anonymous reviewers for their constructive comments, which helped us to improve the readability of the manuscript.

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

  1. Langville, A.N.; Carl, D.; Meyer, C.D. Who’s ♯1: The Science of Rating and Ranking; Princeton University Press: Princeton, NJ, USA, 2012. [Google Scholar]
  2. Kemeny, J.G.; Snell, J.L. Preference rankings. An axiomatic approach. In Mathematical Models in the Social Sciences, 2nd ed.; The MIT Press: Cambridge, MA, USA, 1973; pp. 9–23. [Google Scholar]
  3. Diaconis, P.; Graham, R.L. Spearman’s Footrule as a Measure of Disarray. J. R. Stat. Soc. B Met. 1977, 39, 262–268. [Google Scholar] [CrossRef]
  4. Moreno-Centeno, E.; Escobedo, A.R. Axiomatic aggregation of incomplete rankings. IIE Trans. 2016, 48, 475–488. [Google Scholar] [CrossRef]
  5. Criado, R.; García, E.; Pedroche, F.; Romance, M. A new method for comparing rankings through complex networks: Model and analysis of competitiveness of major European soccer leagues. Chaos 2013, 23, 043114. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  6. Fortune 500. Available online: https://fortune.com/fortune500/ (accessed on 24 September 2020).
  7. Academic Ranking of World Universities ARWU 2020. Available online: http://www.shanghairanking.com/ARWU2020.html (accessed on 24 September 2020).
  8. CWTS Leiden Ranking 2020. Available online: https://www.leidenranking.com/ranking/2020/list (accessed on 24 September 2020).
  9. Billborad. The Hot 100. Available online: https://www.billboard.com/charts/hot-100 (accessed on 24 September 2020).
  10. Fagin, R.; Kumar, R.; Mahdian, M.; Sivakumar, D.; Vee, E. Comparing Partial Rankings. SIAM J. Discrete Math. 2006, 20, 628–648. [Google Scholar] [CrossRef] [Green Version]
  11. Cook, W.D.; Kress, M.; Seiford, L.M. An axiomatic approach to distance on partial orderings. Rairo-Rech. Oper. 1986, 20, 115–122. [Google Scholar] [CrossRef] [Green Version]
  12. Yoo, Y.; Escobedo, A.R.; Skolfield, J.K. A new correlation coefficient for comparing and aggregating non-strict and incomplete rankings. Eur. J. Oper. Res. 2020, 285, 1025–1041. [Google Scholar] [CrossRef]
  13. Pedroche, F.; Criado, R.; García, E.; Romance, M.; Sánchez, V.E. Comparing series of rankings with ties by using complex networks: An analysis of the Spanish stock market (IBEX-35 index). Netw. Heterog. Media 2015, 10, 101–125. [Google Scholar] [CrossRef]
  14. Criado, R.; García, E.; Pedroche, F.; Romance, M. On graphs associated to sets of rankings. J. Comput. Appl. Math. 2016, 291, 497–508. [Google Scholar] [CrossRef]
  15. Kendall, M.G. A New Measure of Rank Correlation. Biometrika 1938, 30, 81–89. [Google Scholar] [CrossRef]
  16. Kendall, M.G. Rank Correlation Methods, 4th ed.; Griffin: London, UK, 1970. [Google Scholar]
  17. Kendall, M.G.; Babington-Smith, B. The Problem of m Rankings. Ann. Math. Stat. 1939, 10, 275–287. [Google Scholar] [CrossRef]
  18. Bogart, K.P. Preference structures I: Distances between transitive preference relations. J. Math. Sociol. 1973, 3, 1. [Google Scholar] [CrossRef]
  19. Bogart, K.P. Preference Structures II: Distances Between Asymmetric Relations. SIAM J. Appl. Math. 1975, 29, 254–262. [Google Scholar] [CrossRef]
  20. Cicirello, V.A. Kendall Tau Sequence Distance: Extending Kendall Tau from Ranks to Sequences. arXiv 2019, arXiv:1905.02752v3. [Google Scholar] [CrossRef] [Green Version]
  21. Armstrong, R.A. Should Pearson’s correlation coefficient be avoided? Ophthal. Physl. Opt. 2019, 39, 316–327. [Google Scholar] [CrossRef] [Green Version]
  22. Redman, W. An O(n) method of calculating Kendall correlations of spike trains. PLoS ONE 2019, 14, e0212190. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  23. Pihur, V.; Datta, S.; Datta, S. RankAggreg, an R package for weighted rank aggregation. BMC Bioinform. 2009, 10. [Google Scholar] [CrossRef] [Green Version]
  24. Pnueli, A.; Lempel, A.; Even, S. Transitive orientation of graphs and identification of permutation graphs. Can. J. Math. 1971, 23, 160–175. [Google Scholar] [CrossRef]
  25. Gervacio, S.; Rapanut, T.; Ramos, P. Characterization and construction of permutation graphs. Open J. Discrete Math. 2013, 3, 33–38. [Google Scholar] [CrossRef] [Green Version]
  26. Golumbic, M.; Rotem, D.; Urrutia, J. Comparability graphs and intersection graphs. Discrete Math. 1983, 43, 37–46. [Google Scholar] [CrossRef] [Green Version]
  27. Emond, E.J.; Mason, D.W. A New Rank Correlation Coefficient with Application to the Consensus Ranking Problem. J. Multi-Crit. Decis. Anal. 2002, 11, 17–28. [Google Scholar] [CrossRef]
  28. Spotify Reports Second Quarter 2020 Earnings. 29 July 2020. Available online: https://newsroom.spotify.com/2020-07-29/spotify-reports-second-quarter-2020-earnings (accessed on 24 September 2020).
  29. Spotify. Company info. Available online: https://newsroom.spotify.com/company-info/ (accessed on 24 September 2020).
  30. Bussines Wire. 29 April 2020. Available online: https://www.businesswire.com/news/home/20200429005216/en/ (accessed on 24 September 2020).
  31. Swanson, K. A Case Study on Spotify: Exploring Perceptions of the Music Streaming Service. J. Music Entertain. Ind. Educ. Assoc. 2013, 13, 207–230. [Google Scholar] [CrossRef]
  32. Warren, T. Microsoft Retires Groove Music Service, Partners with Spotify. The Verge. Vox Media. 2 October 2017. Available online: https://www.theverge.com/2017/10/2/16401898/microsoft-groove-music-pass-discontinued-spotify-partner (accessed on 24 September 2020).
  33. Lempel, E. Spotify Launches on PlayStation Music Today. (30 March 2015) Sony. Available online: https://blog.playstation.com/2015/03/30/spotify-launches-on-playstation-music-today/ (accessed on 24 September 2020).
  34. Perez, S. You Can Now Share Music from Spotify to Facebook Stories. Techcrunch.com. 31 August 2019. Available online: https://techcrunch.com/2019/08/30/you-can-now-share-music-from-spotify-to-facebook-stories (accessed on 24 September 2020).
  35. Mähler, R.; Vonderau, P. Studying Ad Targeting with Digital Methods: The Case of Spotify. Cult. Unbound. 2017, 9, 212–221. Available online: https://cultureunbound.ep.liu.se/article/view/1820 (accessed on 24 September 2020). [CrossRef] [Green Version]
  36. Van den Hoven, J. Analyzing Spotify Data. Exploring the Possibilities of User Data from a Scientific and Business Perspective. (Supervised by Sandjai Bhulai). Report from Vrije Universiteit Amsterdam. August 2015. Available online: https://www.math.vu.nl/~sbhulai/papers/paper-vandenhoven.pdf (accessed on 24 September 2020).
  37. Greenberg, D.V.; Kosinski, M.; Stillwell, D.V.; Monteiro, B.L.; Levitin, D.J.; Rentfrow, P.J. The Song Is You: Preferences for Musical Attribute Dimensions Reflect Personality. Soc. Psychol. Pers. Sci. 2016, 7, 597–605. [Google Scholar] [CrossRef]
  38. Eriksson, M.; Fleischer, R.; Johansson, A.; Snickars, P.; Vonderau, P. Spotify Teardown. Inside the Black Box of Streaming Music; The MIT Press: Cambridge, MA, USA, 2019. [Google Scholar]
  39. Spotify Charts Regional. Available online: https://spotifycharts.com/regional (accessed on 24 September 2020).
  40. Harris, M.; Liu, B.; Park, C.; Ramireddy, R.; Ren, G.; Ren, M.; Yu, S.; Daw, A.; Pender, J. Analyzing the Spotify Top 200 Through a Point Process Lens. arXiv 2019, arXiv:1910.01445v1. [Google Scholar]
  41. Aguiar, L.; Waldfogel, J. Platforms, Promotion, and Product Discovery: Evidence from Spotify Playlists; JRC Digital Economy Working Paper. No. 2018-04; European Commission, Joint Research Centre (JRC): Seville, Spain, 2018. [Google Scholar]
  42. Lawler, R. Spotify Charts Launch Globally, Showcase 50 Most Listened to and Most Viral Tracks Weekly. Engadget. 21 May 2013. Available online: https://www.engadget.com/2013-05-21-spotify-charts-launch.html (accessed on 24 September 2020).
  43. Spotify says its Viral-50 chart reaches the parts other charts don’t. Music Ally Blog. 15 July 2014. Available online: https://musically.com/2014/07/15/spotify-says-its-viral-50-chart-reaches-the-parts-other-charts-dont/ (accessed on 24 September 2020).
  44. Stassen, M. Spotify Reveals New Viral 50 Chart. MusicWeek. 15 July 2014. Available online: https://www.musicweek.com/news/read/spotify-launches-the-viral-50-chart/059027 (accessed on 24 September 2020).
  45. Bertoni, S. How Spotify Made Lorde A Pop Superstar. Forbes, 26 November 2013. [Google Scholar]
  46. Ingham, T. Record Companies Aren’t Safe From the Coronavirus Economic Fallout. Rolling Stone, 31 March 2020. [Google Scholar]
  47. Warner Music Group Corp. Reports Results for Fiscal Second Quarter Ended 31 March 2020. Available online: https://www.wmg.com/news/warner-music-group-corp-reports-results-fiscal-second-quarter-ended-march-31-2020-34751 (accessed on 24 September 2020).
  48. Joven, J.; Rosenborg, R.A.; Seekhao, N.; Yuen, M. COVID-19’s Effect on the Global Music Business, Part 1: Genre. Available online: https://blog.chartmetric.com/covid-19-effect-on-the-global-music-business-part-1-genre/ (accessed on 23 April 2020).
  49. Spotify. Top 200. Available online: https://spotifycharts.com/regional/global/weekly (accessed on 24 September 2020).
  50. Spotify Charts. Available online: https://spotifycharts.com/viral/ (accessed on 24 September 2020).
Figure 1. Projected weighted graphs representing the pairs of elements that contribute to crossings (left panel) and the pairs corresponding to the case tie to untie or viceversa (right panel), occurring in Example 10.
Figure 1. Projected weighted graphs representing the pairs of elements that contribute to crossings (left panel) and the pairs corresponding to the case tie to untie or viceversa (right panel), occurring in Example 10.
Mathematics 08 01828 g001
Figure 2. Projected weighted graph representing the crossing after ties cases occurred in Example 10.
Figure 2. Projected weighted graph representing the crossing after ties cases occurred in Example 10.
Mathematics 08 01828 g002
Figure 3. Graph based on crossings corresponding to the giant connected component of Top 200 2019 Series (left panel, 360 nodes, 16,115 edges) and Top 200 2020 Series (right panel, 374 nodes, 16,564 edges).
Figure 3. Graph based on crossings corresponding to the giant connected component of Top 200 2019 Series (left panel, 360 nodes, 16,115 edges) and Top 200 2020 Series (right panel, 374 nodes, 16,564 edges).
Mathematics 08 01828 g003
Figure 4. Graph based on crossings corresponding to the giant connected component of Viral-50 2019 Series (left panel, 185 nodes, 1685 edges) and Viral-50 2020 Series (right panel, 186 nodes, 1447 edges).
Figure 4. Graph based on crossings corresponding to the giant connected component of Viral-50 2019 Series (left panel, 185 nodes, 1685 edges) and Viral-50 2020 Series (right panel, 186 nodes, 1447 edges).
Mathematics 08 01828 g004
Figure 5. Graph based on crossings of the type from tie to untie or vice versa corresponding to the giant connected component of 2019 Series (left, 377 nodes, 49,915 edges) and 2020 Series (right, 457 nodes, 82,051 edges). We also have 97 isolated nodes in the 2019 Series and 99 in the 2020 Series.
Figure 5. Graph based on crossings of the type from tie to untie or vice versa corresponding to the giant connected component of 2019 Series (left, 377 nodes, 49,915 edges) and 2020 Series (right, 457 nodes, 82,051 edges). We also have 97 isolated nodes in the 2019 Series and 99 in the 2020 Series.
Mathematics 08 01828 g005
Figure 6. Graph based on crossings of the type from tie to tie corresponding to the giant connected component of 2019 Series (left, 382 nodes, 65,269 edges) and 2020 Series (right, 462 nodes, 101,101 edges).
Figure 6. Graph based on crossings of the type from tie to tie corresponding to the giant connected component of 2019 Series (left, 382 nodes, 65,269 edges) and 2020 Series (right, 462 nodes, 101,101 edges).
Mathematics 08 01828 g006
Table 1. Number of pairs { i , j } corresponding to each type for the complete cases.
Table 1. Number of pairs { i , j } corresponding to each type for the complete cases.
TypeNumber of Pairs
C.1 n n c
C.2s
C.3 n t u
C.4 n t t
Table 2. Number of pairs { i , j } corresponding to each type for the incomplete cases.
Table 2. Number of pairs { i , j } corresponding to each type for the incomplete cases.
TypeNumber of Pairs { i , j }
I.1 n 2
I.2 n ( n + n )
I.3 n n + n n
I.4 i = 1 n a n i 2 + i = 1 n b n i 2
I.5 i = 1 n a n i n i + i = 1 n b n i n i
I.6 n 2 + n 2 i = 1 n a n i 2 i = 1 n b n i 2
I.7 n ( n + n ) i = 1 n a n i n i i = 1 n b n i n i
Table 3. Number of pairs { i , j } that have some •, corresponding to Example 2. Note that the sum of all the types is, by definition in (11) , N i n c .
Table 3. Number of pairs { i , j } that have some •, corresponding to Example 2. Note that the sum of all the types is, by definition in (11) , N i n c .
TypeNumber of Pairs { i , j }
I.1 n 2 = 1
I.2 n ( n + n ) = 8
I.3 n n + n n = 11
I.4 i = 1 n a n i 2 + i = 1 n b n i 2 = 1
I.5 i = 1 n a n i n i + i = 1 n b n i n i = 2
I.6 n 2 + n 2 i = 1 n a n i 2 i = 1 n b n i 2 = 2
I.7 n ( n + n ) i = 1 n a n i n i i = 1 n b n i n i = 14
Table 4. Parameters for pairs of consecutive rankings. Example 10.
Table 4. Parameters for pairs of consecutive rankings. Example 10.
a 1 a 2 a 2 a 3 a 3 a 4 a 4 a 5 a 5 a 6
n 10000
n 11010
n 00103
n 67775
s912894
n t u 20211
n t t 00000
N i n c 1377718
n ¯ 67775
τ ^ e v −0.3333−0.14290.14290.09520.1000
N S ^ 0.66670.57140.42680.45240.4500
τ e v −0.1786−0.10710.10710.07140.0357
N S 0.58930.55360.44640.46430.4821
τ x −0.1786−0.10710.10710.07140.0357
τ ^ x −0.3333−0.14290.14290.09520.1000
Table 5. Parameters for two series of incomplete rankings obtained from Spotify Top 200 lists.
Table 5. Parameters for two series of incomplete rankings obtained from Spotify Top 200 lists.
2019 Series2020 Series
n474556
N i n c 1.6 × 10 6 2.4 × 10 6
< n ¯ i , i + 1 > 182175
τ e v 0.12560.0836
N S 0.43720.4582
τ ^ e v 0.85400.8421
N S ^ 0.07300.0789
Table 6. Parameters for two series of incomplete rankings obtained from Spotify Viral 50 lists.
Table 6. Parameters for two series of incomplete rankings obtained from Spotify Viral 50 lists.
2019 Series2020 Series
n315300
N i n c 8.3 × 10 5 7.5 × 10 5
< n ¯ i , i + 1 > 33.635
τ e v 0.00670.0093
N S 0.49660.4954
τ ^ e v 0.60370.6922
N S ^ 0.19820.1539
Table 7. Series of incomplete rankings with ties obtained from Spotify Top 200 charts.
Table 7. Series of incomplete rankings with ties obtained from Spotify Top 200 charts.
2019 Series2020 Series
n474556
N i n c 1.4 × 10 6 1.7 × 10 6
< n ¯ i , i + 1 > 256331
τ e v 0.25770.3108
N S 0.37120.3446
τ ^ e v 0.88480.8757
N S ^ 0.05760.0621
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Pedroche, F.; Conejero, J.A. Corrected Evolutive Kendall’s τ Coefficients for Incomplete Rankings with Ties: Application to Case of Spotify Lists. Mathematics 2020, 8, 1828. https://doi.org/10.3390/math8101828

AMA Style

Pedroche F, Conejero JA. Corrected Evolutive Kendall’s τ Coefficients for Incomplete Rankings with Ties: Application to Case of Spotify Lists. Mathematics. 2020; 8(10):1828. https://doi.org/10.3390/math8101828

Chicago/Turabian Style

Pedroche, Francisco, and J. Alberto Conejero. 2020. "Corrected Evolutive Kendall’s τ Coefficients for Incomplete Rankings with Ties: Application to Case of Spotify Lists" Mathematics 8, no. 10: 1828. https://doi.org/10.3390/math8101828

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop