To Rank or Not to Rank with Indices? That Is the Question

: Ranking countries via index-based league tables is now commonplace and is said by its proponents to provide countries with an ability to compare performance with their peers, spurring them to learn from others and make improvements. The Human Development Index (HDI) is arguably one of the most inﬂuential indices of its type in terms of reporting within the media and inﬂuence on development policy and funding allocation. It is often used as part of a suite of indices to assess sustainability. The index was ﬁrst published in the Human Development Report (HDR) of 1990 and has appeared in each of the HDRs published since then. This paper reports the ﬁrst research of its type designed to explore the impacts of methodological changes over 28 years (1991 to 2018) on the ranks of a sample of 135 countries appearing in the HDRs. Results suggest that methodological changes in the HDI have had a statistically signiﬁcant impact on the ranking of the majority (82%) of countries in the sample, and the ranks of countries that tend to appear towards the top, middle, or bottom of the HDI league table are just as likely to be inﬂuenced by changes in HDI methodology. The paper suggests that after nearly 30 years of the HDI, there is an urgent need for independent and empirical research on the changes that it has helped bring about. gov.uk / government / collections / slides-and-datasets-to-accompany-coronavirus-press-conferences.


Introduction
The ongoing COVID-19 tragedy has, at the time of writing, been at least partially responsible for the deaths of over 400,000 people globally and the infection rate is estimated to be over 700 million. These are sobering and upsetting figures, especially since the virus was only confirmed in countries outside of China in January 2020-barely 6 months ago. Every day since 30 March 2020, the UK government has been giving daily briefings on the state-of-play with COVID-19, although prior to that there were various statements from the prime minister and other senior politicians. The daily briefings have followed a broadly consistent format of a member of the government (e.g., minister for health) making a statement followed by a government expert (e.g., chief medical officer) presenting a series of slides showing the progression of the disease in the UK. In the period between 30 March and 9 May 2020, one of the key slides each day comprised a comparison of the UK (England, Scotland, Wales, and Northern Ireland) with other countries in terms of cumulative deaths that could be ascribed to COVID-19. Figure 1 is a graph based on a compilation of COVID-19 mortality data for the UK and a number of European countries. The data used for Figure 1 are those provided by the UK government in a spreadsheet released for each of the daily briefings and have not been adjusted to account for differences in population size or indeed any other factors. Day zero was taken to be the first day when 50 or more deaths attributed to COVID-19 were reported in each country and the lines are cumulative number of deaths. In the UK, this was initially reported in terms of deaths taking place in hospitals but was later replaced by a more accurate figure based on deaths in all settings: primarily hospitals primarily hospitals and care homes for the elderly. The intention was clearly to show a comparison between the UK and other countries in terms of disease progression, even though it was acknowledged that there are differences between them in the ways in which the data were collected. Nonetheless, the graph clearly shows how the UK was heavily affected by the disease-far more so than any of the other European countries. As a result of this stark picture, questions were continually being asked of the politicians and government experts as to why the UK was doing so badly. What was it about the UK that resulted in such a high number of deaths due to COVID-19? Figure 1. Cumulative number of deaths ascribed to COVID-19 in some European countries. Notes: Day zero is taken to be the first day when the number of deaths reached 50 or more in each of the respective countries. There are two lines for the UK; one is for deaths reported in hospitals and the other is for all settings (primarily hospitals and care homes for the elderly). Sources: The graph has been compiled from UK Government data provided at each of the daily briefings. The data can be accessed at www.gov.uk/government/collections/slides-and-datasets-to-accompany-coronaviruspress-conferences. However, this "ranking" of countries via line graphs of mortality attributed to COVID-19 ceased as of 9 May-the last briefing where the graph was presented. The sudden omission of the ranking chart did receive a fair level of criticism in the UK press and indeed from politicians. Interestingly, there was a parallel debate around the same time amongst experts about the value of such ranking. Professor Sir David Spiegelhalter, Chairman of the Winton Centre for Risk and Evidence Communication at the University of Cambridge, wrote an article published in the Guardian newspaper on 30 April 2020 which included the following: Every country has different ways of recording Covid-19 deaths: the large number of untested deaths in care homes have not featured in Spain's statistics-which, like the UK's require a positive test result. The numbers may be useful for looking at trends, but they are not reliable indicators for comparing the absolute levels………But, of course, people are not so interested in the numbers themselves -they want to say why they are so high, and ascribe blame. But if it's difficult to rank this country, it's even trickier to give reasons for our position. (emphasis added by author) This article was picked up by both the government and its experts as a reason why it is not good to have rankings of the UK with other countries. They all pointed to differences in the way data are   in some European countries. Notes: Day zero is taken to be the first day when the number of deaths reached 50 or more in each of the respective countries. There are two lines for the UK; one is for deaths reported in hospitals and the other is for all settings (primarily hospitals and care homes for the elderly). Sources: The graph has been compiled from UK Government data provided at each of the daily briefings. The data can be accessed at www. gov.uk/government/collections/slides-and-datasets-to-accompany-coronavirus-press-conferences. However, this "ranking" of countries via line graphs of mortality attributed to COVID-19 ceased as of 9 May-the last briefing where the graph was presented. The sudden omission of the ranking chart did receive a fair level of criticism in the UK press and indeed from politicians. Interestingly, there was a parallel debate around the same time amongst experts about the value of such ranking. Professor Sir David Spiegelhalter, Chairman of the Winton Centre for Risk and Evidence Communication at the University of Cambridge, wrote an article published in the Guardian newspaper on 30 April 2020 which included the following: Every country has different ways of recording Covid-19 deaths: the large number of untested deaths in care homes have not featured in Spain's statistics-which, like the UK's require a positive test result. The numbers may be useful for looking at trends, but they are not reliable indicators for comparing the absolute levels . . . . . . . . . But, of course, people are not so interested in the numbers themselves-they want to say why they are so high, and ascribe blame. But if it's difficult to rank this country, it's even trickier to give reasons for our position. (emphasis added by author) This article was picked up by both the government and its experts as a reason why it is not good to have rankings of the UK with other countries. They all pointed to differences in the way data are Sustainability 2020, 12, 5572 3 of 20 collected across countries and indeed the ways in which mortality from COVID-19 was defined-points which Professor Spiegelhalter highlighted in his article. Thus, the argument goes, if data are not strictly comparable, then country rankings should cease. Indeed, Professor Spiegelhalter began to take issue with the use of his article and critique of rankings by politicians and in a later edition of the Guardian there can be found the following rebuff: A statistician has asked the government to stop using an article he wrote for the Guardian as justification for why Britain's death toll from coronavirus should not be compared with that of other countries. Prof David Spiegelhalter said in the piece published on 30 April that comparing the number of deaths from Covid-19 between countries was difficult because of the different methodologies used by governments to measure deaths. The day the article was published, England's chief medical officer, Prof Chris Whitty, praised it during the No 10 daily coronavirus briefing, saying it showed that comparing death rates in different countries was a "fruitless exercise" . . . . . . . . . Boris Johnson again referred to Spiegelhalter's words on Wednesday in a response to the Labour leader, Keir Starmer, during prime minister's questions, after Britain's death toll became the highest in Europe and second highest globally . . . . . . . . . . However, a few hours later Spiegelhalter, tweeted: "Polite request to PM and others: please stop using my Guardian article to claim we cannot make any international comparisons yet. I refer only to detailed league tables-of course we should now use other countries to try and learn why our numbers are high. (Harry Taylor, Guardian, 6 May 2020; emphasis added by author) The message now appears to be far more nuanced; league table rankings appear to be unacceptable ("fruitless exercise") while it seems other forms of "country comparisons" are acceptable if they help to provide "learning". However, this debate over the validity and presumed usefulness of country rankings in league tables is just a recent-albeit intense, given the health and political ramifications-incarnation of one that is much older within the community of researchers and practitioners working with indicators, including those used to assess sustainability. The contextualized and selective use of indices by users has been well-known and reported in the literature for many years (a discussion for sustainability indicators can be found in [1]) and, ironically, the UK has been something of a leader in this field as league table rankings have been embraced by governments of all political shades and existed for long in many sectors including education and health. It often comes down to a fundamental question: to rank or not to rank performance with what are, by definition, a simplified set of indicators or even a single index?

Literature Review
This paper is not about COVID-19 or indeed about the UK or international response to the epidemic. Analyses of the epidemiology of the disease and the effectiveness (or not) of the various responses by agencies are best left to a later date. Instead, the paper will build on the prompt provided by the latest manifestation of the league table ranking debate surrounding the use of indices of country performance. It will seek to re-open that question by focusing on one of the oldest indices still published routinely in country league table format-the Human Development Index (HDI).
The HDI is arguably one of the most influential indices of its type in terms of reporting within the media and influence on development policy and allocation of funding. It was developed during the 1970s and 1980s [2] and it is often said that the motive was largely a desire by the United Nations Development Programme (UNDP) to try and move the development discourse away from what it saw as a strong emphasis at the time from other powerful international agencies such as the World Bank on economic development and towards a more balanced "human" development. Since 1990, the UNDP has published country league table rankings of the HDI as one of the tables within its annual Human Development Reports (HDRs), usually as the first in a suite of indicator tables at the end of the report [3,4]. The HDI encapsulated some of the social indicators readily available at country level Sustainability 2020, 12, 5572 4 of 20 at the time and combined them with a proxy measure of income (initially gross domestic product (GDP) or GDP/capita) within a theoretical framework of human development that drew heavily from Nobel Prize winner Amartya Sen's work on "capabilities" [5]. From the very beginning, the HDI was intended to allow the allocation of a single "headline" number to each country to "capture" its human development rather than rely on a suite of separate social indicators. The use of a single index to capture human development meant that countries could be ranked in a league table format, with best performers at the top and worst at the bottom. Thus, countries could easily compare their performance against those they consider to be their peer group. The logic here is that the government of a poorly performing country in the league table will feel external and internal pressure, an example of the latter being pressure from media reporting [6], to improve its standing. In almost complete contradiction to the point made by Professor Spiegelhalter, with such league table rankings it can be argued that the value of the HDI becomes less important and what matters is where a country is ranked relative to peers [7,8]. Indeed, the kind of in-depth strategic debate that can be had here about improving a country's HDI rank is illustrated in a paper by Bryane (2018) for Brunei [9].
Since 1990, the underlying conceptualization of the index as combing three components has remained intact:

1.
Life expectancy as a proxy measure for health. It is assumed here that people cannot improve their livelihood unless they are healthy.

2.
Education. It is assumed here that higher levels of education provide the capability to develop as it provides, amongst other things, opportunities for employment and career development.

3.
Income (proxied by GDP/capita). It is assumed here that people need financial capital to help improve their livelihood options through the purchase of goods and services.
Thus, good health, education, and income were seen by the creators of the HDI as key to providing the basis for people to break out of low human development and were considered to be of equal importance [10]. Indeed, the HDI has often been seen, rightly or wrongly, as a measure of quality of life and it has often been included within suites of indicators intended to assess sustainability.
The UNDP has consistently refused to expand the components of the HDI and argued that simplicity was a vital requirement for transparency [11][12][13][14][15]. While the creators of the HDI have not altered its "soul" in any way, they have made changes to the way in which the HDI is calculated as well as the choice of components to best reflect that soul. This is understandable given that the HDI has been around since 1990 and has attracted much attention and critique from researchers and practitioners from the very start [16][17][18], with some suggesting alternative weightings of components [19,20] or even new indices altogether [2]. However, and here there is a strong echo to the COVID-19 ranking debate in the UK, any change to the choice of indicators, quality of the datasets, and decisions over the methodology of calculation would potentially, of course, influence country placement in the league tables irrespective of what a government does (or does not do). As with the COVID-19 indicators, there is a reliance on data collected by the countries themselves and while politicians and others may rightly point to the variation in the way data on disease incidence and mortality are collected across Europe, the same point could equally, if not far more so, apply to the 100+ countries included in the HDI league table spanning the developed and developing worlds. Why should such league tables of the HDI also not be regarded as a fruitless exercise? Indeed, the uncertainties embedded within the HDI can add much fuzziness to the notion of an objective or "true" country ranking just as it can for COVID-19 indicators, and this has been well-reported in the literature since the origin of the HDI. For example, Høyland et al. (2013;pages 11-12), following their analysis of the uncertainty in index rankings, have noted in words that resonate with the concern of Professor Spiegelhalter regarding the use of country ranks for COVID-19 mortality, "Whenever the scores of international index rankings are taken literally, the indexes may be poor guides for policies as each link between indicators and scores is noisy and uncertain, but presented as certain." [21].
Here is the basic conundrum behind the question, "To rank or not to rank?" While the intention may be well-meaning (i.e., to allow countries to compare their performance with peers as a spur for them to learn and improve) there is a danger that the indicators and the ranking of countries based on them may be taken too literally and presented as some kind of objective truth. This results in a grey area that is open to exploitation by those who wish to praise the indicators and report them when they do well in the league table and denigrate them as "noisy and uncertain" when their ranking is not so good. The results may be exactly, and indeed almost predictably given the intense context provided by COVID-19, the sort of debate Professor Spiegelhalter became embroiled in with the politicians and government officials. Just where can the line be drawn between a need for comparing the performance of countries so lessons can be learned and the use of a tool such as a league table which apparently allows just that?
What is less reported in the literature are the impacts that factors such as different methodologies for collecting and reporting data between countries as well as compiling (aggregating) it all into indices have on country rank. In fairness, the latter point (aggregation methodology) has received some attention. Morse (2013) explored the HDI rankings of 167 countries and how they may have been affected by a change in the methodology for the income component of the HDI that took place in 1999 [22]. Results suggest that for the majority (65%) of countries in the dataset their "adjusted ranks" between two periods (1991-1998 and 1999-2009) were not influenced by the change in methodology for handling GDP/capita, while for 35% the change did influence their rank. However, there have been a number of methodological changes in the HDI, not only with regard to how the income component is handled and there has been no exploration to date of the impact coming from the totality of change (all components together) and how these impacts compare between countries. For example, are countries towards the top, middle, or lower end of the league table more vulnerable to such change or is the vulnerability equally distributed across the spectrum of HDI ranks?
The methodology by which the components of the HDI are aggregated has seen some significant change since 1990 and these have revolved around the following:

1.
Education component: A number of components have been used during the life of the HDI, including years of schooling, enrolment in full-time education, and adult literacy rate. The latter was used in the education component until the HDR of 2009, after which it was dropped and only years of schooling was employed.

2.
Income component: There have been a number of changes here, some small and some large. Firstly, the UNDP has alternated between the use of logarithmic (base 10) and Atkinson transformations to transform the data and help avoid a dominance of this component in the index (Morse, 2013). In 1990 and between the HDRs of 1999 and 2018, the UNDP used logarithms while between the HDRs of 1991 and 1998, they used the Atkinson formula. Secondly, while most of the years (HDRs 1990 to 2009) used the real GDP/capita (adjusted for purchasing power parity (PPP) and chained to a particular year), in more recent publications of the HDR (HDRs of 2010 onwards), there was a switch to using gross national income (GNI) per capita (also adjusted for PPP). GDP and GNI are similar metrics but not the same. At the same time as changing to GNI, logarithm base e was used rather than logarithm base 10 for transforming GNI/capita.

3.
Arithmetic and geometric means. It has always been assumed that the three components of the HDI have the same weight within the index, and until 2009 this was achieved by taking the arithmetic mean of the three HDI components. After that year (HDRs 2010 onwards) the geometric mean was used instead, ostensibly to avoid high values of one component compensating for low values of another [23]. However, this change has been claimed to have a negative impact on the HDI for developing countries [19,24].
As well as these more significant changes, there have been other smaller ones related to data years for the index components and the choice of minimum and maximum values for standardization. Following on from this, it is possible to establish two key years which saw significant changes in the calculation of the HDI:

•
Change 1 (HDR 1999): Transformation of the GDP/capita component changed from the Atkinson formula to the use of logarithm (base 10). • Change 2 (HDR 2010): Adult literacy rate was dropped from the education component and changes were made to the way in which the income component was calculated and transformed. In addition, there was a shift from arithmetic to geometric mean for combining the three components of the HDI.
Which of these two changes had the largest impact on country rank? If there are changes then is the impact greater for countries that occupy the top, middle, and lower ends of the table? Finally, why is there a continued fascination with using indices to rank countries so that problems such as that encountered in the UK with COVID-19 still seem to occur? These three questions form the basis for the work reported here, although the emphasis will be placed on the first two. The third will be explored in the discussion.
The paper will first set out the major methodological changes in the HDI that have occurred during its lifetime and based upon this a number of key "change" years will be identified. These change years will then be used to explore the first of the aims above. The paper will then move onto the second of the two aims and explore whether there is any evidence of the methodological changes in HDI having differential impacts across countries occupying different parts of the table.

HDI Ranks
The HDI published in the HDR 1990 was not included in the analysis largely because the index was arguably still experimental at that time and relatively few countries were included. Hence the focus here is upon the HDI published in the HDRs between 1991 and 2018. Over that period there have been many geopolitical changes and the reported rank of a country may change between HDRs as the number of countries changes. For example, Niger was ranked at 187 in the HDR of 2014 but 188 in HDR 2015 and 189 in HDR 2018. In all three cases, it was the bottom-ranked country, so the ranks of 187, 188, and 189 are simply a reflection of changes in the number of countries included in the HDI league table. Thus, allowance needs to be made for changes in the number of countries by calculating an adjusted rank for each of them and in the work reported here this has been based on a fixed scale of 1 (top-ranked) and 2 (bottom-ranked). The original rank of a country is that taken from the HDI table in the HDRs and adjusted ranks were calculated as follows: The result of adjustment in this way is a series of ranks spanning 1 to 2 irrespective of the number of countries in the HDI table. For example, using adjusted ranks for the HDI tables in HDR 2014, HDR 2015, and HDR 2018 means that Niger has a value of 2.0 for each of the years rather than having ranks varying between 187 and 189.
In order to allow for consistency of comparison of adjusted ranks over time, a group of 135 countries was selected that were territorially the same between 1991 and 2018 (Table 1). It should be noted that this sample of 135 countries represents a significant proportion of the total number of countries included in the HDI tables between 1991 (84% of all countries) and 2018 (71% of all countries). Thus, each country had a total of 25 adjusted ranks between 1991 and 2018.

Analysis of HDI Adjusted Rank
In order to test the impact of methodological changes of the HDI on adjusted rank for countries, the adjusted ranks were analyzed using linear regression (least squares estimation). The model adopted was as follows: Adjusted rank = intercept + β1 N + β2 Change 1 + β3 Change 2 + error N = the number of countries included in the HDI table for that year. While the original ranks were adjusted to a fixed scale of between 1 and 2, this could still have an impact as countries may be pushed up or down by the inclusion of new countries below or above them.
Change 1 and Change 2 are dummy variables each having values of 0 or 1. Values of 0 were used to cover the years before the methodological change and values of 1 were used to cover the years after the methodological change. For Change 1, 0 was used for the period from 1991 to 1998 when Atkinson transformation was used for GDP/capita and 1 from 1999 to 2018 when logarithms were used. For Change 2, 0 was used for the period 1991 to 2009 and 1 for the period 2010 to 2018 when changes were made to the education and income components as well as a switch from arithmetic to geometric mean.

Results
The relationship between standard deviation and mean of adjusted ranks for the 135 countries is shown in Figure 2. If standard deviation is used as a measure of "volatility" in adjusted rank, then this is clearly greater for countries towards the middle of the league table than at the top (high human development) or the bottom (low human development). Countries at the extreme tend to be relatively stable in terms of their rank, but for some countries having a middle-rank the volatility is large, a conclusion that matches that of Cilingirturk and Kocak (2018) [8]. Given the volatility in adjusted rank shown in Figure 2, the question that needs to be asked: is this due to methodological changes or countries simply doing "better" than their peers in terms of improving human development?
The results of the least squares regressions using dummy variables (0 or 1) for the periods of relative stasis either side of the key change years (1999 and 2010 for Change 1 and Change 2 respectively) are shown in Table 2. The 135 countries have been grouped into the following categories:

•
No significance: None of the regression coefficients are statistically significant at P < 0.05 • Single significance: One coefficient is statistically significant at P < 0.05 (either the number of countries (N), Change 1, or Change 2) • Double significance: Two coefficients are statistically significant at P < 0.05 (N with Change 1, N with Change 2, or Change 1 with Change 2) • Triple significance: All three coefficients are statistically significant (N, Change 1, and Change 2)    ns = Not significant at 0.05 (P > 0.05); * P < 0.05; ** P < 0.01; *** P < 0.001. Graphical examples of the influence of the Change 1 and Change 2 factors on HDI rank are shown for four countries (Fiji, Gabon, Botswana, and Malta) in Figure 3. Malta is one of the minority of countries with no influence from all of the factors, and indeed the adjusted HDI rank has remained relatively stable from 1991 to 2018. Fiji had a significant influence from Change 1 while Gabon had a significant influence from Change 2. Botswana had significant influences from both Change 1 and Change 2.  Graphical examples of the influence of the Change 1 and Change 2 factors on HDI rank are shown for four countries (Fiji, Gabon, Botswana, and Malta) in Figure 3. Malta is one of the minority of countries with no influence from all of the factors, and indeed the adjusted HDI rank has remained relatively stable from 1991 to 2018. Fiji had a significant influence from Change 1 while Gabon had a significant influence from Change 2. Botswana had significant influences from both Change 1 and Change 2.   A summary of the number of countries (and percentage) within each of these four groups (none, single, double, and triple) is provided in Table 3. Only 24 (18%) countries had rankings that were unaffected by changes in the number of countries or the Change 1 and Change 2 shifts in HDI methodology. The vast majority (82%) of countries had ranks that were influenced by at least one of the changes. Indeed, most countries (60%) had ranks that were influenced by either Change 1, Change 2, or both (Change 1 with Change 2). The number of countries (N) seemed to have a relatively minor influence. Only 2% of countries had adjusted ranks influenced by N, and only 11% were influenced by the combinations of (N with Change 1) and (N with Change 2). Only 12 (9%) of countries were influenced by all three (N, Change 1 and Change 2). Thus, across the 135 countries, it seems that changes in the number of countries had relatively little influence on adjusted rank relative to Change 1 and Change 2. Overall total 135 100 Figure 4 shows the distribution of countries with statistically significant influences from combinations of N, Change 1, and Change 2, as well as no statistically significant influence. In each graph, the vertical axis is the mean adjusted rank and the countries have been placed in order from lowest mean rank (best human development) at the left-hand side to the highest mean rank (lowest human development) at the right-hand side. Each point is a country. Figure 4a,b shows the distributions for those countries having a statistically significant influence on adjusted rank coming from Change 1 and Change 2 respectively, while Figure 4c is a plot of those countries having a significant influence from both Change 1 and Change 2. Given that the number of countries influenced by N as well as N in combination with Change 1 and Change 2 is relatively small, these have been combined into a single graph (Figure 4d). Figure 4e shows those countries with a significant influence on adjusted rank coming from N, Change 1, and Change 2, while Figure 4f shows the distribution of countries having no statistical influence from any of the independent variables. There is no obvious "bunching" of points within any of these plots graphs which suggests that influences of the dependent variables on adjusted rank are not especially concentrated in any place along the distribution. In other words, the spread of points suggests that countries with low or high adjusted ranks are just as likely to be influenced by methodological changes as those in the middle of the distribution.

Predictor
In terms of the impact of Change 1 and Change 2 on adjusted rank, this may be gleaned from the values of the respective regression coefficients shown in Table 2. To allow for an easier comparison, the statistically significant regression coefficients for Change 1 and Change 2 are shown plotted in Figure 5. For some countries, the coefficients are negative (adjusted rank is reduced by the change, signifying an apparent increase in human development) while for others it is positive (adjusted ranked is increased by the change, signifying a decline in human development), and there is large variation between countries in terms of the size of the coefficient. Overall, there is some suggestion here that the impact on adjusted rank is greater for Change 1 than for Change 2, but the difference is not all that marked.

Discussion
While it has been well-reported that methodological shifts in the HDI do have an influence on country rankings within the reported league tables [22] and there have been studies which looked at uncertainty surrounding country rankings of the HDI and other indices [21], this is the first study of its type to explore the influences of some of the major methodological changes between 1991 and 2018 taken in their totality on country rank. While the analysis was a simple one, it does point to some intriguing conclusions that would certainly warrant more detailed investigation. For some countries (18% of the panel used in the analysis) the methodological changes had no significant impact on rank while for the vast majority (82%) there is an influence from one of the changes or combinations of them. The Morse (2013) study included 167 countries in an analysis of the impact on rank of a change in the methodology of the income component in 1999, equivalent to Change 1 in this study, and noted that the ranks of 35% of countries were influenced by the change [22]. However, in this study with a smaller sample (135 countries) and a longer period (1991 to 2018), Change 1 was found to significantly influence the ranks of just 19% of countries. Change 2 had a significant influence on the adjusted ranks of 14% of countries, but 27% were influenced by both Change 1 and Change 2. The number of countries included in the published HDI league table was of relatively minor importance in terms of influencing adjusted rank; the major factors were Change 1 and Change 2.  There is no suggestion from the results that "sensitivity" to methodological change is concentrated at any part of the rank spectrum; if there are influences, then these would seem to be just as likely for those countries ranked high, middle, or low in the HDI league table. Once the ranks for the 135 countries have been adjusted to range from 1 to 2, the influence of the number of countries in the published league table becomes relatively minor, also. Indeed, in terms of the increased volatility in adjusted rank seen towards the middle of Figure 2 (and noted by [8]), there is no suggestion from these results that methodological changes per se are the primary cause, although it does make a contribution for the majority (82%) of the countries. It is more likely that shifts up and down the league table for those middle-ranked countries is due to fluctuations in performance, at least in terms of how this is reflected in the data available to the UNDP for constructing the index, across the HDI components.

Discussion
While it has been well-reported that methodological shifts in the HDI do have an influence on country rankings within the reported league tables [22] and there have been studies which looked at uncertainty surrounding country rankings of the HDI and other indices [21], this is the first study of its type to explore the influences of some of the major methodological changes between 1991 and 2018 taken in their totality on country rank. While the analysis was a simple one, it does point to some intriguing conclusions that would certainly warrant more detailed investigation. For some countries (18% of the panel used in the analysis) the methodological changes had no significant impact on rank while for the vast majority (82%) there is an influence from one of the changes or combinations of them. The Morse (2013) study included 167 countries in an analysis of the impact on rank of a change in the methodology of the income component in 1999, equivalent to Change 1 in this study, and noted that the ranks of 35% of countries were influenced by the change [22]. However, in this study with a smaller sample (135 countries) and a longer period (1991 to 2018), Change 1 was found to significantly influence the ranks of just 19% of countries. Change 2 had a significant influence on the adjusted ranks of 14% of countries, but 27% were influenced by both Change 1 and Change 2. The number of countries included in the published HDI league table was of relatively minor importance in terms of influencing adjusted rank; the major factors were Change 1 and Change 2.
There is no suggestion from the results that "sensitivity" to methodological change is concentrated at any part of the rank spectrum; if there are influences, then these would seem to be just as likely for those countries ranked high, middle, or low in the HDI league table. Once the ranks for the 135 countries have been adjusted to range from 1 to 2, the influence of the number of countries in the published league table becomes relatively minor, also. Indeed, in terms of the increased volatility in adjusted rank seen towards the middle of Figure 2 (and noted by [8]), there is no suggestion from these results that methodological changes per se are the primary cause, although it does make a contribution for the majority (82%) of the countries. It is more likely that shifts up and down the league table for those middle-ranked countries is due to fluctuations in performance, at least in terms of how this is reflected in the data available to the UNDP for constructing the index, across the HDI components.
However, while the effects of methodological changes in the HDI appear to be evenly distributed across the group of 135 countries included in this study, it is important to note that for any one country the changes can be important, even if the result is a change of just a few places up or down the HDI league table. There are many illustrations of this that could be provided and here is just one, referring in this case to India: The robust economic growth notwithstanding, India has garnered a lowly 119th rank in the United Nation's Human Development Index due to poor social infrastructure, mainly in areas of education and healthcare. The HDI has had significant longevity, nearly 30 years at the time of writing, and is widely reported and used by aid agencies amongst others to help with allocation of scarce resources. The HDI often appears within suites of indices designed to assess sustainability and some have even suggested modifications to the HDI that would make it more of an index of sustainability [25][26][27]. It is also laudable to see flexibility in terms of the construction of the index and an openness to address issues and look for ways to make improvements. The creators of the HDI have always been very open and transparent about the changes they have made and why they did them. They have also been aware of the impact arising from methodological changes in the HDI and have published versions of the HDI in the HDRs based on a consistent methodology to better facilitate an exploration of trends over time. Despite all of this, it still needs to be noted that the current HDI league table for each year of publication is the very first table presented in the collection of tables at the end of the HDR and is thus is inevitably a highlight. It is thus reasonable to assume, despite the inclusion of other "adjusted" HDI tables, that the current HDI league table is the one which will "jump out" to the lay reader.
Given the uncertainties inherent in the HDI and the ranking of countries based upon the index [21,28] is it still, for all its faults, a valuable tool to help improve human development? Does it matter if some of the shifts in rank are caused by changes in HDI methodology as long as the index has helped improve people's lives? Perhaps surprisingly there have been few, if any, empirical and independent (from UNDP) studies that have addressed the impact of the HDI and this is certainly a gap that needs to be addressed with some urgency. After all, the league table ranking style of HDI presentation was chosen by the UNDP from the very beginning and the intention was a clear one: to provide a vehicle by which countries, and indeed other "consumers" of the rankings such as international aid agencies and the press, could compare themselves with others. The emphasis on ranking was no accident; it was a policy choice and deserves to be assessed dispassionately. It is well-known that the HDI rankings, even with their uncertainties, are touted by politicians in those countries that do well [21,28] and are used by international development agencies to make decisions over allocation of aid. League tables do continue to fascinate and the COVID-19 rankings noted at the start of the paper attracted a great deal of press and public attention in the UK. However, trying to introduce nuance by claiming that country rankings are not valid but at the same time claiming that county comparisons are useful, thereby allowing politicians and those who advise them to both applaud and dismiss ranking when it suits, is hardly a ringing endorsement of the league table approach. Nuance may be well-meaning and indeed appropriate given the uncertainty surrounding index rankings and the impacts of methodological change if an index has been around for a long time, but there is always the suspicion that it is the favored retreat of those who do badly in the rankings.
In the case of the UK COVID-19 international data presentation shown as Figure 1 it may no doubt be argued that by the time the government stopped presenting the graph (10 May 2020 onwards) in its daily briefings, the country was so far above any other European country in terms of mortality that it was no longer necessary to keep repeating the point. Hence, some may say that there is a degree of "ranking fatigue"; once a country becomes rooted in one spot at the bottom or top in the league table, it may appear pointless to keep repeating it. If this is so, then surely one could make a similar point regarding the HDI league table. After all, since the first version published in the HDR of 1990, the top place has been dominated by a handful of countries (mostly just two-Canada and Norway) while the bottom of the table has been occupied entirely by African countries (mostly Sierra Leone and Niger).
Nonetheless, for all the faults inherent with ranking countries using what are relatively simple indices open to issues of varying data collection methods and hence quality let alone changes to the index methodology as noted in this study, there may well be significant benefits to be gained. The criticisms of indicator "technocrats" would be insignificant if the indices and the rankings based on them do deliver benefits for communities, and, after all, that is precisely what the HDI and the league table style in which it is presented is meant to achieve. It is always important to remember that behind all the graphs and tables in this paper are people. Figure 1 is an embodiment first and foremost of tragedy not competition but so too are the rankings based on the HDI which embody, for many in the developing world, poor life expectancy and poor access to education. It is sometimes all too easy with indicators to forget the human stories which exist at their heart. Indeed, and looking at it from another angle, there may also be some fundamental questions that need to be asked about how indices were developed and whether (potentially unconscious) cultural biases could favor the rankings of some countries over others. This point has been made before, for example with the Environmental Sustainability Index [29], but to date has not received much attention from researchers.
Nonetheless, while the HDRs are replete with case study stories of success and failure there is little, if any, attempt to link these to the HDI, which is perhaps surprising given the emphasis placed on the index. The HDI was meant to be transformative by refocusing attention of policy makers away from an excessive focus on GDP, but has it achieved that [30]? Assessing the impact of a single index is certainly complex and challenging and could take many inter-related forms with many processes at play, but after nearly 30 years of the HDI and all of the effort that has gone into maintaining and promoting the index, now is the time for a better understanding of the changes, good and bad, that it has helped bring about.

Conclusions
Index-based league table rankings have been popular for many years and are said to help generate a sense of comparison with peers which facilitates learning and pressure to improve performance. The HDI-based league tables published by the UNDP since 1990 provide an example. However, methodological changes in the HDI over 28 years (1991 to 2018) appear to have had a significant impact on the ranking on the majority (82%) of countries in the group of 135 countries, and the impacts are not focused on any part of the distribution of country ranks. Countries at the top, middle, or bottom of the HDI league table are just as likely to be influenced by changes in HDI methodology. Does it matter that there is no precise, robust, and consistent measure of human development upon which the ranks are based? Maybe ranking, for all of its faults, does deliver benefits. This is a question that needs to be addressed even if assessing the impact of a single index is challenging.
Funding: This research received no external funding.