Evaluation of Existing Indexes of Sustainable Well-Being and Propositions for Improvement

: The relationship between sustainability and well-being is inconclusive in the literature, with some studies showing consonance while others show dissonance. On top of differences of scale (micro or macro) and of methods, part of this conﬂict in narratives is due to differences in measurement. In this paper I evaluate the quality of existing indexes linking both concepts at a macro level (Happy Planet Index (ﬁrst generation and second generation), Sustainable Development Goals Index, Human Sustainable Development Index, Sustainable Development Index, Gaucher’s index). Recognizing the limits of all of them and acknowledging that the current landscape of measures is over-oriented towards cognitive measures on the well-being side and ecological footprint-oriented on the environmental side, I propose some alternatives to complete the current measures and I discuss possible implications.


Sustainability and Well-Being, a Janus-Headed Story
It is generally admitted that the protection of the environment and the well-being of the populations are two desirable political goals. The relationship between sustainability and well-being has been questioned since the late 1960s and in particular since the Meadows report [1]. Most models would include some overlap between the two concepts, suggesting that they are somewhat going hand in hand. Although several empirical findings go in this direction, others suggest the overlap between them had been exaggerated or even questionable. Some studies went as far as showing that they were going in opposite directions, hereby suggesting that sustainability and well-being could enter a zero-sum game. The veil of normativity that surrounds both concepts does not help to distinguish their inter-relations, i.e. their possible consonances or dissonances, or both. Based on the inconclusiveness of the debate between sustainability and well-being, which I present below, the goal of the present paper is to reckon that part of this is due to measurement, and, based on the existing pitfalls, to offer possible alternatives.

Studies That Highlight Consonances between Sustainability and Well-Being
The romance between sustainability and well-being has ancient roots, back to precolonial wisdom [2] or more recently, the transcendentalists such as Ralph Waldo Emerson or David Henry Thoreau to ecologists of the 1960s and most international institutions from the 1970s on. Because sustainability and well-being are perceived as desirable, many observers wish them to be consonant. According to United Nations Sustainable Development Goals and European Commission (in particular the Joint Research Center), human health, natural protection and natural resources are end points called "areas of protection." Most of human actions can be looked through the spectrum of planetary well-being and human well-being, and both are assumed to share some common links. The field of statistics has also been the ground of confrontation and negotiation between these two concepts. The conference on European statisticians' recommendations on measuring sustainable development distinguished three dimensions of sustainable development: (1) Human well-being of the present generation in one particular country (2) The well-being of future generations and (3) The well-being of people living in other countries. This has also been highlighted more recently in the Stiglitz-Sen-Fitoussi report: "at a minimum, in order to measure sustainability, what we need are indicators that inform us about the change in the quantities of the different factors that matter for future well-being" [2]. The early intuitions and the conceptualizations that followed have been backed up by numerous studies showing cases and situations in which human well-being and planetary well-being positively influence each other. This was in particular shown at the micro level. The rationale is the following and goes in both directions. On the one hand, pollution makes people less well and less happy [3,4]. On the other hand, happy people tend to adopt, ceteris paribus, more virtuous behaviors such as recycling, using their bikes or engaging in pro environmental actions [5][6][7][8][9][10]. There have been also some studies showing evidence of this relation at a macro level. For instance, Sachs et al. show that the Sustainable Development Index is highly correlated with life evaluation as measured by Cantril ladder [11]. The top five of the Human Sustainable Development Index (which is the Human Development Index completed by the per capita carbon emissions) highlight that these countries (Norway, New Zealand, Sweden, Switzerland and France) show a high degree of collinearity with happiness rankings, as the top four is present in almost all top 10 of happiness rankings, regardless of the indicator taken.

Studies That Highlight Dissonances between Sustainability and Well-Being
Studies showing contradictions between sustainability and well-being are fairly recent. They do not stem from conceptual considerations but rather from empirical observations at the macro and the micro level. Some authors show a number of cases for which there are tradeoffs between sustainability and well-being, such as for instance, traveling to meet friends (good for well-being but not the environment) or extreme examples such as living in a small apartment (good for the environment but not for well-being) [12]. The tensions around these two components have been highlighted in particular in the case of mobility. Not owning a car or not driving it for environmental matters might provide a source of personal discomfort with little observable gains in terms of environmental impact [13]. Reversely, having a car has been positively associated with well-being for people [14]. A positive relation between car wealth and life satisfaction has been observed among European seniors [15]. At a macro level, there are indexes which indicate that sustainability and well-being do not necessarily go perfectly hand in hand, and this relation depends on the degree of economic development. In particular, the positive correlation between GDP and environmental impact has been largely documented with richest countries impacting the environment the most [16,17]. For instance, if we look at the Happy Planet Index (HPI), we see that the top countries (Costa Rica, Mexico, Colombia, Vietnam) in terms of sustainability are not exactly the happiest and vice versa. Although one could find many studies on the links between sustainability and well-being [18][19][20][21][22][23][24][25][26][27], there are only few indexes observing this relation directly. Most of the observations between sustainability and wellbeing are done indirectly, via (poor) proxies of SWB such as Gross Domestic Product (GDP). This echoes some studies that highlight "the degree to which economic and environmental objectives are in conflict" [28].
This means that the literature is yet inconclusive regarding the imbrication between sustainability and well-being. An issue in most of these studies is that environmental behaviors are declarative. Apart from a few exceptions, there are barely considerations of the actual environmental impacts. Additionally, studies at a micro level tend to control for many factors (income, education, place of residence, household structure, etc.) which blurs the picture of actual impacts. Finally, the indicators of well-being and sustainability are either proxies or questionable constructs. In particular, the lack of proper indicator to measure jointly the levels of well-being and sustainability is particularly true at the macro level. There is a need for what the International Union for Conservation of Nature (IUCN) calls "a metric capable of measuring the production of human well-being . . . per unit of extraction from or imposition upon nature." However, there are many different measures with different levels of quality. Assessing them and proposing some directions of improvement is the goal of this paper.

A Matter of Measurement
Among the studies looking into sustainability and well-being, there is a wide array of indexes, individual or composite, covering sustainability and well-being or one of them. This diversity of measures does not help to extract consistent findings. Below I review the main indexes linking sustainability and well-being as well as the most used for each dimension (ecological footprint, carbon footprint and carbon emissions and happiness, life satisfaction, and life evaluation). Many other measures could be looked (specific impacts such as plastic production, ozone layer depletion for sustainability, eudaimonic well-being scales, hedonic scale of affects for well-being) at but I only look into the most used ones for each dimension.

Existing Sustainable Well-Being Indexes
Only a few indexes refer explicitly to both human well-being and sustainability. One can cite among others the Happy Planet Index (HPI), Sustainable Development Goals (SDG) Index, Human Sustainable Development Index (HSDI), Sustainable Development Index (SDI), Gaucher's index of a happy, long and sustainable life.
The Happy Planet Index (HPI) has been created by the New Economics Foundation in 2006. It was originally defined as the ratio between happy life years (the product of average life satisfaction and the longevity) and ecological footprint [29]. It was later completed by an indicator of inequality of outcomes and life satisfaction was replaced by life evaluation as measured by the Cantril ladder.
Sustainable Development Goals (SDG) Index is a composite indicator that was created by Sachs and colleagues. It is composed of indicators to compute the 17 SDG goals. The edition of 2021 is composed of "91 global indicators as well as 30 additional indicators for OECD countries, due to better data coverage" [30]. It comes from a "mix of official and non-official data sources. Most of the data (around two thirds) is developed by international organizations [ . . . ] which have extensive and rigorous data validation processes. Other less traditional statistical sources used (accounting for around a third of our data) include household surveys (Gallup World Poll), data from civil society organizations and networks (among others, Oxfam, Tax Justice Network, World Justice Project, Reporters without Borders) and peer-reviewed journals (to track international spillovers, for example)" (p. 68, [30]).
The Human Sustainable Development Index (HSDI) is an augmented version of the United Nations Human Development Index (HDI). In addition to the traditional indicators used in the construction of the HDI (literacy, life expectancy and GDP), the HSDI also includes ecological degradation, as measured by carbon emissions. It was created to "put a stop to the celebration of gas guzzling developed nations" and to provide a "fast and frugal" way of capture human sustainable development [31].
The Sustainable Development Index (SDI) is a ratio between a development index (the cubic root of an income index, education index and longevity index) and an ecological impact which is an index encompassing material footprint and emissions values [32].
The index of a happy, long and sustainable life of Gaucher and colleagues is composed of a ratio between happy life years (as measured by longevity and life evaluation) and a "sustainability ratio" which is the ratio between the biocapacity of a country and the ecological footprint of the same country [33].

Existing Measures of Well-Being
As far as well-being is concerned, there exist a wide array of possibilities. Well-being is covered by the field of quality of life (QOL) in the social sciences and in the almost distinct field of health-related Quality of life (HRQOL). In the field of social sciences, some have defined eudaimonic, hedonic and evaluative evaluations. Eudaimonic evaluations are usually multidimensional whereas evaluative are usually unidimensional, hedonic being sometimes unidimensional and sometimes multidimensional. Probably because of simplicity and availability, the ones that are the most used are unidimensional measures based on single questions ever since the General Social Survey started to use evaluative questions in 1972. There are three highly used measures, which are happiness, life satisfaction and life evaluation as shown in the World Database of Happiness, the reference database in the field [34]. Happiness is probably the closest conception to the one people mean by "happiness" on a daily basis. Usually measured by the question "how happy are you with your life in general?", it aims at capturing happiness on the long term and not as an immediate emotion. Life satisfaction is probably the most used proxy of well-being studies. It comprises at the same time a cognitive and an affective dimension [35,36]. Usually asking "how satisfied are you with your life as a whole?", it is quite close to the happiness question but with a more cognitive tone. Finally, the Cantril ladder is the most cognitive measure of all proxies. After thinking of the best and the worst possible life and giving them respectively a 0 and 10, respondents are asked to place their current life between these two extremes. The fact that these three concepts are the most used does not mean that they are the only ones. There exist many more around hedonic and eudaimonic concepts as well as topics related but slightly different from well-being (locus of control, hope, self-rated health). Each of these concepts and measures are prone to some form of biases, but to different degrees [37].

Existing Measures of Sustainability
There are multiple ways of assessing environmental impacts, some coming from economics literature, some from the natural sciences, some being multidimensional and some other unidimensional (see [38][39][40][41] for a review). In this ocean of measures, one can identify three that are widely used measures for environmental impacts at an international level, the ecological footprint (EF), the carbon footprint (CF) and the carbon emissions. The ecological footprint is a measure of "the biologically productive land and water area an individual, population or activity requires to produce all the resources it consumes, to accommodate its occupied urban infrastructure, and to absorb the waste it generates, using prevailing technology and resource management practices". There are multiple components in the ecological footprint, including cropland, forest land, fishing grounds, grazing land, built-up land and carbon footprint. The carbon footprint represents the largest share of the ecological footprint, arguable representing 50-60% of the ecological footprint (EF) [42,43]. Carbon footprint represents at the same time the emissions of CO 2 and other gas (methane, nitrous oxides) of a given individual, territory, country as well as the ability to absorb carbon emissions. Finally, the carbon emissions are the same as the carbon footprint but do not take into account the biocapacity of the countries, i.e., their ability to absorb carbon emissions. However, it is more sensible the use the ecological footprint instead of a ratio between the footprint and the capacity which could have undesirable effects, such as excusing large countries to pollute a lot or extract a lot of resources whereas one could argue that these resources are helpful worldwide [44]. Some of these measures have been analyzed separately or sometimes together [11,12,45] but (1) footprints indicators were missing (2) sustainability or well-being were assessed but not sustainable well-being. An extensive assessment of the most used measures is thus still lacking, which I intend to cover in next section.

Methods: Assessment of Existing Indexes
To evaluate the quality of existing indexes, we need to process the literature through the filters of quality criteria. In the first part I define the most used criteria to assess quality. In the second part, I use the current state of the debate for existing measures to evaluate current indexes. To do that, because all of these indexes are based on a ratio well-being over environmental impacts, I evaluate concurrently the numerator (well-being) and the denominator (environmental impacts).

Criteria of Evaluation
There are multiple ways of assessing the quality of measures. There are general criteria that apply to any field and some that are applied to a specific field. The two main criteria are often validity and reliability, which are roughly defined by "measuring the right things" and "measuring the things right". These classic criteria are completed by other and novel criteria, some of which can seem more subjective than others (e.g attractiveness in the next example). When expanding a bit further, statisticians focus on other criteria: (1) Relevance (attractiveness + responsiveness), (2) Accuracy and reliability (3) Timeliness and continuity, (4) Comparability (level of harmonization between Member States), (5) Clarity (easiness in understanding, communicability) [46]. Relevance and validity are sometimes together and sometimes separate. Booysen suggests looking at 7 dimensions to assess the quality: (1) the content, (2) the technique and the method, (3) the comparative application, (4) the focus, (5) the clarity and simplicity, (6) the availability and (7) the flexibility [47]. When looking at the most advanced works on metrics of well-being [48], one realizes that things are not fundamentally different in that field. One could say the same about sustainability indicators, but more principles or criteria seem to have emerged from that field. Among these, one can cite in particular the Bellagio principles or the work of Guy and Kibert [49][50][51]. As far as Bellagio sustainable measures are concerned, there are ten Bellagio principles for sustainable development assessment (Hardi and Zdan, 1997) [50]: (1) guiding vision and goals, (2) holistic perspective, (3) essential elements, (4) adequate scope, (5) practical focus (6) openness (7) effective communication 8) broad participation (9) ongoing assessment (10) institutional capacity. Guy and Kibert (1998) detail 13 related desirable characteristics for sustainability measures: (1) Community involvement, (2) Linkage, (3) valid, (4) Available and timely, (5) Stable and reliable, (6) Understandable, (7) responsive, (8) Policy relevance, (9) representative, (10) flexible, (11) proactive, (12) long range, (13) act locally think globally [51]. I will firstly use validity and reliability to evaluate the quality of existing measures (see Table 1) and then use the thirteen criteria of Guy and Kibert which is represent the most complete evaluation of the proposed measures to date (see Section 7). This set of criteria seems to offer a complete overview of the characteristics and quality of measures. On top of these general characteristics, there are also specific criteria from the well-being literature, such as the emphasis that should be placed on output-based criteria [2], although some prefer to have a combination of output and input-based indicators. Nonetheless, there is a consensus that measuring well-being only with objective indicators without questioning individuals on their subjective perception is not desirable. Even when objective indicators are mixed with subjective indicators, there should be part of a well-explained theoretical framework. As for the environmental impacts, as we saw before, it is important to look at footprint and capacity jointly and not make both invisible by creating a ratio. The summary of the assessment of the measures is presented in Table 1 below.

Evaluation of Current Measures Based on the Literature
As we shall see, the debate is unevenly mature on the different measures. As some measures directly use these, this makes it relatively simple to evaluate them based on the current state of the debates. For others (the environmental part of SDG Index, SDI), little to no direct evaluation exists, so we need to use state of the art criteria to evaluate them. I detail these below.
As far as well-being measures are concerned, the measures used are mostly considered valid and reliable [52][53][54]. There are however slight variations among these measures. There are also differences throughout time since the socioeconomic and ecological context in which these measures take place changes. So, assessing the validity and reliability is a picture at a given point and can never be considered as eternal. Happiness questions seem to be closer to what everyday people mean about happiness and therefore is slightly more valid than life satisfaction or even more life evaluation which is a very cognitive way of measuring well-being. In that sense, the more cognitive are the measures, the more prone they are to normative issues and cross-cultural differences, although the extent to which this is normative is still debated [55][56][57]. Life evaluation and to a lesser extent life satisfaction are still debated in terms of validity. The situation is different on the reliability side, since there are more variations in affective measures that lead to lower coefficients of in the test/retest reliability, whereas the coefficients are considered sufficient for more cognitive measures such as life satisfaction and life evaluation [58].
As far as environmental measures are concerned, the situation is also contrasted. No measure seems to score points on both dimensions of validity and reliability. It is quite clear that indicators focused on carbon-only emissions only see a part (although a fundamental one) of the environmental impacts. Some argue that global warming is currently the most important environmental issue, but others argue that other impacts such as loss of biodiversity is just as important [59]. Ecological footprint is in that sense a wider and arguably more faithful representation of the scope of impacts. However, there are serious doubts vis a vis its reliability, i.e., its ability to properly measure what it aims at measuring. This has been summarized in terms of a "lack of congruence between the original narrative of the ecological footprint and the protocol presently proposed for its quantification", "the consequent incongruence of the quantitative indications provided by the EF index"; and "the flaws in the pre-analytical assumptions" [60]. It suffers a certain number of limitations [22]. A first problem noted by the authors is that the EF is largely dominated by energy and in particular carbon sequestrations (more than 50%) in most middle-and highincome countries. Another worry is that administrative boundaries (i.e., mostly national boundaries) are irrelevant. Furthermore, ecological footprint fails to take into account land degradation which has important consequences for food production. In a rather explicit article "Measuring sustainability: Why the ecological footprint is bad economics and bad environmental science", Fiala summarizes "I believe it is more useful to look directly at sustainability measures, such as land degradation and CO 2 aggregations" [60]. Other concerns were related to the success of the ecological footprint and it particular the concept of overshoot day, which is the calendar representation of the ratio between degradation and biocapacity. Some authors have criticized the "media-friendly narrative" around the ecological footprint and in particular the related Earth Overshoot Day [61]. According to the authors, there is an intrinsic contradiction between the complexity of sustainability and the simplicity of the message to deliver. In the Stiglitz report, "measuring sustainability with a single index number would confront us with severe normative questions" [2]. Going further, it is commonly acknowledged that "[in relation to aggregate measures] normative implications are seldom made explicit or justified", [2] a criticism see as appropriate for the ecological footprint [61].
Therefore, carbon-only measures seem to be too restrictive whereas ecological footprint has reliability issues. Carbon footprint has at the same time validity issues and reliability issues. In particular, there have been concerns vis-a-vis ocean sequestration fraction rates [62] as well as "considerable uncertainties remain as to the distribution of anthropogenic CO 2 in the ocean, its rate of uptake over the industrial era, and the relative roles of the ocean and terrestrial biosphere in anthropogenic CO 2 sequestration." [62].
There are other reliability doubts around carbon footprint and ecological footprint. Looking at Icelandic case, it seems that "activity data from international databanks rarely match locally sourced data. The change in CF under the data scenarios created range from a 42% decrease in CF to a 147% increase. Relevant caveats regarding estimations in CF calculations are found lacking in GFN's dissemination of results" [63]. Carbon emissions do not suffer so much from doubts on intake and can therefore be considered as more reliable. There are weaknesses for ecological footprint such as the "lack of transparency (e.g., calculations are not always reproducible)".
Finally, the indexes aiming at measure well-being and sustainability seem to combine the problem of each measure and add some new issues. This means that for each index, I took the assessment of the well-being and the environmental part, when available. Not all the indexes used off the shelf measures of well-being and sustainability. In that case, we need other criteria to assess their validity and reliability. When different measures are combined, I decrease the evaluation.
As far as validity of well-being is concerned, we look for output-based indicators in order to measure what really matters to people and not external constructs [56,57]. Following Veenhoven, it is often preferable to look at quality of life as evaluated by the actors and to use output indicators rather than input indicators [56]. Furthermore, there is a fairly high degree of collinearity as the two dimensions used by the HSDI (literacy rate, longevity) are also related to the third component, GDP [64][65][66][67][68][69][70][71][72]. The indexes should be balanced in order not to measure indirectly growth for instance. This means the indexes should have as many dimensions as possible on the numerator than dimensions on the denominator [67].
For the SDG index and Gaucher's index, I use the assessment of their respective well-being assessment, as the designers directly use life evaluation as such. For the HPI and, as it is a mixture of output and input based, I take the well-being dimension, and I decrease the validity to take into account the mixture. As for SDI and HSDI, it is just based on input indicators so, in line with the criteria in the literature, I consider it the lowest assessment possible of well-being.
As for the environmental assessment, for HPI, SDI and HSDI, I use the direct assessment of the ecological dimensions that they use. For Gaucher, I use the one that is chosen minus a penalty for mistakenly taken into account the biocapacities. Furthermore, the authors use the biocapacity of countries which causes large countries to dispose of a large power of resources that should be shared worldwide. In that sense, we partly join the criticisms in that administrative boundaries are irrelevant [44]. This is in particular true for biocapacity as the whole world depends on the biocapacity of some countries. SDG index is a construction with so many subdimensions that one loses the overall impact capacity. The SDG index seems even more problematic from the perspective of environmental validity. Still stuck in the contradiction between "sustainable" and "development", the SDG seems to be a better indicator of economic prosperity than environmental soundness (as we shall see below). This is not surprising since the term of sustainable development was an expansionist attempt to keep things as they were [67].
The assessment depends on the degree of maturity of the debate on the index itself or on the dimensions of well-being and environmental impacts. When the debate is advanced, as for the three types of well-being and three types of environmental impacts, the number of stars represent the overall acceptability of the measure in the literature. In that case, three stars (high) mean that overall, within the current debate on a given measure, it is estimated that the validity or the reliability is sufficient, two stars mean that the debate is unsettled with comparable forces that think that it good and that think it is bad. Finally, one star means that there is consensus around the fact that the validity or the reliability is poor. When the measures are created ad hoc for certain indexes (well-being part of SDI and HSDI or Gaucher, environmental part of SDG), there is little or no evaluation in the literature, so I use state of the art criteria to assess them.

Summary of Validity and Reliability of Current Measures
Results of validity and reliability based on the literature as well as own evaluations are condensed in Table 1 below. Each criterion ranges from "insufficient" to "high quality".
Happy Planet Index 1st generation is the combination of a happy life years and ecological footprint so that the validity and reliability is the one of each, as displayed at the bottom. Happy Planet Index 2nd generation is a ratio between life evaluation and inequality of outcomes and ecological footprint. The quality of environmental impact is the one of ecological footprint. As for the one of well-being, it is unclear why life evaluation is mixed with inequality of outcomes and not say wealth per capita or safety. It fails to really provide a theoretical framework explaining that choice. SDG Index is a ratio between life evaluation and a complex combination of the SDG (minus target 3). The atomization of the different targets makes it unreadable and untraceable. HSDI combines carbon emissions with an index of well-being (literacy, life expectancy and GDP), not including any subjective assessment. SDI combines ecological footprint and well-being index (the cubic root of an income index, education index and longevity index), not including any subjective assessment. Gaucher's Index combines happy life years with a "sustainability ratio" between the ecological footprint and the biocapacity of the country. As we saw, this no longer measures environmental impacts but rather the degree to which countries underexploit or overexploit their resources.

Data
In order to build our indexes, I use the following databases for the following indicators. I first review the databases of well-being before review those of sustainability.
To measure happiness, respondents are asked to position themselves on the question "taking all things together, would you say you are? Very unhappy, quite unhappy, quite happy or very happy. To measure the ecological footprint, the data is the latest version available of 2021 (data of 2017). The data is available on the website of the Data Footprint Network in partnership with Yale University. Data is available in 183 countries. Carbon emissions are drawn from the Emissions Database for Global Atmospheric Research (EDGAR) database. Latest data is from 2019 and is available for 208 countries and territories. Emissions per country are calculated on an annual basis country by country and sector by sector. All anthropogenic activities leading to carbon emissions are included in the calculations following the standards of the Intergovernmental Panel on Climate Change. The latest data for the SDG Index is drawn from the 2021 report using 2020 data [16]. Data is available for 165 countries. The score ranges from 38.2 (Central African Republic) to 85.9 (Finland). HSDI data is from 2011. Although this is earlier than other indexes, the calculations are not done by an official body and therefore are sparsely available. The score ranges from 0.000 (Qatar) to 0.906 (Norway). SDI data is from 2019. The score ranges from 0.079 (Singapore) to 0.853 (Costa Rica).

Current Landscape of Measures on Sustainable Well-Being
When observing the landscape of current measures on sustainable well-being, one can make several observations: some indexes are indirect measures of economic growth, there is a tropism on cognitive measures, a combination of input and output measures for well-being measure and a tropism on ecological footprint.
First, several indexes end up creating an indirect measure of economic growth. We saw that it was the case for the Sustainable Development Goal Index, but it is also the case for Happy Planet Index, the HSDI and Gaucher's index. The reason for these is that by adding variables such as longevity that are related to GDP, this indirectly gives more space to it. By having two or three variables that are quite linked to GDP on the numerator and just one on the denominator, the indexes are indirectly reflecting by a factor one or two the evolutions of GDP. A "GDP-neutral" index requires to have as many variables related to it on the numerator and the denominator. This is the case for instance of the SDI which uses the cubic root of three indexes on the numerator so that the well-being index and the sustainability index on the denominator are well balanced. The proximity of the different indexes with GDP per capita can be seen in Figure 1 below. Figure 1. Comparison of the existing indexes of sustainable well-being. Source: authors. * significant at the 10% level, ** significant at the 5% level, *** significant at the 1% level.
Second, there is a tropism on cognitive measures. All indexes use either Cantril ladder, the most cognitive measure of all well-being measures and life satisfaction which, although containing also a hedonic component is still mostly related to wants and mental projections. Using happiness questions or affective ones would tend to go closer to what people usually intend when they talk about a happy person.
Third, several measures use input measures or are a combination of input and output measures. Some of the indexes either use what Veenhoven calls environmental chances or mix them with individual outcomes. This makes some of them hard to read. This is the case for the HPI second generation which multiplies outcome-based indicators such as life evaluation with inequality of outcomes which makes hard to read what is being measured.
Finally, there is a tropism on ecological footprint. When looking at the different measure, one can observe that ecological footprint is the most used of environmental measures. As discussed previously, although this is likely a valid idea, there are reliability concerns around that measure. Among all measures, only SDI focuses only on carbon emissions and a more reliable measure.
In Figure 1, I show the correlates (top right of the figure) and the distributions (bottom left of the figure) between average log GDP per capita, HPI, HSDI, SDI and SDG Index in 135 countries.

Proposition of New Measures
A previous look at existing measures show that no measure is considered perfect in the literature and each face to various degrees issues of validity and reliability, with often tradeoffs between the two criteria. The more effective measures could be seen as closer to what happiness means for most people, but it is somewhat less reliable than cognitive measures such as life satisfaction or life evaluation. These last measures are considered as reliable but are not exempt from validity doubts although many of these qualms were probably inflated in the literature [58]. Sustainability measures follow the same pattern with ecological footprint being the closest to a holistic view of environmental impacts but facing reliability doubts, whereas carbon footprints and carbon emissions only capture a part of environmental damages and cannot be valid in displaying all environmental impacts. The indexes share the problems of these two basic measures as well as some extra depending on how they were built. It is important to understand this landscape of measures before envisioning possible improvements. Now that we have seen the shortcomings of the existing measures, what could we do to design a measure that better captures the ability to have the "most happiness for the most" and "for the longest"? Several points must be discussed in order to put boundaries to the debate. There are five points which ought to be discussed in order to choose the indicators: output vs. input, complexity vs. communicability, carbon impacts versus all impacts, footprint and biocapacity versus footprint only, which dimension of well-being to take and what balance of the index regarding prosperity.
First, related to our first observation, I choose to look at output-based indicators for the well-being part. It seems that the movement of the last 70 years that lead from external evaluation to internal appreciations of them is, although not complete, closer to what matters to people on the ground. Using the four qualities of life [56], moving from environmental chances to individual outcomes is form of progress from the perspective of social indicators. It does not mean that national statistical office should be only based on outcomes. However, for the purpose of understanding social progress, asking people about their degree of well-being is interesting to us.
Second, defining a right indicator is about finding the right degree of complexity. It is generally acknowledged that "quantities are a particularly effective attention grabber" (pp. 28-29, [69]). However, they are so when they are properly understood. If taking one thousand indicators to measure quality of life makes it closer to the reality, is it still relevant to capture 99% of the aspects of quality of life versus 5 indicators that would capture 80% of it? The opponents of ecological footprint tend to quote Albert Einstein that "make everything as simple as possible but not simpler". It is possible that both forms of indicators are necessary, one as accurate as possible when details are needed and one rather simple to communicate. In that sense, it is interesting to observe that a strong criticism against ecological footprint was its ability to be easily understood by the wider public. In the field of well-being, the same critique is often made by eudemonic approaches to hedonic indicators [55]. If one takes the criteria of quality of the OECD and of many statistical offices, communicability is an important one. Certainly, the overshoot day does not reflect the complexity of environmental impacts. However, it does give a very understandable indication of the extent to which humanity as a whole and each country lives indebted vis a vis the environment. The fact that ecological footprint is considered a "success story" [70], partly due to its endorsement by NGOs and its ability to discuss with a wider audience, can in that sense be seen as something positive. Another argument of the opponent of the ecological footprint is indirectly, the indicator is not perfect so we should not use it. This is rhetorically easy to respond because no indicator is perfect, neither will any suggestion to replace ecological footprint by others such Nature Index, Planetary Boundaries, MuSIASEM. The weaknesses of these proposed measures of replacement have been shown [71].
Third, as far as the range of ecological impacts is concerned, there is no straight answer. In particular, one could wonder if measures should stick to what is more and more seen as the most problematic problem, e.g., global warming or if one should encompass more type of environmental problem. Some say that as global warming is the most life-threatening issue for the future of the planet [72] whereas others consider that a loss of biodiversity would be just as threatening and pledge for a multi-impact approach [33].
Fourth, there must be an almost philosophical discussion on whether one should include or not the biocapacity of each country in the construction of the index. The inclusion of biocapacity gives an enormous power and responsibility to large countries such as the United States, Russia, Canada, Australia or Brazil. It also gives them the power to say "we are big so our people can pollute a lot". However, we know for instance that the destruction of the Amazon has implications for the whole humanity. Likewise, the forests of Canada and of Russia can be seen as a world heritage, although it belongs to these countries. Therefore, it seems that biocapacity should not be included in the measure as it would tend to induce that the environmental problems are local, which they are indisputably not.
Fifth, the well-being indicators seem to all be reliable, although affective measures such as happiness are a bit less, and one could question the validity of the Cantril ladder to really depict well-being as understood by people, rather than an abstract concept defined by scholars. Because life evaluation depicts better the proximity to ingrained standards than any form of happiness, it has become a decent proxy for environmental degradation, as its very high correlation (0.78) with GDP can attest. However, its annual publication is also a plus from the perspective of indicator. This means that happiness, life satisfaction and life evaluation seem to be three candidates for measuring well-being.
Finally, one has to ensure that the index is balanced, that is that there are as many dimensions related to prosperity on the numerator than on the denominator in order not to build an indirect measure of prosperity as in the HPI or in Gaucher's index (HSDI and SDG are closer to the GDP by design). In that sense, only the SDI is well balanced. From the perspective of longevity, adding it could seem to be a sound idea, but it brings some endogeneity questions as both happiness measures and longevity are related to GDP measures. Therefore, bringing longevity would favor rich countries because both constructs on the numerator are related to GDP and only one on the denominator so this means that the construct would be related to GDP. If we want to avoid this, we should have as many dimensions related to GDP on the numerator and on the denominator side.
This means that if we are looking for balanced, output-based indexes measuring well-being and environmental impacts, that are based on footprints and not biocapacities, that are easy to communicate with the best available measures, we should use respectively two and three measures for environmental impact and well-being. In order to observe the ecological cost of happiness, I have decided to use a simple ratio between an environmental dimension and a well-being dimension. This ratio has several advantages: it is simple, easy to understand and to compare as say the GDP. We have seen previously that they are several candidates for both. Each should be used and verified in the theory, data should be largely available data in order to be able to compare countries on a large scale. From that perspective, I extract ecological footprint (EF) and carbon emissions (CE) on the environmental side and life satisfaction, life evaluation and happiness on the wellbeing side. This gives us six indexes to observe the links between environmental impacts and well-being: happiness/EF, life satisfaction/EF, life evaluation/EF, happiness/CE, life satisfaction/CE, life evaluation/CE.

Integration of Indexes in the Landscape of Measures
The created measures come to complete the existing landscape of measures. On top of methodological considerations, they bring some complementarity with existing measures.

Comparison of the Indexes
To determine to what extent each measure brings a new vision to the existing measures, I plot the distributions and correlations between them and with a prosperity indicator. Figure 2 below presents some results of the different interactions between the well-being and environmental dimensions. The high correlation between life satisfaction and life evaluation is well known and so is the relatively lower correlation with happiness [48,49]. Ecological footprint and carbon emissions show a strong correlation with life evaluation and less so with happiness. An eye on the correlations with GDP is in that sense useful as both carbon emissions and ecological footprint are highly related to GDP. Finally, the SDG Index is highly correlated with GDP (0.85).

New Landscape of Indexes
The new indexes I created complete the current landscape of indexes. Most indexes at the moment are in some form a ratio between a cognitive form of well-being and the ecological footprint (HPI, SDG, Gaucher's Index). The indexes I propose here enable to complete the affective part of the landscape, which at the is largely ignored. Affective questions could be for instance the Affective Balance Scale or the Hedonic Level of Affect or simply as I propose here happiness with one's life. From the perspective of sustainability, I propose to complete the measures by having more carbon-based measures (since the only one using it at the moment is HSDI) and not only ecological footprint, as both have pros and cons from methodological perspective. Some of the indexes I created are very close to some already existing, such as the HPI 1st and 2nd generation, which corresponded to our LS/EF and LE/EF. The only difference is in the way they are operationalized.

Discussion
Depicting the landscape of measures on sustainable well-being and proposing possible complements to it has led to a few discussion points. I highlight them below.

Economic Prosperity Is Related to Higher Environmental Damages
The indexes I proposed show, in the continuation of the HPI, the extent to which economic prosperity is related to higher environmental damages. High-income countries destroy the planet much more than middle-and low-income countries. Looking at indexes at a macro level is a clear sign that one has to choose between economic prosperity and environmental protection. Only the Sustainable Development Goals Index tells another story but as we saw, it is a better proxy of GDP than of environmental protection. In that sense, the SDG Index seems to be more on the "development" than on the "sustainable" side because its correlation with GDP is so high (0.847) that it could be considered as merely a soft version of the GDP. Additionally, the correlations between GDP and the ecological and carbon footprints (respective 0.813 and 0.714) show to what extent the economic prosperity is environmentally destructive. Finally, even the well-being measures are quite related to GDP with respectively 0.769, 0.778 and 0.184 for life satisfaction, life evaluation and happiness. This shows a phenomenon well-known in the literature: life satisfaction and life evaluation are better measures of proximity with ingrained standards than a real measure of happiness. The "real" measure of happiness is only slightly related to GDP. Following this measure could lead to less environmentally destructive economies. For instance, certain studies show that decent living standards could be maintained in India, Brazil and South Africa with around 90% less per-capita energy use than currently consumed in affluent countries [50].

More or Better Indexes
In this paper, I have shown that quite a lot has been done to try to measure some form of sustainable well-being under the form of an index. Many of these indexes present some similarities with a numerator depicting well-being and a denominator about sustainability. I have criticized the well-being part of most of these indexes and questioned the sustainability one. However, it is important to realize that discussing what is a good measure does not happen in a black box. Because other indexes are already existing, the social scientists or statisticians building a measure land on an occupied field. In that sense, one could wonder if it is better to have a great measure of something exiting or simply a good measure of something not measured. In other words, in order to avoid the proliferation of measures, one has to ensure that the contribution of the index is significant enough so that the information it brings is superior to the noise. This means it is necessary to evaluate the quality of existing indexes before proposing a new one. In that sense, because most of the existing measures of sustainable well-being use cognitive measures such as life evaluation or life satisfaction for the well-being part and the ecological footprint (with or without biocapacity), the most innovative measures among the six I created in this paper are the ones that use happiness and carbon emissions. It does not mean they are the best measure in absolute terms, simply that they bring the most information in comparison to the existing ones. We saw that cognitive measures are good proxies of economic prosperity, which is not the fact for hedonic or affective measures. Likewise, I do not say that having carbon emissions is better than taking the ecological footprint. Both seem necessary at the moment, so I made sure to keep both options open at that stage. However, I do believe it is a mistake to include biocapacity, because that gives an enormous ecological degradation power to large countries with large resources, whereas we are all in the same boat and that a burning in Amazonia hurts us all.

Building a Good Index Is an Art
When crossing the different criteria we identified in the literature review, one can be shocked by the very different nature of these, as the constructor of the index is not only bound by validity and reliability concerns, but also communicability, availability of the data, the policy relevance (at a given time or in the future?), the proactiveness, the ability to meet local concerns while tackling larger issues in the case of sustainability. The weight one gives to each criterion depends on the sensibility of the researcher, of the ethos of his or her institution, etc. I have highlighted the debate around ecological footprint: some researchers prefer reliability over communicability, some pragmatic researchers prefer access to data over validity, some researchers value the problems of now whereas other prefer to look into a longer future. Possibly, none of them are actually wrong, they just look at the same reality from different angles. However, this does mean more than ever than the scientific endeavor of building indictors is a science of justification, argumentation, refutation, as the person bringing a new measure is accountable to improve the debate and not just to bring noise in it. In any case, building a good index requires juggling with a lot of these different sensibilities, and can be seen, after all, as a form or art.

Conclusions
There are multiple measures of sustainable well-being, which are unequally well depicting what they intend to capture, which is the degree of well-being of the populations in relation with their environmental impacts. As I showed, the landscape of indicators points out towards indirect measures of economic growth, an inclination towards cognitive measures, a combination of input and output measures for well-being measure and a tropism on ecological footprint. This indirectly points out towards what is missing in this landscape (more balanced indexes, more effective measures, completing measures with carbon footprint). The inclusion of other indexes or other measures could change the conclusions derived here, but it is likely that these differences would be moderate.