Next Article in Journal
Unraveling the Drivers of ESG Performance in Chinese Firms: An Explainable Machine-Learning Approach
Previous Article in Journal
Guiding the Unseen: A Systems Model of Prompt-Driven Agency Dynamics in Generative AI-Enabled VR Serious Game Design
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Beyond the Preston Curve: Analyzing Variations in Life Expectancy Around the World Using Multivariate Regression Circa 2000 and 2015

Homer Consulting and MIT Research Affiliate, Barrytown, NY 12507, USA
Systems 2025, 13(7), 577; https://doi.org/10.3390/systems13070577
Submission received: 3 May 2025 / Revised: 30 June 2025 / Accepted: 9 July 2025 / Published: 14 July 2025
(This article belongs to the Section Systems Practice in Social Science)

Abstract

Multiple studies, starting with Preston’s work in 1975, have suggested that gross domestic product per capita (GDPPC) is an important explanatory factor for understanding differentials in life expectancy at birth (LEB) in countries around the world. This proposition was tested in the present study using two-period cross-sectional regression across a large number of both advanced and developing countries and 16 socioeconomic factors, including GDPPC. The best-performing regression equations in the periods around 2000 and 2015 included four to six of these factors (government effectiveness, safe sanitation, poverty and contraception, plus, in the circa-2000 period, the Gini index and CO2 emissions); perhaps surprisingly, these equations did not include GDPPC. The results were examined in greater detail for the world’s 15 most populous countries, helping to identify key drivers of LEB growth for each of these countries from circa 2000 to 2015. The fact that GDPPC drops out of the best equations calls into question the view that economic growth is the correct primary target for nations seeking to increase their average life expectancy.

1. Introduction

Fifty years ago, Samuel H. Preston published his observations, based on country-level data from throughout the world, of a positive log-linear relationship between gross domestic product per capita (GDPPC) and life expectancy at birth (LEB) [1]. He found such a relationship held for the decades of the 1900s, the 1930s, and the 1960s, but that the entire curve was rising over time. This he attributed to improvements in health care and public health—such as vaccination and the control of infectious disease. Indeed, global life expectancy increased from 32 years in 1900, to 51 in 1960, to 65 in 1990, and to 72 in 2022 [2,3], more than any single static curve based on GDPPC would have predicted. Nonetheless, the Preston curve was established as a clear demonstration that national income level is associated with LEB.
In the decades since Preston’s 1975 paper, multiple studies have seemed to confirm the relationship between GDPPC and LEB in both developed and developing countries [4,5,6,7,8,9,10,11,12,13,14]. Higher national incomes are generally associated with stronger government support for health and human services (including life-saving technologies), and also with the ability of individuals to afford healthier living conditions for themselves.
But some analysts have urged not giving too much weight to GDPPC’s role in explaining LEB. First, the curve reflects some degree of reverse causality because healthier, longer-lived people are more productive and add more to a nation’s income [4]. Second, GDPPC is not fully predictive, and there are many examples of countries that significantly outperform or underperform relative to the Preston curve (as Preston himself noted) [1]. Third, a focus on GDPPC may give the impression that a country need only concentrate on growing its economy, when in fact, other socioeconomic factors may be more proximal to direct mortality risks and more salient for understanding how best to improve a nation’s life expectancy.
For example, a nation may choose to focus on the economic goals of reducing poverty or income inequality rather than GDPPC alone. Similarly, they may choose to focus on the social goals of good governance, literacy and educational attainment, family planning, public health, strong social services, health care access, or reducing harmful air pollution. All of these factors have been found, at least in some studies, to be significant contributors to health and longevity, aside from growth in GDPPC [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24].
Still, after all this research into other explanatory factors, the Preston curve has not been set aside or even much called into question. Indeed, virtually all previous statistical analyses across multiple countries have concluded that GDPPC (specifically, the natural logarithm ln (GDPPC)) is a key independent contributor to LEB, if not the most important [8,9,10,11,12,13,14]. But these previous analyses have all had limitations: some considering only a small number of countries; some considering only high-income countries or only developing countries; some mixing socioeconomic variables with behavioral risk factors (e.g., substance abuse or obesity) or preexisting diseases (e.g., HIV/AIDS); some considering only a single point in time; and some considering only a small subset of the multiple socioeconomic factors described above.
Ideally, one should take a broader systems approach, one that recognizes dynamic complexities and feedback loops like the reverse causality of LEB affecting GDPPC noted above [4], as well as the interplay of GDPPC with the quality of national governance [25]. A causal-loop diagram of what this approach might look like is presented in Figure 1.
However, this is only a high-level picture, and making the approach more useful (e.g., through simulation and scenario testing as in [25]) would first require the identification and quantification of verifiably significant causal relationships. This study is meant to make progress on that initial step by focusing on the thicker black lines in Figure 1, namely, the links from multiple socioeconomic factors to LEB.
First, public data were gathered on LEB and 16 potential explanatory factors across 150 countries around the world at both higher and lower income levels. Second, a stepwise multivariate statistical regression approach was taken to evaluate these data at two different periods, the first around 2000 and the second around 2015, and to identify the best-performing regression models in each period. Next, these best-performing multivariate models (one with six factors and one with four factors) were compared with a Preston-style model containing only ln (GDPPC) as an explanatory factor. As a last step, this study focuses on the world’s 15 largest (most populous) countries to determine how well the six-factor model is able to explain each country’s LEB progress since 2000 and consider what key determinants the model may be missing in some cases.

2. Materials and Methods

2.1. Data

All data selected for the study were based on the literature described above and drawn from the World Bank’s online public database. All countries with a 2020 population of more than one million people and consistent reporting of LEB and GDPPC were admitted, resulting in the list of 150 countries presented alphabetically as Table S1 in Supplementary Materials.
Table 1 below presents simple summary statistics for LEB and the 16 socioeconomic factors included in the analysis; definitions and data links are provided below the numerical table. These factors include the economic variables of GDPPC, poverty, and income inequality (Gini index). They also include the World Bank’s six World Governance Indicators (WGIs), as well as measures of safe sanitation, education (mean years of schooling and female literacy), air pollution (CO2 emissions per capita), physician density, undernourishment, and use of contraception. Each of these variables is an established metric measured by most countries since 2000 or earlier, reported annually for some variables but with less regular reporting for others.
For each variable, two periods were defined for pooling available data: an earlier period circa 2000 and a later period circa 2015. The choice of pooling period was dictated by data availability, with a narrower pooling period (e.g., 2000–2004 or 2015–2019) used for variables with more regular reporting and a wider period (e.g., 1996–2004 or 2012–2018) used for variables with less regular reporting. Also, the pooling period for each socioeconomic factor was made sufficiently wide (always at least two years) to encompass the time period required to see its plausible impact on LEB.
Available data for each country were averaged within each pooling period. If a country had no data for a given variable within a given pooling period, that country was excluded from any analysis involving that variable in that period. The “Countries” column of Table 1 indicates that most variable/period combinations had 140 or more countries with data, and all variable/period combinations had at least 112 countries with data. Table 1 indicates the best value, worst value, and mean value for each variable/period combination across the countries with data.

2.2. Analysis

A stepwise cross-sectional multivariate linear regression approach, with LEB as the dependent variable, was followed for the circa-2000 data, and then again separately for the circa-2015 data. In the first step, only variables with no missing data were included; namely, ln (GDPPC) and the 6 governance indicators. A “best-performing” combination of variables was identified that maximized the adjusted R-squared value and minimized the mean absolute error (MAE) across all 150 countries. In subsequent steps, additional variables were tested along with the best-performing variables from the previous step. Adding too many variables at once could have meant the removal of too many countries due to missing data and compromised statistical power. No more than 7 independent factors were included in any single regression, and no fewer than 91 countries, in line with standard guidelines for the sample-to-variable ratio in multivariate regression [26].
All 16 candidate factors were considered this way in a stepwise fashion, and a final best-performing equation was identified for each of the two periods. The best multivariate equation for each time period was compared in terms of performance with that of an equation including only ln (GDPPC) as an explanatory variable. The best multivariate equations were also tested for the presence of multicollinearity (which could potentially interfere with model reliability), including the calculation of variance inflation factors (VIFs) [27].
Further analysis focused on the world’s 15 largest countries. The goal was to understand key contributing factors for the change in LEB from the circa-2000 period to the circa-2015 period for each country. One of the best-performing regression models was used to calculate and identify apparent significant factors driving the change. The literature for that country was also examined to see what else might help to explain the observed change through circa 2015 or to provide insight concerning more recent changes.

3. Results

3.1. Regression Results

The results of the regression analysis are summarized in Table 2, with the top portion showing results for the circa-2000 period and the lower portion showing results for the circa-2015 period. For both periods, a best-performing linear regression model is presented first, followed by two or three lesser models for comparison.
For the earlier period, the best-performing regression model (“E1”) has six explanatory factors: government effectiveness, sanitation, poverty, contraception, Gini index, and CO2 emissions per capita. Based on the 102 countries with available data on all six variables in the circa-2000 period (see the X-marked countries in the “E1, E2, E3a” column of Table S1 in Supplementary Materials), this equation has an (unadjusted) R-squared of 86.3% and an MAE of 3.16 years. All six explanatory factors are significant with p-values of 0.005 or less.
For the later period, the best-performing regression model (“L1”) has four explanatory factors: government effectiveness, sanitation, poverty, and contraception. (These are also four of the six factors in model E1) Based on the 91 countries with data on all four variables in the circa-2015 period (see the “L1, L2, L3a” column of Table S1), this equation has an R-squared of 85.6% and an MAE of 2.05 years. The first three factors are significant with p-values of 0.02 or less, while contraception is borderline significant with a p-value of 0.14. Coming in a close second is model “L2”, which does not include contraception as a factor. This model with three factors has an R-squared of 85.3% and an MAE of 2.13 years, with p-values of 0.02 or less.
A model with the same four explanatory factors as in model L1 was tested for the earlier period, in order to see what difference the exclusion of Gini and CO2 emissions would make. Based on the same 102 countries as in model E1, this “E2” equation has an R-squared of 83.2% and an MAE of 3.25 years: a bit worse than the E1 model. All four explanatory factors are significant with p-values of 0.04 or less.
Table 2 also presents the variance inflation factors (VIFs) for each independent variable in multivariate models E1, E2, L1, and L2. VIFs across these three models range from a minimum of 1.2 (corresponding to a cross-factor R-squared of 17%) to a maximum of 5.3 (corresponding to a cross-factor R-squared of 81%). Based on the usual guidelines [27], these VIF values do not present strong evidence of multicollinearity.
None of the best-performing models described above include ln (GDPPC) as an explanatory factor. These may be compared with regressions that include only ln (GDPPC), in the manner of Preston. For the earlier period, such a univariate regression was first performed using the same 102 countries as in models E1 and E2. This “E3a” regression has an R-squared of 62.2% and an MAE of 4.88 years, not as good as the multivariate models E1 (86.3%, 3.16) and E2 (83.2%, 3.25). Next, a regression was performed using all 150 countries in the dataset, as a check on whether the smaller sample size in E3a had affected the outcome. This “E3b” regression has an R-squared of 64.9% and an MAE of 4.54 years, only a slight improvement in performance over E3a.
For the later period, two regressions were similarly performed, including only ln (GDPPC) as an explanatory factor. The “L3a” regression is based on the same 91 countries as in model L1; it has an R-squared of 72.5% and an MAE of 3.03 years, not as good as the multivariate model (85.6%, 2.05). The “L3b” regression is based on all 150 countries in the dataset, as a check on whether the smaller sample size in L3a had affected the result; it has an R-squared of 73.9% and an MAE of 3.02 years, close to the L3a result.

3.2. Results for the 15 Largest (Most Populous) Countries

Table 3 presents the average LEB for the world’s 15 largest countries for the earlier and later periods (2000–2004 and 2015–2019, respectively), as well as regression model errors (i.e., predicted minus actual) for models E1 (six factors), E2 (four factors), E3a (one factor), L1 (four factors), L2 (three factors), and L3a (one factor). Due to missing data, these errors could not be calculated for Japan for models E1 and E2, and for India, Nigeria, and the Russian Federation for models L1 and L2. The bottom row of the table shows the mean absolute errors across the countries for which errors could be calculated. The MAEs across this sample of countries are similar in magnitude to the corresponding MAEs across the much broader sample of about 100 countries (including these largest ones) used for the regression analysis. Table 3 illustrates how the multivariate models generally (but not always) outperform the univariate models for these large countries. In particular, multivariate decisively outperforms univariate in the earlier period for China, India, Nigeria, Bangladesh, Philippines, and Vietnam, and in the later period for the USA and Vietnam.
Having established that the best-performing regression models for the broader sample also typically perform well for the 15 largest countries, calculations were performed using the six-factor model to determine possible key factors for the observed magnitudes of change in LEB from the earlier (circa-2000) to the later (circa-2015) period; see Table 4. This table includes a column describing, in order of calculated impact, which of the six factors in the equation experienced a large enough change, when multiplied by the corresponding regression equation coefficient, to result in an impact (positive or negative) of at least 1.0 year of LEB.
In addition, the research literature was examined to see what else might help to explain the observed change in LEB from the earlier period to the later period, particularly for countries where the six factors significantly underestimate or overestimate the actual growth in LEB; or when the research literature brings to light important changes that have occurred more recently. This information is presented in the final column of Table 4.
For example, India experienced growth in LEB of 6.8 years from the earlier period to the later period. The six-factor regression model can explain some but not all of this growth as related to improvements in sanitation and poverty. The literature suggests that improvements in parasitic disease control, literacy, and female education, factors not included in the regression model, also helped boost Indian LEB during those years [28,29].

4. Discussion

4.1. Summary and Contribution

This study addresses the question of whether GDPPC should be considered the primary determinant of LEB in countries around the world, or whether there might be a better way to look at it, with more incisive implications for national policy. A two-period cross-sectional approach was taken. Data on 16 socioeconomic factors were gathered from 150 advanced and developing nations, and regression analysis was used to find equations that gave the best fit to LEB in the circa-2000 and circa-2015 time periods. For both time periods, the best-performing equations included factors of government effectiveness, safe sanitation, poverty, and contraception; the circa-2000 equation also included the Gini index and CO2 emissions per capita. GDPPC is a reasonably good predictor by itself but is not present in the best-performing equations.
The idea that a multivariate equation would perform better than GDPPC alone should come as no surprise and has been found by others previously [8,9,10,11,12,13,14]. What has not been found previously, rather, is that GDPPC could drop altogether out of the best-performing equations. This finding suggests that GDPPC is not a significant contributor to explaining international LEB differentials after four to six other key socioeconomic factors are accounted for. To put it another way, it appears that GDPPC only points broadly at what these other salient factors or essential conditions (similar to the recently described vital conditions framework [45]) tell us more precisely.
The analysis suggests that those essential conditions include a clean environment (sanitation and lower CO2 emissions), adequate income (i.e., beyond poverty level), a government that reliably and fairly provides other important services, family planning (i.e., contraception) to help minimize infant and maternal mortality [46], and perhaps a sense that one’s efforts are valued by society as much as anyone else’s (as reflected in the Gini index of income inequality) [19].
Conversely, it is notable that the best-performing regression equations did not include physician density, undernourishment, female literacy, years of schooling, or any of the governance indicators other than government effectiveness. These excluded factors surely all contribute to quality of life and in some cases might be the difference between life and death; but apparently they do not contribute to explaining international LEB differences beyond the six factors that were included.

4.2. Limitations and Extensions

One might also think of other policy-relevant socioeconomic factors that this analysis did not consider, such as clean water, adequate housing, or reduced particulate air pollution (e.g., PM2.5 concentration, an alternative measure to CO2 emissions) [24]. The reason for their exclusion was simply a lack of data for a sufficient number of countries. Perhaps these factors could be included in an analysis for a smaller number of countries or a more recent time period when those data might be more plentiful.
Missing or less-frequently collected data also constrained what could be addressed in the analysis in terms of explaining change over time. If annual data were available on all variables of interest over a period of, say, 10 to 20 years, a fixed effects longitudinal panel regression approach could be taken to more accurately and systematically analyze the drivers of change over time [47]. Several LEB studies have taken such an approach, but in every case, they were limited to the number and variety of countries and the number of socioeconomic variables for which annual panel data were available [10,11,12,13].
Another possible approach to longitudinal analysis is dynamic simulation, which has been used previously for modeling changes in LEB at a country [48] and global [49] level. Even in the face of missing data, it is possible to simulate changes over time starting from known initial conditions and estimate coefficients for achieving a best fit by stochastic filtering and optimization [50]. Such an approach might be worth exploring for better understanding differentials in international LEB growth rates and anticipating the potential impacts of national policy options.

5. Conclusions

The cross-sectional approach taken here may be viewed as a kind of counterpoint to Preston’s groundbreaking 1975 work. This new analysis found that GDPPC is still strongly associated with national LEB, but that a small cluster of four to six essential socioeconomic factors, which does not include GDPPC, is significantly more predictive. Even if additional factors were tested, and even if a more longitudinal approach were taken, the fact that GDPPC drops out of the best-performing LEB equation would not change.
This result comes from a statistical analysis, and causality beyond correlation has not yet been firmly established. But if the result is ultimately confirmed, it would mean we should no longer view economic growth per se as the correct primary target for nations seeking to increase their average life expectancy. We need only look at China, which in 1975 was more than 11 years behind the US on LEB (60.9 vs. 72.6 years) but by 2022 had pulled ahead (78.6 vs. 77.4 years) [3]. Despite having a GDPPC only one-sixth that of the US, China was able to achieve this remarkable feat, as the analysis here suggests (and as others have documented [28]) largely by concentrating its energies on poverty reduction, sanitation improvement, and the improved provision of other essential government services. We may find fault with many aspects of Chinese governance, but perhaps it is time we removed our growth-obsessed blinders and finally recognized that concerted government effort for the public good is the key requirement for population health and longevity.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/systems13070577/s1, “Table S1. Countries included in the best-performing regression models” (Microsoft Word). “LEB global regress 2025 dataset” (Microsoft Excel).

Funding

This research received no external funding.

Data Availability Statement

The data used in this study (as described in Table 1, including links to the original sources) are included in the Supplementary Materials. Further inquiries can be directed to the author.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Preston, S.H. The changing relation between mortality and level of economic development. Popul. Stud. 1975, 29, 231–248, reprinted in Int. J. Epidem. 2007, 36, 484–490. [Google Scholar] [CrossRef]
  2. Riley, J.C. Estimates of regional and global life expectancy, 1800–2001. Popul. Devel. Rev. 2005, 31, 537–543. [Google Scholar] [CrossRef]
  3. World Bank. Life Expectancy at Birth, Total (Years). 2024. Available online: https://data.worldbank.org/indicator/sp.dyn.le00.in (accessed on 30 June 2025).
  4. Cutler, D.; Deaton, A.; Lleras-Muney, A. The determinants of mortality. J. Econ. Perspect. 2006, 20, 97–120. [Google Scholar] [CrossRef]
  5. Braveman, P.A.; Cubbin, C.; Egerter, S.; Williams, D.R.; Pamuk, E. Socioeconomic disparities in health in the United States: What the patterns tell us. Am. J. Public Health 2010, 100 (Suppl. S1), S186–S196. [Google Scholar] [CrossRef]
  6. Chetty, R.; Stepner, M.; Abraham, S.; Lin, S.; Scuderi, B.; Turner, N.; Bergeron, A.; Cutler, D. The association between income and life expectancy in the United States, 2001–2014. JAMA 2016, 315, 1750–1766. [Google Scholar] [CrossRef]
  7. Bundy, J.D.; Mills, K.T.; He, H.; LaVeist, T.A.; Ferdinand, K.C.; Chen, J.; He, J. Social determinants of health and premature death among adults in the USA from 1999 to 2018: A national cohort study. Lancet Public Health 2023, 8, e422–e431. [Google Scholar] [CrossRef]
  8. Lin, R.T.; Chen, Y.M.; Chien, L.C.; Chan, C.C. Political and social determinants of life expectancy in less developed countries: A longitudinal study. BMC Public Health 2012, 12, 85. [Google Scholar] [CrossRef]
  9. Mondal, M.N.; Shitan, M. Impact of socio-health factors on life expectancy in the low and lower middle income countries. Iran. J. Public Health 2013, 42, 1354–1362. [Google Scholar]
  10. Bayati, M.; Akbarian, R.; Kavosi, Z. Determinants of life expectancy in eastern Mediterranean region: A health production function. Int. J. Health Policy Manag. 2013, 1, 57–61. [Google Scholar] [CrossRef]
  11. Tarca, V.; Tarca, E.; Moscalu, M. Social and economic determinants of life expectancy at birth in Eastern Europe. Healthcare 2024, 12, 1148. [Google Scholar] [CrossRef]
  12. Karma, E. Socioeconomic determinants of life expectancy: Southeastern European countries. Eur. J. Sust. Devel. 2023, 12, 23–34. [Google Scholar] [CrossRef]
  13. Roffia, P.; Bucciol, A.; Hashlamoun, S. Determinants of life expectancy at birth: A longitudinal study on OECD countries. Int. J. Health Econ. Manag. 2022, 23, 189–212. [Google Scholar] [CrossRef] [PubMed]
  14. Homer, J.B. Life Expectancy in the U.S. and other OECD Countries: A Multivariate Analysis of Economic, Social, and Behavioral Factors. 2024. Available online: https://www.academia.edu/121497712/Life_Expectancy_in_the_U_S_and_Other_OECD_Countries_A_Multivariate_Analysis_of_Economic_Social_and_Behavioral_Factors (accessed on 30 June 2025).
  15. Kaufmann, D.; Kraay, A.; Zoido-Lobaton, P. Governance matters: From measurement to action. Financ. Dev. Int. Monet. Fund 2000, 37, 6. [Google Scholar]
  16. Mackenbach, J.P.; Stirbu, I.; Roskam, A.J.R.; Schaap, M.M.; Menvielle, G.; Leinsalu, M.; Kunst, A.E. Socioeconomic inequalities in health in 22 European countries. N. Engl. J. Med. 2008, 358, 2468–2481. [Google Scholar] [CrossRef]
  17. Bradley, E.H.; Elkins, B.R.; Herrin, J.; Elbel, B. Health and social services expenditures: Associations with health outcomes. BMJ Qual. Saf. 2011, 20, 826–831. [Google Scholar] [CrossRef] [PubMed]
  18. National Research Council (US). US Health in International Perspective: Shorter Lives, Poorer Health; Woolf, S.H., Aron, L., Eds.; National Academies Press: Washington, DC, USA, 2013; 394p. [Google Scholar]
  19. Pickett, K.E.; Wilkinson, R.G. Income inequality and health: A causal review. Soc. Sci. Med. 2015, 128, 316–326. [Google Scholar] [CrossRef]
  20. McCullough, J.M.; Leider, J.P. Government spending in health and nonhealthy sectors associated with improvement in county health rankings. Health Aff. 2016, 35, 2037–2043. [Google Scholar] [CrossRef]
  21. Bor, J.; Cohen, G.H.; Galea, S. Population health in an era of rising income inequality: USA, 1980–2015. Lancet 2017, 389, 1475–1490. [Google Scholar] [CrossRef]
  22. Enroth, L.; Jasilionis, D.; Németh, L.; Strand, B.H.; Tanjung, I.; Sundberg, L.; Fors, S.; Jylhä, M.; Brønnum-Hansen, H. Changes in socioeconomic differentials in old age life expectancy in four Nordic countries: The impact of educational expansion and education-specific mortality. Eur. J. Ageing 2022, 19, 161–173. [Google Scholar] [CrossRef]
  23. IHME-CHAIN Collaborators. Effects of education on adult mortality: A global systematic review and meta-analysis. Lancet Public Health 2024, 9, e155–e165. [Google Scholar] [CrossRef]
  24. EPIC/AQLI. Air Pollution Remains the Greatest External Risk to Human Health as Most Countries Fail to Set or Meet Their Own Standards for Clean Air. Energy Policy Institute, University of Chicago: Chicago, IL, USA, 2024. Available online: https://epic.uchicago.edu/news/air-pollution-remains-the-greatest-external-risk-to-human-health-as-most-countries-fail-to-set-or-meet-their-own-standards-for-clean-air/ (accessed on 30 June 2025).
  25. Homer, J. Can good government save us? Extending a climate-population model to include governance and its effects. Systems 2022, 10, 37. [Google Scholar] [CrossRef]
  26. Memon, M.A.; Ting, H.; Cheah, J.-H.; Thurasamy, R.; Chuan, F.; Cham, T.H. Sample size for survey research: Review and recommendations. J. Appl. Struct. Equ. Model. 2020, 4, i–xx. [Google Scholar] [CrossRef]
  27. Kim, J.H. Multicollinearity and misleading statistical results. Korean J. Anesth. 2019, 72, 558–569. [Google Scholar] [CrossRef] [PubMed]
  28. Drèze, J.; Sen, A. China and India. In Hunger and Public Action; Oxford University Press: Oxford, UK, 1991; Chapter 11; pp. 204–225. ISBN 9780198283652. [Google Scholar]
  29. Prasad, A.; Lakhanpaul, M.; Narula, S.; Patel, V.; Piot, P.; Venkatapuram, S. Accounting for the future of health in India. Lancet 2017, 389, 680–682. [Google Scholar] [CrossRef] [PubMed]
  30. Freeman, T.; Gesesew, H.A.; Bambra, C.; Giugliani, E.R.J.; Popay, J.; Sanders, D.; Macinko, J.; Musolino, C.; Baum, F. Why do some countries do better or worse in life expectancy relative to income? An analysis of Brazil, Ethiopia, and the United States of America. Int. J. Equity Health 2020, 19, 202–220. [Google Scholar] [CrossRef]
  31. Roser, M. Why Is Life Expectancy in the US Lower than in Other Rich Countries? 2020. Available online: https://ourworldindata.org/us-life-expectancy-low (accessed on 30 June 2025).
  32. GBD Indonesia. The state of health in Indonesia’s provinces, 1990–2019: A systematic analysis for the Global Burden of Disease Study 2019. Lancet Glob. Health 2022, 10, e1632–e1645. [Google Scholar] [CrossRef]
  33. GBD Pakistan. The state of health in Pakistan and its provinces and territories, 1990–2019: A systematic analysis for the Global Burden of Disease Study 2019. Lancet Glob. Health 2023, 11, e229–e243. [Google Scholar] [CrossRef]
  34. Aburto, J.M.; Calazans, J.; Queiroz, B.L.; Luhar, S.; Canudas-Romo, V. Uneven state distribution of homicides in Brazil and their effect on life expectancy, 2000–2015: A cross-sectional mortality study. BMJ Open 2021, 11, e044706. [Google Scholar] [CrossRef]
  35. Lawanson, O.I.; Umar, D.I. The life expectancy-economic growth nexus in Nigeria: The role of poverty reduction. SN Bus. Econ. 2021, 1, 127–152. [Google Scholar] [CrossRef]
  36. GBD Bangladesh. The burden of diseases and risk factors in Bangladesh, 1990–2019: A systematic analysis for the Global Burden of Disease Study 2019. Lancet Glob. Health 2023, 11, e1931–e1942. [Google Scholar] [CrossRef]
  37. Mitra, D.K.; Mridha, M.K. Sustaining progress in the health landscape of Bangladesh. Lancet 2023, 11, e1838–e1839. [Google Scholar] [CrossRef]
  38. Brainerd, E. Mortality in Russia since the fall of the Soviet Union. Compar. Econ. Stud. 2021, 63, 557–576. [Google Scholar] [CrossRef] [PubMed]
  39. Ikeda, N.; Saito, E.; Kondo, N.; Inoue, M.; Ikeda, S.; Satoh, T.; Wada, K.; Stickley, A.; Katanoda, K.; Mizoue, T.; et al. What has made the population of Japan healthy? Lancet 2011, 378, 1094–1105. [Google Scholar] [CrossRef] [PubMed]
  40. Zazueta-Borboa, J.-D.; Vázquez-Castillo, P.; Gargiulo, M.; Aburto, J.M. The impact of violence and COVID-19 on Mexico’s life-expectancy losses and recent bounce-back, 2015-22. Int. J. Epidem. 2025, 54, dyaf034. [Google Scholar] [CrossRef]
  41. GBD Ethiopia. Progress in health among regions of Ethiopia, 1990-2019: A subnational country analysis for the Global Burden of Disease Study 2019. Lancet 2022, 399, 1322–1335. [Google Scholar] [CrossRef] [PubMed]
  42. Cruz, G.T.; Cruz, C.J.P.; Saito, Y. Is there compression or expansion of morbidity in the Philippines? Geriatr. Gerontol. Int. 2022, 22, 511–515. [Google Scholar] [CrossRef]
  43. Rauch, J.E. Egypt: An Economic Growth Success, Yet a Health Failure? Cairo Review of Global Affairs. 2023. Available online: https://www.thecairoreview.com/essays/egypt-an-economic-growth-success-yet-a-health-failure/ (accessed on 30 June 2025).
  44. Nguyen, T.T.; Trevisan, M. Vietnam a country in transition: Health challenges. BMJ Nutr. Prev. Health 2020, 3, e000069. [Google Scholar] [CrossRef]
  45. Milstein, B.; Payne, B.; Kelleher, C.; Homer, J.; Norris, T.; Roulier, M.; Saha, S. Organizing around vital conditions moves the social determinants agenda into wider action. Health Aff. Forefr. 2023. [Google Scholar] [CrossRef]
  46. Askew, I.; Raney, L.; Kerrigan, M.; Sridhar, A. Family planning saves maternal and newborn lives: Why universal access to contraception must be prioritized in national maternal and newborn health policies, financing, and programs. Int. J. Gynecol. Obs. 2024, 164, 536–540. [Google Scholar] [CrossRef]
  47. Greene, W.H. Econometric Analysis, 8th ed; Pearson: Chennai, India, 2018; 392p. [Google Scholar]
  48. Homer, J. The growth and stagnation of US life expectancy: A dynamic simulation model and implications. Systems 2024, 12, 510. [Google Scholar] [CrossRef]
  49. Homer, J. Modeling global loss of life from climate change through 2060. Syst. Dyn. Rev. 2020, 36, 523–535. [Google Scholar] [CrossRef]
  50. Roy, D.; Rao, G.V. Stochastic Dynamics, Filtering and Optimization; Cambridge University Press: Cambridge, UK, 2017; 742p. [Google Scholar]
Figure 1. A causal-loop diagram of life expectancy and socioeconomic factors. Thicker arrows indicate the influence of socioeconomic factors on life expectancy. Source: Author’s original diagram.
Figure 1. A causal-loop diagram of life expectancy and socioeconomic factors. Thicker arrows indicate the influence of socioeconomic factors on life expectancy. Source: Author’s original diagram.
Systems 13 00577 g001
Table 1. Summary statistics for data (as available) across 150 countries with a 2015 population greater than 1 million; earlier (circa 2000) and later (circa 2015) periods.
Table 1. Summary statistics for data (as available) across 150 countries with a 2015 population greater than 1 million; earlier (circa 2000) and later (circa 2015) periods.
VariableUnitPeriodCountriesBestWorstMean
Life expectancy at birthyears2000–200415081.643.867.0
2015–201915084.652.372.1
GDP per capitaUSD 20152000–2004150USD 74,718USD 277USD 10,897
2015–2019150USD 86,424USD 286USD 13,337
Stability and peacepercent2000–200415083.53.646.3
2015–201715080.20.045.6
Rule of lawpercent2000–200415089.415.947.3
2015–201715091.413.948.6
Control of corruptionpercent2000–200415098.318.948.1
2015–201715094.515.047.7
Voice and accountabilitypercent2000–200415082.88.347.4
2015–201715083.76.947.3
Government effectivenesspercent2000–200415093.414.649.1
2015–201715094.89.449.6
Regulatory qualitypercent2000–200415087.77.749.5
2015–201715093.54.450.4
Safe sanitationpercent2000–20021491003.666.6
2015–20171491007.174.2
Years of schooling (age 25+)years1996–200414612.71.17.0
2013–201714914.11.48.5
CO2 emissions per capitametric tons1999–20031470.040.14.5
2014–20181470.032.44.4
Gini index (income inequality)percent1996–200411624.364.740.4
2012–201814625.859.137.6
Physicians per 1000 populationratio1996–20041425.880.021.51
2012–20181427.820.031.86
Undernourishmentpercent2000–20021361.568.615.2
2015–20171381.359.111.0
Poverty (USD 3.20 PPP 2011 per person
per day)
percent1996–20041160.198.537.7
2012–20181180.091.023.3
Female literacy (age 15+)percent1996–200411599.89.471.9
2012–201811210013.982.2
Contraception (women aged 15–49
with partners)
percent1996–200413386.54.349.1
2012–201811286.25.751.2
Source: Author’s extraction of public data from multiple sources. Notes: Table 1 variable definitions and World Bank data links: GDP per capita: gross domestic product in constant 2015 US dollars. https://data.worldbank.org/indicator/NY.GDP.PCAP.CD (accessed on 30 June 2025); Six World Governance Indicators (WGI; https://www.worldbank.org/en/publication/worldwide-governance-indicators; accessed on 30 June 2025): (1) stability and peace: likelihood of political instability and/or politically motivated violence, including terrorism. (2) rule of law: quality of contract enforcement, property rights, the police, and the courts, plus the likelihood of crime and violence.; (3) control of corruption: minimizing use of public power for private gain, plus avoiding capture of the state by elites and private interests.; (4) voice and accountability: ability of citizens to select their government, plus freedom of expression, freedom of association, and a free media.; (5) government effectiveness: quality of public and civil services and government policy process and their independence from political pressures, plus credibility of government commitments.; (6) regulatory quality: formulation and implementation of sound policies and regulations affecting private sector development.; safe sanitation: piped sewer systems, septic tanks or pit latrines, ventilated improved pit latrines, compositing toilets, or pit latrines with slabs. https://databank.worldbank.org/source/world-development-indicators/Series/SH.STA.BASS.ZS (accessed on 30 June 2025); years of schooling: average number of years of education (primary/ISCED 1 or higher) completed by a country’s adult population (25 years and older), excluding years spent repeating grades. https://databank.worldbank.org/Average-years-of-schooling-of-adults-(male-and-female)/id/12d63977# (accessed on 30 June 2025); CO2 emissions per capita: metric tons of emissions from the burning of fossil fuels and the manufacture of cement divided by total population. https://data.worldbank.org/indicator/EN.GHG.CO2.PC.CE.AR5 (accessed on 30 June 2025); Gini index: calculated from the Lorenz curve of income distribution; 0 represents perfect equality, 100 represents perfect inequality. https://data.worldbank.org/indicator/SI.POV.GINI (accessed on 30 June 2025); physicians per 1000 population: includes generalist and specialist medical practitioners. https://data.worldbank.org/indicator/sh.med.phys.zs (accessed on 30 June 2025); undernourishment: people with food consumption insufficient to provide dietary energy levels required to maintain a normal active and healthy life. https://data.worldbank.org/indicator/SN.ITK.DEFC.ZS (accessed on 30 June 2025); poverty: percent of population not achieving threshold of USD 3.65 purchasing power parity (in 2017 dollars) per person per day. https://data.worldbank.org/indicator/SI.POV.LMIC (accessed on 30 June 2025); female literacy: women aged 15+ who can both read and write with understanding a short, simple statement about their everyday life. https://data.worldbank.org/indicator/SE.ADT.LITR.FE.ZS (accessed on 30 June 2025); contraception: women (married or with partners) ages 15–49 using any form of contraception. https://data.worldbank.org/indicator/SP.DYN.CONU.ZS (accessed on 30 June 2025).
Table 2. Best-performing multivariate regression models and comparison to univariate GDPPC explanation of country-level life expectancy.
Table 2. Best-performing multivariate regression models and comparison to univariate GDPPC explanation of country-level life expectancy.
° Explanatory Variables for Life Expectancy
Constantln(GDPPC)Govt
Effective %
Sanitation %Poverty %Contracept %Gini %CO2 MT per Cap
Earlier period (1996–2004)
Model E1. All variables, N = 102, R-squared = 0.863, mean absolute error = 3.16 years
Coefficient60.4 0.0990.088−0.0880.167−0.200−0.472
p-value2 × 10−28 0.0050.0020.0012 × 10−77 × 10−50.003
VIF 2.75.04.63.41.22.8
Model E2. All variables except Gini and CO2, N = 102, R-squared = 0.832, MAE = 3.25 years
Coefficient52.6 0.0660.085-0.0820.155
p-value1 × 10−29 0.040.0030.0056 × 10−6
VIF 2.04.54.53.3
Model E3a. GDPPC only, N = 102, R-squared = 0.622, MAE = 4.88 years
Coefficient19.35.8
p-value8 × 10−78 × 10−23
Model E3b. GDPPC only, N = 150, R-squared = 0.649, MAE = 4.54 years
Coefficient21.15.5
p-value6 × 10−122 × 10−35
Later period (2012–2019)
Model L1. All variables, N = 91, R-squared = 0.856, MAE = 2.05 years
Coefficient56.8 0.1360.107−0.0530.033
p-value2 × 10−40 2 × 10−72 × 10−50.020.14
VIF 1.85.34.72.2
Model L2. All variables except contracept, N = 91, R-squared = 0.853, MAE = 2.13 years
Coefficient57.6 0.1430.118−0.056
p-value6 × 10−42 4 × 10−81 × 10−60.02
VIF 1.84.84.7
Model L3a. GDPPC only, N = 91, R-squared = 0.725, MAE = 3.03 years
Coefficient30.24.9
p-value6 × 10−191 × 10−26
Model L3b. GDPPC only, N = 150, R-squared = 0.739, MAE = 3.02 years
Coefficient30.24.9
p-value2 × 10−305 × 10−45
Source: Author’s analysis (using Microsoft Excel regression module). Notes: N = number of countries in the regression. MAE = mean absolute error between predicted LEB (using the regression equation) and actual LEB across the countries in the regression. VIF = variance inflation factor. R-squared values reported here are standard, not adjusted.
Table 3. Regression errors (predicted minus actual LEB) for the 15 largest countries, for earlier and later regression periods.
Table 3. Regression errors (predicted minus actual LEB) for the 15 largest countries, for earlier and later regression periods.
Earlier Model Errors Later Model Errors
CountryActual 2000–2004Model E1
(6 Factors)
Model E2
(4 Factors)
Model E3a
(1 Factor)
Actual 2015–2019Model L1
(4 Factors)
Model L2
(3 Factors)
Model L3a
(1 Factor)
China72.9−3.3−4.3−7.877.4−1.5−2.2−2.4
India63.6−2.8−4.6−5.370.4nana−3.4
USA77.0−5.01.04.978.62.02.05.3
Indonesia66.7−2.8−5.1−3.370.11.61.40.3
Pakistan62.6−3.8−6.2−2.966.20.91.5−0.4
Brazil70.4−0.51.80.174.8−0.5−1.1−0.3
Nigeria48.03.84.814.452.3nana16.3
Bangladesh66.5−4.4−6.8−9.271.7−5.9−6.6−5.9
Russian Federation65.34.37.05.772.2nana3.9
Japan81.6nana−2.384.1−3.8−2.9−2.6
Mexico74.0−4.2−2.3−2.274.30.70.50.8
Ethiopia51.8−0.3−2.70.164.8−4.6−4.9−2.3
Philippines69.8−4.2−4.1−6.771.50.40.5−1.6
Egypt68.43.91.4−3.371.11.81.9−0.5
Vietnam72.8−4.5−6.6−12.974.00.40.0−5.7
Mean absolute error 3.43.95.4 2.02.13.4
Source: Author’s model testing, data described in Table 1, and author’s spreadsheet calculations. Notes: “na” = data not available (for the time period in question) on one or more factors in the regression equation. Mean absolute error is across countries with full data for the time period in question.
Table 4. Life expectancy changes for the 15 largest countries, and key drivers based on the 6-factor regression model and the research literature.
Table 4. Life expectancy changes for the 15 largest countries, and key drivers based on the 6-factor regression model and the research literature.
Life ExpectancyApparent Key Drivers of Life Expectancy Change
CountryEarlier 2000–2004Later 2015–2019ChangeBased on the 6-Factor Regression Model
(in Order of Calculated Impact)
Based on the Research LiteratureReference Numbers
China72.977.44.5Improvements in poverty, sanitation, and government effectiveness, but worsening CO2 emissionsA “war against pollution” was announced in 2014 and reduced particulate air pollution (PM2.5) by 40% by 2022[24,28]
India63.670.46.8Improvements in sanitation and povertyAlso, improvements in parasitic disease control, literacy, and female education[28,29]
USA77.078.61.7Improvement in CO2 emissions, but worsening government effectivenessAlso, reductions in crime and violence[7,24,30,31]
Indonesia66.770.13.4Improvements in poverty, sanitation, and contraception, but worsening Gini(no additional)[32]
Pakistan62.666.23.6Improvements in sanitation, poverty, and contraception(no additional)[33]
Brazil70.474.84.4Improvements in sanitation and poverty, but worsening government effectivenessAlso, newly instituted universal health coverage[30,34]
Nigeria48.052.34.3Improvements in sanitationAlso, improved educational attainment[35]
Bangladesh66.571.75.2Improvements in sanitation, poverty, and contraception(no additional)[36,37]
Russian
Federation
65.372.26.9Improvements in government effectiveness, sanitation, and povertyAlso, reduced alcohol consumption following strong policy measures of 2006 and 2010[38]
Japan81.684.12.5Improvements in government effectivenessAlso, increased educational attainment and urbanization[39]
Mexico74.074.30.3Improvements in sanitation, poverty, and Gini, but worsening government effectivenessAlso, increased violence[40]
Ethiopia51.864.813.1Improvements in contraception and povertyReductions in violence and undernourishment, plus improved female education and community-based health services[30,41]
Philippines69.871.51.8Improvements in sanitation, poverty, and contraception(no additional)[42]
Egypt68.471.12.7Worsening government effectivenessImproved control of infectious and parasitic diseases[43]
Vietnam72.874.01.2Improvements in poverty, sanitation, and government effectivenessImprovement has been hindered by a steep increase in chronic disease due to a more sedentary lifestyle and worse nutrition[44]
Source: Author’s analysis and the research literature as indicated with reference numbers. Notes: “no additional” indicates that the six-factor regression model adequately explains the observed change from earlier to later periods, and the research literature does not provide significant further insight.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Homer, J. Beyond the Preston Curve: Analyzing Variations in Life Expectancy Around the World Using Multivariate Regression Circa 2000 and 2015. Systems 2025, 13, 577. https://doi.org/10.3390/systems13070577

AMA Style

Homer J. Beyond the Preston Curve: Analyzing Variations in Life Expectancy Around the World Using Multivariate Regression Circa 2000 and 2015. Systems. 2025; 13(7):577. https://doi.org/10.3390/systems13070577

Chicago/Turabian Style

Homer, Jack. 2025. "Beyond the Preston Curve: Analyzing Variations in Life Expectancy Around the World Using Multivariate Regression Circa 2000 and 2015" Systems 13, no. 7: 577. https://doi.org/10.3390/systems13070577

APA Style

Homer, J. (2025). Beyond the Preston Curve: Analyzing Variations in Life Expectancy Around the World Using Multivariate Regression Circa 2000 and 2015. Systems, 13(7), 577. https://doi.org/10.3390/systems13070577

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop