Causal Analysis of Multidimensional Dietary Data to Assess Effects on All-Cause Mortality
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Design and Population
2.2. Dietary Data
2.3. Mortality
2.4. Covariates (Confounding Variables)
2.5. Statistical Analysis
3. Results
3.1. Baseline Characteristics
3.2. Dietary Patterns
3.3. Associations of Diet and Mortality
3.4. Effectiveness in Confounder Balance
4. Discussion
4.1. Comparison with Previous Studies
4.2. Implications
4.3. Future Directions
4.4. Strengths and Limitations
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| ATE | Average Treatment Effect |
| BMI | Body Mass Index |
| CBPS | Covariate Balancing Propensity Score |
| CI | Confidence Interval |
| CVD | Cardiovascular Disease |
| ESS | Effective Sample Size |
| FNDDS | Food and Nutrient Database for Dietary Studies |
| FPIR | Family Poverty–Income Ratio |
| GED | General Educational Development |
| GPS | Generalized Propensity Score |
| GPAQ | Global Physical Activity Questionnaire |
| HHS | (U.S.) Department of Health and Human Services |
| IPTW | Inverse Probability of Treatment Weighting |
| IPW | Inverse Probability Weighting |
| MET | Metabolic Equivalent Task |
| mvGPS | Multivariable Generalized Propensity Score |
| NHANES | National Health and Nutrition Examination Survey |
| PIR | Poverty–Income Ratio |
| PSU | Primary Sampling Unit |
| RR | Relative Risk |
| SD | Standard Deviation |
| USDA | (U.S.) Department of Agriculture |
| WSS | Within-Cluster Sum of Squares |
References
- Tessier, A.J.; Wang, F.; Korat, A.A.; Eliassen, A.H.; Chavarro, J.; Grodstein, F.; Li, J.; Liang, L.; Willett, W.C.; Sun, Q.; et al. Optimal dietary patterns for healthy aging. Nat. Med. 2025, 31, 1644–1652. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- English, L.K.; Ard, J.D.; Bailey, R.L.; Bates, M.; Bazzano, L.A.; Boushey, C.J.; Brown, C.; Butera, G.; Callahan, E.H.; de Jesus, J.; et al. Evaluation of Dietary Patterns and All-Cause Mortality: A Systematic Review. JAMA Netw. Open 2021, 4, e2122277. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Onni, A.T.; Balakrishna, R.; Perillo, M.; Amato, M.; Javadi Arjmand, E.; Thomassen, L.M.; Lorenzini, A.; Fadnes, L.T. Umbrella Review of Systematic Reviews and Meta-analyses on Consumption of Different Food Groups and Risk of All-cause Mortality. Adv. Nutr. 2025, 16, 100393. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Mozaffarian, D.; Rosenberg, I.; Uauy, R. History of modern nutrition science-implications for current research, dietary guidelines, and food policy. BMJ 2018, 361, k2392. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Hu, F.B. Dietary pattern analysis: A new direction in nutritional epidemiology. Curr. Opin. Lipidol. 2002, 13, 3–9. [Google Scholar] [CrossRef] [PubMed]
- Sotos-Prieto, M.; Bhupathiraju, S.N.; Mattei, J.; Fung, T.T.; Li, Y.; Pan, A.; Willett, W.C.; Rimm, E.B.; Hu, F.B. Association of Changes in Diet Quality with Total and Cause-Specific Mortality. N. Engl. J. Med. 2017, 377, 143–153. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Lane, M.M.; Gamage, E.; Du, S.; Ashtree, D.N.; McGuinness, A.J.; Gauci, S.; Baker, P.; Lawrence, M.; Rebholz, C.M.; Srour, B.; et al. Ultra-processed food exposure and adverse health outcomes: Umbrella review of epidemiological meta-analyses. BMJ 2024, 384, e077310. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Forouhi, N.G.; Krauss, R.M.; Taubes, G.; Willett, W. Dietary fat and cardiometabolic health: Evidence, controversies, and consensus for guidance. BMJ 2018, 361, k2139. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- GBD 2017 Diet Collaborators. Health effects of dietary risks in 195 countries, 1990–2017: A systematic analysis for the Global Burden of Disease Study 2017. Lancet 2019, 393, 1958–1972. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Schulz, C.A.; Oluwagbemigun, K.; Nöthlings, U. Advances in dietary pattern analysis in nutritional epidemiology. Eur. J. Nutr. 2021, 60, 4115–4130. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Zhao, J.; Li, Z.; Gao, Q.; Zhao, H.; Chen, S.; Huang, L.; Wang, W.; Wang, T. A review of statistical methods for dietary pattern analysis. Nutr. J. 2021, 20, 37. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Ocké, M.C. Evaluation of methodologies for assessing the overall diet: Dietary quality scores and dietary pattern analysis. Proc. Nutr. Soc. 2013, 72, 191–199. [Google Scholar] [CrossRef] [PubMed]
- Abebe, Z.; Wassie, M.M.; Mekonnen, T.C.; Reynolds, A.C.; Melaku, Y.A. Difference in Gastrointestinal Cancer Risk and Mortality by Dietary Pattern Analysis: A Systematic Review and Meta-Analysis. Nutr. Rev. 2025, 83, e991–e1013. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Stricker, M.D.; Onland-Moret, N.C.; Boer, J.M.; van der Schouw, Y.T.; Verschuren, W.M.; May, A.M.; Peeters, P.; Beulens, J. Dietary patterns derived from principal component- and k-means cluster analysis: Long-term association with coronary heart disease and stroke. Nutr. Metab. Cardiovasc. Dis. 2013, 23, 250–256. [Google Scholar] [CrossRef] [PubMed]
- Tomova, G.D.; Gilthorpe, M.S.; Tennant, P.W. Theory and performance of substitution models for estimating relative causal effects in nutritional epidemiology. Am. J. Clin. Nutr. 2022, 116, 1379–1388. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Zhao, Y.; Naumova, E.N.; Bobb, J.F.; Claus Henn, B.; Singh, G.M. Joint Associations of Multiple Dietary Components with Cardiovascular Disease Risk: A Machine-Learning Approach. Am. J. Epidemiol. 2021, 190, 1353–1365. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Brown, D.W.; Greene, T.J.; Swartz, M.D.; Wilkinson, A.V.; DeSantis, S.M. Propensity score stratification methods for continuous treatments. Stat. Med. 2021, 40, 1189–1203. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Gao, Q.; Li, T.; Zhu, G.; Wang, J.; Qiu, K.; Liu, L.; Yang, X.; Wang, T. Estimating the causal effects of exposure mixtures: A generalized propensity score method. BMC Med. Res. Methodol. 2025, 25, 221. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Traini, E.; Huss, A.; Portengen, L.; Rookus, M.; Verschuren, W.M.M.; Vermeulen, R.C.H.; Bellavia, A. A Multipollutant Approach to Estimating Causal Effects of Air Pollution Mixtures on Overall Mortality in a Large, Prospective Cohort. Epidemiology 2022, 33, 514–522. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Williams, J.R.; Crespi, C.M. Causal inference for multiple continuous exposures via the multivariate generalized propensity score. arXiv 2020, arXiv:200813767. [Google Scholar] [CrossRef]
- Centers for Disease Control and Prevention. National Health and Nutrition Examination Survey (1999–2018) Linked Mortality Files. 2022. Available online: https://ftp.cdc.gov/pub/Health_Statistics/NCHS/datalinkage/linked_mortality/ (accessed on 12 September 2025).
- Ahluwalia, N.; Dwyer, J.; Terry, A.; Moshfegh, A.; Johnson, C. Update on NHANES Dietary Data: Focus on Collection, Release, Analytical Considerations, and Uses to Inform Public Policy. Adv. Nutr. 2016, 7, 121–134. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- US Department of Health and Human Services. Dietary Guidelines. 2021. Available online: http://www.health.gov/dietaryguidelines/ (accessed on 27 September 2025).
- Wang, X.; Ouyang, Y.; Liu, J.; Zhu, M.; Zhao, G.; Bao, W.; Hu, F.B. Fruit and vegetable consumption and mortality from all causes, cardiovascular disease, and cancer: Systematic review and dose-response meta-analysis of prospective cohort studies. BMJ 2014, 349, g4490. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Schwingshackl, L.; Schwedhelm, C.; Hoffmann, G.; Lampousi, A.M.; Knüppel, S.; Iqbal, K.; Bechthold, A.; Schlesinger, S.; Boeing, H. Food groups and risk of all-cause mortality: A systematic review and meta-analysis of prospective studies. Am. J. Clin. Nutr. 2017, 105, 1462–1473. [Google Scholar] [CrossRef] [PubMed]
- Sun, L.; Zhao, R.; You, X.; Meng, J.; Meng, L.; Di, H. Association between family income to poverty ratio and severe headache/migraine in the American adults: Data from NHANES 1999–2004. Front. Neurol. 2024, 15, 1427277. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Bull, F.C.; Maslin, T.S.; Armstrong, T. Global physical activity questionnaire (GPAQ): Nine country reliability and validity study. J. Phys. Act. Health 2009, 6, 790–804. [Google Scholar] [CrossRef] [PubMed]
- Du, Y.; Liu, B.; Sun, Y.; Snetselaar, L.G.; Wallace, R.B.; Bao, W. Trends in Adherence to the Physical Activity Guidelines for Americans for Aerobic Activity and Time Spent on Sedentary Behavior Among US Adults, 2007 to 2016. JAMA Netw. Open 2019, 2, e197597. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Kroenke, K.; Spitzer, R.L.; Williams, J.B.; Löwe, B. The Patient Health Questionnaire Somatic, Anxiety, and Depressive Symptom Scales: A systematic review. Gen. Hosp. Psychiatry 2010, 32, 345–359. [Google Scholar] [CrossRef] [PubMed]
- Hartigan, J.A.; Wong, M.A. A K-Means Clustering Algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 1979, 28, 100–108. [Google Scholar] [CrossRef]
- Robins, J.M.; Hernán, M.A.; Brumback, B. Marginal structural models and causal inference in epidemiology. Epidemiology 2000, 11, 550–560. [Google Scholar] [CrossRef] [PubMed]
- Zargarzadeh, N.; Mousavi, S.M.; Santos, H.O.; Aune, D.; Hasani-Ranjbar, S.; Larijani, B.; Esmaillzadeh, A. Legume Consumption and Risk of All-Cause and Cause-Specific Mortality: A Systematic Review and Dose-Response Meta-Analysis of Prospective Studies. Adv. Nutr. 2023, 14, 64–76. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Song, S.; Shim, J.E.; Song, Y. Association of added sugar intake with all-cause and cardiovascular disease mortality: A systematic review of cohort studies. Nutr. Res. Pract. 2022, 16, S21–S36. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Kristoffersen, E.; Hjort, S.L.; Thomassen, L.M.; Arjmand, E.J.; Perillo, M.; Balakrishna, R.; Onni, A.T.; Sletten, I.S.K.; Lorenzini, A.; Fadnes, L.T. Umbrella Review of Systematic Reviews and Meta-Analyses on the Consumption of Different Food Groups and the Risk of Overweight and Obesity. Nutrients 2025, 17, 662. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Zeraatkar, D.; Han, M.A.; Guyatt, G.H.; Vernooij, R.W.M.; El Dib, R.; Cheung, K.; Milio, K.; Zworth, M.; Bartoszko, J.J.; Valli, C.; et al. Red and Processed Meat Consumption and Risk for All-Cause Mortality and Cardiometabolic Outcomes: A Systematic Review and Meta-analysis of Cohort Studies. Ann. Intern. Med. 2019, 171, 703–710. [Google Scholar] [CrossRef] [PubMed]
- Aune, D.; Keum, N.; Giovannucci, E.; Fadnes, L.T.; Boffetta, P.; Greenwood, D.C.; Tonstad, S.; Vatten, L.J.; Riboli, E.; Norat, T. Whole grain consumption and risk of cardiovascular disease, cancer, and all cause and cause specific mortality: Systematic review and dose-response meta-analysis of prospective studies. BMJ 2016, 353, i2716. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Aune, D.; Giovannucci, E.; Boffetta, P.; Fadnes, L.T.; Keum, N.; Norat, T.; Greenwood, D.C.; Riboli, E.; Vatten, L.J.; Tonstad, S. Fruit and vegetable intake and the risk of cardiovascular disease, total cancer and all-cause mortality-a systematic review and dose-response meta-analysis of prospective studies. Int. J. Epidemiol. 2017, 46, 1029–1056. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
- Ibsen, D.B.; Laursen, A.S.D.; Würtz, A.M.L.; Dahm, C.C.; Rimm, E.B.; Parner, E.T.; Overvad, K.; Jakobsen, M.U. Food substitution models for nutritional epidemiology. Am. J. Clin. Nutr. 2021, 113, 294–303. [Google Scholar] [CrossRef] [PubMed]
- Imai, K.; Ratkovic, M. Covariate Balancing Propensity Score. J. R. Stat. Soc. Ser. B Stat. Methodol. 2013, 76, 243–263. [Google Scholar] [CrossRef]
- Cole, S.R.; Hernán, M.A. Constructing inverse probability weights for marginal structural models. Am. J. Epidemiol. 2008, 168, 656–664. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]




| Cluster 1 (Mixed) | Cluster 2 (Healthy) | Cluster 3 (Unhealthy) | Total | |
|---|---|---|---|---|
| Characteristics | (N = 6879) | (N = 2576) | (N = 3180) | (N = 12,635) |
| Sex | ||||
| Male | 2467 (35.9%) | 1306 (50.7%) | 2383 (74.9%) | 6156 (48.7%) |
| Female | 4412 (64.1%) | 1270 (49.3%) | 797 (25.1%) | 6479 (51.3%) |
| Age (year) | ||||
| Median [Min, Max] | 38.0 [20.0, 85.0] | 42.0 [20.0, 82.0] | 35.0 [20.0, 85.0] | 38.0 [20.0, 85.0] |
| Education | ||||
| Less Than High School | 1266 (18.4%) | 324 (12.6%) | 764 (24.0%) | 2354 (18.6%) |
| High School Diploma (including GED) | 1497 (21.8%) | 331 (12.8%) | 890 (28.0%) | 2718 (21.5%) |
| More Than High School | 4116 (59.8%) | 1921 (74.6%) | 1526 (48.0%) | 7563 (59.9%) |
| Race | ||||
| Mexican American | 1042 (15.1%) | 369 (14.3%) | 698 (21.9%) | 2109 (16.7%) |
| Other Hispanic | 710 (10.3%) | 253 (9.8%) | 266 (8.4%) | 1229 (9.7%) |
| Non-Hispanic White | 2992 (43.5%) | 1114 (43.2%) | 1416 (44.5%) | 5522 (43.7%) |
| Non-Hispanic Black | 1363 (19.8%) | 346 (13.4%) | 575 (18.1%) | 2284 (18.1%) |
| Other Race—Including Multi-Racial | 772 (11.2%) | 494 (19.2%) | 225 (7.1%) | 1491 (11.8%) |
| Marital status | ||||
| Married/living with partner | 3998 (58.1%) | 1715 (66.6%) | 1999 (62.9%) | 7712 (61.0%) |
| Widowed | 194 (2.8%) | 63 (2.4%) | 34 (1.1%) | 291 (2.3%) |
| Divorced | 645 (9.4%) | 220 (8.5%) | 212 (6.7%) | 1077 (8.5%) |
| Separated | 220 (3.2%) | 56 (2.2%) | 95 (3.0%) | 371 (2.9%) |
| Never married | 1822 (26.5%) | 522 (20.3%) | 840 (26.4%) | 3184 (25.2%) |
| Income (FPIR) | ||||
| Mean (SD) | 2.59 (1.64) | 3.11 (1.68) | 2.34 (1.57) | 2.63 (1.65) |
| Health insurance | 1745 (25.4%) | 516 (20.0%) | 1127 (35.4%) | 3388 (26.8%) |
| Physical activity | ||||
| Sedentary | 2374 (34.5%) | 653 (25.3%) | 928 (29.2%) | 3955 (31.3%) |
| Moderate | 832 (12.1%) | 314 (12.2%) | 278 (8.7%) | 1424 (11.3%) |
| Vigorous | 3673 (53.4%) | 1609 (62.5%) | 1974 (62.1%) | 7256 (57.4%) |
| Smoking | ||||
| Never smoked | 4235 (61.6%) | 1743 (67.7%) | 1595 (50.2%) | 7573 (59.9%) |
| Ex-smoker | 1156 (16.8%) | 580 (22.5%) | 576 (18.1%) | 2312 (18.3%) |
| Smoker | 1488 (21.6%) | 253 (9.8%) | 1009 (31.7%) | 2750 (21.8%) |
| Body mass index | ||||
| Mean (SD) | 28.2 (6.56) | 26.7 (5.23) | 28.1 (6.36) | 27.9 (6.29) |
| Depression | ||||
| None | 5569 (81.0%) | 2292 (89.0%) | 2584 (81.3%) | 10,445 (82.7%) |
| Mild | 626 (9.1%) | 160 (6.2%) | 287 (9.0%) | 1073 (8.5%) |
| Moderate | 493 (7.2%) | 95 (3.7%) | 230 (7.2%) | 818 (6.5%) |
| Severe | 191 (2.8%) | 29 (1.1%) | 79 (2.5%) | 299 (2.4%) |
| Fruit, gm/day | ||||
| Median [Min, Max] | 77.6 [0, 654] | 316 [0, 2910] | 70.0 [0, 1380] | 109 [0, 2910] |
| Vegetable, gm/day | ||||
| Median [Min, Max] | 92.3 [0, 470] | 189 [0, 2040] | 141 [0, 760] | 118 [0, 2040] |
| Legumes, gm/day | ||||
| Median [Min, Max] | 10.3 [0, 1460] | 50.7 [0, 3160] | 17.6 [0, 3870] | 17.2 [0, 3870] |
| Whole grain, gm/day | ||||
| Median [Min, Max] | 7.80 [0, 140] | 45.2 [0, 516] | 4.54 [0, 182] | 11.6 [0, 516] |
| Processed meat, gm/day | ||||
| Median [Min, Max] | 7.37 [0, 200] | 2.76 [0, 329] | 31.3 [0, 475] | 11.5 [0, 475] |
| Fish, gm/day | ||||
| Median [Min, Max] | 0 [0, 267] | 0 [0, 900] | 0 [0, 407] | 0 [0, 900] |
| Added sugar, gm/day | ||||
| Median [Min, Max] | 61.6 [0, 320] | 55.9 [0, 419] | 144 [0, 733] | 73.6 [0, 733] |
| Meat, gm/day | ||||
| Median [Min, Max] | 21.8 [0, 269] | 12.8 [0, 375] | 77.8 [0, 663] | 31.2 [0, 663] |
| Refined grain, gm/day | ||||
| Median [Min, Max] | 129 [0, 420] | 137 [0, 724] | 247 [0, 1150] | 153 [0, 1150] |
| Energy intake, kcal | ||||
| Median [Min, Max] | 1680 [212, 4420] | 2200 [409, 10,000] | 2880 [1390, 9480] | 2020 [212, 10,000] |
| All-cause mortality | 245 (3.6%) | 69 (2.7%) | 86 (2.7%) | 400 (3.2%) |
| Method | Euclidean Distance | Maximum Correlation | Average Correlation | Effective Sample Size |
|---|---|---|---|---|
| mvGPS | 0.98 | 0.41 | 0.04 | 5308 |
| Refined grain (PS) | 1.27 | 0.43 | 0.06 | 9180 |
| Added sugar (PS) | 1.21 | 0.49 | 0.05 | 9788 |
| Processed meat (PS) | 1.26 | 0.49 | 0.05 | 11,952 |
| Meat (PS) | 1.24 | 0.49 | 0.05 | 11,626 |
| Legumes, nuts, and soy (PS) | 1.30 | 0.50 | 0.06 | 12,227 |
| Fish (PS) | 1.31 | 0.50 | 0.06 | 12,458 |
| Whole grain (PS) | 1.28 | 0.51 | 0.05 | 12,038 |
| Fruit (PS) | 1.29 | 0.51 | 0.05 | 12,006 |
| Vegetable (PS) | 1.28 | 0.52 | 0.05 | 11,590 |
| Unweighted | 1.32 | 0.50 | 0.06 | 12,635 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Melaku, Y.A.; Shi, Z. Causal Analysis of Multidimensional Dietary Data to Assess Effects on All-Cause Mortality. Nutrients 2026, 18, 1629. https://doi.org/10.3390/nu18101629
Melaku YA, Shi Z. Causal Analysis of Multidimensional Dietary Data to Assess Effects on All-Cause Mortality. Nutrients. 2026; 18(10):1629. https://doi.org/10.3390/nu18101629
Chicago/Turabian StyleMelaku, Yohannes Adama, and Zumin Shi. 2026. "Causal Analysis of Multidimensional Dietary Data to Assess Effects on All-Cause Mortality" Nutrients 18, no. 10: 1629. https://doi.org/10.3390/nu18101629
APA StyleMelaku, Y. A., & Shi, Z. (2026). Causal Analysis of Multidimensional Dietary Data to Assess Effects on All-Cause Mortality. Nutrients, 18(10), 1629. https://doi.org/10.3390/nu18101629
