Multiple Imputation of a Continuous Outcome with Fully Observed Predictors Using TabPFN
Abstract
1. Introduction
2. Methods
2.1. Brief Introduction to TabPFN
2.2. Using TabPFN for Multiple Imputation
3. Simulation Study
3.1. Data and Missingness Generation
- Linear: ;
- Additive: ;
- Non-additive: .
- MCAR: ;
- Linear MAR: ;
- Additive MAR: ;
- Non-additive MAR: .
- : number of noise variables;
- : number of observations;
- 3 mean structures: linear, additive, non-additive;
- 4 missingness mechanisms: MCAR, linear MAR, additive MAR, non-additive MAR.
3.2. Comparator Multiple Imputation Methods
3.3. Performance Evaluation
- Mean and Standard Deviation of Bias:For each simulation repetition, the bias was defined as . For each method and simulation condition, we reported the mean and standard deviation of the resulting biases.
- Mean and Standard Deviation of the Standard Error (SE):The within-imputation standard error of , denoted by , was calculated and summarized using the mean and standard deviation across repetitions.
- Coverage of the 95% Confidence Interval:To assess interval performance, we evaluated whether 10 fell within the 95% confidence interval for . Specifically, we checked whether where denotes the 97.5th percentile of the t-distribution with pooled degrees of freedom according to Barnard and Rubin [44]. Coverage was then defined as the proportion of simulations in which this criterion was met.
3.4. Secondary Simulation: Sensitivity to the Number of Imputations (m)
4. Results
5. Case Study: Estimating the Effect of Smoking Cessation on Weight Gain
6. Discussion
7. Conclusions
Supplementary Materials
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
| Stratified by Smoking | Non-Quitters | Quitters |
|---|---|---|
| n | 363 | 127 |
| Weight change [kg] (mean (SD)) | 2.09 (7.43) | 4.54 (8.68) |
| Weight in 1971 [kg] (mean (SD)) | 72.30 (14.67) | 74.41 (14.46) |
| Sex = Male (%) | 205 (56.5) | 80 (63.0) |
| Race = White (%) | 304 (83.7) | 113 (89.0) |
| Age in 1971 (mean (SD)) | 42.50 (11.95) | 46.06 (12.67) |
| Amount of education (%) | ||
| 8th grade or less | 58 (16.0) | 23 (18.1) |
| College dropout | 22 (6.1) | 12 (9.4) |
| College or more | 36 (9.9) | 20 (15.7) |
| Highschool | 161 (44.4) | 53 (41.7) |
| Highschool dropout | 86 (23.7) | 19 (15.0) |
| Number of cigarettes smoked per day (mean (SD)) | 20.73 (11.26) | 17.86 (12.27) |
| Years of smoking (mean (SD)) | 23.94 (11.99) | 25.49 (13.57) |
| In recreation, how much exercise? (%) | ||
| Little or no exercise | 142 (39.1) | 61 (48.0) |
| Moderate exercise | 137 (37.7) | 46 (36.2) |
| Much exercise | 84 (23.1) | 20 (15.7) |
| In your usual day, how active are you? (%) | ||
| Inactive | 29 (8.0) | 9 (7.1) |
| Moderately active | 145 (39.9) | 56 (44.1) |
| Very active | 189 (52.1) | 62 (48.8) |
| Total family income [USD] (%) | ||
| <1000 | 3 (0.8) | 2 (1.6) |
| 1000–1999 | 16 (4.4) | 2 (1.6) |
| 10,000–14,999 | 90 (24.8) | 41 (32.3) |
| 15,000–19,999 | 54 (14.9) | 17 (13.4) |
| 2000–2999 | 17 (4.7) | 3 (2.4) |
| 20,000–24,999 | 26 (7.2) | 6 (4.7) |
| 25,000+ | 15 (4.1) | 7 (5.5) |
| 3000–3999 | 11 (3.0) | 7 (5.5) |
| 4000–4999 | 15 (4.1) | 3 (2.4) |
| 5000–5999 | 15 (4.1) | 3 (2.4) |
| 6000–6999 | 12 (3.3) | 3 (2.4) |
| 7000–9999 | 89 (24.5) | 33 (26.0) |
| Marital status (%) | ||
| Divorced | 23 (6.3) | 6 (4.7) |
| Married | 305 (84.0) | 110 (86.6) |
| Never married | 21 (5.8) | 8 (6.3) |
| Separated | 14 (3.9) | 3 (2.4) |
| Highest grade of regular school ever (mean (SD)) | 11.23 (2.81) | 11.59 (2.98) |
| Height [cm] (mean (SD)) | 169.76 (8.84) | 170.70 (8.77) |
| Asthma = Never (%) | 353 (97.2) | 121 (95.3) |
| Chronic Bronchitis/Emphysema = Never (%) | 342 (94.2) | 120 (94.5) |
| Tuberculosis = Never (%) | 360 (99.2) | 126 (99.2) |
| Heart failure = Never (%) | 362 (99.7) | 127 (100.0) |
| High blood pressure = Never (%) | 321 (88.4) | 105 (82.7) |
| Peptic ulcer = Never (%) | 333 (91.7) | 110 (86.6) |
| Colitis = Never (%) | 352 (97.0) | 123 (96.9) |
| Hepatitis = Never (%) | 361 (99.4) | 127 (100.0) |
| Chronic cough = Never (%) | 347 (95.6) | 124 (97.6) |
| Hay fever = Never (%) | 340 (93.7) | 117 (92.1) |
| Diabetes = Never (%) | 361 (99.4) | 126 (99.2) |
| Polio = Never (%) | 359 (98.9) | 125 (98.4) |
| Malignant tumor/growth = Never (%) | 355 (97.8) | 124 (97.6) |
| Nervous breakdown = Never (%) | 356 (98.1) | 127 (100.0) |
| Have you had 1 drink past year? = Ever (%) | 363 (100.0) | 127 (100.0) |
| How often do you drink? (%) | ||
| <12 times/year | 42 (11.6) | 14 (11.0) |
| 1–4 times/month | 152 (41.9) | 63 (49.6) |
| 2–3 times/week | 68 (18.7) | 16 (12.6) |
| Almost every day | 101 (27.8) | 34 (26.8) |
| Which do you most frequently drink? (%) | ||
| Beer | 194 (53.4) | 58 (45.7) |
| Liquor | 144 (39.7) | 59 (46.5) |
| Wine | 25 (6.9) | 10 (7.9) |
| When you drink, how much do you drink? (mean (SD)) | 3.28 (3.53) | 3.03 (2.62) |
| Do you eat dirt or clay, starch or other non standard food? = Never (%) | 359 (98.9) | 126 (99.2) |
| Use headache medication = Never (%) | 138 (38.0) | 59 (46.5) |
| Use other pains medication = Never (%) | 282 (77.7) | 100 (78.7) |
| Use weak heart medication = Never (%) | 357 (98.3) | 124 (97.6) |
| Use allergies medication = Never (%) | 348 (95.9) | 123 (96.9) |
| Use nerves medication = Never (%) | 321 (88.4) | 108 (85.0) |
| Use lack of pep medication = Never (%) | 342 (94.2) | 123 (96.9) |
| Use high blood pressure medication = Never (%) | 348 (95.9) | 119 (93.7) |
| Use bowel trouble medication = Never (%) | 325 (89.5) | 108 (85.0) |
| Use weight loss medication = Never (%) | 357 (98.3) | 126 (99.2) |
| Use infection medication = Never (%) | 311 (85.7) | 111 (87.4) |
| Serum cholesterol [mg/100 mL] (mean (SD)) | 218.30 (45.21) | 228.03 (48.67) |
| Avg tobacco price in state of residence [USD] (mean (SD)) | 2.17 (0.22) | 2.14 (0.21) |
| Tobacco tax in state of residence (mean (SD)) | 1.09 (0.21) | 1.06 (0.21) |


References
- Committee for Medicinal Products for Human Use (CHMP). EMA Guideline on Missing Data in Confirmatory Clinical Trials (EMA/CPMP/EWP/1776/99). 2010. Available online: https://www.ema.europa.eu/en/documents/scientific-guideline/guideline-missing-data-confirmatory-clinical-trials_en.pdf (accessed on 11 October 2025).
- Little, R.J.A.; Rubin, D.B. Statistical Analysis with Missing Data, 3rd ed.; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2019. [Google Scholar]
- Alabadla, M.; Sidi, F.; Ishak, I.; Ibrahim, H.; Affendey, L.S.; Che Ani, Z.; Jabar, M.A.; Bukar, U.A.; Devaraj, N.K.; Muda, A.S.; et al. Systematic Review of Using Machine Learning in Imputing Missing Values. IEEE Access 2022, 10, 44483–44502. [Google Scholar] [CrossRef]
- Li, J.; Yan, X.S.; Chaudhary, D.; Avula, V.; Mudiganti, S.; Husby, H.; Shahjouei, S.; Afshar, A.; Stewart, W.F.; Yeasin, M.; et al. Imputation of missing values for electronic health record laboratory data. npj Digit. Med. 2021, 4, 147. [Google Scholar] [CrossRef]
- Buczak, P.; Chen, J.J.; Pauly, M. Analyzing the Effect of Imputation on Classification Performance under MCAR and MAR Missing Mechanisms. Entropy 2023, 25, 521. [Google Scholar] [CrossRef] [PubMed]
- Kazijevs, M.; Samad, M.D. Deep imputation of missing values in time series health data: A review with benchmarking. J. Biomed. Inform. 2023, 144, 104440. [Google Scholar] [CrossRef] [PubMed]
- Liu, M.; Li, S.; Yuan, H.; Ong, M.E.H.; Ning, Y.; Xie, F.; Saffari, S.E.; Shang, Y.; Volovici, V.; Chakraborty, B.; et al. Handling missing values in healthcare data: A systematic review of deep learning-based imputation techniques. Artif. Intell. Med. 2023, 142, 102587. [Google Scholar] [CrossRef]
- Varol, B.; Omurlu, I.K.; Ture, M. Review of Simulation Studies Evaluating Imputation Methods in High-Dimensional Datasets. WIREs Comput. Stat. 2025, 17, e70044. [Google Scholar] [CrossRef]
- Thurow, M.; Dumpert, F.; Ramosaj, B.; Pauly, M. Imputing missings in official statistics for general tasks–our vote for distributional accuracy. Stat. J. IAOS 2021, 37, 1379–1390. [Google Scholar] [CrossRef]
- Schwerter, J.; Gurtskaia, K.; Romero, A.; Zeyer-Gliozzo, B.; Pauly, M. Evaluating tree-based imputation methods as an alternative to MICE PMM for drawing inference in empirical studies. arXiv 2024, arXiv:2401.09602. [Google Scholar] [CrossRef]
- Grzesiak, K.; Muller, C.; Josse, J.; Näf, J. Do we Need Dozens of Methods for Real World Missing Value Imputation? arXiv 2025, arXiv:2511.04833. [Google Scholar] [CrossRef]
- Rubin, D.B. Multiple Imputation for Nonresponse in Surveys; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1987. [Google Scholar]
- Seaman, S.R.; Bartlett, J.W.; White, I.R. Multiple imputation of missing covariates with non-linear effects and interactions: An evaluation of statistical methods. BMC Med. Res. Methodol. 2012, 12, 46. [Google Scholar] [CrossRef] [PubMed]
- Curnow, E.; Carpenter, J.R.; Heron, J.E.; Cornish, R.P.; Rach, S.; Didelez, V.; Langeheine, M.; Tilling, K. Multiple imputation of missing data under missing at random: Compatible imputation models are not sufficient to avoid bias if they are mis-specified. J. Clin. Epidemiol. 2023, 160, 100–109. [Google Scholar] [CrossRef] [PubMed]
- van Buuren, S.; Groothuis-Oudshoorn, K. mice: Multivariate Imputation by Chained Equations in R. J. Stat. Softw. 2011, 45, 1–67. [Google Scholar] [CrossRef]
- van Buuren, S. Flexible Imputation of Missing Data; Chapman & Hall/CRC: Boca Raton, FL, USA, 2018. [Google Scholar]
- Meng, X.L. Multiple-imputation inferences with uncongenial sources of input. Stat. Sci. 1994, 9, 538–558. [Google Scholar] [CrossRef]
- White, I.R.; Royston, P.; Wood, A.M. Multiple imputation using chained equations: Issues and guidance for practice. Stat. Med. 2010, 30, 377–399. [Google Scholar] [CrossRef] [PubMed]
- Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Chapman & Hall/CRC: Boca Raton, FL, USA, 2017. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; ACM: New York, NY, USA, 2016; KDD ’16. [Google Scholar] [CrossRef]
- Burgette, L.F.; Reiter, J.P. Multiple Imputation for Missing Data via Sequential Regression Trees. Am. J. Epidemiol. 2010, 172, 1070–1076. [Google Scholar] [CrossRef]
- Shah, A.D.; Bartlett, J.W.; Carpenter, J.; Nicholas, O.; Hemingway, H. Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study. Am. J. Epidemiol. 2014, 179, 764–774. [Google Scholar] [CrossRef]
- Doove, L.; Van Buuren, S.; Dusseldorp, E. Recursive partitioning for missing data imputation in the presence of interaction effects. Comput. Stat. Data Anal. 2014, 72, 92–104. [Google Scholar] [CrossRef]
- Deng, Y.; Lumley, T. Multiple Imputation Through XGBoost. J. Comput. Graph. Stat. 2023, 33, 352–363. [Google Scholar] [CrossRef]
- Carpenito, T.; Manjourides, J. MISL: Multiple imputation by super learning. Stat. Methods Med. Res. 2022, 31, 1904–1915. [Google Scholar] [CrossRef]
- Sepin, J. Multiple imputation using multivariate adaptive regression splines. Res. Sq. 2025, 1–15. [Google Scholar] [CrossRef]
- van der Ploeg, T.; Austin, P.C.; Steyerberg, E.W. Modern modelling techniques are data hungry: A simulation study for predicting dichotomous endpoints. BMC Med. Res. Methodol. 2014, 14, 137. [Google Scholar] [CrossRef] [PubMed]
- Riley, R.D.; Ensor, J.; Snell, K.I.E.; Harrell, F.E.; Martin, G.P.; Reitsma, J.B.; Moons, K.G.M.; Collins, G.; van Smeden, M. Calculating the sample size required for developing a clinical prediction model. BMJ 2020, 368, m441. [Google Scholar] [CrossRef]
- Romaniuk, M.; Grzegorzewski, P. Fuzzy data imputation with DIMP and FGAIN. J. Comput. Sci. 2026, 93, 102738. [Google Scholar] [CrossRef]
- Little, R.J.A.; D’Agostino, R.; Cohen, M.L.; Dickersin, K.; Emerson, S.S.; Farrar, J.T.; Frangakis, C.; Hogan, J.W.; Molenberghs, G.; Murphy, S.A.; et al. The Prevention and Treatment of Missing Data in Clinical Trials. N. Engl. J. Med. 2012, 367, 1355–1360. [Google Scholar] [CrossRef]
- Ye, H.J.; Liu, S.Y.; Chao, W.L. A Closer Look at TabPFN v2: Understanding Its Strengths and Extending Its Capabilities. arXiv 2025, arXiv:2502.17361. [Google Scholar] [CrossRef]
- Hollmann, N.; Müller, S.; Purucker, L.; Krishnakumar, A.; Körfer, M.; Hoo, S.B.; Schirrmeister, R.T.; Hutter, F. Accurate predictions on small data with a tabular foundation model. Nature 2025, 637, 319–326. [Google Scholar] [CrossRef] [PubMed]
- Erickson, N.; Purucker, L.; Tschalzev, A.; Holzmüller, D.; Desai, P.M.; Salinas, D.; Hutter, F. TabArena: A Living Benchmark for Machine Learning on Tabular Data. arXiv 2025, arXiv:2506.16791. [Google Scholar] [CrossRef]
- Grinsztajn, L.; Flöge, K.; Key, O.; Birkel, F.; Jund, P.; Roof, B.; Jäger, B.; Safaric, D.; Alessi, S.; Hayler, A.; et al. TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models. arXiv 2025, arXiv:2511.08667. [Google Scholar] [CrossRef]
- Feitelberg, J.; Saha, D.; Choi, K.; Ahmad, Z.; Agarwal, A.; Dwivedi, R. TabImpute: Accurate and Fast Zero-Shot Missing-Data Imputation with a Pre-Trained Transformer. arXiv 2025, arXiv:2510.02625. [Google Scholar] [CrossRef]
- Jakobsen, J.C.; Gluud, C.; Wetterslev, J.; Winkel, P. When and how should multiple imputation be used for handling missing data in randomised clinical trials—A practical guide with flowcharts. BMC Med. Res. Methodol. 2017, 17, 162. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Hollmann, N.; Müller, S.; Eggensperger, K.; Hutter, F. Tabpfn: A transformer that solves small tabular classification problems in a second. arXiv 2022, arXiv:2207.01848. [Google Scholar] [CrossRef]
- Müller, S.; Hollmann, N.; Arango, S.P.; Grabocka, J.; Hutter, F. Transformers can do bayesian inference. arXiv 2021, arXiv:2112.10510. [Google Scholar] [CrossRef]
- Little, R.J.A.; An, H. Robust likelihood-based analysis of multivariate data with missing values. Stat. Sin. 2004, 14, 949–968. [Google Scholar]
- Oberman, H.I.; Vink, G. Toward a standardized evaluation of imputation methodology. Biom. J. 2023, 66, 2200107. [Google Scholar] [CrossRef] [PubMed]
- Little, R.J.A. Missing-Data Adjustments in Large Surveys. J. Bus. Econ. Stat. 1988, 6, 287–296. [Google Scholar] [CrossRef]
- Barnard, J.; Rubin, D.B. Miscellanea. Small-sample degrees of freedom with multiple imputation. Biometrika 1999, 86, 948–955. [Google Scholar] [CrossRef]
- Wilson, E.B. Probable Inference, the Law of Succession, and Statistical Inference. J. Am. Stat. Assoc. 1927, 22, 209–212. [Google Scholar] [CrossRef]
- Hernán, M.A.; Robins, J.M. Causal Inference: What If; Chapman & Hall/CRC: Boca Raton, FL, USA, 2020. [Google Scholar]
- Westreich, D.; Edwards, J.K.; Cole, S.R.; Platt, R.W.; Mumford, S.L.; Schisterman, E.F. Imputation approaches for potential outcomes in causal inference. Int. J. Epidemiol. 2015, 44, 1731–1737. [Google Scholar] [CrossRef]
- Morris, T.P.; White, I.R.; Royston, P. Tuning multiple imputation by predictive mean matching and local residual draws. BMC Med. Res. Methodol. 2014, 14, 75. [Google Scholar] [CrossRef]
- Mayer, M. missRanger: Fast Imputation of Missing Values. CRAN Contrib. Packag. 2017. [Google Scholar] [CrossRef]
- Franklin, J.M.; Schneeweiss, S.; Polinski, J.M.; Rassen, J.A. Plasmode simulation for the evaluation of pharmacoepidemiologic methods in complex healthcare databases. Comput. Stat. Data Anal. 2014, 72, 219–226. [Google Scholar] [CrossRef] [PubMed]
- Schreck, N.; Slynko, A.; Saadati, M.; Benner, A. Statistical plasmode simulations–Potentials, challenges and recommendations. Stat. Med. 2024, 43, 1804–1825. [Google Scholar] [CrossRef] [PubMed]
- Murray, J.S. Multiple Imputation: A Review of Practical and Theoretical Findings. Stat. Sci. 2018, 33, 142–159. [Google Scholar] [CrossRef]
- Neal, B.; Huang, C.W.; Raghupathi, S. RealCause: Realistic Causal Inference Benchmarking. arXiv 2020, arXiv:2011.15007. [Google Scholar] [CrossRef]
- Qu, J.; Holzmüller, D.; Varoquaux, G.; Morvan, M.L. TabICLv2: A better, faster, scalable, and open tabular foundation model. arXiv 2026, arXiv:2602.11139. [Google Scholar] [CrossRef]



| Method | n | p = 10 | p = 100 |
|---|---|---|---|
| cart | 100 | 0.324 | 1.072 |
| cart | 200 | 0.373 | 1.281 |
| cart | 400 | 0.464 | 1.769 |
| pmm | 100 | 0.208 | 0.688 |
| pmm | 200 | 0.218 | 0.708 |
| pmm | 400 | 0.226 | 0.738 |
| tabpfn | 100 | 14.081 | 18.071 |
| tabpfn | 200 | 14.219 | 18.483 |
| tabpfn | 400 | 14.510 | 19.607 |
| Method | Estimate | Std. Error | Lower 95% | Upper 95% |
|---|---|---|---|---|
| pmm | 2.363 | 0.834 | 0.728 | 3.998 |
| cart | 3.419 | 0.581 | 2.280 | 4.558 |
| tabpfn | 2.746 | 0.568 | 1.633 | 3.859 |
| g-computation | 3.380 | 0.853 | 1.708 | 5.053 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Sepin, J. Multiple Imputation of a Continuous Outcome with Fully Observed Predictors Using TabPFN. Stats 2026, 9, 38. https://doi.org/10.3390/stats9020038
Sepin J. Multiple Imputation of a Continuous Outcome with Fully Observed Predictors Using TabPFN. Stats. 2026; 9(2):38. https://doi.org/10.3390/stats9020038
Chicago/Turabian StyleSepin, Jerome. 2026. "Multiple Imputation of a Continuous Outcome with Fully Observed Predictors Using TabPFN" Stats 9, no. 2: 38. https://doi.org/10.3390/stats9020038
APA StyleSepin, J. (2026). Multiple Imputation of a Continuous Outcome with Fully Observed Predictors Using TabPFN. Stats, 9(2), 38. https://doi.org/10.3390/stats9020038

