Next Article in Journal
Stability Analysis of a Rumor-Spreading Model with Two Time Delays and Saturation Effect
Previous Article in Journal
Review on Sound-Based Industrial Predictive Maintenance: From Feature Engineering to Deep Learning
Previous Article in Special Issue
Gaussian Graphical Model Estimation and Selection for High-Dimensional Incomplete Data Using Multiple Imputation and Horseshoe Estimators
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Exploring a Diagnostic Test for Missingness at Random

by
Dominick Sutton
1,*,
Anahid Basiri
1 and
Ziqi Li
2
1
School of Geographical & Earth Sciences, University of Glasgow, Glasgow G12 8QQ, UK
2
Department of Geography, Florida State University, Tallahassee, FL 32306, USA
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(11), 1728; https://doi.org/10.3390/math13111728
Submission received: 28 February 2025 / Revised: 12 May 2025 / Accepted: 22 May 2025 / Published: 23 May 2025
(This article belongs to the Special Issue Statistical Research on Missing Data and Applications)

Abstract

Missing data remain a challenge for researchers and decision-makers due to their impact on analytical accuracy and uncertainty estimation. Many studies on missing data are based on randomness, but randomness itself is problematic. This makes it difficult to identify missing data mechanisms and affects how effectively the missing data impacts can be minimized. The purpose of this paper is to examine a potentially simple test to diagnose whether the missing data are missing at random. Such a test is developed using an extended taxonomy of missing data mechanisms. A key aspect of the approach is the use of single mean imputation for handling missing data in the test development dataset. Changing this to random imputation from the same underlying distribution, however, has a negative impact on the diagnosis. This is aggravated by the possibility of high inter-variable correlation, confounding, and mixed missing data mechanisms. The verification step uses data from a high-quality real-world dataset and finds some evidence—in one case—that the data may be missing at random, but this is less persuasive in the second case. Confidence in these results, however, is limited by the potential influence of correlation, confounding, and mixed missingness. This paper concludes with a discussion of the test’s merits and finds that sufficient uncertainties remain to render it unreliable, even if the initial results appear promising.
Keywords: missing data; survey nonresponse; generalized linear model; verification data; missing at random test; extended missing data taxonomy; confounding missing data; survey nonresponse; generalized linear model; verification data; missing at random test; extended missing data taxonomy; confounding

Share and Cite

MDPI and ACS Style

Sutton, D.; Basiri, A.; Li, Z. Exploring a Diagnostic Test for Missingness at Random. Mathematics 2025, 13, 1728. https://doi.org/10.3390/math13111728

AMA Style

Sutton D, Basiri A, Li Z. Exploring a Diagnostic Test for Missingness at Random. Mathematics. 2025; 13(11):1728. https://doi.org/10.3390/math13111728

Chicago/Turabian Style

Sutton, Dominick, Anahid Basiri, and Ziqi Li. 2025. "Exploring a Diagnostic Test for Missingness at Random" Mathematics 13, no. 11: 1728. https://doi.org/10.3390/math13111728

APA Style

Sutton, D., Basiri, A., & Li, Z. (2025). Exploring a Diagnostic Test for Missingness at Random. Mathematics, 13(11), 1728. https://doi.org/10.3390/math13111728

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop