Next Article in Journal
A Combination Method for Averaging OLS and GLS Estimators
Previous Article in Journal
Compulsory Schooling and Returns to Education: A Re-Examination
Open AccessArticle

Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data

1
School of Behavioral and Brain Sciences, GR4.1, 800 W. Campbell Rd., University of Texas at Dallas, Richardson, TX 75080, USA
2
Martingale Research Corporation, 101 E. Park Blvd., Suite 600, Plano, TX 75074, USA
3
Department of Medicine, Loma Linda University School of Medicine, Loma Linda, CA 92357, USA
4
Center for Advanced Statistics in Education, VA Loma Linda Healthcare System, Loma Linda, CA 92357, USA
5
Department of Economics, University of California San Diego, La Jolla, CA 92093, USA
6
Office of Academic Affiliations (10X1), Department of Veterans Affairs, 810 Vermont Ave. NW, Washington, DC 20420, USA
*
Author to whom correspondence should be addressed.
Halbert White sadly passed away before this article was published.
Econometrics 2019, 7(3), 37; https://doi.org/10.3390/econometrics7030037
Received: 22 October 2018 / Revised: 4 June 2019 / Accepted: 29 August 2019 / Published: 5 September 2019
Researchers are often faced with the challenge of developing statistical models with incomplete data. Exacerbating this situation is the possibility that either the researcher’s complete-data model or the model of the missing-data mechanism is misspecified. In this article, we create a formal theoretical framework for developing statistical models and detecting model misspecification in the presence of incomplete data where maximum likelihood estimates are obtained by maximizing the observable-data likelihood function when the missing-data mechanism is assumed ignorable. First, we provide sufficient regularity conditions on the researcher’s complete-data model to characterize the asymptotic behavior of maximum likelihood estimates in the simultaneous presence of both missing data and model misspecification. These results are then used to derive robust hypothesis testing methods for possibly misspecified models in the presence of Missing at Random (MAR) or Missing Not at Random (MNAR) missing data. Second, we introduce a method for the detection of model misspecification in missing data problems using recently developed Generalized Information Matrix Tests (GIMT). Third, we identify regularity conditions for the Missing Information Principle (MIP) to hold in the presence of model misspecification so as to provide useful computational covariance matrix estimation formulas. Fourth, we provide regularity conditions that ensure the observable-data expected negative log-likelihood function is convex in the presence of partially observable data when the amount of missingness is sufficiently small and the complete-data likelihood is convex. Fifth, we show that when the researcher has correctly specified a complete-data model with a convex negative likelihood function and an ignorable missing-data mechanism, then its strict local minimizer is the true parameter value for the complete-data model when the amount of missingness is sufficiently small. Our results thus provide new robust estimation, inference, and specification analysis methods for developing statistical models with incomplete data. View Full-Text
Keywords: asymptotic theory; ignorable; Generalized Information Matrix Test; misspecification; missing data; nonignorable; sandwich estimator; specification analysis asymptotic theory; ignorable; Generalized Information Matrix Test; misspecification; missing data; nonignorable; sandwich estimator; specification analysis
MDPI and ACS Style

Golden, R.M.; Henley, S.S.; White, H.; Kashner, T.M. Consequences of Model Misspecification for Maximum Likelihood Estimation with Missing Data. Econometrics 2019, 7, 37.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop