This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Improved Doubly Robust Inference with Nonprobability Survey Samples Using Finite Mixture Models: Application to Health Monitoring SMS Survey Data
by
Ziying Yang
Ziying Yang 1,
Xu Wang
Xu Wang 1,
Wenjing Wu
Wenjing Wu 1,2,3 and
Jing Gu
Jing Gu 1,2,3,4,5,*
1
Department of Medical Statistics, School of Public Health, Sun Yat-sen University, No. 74, Zhongshan Second Road, Guangzhou 510080, China
2
Research Center of Health Informatics, Sun Yat-sen University, Guangzhou 510080, China
3
Sun Yat-sen Global Health Institute, School of Public Health and Institute of State Governance, Sun Yat-sen University, Guangzhou 510080, China
4
Guangzhou Joint Research Center for Disease Surveillance, Early Warning, and Risk Assessment, Guangzhou 510080, China
5
Guangdong Key Laboratory of Health Informatics, Guangzhou 510080, China
*
Author to whom correspondence should be addressed.
Mathematics 2026, 14(1), 118; https://doi.org/10.3390/math14010118 (registering DOI)
Submission received: 4 November 2025
/
Revised: 21 December 2025
/
Accepted: 24 December 2025
/
Published: 28 December 2025
Abstract
Nonprobability sampling has been increasingly used in epidemiologic research, yet direct inference based on such samples is subject to selection bias. Current adjustment methods commonly rely on a reference probability-based survey sample that shares a set of covariates with the nonprobability sample. However, these common covariates are often limited and may bias estimates in the presence of population heterogeneity. Existing methods generally assume population homogeneity in models and fail to address such heterogeneity adequately. To overcome this limitation, we propose the Nonprobability Heterogeneity-adjusted Doubly Robust (NHDR) method, a novel inference framework that explicitly accounts for population heterogeneity during selection bias adjustment. NHDR proceeds in three stages: (1) identifying latent subpopulations via finite mixture modeling; (2) incorporating the resulting latent-class structure as a grouping variable into mixed-effects models for both the propensity score and outcome projection; and (3) constructing a doubly robust estimator that integrates these adjusted models. The key methodological contribution of NHDR is its formal integration of latent-class-based population structure into a doubly robust estimation framework, which enables more reliable inference under heterogeneous population settings. Simulation studies demonstrate that the proposed method control the coverage probabilities well in most scenarios. Under heterogeneous conditions, NHDR consistently outperforms existing methods achieving an average reduction in relative bias of approximately 1.8–4.5% and a corresponding decrease in mean squared error of about 5.1–15.5 compared to the benchmark method. We illustrate the practical utility of NHDR by applying it to estimate nine health indicators using data from the Health Monitoring SMS Survey in Guangzhou, China, with the seventh Guangdong Health Service Survey serving as the reference sample.
Share and Cite
MDPI and ACS Style
Yang, Z.; Wang, X.; Wu, W.; Gu, J.
Improved Doubly Robust Inference with Nonprobability Survey Samples Using Finite Mixture Models: Application to Health Monitoring SMS Survey Data. Mathematics 2026, 14, 118.
https://doi.org/10.3390/math14010118
AMA Style
Yang Z, Wang X, Wu W, Gu J.
Improved Doubly Robust Inference with Nonprobability Survey Samples Using Finite Mixture Models: Application to Health Monitoring SMS Survey Data. Mathematics. 2026; 14(1):118.
https://doi.org/10.3390/math14010118
Chicago/Turabian Style
Yang, Ziying, Xu Wang, Wenjing Wu, and Jing Gu.
2026. "Improved Doubly Robust Inference with Nonprobability Survey Samples Using Finite Mixture Models: Application to Health Monitoring SMS Survey Data" Mathematics 14, no. 1: 118.
https://doi.org/10.3390/math14010118
APA Style
Yang, Z., Wang, X., Wu, W., & Gu, J.
(2026). Improved Doubly Robust Inference with Nonprobability Survey Samples Using Finite Mixture Models: Application to Health Monitoring SMS Survey Data. Mathematics, 14(1), 118.
https://doi.org/10.3390/math14010118
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.