Next Article in Journal
A Spatial Analysis Framework to Monitor and Accelerate Progress towards SDG 3 to End TB in Bangladesh
Previous Article in Journal
Distributed Geoscience Algorithm Integration Based on OWS Specifications: A Case Study of the Extraction of a River Network
Article Menu
Issue 1 (January) cover image

Export Article

Open AccessArticle
ISPRS Int. J. Geo-Inf. 2019, 8(1), 13; https://doi.org/10.3390/ijgi8010013

A Cluster-Based Machine Learning Ensemble Approach for Geospatial Data: Estimation of Health Insurance Status in Missouri

1
Integrated & Applied Sciences: Bioinformatics & Geospatial Biology, College of Arts and Sciences, Saint Louis University, St. Louis, MO 63103, USA
2
Department of Sociology and Anthropology, College of Arts and Sciences, Saint Louis University, St. Louis, MO 63103, USA
3
School for Professional Studies, Saint Louis University, St. Louis, MO 63103, USA
4
Department of Epidemiology and Biostatistics, College for Public Health and Social Justice, Saint Louis University, St. Louis, MO 63103, USA
*
Author to whom correspondence should be addressed.
Received: 12 October 2018 / Revised: 19 December 2018 / Accepted: 20 December 2018 / Published: 28 December 2018
  |  
PDF [942 KB, uploaded 29 December 2018]
  |     |  

Abstract

Mainstream machine learning approaches to predictive analytics consistently prove their ability to perform well using a variety of datasets, although the task of identifying an optimally-performing machine learning approach for any given dataset becomes much less intuitive. Methods such as ensemble and transformation modeling have been developed to improve upon individual base learners and datasets with large degrees of variance. Despite the increased generalizability and flexibility of ensemble approaches, the cost often involves sacrificing inference for predictive ability. This paper introduces an alternative approach to ensemble modeling, combining the predictive ability of an ensemble framework with localized model construction through the incorporation of cluster analysis as a pre-processing technique. The workflow not only outperforms independent base learners and comparative ensemble methods, but also preserves local inferential capability by manipulating cluster parameters and maintaining interpretable relative importance values and non-transformed coefficients for the overall consideration of variable importance. This paper demonstrates the ensemble technique on a dataset to estimate rates of health insurance coverage across the state of Missouri, where the cluster pre-processing assists in understanding both local and global variable importance and interactions when predicting high concentration areas of low health insurance coverage based on demographic, socioeconomic, and geospatial variables. View Full-Text
Keywords: ensemble modeling; machine learning; population health; health insurance status; spatial statistics; variable clustering ensemble modeling; machine learning; population health; health insurance status; spatial statistics; variable clustering
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Mueller, E.; Sandoval, J.S.O.; Mudigonda, S.; Elliott, M. A Cluster-Based Machine Learning Ensemble Approach for Geospatial Data: Estimation of Health Insurance Status in Missouri. ISPRS Int. J. Geo-Inf. 2019, 8, 13.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
ISPRS Int. J. Geo-Inf. EISSN 2220-9964 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top