Next Article in Journal
Spatial Environmental Modeling of Autoantibody Outcomes among an African American Population
Next Article in Special Issue
Using a Theoretical Framework to Investigate Whether the HIV/AIDS Information Needs of the AfroAIDSinfo Web Portal Members Are Met: A South African eHealth Study
Previous Article in Journal
Estimation of Populations Exposed to Road Traffic Noise in Districts of Seoul Metropolitan Area of Korea
Previous Article in Special Issue
Health Recommender Systems: Concepts, Requirements, Technical Basics and Challenges
Open AccessArticle

Clustering Multivariate Time Series Using Hidden Markov Models

1
School of Computing, Engineering and Mathematics, University of Western Sydney, Campbelltown, NSW 2751 , Australia
2
Centre for Health Research, University of Western Sydney, Campbelltown, NSW 2751 , Australia
*
Author to whom correspondence should be addressed.
Int. J. Environ. Res. Public Health 2014, 11(3), 2741-2763; https://doi.org/10.3390/ijerph110302741
Received: 4 December 2013 / Revised: 6 February 2014 / Accepted: 19 February 2014 / Published: 6 March 2014
(This article belongs to the Special Issue Public Health Informatics)
In this paper we describe an algorithm for clustering multivariate time series with variables taking both categorical and continuous values. Time series of this type are frequent in health care, where they represent the health trajectories of individuals. The problem is challenging because categorical variables make it difficult to define a meaningful distance between trajectories. We propose an approach based on Hidden Markov Models (HMMs), where we first map each trajectory into an HMM, then define a suitable distance between HMMs and finally proceed to cluster the HMMs with a method based on a distance matrix. We test our approach on a simulated, but realistic, data set of 1,255 trajectories of individuals of age 45 and over, on a synthetic validation set with known clustering structure, and on a smaller set of 268 trajectories extracted from the longitudinal Health and Retirement Survey. The proposed method can be implemented quite simply using standard packages in R and Matlab and may be a good candidate for solving the difficult problem of clustering multivariate time series with categorical variables using tools that do not require advanced statistic knowledge, and therefore are accessible to a wide range of researchers. View Full-Text
Keywords: health trajectory; HMM; clustering health trajectory; HMM; clustering
Show Figures

Figure 1

MDPI and ACS Style

Ghassempour, S.; Girosi, F.; Maeder, A. Clustering Multivariate Time Series Using Hidden Markov Models. Int. J. Environ. Res. Public Health 2014, 11, 2741-2763. https://doi.org/10.3390/ijerph110302741

AMA Style

Ghassempour S, Girosi F, Maeder A. Clustering Multivariate Time Series Using Hidden Markov Models. International Journal of Environmental Research and Public Health. 2014; 11(3):2741-2763. https://doi.org/10.3390/ijerph110302741

Chicago/Turabian Style

Ghassempour, Shima; Girosi, Federico; Maeder, Anthony. 2014. "Clustering Multivariate Time Series Using Hidden Markov Models" Int. J. Environ. Res. Public Health 11, no. 3: 2741-2763. https://doi.org/10.3390/ijerph110302741

Find Other Styles

Article Access Map by Country/Region

1
Only visits after 24 November 2015 are recorded.
Search more from Scilit
 
Search
Back to TopTop