Next Article in Journal
An Entropy-Based Design Evaluation Model for Architectural Competitions through Multiple Factors
Next Article in Special Issue
Universal Sample Size Invariant Measures for Uncertainty Quantification in Density Estimation
Previous Article in Journal
Impact of Misclassification Rates on Compression Efficiency of Red Blood Cell Images of Malaria Infection Using Deep Learning
Open AccessArticle

An Integrated Approach for Making Inference on the Number of Clusters in a Mixture Model

1
Instituto de Matemática, Universidade Federal de Mato Grosso do Sul, Campo Grande 79070-900, Brazil
2
Departamento de Matemática Aplicada e Estatística, Universidade de São Paulo, São Carlos 13566-590, Brazil
3
Departamento de Estatística, Universidade Federal de São Carlos, São Carlos 13565-905, Brazil
4
Instituto de Matemática e Estatística, Universidade de São Paulo, São Paulo 05508-090, Brazil
*
Author to whom correspondence should be addressed.
Entropy 2019, 21(11), 1063; https://doi.org/10.3390/e21111063
Received: 23 September 2019 / Revised: 20 October 2019 / Accepted: 26 October 2019 / Published: 30 October 2019
(This article belongs to the Special Issue Data Science: Measuring Uncertainties)
This paper presents an integrated approach for the estimation of the parameters of a mixture model in the context of data clustering. The method is designed to estimate the unknown number of clusters from observed data. For this, we marginalize out the weights for getting allocation probabilities that depend on the number of clusters but not on the number of components of the mixture model. As an alternative to the stochastic expectation maximization (SEM) algorithm, we propose the integrated stochastic expectation maximization (ISEM) algorithm, which in contrast to SEM, does not need the specification, a priori, of the number of components of the mixture. Using this algorithm, one estimates the parameters associated with the clusters, with at least two observations, via local maximization of the likelihood function. In addition, at each iteration of the algorithm, there exists a positive probability of a new cluster being created by a single observation. Using simulated datasets, we compare the performance of the ISEM algorithm against both SEM and reversible jump (RJ) algorithms. The obtained results show that ISEM outperforms SEM and RJ algorithms. We also provide the performance of the three algorithms in two real datasets. View Full-Text
Keywords: model-based clustering; mixture model; EM algorithm; integrated approach model-based clustering; mixture model; EM algorithm; integrated approach
Show Figures

Figure 1

MDPI and ACS Style

Saraiva, E.F.; Suzuki,  .K.; Milan, L.A.; Pereira, C.B. An Integrated Approach for Making Inference on the Number of Clusters in a Mixture Model. Entropy 2019, 21, 1063.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Back to TopTop