Open AccessThis article is
- freely available
Learning Genetic Population Structures Using Minimization of Stochastic Complexity
Department of Mathematics and statistics, University of Helsinki, P.O.Box 68, FIN-00014 University of Helsinki, Finland
Department of Mathematics, Royal Institute of Technology, S-100 44 Stockholm, Sweden
Department of Mathematics, Åbo Akademi University, FIN-20500 Åbo, Finland
* Author to whom correspondence should be addressed.
Received: 21 February 2010; Accepted: 28 April 2010 / Published: 5 May 2010
Abstract: Considerable research efforts have been devoted to probabilistic modeling of genetic population structures within the past decade. In particular, a wide spectrum of Bayesian models have been proposed for unlinked molecular marker data from diploid organisms. Here we derive a theoretical framework for learning genetic population structure of a haploid organism from bi-allelic markers for which potential patterns of dependence are a priori unknown and to be explicitly incorporated in the model. Our framework is based on the principle of minimizing stochastic complexity of an unsupervised classification under tree augmented factorization of the predictive data distribution. We discuss a fast implementation of the learning framework using deterministic algorithms.
Keywords: factorization of multivariate distributions; finite mixture models; Minimum Description Length; population genetics; statistical learning; structured population
Citations to this Article
Cite This Article
MDPI and ACS Style
Corander, J.; Gyllenberg, M.; Koski, T. Learning Genetic Population Structures Using Minimization of Stochastic Complexity. Entropy 2010, 12, 1102-1124.
Corander J, Gyllenberg M, Koski T. Learning Genetic Population Structures Using Minimization of Stochastic Complexity. Entropy. 2010; 12(5):1102-1124.
Corander, Jukka; Gyllenberg, Mats; Koski, Timo. 2010. "Learning Genetic Population Structures Using Minimization of Stochastic Complexity." Entropy 12, no. 5: 1102-1124.