Open AccessThis article is
- freely available
Evolution and Quantitative Comparison of Genome-Wide Protein Domain Distributions
Computational EvoDevo Group, Department of Computer Science, University of Leipzig, Härtelstraße 16–18, D-04107 Leipzig, Germany
Interdisciplinary Center for Bioinformatics, University of Leipzig, Härtelstraße 16–18, D-04107 Leipzig, Germany
Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstraße 16–18, D-04107 Leipzig, Germany
Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, D-04103 Leipzig, Germany
Fraunhofer Institut für Zelltherapie und Immunologie—IZI Perlickstraße 1, D-04103 Leipzig, Germany
Department of Theoretical Chemistry, University of Vienna, Währingerstraße 17, A-1090 Wien, Austria
Center for non-coding RNA in Technology and Health, University of Copenhagen, Grønnegårdsvej 3, DK-1870 Frederiksberg C, Denmark
Santa Fe Institute, 1399 Hyde Park Rd., Santa Fe, NM 87501, USA
* Author to whom correspondence should be addressed.
Received: 29 August 2011; in revised form: 7 October 2011 / Accepted: 25 October 2011 / Published: 9 November 2011
Abstract: The metabolic and regulatory capabilities of an organism are implicit in its protein content. This is often hard to estimate, however, due to ascertainment biases inherent in the available genome annotations. Its complement of recognizable functional protein domains and their combinations convey essentially the same information and at the same time are much more readily accessible, although protein domain models trained for one phylogenetic group frequently fail on distantly related sequences. Pooling related domain models based on their GO-annotation in combination with de novo gene prediction methods provides estimates that seem to be less affected by phylogenetic biases. We show here for 18 diverse representatives from all eukaryotic kingdoms that a pooled analysis of the tendencies for co-occurrence or avoidance of protein domains is indeed feasible. This type of analysis can reveal general large-scale patterns in the domain co-occurrence and helps to identify lineage-specific variations in the evolution of protein domains. Somewhat surprisingly, we do not find strong ubiquitous patterns governing the evolutionary behavior of specific functional classes. Instead, there are strong variations between the major groups of Eukaryotes, pointing at systematic differences in their evolutionary constraints.
Keywords: protein domains; HMM models; GO classification; functional genome annotation; Eukarya
Citations to this Article
Cite This Article
MDPI and ACS Style
Parikesit, A.A.; Stadler, P.F.; Prohaska, S.J. Evolution and Quantitative Comparison of Genome-Wide Protein Domain Distributions. Genes 2011, 2, 912-924.
Parikesit AA, Stadler PF, Prohaska SJ. Evolution and Quantitative Comparison of Genome-Wide Protein Domain Distributions. Genes. 2011; 2(4):912-924.
Parikesit, Arli A.; Stadler, Peter F.; Prohaska, Sonja J. 2011. "Evolution and Quantitative Comparison of Genome-Wide Protein Domain Distributions." Genes 2, no. 4: 912-924.