Biomolecules 2012, 2(1), 1-22; doi:10.3390/biom2010001
Article

Exploring the Optimal Strategy to Predict Essential Genes in Microbes

1,2email, 1,2email, 4email, 5email and 1,2,3,* email
1 Division of Biomedical Informatics, Cincinnati Children’s Hospital Research Foundation, 3333 Burnet Avenue, Cincinnati, OH 45229-3026, USA 2 Department of Computer Science, School of Computing Sciences and Informatics, University of Cincinnati, 814 Rhodes Hall, Cincinnati, OH 45221-0030, USA 3 Department of Environmental Health, College of Medicine, University of Cincinnati, 231 Albert Sabin Way, Cincinnati, OH 45267-0524, USA 4 Department of Management Science & Information Systems, Rutgers University, 252 Janice H. Levin Hall, Piscataway, NJ 08854, USA 5 Shanghai Institute of Medical Genetics, Shanghai Jiaotong University, 24/1400 Beijing (W) Road, Shanghai 200040, China
* Author to whom correspondence should be addressed.
Received: 11 November 2011; in revised form: 16 December 2011 / Accepted: 19 December 2011 / Published: 27 December 2011
(This article belongs to the Special Issue Feature Papers)
PDF Full-text Download PDF Full-Text [1115 KB, Updated Version, uploaded 2 January 2012 12:07 CET]
The original version is still available [1115 KB, uploaded 29 December 2011 08:41 CET]
Abstract: Accurately predicting essential genes is important in many aspects of biology, medicine and bioengineering. In previous research, we have developed a machine learning based integrative algorithm to predict essential genes in bacterial species. This algorithm lends itself to two approaches for predicting essential genes: learning the traits from known essential genes in the target organism, or transferring essential gene annotations from a closely related model organism. However, for an understudied microbe, each approach has its potential limitations. The first is constricted by the often small number of known essential genes. The second is limited by the availability of model organisms and by evolutionary distance. In this study, we aim to determine the optimal strategy for predicting essential genes by examining four microbes with well-characterized essential genes. Our results suggest that, unless the known essential genes are few, learning from the known essential genes in the target organism usually outperforms transferring essential gene annotations from a related model organism. In fact, the required number of known essential genes is surprisingly small to make accurate predictions. In prokaryotes, when the number of known essential genes is greater than 2% of total genes, this approach already comes close to its optimal performance. In eukaryotes, achieving the same best performance requires over 4% of total genes, reflecting the increased complexity of eukaryotic organisms. Combining the two approaches resulted in an increased performance when the known essential genes are few. Our investigation thus provides key information on accurately predicting essential genes and will greatly facilitate annotations of microbial genomes.
Keywords: essential genes; machine learning; annotation

Article Statistics

Load and display the download statistics.

Citations to this Article

Cite This Article

MDPI and ACS Style

Deng, J.; Tan, L.; Lin, X.; Lu, Y.; Lu, L.J. Exploring the Optimal Strategy to Predict Essential Genes in Microbes. Biomolecules 2012, 2, 1-22.

AMA Style

Deng J, Tan L, Lin X, Lu Y, Lu LJ. Exploring the Optimal Strategy to Predict Essential Genes in Microbes. Biomolecules. 2012; 2(1):1-22.

Chicago/Turabian Style

Deng, Jingyuan; Tan, Lirong; Lin, Xiaodong; Lu, Yao; Lu, Long J. 2012. "Exploring the Optimal Strategy to Predict Essential Genes in Microbes." Biomolecules 2, no. 1: 1-22.

Biomolecules EISSN 2218-273X Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert