Next Article in Journal
AglM and VNG1048G, Two Haloarchaeal UDP-Glucose Dehydrogenases, Show Different Salt-Related Behaviors
Next Article in Special Issue
Ultra Large Gene Families: A Matter of Adaptation or Genomic Parasites?
Previous Article in Journal
A Hypothesis: Life Initiated from Two Genes, as Deduced from the RNA World Hypothesis and the Characteristics of Life-Like Systems
Previous Article in Special Issue
Conservation of the Exon-Intron Structure of Long Intergenic Non-Coding RNA Genes in Eutherian Mammals
Article Menu

Export Article

Open AccessArticle
Life 2016, 6(3), 30; doi:10.3390/life6030030

Gene-Family Extension Measures and Correlations

Department of Evolutionary and Environmental Biology, University of Haifa, Haifa 3498838, Israel
*
Author to whom correspondence should be addressed.
Academic Editor: David Deamer
Received: 2 June 2016 / Revised: 18 July 2016 / Accepted: 18 July 2016 / Published: 3 August 2016
(This article belongs to the Special Issue Structure and Evolution of Genome)
View Full-Text   |   Download PDF [984 KB, uploaded 3 August 2016]   |  

Abstract

The existence of multiple copies of genes is a well-known phenomenon. A gene family is a set of sufficiently similar genes, formed by gene duplication. In earlier works conducted on a limited number of completely sequenced and annotated genomes it was found that size of gene family and size of genome are positively correlated. Additionally, it was found that several atypical microbes deviated from the observed general trend. In this study, we reexamined these associations on a larger dataset consisting of 1484 prokaryotic genomes and using several ranking approaches. We applied ranking methods in such a way that genomes with lower numbers of gene copies would have lower rank. Until now only simple ranking methods were used; we applied the Kemeny optimal aggregation approach as well. Regression and correlation analysis were utilized in order to accurately quantify and characterize the relationships between measures of paralog indices and genome size. In addition, boxplot analysis was employed as a method for outlier detection. We found that, in general, all paralog indexes positively correlate with an increase of genome size. As expected, different groups of atypical prokaryotic genomes were found for different types of paralog quantities. Mycoplasmataceae and Halobacteria appeared to be among the most interesting candidates for further research of evolution through gene duplication. View Full-Text
Keywords: number of paralogs; comparative genomics; combinatorial optimization; Mycoplasmas; Halophiles; Orientia; Mycobacterium leprae; genome size number of paralogs; comparative genomics; combinatorial optimization; Mycoplasmas; Halophiles; Orientia; Mycobacterium leprae; genome size
Figures

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Supplementary material

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Carmi, G.; Bolshoy, A. Gene-Family Extension Measures and Correlations. Life 2016, 6, 30.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Life EISSN 2075-1729 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top