Next Article in Journal
A Hypothesis for Bacteriophage DNA Packaging Motors
Next Article in Special Issue
JaPaFi: A Novel Program for the Identification of Highly Conserved DNA Sequences
Previous Article in Journal
Last Stop Before Exit – Hepatitis C Assembly and Release as Antiviral Drug Targets
Previous Article in Special Issue
The Genomic Diversity and Phylogenetic Relationship in the Family Iridoviridae
Article Menu

Article Versions

Export Article

Open AccessReview
Viruses 2010, 2(8), 1804-1820;

Coronavirus Genomics and Bioinformatics Analysis

1,2,3,4,†,* , 4,†
1,2,3,4,* and 1,2,3,4
State Key Laboratory of Emerging Infectious Diseases, The University of Hong Kong, Hong Kong
Research Centre of Infection and Immunology, The University of Hong Kong, Hong Kong
Carol Yu Centre of Infection, The University of Hong Kong, Hong Kong
Department of Microbiology, The University of Hong Kong, University Pathology Building,
These authors contributed equally to this work.
Authors to whom correspondence should be addressed.
Received: 1 July 2010 / Accepted: 12 August 2010 / Published: 24 August 2010
(This article belongs to the Special Issue Viral Genomics and Bioinformatics)
PDF [300 KB, uploaded 12 May 2015]


The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb) among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid) and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV), between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV) type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1). Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.
Keywords: coronavirus; genome; bioinformatics coronavirus; genome; bioinformatics
This is an open access article distributed under the Creative Commons Attribution License (CC BY 3.0).

Share & Cite This Article

MDPI and ACS Style

Woo, P.C.Y.; Huang, Y.; Lau, S.K.P.; Yuen, K.-Y. Coronavirus Genomics and Bioinformatics Analysis. Viruses 2010, 2, 1804-1820.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Viruses EISSN 1999-4915 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top