Next Article in Journal
Composition-Dependent Dielectric Properties of DMF-Water Mixtures by Molecular Dynamics Simulations
Next Article in Special Issue
Uncovering the Properties of Energy-Weighted Conformation Space Networks with a Hydrophobic-Hydrophilic Model
Previous Article in Journal / Special Issue
Protein GB1 Folding and Assembly from Structural Elements
Int. J. Mol. Sci. 2009, 10(4), 1567-1589; doi:10.3390/ijms10041567

Folding by Numbers: Primary Sequence Statistics and Their Use in Studying Protein Folding

 and *
Department of Biochemistry, Queen's University, Kingston, Ontario, Canada K7L 3N6
* Author to whom correspondence should be addressed.
Received: 30 January 2009 / Revised: 30 March 2009 / Accepted: 2 April 2009 / Published: 8 April 2009
(This article belongs to the Special Issue Protein Folding)
View Full-Text   |   Download PDF [204 KB, uploaded 19 June 2014]   |   Browse Figures


The exponential growth over the past several decades in the quantity of both primary sequence data available and the number of protein structures determined has provided a wealth of information describing the relationship between protein primary sequence and tertiary structure. This growing repository of data has served as a prime source for statistical analysis, where underlying relationships between patterns of amino acids and protein structure can be uncovered. Here, we survey the main statistical approaches that have been used for identifying patterns within protein sequences, and discuss sequence pattern research as it relates to both secondary and tertiary protein structure. Limitations to statistical analyses are discussed, and a context for their role within the field of protein folding is given. We conclude by describing a novel statistical study of residue patterning in β-strands, which finds that hydrophobic (i,i+2) pairing in β-strands occurs more often than expected at locations near strand termini. Interpretations involving β-sheet nucleation and growth are discussed.
Keywords: Primary Sequence; Protein Folding; Sequence-Structure Relationship Primary Sequence; Protein Folding; Sequence-Structure Relationship
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Share & Cite This Article

Further Mendeley | CiteULike
Export to BibTeX |
MDPI and ACS Style

Wathen, B.; Jia, Z. Folding by Numbers: Primary Sequence Statistics and Their Use in Studying Protein Folding. Int. J. Mol. Sci. 2009, 10, 1567-1589.

View more citation formats

Related Articles

Article Metrics

For more information on the journal, click here


Cited By

[Return to top]
Int. J. Mol. Sci. EISSN 1422-0067 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert