These authors contributed equally to this work

This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

The conformation spaces generated by short hydrophobic-hydrophilic (HP) lattice chains are mapped to conformation space networks (CSNs). The vertices (nodes) of the network are the conformations and the links are the transitions between them. It has been found that these networks have “small-world” properties without considering the interaction energy of the monomers in the chain, i. e. the hydrophobic or hydrophilic amino acids inside the chain. When the weight based on the interaction energy of the monomers in the chain is added to the CSNs, it is found that the weighted networks show the “scale-free” characteristic. In addition, it reveals that there is a connection between the scale-free property of the weighted CSN and the folding dynamics of the chain by investigating the relationship between the scale-free structure of the weighted CSN and the noted parameter

The complex network model [

In reference [^{−}^{γ}

Furthermore, the power-law property of the CSNs may have an important relationship with the dynamics of some complex processes such as protein folding. As we know, among the all the possible linear amino-acid sequences, only very few are “protein-like” [

Finally, the modular (community) structure [

Here the weighted CSN was studied with the 2D hydrophobic-hydrophilic (HP) square lattice model proposed by Dill [_{HH}_{PP}_{HP}

First, the conformation space was mapped onto an unweighted network. All allowed conformations of a chain, which has unique ground-state conformation, were enumerated. Conformation of the chain was evolved by the conventional Monte Carlo elementary moves [

Second, the energy weight was added to the network. According to the interaction energy of the conformation, the Boltzmann factor of each conformation _{B}_{B} are the energy of the conformation and the Boltzmann constant, respectively, and _{B} = 1 and _{i}

Finally, we chose HP sequences to construct the weighted CSNs. As to the HP model, the relationship of the conformation–sequence has been studied thoroughly [^{13} HP sequences. Through searching the complete conformations of each sequence, a set of 309 sequences with unique lowest-energy state was found from the 2^{13} HP sequences. A natural protein sequence has a unique global minimum of free energy which is well separated in energy from other misfolded states. In a lattice HP model, the protein-like folds are associated with sequences that have a minimal number of lowest-energy states [

To uncover the relationship between the folding dynamics and the power-law property of the weighted CSN, the set of 309 sequences, as mentioned above, with unique lowest-energy state was applied. Then according to the method of the construction of the complex network above, we constructed 309 weighted complex networks. It is found out that all of these weighted networks show power-law tail properties, thus we can obtain all the scaling exponent

Additionally, a known parameter, Z score (_{a}_{0}_{C}

In this article, a multistep greedy algorithm (MSG) in combination with a local refinement procedure named “vertex mover” (VM) [_{i}_{C}

The algorithm works as follows: i. Start with the modularity change matrix Δ_{i}_{x}_{ij}

In the MSG-VM algorithms, the value of _{1} and _{2}, the product of the Boltzmann factors of the two nodes _{1} and _{2} is defined as the weight of edge _{e}_{n1} and _{n2} are the energies of the conformations which correspond to nodes n_{1} and n_{2,} respectively; _{B} is the Boltzmann constant; _{B} = 1 and

For a complex network, the widely studied characteristic is the degree. Degree of a vertex is the total number of its connections. This quantity is also called “connectivity”. The degree of a vertex is a local quantity, and the total distribution of vertex degrees often determines some important global characteristics of the network. As to different kinds of networks, the degree distributions follow different forms. The degree distribution of a scale-free network is of a fat-tailed form [

^{−}^{γ}

Using the edge weight defined as (5), we firstly calculated the average shortest distance _{w}_{random}_{w}_{w}_{random}_{random}_{random}_{w}^{2}, where _{random}_{w}_{random}

The result indicates that the energy weight is a key factor for the appearance of the power-law property of the CSN. To understand the meaning of the energy weight, we should consider the connectivity of the conformations and more details about the lattice model. It is well known that the lattice model is a simplified model of a polymer. In the lattice model, polymers including proteins can be treated as linear strings of beads, and the details of the structures are not accurately represented. This simplified approximation is based on the following hypothesis: when the real chain conformations are under good solvent conditions or in “theta”[

When the interaction energy of the monomers in the chain is not taken into account, all conformations are of the same free energy, i.e. unit free energy. It means that each conformation has the same weight in the conformation space. Therefore, the topology of the CSN is simply determined by the connectivity of the conformations. In this case, the degree distribution of the network is binomial distribution and significantly deviates from the power-law form, as shown in

The power-law distribution indicates that there are a few “hubs” which have large connectivity and a mass of vertices with a relatively small number of links in the network [

This result can be understood by the free-energy landscape view. According to the mathematical knowledge of the power-law function [^{−}^{β}

It has been shown [

This result is also helpful for protein design, whose goal is to identify amino acid sequences that can fold well and lead to a given structure [

The value of

The inverse proportion relationships shown in

To obtain the values of the modularity

Finally, several tasks remain to be done in the future. First of all, one may consider the conformation space generated by a three-dimensional (3D) lattice chain and a more reasonable weighting method. In addition, more details about relationship between the topology of the CSN and protein dynamics could be studied to obtain more profound insights into the protein dynamics and the topology of the CSN. Besides, the farther modular analysis of CSN will help us to comprehend expressly the rugged energy landscape. For example, through modular analysis, one may find out more information about the micro-structure on the surface of the energy landscape, such as “local minima” [

Complex network theory was applied to analyze the topological properties of the conformation spaces generated by short two-dimensional HP lattice chains and the underlying free-energy landscape. Scaling behavior is observed in the CSN topology when the weight based on interaction energy of monomers inside the sequences is considered. This result uncovers the importance of the monomer interaction in forming the topology of the CSNs, thus may provide an optional comprehension about the origin of the scale-free property of the CSNs. Moreover, the significant correlation between the scaling exponent

This work was supported in part by National Nature Science Foundation of China (Grant Nos. 20773006 and 30670497) and Specialized Research Fund for the Doctoral Program of Higher Education (No. 200800050003).

(a) The example of Monte Carlo move. The five different conformations of a 10 monomers lattice chain are labeled

The connectivity distribution of the unweighted conformation space network. In this figure,

Topological properties of weighted conformation space networks. Logarithmic coordinate is used. ^{−}^{γ}

The scaling exponent

Relationship between the scaling exponent

Correlation between the modularity