A Firefly-Inspired Method for Protein Structure Prediction in Lattice Models

We introduce a Firefly-inspired algorithmic approach for protein structure prediction over two different lattice models in three-dimensional space. In particular, we consider three-dimensional cubic and three-dimensional face-centred-cubic (FCC) lattices. The underlying energy models are the Hydrophobic-Polar (H-P) model, the Miyazawa–Jernigan (M-J) model and a related matrix model. The implementation of our approach is tested on ten H-P benchmark problems of a length of 48 and ten M-J benchmark problems of a length ranging from 48 until 61. The key complexity parameter we investigate is the total number of objective function evaluations required to achieve the optimum energy values for the H-P model or competitive results in comparison to published values for the M-J model. For H-P instances and cubic lattices, where data for comparison are available, we obtain an average speed-up over eight instances of 2.1, leaving out two extreme values (otherwise, 8.8). For six M-J instances, data for comparison are available for cubic lattices and runs with a population size of 100, where, a priori, the minimum free energy is a termination criterion. The average speed-up over four instances is 1.2 (leaving out two extreme values, otherwise 1.1), which is achieved for a population size of only eight instances. The present study is a test case with initial results for ad hoc parameter settings, with the aim of justifying future research on larger instances within lattice model settings, eventually leading to the ultimate goal of implementations for off-lattice models.

with the aim of justifying future research on larger instances within lattice model settings, eventually leading to the ultimate goal of implementations for off-lattice models.
Keywords: protein folding; lattice models; H-P and M-J energy functions; Firefly Algorithm

Introduction
Research on protein structure and folding has a long history, dating back to the seminal work by Pauling and Corey [1]; see also [2][3][4][5]. From a computational point of view, all-atom protein structure prediction is a challenging task. Interestingly, Finkelstein and Badretdinov [6] approximated the worst case folding time of a protein of length n as exp(λ · n 2/3 ± χ · n 1/2 /2) ns, where λ and χ are constants close to unity. In terms of algorithmic complexity, this implies a quasi-exponential structure prediction time with an exponent ≈ n 2/3 , if the structure prediction follows the folding path and elementary steps reflect structure transitions. Consequently, protein structure prediction has been shown to be NP-hard for various lattice models [7,8].
A common approach to tackle NP-hard problems are population-based methods and search-based heuristics, such as genetic algorithms [9], tabu search [10], Monte Carlo methods [11], simulated annealing [12][13][14] and quantum optimization [15,16]. Furthermore, simplified energy models are used for approximations of tertiary protein structures, with the most popular being the Hydrophobic-Polar (H-P) energy model [17][18][19][20] and the Miyazawa-Jernigan (M-J) energy model [21,22]. In the H-P model, proteins are represented by chains (within a given lattice) whose vertices are marked either as H (hydrophobic) or P (hydrophilic); H nodes are considered to attract each other, while P nodes are neutral. An optimal lattice embedding is one that maximises the number of H-H contacts, where the underlying justification is the assumption that hydrophobic interactions contribute to a significant portion of the total energy function. The M-J energy model is a more realistic model of the free energy of a given conformation, since it takes into account the interactions between specific pairs of amino acids.
In general, lattice models have been shown to be useful for the study of globular proteins [17,19], of the conformational space induced by (simplified) protein sequences [4,20] and for the analysis of a number of other features important in protein structure prediction, such as long-range interactions in proteins [23]. While typical lattice models account only for the backbone representation of protein sequences, recent research aims at incorporating the space required for side chain packing; see the LatFittool in [24]. As pointed out by Moreno-Hernández and Levitt in [20], protein structure predictions carried out in practice may consist of multiple stages that combine computational models of different complexities, where simplified models are often used in the early stages of structure predictions. In the present study, we focus on lattice models as a test case, with future research aiming at the application of the new search methodology to off-lattice models.
For comprehensive information about recent developments regarding lattice-based protein structure prediction, we refer the reader to [25,26]. In the centre of the present paper is the adaptation of a brand new population-based method to protein structure prediction, namely, the Firefly Algorithm (FA) devised by X.S. Yang [27,28]. To the best of our knowledge, FA has been applied before to protein structure prediction only in [29] (the 2D H-P model).
For the H-P model and face-centred-cubic (FCC) lattices, a new tabu search heuristic is applied in [30] to 21 sequences of a length between 90 and 279. The new heuristic is able to improve on previously known minimum energy values in double digit percentages. The details of how to select favourable intermediate conformations in a population-based search are addressed in [31] (the H-P model and FCC lattices). The information can then be utilised for tabu search or related methodologies. Two energy models, namely the H-P model and a contact matrix similar to the M-J model, proposed in [32], are merged in [33,34]. The authors obtain improvements over existing methods in terms of the root mean square deviation (RMSD, the overall distance between the predicted structure and the native folding identified by X-ray crystallography or nuclear magnetic resonance spectroscopy) for a number of sequences out of 12 proteins with a length ranging from 54 to 160.
In [35], the authors devise a set of general guidelines for the application of Ant Colony Optimisationto protein structure prediction in lattice models. The guidelines are derived from computational experiments on 2D and 3D rectangular lattices combined with the H-P model.
In our study, we try to demonstrate the advantage of the Firefly-inspired method in terms of energy function evaluations, calculated over all individual runs of the population-based method. The results encourage us to analyse further larger instances for the H-P model and to adapt the approach to off-lattice models.

Lattice Models
In order to simplify the modelling of protein conformations, and to make computational processing of the structure easier, the common practice is to represent the protein on a lattice. In informal terms, a lattice can be defined as any two-or three-dimensional graph. In our case, we will be using three-dimensional graphs, which are infinite in all three directions. A valid conformation on a lattice is a self-avoiding walk of the graph representation of the lattice, of a length equal to the length of the protein sequence being modelled. We will consider two lattice types in our experiments: the three-dimensional cubic lattice and the three-dimensional face-centred-cubic (FCC) lattice.
The formal definition of lattices is adopted from [26,36,37]: A lattice, Λ, is a subset of R N defined by: where the vectors, u 0 , ..., u N −1 , form a basis in R N and can be represented by the basis matrix B Λ = (u 0 , ..., u N −1 ). The three-dimensional cubic lattice is the simplest lattice we consider, where B Λ is given by: Two nodes v i = (x i , y i , z i ), v j = (x j , y j , z j ) ∈ B 3Dcubic are neighbours, i.e., connected by an edge, if: Therefore, each node in the lattice has six neighbouring nodes, e.g., for v = (x, y, z), the neighbours are (x + 1, y, z), (x − 1, y, z), (x, y + 1, z), (x, y − 1, z), (x, y, z + 1) and (x, y, z − 1). The FCC lattice is more complex and is a three-dimensional lattice formed around triangles. Each lattice node has 12 possible neighbouring nodes, twice that of the three-dimensional cubic lattice. We consider the FCC lattice, since, even though there are less published results for proteins folded over it, it has been shown that the FCC lattice provides good approximations of actual protein structures (see [14] for the relation to Kepler's conjecture). The basis matrix, B Λ , for the FCC lattice can be defined by: see [26] and also Section 3 in [36]. Two nodes v i = (x i , y i , z i ), v j = (x j , y j , z j ) ∈ B FCC are neighbours according to: For both of these lattice models, a contact between two amino acids is defined as any two amino acids on lattice nodes (x i , y i , z i ) and (x j , y j , z j ) that meet their respective neighbouring properties, defined above (Equations (3) & (5)), and that have the extra requirement that the two nodes do not hold P n and P n+1 , where P n represents the n-th amino acid in the protein sequence, P .
In our population-based model, we consider a separate infinite lattice for each conformation in the population. Conformations are represented as vectors, and as such, we can measure the distance between them over multiple lattices. Using separate lattices for each conformation prevents a move from not being made due to a node already containing an amino acid from another conformation.
All conformations in the population can move independently, which results in an algorithm that, to a certain degree, can be processed in parallel. For a given number of intermediate steps, T , the conformational moves from S 0 to S T can be processed in parallel, for each conformation individually. It is only after each T step that we need all conformational moves to be completed, so that the current best result can be established to become part of the Firefly objective function.

Energy Functions
In a further simplification used to aid computational modelling, bioinformaticians consider simplified energy functions to determine the free energy of any valid conformation. We will consider two energy functions, the simplest, Hydrophobic-Polar (H-P) model proposed by Dill [18], and the Miyazawa-Jernigan (M-J) [21,22] energy model.
The H-P energy model is possibly the simplest energy model in general use. It considers each amino acid to be either hydrophobic or polar. The free energy is calculated as the negative sum of the number of non-sequential H-H bonds in the conformation. Due to its simplicity, the H-P model makes calculating the free energy of a conformation fast, but it does not take into account the interactions between specific pairs of amino acids.
In more detail, let α S denote a lattice embedding of the protein sequence, S. The embedding is valid, if α S lies along a non-self-intersecting path within the lattice in such a way that adjacent vertices of the chain, S, occupy adjacent locations. We define the set of conformations by: Let HH c denote the function that counts the number of neighbouring hydrophobic amino acids in α S that are not neighbours in S, but are neighbours of a distance of one within the lattice. The objective function is then defined by: The energy function corresponds to the notation HP100 introduced in [20], which dates back to the energy function defined in [18]. Apart from HP100, Moreno-Hernández and Levitt [20] also consider HP211, where H-H contacts are weighted with −2, and H-P, as well as P-P contacts are weighted with −1.
The M-J energy model provides a more realistic model of the free energy of a given conformation, since it takes into account the interactions between specific pairs of amino acids. It is represented as a two-dimensional matrix of contact energies between pairs of amino acids; in our experiments, we will use the contact energies matrix presented in [21,22]. Whilst the M-J model is much more biologically realistic, it is more complicated to both implement and compute.
In the original paper [21], where the energy function is introduced, there are two different interaction matrices. The first matrix, often referred to as MJain the literature, stands for the actual energy value of each bond, while the second matrix, MJb, stands for the pairwise contributions to the total free energy related to the fact that two amino acids are forced to expel a solvent molecule and form a contact. In our study, we use the second matrix. The choice is driven by the fact that benchmarks with a provably optimal energy value in the cubic lattice are available for this matrix, MJb [38].
The total free energy under the M-J pairwise interactions is given by Equation (8), where {r i } is the set of bead coordinates that define α S . Let AT(i) denote the amino acid type of the i − th residue of α S , and e u,v are the entries of the MJb matrix. We have, for the contact function D(r i − r j ) = 1, if beads i and j form a contact (that is not a covalent linkage) within the lattice, and D(r i − r j ) = 0, otherwise. We then define: In the evaluation of our methods, we will consider different combinations of lattices and energy functions. We will present the following sets of results: The H-P model over the three-dimensional cubic lattice the H-P model over the three-dimensional FCC lattice, the M-J model over the three-dimensional cubic lattice and a modification of the M-J model introduced in [32] over the three-dimensional FCC lattice. We will use the number of energy evaluations as a performance benchmark of our algorithm.

H-P Benchmarks
The H-P benchmark problems listed in Table 1 are from [39,40] and represent standard structures in this area of research. The benchmark problems have been studied before by the authors [13,14]; see also [41,42]. Table 1. Hydrophobic-Polar (H-P) model benchmark problems (length of 48) from [39,40]. E bench is the benchmark minimum on the 3D cubic lattice and E nat the corresponding value for the face-centred-cubic (FCC) lattice.

. M-J Benchmarks
The M-J sequences taken from [38] are also of a length of 48; see Table 2. Along with the sequences from [38], we also consider four sequences analysed in [37] and [33] for 3D FCC lattices; see Table 3. However, the underlying energy matrix in [33] is taken from [32]. Therefore, we executed computational experiments for 3D FCC lattices on benchmarks from Table 3 for the energy matrix defined in [32].

Firefly Algorithm
Swarm intelligence (SI) and bio-inspired computation have received great interests and attention in the literature. SI-based algorithms, such as particle swarm optimization and the Firefly Algorithm, have simplicity, flexibility and lower time complexity, and thus, they can have many advantages over conventional algorithms [44,45].

Standard Firefly Algorithm
The Firefly Algorithm (FA) was developed by X.S. Yang in 2008 [27,28], which was based on the flashing patterns and behaviour of tropical fireflies. FA is simple, flexible and easy to implement. The basic assumptions in the Firefly Algorithm are: • All fireflies are unisex, so that one firefly will be attracted to other fireflies regardless of their sex; • Attractiveness is proportional to the their brightness; thus for any two flashing fireflies, the less bright one will move towards the brighter one. The attractiveness is proportional to the brightness, and they both decrease as their distance increases. If there is no brighter one than a particular firefly, it will move randomly by random walks; • The brightness of a firefly is affected or determined by the landscape of the objective function.
As a firefly's attractiveness is proportional to the light intensity seen by adjacent fireflies, we can now define the attractiveness, β, of a firefly by: where β 0 ∈ (0, 1] is the attractiveness at r = 0. The movement of a firefly, i, attracted to another more attractive (brighter) firefly, j, is determined by: where the second term is due to the attraction. In addition, γ > 0 is a scaling factor and should be related to the scales of the problem of interest. The third term is randomization with α being the randomization parameter, and t i is a vector of random numbers drawn from a Gaussian distribution at time t. Obviously, the randomization term, t i , can also be drawn from other probability distributions, such as Lévy flights [27,28]. A comprehensive review of the Firefly Algorithm and its variants has been carried out by Fister et al. [46].
For this algorithm to reach convergence properly, randomness should be gradually reduced, and one way to achieve such randomness reduction is to use: where t is the index of iterations/generations. Here, α 0 is the initial randomness factor, and we can set α 0 = 1 without losing generality. As for the parameter values, parametric studies [27,46] suggested that α 0 ∈ [0.5, 1], θ ∈ [0.7, 0.975], n = 15 to 100, and γ = 0.01 to 10, though α 0 = 1 and γ = 1 can work for many applications.
From the above equation, we can see that mutation can be used for both local and global searches. When t i is drawn from a Gaussian distribution and Lévy flights, it produces a mutation on a larger scale. On the other hand, if α is chosen to be a very small value, then mutation can be very small and, thus, limited to a subspace. During the updating steps in the two loops in the FA, ranking, as well as selection is used, see Figure 1. A brief analysis of the firefly algorithm can reveal two distinct advantages: • The automatic subdivision of the whole population into subgroups; • The natural capability of dealing with multi-modal optimization.
All these advantages make FA very unique and very efficient.

Special Cases of FA
The Firefly Algorithm can have rich characteristics. First, it uses attraction to influence the behavior of the population. As local attraction tends to be stronger than long-distance attraction, the population in FA can automatically subdivide into subgroups, depending on the modality of the problem, which enables FA to deal with multimodal, nonlinear optimization problems naturally.
Furthermore, if we look at the updating Equation (10) more closely, we can see that there are special cases when some parameters are approaching certain limits. Firstly, if γ is very large, then attractiveness or light intensity decreases too quickly; this means that the second term in Equation (10) becomes negligible, leading to the standard simulated annealing (SA) [47]; see Section 2.5.2..
Secondly, if γ is very small (i.e., γ → 0), then the exponential factor exp[−γr 2 ij ] → 1. We have: Here, if we further set α = 0, then the above Equation (12) becomes a variant of a differential evolution [48]. On the other hand, if we replace x t j by the current global best solution, g * , we have: which is essentially a variant of particle swarm optimization [44] or, more specifically, the accelerated particle swarm optimization (APSO) introduced by Yang et al. [49]. Thirdly, if we set β 0 = 0 and let t i be related to x i , then Equation (12) becomes a pitch adjustment variant in harmony search (HS) [50].
Therefore, we can essentially say that DE, APSO, SA and HS are special cases of the Firefly Algorithm. Conversely, FA can be considered as a good combination of all four algorithms (DE, APSO, SA and HS), to a certain extent. Furthermore, FA uses a nonlinear updating equation, which can produce richer behaviour and higher convergence than the linear updating equations used in standard PSO and DE. Consequently, it is no surprise that FA can outperform other algorithms in many applications, such as multimodal optimization, classifications, image processing and feature selection [46]. Evaluate new solutions and update light intensity. end for j end for i Rank the fireflies and find the current global best g * . end while Post-process results and visualization.

Firefly-Inspired Protein Structure Prediction
Whilst our approach does not follow the exact ideals of the Firefly methodology, it could be considered a simulated annealing-Firefly hybrid method. We apply the principles of the Firefly optimisation algorithm to a more traditional population-based simulated annealing algorithm, with a primary goal of minimising the number of objective function evaluations (defined as Oval min and Oval avg in our results tables) required to reach a solution.

Pull Move Set
The algorithm we have developed uses the pull moves set [10] with the correction suggested by Györffy et al. in [51] (see p. 1848, left column) for cubic lattices. We choose the (adjusted) pull move set, because it has been shown to be reversible and complete for cubic lattices [10,51] and FCC lattices [36] (Section 3), meaning that every possible valid conformation is reachable using only the pull move set. We use this pull move set in conjunction with a local search approach, combined with simulated annealing, in order to find an optimum embedding in the lattice. This approach is used together with the Firefly methodology, as shown below, over both the three-dimensional cubic and FCC lattices.

Simulated Annealing
Simulated annealing has long been used [47] as an optimisation method for local search-based problems. We implement simulated annealing to our local search method to avoid becoming trapped in a local minimum during the early stages of computation. Becoming trapped in a local minimum could prevent an optimal embedding from being reached, since the move to an optimal conformation may not be possible if intermediate conformations with a less-optimal energy function value are not considered.
Since the objective of this experiment is to determine the effectiveness of the Firefly optimisation method, we consider only a linear cooling schedule and not a logarithmic cooling schedule.
We use a standard cooling schedule in accordance with: and a simulated annealing decision based on: This function is used to determine whether a move is to be taken. The value of n is not increased for each member of the population, that is, after 10 steps in a population of a size of 5, n = 10, not 50. This ensures that population members who are computed sequentially after another have the same probability of accepting a move at any given step.

Population
Our method is a population-based method. We consider a population of a size of N = 8 in our experiments, where this number is at the lower end of the settings for N discussed in population-based computing (e.g., if values of N = 2 n = 4, 8, 16, 32, ... are considered). Moreover, due to the multi-threaded nature of our implementation on four CPU cores, the communication overhead is relatively small for N = 8.
We create our population by identifying a set of initial conformations. To do this for a population of a size of N , we create N infinite lattice models. We then make a random self-avoiding walk of each member lattice, of the same length as the protein sequence, attaching the value of each amino acid in the sequence to its respective node. Once we have this random initial conformation, we run a greedy version of the pull move set algorithm on each initial conformation, until we reach a local minimum. This greedy version of the pull move set will only accept moves from the pull move set that result in a lower (better) energy function evaluation.
In traditional simulated annealing combined with population-based methods, the entire population is considered for simulated annealing-based evaluation using the energy function at every step. A crucial feature of our approach is that simulated annealing-based decisions are executed at each step only for the conformation with the lowest value of the energy function (at the last known iteration of T steps), whereas the remaining members of the population are evaluated at each step according to their Euclidean distance to the best conformation. The selection of the best conformation is then updated after T steps, as described in Section 2.5.4..

Integration with Firefly
We use the Firefly method as a tool for guiding intermediate solutions towards optimum solutions, where the (intermediate) best solution within the population serves as a reference structure for a limited number of T steps and changes during this interval according to simulated annealing decisions.
We consider the initial population (S 1 ... S N ) and an interval, T . At the beginning, pull moves are executed T times with the only criterion that the energy value does not increase (greedy descent, potentially to local minima within T steps). We then perform an energy evaluation of all conformations in the generated initial population, and save the conformation with the lowest energy value as S best (random selection, if there are several conformations with the lowest energy value according to Equation (7) or Equation (8)). It is important to note that the member, S best , of the population remains the same member during simulated annealing-guided conformational changes during each interval of T steps. In terms of the standard Firefly Algorithm, S best , can be considered to be the firefly with the brightest light (and, thus, the light intensity of a population member is inversely proportional to its energy function evaluation; that is, lower energy function evaluations result in brighter firefly lights).
While for each T -interval, standard pull moves are executed and evaluated based upon simulated annealing on S best , for every other conformation in the population (N −1), pull moves are applied with respect to an alternative objective function: In order to mimic the Firefly method, each of these N −1 conformations is evaluated in accordance with its Euclidean distance to S best , i.e., the second objective function favours changes to the corresponding lattice embedding that move the conformation structurally towards the current S best .
In formal terms, the structural similarity of [S best , S i ] is defined by the Euclidean distance: For the 3D coordinates (x t , y t , z t ) of the t − th node (amino acid) at step n within the T -interval, we set: If, now, a pull move affects the positions of m nodes (amino acids) from position j until position k (m = k−j +1), we select the average value of the distance changes as the structural objective function: The pull move is only accepted for the population member, S i , if sof n (S i ) ≥ sof n+1 (S i ). Thus, if the total distance increases, but more nodes are affected, the move could still be accepted, which can contribute to the diversity of the population. Our approach follows the Firefly optimisation principle, since all members of the population with dimmer lights (higher energy function evaluations) move (structurally) to the member with the brightest light (the lowest energy evaluation score).
After T steps, we perform an energy function evaluation on all N members of the population, and the member of the population that is labelled S best is updated.
The procedure terminates if there are no pull moves accepted by simulated annealing for the conformation labelled as S best within the current T -interval, and the member of the population labelled as S best remains the same after this particular interval.
where S x is the conformation with the best energy function evaluation best index ← x iterations ← 0 end if end while As an overview (see the pseudocode in Figure 2), the main steps of the algorithm are: 1. Generate a random initial population of random walks (conformations).
2. Perform pull moves on each conformation greedily until a local minimum is found.
3. Copy the best conformation in the population into S best . Note its index in best index.
5. Perform a pull move on conformation best index using the simulated annealing objective function. 6. Perform a pull move on all other conformations using the structural objective function.
7. If we have reached T iterations since S best , we set a new value for S best and best index by picking randomly from all conformations with the lowest energy function evaluation.

Results and Discussion
In our experiments, we tested the performance of our algorithm on both the H-P and M-J energy functions on three-dimensional cubic and FCC lattices. We take the minimum and average results and objective function counts from five runs of each combination. All experiments are run with a population size of eight and a default interval of T = 15. By selecting such a small value of T , we try to avoid that conformations different from S best do not move for too many steps structurally towards a conformation that is no longer the best with respect to the energy value. Future research will include finding out whether or not one can identify a range of T (depending on the sequence length) more appropriate for this particular parameter setting.
The computational experiments were executed on computers with Intel Xeon E5-4603 CPUs and 16 GB of RAM, running Debian 6 x64. The runtimes for the HP1-HP10 sequences ranged from 35 to 142 minutes on the 3D cubic lattice and from 81 to 250 minutes on the FCC lattice. For sequences MJ1-MJ5 over the cubic lattice, runtimes ranging from 89 to 221 minutes were observed. We expect these runtimes to improve with further fine-tuning of the algorithm and parameter settings.
The performance of algorithms is difficult to compare solely based upon runtime values, since the particular runtimes are affected by the underlying hardware and implementation details. Therefore, we provide data about the total number of objective function evaluations, which is denoted by Oval. The notation has been chosen in order to cover both energy function evaluations according to Equation (7) and Equation (8), which are counted for the member of the population labelled as S best and the number of 'structure-based' objective function evaluations according to Equation (17), which are executed for each of the remaining N −1 members of the population. Thus, if K = k × T + T is the number of steps (without the initial stage) until the termination of the procedure, where 1 ≤ T ≤ T , then: The values of Init i cover the initial stage before the selection of S best , and (k + 1)(N − 1) accounts for the selection of S best at the beginning of each complete T -interval and T (we note that the energy value of S best is known at the beginning of each interval). By selecting such a performance measure, it is also possible to circumvent the runtime impact of how the underlying lattices (which are equal for the same problem setting) are implemented and handled in computational experiments.
In the literature, comparable data with respect to function evaluations are provided in [52] with respect to the H-P model (ten benchmarks from Table 1), and in [41] for the M-J model (six benchmarks from Table 2); in both cases, for the three-dimensional cubic lattice. The function evaluations are counted for the respective energy function, and therefore, we denote by Eval the total number of energy function evaluations. We note that in [52], the value of Eval is equal to the product of the population size (= 20) and the number of 'generations'. Since the number of 'generations' reported in [52] is the average value over 10 independent runs, we use Eval avg for comparison purposes. In [41] (see Tables 9-14 there), 10 independent runs are reported individually with their respective Eval value and minimum energy value. The population size for each of the runs is 100. For comparison purposes, we calculated the average Eval avg value over 10 runs for each of the six M-J instances.
In our computational experiments, we executed five independent runs, and we denote by Oval avg the average number of objective function evaluations according to Equation (18) for each individual benchmark problem. For comparison to [41] and [52], we introduce the parameter: Table 4 displays our results using the H-P model on the three-dimensional cubic lattice. It shows that, over the 10 benchmark problems, our method finds an optimal conformation for every problem. This is comparable to other methods; as E pub shows, the method presented in [52] achieved the same performance. Except for HP1, HP4 and HP6, our method improves on the number of energy function evaluations reported in [52], and on eight out of the ten instances, Oval min is smaller than Eval pub avg . If the worst (HP1) and best (HP7) performances are left out, we achieve an average speed-up of 2.1.
A potential explanation for the relatively large value of Eval pub avg for HP7 could be the landscape analysis carried out in [14] for this instance: HP7 exhibits the smallest γ value, which results in a termination criterion larger by about one margin compared to most of the other sequences (see Table 1 in [14]). Table 5 shows our results for the H-P model over the FCC lattice. This combination shows similar performance to the three-dimensional cubic lattice, in that both our method and the published [53] method found an optimal conformation in 100% of the benchmark problems. We can also see that in 50% of benchmarks (HP2, HP3, HP4, HP5 and HP8), we achieved a better average conformation value than the results published in [53]. To our knowledge, there are no published energy function evaluation counts for the benchmark problems HP1-HP10 over the 3D FCC lattice. Comparing to our results for the three-dimensional cubic lattice, there is an increase in energy evaluations over all benchmark problems. This is to be expected considering the additional complexity of the 3D FCC lattice. Table 6 shows our results from five runs for the M-J model on the three-dimensional cubic lattice. In this combination, we achieved an optimal conformation in four out of the six benchmarks. We recall that the Eval pub avg values are the average values of energy function evaluation counts over all 10 runs obtained by PLS in [41] for the corresponding benchmark. We note that except for MJ3, not all of the 10 runs reported in [41] reached the optimum value (ranges from six up to nine runs on the remaining instances).
On four of the benchmarks, we achieved the optimum free energy value. With the ad hoc parameter settings we used, we missed out by a margin on MJ1 and and MJ4. It should be noted that the speed-up of objective function evaluations in comparison to PLS from [41] is 1.5 and 1.4 on MJ1 and MJ4, respectively. Therefore, we think it is justified to say that, with a slightly higher computational effort, our approach will be able to achieve the optimum free energy values on these two instances. When leaving out MJ3 and MJ5, the average speed-up is 1.2 (otherwise, 1.1). As mentioned before, the population size in [41] is 100, and we presume that with a moderate increase of the population size for the Firefly-inspired algorithms, a stronger speed-up can be achieved. Furthermore, it is important to note that in [41], the runs terminate when a structure with minimum free energy has been found, which is not the case in our computational experiments. Table 6. Results for the M-J Model on 3D cubic lattices for benchmarks from Table 2. E pub are the results published in [41]. Eval pub avg are the average energy function evaluation counts for PLSfrom [41]. 1.2 Table 7 shows our results obtained by using the energy function from [32] over the FCC lattice compared to those from [33]. Our results are comparable to [33] on instance 4RXN, worse on instance 1ENHand better on the longer instances, 4PTIand 2IGD. Unfortunately, no data are available in published literature for comparison of the number of energy function evaluations.

Conclusions
We presented the initial results of protein structure prediction in lattice models by using simplified energy function and a new Firefly-inspired heuristic with ad hoc parameter settings. With regard to H-P instances and the minimum number of objective function evaluations, our methods outperform the published results on eight out of ten benchmark problems for cubic lattices. If the average number of objective function evaluations over five runs is compared to the published data over ten runs, the average speed-up is 2.1 over eight instances, where two extreme cases are left out (0.4 and 70.6). For H-P instances on FCC lattices, all ten optimum free energy values are achieved with a relatively small number of objective function evaluations. For the six M-J instances of a length of 48, we obtained on cubic lattices an average speed-up over four instances of 1.2, leaving out two extreme values (otherwise, 1.1). The speed-up is achieved for a population size of only eight instances, whereas the population size producing the data we used for comparison is 100. We presume the speed-up can be improved, if a larger population size is chosen for our Firefly-inspired method. On four selected benchmark problems of a length between 54 and 61, we achieved for the energy matrix proposed in [32] on FCC lattices an improvement of minimum free energy values on two instances in comparison to the results published in [33]. As in the case of H-P instances on FCC lattices, the number of energy function evaluations is comparably small. The results we obtained can be seen as a proof of concept, which encourages us to move forward towards the analysis of longer benchmark problems and off-lattice models.