You are currently viewing a new version of our website. To view the old version click .
Biomimetics
  • Article
  • Open Access

12 November 2025

ACGA a Novel Biomimetic Hybrid Optimisation Algorithm Based on a HP Protein Visualizer: An Interpretable Web-Based Tool for 3D Protein Folding Based on the Hydrophobic-Polar Model

,
,
and
1
Computer Science Department, Babes-Bolyai University, 400114 Cluj-Napoca, Romania
2
Informatics, Mathematics and Electronics Department, University ‘1 Decembrie 1918’ of Alba Iulia, 510009 Alba-Iulia, Romania
3
Doctoral School of Letters, Humanities and Applied Sciences, George Emil Palade University of Medicine, Pharmacy, Sciences and Technology of Targu Mures, 540088 Târgu Mureș, Romania
4
Department of Electrical Engineering and Information Technology, Faculty of Engineering and Information Technology, George Emil Palade University of Medicine, Pharmacy, Science, and Technology of Targu Mures, 540142 Târgu Mureș, Romania
This article belongs to the Special Issue Bio-Inspired Artificial Intelligence in Healthcare

Abstract

In this study, we used the hydrophobic-polar (HP) two-dimensional square and three-dimensional cubic lattice models for the problem of protein structure prediction (PSP). This kind of lattice reduces computational time and calculations, the conformational space from 9 n to 3 n 2 for the 2D square lattice and 5 n 2 for the 3D cubic lattice. Even within this context, it remains challenging for genetic algorithms or other metaheuristics to identify the optimal solutions. The contributions of the paper consist of: (1) implementation of a high-performing novel genetic algorithm (GA); instead of considering only the self-avoiding walk (SAW) conformations approached in other work, we decided to allow any conformation to appear in the population at all stages of the proposed all conformations biomimetic genetic algorithm (ACGA). This increases the probability of achieving good conformations (self avoiding walk ones), with the lowest energy. In addition to classical crossover and mutation operators, (2) we introduced specific translation operators for these two operations. We have proposed and implemented an HP Protein Visualizer tool which offers interpretability, a hybrid approach in that the visualizer gives some insight to the algorithm, that analyse and optimise protein structures HP model. The program resulted based on performed research, provides a molecular modeling tool for studying protein folding using technologies such as Node.js, Express and p5js for 3D rendering, and includes optimization algorithms to simulate protein folding.

1. Introduction

The molecular structure of a protein can be broken down hierarchically into four levels. The protein’s primary structure is its sequence of amino acids (AAs), the secondary structure is its locally folding pattern, the tertiary structure is the globally folded form, frequently as a globule, and its quaternary structure is its multimeric organization [].
Protein folding (PF) is the physical process by which a protein chain is translated into its native three-dimensional (3D) structure, typically a “folded” conformation in which the protein becomes biologically functional. This process is fundamental because only in a good tertiary structure (native conformation) can the protein become biologically functional [].
The algorithms must predict the correct native conformation and the purpose of PSP is to predict the tertiary structure from the primary structure. Due to its extreme complexity, several simplified protein models have been proposed, reducing the algorithms search space. Three approaches can be classified: (1) ab initio-starting only from the information contained in the primary structure of the proteins, (2) homology-using information from the primary structure and knowledge of the native conformations of similar sequences, and (3) protein threading (or fold recognition), using knowledge of the fold families [,].
The fundamental motivations of this work are: (1) Complexity of protein modelling-understanding protein structures is crucial to revealing biological functions and diseases associated with protein misfolding, such as Alzheimer’s disease []. The HP model provides a simplified yet robust framework for examining the basic principles of protein folding without the need for extensive computational resources. (2) Lack of accessible tools-most existing tools, such as Rosetta [] or GROMACS [], requires advanced programming knowledge and is difficult to use.
The main objective of this work was to provide researchers, students, and bioinformatics professionals to explore protein folding visually in three-dimensional space. Through its interface, HP Protein Visualizer understanding of the relationship between amino acid sequences and resulting 3D structures, an essential aspect of protein study and its applications in biotechnology and medicine.
Contribution to PSP. This paper describes a new biomimetic genetic algorithm used for solving PSP on the HP model. The algorithm considers, at all stages, populations formed of all configurations, without excluding the invalid (unfeasible) ones. A conformation is valid (feasible) if and only if it respects the self-avoiding walk (SAW) condition, which means that no two beads (i.e., amino acids represented as nodes placed on a 2D or 3D lattice) are placed in the same position on the lattice (thus avoiding overlaps).
Allowing invalid configurations in the populations is beneficial due to the chaotic behavior of the associated energy, increasing the chances of finding good partial solutions (valid configurations) generated through small changes applied to some invalid configurations.
Furthermore, the original approach includes, in addition to classical crossover (rotational) and mutation (rotational and diagonal) operators, the application of translation operators for both crossover and mutation operations.
An important and novel aspect of this work lies in the hybrid integration between the ACGA and the HP Protein Visualizer. The visual component offers interpretability of how genetic operators influence protein geometry and provides an interface for debugging, hypothesis testing, and exploratory purposes.
The remainder of the paper is organized as follows: Section 2 presents the related work in the domain of PSP, followed by Section 3, which describes the HP model and the PSP within heuristic techniques. Section 4 presents the hybrid algorithms, while Section 5 presents and analyzes the results obtained by applying the proposed algorithms to a set of popular benchmarks. Finally, Section 6 and Section 7 offer a conclusive analysis of the results and future work.

3. Protein Structure Prediction on the Hydrophobic-Polar Model

PSP on the HP model involves three stages: (1) choosing a suitable type of lattice (e.g., 2D-square, 2D-trigonal, 2D-rectangular, 3D-cubic, 3D-Face-Centered Cubic); suitability is defined either by the appropriateness to the structure of real proteins or by the higher probability to reach a solution; (2) choosing an arbitrary energy function to model the protein conformation on the lattice, making it similar to the real protein’s native conformation (i.e., forming the hydrophobic kernel/kernels and the polar surface); and (3) creating a new algorithm or modifying an existing one that can solve the protein folding problem in this lattice-based setting.
Two AAs placed on adjacent nodes in the lattice can be either sequence neighbors or topological neighbors. Sequence neighbors are neighbors in the protein’s primary structure, while topological neighbors are positioned on nearby nodes to form contacts.
The arbitrary energy function should be defined to maximize the number of contacts between hydrophobic AAs. This can be achieved by assigning a value of free energy for each type of contact, as follows: e ( H , H ) = 1 ; e ( H , P ) = 0 ; e ( H , P ) = 0 ; e ( P , P ) = 0 . Specifically, H-H contacts are encouraged, while other types of contacts are neither encouraged nor penalized. An H-H contact ( h i , h j ) between the i-th and j-th AA from the original sequence can be formed only if | i j | > 2 . Based on these criteria, the free-energy, E, of a conformation c is calculated as the sum of H-H contacts, taken with the minus sign (see Equation (2)):
E ( c ) = i , j = 1 , | i j | > 2 n e ( a i , a j )
where n is the sequence length. The negation is applied to ensure similarity with thermodynamic free energy.
For the basic HP model, PSP problems on square lattices have been proved to be NP-complete problems by Crescenzi [] and Berger []. Based on these results, it has been concluded that all other problems on more general lattices and on three-dimensional ones are also NP-complete.
Problem definition. For computational reasons, a more formal and general definition of the k-dimensional problem is needed:
Let be Λ a geometrical lattice in R k and H P = ( h 1 , h 2 , , h n ) be a string of H and P letters ( h i { H , P } , i 1 , n ¯ ). The lattice is defined by:
Λ = i = 1 k b i v i b i Z
where { v 1 , v 2 v k } form a base of a vector space R k .
For the 2D square lattice ( k = 2 ) and 3D cubic lattice ( k = 3 ), the vectors v 1 v k have to satisfy the following two conditions: (1) v i v j , i . j 1 , k ¯ , and (2) v 1 = = v k = 1 .
The HP string represents an AAs sequence in the HP model, where H corresponds to a hydrophobic AA, and P corresponds to a polar AA. To simplify the explanation, we refer to the modeled AAs as “beads.” In the sequence, each bead has exactly two neighbors, except for the first and last beads, which have only one neighbor (successor and predecessor, respectively). The distance between any two neighboring beads is equal to 1.
Conformations. A conformation is defined by the placement of the H P string on the lattice. The placement must adhere to the following restrictions: (1) only one bead can be placed in each position on the lattice; (2) the H P string cannot be broken; neighboring beads in the string should be placed in neighboring positions in the lattice.
A conformation corresponds to a walk in the lattice, representing the placement of the H P string on it. A conformation is feasible if and only if it respects self-avoiding walk condition, which means that no two beads are placed in the same position on the lattice.
There are two common types of encodings used to represent a conformation. These encodings specify directions to follow the walk. In the first type, known as absolute encoding, the fold directions are relative to the lattice. In the second type, known as relative encoding, the fold directions are relative to the conformation itself.
  • For the 2D square lattice, the two encodings are denoted as follows:
1.
R U L D string—use for absolute encoding, where R , U , L , D letters stand for right, left, up, and down, respectively;
2.
S R L string—use for relative encoding, where S stands for straight, R for right, and L for left.
  • For the 3D cubic lattice, the two encodings are denoted as follows:
1.
R U L D F B string—use for absolute encoding, where R , U , L , D have the same meaning as in a 2D square lattice, and F , B letters stand for front and back directions, respectively;
2.
S R L F B string—use for relative encoding, where the letters R , U , L , D retain their meaning from points above.
The free energy of a conformation is calculated based on the Equation (2). Free energy of a conformation is a function depending on the HP string, the encoding string and the lattice type, as can be seen in Equation (4).
E ( c ) = f ( H P _ s t r i n g , e n c o d i n g _ s t r i n g , l a t t i c e _ t y p e )
Problem requirement. For a given sequence (HP string), find the optimal configuration for which the free-energy is minimal []. This implies that the H beads are positioned in the center of the lattice, and this configuration should directly correlate with the native configuration of the real protein. Therefore, the problem is a combinatorial optimization one [,], which can be solved using either deterministic or non-deterministic algorithms. In this case, we have chosen non-deterministic GA. For all the cases, the problem input is the H P string that represents the unfolded string of beads (Example: HPHPPHHPHPPHPHHPPHPH).
The output of the algorithm depends on the lattice type. For example, in the two-dimensional case, the output could be the R U L D string (or S R L string), representing the folded string with the minimal energy configuration that satisfies the SAW condition. The objective function is defined by the energy equation (Equation (2)).

4. Optimisation Algorithms

Three different optimization techniques are used to find protein conformations with the lowest amount of energy.

4.1. Monte Carlo Simulation

Monte Carlo (MC) simulation is a stochastic method that explores the conformational space through random sampling []. It applies the Metropolis criterion [] to decide whether to accept or reject a structural mutation based on energy change. Our implementation includes custom move sets, such as crankshaft motions, translational moves and pivot (rotational) moves. An example of a Monte Carlo execution result is displayed in Figure 1.
Figure 1. Monte Carlo simulation result for a 20-amino-acid HP sequence. Note: The figure shows the Monte Carlo algorithm after completing 10,000,000 iterations for a sequence of 20 amino acids. The input data for both the visualizer and the algorithm (HP sequence, Coordinates type, and Lattice type) are displayed in the upper-right panel. This panel also contains the optimal solution returned by the algorithm, represented by the RULDFB string. The canvas area displays the optimal conformation found by the algorithm. Hydrophobic amino acids, shown in red, are grouped in the central region of the protein, forming a hydrophobic kernel, while polar amino acids, shown in blue, are located in the peripheral region. The algorithm temperature is maintained constant at 1 degree.
The system was evaluated with 10,000,000 iterations and a constant temperature of 1.0, which probabilistically accepts the new conformation, according to:
P ( C C ) = 1 , if   Δ E 0 e Δ E / T , if   Δ E > 0
where Δ E = E new E old (is the energy difference), and T is the constant temperature.
The analysis of the energy profile in Figure 2 reveals the following characteristics: the x-axis corresponds to the iteration number, while the y-axis represents the energy level multiplied by two. The minimum energy achieved is 2 in this case. The algorithm displays a dynamic behaviour with significant fluctuations between 0 and −2, indicating more in-depth exploration of the solution space, with abrupt transitions that indicate the algorithm ability to escape local optima.
Figure 2. Monte Carlo Energy evolution.
Algorithm 1 implements the Metropolis Monte Carlo (MC) strategy for exploring conformational space in the HP lattice protein folding model. The input consists of a binary HP sequence S = ( s 1 , s 2 , , s n ) , where each monomer s i is either hydrophobic (H) or polar (P), along with the number of iterations and the lattice type (2D square or 3D cubic). The algorithm begins by initializing a valid conformation C 0 and computing its energy E ( C 0 ) , which serves as the initial reference.
At each iteration, a new candidate conformation C is generated via a mutation operator. If C is valid (i.e., free of overlaps), its energy E is computed and compared to the previous energy E i 1 . If the energy difference Δ E = E E i 1 is negative, the new conformation is accepted. Otherwise, it is accepted with a probability p = exp ( Δ E / T ) , where T is the system temperature. A random number r [ 0 , 1 ] determines acceptance based on p. If rejected, the previous conformation is retained.
The algorithm iteratively optimized the conformation, storing the best solution C * encountered. This approach allows escape from local minima. The output is the optimal conformation string encoded in RULDFB notation.
Algorithm 1 Metropolis Monte Carlo Algorithm for HP lattice Model
1:
Input: S, iterations, lattice_type     ▹ HP sequence: S = ( s 1 , s 2 , , s n ) with s i { H , P }
2:
Output: C *                                              ▹ RULDFB string (Best conformation found)
3:
Initialize a valid conformation C 0
4:
Compute energy of C 0 : E ( C 0 ) , C * C 0
5:
E 0 E ( C 0 )
6:
for  i = 1 to i t e r a t i o n s  do
7:
     C M u t a t i o n ( C i 1 )                                         ▹ Generate a new conformation, C
8:
    if  C is valid then                                                                   ▹ Check for overlaps
9:
        Compute energy E E ( C )
10:
         Δ E E E i 1
11:
        if  Δ E 0  then
12:
           Accept C : C i C , E i E , C * C
13:
        else
14:
            p exp Δ E T                                           ▹ Compute acceptance probability
15:
           Generate random number r [ 0 , 1 ]
16:
           if  r < p  then
17:
               Accept C : C i C , E i E
18:
           else
19:
               Reject C : C i C i 1 , E i E i 1
20:
           end if
21:
        end if
22:
    else
23:
        Reject C : C i C i 1 , E i E i 1
24:
    end if
25:
end for
26:
return C * ;

4.2. Simulated Annealing

Simulated Annealing (SA) extends the Monte Carlo approach by progressively lowering the temperature to reduce the acceptance probability of suboptimal solutions []. This mimics the annealing process in metallurgy. We use exponential cooling and adaptive temperature steps based on energy fluctuations. Figure 3 shows the results of a sample Simulated Annealing run.
Figure 3. Simulated annealing simulation result for an 84-amino-acid HP sequence. Note: The user interface is almost identical to that of the Monte Carlo implementation, with the addition of the temperature parameter as an input for the algorithm. In this example, the initial temperature is set to 1 degree and gradually decreases throughout the iterations. The SA algorithm was executed on an HP sequence of 84 amino acids using a 3D cubic lattice. It is important to note that the lattice type parameter should not be changed from the graphical interface, since the difference between the 2D square and 3D cubic lattices lies only in the fact that, in the 3D case, the additional move directions F (Forward) and B (Backward) are in RULDFB conformation string.
The SA algorithm also ran for 1,000,000 iterations, starting from an initial temperature T 0 = 1.0 , which was gradually decreased using an exponential cooling schedule:
T i = T 0 · α i ,     α 0.95
where T i is the temperature at iteration i; T 0 is the initial temperature; α is the cooling rate, and i is the iteration index.
According to the energy analysis in Figure 4, the following observations were made: the x-axis corresponds to the iteration number, while the y-axis represents the energy level multiplied by two. The minimum energy achieved is 2 . The dynamic behaviour shows a transition from 1 to 2 without abrupt jumps, and a noticeable plateau around 1 , indicating prolonged exploration in that region. The final structure have increased linearity with fewer compact hydrophobic interactions, suggesting potential risks in a local optimum.
Figure 4. Simulated Annealing Energy evolution.
Algorithm 2 implements the SA optimisation strategy for exploring the conformational space of protein sequences modelled using the HP lattice framework. The input comprises an HP sequence S = ( s 1 , s 2 , , s n ) , where each residue s i belongs to the set { H , P } , alongside the initial temperature T 0 , cooling rate α , number of iterations, and the lattice type.
The procedure begins by generating a valid initial conformation C 0 and computing its associated energy E ( C 0 ) , which serves as the reference state. At each iteration, a new candidate conformation C is produced via a random mutation applied to the current conformation. If C is valid (i.e., free from overlaps and maintaining chain connectivity), its energy E is evaluated and compared to the previous energy E i 1 .
If the energy difference Δ E = E E i 1 is non-positive, the candidate is accepted unconditionally. Otherwise, it is accepted probabilistically with p = exp ( Δ E / T i ) , where T i = T 0 · α i represents the temperature at iteration i according to an exponential cooling schedule. A random number r [ 0 , 1 ] is drawn, and C is accepted if r < p ; otherwise, the previous conformation is retained.
Throughout the iterative process, the algorithm tracks the best conformation C * encountered. This stochastic approach enables efficient exploration of the solution space, balancing exploitation of low-energy states with the ability to escape local minima. The final output is the conformation with the lowest energy identified during the search.
As illustrated in Algorithm 2, the key difference between SA and MMC lies in the temperature schedule. While both methods start with an initial temperature set to 1, SA applies an exponential decay across iterations (see line 13 in the algorithm). This adjustment balances the rate between exploration and exploitation. Initially, exploration is encouraged by accepting solutions or conformations with higher energy than the parent conformation. Towards the end of the algorithm, the acceptance probability decreases significantly, thereby promoting exploitation of the combinatorial search space.
Algorithm 2 Simulated Annealing for HP Lattice Model
1:
Input: S, T 0 , α , iterations, lattice_type                  ▹ S - HP String, T 0 - initial temperature
2:
Initialize a valid conformation C 0
3:
Compute energy of C 0 : E ( C 0 ) , C * C 0
4:
E 0 E ( C 0 )
5:
for  i = 1 to i t e r a t i o n s  do
6:
     C M u t a t i o n ( C i 1 )                ▹ Generate a new conformation, C by random mutation
7:
    if  C is valid then                                      ▹ Check for overlaps and chain connectivity
8:
        Compute energy E E ( C )
9:
         Δ E E E i 1
10:
        if  Δ E 0  then
11:
           Accept C : C i C , E i E , C * C
12:
        else
13:
            T i T 0 α i
14:
            p exp Δ E T i                                                ▹ Compute acceptance probability
15:
           Generate random number r [ 0 , 1 ]
16:
           if  r < p  then
17:
               Accept C : C i C , E i E
18:
           else
19:
               Reject C : C i C i 1 , E i E i 1
20:
           end if
21:
        end if
22:
    else
23:
        Reject C : C i C i 1 , E i E i 1
24:
    end if
25:
end for
26:
return C *

4.3. ACGA Algorithm

The proposed ACGA is a population-based method that evolves a set of protein conformations using selection, crossover, and mutation operations. The fitness function incorporates both energy minimisation and structural compactness metrics. ACGA involves the following steps:
1.
Population Initialization: Firstly, a population of potential solutions is created. Each solution, referred to as a chromosome, is randomly generated, and it has an associated objective function called fitness.
2.
Exploration Stage: The mutation and crossover operators are applied to a certain percentage of the chromosomes in the population, which are chosen randomly. These operators ensure the dispersion of the population in the space of possible solutions, promoting exploration of the solution space.
3.
Through the selection operation, from a percentage of the individuals of the population, those with the best fitness are selected. In this way, a new population is created, usually statistically better, and this represents the next generation.
4.
Exploitation Stage: Through the selection operation, a certain percentage of individuals with the best fitness are chosen from the population. This process creates a new population, which is usually statistically better and represents the next generation of potential solutions. The selection operation helps exploit the combinatorial space by favoring the fitter individuals for reproduction.
Steps (2) and (3) are iterated for a number of generations, allowing the population to evolve and improve over time. The mutation and crossover operators contribute to the exploration stage, while the selection operator contributes to the exploitation stage of the genetic algorithm.
Chromosomes encoding. We consider that the conformation of the modeled proteins can be encoded using either relative or absolute directions. We have chosen to use both in our approach to exploit the advantages of both encodings. The benefits of using relative encoding are as follows: (a) smaller combinatorial space compared to absolute encoding (relative— 3 n ; absolute— 4 n ); (b) implicit avoidance of returning the current AA to the previous AA position in the walk, during the creation of the initial population. (c) Mutation and crossover operators do not require modifications of letters that specify the next positions. On the other hand, absolute coding allows easy conversion into Cartesian coordinates.
Therefore, we have employed both the absolute and relative encodings, resulting in the following corresponding representations (strings):
  • HP string
  • RULD string—absolute 2D square
  • SRL string—relative 2D square
  • RULDFB string—absolute 3D cubic
  • SRLFB string—relative 3D cubic
HP string is a sequence string. We will generically call RULD string, SRL string, RULDFB string and SRLFB string as conformation strings.
For computational efficiency reasons, the exploration is performed using the relative encoding, which reduces the conformation space from 4 n 1 to 3 n 1 . However, the corresponding absolute encoding is also stored to enable easy and fast computation of the Cartesian coordinates. The size of the sequence string is equal to n, and the size of the conformation strings is equal to n 1 , with each letter representing the relative successive direction in the conformation, where n is number of AAs of the sequence.
Generation of the initial population. The sequence string represents the input data, and the conformation string is the output of the ACGA algorithm. The primary structure of the protein is represented by the HP string sequence string) of n letters corresponding to the n AAs of a sequence.
Below are the steps for building a conformation (chromosome) using SRL string representation in the population initialization stage:
1.
Set i = 1. Initialize SRL[i] = ‘S’.
2.
i = i + 1. if i n 1 continue with the next step. Otherwise, the conformation is completely generated in the S R L string.
3.
Choose a random direction ’d’ from the {S,R,L }.
For the 3D case, a similar construction is used based on the S R L F B string. Then, the S R L string is converted to the R U L D string. The first letter, which is S, is always converted to the R letter. This fact reduces the 4-exponential combinatorial space by four times. After that, the R U L D string is converted to an array of Cartesian coordinates. Based on these Cartesian coordinates, the number of collisions, the number of contacts, and the fitness are computed. The math formulas used for finding an H-H contact in the 2D square and 3D cubic lattices are as follows.
For the 2D square lattice:
If a b s ( x i x j ) + a b s ( y i y j ) = 1 , where x i , y i are the Cartesian coordinates of the i-th AA, and x j , y j are the Cartesian coordinates of the j-th AA, then there is a contact between the two AAs at positions i and j, i , j 1 , n ¯ .
For the 3D cubic:
If a b s ( x i x j ) + a b s ( y i y j ) + a b s ( z i z j ) = 1 , where x i , y i , z i are the Cartesian coordinates of the i-th AA, and x j , y j , z j are the Cartesian coordinates of the j-th AA, then there is a contact between the two AAs at positions i and j, i , j 1 , n ¯ .
For finding a collision (two AAs in the same place), we use the following equations, where the terms have the same understanding as above:
For 2D square lattice, if i j : ( x i x j ) + ( y i y j ) = 0 ;
For 3D cubic lattice, if i j : ( x i x j ) + ( y i y j ) + ( z i z j ) = 0
If these equations are satisfied, it indicates that there is a collision between the AAs at positions i and j in the conformation.
Figure 5 presents two conformations: on the left side, there is a SAW conformation, and on the right side, there is a conformation that has one collision (non-SAW conformation).
Figure 5. Conformations on 2D Lattice.
Fitness Evaluation Strategy. The fitness function evaluates how close a given chromosome is to the optimum solution. It determines how fit a chromosome is. We have used a fitness function (see Equation (5)) inspired by the code of Alican Toprak. (https://github.com/alican/GeneticAlgorithm accessed on 6 November 2025).
F i t n e s s ( c ) = E ( c ) · 100 + 1 c o l l i s i o n s ( c ) 2
where c is the conformation, E ( c ) is the number of contacts, and the c o l l i s i o n s ( c ) is the number of collisions of the conformation. This is computed by checking the topological neighborhood of all AAs on the lattice, according to the number of contacts (Equation (2)) and the number of collisions ( c o l l i s i o n s ( c ) ). There are two exceptions: if c o l l i s i o n s ( c ) = 0 then the formula becomes F i t n e s s ( c ) = E ( c ) 100 + 1 and if c o l l i s i o n s ( c ) = 1 then c o l l i s i o n s ( c ) is replaced with 2. Thus, the fitness increases with the number of contacts and is strongly penalized by the number of collisions.
Adapted tournament selection. We proposed an adapted variant of tournament selection, which increases the probability of individuals with low energy values entering the next generation, while implicitly conserving the best individual. Specifically, the selection is applied to the previous population by choosing pairs of chromosomes at random, and after comparing them, the best one is copied into the position of the worst one. This way, the best chromosome is preserved through the generations.
Crossover. In addition to the rotational crossover used in our previous work [], we apply the translational crossover. For both types of crossover operators, the best chromosome ( C ) from the current population is protected. The crossover operation is performed using the following formula:
C t + 1 = C r o s s o v e r ( C ( t ) , D ( t ) , C * )
where C is the best chromosome from the current population, and C ( t ) and D ( t ) are parent chromosomes. The reason for introducing C , as parameter into the crossover operator is to protect this chromosome. Figure 6 shows the translational crossover.
Figure 6. Translational crossover on the 2D Lattice—exemplification.
Mutation. We employ translational, rotational, and diagonal mutations. Given a chromosome C = [ d 1 , d 2 , , d n ] , where d i { R , U , L , D } for 2D (or d i { R , U , L , D , F , B } for 3D), it is mutated to a new chromosome, C’. To achieve this, a position g (1 ≤ g ≤ n), known as the mutation point, is randomly chosen for each conformation. The letter at position g is then replaced by one letter sampled uniformly from the set of possible directions.
For the rotational mutation, the modification is applied to the S R L string, and the next letter after the g point remains unchanged. Then, the S R L string is converted to the R U L D string. This modification produces a rotation of the second part of the chain by 90 , 180 , or 270 , respectively. In the case of translational mutation, the modification is applied to the R U L D string, and the next letter after the g point remains unchanged. Figure 7 shows the two types of mutation. A diagonal move is executed on the two letters of the R U L D string that form a corner. Finally, the mutation operation is performed using the following formula:
C t + 1 = M u t a t i o n ( C ( t ) , C )
where C is the best chromosome from the current population and t represents the iteration number (time). The reason for introducing C , as parameter into the mutation operator is to protect it.
Figure 7. Rotational and translational mutation on the 2D Lattice – exemplification.
The algorithm. For every generation, the next operations are executed: (a) rotational crossover, (b) translational crossover, (c) translational mutation, (d) rotational mutation, (e) diagonal mutation and (f) tournament selection. After iterating all generations, the algorithm returns the best conformation obtained.
Pseudocode of the ACGA algorithm skeleton is given in Algorithm 3. As can be seen, the stopping criterion of the algorithm consists of reaching the number of generations, given as an input parameter.
Algorithm 3 All Conformations Genetic Algorithm (ACGA)
1:
Input: p o p u l a t i o n _ s i z e , g e n e r a t i o n s , H P s e q , l a t t i c e _ t y p e
2:
Output: C * ( R U L D s t r i n g )
3:
Initialization of the population P i ( i = 1 , 2 n )
4:
Compute fitness of each conformation conf Equation (5)
5:
Adapted tournament selection
6:
C * the best conformation
7:
t 0
8:
while  ( t < g e n e r a t i o n s ) do
9:
    for (every chosen conformation) do
10:
         C t + 1 C r o s s o v e r ( C t , D t , C * )
11:
         C t + 1 M u t a t i o n ( C t , C * )
12:
    end for
13:
    Compute the fitness of modified chromosome
14:
     A d a p t e d _ t o u r n a m e n t _ s e l e c t i o n
15:
     C * the best conformation
16:
     t t + 1
17:
end while
18:
return C*

4.4. Computational Complexity Analysis of ACGA

The computational cost of the ACGA algorithm can be expressed as a function of:
  • n: number of amino acids in the sequence (length of HP string);
  • P: population size;
  • G: number of generations;
  • O fit : cost of computing the fitness for a single conformation.
Time complexity. In each generation, the algorithm performs the following dominant steps:
1.
Fitness evaluation: For each of the P individuals, the number of hydrophobic–hydrophobic (H–H) contacts and collisions is computed. In the naive implementation, this requires pairwise checks between amino acids, leading to O ( n 2 ) time per individual. Thus, the fitness evaluation per generation costs O ( P · n 2 ) .
2.
Genetic operators: Crossover and mutation operate on conformation strings of length n, requiring O ( n ) per operation. As a constant fraction of the population is modified in each generation, the total cost for genetic operators is O ( P · n ) , which is asymptotically dominated by the O ( P · n 2 ) fitness term for large n.
3.
Selection: The adapted tournament selection compares pairs of individuals, with O ( P ) comparisons per generation.
Combining these, the overall time complexity is:
T ACGA = O ( G · P · n 2 )
Space complexity. The primary memory requirements come from:
1.
Storage of the population: P individuals, each with a conformation string ( O ( n ) ) and auxiliary encodings (absolute, relative) plus Cartesian coordinates ( O ( n ) ). This is O ( P · n ) .
2.
Temporary arrays for crossover/mutation operations: O ( n ) .
Thus, the total space complexity is:
S ACGA = O ( P · n )
In conclusion, for typical parameter settings in protein structure prediction benchmarks ( n 1000 , P up to 10 5 , G in the tens of thousands), ACGA remains computationally tractable.

4.5. Application Architecture Overview

Figure 8 illustrates the high-level architecture of the application, which integrates a web-based front-end (HTML + JavaScript), a middleware server implemented in Node.js, and a back-end component developed in C/C++. The system employs both HTTP and WebSocket communication mechanisms for real-time data exchange.
Figure 8. Application architecture involving browser, Node.js server, and C/C++ processing module.
  • The client-side browser component (node A) allows users to send parameters and receive processed messages. It initiates a WebSocket connection with the Node.js server and receives messages that incorporate the best protein conformation for every iteration. Then, a Viewer p5-based shows the conformation (p5.js is a free and open-source JavaScript library []).
  • The Node.js server (node B) receives input parameters from the browser via an HTTP POST request. It is responsible for spawning the C/C++ application as a separate process and handling message forwarding through WebSocket.
  • The C/C++ application (node C) performs the core processing logic (MC, SA, ACGA). Its output is redirected to the Node.js server through standard output (stdout). When the execution is finished, a completion message is sent (and a final message is sent upon completion.)
The communication flow is as follows: the browser sends a POST request to start the processing application with the desired parameters. Node.js spawns the C/C++ executable and relays its output back to the browser using WebSocket messages. A special final message indicates the end of processing.

5. Results Analysis

The algorithm has been implemented in the C++ programming language, and tests were conducted on a PC with a Intel(R) Core i7, 2.8 GHz, processor with 4 physical CPU and 4 logical CPU, and 8 GB of RAM running Windows 10. Two implementation variants were developed: one using an OOP style for better software qualities (readability, maintainability, usability) but with slightly poorer execution time performance, and another using the standard C style, which is approximately 40 times faster, mainly due to a more optimized implementation of the selection operation. The results presented in this report are based on the second implementation variant.
To evaluate the efficiency of the algorithm, experiments were conducted on nine popular benchmark sequences with varying lengths ranging from 20 AAs to 85 AAs (see Table 1). The first eight sequences were taken from [], and the ninth sequence was taken from []. The table contains information about the sequence ID, the protein length, and the H P string.
Table 1. Benchmark data set.
The population size and the number of generations were chosen to allow a wide range of possible conformations to be evaluated. For the selection, mutation, and crossover operators, we tested multiple configurations and adopted the ones that produced the best results after parameter tuning. For the algorithm parameters, we considered various combinations of the following parameters to explore the solution space:
(1)
Population size: 10, 50, 100, 300, 500, 1000, 2000, 5000, 10,000, 20,000, 50,000;
(2)
Number of generations: 50, 100, 200, 300, 500, 1000, 5000, 10,000;
(3)
Translational mutation percent: 0.3;
(4)
Rotational crossover percent: 0.4;
(5)
Adapted tournament selection percent: 0.35.
For each combination, we repeated the execution 20 times and computed the average energy obtained. We have established 20 considering the aspect to have enough data to have enough data to be able to make simple (like data normality assumption verification) or even advanced statistical analyses (combined analyses where different assumptions must be met).
Table 2 shows the best results obtained for the nine benchmark sequences, which are highly influenced by the population size. The empty value, marked with a “–” in the table, indicates that no data are available in the literature.
Table 2. Best energy for 2D and 3D square lattice for the 9 considered benchmarks.
Figure 9 further illustrates how the population size affects the results, with the first benchmark sequence of length 20 used as an example. Similar trends were observed for other sequences, where shorter sequences (less than 50 AAs) reached optimal conformations rapidly by increasing the population size, while longer sequences did not reach the optimum.
Figure 9. Average energy related to population size in the case of the first benchmark.
The algorithm indicates good convergence, as the optimum conformation is typically achieved within 100 generations for small conformations. However, for larger conformations, significant improvements were not observed after 100 generations. The algorithm’s robustness was confirmed by repeating executions with the same parameters 20 times, resulting in a relatively low standard deviation. The box plot in Figure 9 presents the quartile median calculus.
Table 3 and Table 4 show the best conformations obtained by MC, SA and ACGA for the 2D square and 3D cubic lattices, respectively. In conclusion, even for long proteins, the ACGA approach offers a reasonable possibility of finding solutions very close to the optimal ones.
Table 3. 2D conformations obtained by ACGA for the benchmark sequences.
Table 4. 3D conformations obtained by ACGA for the benchmark sequences.
Algorithms Comparison. The comparative analysis of the implemented algorithms gave the following observation:
  • MC achieved the low energy results (can be seen in the Table 2) and has dynamic search behaviour with exploratory capability. However, due to its fixed temperature, it is prone to high fluctuations and occasional instability.
  • SA displayed a similar convergence and stability with MC, and was more resilient to local optima thanks to its cooling schedule. Overall, its results (energy) are slightly better those of MC algorithm.
  • ACGA was effective at exploring diverse conformations and maintaining population diversity. Its performance depends strongly on the balance between exploration and exploitation parameters, such as mutation rate and selection pressure. Among the three algorithms, ACGA obtained the best energy for every sequence, as shown in Table 2.

6. Discussion

MC proved the advantage of simplicity in implementation and high efficiency in small search spaces. However, the limitations include a large fluctuation that may lead to inconsistent solutions.
SA proved to have increased stability (without extreme oscillations) and achieved better energy compared to MC. Main disadvantage consists of in the fact that MC and SA find conformations with the high energy, far from the energy of optimal conformation.
The ACGA algorithm was efficiently optimized from a computational point of view, enabling the use of large populations to maintain the diversity of individuals in the context of limited computing resources. Experiments have been conducted for several benchmark HP sequences, and the comparative analysis showed that the proposed genetic algorithms offer valuable advantages for PSP on the HP model, providing optimal solutions in most cases, i.e., from a maximum of 20 trials, we give the standard deviation.
In comparison to all these previous results, our approach addresses both 2D square and 3D cubic cases, and we achieved optimum conformations for sequences of length less than or equal to 50. Concerning the 2D square and 3D cubic models, our results are similar for larger sequences.
Unlike HP Protein Visualizer, which prioritises exploratory use and algorithmic approaches through simplified lattice models, AlphaFold and RoseTTAFold aim for highly accurate real-world predictions. While they require substantial computational resources, HP-based tools can be executed in a web browser and are more accessible for didactic and exploratory use.

7. Conclusions

The HP model is an ab initio paradigm to model and understand protein folding and is one of the most extensively studied physical models for protein structure prediction from sequences. While the HP model appears very simple, solving it is proven NP-hard. Based on this fact, it is a very good benchmark problem for GA.
We have introduced a novel hybrid Genetic Algorithm that also offers interpretability for PSP that goes beyond considering only the SAW conformations. Instead, our approach allows any conformations in the populations at all stages. This increased flexibility improves the likelihood of obtaining good conformations, even starting from non-SAW ones, as the energy associated with these conformations shows chaotic behavior.
Our primary focus has been on computational efficiency, as it enables reasonable computation times even for large populations. Working with large populations is crucial to achieving the necessary diversity for convergence. The results obtained on a popular benchmark dataset are highly promising, with the optimal solution being achieved in most cases, and for others, the distance from the optimal solution is minimal.
To further improve the algorithm, we plan to explore parallel computation, which can improve computational performance [], allowing the use of even larger populations. Larger populations further increase the chances of reaching optimal solutions for all cases. Parallelization will be a key focus of our future work. Additionally, we intend to apply the parallel AGCA to real proteins and compare its results with the native conformations from the Protein Data Bank (PDB) [] and those obtained using the AlphaFold algorithm. This comparison will provide valuable insights into the performance of our algorithm on real biological proteins.

Author Contributions

Conceptualization, I.S.; methodology, I.S. and V.N.; software, I.S., V.N. and D.-M.C.; validation, I.S., V.N. and D.-M.C.; formal analysis, I.S., V.N., D.-M.C. and L.B.I.; investigation, I.S., D.-M.C. and L.B.I.; resources, I.S. and D.-M.C.; data curation, I.S.; writing—original draft preparation, I.S., V.N. and D.-M.C.; writing—review and editing, D.-M.C. and L.B.I.; visualization, I.S., D.-M.C. and L.B.I.; supervision, L.B.I. and V.N.; project administration, I.S. and D.-M.C.; funding acquisition, D.-M.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by internal university resources by ’1 Decembrie 1918’ University of Alba Iulia, Romania, through Order No. 3259 of 13 February 2025.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data set used in the study was taken from Unger1993 [] and Huang2010 []. The conformations achieved by the ACGA on the 2D square lattice and the 3D cubic lattice can be visualized in the files “Experiments.xlsx” and “Experiments3D.xlsx”.

Acknowledgments

The authors thank the Research Group on Artificial Intelligence and Data Science for Healthcare Innovation (REFLECTION) and the Research Center on Artificial Intelligence, Data Science, and Smart Engineering (ARTEMIS), of the George Emil Palade University of Medicine, Pharmacy, Science and Technology of Targu Mureş, Romania, for support of research infrastructure. The WG1 and WG6 Benchmarking Workgroup within CA22137 COST Action, the Randomized Optimization Algorithms Research Network (ROAR-NET), with concerns in identifying benchmarking issues. The authors thank Eng. Simona Ispas, responsible for the graphic design.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Baynes, J.W.; Dominiczak, M.H. Medical Biochemistry, 5th ed.; Elsevier: Amsterdam, The Netherlands, 2019. [Google Scholar]
  2. Felix-Saul, J.C.; García-Valdez, M.; Merelo Guervós, J.J.; Castillo, O. Extending Genetic Algorithms with Biological Life-Cycle Dynamics. Biomimetics 2024, 9, 476. [Google Scholar] [CrossRef] [PubMed]
  3. Rashid, M.; Newton, M.A.H.; Hoque, M.; Sattar, A. Mixing Energy Models in Genetic Algorithms for On-Lattice Protein Structure Prediction. BioMed Res. Int. 2013, 2013, 924137. [Google Scholar] [CrossRef]
  4. Rashid, M.A.; Iqbal, S.; Khatib, F.; Hoque, M.T.; Sattar, A. Guided macro-mutation in a graded energy based genetic algorithm for protein structure prediction. Comput. Biol. Chem. 2016, 61, 162–177. [Google Scholar] [CrossRef]
  5. Hu, C.; Lin, M.; Wang, C.; Zhang, S. Current Understanding of Protein Aggregation in Neurodegenerative Diseases. Int. J. Mol. Sci. 2025, 26, 10568. [Google Scholar] [CrossRef]
  6. Leaver-Fay, A.; Tyka, M.; Lewis, S.M.; Lange, O.F.; Thompson, J.; Jacak, R.; Kaufman, K.W.; Renfrew, P.D.; Smith, C.A.; Sheffler, W.; et al. Rosetta3: An object-oriented software suite for the simulation and design of macromolecules. Methods Enzymol. 2011, 487, 545–574. [Google Scholar]
  7. Abraham, M.J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J.C.; Hess, B.; Lindahl, E. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1, 19–25. [Google Scholar] [CrossRef]
  8. Anfinsen, C.B. Principles that govern the folding of protein chains. Science 1973, 181, 223–230. [Google Scholar] [CrossRef]
  9. Unger, R.; Moult, J. Genetic algorithms for protein folding simulations. J. Mol. Biol. 1993, 231 1, 75–81. [Google Scholar] [CrossRef] [PubMed]
  10. Custódio, F.; Barbosa, H.; Dardenne, L. Investigation of the three-dimensional lattice HP protein folding model using a genetic algorithm. Genet. Mol. Biol. 2004, 27, 611–615. [Google Scholar] [CrossRef]
  11. Jiang, T.; Cui, Q.; Shi, G.; Ma, S. Protein folding simulations of the hydrophobic–hydrophilic model by combining tabu search with genetic algorithms. J. Chem. Phys. 2003, 119, 4592–4596. [Google Scholar] [CrossRef]
  12. Cox, G.; Mortimer-Jones, T.; Taylor, R.; Johnston, R. Development and optimisation of a novel genetic algorithm for studying model protein folding. Theor. Chem. Acc. 2004, 112, 163–178. [Google Scholar] [CrossRef]
  13. Cox, G.; Johnston, R. Analyzing energy landscapes for folding model proteins. J. Chem. Phys. 2006, 124, 204714. [Google Scholar] [CrossRef]
  14. Liang, F.; Wong, W.H. Evolutionary Monte Carlo for protein folding simulations. J. Chem. Phys. 2001, 115, 3374–3380. [Google Scholar] [CrossRef]
  15. Zhou, C.; Hou, C.; Zhang, Q.; Wei, X. Enhanced hybrid search algorithm for protein structure prediction using the 3D-HP lattice model. J. Mol. Model. 2013, 19, 3883–3891. [Google Scholar] [CrossRef] [PubMed]
  16. Huang, C.; Yang, X.; He, Z. Protein folding simulations of 2D HP model by the genetic algorithm based on optimal secondary structures. Comput. Biol. Chem. 2010, 34, 137–142. [Google Scholar] [CrossRef] [PubMed]
  17. Benitez, C.; Parpinelli, R.; Lopes, H. Parallelism, hybridism and coevolution in a multi-level ABC-GA approach for the protein structure prediction problem. Concurr. Comput. Pract. Exp. 2011, 24, 635–646. [Google Scholar] [CrossRef]
  18. Rashid, M.; Newton, M.A.H.; Hoque, M.; Sattar, A. A local search embedded genetic algorithm for simplified protein structure prediction. In Proceedings of the 2013 IEEE Congress on Evolutionary Computation, CEC 2013, Cancún, Mexico, 22–23 June 2013. [Google Scholar] [CrossRef]
  19. Rashid, M.A.; Newton, M.H.; Hoque, M.T.; Sattar, A. Collaborative Parallel Local Search for Simplified Protein Structure Prediction. In Proceedings of the 2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, Melbourne, VIC, Australia, 16–18 July 2013; pp. 966–973. [Google Scholar] [CrossRef]
  20. Dill, K.A. Theory for the folding and stability of globular proteins. Biochemistry 1985, 24, 1501. [Google Scholar] [CrossRef]
  21. Lau, K.F.; Dill, K.A. A lattice statistical mechanics model of the conformational and sequence spaces of proteins. Macromolecules 1989, 22, 3986–3997. [Google Scholar] [CrossRef]
  22. Boumedine, N.; Bouroubi, S. A new hybrid genetic algorithm for protein structure prediction on the 2D triangular lattice. Turk. J. Electr. Eng. Comput. Sci. 2021, 29, 499–513. [Google Scholar] [CrossRef]
  23. Mazidi, A.; Roshanfar, F. PSPGA: A New Method for Protein Structure Prediction based on Genetic Algorithm. J. Appl. Dyn. Syst. Control 2020, 3, 9–16. [Google Scholar]
  24. Rezaei, M.; Kheyrandish, M.; Mosleh, M. A novel algorithm based on a modified PSO to predict 3D structure for proteins in HP model using Transfer Learning. Expert Syst. Appl. 2024, 235, 121233. [Google Scholar] [CrossRef]
  25. Geethu, S.; Vimina, E.R. Protein Secondary Structure Prediction Using Cascaded Feature Learning Model. Appl. Soft Comput. 2023, 140, 110242. [Google Scholar] [CrossRef]
  26. Berger, B.; Leighton, T. Protein Folding in the Hydrophobic-Hydrophilic (HP) Model is NP-Complete. J. Comput. Biol. 1998, 5, 27–40. [Google Scholar] [CrossRef]
  27. Burley, S.K.; Bhikadiya, C.; Bi, C.; Bittrich, S.; Chao, H.; Chen, L.; Craig, P.A.; Crichlow, G.V.; Dalenberg, K.; Duarte, J.M.; et al. RCSB Protein Data Bank (RCSB.org): Delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res. 2023, 51, D488–D508. [Google Scholar] [CrossRef]
  28. Chira, C.; Horvath, D.; Dumitrescu, D. Hill-Climbing search and diversification within an evolutionary approach to protein structure prediction. BioData Min. 2011, 4, 23. [Google Scholar] [CrossRef] [PubMed]
  29. Rotar, C. An Evolutionary Technique for Multicriterial Optimization Based on Endocrine Paradigm. Proc. Vol. 2004, 3103, 414–415. [Google Scholar] [CrossRef]
  30. Bahi, J.; Côté, N.; Guyeux, C.; Salomon, M. Protein Folding in the 2D Hydrophobic-Hydrophilic (HP) Square Lattice Model is Chaotic. Cogn. Comput. 2012, 4, 98–114. [Google Scholar] [CrossRef]
  31. Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
  32. Callaway, E. What’s next for AlphaFold and the AI protein-folding revolution. Nature 2022, 604, 234–238. [Google Scholar] [CrossRef]
  33. Krishna, R.; Wang, J.; Ahern, W.; Sturmfels, P.; Venkatesh, P.; Kalvet, I.; Lee, G.R.; Morey-Burrows, F.S.; Anishchenko, I.; Humphreys, I.R.; et al. Generalized biomolecular modeling and design with RoseTTAFold All-Atom. Science 2024, 384, eadl2528. [Google Scholar] [CrossRef]
  34. Baek, M.; DiMaio, F.; Anishchenko, I.; Dauparas, J.; Ovchinnikov, S.; Lee, G.R.; Wang, J.; Cong, Q.; Kinch, L.N.; Schaeffer, R.D.; et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 2021, 373, 871–876. [Google Scholar] [CrossRef]
  35. Cruz-Fernández, M.; López-Maldonado, J.T.; Rodriguez-Abreo, O.; Ortiz Verdín, A.A.; Amezcua, Tinajero, J.I.; Macías-Socarrás, I.; Rodríguez-Reséndiz, J. Model Parametrization-Based Genetic Algorithms Using Velocity Signal and Steady State of the Dynamic Response of a Motor. Biomimetics 2025, 10, 146. [Google Scholar] [CrossRef]
  36. Gobashy, M.; Abdelazeem, M. Metaheuristics Inversion of Self-Potential Anomalies. In Self-Potential Method: Theoretical Modeling and Applications in Geosciences; Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 35–103. [Google Scholar] [CrossRef]
  37. Jamali, K.; Käll, L.; Zhang, R.; Brown, A.; Kimanius, D.; Scheres, S.H. Automated model building and protein identification in cryo-EM maps. Nature 2024, 628, 450–457. [Google Scholar] [CrossRef]
  38. Zhou, X.; Zheng, W.; Li, Y.; Pearce, R.; Zhang, C.; Bell, E.W.; Zhang, G.; Zhang, Y. I-TASSER-MTD: A deep-learning-based platform for multi-domain protein structure and function prediction. Nat. Protoc. 2022, 17, 2326–2353. [Google Scholar] [CrossRef]
  39. Wang, X.; Zhu, H.; Terashi, G.; Taluja, M.; Kihara, D. DiffModeler: Large Macromolecular Structure Modeling in Low-Resolution Cryo-EM Maps Using Diffusion Model. bioRxiv 2024. bioRxiv:2024.01.20.576370. [Google Scholar] [CrossRef]
  40. Zhang, Z.; Xu, L.; Zhang, S.; Peng, C.; Zhang, G.; Zhou, X. DEMO-EMol: Modeling protein-nucleic acid complex structures from cryo-EM maps by coupling chain assembly with map segmentation. Nucleic Acids Res. 2025, 53, W228–W237. [Google Scholar] [CrossRef]
  41. Zhou, X.; Li, Y.; Zhang, C.; Zheng, W.; Zhang, G.; Zhang, Y. Progressive assembly of multi-domain protein structures from cryo-EM density maps. Nat. Comput. Sci. 2022, 2, 265–275. [Google Scholar] [CrossRef] [PubMed]
  42. Crescenzi, P.; Goldman, D.; Papadimitriou, C.; Piccolboni, A.; Yannakakis, M. On the Complexity of Protein Folding. J. Comput. Biol. 1998, 5, 423–465. [Google Scholar] [CrossRef] [PubMed]
  43. Zhang, H. Enhanced genetic algorithm for indoor low-illumination stereo matching energy function optimization. Alex. Eng. J. 2025, 121, 1–17. [Google Scholar] [CrossRef]
  44. Cristea, D.M. Hybrid Combinatorial Problems Used for Multimodal Optimisation. In Proceedings of the 2024 IEEE 24th International Conference on Bioinformatics and Bioengineering (BIBE), Kragujevac, Serbia, 27–29 November 2024; pp. 1–8. [Google Scholar] [CrossRef]
  45. Sauer, R.T. Protein folding from a combinatorial perspective. Fold. Des. 1996, 1, R27–R30. [Google Scholar] [CrossRef]
  46. Thachuk, C.; Shmygelska, A.; Hoos, H. A replica exchange Monte Carlo algorithm for protein folding in the HP model. BMC Bioinform. 2007, 8, 342. [Google Scholar] [CrossRef]
  47. Metropolis, N.; Rosenbluth, A.W.; Rosenbluth, M.N.; Teller, A.H.; Teller, E. Equation of state calculations by fast computing machines. J. Chem. Phys. 1953, 21, 1087–1092. [Google Scholar] [CrossRef]
  48. Kirkpatrick, S.; Gelatt, C.; Vecchi, M. Optimization by simulated annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef] [PubMed]
  49. Sima, I.; Parv, B. Protein Folding Simulation Using Combinatorial Whale Optimization Algorithm. In Proceedings of the 2019 21st International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), Timișoara, Romania, 4–7 September 2019; pp. 159–166. [Google Scholar] [CrossRef]
  50. McCarthy, L. p5.js. Available online: https://p5js.org/ (accessed on 15 January 2024).
  51. Niculescu, V. Parallel Computation: Design and Formal Development of Parallel Programs; Presa Universitară Clujeană Publishing: Cluj-Napoca, Romania, 2005; Originally published in Romanian as Calcul Paralel. Proiectares și dezvoltare formală a programelor paralele. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Article metric data becomes available approximately 24 hours after publication online.