# General Purpose Optimization Library (GPOL): A Flexible and Efficient Multi-Purpose Optimization Library in Python

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Optimization Problems

**Problem**, with the following instance attributes:

- The search space S, materialized as a dictionary called sspace, holds problem-specific information, such as a problem’s dimensionality and spatial bounds;
- The fitness function f, materialized as a function called ffunction, calculates the fitness of the proposed solutions;
- An indication regarding optimization’s purpose whether it is a minimization or a maximization materialized as a Boolean variable called min_.

**evaluate_sol**evaluates individual candidate solutions (objects of type**Solution**) and is dedicated to single-point metaheuristics;**evaluate_pop**evaluates a set of candidate solutions (objects of type**Population**) and is dedicated to population-based metaheuristics.

**evaluate_pop**is designed to encapsulate possible optimization procedures, such as parallel processing.

**Problem**with two instance methods that verify solution’s feasibility.

**_is_feasible_sol**verifies if an individual candidate solution (an object of type**Solution**) satisfies the constraints imposed by S (dedicated to single-point metaheuristics). The method returns True if the solution is feasible; False otherwise.**_is_feasible_pop**verifies if a set of candidate solutions (an object of type**Population**) satisfies the constraints imposed by S (dedicated to population-based metaheuristics). The method returns a tensor of Boolean values, where True represents that a given solution is feasible; False otherwise.

**_is_feasible_pop**.

**Problem**subclasses: continuous, knapsack and traveling salesman (being these two special kinds of combinatorial problems), and supervised machine learning (approached from the perspective of inductive programming). Each of the aforementioned default problems is described in the following sections. Furthermore, examples of how to create them are also provided; the examples of how to solve them can be found in Section 3.

#### 2.1. Continuous Optimization Problems

**Box**: a simplistic variant of a constrained continuous problem in which the parameters can take any real number within a given range of values, the box (also known as hyperrectangle), which can be regular case when the bounds are the same for every dimension or irregular when each dimension is allowed to have different bounds. The search space of an instance of the continuous box problem consists of the following key-value pairs.

- “constraints” is a tensor that holds the lower and the upper bounds for the search space’s box. When the box is intended to be regular (i.e., all its dimensions are equally sized), it tensor holds only two values, each representing the lower and the upper bounds of each of the D dimensions of the problem, respectively. When the box is intended to be irregular, the tensor is a 2 ×D matrix. In such case, the first and the second row represent the lower and the upper bounds for each of the D dimensions of the problem, respectively.
- “n_dims” is an integer value representing the dimensionality (D) of S.

**Box**. Note that, in Section B, examples will be presented of how to solve this problem by using different iterative search algorithms contained in the current release of the library.

#### 2.2. Combinatorial Optimization Problems

#### 2.2.1. The Traveling Salesman Problem

**TSP**. The search space of an instance of

**TSP**consists of the following key-value pairs:

- “distances” is an $n\times n$ tensor of type torch.float, which represents the distance’s matrix for the underlying problem. The matrix can be either symmetric or asymmetric. A given row i in the matrix represents the set of distances between the “city” i, as being the origin, and the n possible destinations (including i itself).
- “origin” is an integer value representing the origin (i.e., the point from where the “traveling salesman” departs).

**TSP**, whereas Appendix B demonstrates how to solve this problem by using different iterative search algorithms.

#### 2.2.2. Knapsack

**Knapsack01**and

**KnapsackBounded**, respectively. In the first, each item i can be included in the solution only once; as such, the solutions are often represented as binary vectors. In the second, i can be included in the solution $u{b}_{i}$ times, at most; as such, the solutions are often represented as integer vectors. Concerning the latter, in our implementation, we also allow the user to define not only the upper bound for i ($u{b}_{i}$), but also the lower bound ($l{b}_{i}$). That is, we allow the user to specify the minimum and the maximum number of times an item i can be present in a candidate solution. The search space of an instance of

**Knapsack01**problem, a subclass of

**Problem**, consists of the following key-value pairs:

- “capacity” as the maximum capacity of the “knapsack”;
- “n_dims” as the number of items in S;
- “weights” as the collection of items’ weights defined as a vector of type torch.float;
- “values” as the collection of items’ values defined as a vector of type torch.float.

**KnapsackBounded**problem, a subclass of

**Knapsack01**, also comprises the key “bounds” which holds a $2\times n$ tensor representing the minimum and the maximum number of copies allowed for each of the n items in S.

**Knapsack01**and

**KnapsackBounded**, whereas Appendix B demonstrates how to solve this problem type.

#### 2.3. Supervised Machine Learning Problems (Approached from the Perspective of Inductive Programming)

**SML**, a subclass of

**Problem**, and aims to support the SML problem solving, more specifically the symbolic regression and binary classification, by means of standard GP and its local-search variants. The second, called

**SMLGS**, aims to provide an efficient support for the same tasks addressed instead by means of geometric semantic GP (GSGP), following the implementation proposed in [26]. The major difference between the two is that the latter does not require storing the GP trees in memory, as it relies on memoization techniques.

**SML**must contain the problem’s dimensionality (in the context of SML this corresponds to the number of input features), and those GP-specific parameters that characterize and delimit S. These can be the set of functions and constants from which programs are built, the maximum bound for the trees’ initial depth and their growth during the search (which can be seen as a constraint to solutions’ validity), and so forth. The following list of key-value pairs fully describes the search space for an instance of

**SML**:

- “n_dims” is the number of input features (also known as input dimensions) in the underlying SML problem;
- “function_set” is the set of primitive functions;
- “constant_set” is the set of constants to draw terminals from;
- “p_constants” is the probability of generating a constant when sampling a terminal;
- “max_init_depth” is the trees’ maximum depth during the initialization;
- “max_depth” is the trees’ maximum depth during the evolution;
- “n_batches” is number of batches to use when evaluating solutions (more than one can be used).

**SML**additionally receives two objects of type

**torch.utils.data.DataLoader**, called dl_train and dl_test, which represent training and test (also known as unseen) data, respectively. In the proposed library, we decided to rely upon PyTorch’s data manipulation facilities, such as

**torch.utils.data.Dataset**and

**torch.utils.data.DataLoader**[27], for the following reasons: simplicity and flexibility of the interface, randomized access of the data by batches, and the framework’s popularity (as such, familiarity with its features). Moreover, the constructor of

**SML**receives another parameter, called n_jobs, which specifies the number of jobs to run in parallel when executing trees (we rely on joblib for parallel computing [28]).

**SMLGS**does not vary from the one of

**SML**except in the fact that it does not take “max_depth” as the growth of the individuals in GSGP is an inevitable and necessary consequence of semantic operators’ application; thus, restricting the depth of the individuals is unnecessary. The constructor of

**SMLGS**is significantly different from the

**SML**in the sense that data are not manipulated through PyTorch’s data-loaders; instead, it uses the input and the target tensors (X and y) directly (similar to what is done in scikit-learn [29]). This difference was motivated by implementation guidelines in [26]. The module gpol.utils.datasets provides a set of built-in regression and classification datasets.

**SML**and

**SMLGS**, whereas Appendix B demonstrates how to solve this problem type.

## 3. Iterative Search Algorithms

**SearchAlgorithm**, characterized by the following instance attributes:

- pi is an instance of an optimization problem (i.e., what to solve/optimize);
- best_sol is the best solution found by the search procedure;
- initializer is a procedure to generate the initial point in S; and
- device is the specification of the processing device (i.e., whether to perform computations on the CPU or the GPU).

- Initializing the search at a given point in S.
- Solving a problem’s instance by iteratively searching, throughout S, for the best possible solution according to the criteria specified in the instance. Traditionally, the termination condition for an iterative metaheuristic is the maximum number of iterations, and it constitutes the default stopping criterion implemented in this library (although the user can specify a convergence criterion, and the search can be automatically stopped before completing all the iterations).

**_initialize**and

**solve**. Every implemented algorithm in the scope of this library is an instance of

**SearchAlgorithm**, meaning that it must implement those two methods. Note that the

**_initialize**is called within the

**solve**, whereas the latter is to be invoked by the main script. The signature for the

**solve**does not vary among different iterative metaheuristics and is made of the following parameters:

- n_iter is the number of iterations to execute a metaheuristic (functions as the default stopping criterion).
- tol is the minimum required fitness improvement for n_iter_tol consecutive iterations to continue the search. When the fitness the current best solution is not improving by at least tol for n_iter_tol consecutive iterations, the search will be automatically interrupted.
- n_iter_tol is the maximum number of iterations to not meet tol improvement.
- start_at is the initial starting point in S (i.e., the user can explicitly provide the metaheuristic a starting point in S).
- test_elite is an indication whether to evaluate the best-so-far solution on the test partition, if such exists. This regard only those problem types which operate upon training and test cases, this allow one to assess solutions’ generalization ability.
- verbose is the verbosity level of the search loop.
- log is the detail level of the log file (if such exists).

**SearchAlgorithm**class implements the following utility methods:

**_get_best**compares two candidate solutions based on their fitness values and returns the best;**_get_worst**compares two candidate solutions based on their fitness values and returns the worst.

**_create_log_event**is designed to create a log-event for writing search-related data on the log-file;**_verbose_reporter**is designed to report search-related information on the console.

**SearchAlgorithm**class and its subclasses, which are described in the rest of this section.

#### 3.1. Random Search

**RandomSearch**, a subclass of

**SearchAlgorithm**, characterized by the following instance attributes:

- pi, best_sol, initializer, and device, which are inherited from the
**SearchAlgorithm**; - seed which is a random state for the pseudo-random numbers generation (an integer value).

**SearchAlgorithm**, the

**RandomSearch**implements the methods

**_initialize**,

**solve**,

**_create_log_event**and

**_verbose_reporter**. Moreover, it implements

**_get_random_sol**, a method that (1) generates a random representation of a candidate solution by means of the initializer, (2) creates an instance of type

**Solution**, (3) evaluates an instance’s representation, and (4) returns the evaluated object.

**RandomSearch**and apply it in different problem solving.

#### 3.2. Local Search

**RandomSearch**, called

**HillClimbing**, whereas the latter is materialized as a subclass of

**HillClimbing**, called

**SimulatedAnnealing**.

**RandomSearch**,

**HillClimbing**, or

**SimulatedAnnealing**to solve potentially any kind of problem, whether it is continuous, combinatorial, or inductive program synthesis for SLM problem solving. The only two things one has to take in consideration are the correct specification of the search space and the operators for a given problem type.

#### 3.2.1. Hill Climbing (HC)

**HillClimbing**, subclass of

**RandomSearch**and it is characterized by the following instance attributes:

- pi, initializer, best_sol, seed, and device are the instance attributes inherited from
**RandomSearch**; - nh_size is the neighborhood’s size;
- nh_function is a procedure to generate nh_size neighbours of a given solution (the neighbour-generation function).

**HillClimbing**class can be expressed through the logic that guides the search-procedure, which is implemented in the overridden

**solve**method. Additionally, the class implements a private method, called

**_get_best_nh**, which returns the best neighbor from a given neighborhood. Figure 3 represents the search procedure of a HC algorithm that is mirrored in the

**solve**method.

#### 3.2.2. Simulated Annealing (SA)

**SimulatedAnnealing**, subclass of

**HillClimbing**, and it is characterized by the following instance attributes:

- pi, initializer, nh_function, nh_size, best_sol, seed, and device are instance attributes inherited from
**HillClimbing**; - control is the control parameter (also known as temperature);
- update_rate: rate of control’s decrease over the iterations.

**solve**method of

**SimulatedAnnealing**class.

**SimulatedAnnealing**and apply it in different problem solving.

#### 3.3. Population-Based Algorithms

**PopulationBased**, a subclass of

**RandomSearch**, as it improves the latter by means of “collective intelligence.” The class

**PopulationBased**is the root of all the PB-metaheuristics and is characterized by the following instance attributes:

- pi, initializer, best_sol, seed, and device are inherited from the
**RandomSearch**; - pop_size is the number of candidate solutions to exploit simultaneously at each step (i.e., the population’s size);
- pop is an object of type
**Population**representing the set of simultaneously exploited candidate solutions (i.e., the population); - mutator is a procedure to “move” the candidate solutions across S.

#### 3.3.1. Genetic Algorithms (GAs)

**GeneticAlgorithm**, subclass of

**PopulationBased**, characterized by the following instance attributes:

- pi, initializer, best_sol, pop_size, pop, mutator, seed, and device are inherited from
**PopulationBased**; - selector is the selection operator;
- crossover is the crossover variation operator;
- p_m is the probability of applying mutation variation operator;
- p_c is the probability of applying crossover variation operator;
- elitism is a flag which activates elitism during the evolutionary process; and
- reproduction is a flag that states whether the crossover and the mutation can be applied on the same individual (case when reproduction is set to True). If reproduction is set to False, then either crossover or mutation will be applied (this resembles a GP-like search procedure).

**solve**method in the

**GeneticAlgorithm**class. Moreover, the class implements a private method, called

**_elite_replacement**, which directly replaces P with ${P}^{\prime}$ if the elite is the best offspring; otherwise, when the elite is the best parent, P is replaced with ${P}^{\prime}$ and the elite is transferred to ${P}^{\prime}$ (by replacing a randomly selected offspring).

#### 3.3.2. Genetic Programming (GP)

**GeneticAlgorithm**to solve potentially any kind of problem, whether it is of continuous, combinatorial, or inductive program synthesis nature. The only two things one has to take into consideration are (1) the correct specification of the problem-specific S and (2) the operators. Following this perspective, by creating an instance of the class

**GeneticAlgorithm**with, for example, ramped half-and-half (RHH) initialization, tournament selection, swap crossover and sub-tree mutation, all of them implemented in this library, one obtains a standard GP algorithm. Recall that a similar flexible behaviour is present in the branch of LS algorithms. By providing HC or SA with, for example, grow initialization and sub-tree mutation, one obtains a LS-based program induction algorithm.

**GeneticAlgorithm**and apply it in different problem solving.

#### 3.3.3. Geometric Semantic Genetic Programming (GSGP)

**GSGP**, a subclass of

**GeneticAlgorithm**, which encapsulates the aforementioned efficient implementation of GSOs and is intended to work in conjunction with

**SMLGS**(which was also specially designed to incorporate the aforementioned guidelines). The class

**GSGP**is characterized by the following instance attributes:

- pi, best_sol, pop_size, pop, initializer, selector, mutator, crossover, p_m p_c, elitism, reproduction, seed, and device are inherited from the
**GeneticAlgorithm**class. - _reconstruct is a flag stating whether the initial population and the intermediary random trees should be disk-cached. If the value is set to “False”, then there is no possibility of reconstructing the individuals after the search is finished; this scenario is useful to conduct the parameter tuning, for example. If the value is set to “True”, then the individuals can be reconstructed by means of an auxiliary procedure (the function
**gpol.utils.inductive_programming.prm_reconstruct_tree**) after the search is finished; this scenario is useful when the final solution needs to be deployed, for example. - path_init_pop is a connection string toward the initial population’s repository.
- path_rts is a connection string toward the random trees’ repository.
- pop_ids are the IDs of the current population (the population of parents).
- history is a dictionary which stores the history of operations applied on each offspring. In abstract terms, it stores a one-level family tree of a given offspring. Specifically, history stores as a key the offspring’s ID, as a value a dictionary with the following key-value pairs:
- “Iter” is the iteration’s number; “Operator” is the variation operator that was applied on a given offspring;
- “T1” is the ID of the first parent;
- “T2” is the ID of the second parent (if GSC was applied);
- “Tr” is the ID of a random tree;
- “ms” is mutation’s step (if GSM was applied);
- “Fitness” is the offspring’s training fitness.

**GSGP**and apply it in different problem solving, whereas Appendix B.5 demonstrates how to reconstruct an individual generated by means of

**GSGP**.

#### 3.3.4. Differential Evolution (DE)

**DifferentialEvolution**, a subclass of

**PopulationBased**, characterized by the following instance attributes:

- pi, initializer, best_sol, pop_size, pop, mutator, seed, and device are inherited from the
**PopulationBased**; - selector is an operator that selects parents for the sake of mutation;
- crossover: crossover operator.

**_replacement**: it compares each trial vector with its respective target and returns the most fit solution.

**DifferentialEvolution**and apply it in different problem solving.

#### 3.3.5. Particle Swarm Optimization

**SPSO**and

**APSO**classes. The former extends

**PopulationBased**, whereas the latter extends

**SPSO**; both require an additional parameter, called v_clamp which allows one to bound represents the velocity vector to foster the convergence, as suggested in [53]. The

**solve**method for

**SPSO**and

**APSO**reflects the underlying algorithmic logic which guides the search-procedure. Additionally, both classes implement a private method, called

**_update**, which efficiently encapsulates step number 2 from the procedural steps of S-PSO and A-PSO, and constitutes the main difference between the two classes. In the scope of swarm intelligence, the force-generating mechanism that yields ${\overrightarrow{x}}_{(p,i)}$ (and essentially, dictates how how the candidate solutions will “move” across S) is encapsulated in the mutator function, and is provided as a parameter during algorithms’ instantiate-generation. In this sense, the function force-generating function is completely abstracted from the PSO algorithm, meaning that the user can easily personalize it with any other update-rule, if the interface is respected.

**SPSO**and apply it in different problem solving.

## 4. Operators

#### 4.1. Initialization

**RandomSearch**,

**HillClimbing**, and

**SimulatedAnnealing**. In the case of PB algorithms, an initializer has one additional parameter, called n_sols, which represents the population’s/swarm’s size. Such branching is necessary because, by definition, not all the PB initialization operators can generate one solution. Additionally, it allows the user to encapsulate a computationally more efficient generation of a set of initial solutions.

**grow**and

**full**, represents the usage of a Python closure; for example,

**prm_grow**is a special adaptation of the

**grow**function that accepts as a parameter the S of a given problem’s instance. Additionally, this solution allows one to have a deeper control over the operators’ functioning—an important feature for the research purposes.

#### 4.2. Selection

**Population**and min_. This configuration suits the majority of selectors. However, some operators require a larger enclosing scope; this is the case of tournament selection which requires an additional parameter: the selection’s pressure. To remedy this situation, Python closures are used to provide the selectors with the necessary outer scope. In this sense, the prefix “prm” in

**prm_tournament**represents the usage of a Python closure which allows the user to parametrize the necessary pressure. Similarly,

**prm_dernd_selection**receives an outer parameter called n_sols, which tells the function how many random vectors to select for the sake of the DE’s mutation. In this sense, the parameter n_sols allows for easily including different DE mutation strategies, as these might include different amounts of random vectors.

#### 4.3. Variation Functions

**mutator(repr_)**, where repr_ stands for the representation of a single parent solution to be mutated, and the function returns the mutated copy of repr_;**crossover(p1_repr, p2_repr)**, where p1_repr and p2_repr stand for the representations of two different parents, and the function returns two modified copies of p1_repr and p2_repr.

- DE/rand/N: One type of DE mutation that creates the donor vector (the mutant) from adding N weighted differences between $2N$ randomly selected parent vectors to another ${(2N+1)}^{th}$ random parent. The underlying weights are provided by to the functions through the Python closures. Thus, the signature of these functions simplifies to
**mutator(parents)**, where parents is a collection containing $(2N+1)$ randomly selected parents. The function returns one donor vector (the mutant). - DE/best/N: Another type of DE mutation that creates the donor vector from adding N weighted differences between $2N$ randomly selected parent vectors to the best parent at the current iteration. Similarly to DE/rand/N, the weights are provided through the Python closures. The signature of these functions simplifies to
**mutator(best, parents)**, where best stands for the best parent and parents contains $(2N+1)$ random parents. The function returns one donor vector (the mutant). - DE/target-to-best/1: Another type of DE mutation that creates the donor from summing the target vector’s (the current parent) two weighted differences: one between two randomly selected parents and one between the best parent and the target vector itself. Similarly to the previous operators, the weights are provided through the Python closures. The signature of these functions simplifies to
**mutator(target, best, parents)**, where target stands for the current parent, best stands for the best parent and parents contain two random parents. The function returns one donor vector (the mutant). - crossover(donor, target), where donor and target stand for the representations of two different vectors: The donor vector, generated by means of mutation, and the target vector (current parent). The function returns the trial vector (result of the DE’s crossover).

**prm_pso**represents the usage of a closure that allows the user to specify the necessary social (c1) and cognitive (c2) factor weights, along with the inertia’s range (w_max and w_min).

## 5. Solutions

**Solution**, which encapsulates the necessary attributes and behavior of a given candidate solution, specifically the unique identification, the representation in the light of a given problem, the validity state in the light of S, and the fitness value(s) (which can be several, depending if data partitioning was used). To ease the library’s necessity for flexibility, in this release, the solution’s representation can take one of two forms: either a list or a tensor (torch.Tensor). The former relates to GP trees, and the latter relates to the remaining array-based representations.

**Population**.

## 6. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Appendix A. Creating Problem Instances

#### Appendix A.1. Creating an Instance of Box

**Box**problem to find the minimum point of a 2D Rastrigin function, a popular continuous optimization test function in the scientific community [60,61]. As illustrated in the example and in the code comments, the main steps are (1) define the search space S, by specifying the number of dimensions in the Rastrigin problem and the (regular) bounds; and (2) create an instance of the

**Box**problem by passing to the constructor the aforementioned S, the fitness function, the optimization’s purpose, and whether to bound the outlying solutions’ dimensions.

#### Appendix A.2. Creating an Instance of TSP

**TSP**to find the minimum travel distance for tour among 13 cities. The first part of the example defines a (symmetric) distance matrix whose $(i,\phantom{\rule{4pt}{0ex}}j)$ entry corresponds to the distance from location i to location j in miles (for more information, follow the source from which the example was taken [62]). Note that the matrix can also be asymmetric. Once the distance matrix is declared, one has to do only two things: create a TSP-specific S, by providing the distance matrix and the index of the origin city; and declare an instance of

**TSP**, by providing the S, the fitness function (in this case, the traveling distance), and the optimization’s purpose (minimization). When compared to the example of Figure A1, one can already notice the intrinsic characteristics of the API in regard to problem instance’s creation. Note that the variables defined and used there, namely, the device, are assumed to be accessible in the “enclosing scope” of the current example.

#### Appendix A.3. Creating Instances of Knapsack01 and KnapsackBounded

**Knapsack01**and

**KnapsackBounded**problems to pack a fixed-size knapsack with the most valuable items from the set of available items. Recall that the latter allows one to have several copies of an item. In this example, the items’ weights and values are randomly generated 1D tensors (vectors); however, these can hold any user-specified values (for example, one can import them from a file). Note that both instances in this example use the same S; the only difference is that, before creating an instance of

**KnapsackBounded**, S is altered in the capacity and added to the items’ quantity bounds (in this example, each item can appear four times at most).

#### Appendix A.4. Creating an Instance of SML

**SML**to predict the median value of owner-occupied homes in Boston, a popular dataset for ML algorithm benchmarks [29], originally published by [63].

## Appendix B. Algorithm Creation and Applications for Problem-Solving

#### Appendix B.1. Applying Random Search

**SMLGS**because it was specifically designed to work with

**GSGP**. The script exemplifies how a given metaheuristic, such as RS, can be used to solve any type of problem in the scope of this library. The modular implementation allows one to reuse the code of

**RandomSearch**for any type of problem easily by simply providing the algorithm’s instance with a problem-specific initialization function. In this sense, the initialization functions generate initial solutions according to the instance’s search space S.

#### Appendix B.2. Applying Local Search

**SimulatedAnnealing**’s constructor is provided the necessary parameters’ dictionary, and the search is executed for a given number of iterations. Note that a detailed description of the neighborhood functions, along with other operators, can be found in Section 4.

#### Appendix B.3. Applying Genetic Algorithms

**GeneticAlgorithm**, allows using the GA (and LS) for any kind of representation, including trees. Note that this example follows the variables defined in previous demonstrations, where the reader was shown how to apply RS and SA for solving different types of problems, which are assumed to be cached. The current example shows how GA can be applied to solve any type of problem when provided problem-specific initialization, mutation, and crossover functions. Note that, in context of GA, the LS’s neighborhood’s size and functions are seen as the GA’s population’s size and mutation operators, respectively, and vice-versa.

**GeneticAlgorithm**constructor is provided the necessary parameters’ dictionary, and the search is executed for a given number of iterations. Note that a detailed description of the crossover functions, along with the other operators, can be found in Section 4.

#### Appendix B.4. Applying GSGP

**SML**problem type, an instance of

**SMLGS**receives two tensors instead: X representing the input data and y the real-valued target. Second, to create an algorithm’s instance, one must specify two additional parameters: path_init_pop and path_rts, which represent the connection strings toward the initial and random trees’ repositories, respectively. During GSGP’s execution, constructed trees are stored in those folders to make individual reconstruction possible.

#### Appendix B.5. Reconstructing Trees after GSGP

#### Appendix B.6. Applying Differential Evolution

#### Appendix B.7. Applying Particle Swarm Optimization

## References

- Namkoong, J.E.; Henderson, M. Responding to Causal Uncertainty through Abstract Thinking. Curr. Dir. Psychol. Sci.
**2019**, 28, 547–651. [Google Scholar] [CrossRef] - Smith, P.; Wigboldus, D.; Dijksterhuis, A. Abstract thinking increases one’s sense of power. J. Exp. Soc. Psychol.
**2008**, 44, 378–385. [Google Scholar] [CrossRef] - Vallacher, R.R.; Wegner, D.M. Levels of personal agency: Individual variation in action identification. J. Personal. Soc. Psychol.
**1989**, 660–671. [Google Scholar] [CrossRef] - Optimize Live Editor Task—MATLAB & Simulink. Available online: https://www.mathworks.com/help/matlab/math/optimize-live-editor-matlab.html (accessed on 16 February 2021).
- Optimization (scipy.optimize)—SciPy v1.6.0 Reference Guide. Available online: https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html (accessed on 16 February 2021).
- DEAP Documentation|DEAP 1.3.1 Documentation. Available online: https://deap.readthedocs.io/en/master/ (accessed on 16 February 2021).
- Welcome to Gplearn’s Documentation!—Gplearn 0.4.1 Documentation. Available online: https://gplearn.readthedocs.io/en/stable/ (accessed on 16 February 2021).
- Welcome to PySwarms’s Documentation!|PySwarms 1.3.0 Documentation. Available online: https://pyswarms.readthedocs.io/en/latest/index.html (accessed on 16 February 2021).
- Benitez-Hidalgo, A.; Nebro, A.J.; Garcia-Nieto, J.; Oregi, I.; Del Ser, J. jMetalPy: A Python framework for multi-objective optimization with metaheuristics. Swarm Evol. Comput.
**2019**, 51, 100598. [Google Scholar] [CrossRef][Green Version] - Project-Platypus/Platypus: A Free and Open Source Python Library for Multiobjective Optimization. Available online: https://github.com/Project-Platypus/Platypus (accessed on 20 April 2021).
- Karban, P.; Pánek, D.; Orosz, T.; Petrášová, I.; Doležel, I. FEM based robust design optimization with Agros and Ārtap. Comput. Math. Appl.
**2021**, 81, 618–633. [Google Scholar] [CrossRef] - OR-Tools|Google Developers. Available online: https://developers.google.com/optimization (accessed on 16 February 2021).
- Voß, S. MetaheuristicsMetaheuristics. In Encyclopedia of Optimization; Floudas, C.A., Pardalos, P.M., Eds.; Springer: Boston, MA, USA, 2009; pp. 2061–2075. [Google Scholar] [CrossRef]
- Aarts, E.; Korst, J. Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing; Wiley-Interscience Series in Discrete Mathematics and Optimization; Wiley: Hoboken, NJ, USA, 1989. [Google Scholar]
- Kitzelmann, E.; Steiglitz, K. Combinatorial Optimization: Algorithms and Complexity; Taylor & Francis: Abingdon, UK, 1984. [Google Scholar]
- Fletcher, R.; Leyffer, S. Nonlinear programming without a penalty function. Math. Program.
**1999**, 91, 239–269. [Google Scholar] [CrossRef] - Price, K.; Storn, R.M.; Lampinen, J.A. Differential Evolution: A Practical Approach to Global Optimization (Natural Computing Series); Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
- Jeyakumar, V.; Rubinov, A. Continuous Optimization: Current Trends and Modern Applications; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
- Bartashevich, P.; Grimaldi, L.; Mostaghim, S. PSO-based Search mechanism in dynamic environments: Swarms in Vector Fields. In Proceedings of the 2017 IEEE Congress on Evolutionary Computation (CEC), Donostia, Spain, 5–8 June 2017; pp. 1263–1270. [Google Scholar] [CrossRef]
- Liang, J.J.; Qu, B.Y.; Suganthan, P.N. Problem definitions and evaluation criteria for the CEC 2014 special session and competition on single objective real-parameter numerical optimization. Comput. Intell. Lab. Zhengzhou Univ. Zhengzhou China Tech. Rep. Nanyang Technol. Univ. Singap.
**2014**, 635, 490. [Google Scholar] - GEATbx-Genetic and Evolutionary Algorithms Toolbox in Matlab-Main Page. Available online: http://www.geatbx.com (accessed on 16 February 2021).
- Applegate, D.L.; Bixby, R.E.; Chvatal, V.; Cook, W.J. The Traveling Salesman Problem: A Computational Study (Princeton Series in Applied Mathematics); Princeton University Press: St. Princeton, NJ, USA, 2007. [Google Scholar]
- Martello, S.; Toth, P. Knapsack Problems: Algorithms and Computer Implementations; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1990. [Google Scholar]
- Kitzelmann, E.; Schmid, U. Inductive Synthesis of Functional Programs: An Explanation Based Generalization Approach. J. Mach. Learn. Res.
**2006**, 7, 429–454. [Google Scholar] - Schmid, U. Inductive Synthesis of Functional Programs, Universal Planning, Folding of Finite Programs, and Schema Abstraction by Analogical Reasoning; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2003; Volume 2654. [Google Scholar] [CrossRef]
- Castelli, M.; Silva, S.; Vanneschi, L. A C++ framework for geometric semantic genetic programming. Genet. Program. Evolvable Mach.
**2014**, 16, 73–81. [Google Scholar] [CrossRef] - PyTorch, an Open Source Machine Learning Framework that Accelerates the Path from Research Prototyping to Production Deployment. Available online: https://pytorch.org/ (accessed on 16 April 2021).
- Joblib: Running Python Functions as Pipeline Jobs. Available online: https://joblib.readthedocs.io/en/latest/ (accessed on 16 April 2021).
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res.
**2011**, 12, 2825–2830. [Google Scholar] - Mitchell, T.M. Machine Learning, 1st ed.; McGraw-Hill, Inc.: New York, NY, USA, 1997. [Google Scholar]
- Hoos, H.; Sttzle, T. Stochastic Local Search: Foundations & Applications; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2004. [Google Scholar]
- Gonçalves, I.; Silva, S.; Fonseca, C.M. Semantic Learning Machine: A Feedforward Neural Network Construction Algorithm Inspired by Geometric Semantic Genetic Programming. In Progress in Artificial Intelligence; Pereira, F., Machado, P., Costa, E., Cardoso, A., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 280–285. [Google Scholar]
- Holland, J.H. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence; MIT Press: Cambridge, MA, USA, 1992. [Google Scholar]
- Darwin, C. On the Origins of Species by Means of Natural Selection; Murray: London, UK, 1859. [Google Scholar]
- Mitchell, M. An Introduction to Genetic Algorithms; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
- Koza, J.R. Genetic Programming: On the Programming of Computers by Means of Natural Selection; MIT Press: Cambridge, MA, USA, 1992. [Google Scholar]
- Vanneschi, L.; Poli, R. Genetic Programming—Introduction, Applications, Theory and Open Issues. In Handbook of Natural Computing; Springer: Berlin/Heidelberg, Germany, 2012; pp. 709–739. [Google Scholar] [CrossRef]
- Moraglio, A.; Krawiec, K.; Johnson, C.G. Geometric semantic genetic programming. In International Conference on Parallel Problem Solving from Nature; Springer: Berlin/Heidelberg, Germany, 2012; pp. 21–31. [Google Scholar]
- Vanneschi, L.; Castelli, M.; Silva, S. A survey of semantic methods in genetic programming. Genet. Program. Evolvable Mach.
**2014**, 15, 195–214. [Google Scholar] [CrossRef] - Vanneschi, L.; Silva, S.; Castelli, M.; Manzoni, L. Geometric semantic genetic programming for real life applications. In Genetic Programming Theory and Practice xi; Springer: Berlin/Heidelberg, Germany, 2014; pp. 191–209. [Google Scholar]
- Castelli, M.; Vanneschi, L.; Popovič, A. Parameter evaluation of geometric semantic genetic programming in pharmacokinetics. Int. J. Bio Inspired Comput.
**2016**, 8, 42–50. [Google Scholar] [CrossRef] - Bartashevich, P.; Bakurov, I.; Mostaghim, S.; Vanneschi, L. PSO-Based Search Rules for Aerial Swarms Against Unexplored Vector Fields via Genetic Programming. In International Conference on Parallel Problem Solving from Nature; Springer: Berlin/Heidelberg, Germany, 2018; pp. 41–53. [Google Scholar]
- Bakurov, I.; Castelli, M.; Vanneschi, L.; Freitas, M. Supporting medical decisions for treating rare diseases through genetic programming. In Applications of Evolutionary Computation; Kaufmann, P., Castillo, P., Eds.; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2019; pp. 187–203. [Google Scholar] [CrossRef][Green Version]
- Vanneschi, L. An Introduction to Geometric Semantic Genetic Programming. In NEO 2015; Springer: Berlin/Heidelberg, Germany, 2017; Volume 663, pp. 3–42. [Google Scholar] [CrossRef]
- Castelli, M.; Castaldi, D.; Giordani, I.; Silva, S.; Vanneschi, L.; Archetti, F.; Maccagnola, D. An efficient implementation of geometric semantic genetic programming for anticoagulation level prediction in pharmacogenetics. In Portuguese Conference on Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2013; pp. 78–89. [Google Scholar]
- Storn, R.; Price, K. Differential Evolution: A Simple and Efficient Adaptive Scheme for Global Optimization Over Continuous Spaces. Tech. Rep. TR-95-012 ICSI
**1995**, 23, 341–359. [Google Scholar] - Storn, R. On the usage of differential evolution for function optimization. In Proceedings of the Proceedings of North American Fuzzy Information Processing, Berkeley, CA, USA, 19–22 June 1996; pp. 519–523. [Google Scholar] [CrossRef]
- Guo, S.M.; Yang, C.C.; Hsu, P.H.; Tsai, J.S.H. Improving Differential Evolution With a Successful-Parent-Selecting Framework. IEEE Trans. Evol. Comput.
**2015**, 19, 717–730. [Google Scholar] [CrossRef] - Eltaeib, T.; Mahmood, A. Differential Evolution: A Survey and Analysis. Appl. Sci.
**2018**, 8, 1945. [Google Scholar] [CrossRef][Green Version] - Das, S.; Suganthan, P. Differential Evolution: A Survey of the State-of-the-Art. IEEE Trans. Evol. Comput.
**2011**, 15, 4–31. [Google Scholar] [CrossRef] - Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95—International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar] [CrossRef]
- Shi, Y.; Eberhart, R. A modified particle swarm optimizer. In Proceedings of the 1998 IEEE International Conference on Evolutionary Computation Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98TH8360), Anchorage, AK, USA, 4–9 May 1998; pp. 69–73. [Google Scholar]
- Kennedy, J.; Eberhart, R.C. Swarm Intelligence; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2001. [Google Scholar]
- Carlisle, A.; Dozier, G. An off-the-shelf pso. In Proceedings of the Workshop on Particle Swarm Optimization, Purdue School of Engineering and Technology, Indianapolis, IN, USA, 6–7 April 2001. [Google Scholar]
- Mussi, L.; Cagnoni, S.; Daolio, F. Empirical assessment of the effects of update synchronization in Particle Swarm Optimization. In Proceedings of the 2009 AI*IA Workshop on Complexity, Evolution and Emergent Intelligence, Reggio Emilia, Italy, 9–12 December 2009; pp. 1–10. [Google Scholar]
- Rada-Vilela, J.; Zhang, M.; Seah, W. A Performance Study on Synchronous and Asynchronous Updates in Particle Swarm Optimization. In Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation, Dublin, Ireland, 12–16 July 2011; ACM: New York, NY, USA, 2011; pp. 21–28. [Google Scholar]
- Vanneschi, L.; Bakurov, I.; Castelli, M. An initialization technique for geometric semantic GP based on demes evolution and despeciation. In Proceedings of the Congress on Evolutionary Computation (CEC), San Sebastian, Spain, 5–8 June 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 113–120. [Google Scholar]
- Bakurov, I.; Vanneschi, L.; Castelli, M.; Fontanella, F. EDDA-V2–An Improvement of the Evolutionary Demes Despeciation Algorithm. In International Conference on Parallel Problem Solving from Nature; Springer: Berlin/Heidelberg, Germany, 2018; pp. 185–196. [Google Scholar]
- Moraglio, A. Towards a Geometric Unification of Evolutionary Algorithms. Ph.D. Thesis, Department of Computer Science—University of Essex, Colchester, UK, 2007. [Google Scholar]
- Bajpai, P.; Kumar, M. Genetic Algorithm—An Approach to Solve Global Optimization Problems. Indian J. Comput. Sci. Eng.
**2010**, 1, 199–206. [Google Scholar] - Mühlenbein, H.; Schomisch, M.; Born, J. The parallel genetic algorithm as function optimizer. Parallel Comput.
**1991**, 17, 619–632. [Google Scholar] [CrossRef] - Traveling Salesman Problem|OR-Tools|Google Developers. Available online: https://developers.google.com/optimization/routing/tsp (accessed on 16 February 2021).
- Harrison, D.; Rubinfeld, D.L. Hedonic housing prices and the demand for clean air. J. Environ. Econ. Manag.
**1978**, 5, 81–102. [Google Scholar] [CrossRef][Green Version]

**Figure 1.**A high-level overview of algorithms’, problems’ and operators’ relationships through a block-diagram.

**Figure 2.**UML diagram of the algorithm class

**SearchAlgorithm**and relative subclasses, as implemented in GPOL.

**Figure 6.**A simple visual demonstration of GSM’s genotype-phenotype mapping and its property of introducing a unimodal error surface on any SML problem. Adapted from [44].

Function | OP Type | MH | Description |
---|---|---|---|

prm_rnd_vint(lb, ub) | Knapsack | SP | vector generated under ∼$U\{lb,\phantom{\rule{4pt}{0ex}}ub\}$ |

prm_rnd_mint(lb, ub) | PB | matrix generated under ∼$U\{lb,\phantom{\rule{4pt}{0ex}}ub\}$ | |

rnd_vshuffle | TSP | SP | permutation vector of cities |

rnd_mshuffle | PB | permutation matrix of cities | |

rnd_vuniform | Continuous | SP | vector generated under ∼$U(lb,\phantom{\rule{4pt}{0ex}}ub)$ |

rnd_muniform | PB | matrix generated under ∼$U(lb,\phantom{\rule{4pt}{0ex}}ub)$ | |

grow | SML-IP | SP | LISP tree created with Grow method [36] |

prm_grow(sspace) | |||

full | LISP tree created with Full method [36] | ||

prm_full(sspace) | |||

rhh | PB | list of LISP trees created with RHH [36] | |

prm_edda | list of LISP trees created with EDDA [57,58] |

Function | MH Type | Description |
---|---|---|

prm_tournament(pressure) | {GA, GSGP} | tournament selection of one individual |

roulette_wheel | roulette wheel selection of one individual | |

rank_selection | rank-based selection of one individual | |

rnd_selection | selects one individual at random | |

prm_dernd_selection(n_sols) | DE | random selection of n_sols vectors |

Function | OP | MH | Description |
---|---|---|---|

one_point_xo | Knapsack01 | GA, HC, SA | one point crossover |

prm_n_point_xo(n) | n point crossover | ||

binary_flip | flips a randomly selected value ($\overrightarrow{{x}_{i}}=!\overrightarrow{{x}_{i}}$) | ||

prm_ibinary_flip(prob) | $\overrightarrow{{x}_{i}}=!\overrightarrow{{x}_{i}}$ with $P\left({M}_{i}\right)=prob$ | ||

prm_rnd_int_ibound(prob, lb, ub) | KnapsackBounded | $\overrightarrow{{x}_{i}}\sim U\{lb,\phantom{\rule{4pt}{0ex}}ub\}$ with $P\left({M}_{i}\right)=prob$ | |

partially_mapped_xo | TSP | partially mapped crossover | |

prm_iswap_mtn(prob) | random swap of the $i\mathrm{th}$ element with $P\left({M}_{i}\right)=prob$ | ||

geometric_xo | Continuous Function | geometric crossover [59] | |

prm_iball_mtn | ball mutation [59] | ||

de_binomial_xo(prob) | DE | binomial crossover for DE | |

de_exponential_xo(prob) | exponential crossover for DE | ||

de_rand | DE/RAND/N mutation scheme | ||

de_best | DE/BEST/N mutation scheme | ||

de_target_to_best | DE/TARGET-TO-BEST/N mutation scheme | ||

prm_pso(c1, c2, w_max, w_min) | A-PSO | PSO’s force-generating equation (also known as update rule) | |

swap_xo | SML-IP | {GA, HC, SA} | standard GP’s crossover (also known as a swap crossover) |

prm_gs_xo(initializer, device) | GSC that works upon tree-like representations [38] | ||

hoist_mtn | hoist mutation | ||

prm_point_mtn(sspace, prob) | point mutation | ||

prm_subtree_mtn(initializer) | standard GP’s mutation (also known as a sub-tree mutation) [38] | ||

prm_gs_mtn(initializer, ms) | GSM that works upon tree-like representations | ||

prm_efficient_gs_xo(X, initializer) | GSGP | efficient GSC that works upon semantics [45] | |

prm_efficient_gs_mtn(X, initializer, ms) | efficient GSM that works upon semantics [45] |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Bakurov, I.; Buzzelli, M.; Castelli, M.; Vanneschi, L.; Schettini, R. General Purpose Optimization Library (GPOL): A Flexible and Efficient Multi-Purpose Optimization Library in Python. *Appl. Sci.* **2021**, *11*, 4774.
https://doi.org/10.3390/app11114774

**AMA Style**

Bakurov I, Buzzelli M, Castelli M, Vanneschi L, Schettini R. General Purpose Optimization Library (GPOL): A Flexible and Efficient Multi-Purpose Optimization Library in Python. *Applied Sciences*. 2021; 11(11):4774.
https://doi.org/10.3390/app11114774

**Chicago/Turabian Style**

Bakurov, Illya, Marco Buzzelli, Mauro Castelli, Leonardo Vanneschi, and Raimondo Schettini. 2021. "General Purpose Optimization Library (GPOL): A Flexible and Efficient Multi-Purpose Optimization Library in Python" *Applied Sciences* 11, no. 11: 4774.
https://doi.org/10.3390/app11114774