A Collection of 30 Multidimensional Functions for Global Optimization Benchmarking

: A collection of thirty mathematical functions that can be used for optimization purposes is presented and investigated in detail. The functions are deﬁned in multiple dimensions, for any number of dimensions, and can be used as benchmark functions for unconstrained multidimensional single-objective optimization problems. The functions feature a wide variability in terms of complexity. We investigate the performance of three optimization algorithms on the functions: two metaheuristic algorithms, namely Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), and one mathematical algorithm, Sequential Quadratic Programming (SQP). All implementations are done in MATLAB, with full source code availability. The focus of the study is both on the objective functions, the optimization algorithms used, and their suitability for solving each problem. We use the three optimization methods to investigate the difﬁculty and complexity of each problem and to determine whether the problem is better suited for a metaheuristic approach or for a mathematical method, which is based on gradients. We also investigate how increasing the dimensionality affects the difﬁculty of each problem and the performance of the optimizers. There are functions that are extremely difﬁcult to optimize efﬁciently, especially for higher dimensions. Such examples are the last two new objective functions, F29 and F30, which are very hard to optimize, although the optimum point is clearly visible, at least in the two-dimensional case.


Introduction
Mathematical optimization is the process of finding the best element, with regard to a given criterion, from a set of available alternatives. Optimization problems arise in various quantitative disciplines from computer science and engineering to economics and operational research. The development of solution methods to optimization problems has been of interest in mathematics and engineering for centuries [1].
Even though there are some well-established optimization methods, the truth is that there is no single method that outperforms all the others when several different optimization problems are considered. This is often referred to as the No Free Lunch (NFL) theorem [2,3]. Consequentially, new optimization methods or new variants of existing ones are proposed on a regular basis [4][5][6]. When a new optimization method is proposed, the developers of the method usually choose a set of popular optimization problems (or objective functions) to test the algorithm on and also to serve as a basis for the comparison of the new algorithm with other, existing ones. The chosen objective functions for the testing phase are known as benchmark functions and play a decisive role to determine whether the new proposed algorithm can be considered successful when its performance is better or at least similar to the ones of existing, established algorithms. Benchmark functions are usually defined in such a way that they can be computed in an arbitrarily chosen number of dimensions. As the number of dimensions increases, the complexity of the optimization task also increases. A certain optimization algorithm could perform very well for a small number of dimensions but poorly in higher dimensional spaces. This is the so-called "Curse of Dimensionality", a well-known problem in data science referring to various phenomena that arise when analyzing and organizing data in high-dimensional spaces that do not occur when low-dimensional settings are implemented [7]. Additionally, the size of the search domain is another important variable. Benchmark functions based on explicit mathematics usually span infinitely, thereby, an appropriate size of the search space must be chosen a priori. As a result, choosing the benchmark functions, the number of dimensions, and the size of the search domain is not a trivial task when testing and comparing optimization algorithms.
In this study, we investigate a total of 30 mathematical functions that can be used as optimization benchmark functions. There is no consensus among researchers on how an optimization problem should be properly tested or which benchmark functions should be particularly used. The goal of the present study is not to answer this question. Instead, the study aims at providing a compilation of ready-to-use functions of various complexities that are suited for benchmarking purposes. We investigate and assess the properties and complexity of these functions by observing and comparing the difficulties encountered by popular optimization algorithms when searching to find their respective optimum values. The selected methods used for these comparisons are: Genetic Algorithm (GA) [8][9][10], Particle Swarm Optimization (PSO) [11][12][13][14][15], and Sequential Quadratic Programming (SQP) [11,16,17]. Based on the obtained results, the complete set of the 30 functions can be used for checking the efficiency of any other optimization algorithm.
Before the description of the implemented methodology, a brief introduction to the basic concepts, notation, and common search strategies used in optimization methods are described in the following Sections 1.2 and 1.3.

Formulation of an Optimization Problem
An optimization problem is usually written in terms of an objective function f (x) which needs to be minimized (or maximized), that denotes the purpose of the problem. The vector term x is known as the design vector, and it constitutes a candidate solution to the problem. It is composed of several design variables, x = {x 1 , . . . , x D }, that represent the unknown optimal parameters that are to be found. The number of design variables D is the number of dimensions of the design vector and the optimization problem in general. Design variables are expressed in various forms and can have binary, integer, or real values. In all cases, some sort of boundaries must be specified to restrict the search space to a realistic domain Ω (i.e., the lower and upper bounds, Ω = [lb, ub] where lb i ≤ x i ≤ ub i for every i ∈ {1, . . . , D} [18]. The optimization task is then described as the process of finding a design vector x* such that the following expression is fulfilled, for a minimization problem: The definition expressed through Equation (1) implies that there is no better solution to the optimization problem than the one denoted by the design vector x*. In that case, the solution is known as the global optimum. However, in most practical optimization problems, an exact solution such as x* is difficult or practically impossible to obtain. Instead, an approximate solution of the actual global optimum, which is usually a local minimum, can be acceptable for practical purposes. Moreover, most optimization problems in practice are subjected to restrictions within their search space, meaning that some values of the domain Ω are not valid as solutions to the problem, due to the imposed constraints. These constraints can be expressed as equality functions, h(x) = 0, or more frequently as inequality Data 2022, 7, 46 3 of 51 functions, g(x) ≤ 0. When there are no constraints, other than the design variable bounds, the formal formulation of an optimization problem (for minimization) is simply: (2)

Optimization Search Strategies
There are two general types of strategies that can be used to solve optimization problems. On the one hand, deterministic or exact methods are based on a solid mathematical formulation and are commonly used to solve simple optimization problems where the effort grows only polynomially with the problem size. However, if the problem is NP-hard, the computational effort grows exponentially, and even small-sized problems can become unsolvable by these methods as they usually get trapped in local minima. In the present study, we use SQP as a mathematical (exact) method.
Alternatively, metaheuristic optimization algorithms (MOAs) [19] are based on stochastic search strategies that incorporate some form of randomness or probability that increases their robustness [4,20,21]. As a result, such algorithms are very effective in handling hard or ill-conditioned optimization problems where the objective function may be nonconvex, nondifferentiable, and possibly discontinuous over a continuous, discrete, or mixed continuous-discrete domain. Furthermore, these algorithms often show good performance for many NP-complete problems and are widely used in practical applications. Although MOAs usually provide good quality solutions, they can offer no guarantee that the optimal solution has been found. In the present study, we use two well-known MOAs, namely the GA and PSO, as explained in detail in Section 2.1. MOAs have been used to solve mathematical problems as well as more practical optimization problems in various scientific fields, including computer science, economics, operations research, engineering, and others. Other popular MOAs that have been successfully applied to a variety of problems in different disciplines are Evolution Strategies (ES) [22,23], Differential Evolution (DE) [24][25][26][27][28], and Ant Colony Optimization (ACO) [15,29], among others.

Methodology
A total of 30 objective functions that can serve for benchmarking purposes are investigated, denoted as F01 to F30. Their mathematical expressions as well as a 2-dimensional graphical visualization and other details are thoroughly described in Appendix A. The functions are chosen according to the following specific criteria so that they are well-suited for benchmarking purposes: (i) They are scalable in terms of their number of dimensions, i.e., they can be defined for any number of dimensions D. (ii) They can be expressed explicitly in a clear mathematical form without any ambiguities. (iii) All correspond to minimization problems. Therefore, a specific minimum value (and a corresponding solution vector) exists. (iv) All functions have a minimum value of zero, for consistency. This is not a limitation, as a constant number can be easily added to any function, making the minimum value whatever is desired.
All the objective functions are investigated in this study in multiple numbers of dimensions, namely: (i) D = 5, (ii) D = 10, (iii) D = 30, and (iv) D = 50. In other words, each problem is defined and investigated with 5, 10, 30, or 50 variables. Three different optimization algorithms are used to find the minimum value of every objective function, in each of the chosen dimensions. The chosen algorithms and their respective parameters are discussed in Section 2.1.
Each of the optimization tasks is defined as an unconstrained optimization problem (see Equation (2)). All the tested objective functions are scalar-valued such as f : R D → R, where D is the number of dimensions. The search space Ω ⊂ R D is box-shaped in the D-dimensional space and it is defined by the lower and upper bounds vectors denoted as Data 2022, 7, 46 4 of 51 Ω = [lb, ub] where lb i ≤ x i ≤ ub i for every i ∈ [18]. A design vector x with design variables x = {x 1 , . . . , x D } is a candidate solution inside the search space x ∈ Ω (the adopted notation is introduced in Section 1.2). The obtained results are presented and compared in Section 3 where the complexity and properties of the presented objective functions are discussed.
All the simulations and the numerical work in this study have been completed in MATLAB. All the work is available with its source code (in a github repository), where any interested researcher can download the scripts, run the program, and reproduce the results on his/her own computer. This is particularly useful for researchers as they can (i) use the provided functions for their optimization and benchmarking work; (ii) use the provided optimizers for other optimization problems; and (iii) investigate the performance and suitability of these algorithms in optimizing the provided functions in various dimensions, replicate, and validate the results of the present study.

Optimization Algorithms Used
We have chosen three well-known optimization algorithms to study the selected optimization functions:
GA and PSO are metaheuristic methods that use a population of agents (or particles) at each generation (iteration). In addition, they are stochastic methods, which means that the final result of the optimization procedure will be different each time the method is run. For this reason, we run these two algorithms 50 times each and we process the results statistically in the end. On the other hand, SQP is a deterministic method which will give the very same result every time the algorithm is run, provided that the starting point of the search is the same. In this study, SQP is also run 50 times, starting from different random points in the design space. After the results of 50 runs for each algorithm have been obtained, we calculate and report the average and the median objective function values, as well as the standard deviation, over the 50 runs, for each problem. In addition, we report the median values of two useful evaluation metrics, ∆ x and ∆ f [30,31], that are defined in the domain space and the image space, respectively. Finally, we calculate the median value of a third final evaluation metric, ∆ t , which is a combination of the other two. The metrics are described in detail in Section 2.3.
All three algorithms (GA, PSO, SQP) are based on MATLAB implementations and are executed using the MATLAB commands ga, particleswarm, and fmincon, respectively.
GA uses the following default parameters: • 'CrossoverFraction', 0.8. The CrossoverFraction option specifies the fraction of each population, other than elite children, that are made up of crossover children; • 'EliteCount', 0.05*PopulationSize. EliteCount specifies the number of elite children; For the MATLAB fmincon command, which is a mathematical optimizer, we also use the additional option 'Algorithm', 'sqp' to ensure that the SQP variant of the mathematical optimizer will be employed.
We use the maximum function evaluations as the only convergence criterion for GA and PSO, i.e., both algorithms will stop after a certain number of function evaluations is completed. In the case of SQP (fmincon MATLAB command), we use, additionally, the following parameters that can affect the convergence criterion: In fact, for the SQP case, we try to enforce very strict criteria for the tolerances, to try to ensure that the max. number of function evaluations will be reached so that the Data 2022, 7, 46 5 of 51 comparison is somehow fair between the three methods. Since the GA, PSO, and SQP are run 50 times for each problem, the total number of optimization problems solved is 3 (methods) * 4 (different dimensions) * 30 (Problems) * 50 (Runs) = 18,000. To maintain consistency for all problems and all the different cases, for GA and PSO the population size is set to NP = 10·D and the maximum number of iterations (or generations) is set to MaxIter = 20·D − 50. Then, the max. number of function evaluations can be calculated as MaxFE = NP·MaxIter. Table 1 shows the population size, max. number of generations/iterations, and the max. number of function evaluations for each category of problems, based on the number of dimensions.

Objective Functions
The selected objective functions together with their suggested search range and the location of the global optimum x* in the design space are briefly presented in Table 2. For uniformity reasons, the optimum (minimum) value of all functions is zero, in all cases (f i (x*) = 0, for all i = 1, 2, . . . , 30). However, the location of the minimum, x*, varies with the problems. It is at x* = {0, 0, . . . , 0} in 24 of the functions (80% of them), while it is different in 6 of them, namely, F04, F11, F12, F13, F17, and F21.  At this point, it is worth noting that some algorithms, such as PSO, tend to converge, at least for their free response of the associated dynamical system to the {0, 0, . . . , 0} point and this can cause a bias in the procedure, favoring these algorithms in cases where the optimum lies at {0, 0, . . . , 0} or near that. For a fair and more general comparison, it would be advisable to shift and rotate the functions using proper transformations, before using them. On the other hand, the direct comparison of the performance of the different algorithms is not the main purpose of the present study, and to keep things simple and consistent we will use these functions in their original form in the paper and the MATLAB code implementation.
The properties, mathematical formulation, suggested search space, and the location of the global minimum for each function are given in detail in Appendix A, together with figures visualizing the functions in the simple two-dimensional (D = 2) case. The mathematical functions have been implemented in MATLAB and their code has been optimized to achieve a faster execution time. Wherever possible, the use of "for-loops" is avoided and replaced with vectorized operations, as it is known that MATLAB is slow when processing for-loops, while it is very fast and efficient in handling vectors and matrices. Only 3 of the functions, namely F06-Weierstrass, F17-Perm D, Beta, and F19-Expanded Schaffer's F6, use some limited for-loops in their code, while the other functions use only vectorized operations without any for-loops. Most functions are very fast to calculate using a modern computer, with the exceptions of F06 (Weierstrass function) and F17 (Perm D, Beta function), which require relatively more time, especially for the higher dimension cases.

Evaluation Metrics
Various metrics can be used for the evaluation of the performance of an optimization algorithm in optimizing an objective function. In this study, we first use the average value of the objective function, the median value, and the standard deviation over 50 runs. Although these can provide some information on the performance of each algorithm in each problem, they are not normalized metrics and they cannot be comparable among different functions. The functions are defined in various ranges, in different dimensions, while their values within the multidimensional search space also vary. For this reason, we use three additional normalized evaluation metrics, ∆ x , ∆ f , and ∆ t [30,31], in particular their median values over 50 runs. ∆ x is the root mean square of the normalized Euclidean distance (in the domain space) between the optimizer-found optimum location x and the location of the global optimum x*. ∆ f is the associated normalized distance in the image space. The first two metrics are defined as follows: where R i is the range of the i-th variable, i.e., R i = ub i − lb i , f min is the final objective function value found by an optimizer, f * min = 0 is the global optimum which is zero for all functions in the present study, and f * max is the maximum value of the objective function in the search space. The third metric, ∆ t , is a combination of the other two, as shown in Equation (5), which gives an overall evaluation of the quality of the final result.
Again, the final value of the ∆ t evaluation metric reported in this study is the median value over 50 runs. Equation (5) should not be applied on the final median values of ∆ x and ∆ f to obtain ∆ t in a single step, but rather on the individual values of ∆ x and ∆ f for each optimization run and then take the median value of ∆ t over the 50 runs.
It should be noted that the exact value of f * max for every function (for a given number of dimensions, D) is not known a priori. For this reason, we perform a Monte Carlo Simulation to approximate the f * max value. For every function and every number of dimensions (5, 10, 30, 50), we generate 10,000 sample points in the search space, and we calculate the corresponding objective function values for all of them. Then, we take the maximum value as the f * max to apply it to Equation (4).

Obtained Objective Function Values
For all 30 objective functions, the minimum (target) value of the objective function is zero, as shown in Table 2. In our case, we run each algorithm 50 times, for each problem. The total number of optimization runs is therefore 3 × 4 × 30 × 50 = 18,000. Considering that the full convergence history of each individual run is recorded, together with the final optimum, the execution time, and other important parameters, it is obvious that the generated amount of data is massive, and it is not easy to present all these results in a simple, compact, and comprehensive way.
For comparison purposes, we present in the figures: (i) the median values of the final optimum, over 50 runs; (ii) the median of ∆ f metric; (iii) the median of ∆ x metric; and (iv) the median of ∆ t metric, for each problem and each optimization algorithm. In case a problem has two global optima (the case of F04, Quintic function), we take into account the minimum ∆ x and ∆ t metrics. The results are presented in Figure 1  In all four figures, the y-axis is in logarithmic scale for the first subfigure which has to do with the objective function value, and it has been limited to the value of 10 5 for all cases. For the ∆ f , ∆ x , and ∆ t metrics (2nd, 3rd, and 4th subfigures), the y-axis is in normal scale with automatic min/max values.
More detailed results are presented in table format in Appendix C, where Table A1 shows the results obtained from the three optimizers for the cases D = 5 and D = 10, as averages over 50 runs, and Table A2 shows the corresponding average results for the cases D = 30 and D = 50.            As expected, the SQP shows the least variance of the results (lowest values of the standard deviation), in most cases, and this is particularly true for higher dimensions. The SQP seems to specialize in some specific problems, such as F01, F02, F13, F14, F15, F16, F20, F24, and F28, where it manages to get very close to the optimum solution, in comparison to the GA and PSO. The values of the ∆ f and ∆ x metrics provide a good indication of the performance of the algorithms and which problem is hard to solve. According to the ∆ f metric, the functions F05, F06, F08, F10, F18, F19, F21, F24, F26, F29, and F30 are hard to optimize, with F29 and F30 being the hardest.

Convergence History for Each Problem and Each Optimization Algorithm
The convergence histories for each problem and each optimization algorithm for the various numbers of dimensions are presented in the following figures as follows: D = 5 ( Figures 5 and 6), D = 10 ( Figures 7 and 8), D = 30 (Figures 9 and 10), and D = 50 ( Figures 11 and 12), as the median values over 50 runs, for each case. The median is the 0.5 quantile of a data set, i.e., the middle number in a sorted list of numbers. The presentation of these results using the median curve is more descriptive than the one using the average curve, as the median is not affected by the existence of any outliers, in contrast with the average. It should be noted that in these convergence history plots, the y-axis (median of objective function values) is in the logarithmic scale, while the x-axis (number of iterations) remains in the normal scale.
Although the median curve is presented in these figures, there is variation among the 50 independent runs of the algorithms, and it is worth also investigating the spread of these results. For this purpose, at the end of each optimization case (i.e., 50 runs) we calculate the 0.1 quantile, Q 0.1 and the 0.9 quantile, Q 0.9 . The 0.1 quantile is the 10th percentile, i.e., the point where 10% percent of the data have values less than this number. Similarly, the 0.9 quantile is the 90th percentile, i.e., the point where 90% percent of the data have values less than this number. In our case, with 50 elements (50 runs), these two correspond to the average of the 5th and the 6th elements (Q 0.1 ), and the average of the 45th and the 46th elements (Q 0.9 ) of the ordered list containing, in ascending order, the values of the objective function (50 elements in total). Within this range [Q 0.1 , Q 0.9 ], there are 80% of the values of the objective function (i.e., 40 values in our case).
We see that in some cases this vertical line is long, i.e., there is a large spread of the results above and below the median value, while in other cases the line is barely drawn or it is not drawn at all, i.e., the spread of the results is small. Again, it should be emphasized that this vertical line is drawn along an axis which is presented in a logarithmic scale, and for this reason its top part (the part above the median) would be drawn shorter in length, in comparison to the bottom part (below the median), in a case where the two actually have the same length in absolute values.

Discussion and Conclusions
There are plenty of data to be analyzed from the total number of 18,000 optimization problems that are solved. The number of unique problems is in fact 360, since each problem is solved 50 times, to compute the average, the median, and other statistical quantities and evaluation metrics for every algorithm and every problem. The presented results and the convergence histories show both the relevant difficulty of each optimization problem, in the given range of dimensions, and also a comparison of the performance of the different optimization algorithms in each problem.
Every function has its own unique characteristics, and every optimization algorithm has its own special features, advantages, and drawbacks. Some functions are easily optimized by all algorithms, while others pose a real challenge to some (or even all) of the optimizers. It appears to be impossible to establish a single criterion to determine the complexity of the functions; however, we will try to provide a general overview by identifying some common patterns found in the results. In the following, we will use the labels "low", "middle", and "high" to estimate the function's complexity relative to the challenge that they pose to each optimizer.
The convergence history plots shown in Figures 5-12 provide an overall picture on how easy or difficult the process of finding the minimum value for an optimization algorithm is, for each problem. If the curve shows a steady decrease towards the zero value early in the process, it means that the algorithm is working as intended for the specific problem, and it is a sign of good performance. When the curve is horizontal at a point above zero, it means that the algorithm is trapped in a local minimum, and it cannot move further. Note that the results presented as convergence history plots are median values over 50 runs, for all three algorithms, the GA, PSO, and SQP. The combined ∆ t metric can also give us a good indication of the success of each algorithm in each problem.
The first major pattern found is with functions that appear to be easily solvable by the deterministic SQP approach but much more difficult for the GA or the PSO metaheuristics. These functions are, namely: F01, F02, F04, F11, F12, F13, F14, F15, F16, F20, F24, F27, and F28. Such an observation is not a surprise, as these functions are convex and/or unimodal and pure mathematical methods usually excel in such problems, taking advantage of gradient information. Nevertheless, in most of these cases, it appears that increasing the number of iterations may improve the result for the GA or PSO. These functions are classified as a low level of complexity for the SQP and between the low to middle level for the GA and PSO.
The next distinction is made for problems that show a good convergence history curve (i.e., steady decrease towards zero) in all the tested dimensions, but are considered only for the metaheuristics, i.e., the GA or the PSO algorithms. In other words, they are relatively easily solvable by at least one of the tested metaheuristic approaches. The SQP is not considered here to avoid comparing results based on algorithms that serve very different purposes. The identified functions with this characteristic are: F01, F02, F03, F04, F07, F09, F10, F14, F16, F20, F22, F23, F26, and F27. These functions are classified as a low to middle level of complexity for optimizers that are based on metaheuristic approaches.
Functions with a high level of complexity are considered as the ones where none of the algorithms considered seem to have found a satisfactory solution that lays close to the global minimum in at least one of the tested dimensions. Functions with such properties are: F05, F06, F08, F17, F18, F19, F21, F23, F25, F29, and F30. Some of these functions are difficult only in higher dimensions (i.e., D = 30 or D = 50), while others, such as F05, F17, F29, and F30 are very challenging in all the tested dimensions, even for the simplest D = 5 case. Most of these functions are nonconvex and multimodal and the optimizers get trapped in local minima quite often. For the last two functions, F29 and F30, although the optimum point is clearly visible in the 2D case, as shown in Figures A30 and A31, respectively, it is extremely difficult to locate it in practice using optimization procedures. Due to the presence of numerous local minima and the isolation of the global minimum, these two problems represent difficult "needle in a haystack" optimization cases that are Data 2022, 7, 46 20 of 51 extremely hard to optimize effectively. In both problems F29 and F30, all three optimizers fail to reach objective function values lower than 10,000 in all dimension cases, even for the simplest D = 5 case. Based on some additional tests that were performed, it appears that the task is very challenging even when only two dimensions are considered (case D = 2).
The three optimizers, GA, PSO, and SQP, in their MATLAB implementations require different computational times to end up to the optimum solutions. In general, the PSO was found to be the fastest algorithm in all the examined problems. In most cases, the SQP was the slowest algorithm, requiring more time than the GA, especially when low-dimensional spaces were examined (D = 5 or D = 10 cases). The needed time for each algorithm and each problem is recorded by the program and the relevant results are available in the github repository hosting the source code of the project.

User Notes
A dedicated github repository, freely available at https://github.com/vplevris/ Collection30Functions (accessed on 24 February 2022), has been made for this project, where the interested reader can download the code, run it, and reproduce all the results and the data of the paper, including tables, figures, etc. A detailed instruction file is also provided in Word format on how to run the different modules of the code.

Funding:
The APC was funded by Oslo Metropolitan University.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The full source code of the study, including the 30 functions, the 3 optimizers, and the full methodology, are provided. The code is written in MATLAB. Any reader can download the source code, run it in MATLAB, and reproduce the results of the study. It is available online at https://github.com/vplevris/Collection30Functions (accessed on 24 February 2022).

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A Detailed Description of the 30 Functions
The properties, mathematical formulation, suggested search space, and the location of the global minimum are given in detail for each function in this appendix. In addition, the 30 functions are plotted for the simple two-dimensional case (D = 2), to provide a visual idea of their shapes and complexities. For each function, there are two plots. The one on the right (b) provides a general overview as the plotting area is set to the suggested search range according to Table 2. The plot on the left (a) is a closer look (or a zoom-in) into the search area by a factor of ×10 (in other words, the plot range is limited to 1/10 of the suggested search range).

Sphere function (sphere_func)
The Sphere function [32], also known as De Jong's function [33] is one of the simplest optimization test functions, probably the simplest, easiest, and most commonly used continuous domain search problem. It is continuous, convex, unimodal, differentiable, separable, highly symmetric, and rotationally invariant. The suggested search area is the hypercube [−100, 100] D . The global minimum is f 01 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is:  Figure A1 depicts the function in the 2D case (D = 2). In this case, the function is simplified as: Figure A1 depicts the function in the 2D case (D = 2). In this case, the function is simplified as: Figure A1. F01-Sphere function in two dimensions.

Ellipsoid function (ellipsoid_func)
The Ellipsoid function [32], or Hyper-ellipsoid function or Axis Parallel Hyper-ellipsoid function, is similar to the sphere function and it is also known as the Weighted sphere function [33]. It is continuous, convex, differentiable, separable, and unimodal. The suggested search area is the hypercube [−100, 100] D . The global minimum is f02(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A2 depicts the function in the 2D case (D = 2). In this case, the formula is:

Sum of Different Powers function (sumpow_func)
The Sum of Different Powers function [33] is a commonly used unimodal test function. The suggested search area is the hypercube [−10, 10] D . The global minimum is f03(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A3 depicts the function in the 2D case (D = 2). In this case, the formula is:

Ellipsoid function (ellipsoid_func)
The Ellipsoid function [32], or Hyper-ellipsoid function or Axis Parallel Hyper-ellipsoid function, is similar to the sphere function and it is also known as the Weighted sphere function [33]. It is continuous, convex, differentiable, separable, and unimodal. The suggested search area is the hypercube [−100, 100] D . The global minimum is f 02 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A2 depicts the function in the 2D case (D = 2). In this case, the formula is:

Sum of Different Powers function (sumpow_func)
The Sum of Different Powers function [33] is a commonly used unimodal test function. The suggested search area is the hypercube [−10, 10] D . The global minimum is f 03 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A3 depicts the function in the 2D case (D = 2). In this case, the formula is:

Drop-Wave function (drop_wave_func)
The Drop-Wave function is a multimodal function with high complexity. The suggested search area is the hypercube [−5.12, 5.12] D . The global minimum is f05(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is:

Drop-Wave function (drop_wave_func)
The Drop-Wave function is a multimodal function with high complexity. The suggested search area is the hypercube [−5.12, 5.12] D . The global minimum is f 05 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A5 depicts the function in the 2D case (D = 2). In this case, the formula is: . F04-Quintic function in two dimensions.

Drop-Wave function (drop_wave_func)
The Drop-Wave function is a multimodal function with high complexity. The suggested search area is the hypercube [−5.12, 5.12] D . The global minimum is f05(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is:

Weierstrass function (weierstrass_func)
The Weierstrass function [32] is multimodal and it is continuous everywhere but only differentiable on a set of points. It is a computationally expensive function. The suggested search area is the hypercube [−0.5, 0.5] D . In this search area, the global minimum is unique, and it is f 06 (x*) = 0 at x* = {0, 0, . . . , 0}. Note that if a larger search area is considered, then there might be multiple global optima as the function is periodic. For this reason, it is strongly suggested to use the previously mentioned search area of [−0.5, 0.5] D . The general formulation of the function is: Figure A6 depicts the function in the 2D case (D = 2). In this case, the formula is: Data 2022, 7, x FOR PEER REVIEW 24 of 52

Weierstrass function (weierstrass_func)
The Weierstrass function [32] is multimodal and it is continuous everywhere but only differentiable on a set of points. It is a computationally expensive function. The suggested search area is the hypercube [−0.5, 0.5] D . In this search area, the global minimum is unique, and it is f06(x*) = 0 at x* = {0, 0, …, 0}. Note that if a larger search area is considered, then there might be multiple global optima as the function is periodic. For this reason, it is strongly suggested to use the previously mentioned search area of [−0.5, 0.5] D . The general formulation of the function is: Figure A6 depicts the function in the 2D case (D = 2). In this case, the formula is: Figure A6. F06-Weierstrass function in two dimensions.

Alpine 1 function (alpine1_func)
The Alpine 1 function is a non-convex multimodal differentiable function. The suggested search area is the hypercube [−10, 10] D . The global minimum is f07(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A7 depicts the function in the 2D case (D = 2). In this case, the formula is:

Alpine 1 function (alpine1_func)
The Alpine 1 function is a non-convex multimodal differentiable function. The suggested search area is the hypercube [−10, 10] D . The global minimum is f 07 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A7 depicts the function in the 2D case (D = 2). In this case, the formula is:

Griewank's function (griewank_func)
The Griewank's function [32,33] is a multimodal function which has many regularly distributed local minima. The suggested search area is the hypercube [−100, 100] D . The global minimum is f 09 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A9 depicts the function in the 2D case (D = 2). In this case, the formula is:

Griewank's function (griewank_func)
The Griewank's function [32,33] is a multimodal function which has many regularly distributed local minima. The suggested search area is the hypercube [−100, 100] D . The global minimum is f09(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A9 depicts the function in the 2D case (D = 2). In this case, the formula is:

HappyCat function (happycat_func)
The HappyCat function [32] is  Figure A11 depicts the function in the 2D case (D = 2). In this case, the formula is:

HGBat function (hgbat_func)
The HGBat function [32] is similar to HappyCat function but it is even more complex. It is a multimodal function. The suggested search area is the hypercube [−15, 15] D . The global minimum is f 12 (x*) = 0 at x* = {−1, −1, . . . , −1}. The general formulation of the function is: Figure A12 depicts the function in the 2D case (D = 2). In this case, the formula is: Data 2022, 7, x FOR PEER REVIEW 28 of 52

HGBat function (hgbat_func)
The HGBat function [32] is similar to HappyCat function but it is even more complex. It is a multimodal function. The suggested search area is the hypercube [−15, 15] D . The global minimum is f12(x*) = 0 at x* = {−1, −1, …, −1}. The general formulation of the function is: Figure A12 depicts the function in the 2D case (D = 2). In this case, the formula is:

Rosenbrock's function (rosenbrock_func)
The Rosenbrock's function [33] is a classic optimization problem also known as Rosen-

Rosenbrock's function (rosenbrock_func)
The Rosenbrock's function [33] is a classic optimization problem also known as Rosenbrock's valley or Banana function. The global optimum lays inside a long, narrow, parabolic shaped flat valley. Finding the valley is trivial, but convergence to the global optimum is difficult. The suggested search area is the hypercube [−10, 10] D . The global minimum is f 13 (x*) = 0 at x* = {1, 1, . . . , 1}. The general formulation of the function is: Figure A13 depicts the function in the 2D case (D = 2). In this case, the formula is:

High Conditioned Elliptic function (ellipt_func)
The High Conditioned Elliptic function [32] is a unimodal, globally quadratic, and illconditioned function with smooth local irregularities. The suggested search area is the hypercube [−100, 100] D . The global minimum is f14(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A14 depicts the function in the 2D case (D = 2). In this case, the formula is:

High Conditioned Elliptic function (ellipt_func)
The High Conditioned Elliptic function [32] is a unimodal, globally quadratic, and illconditioned function with smooth local irregularities. The suggested search area is the hypercube [−100, 100] D . The global minimum is f 14 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A14 depicts the function in the 2D case (D = 2). In this case, the formula is:

High Conditioned Elliptic function (ellipt_func)
The High Conditioned Elliptic function [32] is a unimodal, globally quadratic, and illconditioned function with smooth local irregularities. The suggested search area is the hypercube [−100, 100] D . The global minimum is f14(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A14 depicts the function in the 2D case (D = 2). In this case, the formula is:

Discus function (discus_func)
The Discus function is a globally quadratic unimodal function with smooth local irregularities where a single direction in the search space is thousands of times more sensitive than all others (conditioning is about 10 6 ). The suggested search area is the hypercube [−100, 100] D . The global minimum is f 15 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A15 depicts the function in the 2D case (D = 2). In this case, the formula is:

Discus function (discus_func)
The Discus function is a globally quadratic unimodal function with smooth local irregularities where a single direction in the search space is thousands of times more sensitive than all others (conditioning is about 10 6 ). The suggested search area is the hypercube [−100, 100] D . The global minimum is f15(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A15 depicts the function in the 2D case (D = 2). In this case, the formula is:

Bent Cigar function (bent_cigar_func)
The Bent Cigar function is unimodal and nonseparable, with the optimum located in a smooth, but very narrow valley. The suggested search area is the hypercube [−100, 100] D . The global minimum is f16(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A16 depicts the function in the 2D case (D = 2). In this case, the formula is: We notice that in the 2D case, functions f14, f15, and f16 give essentially the same optimization problem, but for D > 2 this is not the case.

Bent Cigar function (bent_cigar_func)
The Bent Cigar function is unimodal and nonseparable, with the optimum located in a smooth, but very narrow valley. The suggested search area is the hypercube [−100, 100] D . The global minimum is f 16 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A16 depicts the function in the 2D case (D = 2). In this case, the formula is: We notice that in the 2D case, functions f 14 , f 15 , and f 16 give essentially the same optimization problem, but for D > 2 this is not the case.

Perm D, Beta function (permdb_func)
The Perm D, Beta function is a unimodal function. The suggested search area is the hypercube [−D, D] D . This is because the global minimum f17(x*) = 0 is at x* = {1, 2, …, D}, to ensure that it will always lie inside the search area. In the present study, since D = 50 is the max. number of dimensions considered, and to keep things consistent, we will use the search range [−50, 50] D for all the cases considered (for all dimensions).
The general formulation of the function is:

Perm D, Beta function (permdb_func)
The Perm D, Beta function is a unimodal function. The suggested search area is the hypercube [−D, D] D . This is because the global minimum f 17 (x*) = 0 is at x* = {1, 2, . . . , D}, to ensure that it will always lie inside the search area. In the present study, since D = 50 is the max. number of dimensions considered, and to keep things consistent, we will use the search range [−50, 50] D for all the cases considered (for all dimensions).
The general formulation of the function is: Figure A17 depicts the function in the 2D case (D = 2) for the search range considered [−50, 50] 2 and the zoomed case [−5, 5] 2 . In this case, the formula is: By setting β = 0.5 in the 2D case, we obtain: For illustration purposes, Figure A18

Expanded Schaffer's F6 function (expschafferf6_func)
The Expanded Schaffer's F6 function is a multidimensional function based on the Schaffer's F6 function [34]. It is multimodal and nonseparable. The

Rotated Hyper-ellipsoid function (rothellipsoid_func)
The Rotated Hyper-ellipsoid function is similar to the Ellipsoid function. It is continuous, convex, and unimodal. The suggested search area is the hypercube [−100, 100] D . The global minimum is f20(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is:

Rotated Hyper-ellipsoid function (rothellipsoid_func)
The Rotated Hyper-ellipsoid function is similar to the Ellipsoid function. It is continuous, convex, and unimodal. The suggested search area is the hypercube [−100, 100] D . The global minimum is f 20 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A21 depicts the function in the 2D case (D = 2). In this case, the formula is:

Rotated Hyper-ellipsoid function (rothellipsoid_func)
The Rotated Hyper-ellipsoid function is similar to the Ellipsoid function. It is continuous, convex, and unimodal. The suggested search area is the hypercube [−100, 100] D . The global minimum is f20(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A21 depicts the function in the 2D case (D = 2). In this case, the formula is:

Schwefel function (schwefel_func)
The Schwefel function [33,34] is quite complex, with multiple local minima. The suggested search area is the hypercube [−500, 500] D . The global minimum is f 21 (x*) = 0 at x* = {c, c, . . . , c}, where c = 420.968746359982025. The general formulation of the function is: In the literature, the function is also found with the constant value 418.9829·D where the optimum location is reported with c = 420.9687 [34]. This formulation is not very precise. For details on this and a relevant detailed investigation of the function, please see Appendix B. Figure A22 depicts the function in the 2D case (D = 2). In this case, the formula is: the optimum location is reported with c = 420.9687 [34]. This formulation is not very precise. For details on this and a relevant detailed investigation of the function, please see Appendix B. Figure A22 depicts the function in the 2D case (D = 2). In this case, the formula is: ( ) ( ) 21

Sum of Different Powers 2 function (sumpow2_func)
The Sum of Different Powers 2 function [32] is similar to the Sum of Different Powers function, but its formulation is slightly different. It is unimodal and nonseparable, with different sensitives for the various design variables. The suggested search area is again the hypercube [−10, 10] D . The global minimum is f22(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A23 depicts the function in the 2D case (D = 2). In this case, the formula is:

Sum of Different Powers 2 function (sumpow2_func)
The Sum of Different Powers 2 function [32] is similar to the Sum of Different Powers function, but its formulation is slightly different. It is unimodal and nonseparable, with different sensitives for the various design variables. The suggested search area is again the hypercube [−10, 10] D . The global minimum is f 22 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A23 depicts the function in the 2D case (D = 2). In this case, the formula is:

Xin-She Yang's 1 function (xinsheyang1_func)
The Xin-She Yang's 1 function [33] is nonconvex and nonseparable. The function is not smooth, and its derivatives are not well-defined at the optimum. The suggested search area is the hypercube [−2π, 2π] D . The global minimum is f 23 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A24 depicts the function in the 2D case (D = 2). In this case, the formula is:

Xin-She Yang's 1 function (xinsheyang1_func)
The Xin-She Yang's 1 function [33] is nonconvex and nonseparable. The function is not smooth, and its derivatives are not well-defined at the optimum. The suggested search area is the hypercube [−2π, 2π] D . The global minimum is f23(x*) = 0 at x* = {0, 0, …, 0}. The general formulation of the function is: Figure A24 depicts the function in the 2D case (D = 2). In this case, the formula is: Figure A24. F23-Xin-She Yang's 1 function in two dimensions. Figure A24. F23-Xin-She Yang's 1 function in two dimensions.

Schwefel 2.22 function (schwefel222_func)
The   Figure A26 depicts the function in the 2D case (D = 2). In this case, the formula is:

Salomon function (salomon_func)
The Salomon function is nonconvex, continuous, multimodal, and nonseparable. The suggested search area is the hypercube [−20, 20] D . The global minimum is f 26 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A27 depicts the function in the 2D case (D = 2). In this case, the formula is:

Modified Ridge function (modridge_func)
The original Ridge function has the form In this formula, d and a are constants and are usually set to d = 1, a = 0.1. Other values (d = 2, a = 0.5, etc) can be also found in the literature. The Modified Ridge function proposed in this study has the form: The suggested search area is the hypercube [−100, 100] D . The global minimum is f 27 (x*) = 0 at x* = {0, 0, . . . , 0}. Figure A28 depicts the function in the 2D case (D = 2). In this case, the formula is:  Figure A28 depicts the function in the 2D case (D = 2). In this case, the formula is: Figure A28. F27-Ridge function in two dimensions.

Modified Xin-She Yang's 3 function (modxinsyang3_func)
The original Xin-She Yang's 3 function is the third function proposed in the excellent work by Xin-She Yang [33]. The Modified Xin-She Yang's 3 function proposed in this study is based on that, with some modifications. It is a standing-wave function with a defect,

Modified Xin-She Yang's 3 function (modxinsyang3_func)
The original Xin-She Yang's 3 function is the third function proposed in the excellent work by Xin-She Yang [33]. The Modified Xin-She Yang's 3 function proposed in this study is based on that, with some modifications. It is a standing-wave function with a defect, which is nonconvex and nonseparable, with multiple local minima, and a unique global minimum. The suggested search area is the hypercube [−20, 20] D . The global minimum is f 29 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: Figure A30 depicts the function in the 2D case (D = 2). The function is simplified as:

Modified Xin-She Yang's 5 function (modxinsyang5_func)
The original Xin-She Yang's 5 function is the fifth function proposed in the work by Xin-She Yang [33].
The function has multiple local minima, but the global minimum is unique. Figure  A31 depicts the function in the 2D case (D = 2) where its landscape looks like a wonderful candlestick [33]. The function is simplified as:

Modified Xin-She Yang's 5 function (modxinsyang5_func)
The original Xin-She Yang's 5 function is the fifth function proposed in the work by Xin-She Yang [33]. The Modified Xin-She Yang's 5 function proposed in this study is based on that, with some minor modifications. The suggested search area is the hypercube [−100, 100] D . The global minimum is f 30 (x*) = 0 at x* = {0, 0, . . . , 0}. The general formulation of the function is: The function has multiple local minima, but the global minimum is unique. Figure A31 depicts the function in the 2D case (D = 2) where its landscape looks like a wonderful candlestick [33]. The function is simplified as: The function has multiple local minima, but the global minimum is unique. Figure  A31 depicts the function in the 2D case (D = 2) where its landscape looks like a wonderful candlestick [33]. The function is simplified as:

Appendix B Investigation of the Schwefel Function (F21)
In the literature [33,34], the Schwefel function (F21 in this study) is usually found with the value 418.9829·D in its formula and the optimum location is reported with c = 420.9687. This formulation is not very precise, compared to the formulation described here in the description of the function. Indeed, to find the correct values, one can take the onedimensional case (D = 1) and find the minimum of the function in the search area [−500, 500]. A plot of the function of Equation (A63) is presented in Figure A32.
Data 2022, 7, x FOR PEER REVIEW 42 of 5

Appendix B. Investigation of the Schwefel Function (F21)
In the literature [33,34], the Schwefel function (F21 in this study) is usually foun with the value 418.9829•D in its formula and the optimum location is reported with c 420.9687. This formulation is not very precise, compared to the formulation described her in the description of the function. Indeed, to find the correct values, one can take the one dimensional case (D = 1) and find the minimum of the function  As shown in the figure, the minimum is obviously within the range [400, 500] for x To find the exact location of the minimum, we can omit the absolute term (since x > 0 i this range) and find the value of x ∈ [400, 500] that makes the derivative of the functio equal to zero. In this case, we have: Figure A32. Plot of the function y = −x sin |x| for −500 < x < 500.
As shown in the figure, the minimum is obviously within the range [400, 500] for x. To find the exact location of the minimum, we can omit the absolute term (since x > 0 in this Data 2022, 7, 46 42 of 51 range) and find the value of x ∈ [400, 500] that makes the derivative of the function equal to zero. In this case, we have: By using MATLAB and the function "vpasolve", we can numerically find the root of y'(x) for x ∈ [400, 500] as follows (code in MATLAB): syms x y y = -x*sin(sqrt(x)); yd = diff(y,x); s = vpasolve(yd = = 0, x, [400 500]) Then we obtain the result: s = 420. 96874635998202731184436501869 We substitute the above s value in the function (as x), to find the minimum value of f, as follows: and we obtain fmin(x) = −418.9828872724337062747864351956 Practically, there is no need to take into account so many decimal places. In the formulation proposed in this study, the formula for f 20 is the following For the above function, the global minimum is f 21 (x*) = 0 at x* = {c, c, . . . , c}, where c = 420.968746359982025.