Random Orthogonal Search with Triangular and Quadratic Distributions (TROS and QROS) Parameterless Algorithms for Global Optimization

: In this paper, the behavior and performance of Pure Random Orthogonal Search (PROS), a parameter-free evolutionary algorithm (EA) that outperforms many existing EAs on the well-known benchmark functions with ﬁnite-time budget, are analyzed. The sufﬁcient conditions to converge to the global optimum are also determined. In addition, we propose two modiﬁcations to PROS, namely Triangular-Distributed Random Orthogonal Search (TROS) and Quadratic-Distributed Random Orthogonal Search (QROS). With our local search mechanism, both modiﬁed algorithms improve the convergence rates and the errors of the obtained solutions signiﬁcantly on the benchmark functions while preserving the advantages of PROS: parameterless, excellent computational efﬁciency, ease of applying to all kinds of applications, and high performance with ﬁnite-time search budget. The experimental results show that both TROS and QROS are competitive in comparison to several classic metaheuristic optimization algorithms.


Introduction
Black-box optimization refers to optimizing an objective function in the absence of prior knowledge of the function.The only knowledge of the objective function is the observed outputs of the given inputs.Metaheuristics such as evolutionary computation (EC) techniques and evolutionary algorithms (EAs) are widely employed for black-box optimization of various fields across science and engineering [1][2][3] due to their certain primary advantages over the traditional techniques (such as Newton's method and gradient descent) including high robustness in solving complex real-world problems, and domain knowledge and regularity (e.g., convexity, continuity, differentiability) of the functions to be optimized are generally not required [4].
However, these algorithms often require a large number of function evaluations to evolve or update the candidate solutions in order for errors reduced to a satisfactory level [5].For many real-world problems, evaluating the objective function is often costly.These function evaluations may involve conducting computational intensive numerical simulations or expensive physical experiments [6][7][8][9] When only a limited number of objective function evaluations is affordable, EC's trial and error approach becomes unattractive.
Moreover, EC techniques are commonly recognized as heuristic search algorithms because their theoretical analysis often lags behind the development of the algorithms [4,10].Their performance is often sensitive to the problem-dependent hyper-parameters [11].Various approaches have been suggested to overcome these drawbacks.For example, building an inexpensive surrogate model for evaluation so as to reduce the number of objective function evaluations [5,12], and automatic or self-adaptive tuning of the hyperparameters [11,13,14].
In this paper, we propose two algorithms for expensive black-box optimization.The two algorithms are modified from Pure Random Orthogonal Search (PROS) [15].PROS is a (1 + 1) Evolutionary Strategy ((1 + 1) ES), a kind of EA, that has only one parent in the population and generates one child in each generation that competes with its parent.The finite-time performance of PROS is promising and outperforms several existing EC techniques on most benchmark functions [15].Unlike those well known EAs, PROS is free of parameters and thus the tuning of parameters is not required.To the best of our knowledge, there were no mathematical analyses on the behaviour and performance of the algorithm reported in the literature.In this paper, we aim to fill this gap.The major contributions of this research work are summarized as follows: 1.
The behavior and performance analysis of PROS, which are not available in [15], are provided here.2.
Two effective novel (1 + 1) ES based on PROS, namely TROS and QROS, are proposed and they outperform PROS on a set of benchmark problems.

3.
The performance of TROS and QROS are found competitive with three well-known optimization algorithms (GA, PSO and DE) on a set of benchmark problems.

Problem Formulation
In this paper, we restrict attention to global optimization of expensive black-box functions.The goal of global optimization is to find x * which is a global minimum of f , where f : R D → R is a scalar-valued objective function defined on the decision space Ω ⊆ R D and x = (x 1 , x 2 , ..., x D ) represents a vector of decision variables with D dimensions.
We have made the following assumptions.Assumption 2. f can be evaluated at all points of Ω in an arbitrary order.
Assumption 3. The slope of f is bounded with a Lipschitz constant L such that where L > 0 and • denotes the Euclidean norm.
It has to be noted that although f is assumed a Lipschitz function, the corresponding Libschitz constant L is unknown to an optimization algorithm.

Analysis of the Pure Random Orthogonal Search (PROS) Algorithm
The PROS algorithm was originally proposed by Plevris et al. in [15].We describe it with our revised notation as follows: Initially, a candidate solution vector x (0) is generated randomly from Ω (lines 1 and 2).Then, for each iteration t, j is chosen randomly from 1 to D (line 4) and a real number r is drawn randomly with uniform distribution within the search space of the j-th decision variable (line 5).A new candidate solution vector y is obtained by replacing the value of the j-th decision variable of the current best solution vector x (t) with r (lines 6 and 7).The new candidate solution vector y is evaluated and the new best solution vector x (t+1) is then updated based on the comparison result between f (y) and f (x (t) ).If f (y) is smaller, y is accepted as the new best solution vector.Otherwise, y is rejected (lines 8 to 12).The iteration counter t is updated and the search process is repeated until a termination criterion is met (lines 13 and 14).Finally, the best solution vector found by the algorithm is returned as the final result (line 15).
In the following analysis, we assume the algorithm runs continuously until the global optimum is found (defined below of this section).It is because we are interested in analyzing its ultimate performance.In practice, it may take an infinitely long time to reach the global optimum.Therefore, other termination criteria are usually adopted.For example, the algorithm stops when no improvement is made after a certain number of iterations or when a pre-defined maximum number of iterations is reached.
The error of the algorithm after running for t iterations is given by Here, x (t) is the solution vector found by the algorithm in the t-th iteration and is also the best solution vector found by the algorithm after t iterations.It is because the values of the solution vectors found by PROS is monotonically non-increasing.That is, Consider the pseudo code from line 8 to line 12 of Algorithm 1.In the t-th iteration, when a better solution y is found, it will be assigned to x (t+1) .In this case, f (x (t+1) ) < f (x (t) ).When no better solution is found in the t-th iteration, x (t) will be assigned to x (t+1) .In this case, f (x (t+1) ) = f (x (t) ).Combining the two conditions, f (x (t+1) ) ≤ f (x (t) ) and the inequalities then follow.

Algorithm 1: Pure Random Orthogonal Search (PROS)
input : nil output : the best solution vector x (t) found by the algorithm t ← 0; Initialize until a termination criterion is met; return x (t)   Definition 1 (Region of Global Optimum).It is said that a candidate solution vector x is in the region of global optimum R if the error of x is not larger than .That is, where is a small positive real number denoting the tolerance of error.

Definition 2 (Convergence Time).
The convergence time T of an algorithm is defined to be the first time t when x (t) ∈ R .

One-Dimensional Functions
In this section, we study the performance of PROS on one-dimensional functions defined on the domain Ω = [a, b].Since we are interested in the case where is small, we assume that 2  L < b − a.
Lemma 1.For any > 0, the region of the global minimum contains an interval of minimum length L for all one-dimensional functions.
Proof.Let x * = (x * 1 ) be the global optimum, x l = (x l 1 ) ∈ R be a point on the left of x * and x r = (x r 1 ) ∈ R be a point on the right of x * .We are going to find a sufficient condition to ensure that We want to make x l 1 as small as possible, so we let Similarly, the interval Since we have assumed 2 L < b − a, it is impossible that x l 1 = a and x r 1 = b hold simultaneously.Therefore, at least one of the intervals has length L .Hence, the length of the region of global optimum is at least L .Theorem 1.The expected convergence time of PROS for one-dimensional functions is bounded above by L(b−a) .
Proof.Let l be the length of the region of global optimum, R .The probability that a point chosen uniformly at random falls in R is given by The convergence time T is geometrically distributed with parameter p.Its expected value is given by Corollary 1. PROS converges in probability to the region of global optimum for all one-dimensional functions, i.e., lim Proof.For any realization {x (t) : t = 0, 1, 2, . ..}, f (x (t) ) is monotonically non-increasing in t.If x (t) ∈ R , then x (t+τ) ∈ R for all non-negative integer τ.Therefore, P{x (t) ∈ R } is monotonically non-decreasing in t.Since the sequence is bounded above by 1, it is convergent.Given any > 0, the expected convergence time is bounded according to Theorem 1.If lim t→∞ P{x (t) ∈ R } was non-zero, then the expected convergence time would be unbounded, which leads to a contradiction.

Multi-Dimensional Functions
In this section, we study the performance of PROS on D-dimensional functions with domain In particular, we focus on the class of totally separable functions as defined below.For notation simplicity, define x −i (x 1 , x 2 , . . ., x i−1 , x i+1 , . . ., x D ).Definition 3 (Partially Separable Function).A function f (x) is partially separable with coordinate i if arg min An example of partially separable functions with coordinate i = 1 is : Definition 4 (Totally Separable Function).A function f (x) is totally separable if it is partially separable with every of the D coordinates.
An example of totally separable functions is: In each iteration, PROS minimizes f in one of the random coordinate i.As f is totally separable, each coordinate i can be minimized independently.The following result shows that PROS converges to the region of global optimum in probability.
Theorem 2. PROS converges in probability to the region of global optimum for all totally separable functions, i.e., lim Proof.Since f is Lipschitz, given any x −i , we have Therefore, f can be regarded as a one-dimensional Lipschitz function in x i with the same constant L. As shown in the previous subsection, there is an interval V i of minimum length i L such that for all Since f is Lipschitz, it is a continuous function.Given any > 0, there exists sufficiently small positive i 's such that when x (t) i ∈ V i for all i, we must have x (t) ∈ R .The statement then follows from The expected convergence time can be obtained for the following subclass of totally separable functions.

Definition 5 (Additively Separable Function
An example of additively separable functions is the sum-of-spheres function: For the optimization of D-dimensional additively separable functions, it is equivalent to optimize D one-dimensional functions independently, i.e., min Theorem 3. The expected convergence time of PROS for D-dimensional additively separable functions is bounded above by where f * i is the global minimum of the i-th one-dimensional function.Then, is a sufficient condition for x ∈ R .Let S i be the iteration time PROS enters the region for x i as stated in (1).Under PROS, at each iteration t, the coordinate to be optimized is chosen uniformly at random.As in the proof of Theorem 1, S i is a geometric random variable with parameter Note that S i 's are not independent.We bound the convergence time T as follows: Hence,

Modified PROS with Local Search Mechanism
Although the PROS algorithm converges to the global optimum, provided that the sufficient conditions are satisfied, it converges slowly when compared with other wellknown EC algorithms [15].The major reason is PROS simply performs uniform orthogonal search in every iteration with no local search mechanism.The probability of finding an improved solution (defined in the below subsection) diminishes as it moves closer to the global optimum.Consider the situation that in the t-th iteration, x (t) is already very close to x * but has not yet fallen into in the region of global optimum.There is a high chance that x (t+1) would reach the region of the global optimum if a narrow-range local search is performed in the (t + 1)-th iteration.However, with uniform orthogonal search, the chance to reach the global optimum is relatively low, thus making it converge slowly.One may consider using a sampling policy other than uniform to perform local search.

Triangular-Distributed Random Orthogonal Search (TROS)
In this section, we present our first proposed algorithm called Triangular-Distributed Random Orthogonal Search (TROS).The TROS algorithm (Algorithm 2) is presented as follows: Algorithm 2: Triangular-Distributed Random Orthogonal Search (TROS) input : nil output : the best solution vector x (t) found by the algorithm t ← 0; Initialize until a termination criterion is met; return x (t)   Compared with PROS, TROS has one change in line 5 of Algorithm 1. Instead of sampling the next point of the j-th decision variable using uniform distribution, the triangular distribution T is used in the TROS algorithm.The probability density function of the triangular distribution T (a, b, c) is where a, b and c are the parameters of the distribution that represents the lower limit of x, upper limit of x and mode of x, respectively.The triangular distribution is illustrated in Figure 1.
In each iteration of TROS, the next point is sampled using the triangular distribution with the settings a = a j , b = b j , c = x (t) j where j ∈ {1, 2, . . ., D} is the randomly chosen decision variable for the current iteration, a j and b j are the lower bound and upper bound of the j-th decision variable, and x (t) j is the value of the j-th decision variable of the current best solution vector.With this distribution, there is a higher chance to draw a sample that is near to the current position x (t) j than is far from the current position.As a result, the algorithm performs exploitation (that is, encourages local search) on the j-th decision variable.

Quadratic-Distributed Random Orthogonal Search (QROS)
In this section, we present our second proposed algorithm called Quadratic-Distributed Random Orthogonal Search (QROS).The QROS algorithm (Algorithm 3) is presented as follows: Algorithm 3: Quadratic-Distributed Random Orthogonal Search (QROS) input : nil output : the best solution vector x (t) found by the algorithm t ← 0; Initialize until a termination criterion is met; return x (t)  Compared with TROS, QROS has one change in line 5 of Algorithm 2. The value of the j-th decision variable is sampled using the quadratic distribution Q.The probability density function of the quadratic distribution Q(a, b, c) is where a, b and c are the lower limit of x, upper limit of x and mode of x, respectively.The quadratic distribution is illustrated in Figure 1.
In each iteration of QROS, the next point is sampled using the quadratic distribution j where j ∈ {1, 2, . . ., D} is the randomly chosen decision variable for the current iteration, a j and b j are the lower bound and upper bound of the j-th decision variable, and x (t) j is the value of the j-th decision variable of the current best solution vector.With this distribution, there is a higher chance to draw a sample that is near to the current position x (t) j than is far from the current position.Compared with TROS, QROS encourages exploitation even more than TROS.
It has to be noted that both TROS and QROS are still "parameter-free" algorithms, as a and b are fixed boundaries, and c is determined based on the current best solution vector.With a simple modification, both TROS and QROS are able to improve the convergence speed of PROS and keep the major properties of PROS: parameterless, excellent computational efficiency, easy to apply to all kinds of applications, and high performance with finite-time search budget.

Analysis of the Modified Algorithms
In this subsection, we are going to explain the motivation and analyze the performance of the two algorithms.Sampling using uniform distribution can be considered as performing global search on a decision variable, while sampling using the triangular distribution and the quadratic distribution can be considered as performing local search on a decision variable.Definition 6 (Improved Solution).The candidate solution vector x (t+1) found by the algorithm is said to be improved if its error is less than the error of the current best solution.That is, where t denotes the current iteration.
For continuous (or Lipschitz) unimodal functions defined on a bounded domain, finding an improved solution vector always lead to the convergence to the global optimum.However, it may not be true for multimodal functions.Therefore, we are not interested in simply finding an improved solution vector.Here, we define a restrictive subset of improved solution vectors.Definition 7 (Tame Solution).The candidate solution vector x (t+1) found by the algorithm is said to be a tame solution if the errors of all points in between the line containing the candidate solution and the optimum solution are all less than the error of the current best solution.That is, where t denotes the current iteration.
A local optimum in a multimodal function is not considered as a tame solution if there exists a high hill in between the local optimum and the global optimum.The purpose of defining tame solution is when a tame solution is found, it does not only mean the solution is improved but also implies one could apply some simple techniques to converge to the global optimum.Therefore, we are interested in knowing the probability of finding a tame solution by each of the algorithms.
We define following terms for the analysis of the performance of the proposed algorithms.Let D ) be the current best solution vector, and j ∈ {1, . . ., D} be the chosen decision variable of the current iteration.Given x (t) −j , x j ) has a unique global minimum and is denoted as x * j .That is, Let R j be the set of values of the j-th decision variable that belong to improved solution vectors.That is, j ), x j ∈ [a j , b j ]}.Let S j be the set of values of the j-th decision variable that belong to tame solution vectors.That is, Let x l j be the smallest x j value and x r j be the largest x j value of the tame solutions, respectively.That is, x l j = inf S j and x r j = sup S j .Let l j be the length of the interval S j .That is, By Lemma 1, l j is non-negative and is bounded below by where e is the error of f (x Figure 2 shows two examples of multimodal functions with the corresponding set of tame solutions denoted by S j .Definition 8 (Probability of Tame Convergence).The probability of tame convergence is defined as the probability of an algorithm with sampling policy π to find a tame solution in its next iteration, given the current best solution vector.That is,  Proof.Let P U j be the probability that a tame solution is found by sampling using uniform distribution on the j-th decision variable in the (t + 1)-th iteration, given the best solution vector in the t-th iteration.Then Theorem 4. The probability of tame convergence of TROS is l j b j −a j ( l j +2v j u j +l j +v j ) when x (t) j < x * j and is l j b j −a j ( l j +2q j q j +l j +r j ) when x (t) j > x * j where u j = x l j − x (t) j , v j = b j − x r j , q j = x l j − a j and r j = x (t) j − x r j .
Proof.Let P T j be the probability that a tame solution is found by sampling using the triangular distribution on the j-th decision variable in the (t + 1)-th iteration, given the best solution vector in the t-th iteration.That is, ∼ T (a j , b j , x (t) j )}.
Similarly, for case 2, x (t) j > x * j : ( 2 b j − a j q j q j + l j + r j + 2 b j − a j l j + q j q j + l j + r j ) = l j b j − a j ( l j + 2q j q j + l j + r j ).
Corollary 2. The conditions that TROS has a higher probability of tame convergence than that of PROS, the same probability as that of PROS, and a lower probability than that of PROS are as follows, respectively.
For case 1, x (t) j < x * j : j > x * j : , q j > r j = P U j , q j = r j < P U j , q j < r j .
Corollary 3. The probability of tame convergence of TROS is higher than or equal to that of PROS for all convex functions.
Proof.For convex functions, u j = 0 (for case 1) and r j = 0 (for case 2).An example convex function is shown in Figure 3.
For case 1, x (t) j < x * j : Similarly, for case 2, x (t) j > x * j : P T j = l j b j − a j ( l j + 2q j q j + l j + r j ) = P U j (1 + q j q j + l j ) Therefore, for convex functions, P T j can never be less than P U j .
Theorem 5.The probability of tame convergence of QROS is l j b j −a j (l j +v j )(l j +2v j )+v 2 j (u j +l j +v j ) 2 when x (t) j < x * j and is l j b j −a j (q j +l j )(2q j +l j )+q 2 j (q j +l j +r j ) 2 when x (t) j > x * j where u j = x l j − x (t) j , v j = b j − x r j , q j = x l j − a j and r j = x (t) j − x r j .
Proof.Let P Q j be the probability that a tame solution is found by sampling using the quadratic distribution on the j-th decision variable in the (t + 1)-th iteration, given the best solution vector in the t-th iteration.That is, For case 1, x (t) j < x * j : Similarly, for case 2, x j > x * j : (q j + l j )(2q j + l j ) + q 2 j (q j + l j + r j ) 2  .
Corollary 4. The probability of tame convergence of QROS is higher than or equal to that of TROS and PROS for all convex functions.
For case 1, For case 2, x (t) j > x * : Therefore, for convex functions, P Q j can never be less than P T j and P U j .
The probability density function of distributions U , T and Q are characterized by a degree-0, degree-1 and degree-2 polynomials, respectively.From Corollary 4, the increment in the probablity of tame convergence diminishes as the degree of the polynomial increases.We expect sampling using an even higher degree distribution (e.g., cubic distribution with the mode narrower than Q and T ) would give higher probability of tame convergence, but the gain diminishes as the degree grows.

Experiments
In order to evaluate the merits of the two modified algorithms (TROS and QROS), a set of benchmark test problems is selected from the literature [15][16][17] and shown in Table 1.The benchmark problems include unimodal and highly multimodal functions, convex and non-convex functions, and separable and non-separable functions.TROS and QROS were compared with the original PROS algorithm and three well-known and widely used EC algorithms for optimization: genetic algorithm (GA) [18], particle swarm optimization (PSO) [19] and differential evolution (DE) [20].PROS, TROS and QROS were implemented in Java by the authors.GA, PSO and DE were run using pymoo, an open source framework for multi-objective optimization in Python, version 0.6.0[21].The code for the benchmark problems running on pymoo were revised by the authors (for experiments of shifted search space).The hyper-parameters of GA, PSO and DE are selected using the default values from the pymoo package (they are the typical values suggested in the literature). No.

No. Function Formulation and Global Optimum
Search Space [−5.12, 5.12] D x * = (0, 0, . . ., 0), f (x * ) = 0 The experiments were carried out on a 3.20 GHz computer with 16GB RAM under a Windows 10 platform.Multiple runs were conducted for each problem by each algorithm with a different seed for the generation of random numbers for each run.For fair comparison, all algorithms end with the same maximum number of objective function evaluations.We follow the experimental settings of the PROS paper [15] for the population size and the maximum number of objective function evaluations.In order to increase the reliability of the statistical results, the number of runs was increased to 100, 100 and 30 for 5D, 10D, and 50D problems, respectively (instead of 10 for all dimensions in [15]).The settings are summarized in Table 2. Figure 4 shows the experimental results of the mean error for 100 independent runs of each of the algorithms on each benchmark problem for dimension D = 5.The x-axes are the number of objective function evaluations (in log scale) and the y-axes are the average error of the algorithm after certain objective function evaluations.As shown in the figure, QROS converged faster than the other five algorithms except on f 3 and f 5 where PSO converged the fastest.TROS and PROS had similar trends as QROS on the benchmark functions: they converged faster than GA, PSO and DE on most benchmark functions except on f 3 and f 5 .Figures 5 and 6 show the experimental results for dimension D = 10 and D = 50, respectively.Similar to the results for D = 5, QROS converged faster than the other five algorithms except on f 3 (for 10D) and f 5 (for 10D and 50D) where PSO converged the fastest.Table 3 shows the mean of the errors of the final solution vectors returned by PROS, TROS and QROS for 100 independent runs of on each benchmark problem for dimension D = 5.The mean errors represent the medium to long term performance of the algorithms.The best results (the smallest mean final errors) for each benchmark problem are highlighted in bold font.The corresponding standard deviations are placed next with parenthesis.Owing to limited space, the results of GA, PSO and DE are not shown here, as they have been reported in [15] already.As seen from the table, QROS had the smallest mean final errors on most of the benchmark problems.Similar results can be seen from Tables 4 and 5 which show the means of the final errors for dimension D = 10 and D = 50, respectively.In general, QROS and TROS are more efficient in reducing the mean final errors on most benchmark problems when compared with PROS.One may notice that both TROS and QROS are in favor of objective functions with the global optimum located at the center of the search space.For example, the global optimum of f 1 , f 2 , f 3 , f 5 , f 6 , f 7 , f 8 , f 9 , f 12 is (0, 0, . . ., 0).In order to test the effectiveness of TROS and QROS in a general case, another experiment was conducted.The same set of benchmark problems were used in this experiment, but the search space was shifted so that the global optimum may be located anywhere within the bounded search space.A random point for each benchmark function for each run where a i , b i , x * i are the lower limit, upper limit and the optimum value of the i-th decision variable in the original search space of the benchmark function being optimized and i ∈ {1, 2, . . ., D}.The shifted search space becomes It has to be noted that the benchmark problems are carefully chosen in this experiment to ensure for each benchmark problem, x * is still the global optimum in the shifted search space but x * is not necessarily be located at the center of the new search space.
Figures 7-9 show the experimental results of the mean error of each of the algorithms on each benchmark problem for dimension D = 5, D = 10 and D = 50, respectively.Similar to the non-shifted experiments, PROS, TROS and QROS converged faster than GA, PSO and DE on most benchmark problems except on f 5 when D = 50.When only considering the three random orthogonal search algorithms, PROS usually converged quickly initially, then TROS surpassed PROS, followed by QROS surpassed TROS on most of the benchmark problems.
Tables 6-8 show the statistical results of the final error of each of the algorithms on each benchmark problem for dimension D = 5, D = 10 and D = 50, respectively.Similar to the non-shifted experiments, both QROS and TROS often improve the mean final errors.The performance of QROS and TROS are quite promising on the shifted benchmark functions.

Future Work
In our future work, we plan to extend the algorithms in two directions.The first one is to allow switching between various sampling polices (e.g., uniform, triangular, quadratic) in an adaptive way.This is particular useful for black-box optimization where the form of the function is unknown to the optimization algorithm.One could learn the best sampling policy for that particular function and adapt gradually from the results of previous function evaluations.The second direction is to gradually rotate the objective function based on the current best solution set [22,23].By doing so, a complex function would be transformed to a simpler convex function in the small area of concern.Any simple techniques such as TROS and QROS would quickly converge to the global optimum if it is in the area of concern.We are interested in investigating the convergence of TROS and QROS with the addition of transformation techniques.

Assumption 1 .
f has a single global optimum f * = min x∈Ω f (x).

Figure 1 .
Figure 1.The probability density function of the triangular distribution T (a, b, c), the quadratic distribution Q(a, b, c), and uniform distribution U (a, b).

Lemma 2 .
The probability of tame convergence of PROS is l j b j −a j .

Figure 2 .
Figure 2. Left: an example function where x (t) < x * j .Right: another example function where x (t) > x * j .The set of tame solutions is denoted as S j .

Figure 3 .
Figure 3.An example convex function.Left: for case 1, x

Figure 6 .
Figure 6.The convergence curve of 30 runs on 50D benchmark functions.

Table 2 .
Settings of the numerical experiments.

Table 3 .
Statistical results of 100 runs on 5D benchmark functions.

Table 4 .
Statistical results of 100 runs on 10D benchmark functions.

Table 5 .
Statistical results of 30 runs on 50D benchmark functions.

Table 6 .
Statistical results of 100 runs on 5D benchmark functions with shifted global optimum.

Table 7 .
Statistical results of 100 runs on 10D benchmark functions with shifted global optimum.

Table 8 .
Statistical results of 30 runs on 50D benchmark functions with shifted global optimum.