Chaotic Quantum Double Delta Swarm Algorithm using Chebyshev Maps: Theoretical Foundations, Performance Analyses and Convergence Issues

Quantum Double Delta Swarm (QDDS) Algorithm is a new metaheuristic algorithm inspired by the convergence mechanism to the center of potential generated within a single well of a spatially co-located double-delta well setup. It mimics the wave nature of candidate positions in solution spaces and draws upon quantum mechanical interpretations much like other quantum-inspired computational intelligence paradigms. In this work, we introduce a Chebyshev map driven chaotic perturbation in the optimization phase of the algorithm to diversify weights placed on contemporary and historical, socially-optimal agents' solutions. We follow this up with a characterization of solution quality on a suite of 23 single-objective functions and carry out a comparative analysis with eight other related nature-inspired approaches. By comparing solution quality and successful runs over dynamic solution ranges, insights about the nature of convergence are obtained. A two-tailed t-test establishes the statistical significance of the solution data whereas Cohen's d and Hedge's g values provide a measure of effect sizes. We trace the trajectory of the fittest pseudo-agent over all function evaluations to comment on the dynamics of the system and prove that the proposed algorithm is theoretically globally convergent under the assumptions adopted for proofs of other closely-related random search algorithms.


I. INTRODUCTION
The Quantum-Double Delta Swarm (QDDS) algorithm [1] extends the well-known Quantum-behaved Particle Swarm Optimization (QPSO) [2][3][4] using an additional Dirac delta well and imposing motional constraints on particles to effect in convergence to a single well under the influence of both.The particles are centrally pulled by an attractive potential field and a recursive Monte Carlo relation is established by collapse of the wavefunctions around the center of the wells.The methodology has been put forward and tested on select unimodal and multi-modal benchmarks in [1] and generates promising solution quality when compared to [4].In this work, we primarily report performance improvements of the QDDS algorithm when its solution update process is influenced by a random perturbation drawn from a Chebyshev chaotic map.The perturbation seeks to diversify the weight array corresponding to the current and socially-optimal agents' solutions.A detailed performance characterization over twenty-three single-objective, unimodal and multi-modal functions of fixed and varying dimensions is carried out.The characterization is repeated for eight other nature-inspired approaches to provide a basis for comparison.The collective potential (cost) quality and precision data from the experimentation provide information on the operating conditions and tradeoffs while the conclusion drawn from a subsequent twotailed t-test points to the statistical significance of the results at the α=0.05 level.We follow the path of the best performing agent in any iteration across all iterations and critically analyze the dynamical limitations of the algorithm.Consequently, we also look at the global convergence proof of Random Search algorithms [5] and contend that the proposed algorithm theoretically converges to the global infimum under certain weak assumptions adopted for convergence proofs of similar random search techniques.
The organization of the article is as follows: in Section II we walk through a couple of major swarm intelligence paradigms and derive our way through the classical and quantum interpretations in these multi-agent systems.In section III, we talk about swarm propagation under the influence of a double Dirac delta well and setup its quantum mechanical model.In section IV we outline the QDDS and the Chebyshev map driven QDDS (C-QDDS) and provide an involved algorithmic procedure for purposes of reproducibility.Following this, in Section V we detail the benchmark optimization problems and graphically illustrate their threedimensional representations.This is followed in Section VI by comparative analyses of function evaluations on the benchmarks and statistical significance tests, taking into account the contribution of effect sizes.The trajectory of the best performing agent in each iteration is tracked along the function contours and the limitations and successes of the approach are identified.In section VII critical analyses is presented in light of the findings.In Section VIII a global convergence proof is given for the algorithm, and finally, Section IX charts out future directions and concludes the paper.

II. BACKGROUND
The seminal work of Eberhart and Kennedy on flocking induced stochastic, multi-particle swarming resulted in a surge in natureinspired optimization research, specifically after their highly influential paper: Particle Swarm Optimization [6] (PSO) at the International Conference on Neural Networks in 1995.This was a landmark moment in the history of swarm intelligence and the following years saw a surge of interest towards the application of nature-inspired methods in approximating engineering problems that were till then either not tractable or simply hard from a computational standpoint.With a steady increase in processor speed and distributed computing abilities over the last couple of decades, gradient-independent approaches have gradually become ever so common.The simple and intuitive equations of motion in PSO are powerful due to simplicity and low computational cost.In this section, a formal transition from the classical model of the canonical PSO to that of quantum-inspired PSO, or the Quantumbehaved PSO (QPSO) is explored.The QPSO model assumes quantum properties in agents and establishes an uncertainty-based position distribution instead of a deterministic one as in the canonical PSO with Newtonian walks.Importantly enough, the QPSO algorithm requires the practitioner to tune only one parameter: the contraction-expansion (CE) coefficient instead of three in PSO.It is worth looking at the dynamics of a PSO-driven swarm to gain a better understanding of singular and double Dirac delta driven quantum swarms, later in the article.
A. The Classical PSO Assume  =.. = [ 1  2  3 …   ] is the cohort of m particles of dimensionality n and  =.. = [ 1  2  3 …   ] are the velocity vectors which denote incremental changes in their positions in the solution hyperspace.Given this knowledge, a canonical PSOlike formulation may be expressed as: The parameters ,  1 ,  2 ,  1 ,  2 are responsible for imparting inertia, cognitive and social weights as well as random perturbations towards the historical best position   () of any particle (pbest) or   (), that of the swarm as a whole (gbest).The canonical PSO model mimics social information exchange in flocks of birds and schools of fish and is a simple, yet powerful optimization paradigm.However, it has its limitations: Van den Bergh showed that the algorithm is not guaranteed to converge to globally optimum solutions based on the convergence criteria put forward by Solis and Wet [5].Clerc and Kennedy demonstrated in [7] that the algorithm may converge if particles cluster about a local attractor  lying at the diagonal end of the hyper-rectangle constructed using its cognitive and social velocity vectors (terms 2 and 3 in the right-hand side of equation 1, respectively).Proper tuning of the algorithmic parameters and limits on the velocity are usually required to bring about convergent behavior.The interested reader may look at [8][9][10][11] for detailed operating conditions, possible applications and troubleshooting of issues when working with the PSO algorithm.

B. The Quantum-behaved PSO
The local attractor , introduced in [7] as the point around which particles should flock in order to bring about swarm-wide convergence can be formally expressed using equation 3 and further simplifications lead to a parameter reduced form in equation 4.This result is possible of course, after the assumption that  1 and  2 may take on any values between 0 and 1.
Drawing insights from this convergence analysis, Sun et al in [2][3] outlined the algorithmic working of the Quantum-behaved Particle Swarm Optimization (QPSO).Instead of point representations of a particle, wavefunctions were used to provide a quantitative sense about its state.The normalized probability density function F of a particle may be put forward as: L is the standard deviation of the distribution: it provides a measure of the dynamic range of the search space of a particle in a specific timestep.Using Monte Carlo method, equation ( 5) may be transformed into a recursive, computable closed form expression of particle positions in equation ( 6) below: L is computed as a measure of deviation from the average of all individual personal best particle positions (pbest) in each dimension, i.e. the farther from the average a particle is in a dimension the larger the value of L is for that dimension.This average position has been dubbed the name 'Mean Best' or 'mbest' and is an agglomerative representation of the swarm as if each member were in its personal best position visited in course of history.
Therefore, L may be expressed by including the deviation from  by equation (8).The modulation factor  is known as the Contraction-Expansion (CE) Factor and may be adjusted to control the convergence speed of the QPSO algorithm depending on the application.
Subsequently plugging the value of  obtained in equation ( 8) into equation ( 6), the position update formulation for QPSO may be re-expressed as the following: Issues such as suboptimal convergence during the application of the QPSO algorithm may arise out of an unbiased selection of weights in the mean best computation as well as the overdependence on the globally best particle in the design of the local attractor .These issues have also been studied by Xi et al. in [12], Sengupta et al. in [13] and Dhabal et al. [14].Xi et al. proposed a differentially weighted mean best in [4]: a variant of the QPSO algorithm with a weighted mean best position (WQPSO), which seeks to alleviate the subpar selection of weights in the mean best update process.The underlying assumption is that fitter particles stand to contribute more to the mean best position and that these particles should be accorded larger weights, drawing analogy with the correlation between cultural uptick and the contributions of the societal, intellectually elite to it [4].Xi et al. also put forward in [12] a local search strategy using a super particle with variable contributions from swarm members to overcome the dependence issues during the local attractor design.However, to date no significant study has been undertaken to investigate the effect of more than one spatially co-located basin of attraction around the local attractor, particularly that of multi-well systems.In the next section we seek to derive state expressions of a particle convergent upon one well under the influence of two spatially co-located Diracdelta wells.

III. SWARMING UNDER THE INFLUENCE OF TWO DELTA POTENTIAL WELLS
The time-independent Schrodinger's wave equation governs the different interpretations of particle behavior: ψ(r), V(r), m, E and ћ represent the wave function, the potential function, the reduced mass, the energy of the particle and reduced Planck's constant respectively.However, the wavefunction () has no physical significance on its own: its amplitude squared is a measure of the probability of finding a particle.Let us consider a particle under the influence of two delta potential wells experiencing an attractive potential : The centers of the two wells are at − and  and  is a constant indicative of the depth of the wells.Under the assumption that the particle experiences no attractive potential, i.e.  = 0 in regions far away from the centers, the even solution of the time-independent Schrodinger's equation in equation (10) takes the following form: The even solutions to  for  < 0 (bound states) in regions ℝ1:  ∈ (−∞, ), ℝ2:  ∈ (−, )  ℝ3:  ∈ (, ∞), taking k to be equal to (√2 ћ ⁄ ) can be expressed as has been proved in [15]: The constants  1 and  2 described in the above equation are obtained by: (a) solving for the continuity of the wave function   at  =  and  = − and (b) solving for the continuity of the derivative of the wave function at  = 0. Thus,   may be rewritten below as has been in [15]: The odd wave function   does not guarantee that its solution would be found [15].Additionally, the bound state energy in the double well setup is lower than that in a single well setup by approximately a factor of (1.11) 2 ≈ 1.2321 [16]: To study the motional aspect of a particle its probability density function given by the squared magnitude of   is formally expressed.Further, the claim that there is a greater than 50 per cent probability of a particle existing in the neighborhood of the center of any of the potential wells (assumed centered at 0) boils down to the following criterion being met [2]: −|| and || are the dynamic limits of the neighborhood.Doing away with the inequality, equation ( 16) is re-written as: Equation ( 17) is the criterion for localization of a particle around the center of a potential well in a double Dirac delta well setup.

IV. THE QUANTUM DOUBLE DELTA SWARM (QDDS) ALGORITHM
To ease computations, we make the assumption that one of the two potential wells is centered at 0. Then, solving for conditions of localization of the particle in the neighborhood around the center of that well and computing ∫ (() 2

|𝑟| −|𝑟|
for regions ℝ2 0− : ′ ∈ (−, 0) and ℝ2 0+ : ′ ∈ (0, ), we obtain the relationship below: Replacing the denominator of the R.H.S. of equation ( 18) i.e. (exp (2) − 5exp (−2) + 4 + 4) as δ, we re-write it as: Equating  2 in L.H.S. of equation ( 18) for any two consecutive iterations (assuming it is a constant over iterations as it not a function of time) we get equations ( 20), ( 21) and ( 22): Λ is the ratio (   −1 ⁄ ) and it may vary between 0.5 to 2 since (1<  <2).To keep a particle constrained within the vicinity of the center of the potential well, it must meet the following condition: Thus, we find   for any iteration by utilizing  −1 , obtained in the immediately past iteration.This is done by accounting for a correction factor in the form of the gradient of  −1 , multiplied by a learning rate .The computation of   from  −1 feeds off the relationship of  −1 with δ −2 while taking the sign of the gradient of  −1 into consideration.The procedural details are outlined in Algorithm 1.The learning rate  is chosen as a linearly decreasing, time-varying one (LTV) to help facilitate exploration of the solution space early on in the optimization phase and a gradual shift to exploitation as the process evolves.ν is a small fraction between 0 and 1 chosen at will.However, one empirically successful value is 0.3 and we use it in our computations.
Upon computing a value for   , equation ( 19) is solved to retrieve an estimate of   , which denotes a candidate position as well as a potential solution at the end of that iteration.
We let   i.e. a particle's position in the current iteration maintain a component towards the best position found so far (gbest) in addition to its current solution obtained from equation (19).Let  denote the component towards the gbest position and (1-) be that towards the current solution.
A cost function is subsequently computed and the corresponding particle position is saved if the cost is lowest among all the historical swarm-wide best costs obtained.This process is repeated until convergence criteria of choice (solution accuracy threshold, computational expense, memory requirements, success rate etc.) are met.In this section, we use a Chebyshev chaotic map to generate co-efficient sequences for driving the belief  in the solution update phase of the QDDS algorithm.

A. Chebyshev Map Driven Solution Update
The Chebyshev chaotic sequence can be generated using the following recursive relation [17]: Equation ( 26) subsequently becomes:  Find learning rate χ using eq.( 24) 15: Select a particle randomly 16: for each dimension

A. Benchmark Functions
A suite of the following 23 optimization benchmark functions (F1-F23) are popularly used to inspect the performance of evolutionary optimization paradigms and have been utilized in this work to characterize the behavior of C-QDDS across unimodal and multimodal function landscapes of fixed and varying dimensionality.Co-efficients are defined according to Table F20.1 and F20.2 respectively.
Co-efficients are defined according to Table F21.
Co-efficients are defined according to Table F22.
Co-efficients are defined according to Table F23.

A. Parameter Settings
We choose the constant k to be 5 and θ to be the product of a random number drawn from a zero-mean Gaussian distribution with a standard deviation of 0.5 and a factor of the order of 10 -3 after sufficient number of trials.The learning rate χ decreases linearly with iterations from 1 to 0.3 according to equation ( 24) as an LTV weight [8]. ℎℎ ∈ [0,1] is a random number generated using a Chebyshev chaotic map in equation (27).All experiments are carried out on two Intel(R) Core(TM) i7-5500U CPUs @ 2.40GHz with 8GB RAM and one Intel(R) Core(TM) i7-2600U CPU @ 3.40GHz with 16GB RAM using MATLAB R2017a.All experiments are independently repeated 30 times in order to account for variability in reported data due to the underlying stochasticity of the metaheuristics used.Clusters from the MATLAB Parallel Computing Cloud are utilized to speed up the benchmarking.

VI. EXPERIMENTAL RESULTS
Tables 10 through 12 report performances of the C-QDDS algorithm on the test problems stacked against solution qualities obtained using 8 other commonly used, recent nature-inspired approaches.These are: i) Sine Cosine Algorithm (SCA) [18], ii) Dragon Fly Algorithm (DFA) [19], iii) Ant Lion Optimization (ALO) [20], iv) Whale Optimization Algorithm (WOA) [21], v) Firefly Algorithm (FA) [22], vi) Quantum-behaved Particle Swarm Optimization (QPSO) [2][3], vii) Particle Swarm Optimization with Damped Inertia * (PSO-I) and viii) the canonical Particle Swarm Optimizer (PSO-II) [6].Each algorithm has been executed for 1000 iterations with 30 independent trials following which their mean, standard deviation and minimum values are noted.Testing carried out on the functions adhere to the dimensionalities and range constraints specified in Tables 2-8.A total of 50 agents have been introduced in the particle pool, out of which only one agent is picked in each iteration.The rationale for choosing one agent instead of many or all from the pool is to investigate the incremental effect of a single agent's propagation under different natureinspired dynamical perturbations.The ripple effect caused otherwise, by many sensor reading exchanges among many or all particles may be delayed when a single particle affects the global pool of particles in one iteration.
* PSO-I utilizes an exponentially decaying inertia weight for exploration-exploitation trade-off

VII. ANALYSIS OF EXPERIMENTAL RESULTS
Tables 10-12 report the solution qualities obtained on the suite of test functions F1-F23 followed by Tables 13-17 in which the win/tie/loss counts, average ranks and results of statistical significance tests such as that of a two-tailed t-test and Cohen's d and Hedge's g values are reported.From tables 10-12 one can make the observation that C-QDDS has a distinctive advantage over the other algorithms in terms of quality of optima found, outperforming competitors in unimodal functions as F3-F5, F7 and multimodal ones such as F9-F13.However, solution quality drops for multimodal functions F14-F23, with the agents getting stuck in local minima.One interpretation is that since communication between particles is limited when only one agent is drawn in an iteration, it will take a considerably large number of iterations for promising regions to be found.Alternatively, because the QDDS mechanism is based on gradient descent, saddle points and valleys introduce stagnation which is difficult to break out of.A twotailed student's t-test with significance level α=0.05 in Table 15 is used to accept or reject the hypothesis that the performance of the C-QDDS algorithm is significant when compared to any of the other approaches.It is observed that in general, C-QDDS provides superior solution quality when applied to problems in Tables 10-11 and that the difference is statistically significant at α =0.05.A measure of the effect sizes is provided in Table 16 through the computation of Cohen's d values, however to account for the correction Hedge's g values have also been reported in Table 17.
In Table 18, the number of successful executions against the obtained cost range for any algorithm is demonstrated for all test functions.The horizontal axis represents a value equivalent to the sum of the lowest cost obtained during the 30 runs of an algorithm and a fraction of the cost range (i.e.highest costlowest cost) ranging from 0.1 through 1 at intervals of 0.1.The vertical axis is the cumulative number of trials that resulted in solutions with lower cost than the corresponding horizontal axis value.For example, the vertical axis value at the horizontal tick of 0.1 is the number of trials having cost values less than {minimum cost + 0.1 × (maximum cost -minimum cost)}.These curves are a measure of the variability of the algorithmic solutions within their reported cost ranges and an indicator of how top-heavy or bottom-heavy they are.It is important to note that the cost range for each algorithm is different on every test function execution and as such the curves are merely meant for an intuitive understanding of the variability of the solutions and not intended to provide any basis for comparison among the algorithms.Algorithms having the least standard deviation among the cohort are expected to have a uniform density of solutions in the cost range and as such should follow a roughly linear relationship between the variables in the horizontal and vertical axes.It may be noted that C-QDDS, which roughly follows this relationship indeed has the least standard deviation in many cases, specifically for 14 of the 23 functions as illustrated in Table 13.This is in congruence with the convergence profiles of QDDS in Figures 1 through 12 of [1] which point out that QDDS is fairly consistent in its ability to converge to local optima of acceptable quality in certain problems.
Table 19 shows the trajectory evolution of the global best position across the functional iterations for each test case using C-QDDS.For ease of visualization, the contours of the 30-dimensional functions as well as the obtained gbest i.e. global best solutions are plotted using only the first 2 dimensions. 0 represents the initial gbest position and  1 represents the gbest position upon convergence, given the convergence criteria.The interim gbest position transitions are shown by dotted lines.The solutions to the 23 test problems outlined in the paper are local minima, however the quality of solutions that the C-QDDS and QDDS algorithm provide to some of these problems are markedly better than those reported in some studies in the literature [4,23,24,25].A logical next-step to improve the optima seeking capability of the QDDS/C-QDDS approach is to introduce a problem-independent random walk in the  recomputing step of the algorithm instead of using gradient descent.

VIII. NOTES ON CONVERGENCE OF THE ALGORITHM
In this section, we discuss the convergence characteristics of the QDDS algorithm by formulating the algorithmic objective as an optimization problem and proving hypotheses adherence under certain weak assumptions.We start by considering the following problem ℂ : ℂ: Provided there is a function f from ℝ  to ℝ and that S is a subset of ℝ  , a solution x in S is sought such that x minimizes f on S or finds an acceptable approximation of the minimum of f on S.
A conditioned approach to solving ℂ was proposed by Solis and Wet [5] which we describe below.The rest of the proof follows logically from [5] as has also been shown by Van den Bergh in [26] and Sun et al in [27].The mapping £ is the optimization algorithm and should satisfy the following two hypotheses ℍ 1 and ℍ 2 in order to theoretically be globally convergent.
Hypothesis ℍ 2 : For any Borel subset A of S with () > 0, it can be proved that:

Figure 1 :
Figure 1: The Double Well Potential Setup

Table 1 . General Terms used in Context of the Algorithms and Experimentation Term Discussion
Scale factor influencing the convergence speed of QPSO Mean BestMean of personal bests across all particles, akin to leader election in species Quantum Position corresponding to historically best fitness for a swarm member Global Best (gBest) Position corresponding to best fitness over history for swarm members Inertia Weight Co-efficient (ω) Facilitates and modulates exploration in the search space Cognitive Random Perturbation (r1) Random noise injector in the Personal Best attractor Social Random Perturbation (r2) Random noise injector in the Global Best attractor Quantum-behaved Particle Swarm Optimization (QPSO)

Double-Delta Swarm Optimization (QDDS)
(19)his section, we present the pseudocode of the Chaotic Quantum Double Delta Swarm (C-QDDS) Algorithm.Generate   and   from   and   using equation(19)12: Set current iteration t Component towards the global best position gbest drawn from a Chebyshev map  Depth of the wells  Co-ordinate of wells Figure 4: Schematic of the C-QDDS Workflow B. Pseudocode of the C-QDDS Algorithm