Abstract
The harmony search (HS) algorithm is an evolutionary computation technique, which was inspired by music improvisation. So far, it has been applied to various scientific and engineering optimization problems including project scheduling, structural design, energy system operation, car lane detection, ecological conservation, model parameter calibration, portfolio management, banking fraud detection, law enforcement, disease spread modeling, cancer detection, astronomical observation, music composition, fine art appreciation, and sudoku puzzle solving. While there are many application-oriented papers, only few papers exist on how HS performs for finding optimal solutions. Thus, this preliminary study proposes a new approach to show how HS converges on an optimal solution under specific conditions. Here, we introduce a distance concept and prove the convergence based on the empirical probability. Moreover, a numerical example is provided to easily explain the theorem.
1. Introduction
In recent years, many researchers have utilized various nature-inspired metaheuristic algorithms for solving scientific and engineering optimization problems. One of the popular algorithms is harmony search (HS), which was inspired by jazz improvisation [1]. As a musician plays a musical note from its memory or randomly, HS generates a value from its memory or randomly. As a new harmony, which is composed of musical notes, is evaluated in each practice and memorized if it is good, a new solution vector, which is composed of values, is evaluated in each iteration and memorized if it performs well. This HS optimization process, which utilizes three basic operations (memory consideration, pitch adjustment, and random selection), continues until it finds an optimal solution vector [2].
When compared with other algorithms, there is a similarity between HS and them [1]. HS is similar to Tabu Search with respect to keeping the past vectors in a memory called harmony memory (HM). In addition, HS can use the adaptative parameters of HMCR (Harmony Memory Consideration Rate) and PAR (Pitch Adjustment Rate), which are similar to Simulated Annealing in varying temperature. Furthermore, both HS and genetic algorithm (GA) manage multiple vectors simultaneously. However, there is also a difference between HS and GA. While HS generates a new vector by considering all the existing vectors, GA generates the new vector by considering only two of the existing vectors (the parents). In addition, HS considers each variable in a vector independently while GA cannot consider in that way because its major operation is crossover, which keeps the gene sequence (multiple variables together).
The HS algorithm has been applied to various optimization problems including project scheduling [3], structural design [4], energy system operation [5], car lane detection [6], ecological conservation [7], model parameter calibration [8], portfolio management [9], banking fraud detection [10], law enforcement [11], disease spread modeling [12], cancer detection [13], astronomical observation [14], music composition [15,16], fine art appreciation [17], and sudoku puzzle solving [18]. Furthermore, there are some application-oriented reviews of HS [19,20,21,22,23,24,25].
While there are many applications proposed so far, only a few studies have been dedicated to the theoretical background of the HS algorithm. Beyer [26] dealt with the expected population variance of several evolutionary algorithms, and Das et al. [27] proposed an approximated variance of the expectation of solution candidates and discussed the exploratory power of the HS algorithm. However, there has been no study discussing the convergence of the HS algorithm while the convergence of other optimization algorithms has been discussed in some studies [28,29,30,31].
As an evolutionary computation algorithm, HS considers an optimization problem:
where
If all the bounds are infinite, the above problem becomes an unconstrained optimization. However, most practical optimization problems adopt variables having certain prefixed range of values, and, therefore, become box-constrained problems [32].
In this paper, we propose the convergence theory based on the empirical probability for the HS, which is one of box-constrained optimization algorithms. In particular, we consider the convex optimization functions with single and multiple variable cases. For this, we define a discrete sequence to prove convergence using distance. This new approach can be applied to the algorithms, such as HS, which have candidate sets that store the improved solutions in the storage and iteratively update them.
2. Harmony Search Algorithm
2.1. Basic Structure of Harmony Search
The HS algorithm is an optimization method inspired by musical phenomena. This algorithm mimics the musical performing process that occurs when a musician searches for a better harmonic sound, such as jazz improvisation. Jazz improvisation finds a musically pleasing harmony (perfect state) determined by aesthetic standards, just as the optimization process finds a global solution (perfect state) determined by an objective function. While the pitch of each instrument determines its aesthetic quality, the value of each decision/design variable determines its solution quality of the objective function.
In order to optimize a problem using HS, we first define all possible candidate value set called the candidate set (universal set) . Let us assume that includes That is, and
Here, the memory storage HM of the HS algorithm can be expressed as in Equation (2). To initialize HM, we randomly choose values from the universal set, and generate vectors as many as HMS (harmony memory size, that is, the number of vectors stored in HM). The value of the objective function is also kept next to each solution vector.
Once HM is prepared, the HM is refreshed with better solution vectors iteration by iteration. If a newly generated vector is better than the worst vector stored in the HM in terms of an objective function value, the new vector is swapped with the worst one.
The value of variable can be randomly selected from the set of all candidate discrete values with a probability of (random selection rate). Otherwise, the value of can be selected from the set of stored values, named , which is the ith column of HM, with a probability of (the probability with which the value is selected solely from HM), or, once is selected from , it can be slightly adjusted by moving it to neighboring values with a probability of (the probability with which the value is selected solely using pitch adjustment) as follows:
where , is some non-zero integer such that . Here, is a predetermined parameter for adjustment. That is, = or or , etc. For example, if = then we take
For and , we first define HMCR by HMCR . Then, after we define PAR, and are defined as and .
Here, for the pitch adjustment, we first select randomly from using a uniform distribution between 0 and 1, i.e., Then, we identify which satisfies Next, we further tweak into where
This is the basic structure of the HS algorithm [33]. Although there are many structural variants of the HS algorithm [34], most variants basically have the above-mentioned three operations: memory consideration, pitch adjustment, and random selection.
2.2. Solution Formula for Harmony Search without Pitch Adjustment
Let us formulate the solution for the HS algorithm without a pitch adjustment case.
At the first generation, will be chosen with given probabilities as follows:
where and is the initial harmony memory for . And let be the solution with given probabilities at the generation stage, then
where is the newly updated column of HM after the generation. At the first generation, is obtained by two operations (random selection or memory consideration). If the newly generated vector is better than which is the worst vector in in terms of an objective function value, then will be replaced with Otherwise, the worst vector will stay in the memory.
The element that we get from this comparison after the first generation will be represented by . That is,
Following a similar procedure, at the generation, if is better than , which is the worst solution vector in , then will be replaced with Otherwise, the worst element will stay in the memory.
The element that we get from this comparison after the generation will be represented by . That is,
Using an indicator function, the solution formula for in Equation (7) can be equivalently represented by:
where
2.3. Solution Formula for Harmony Search with Pitch Adjustment
Next, let us formulate the solution for the HS algorithm with a pitch adjustment case.
At the first generation, will be chosen with given probabilities as follows:
where and is the initial harmony memory. Then, let be the solution with given probabilities at the generation stage, in which:
where is the newly updated column of HM after the generation. At the first generation, is obtained by three operations (random selection, memory consideration or pitch adjustment). If is better than which is the worst one in in terms of an objective function value, then . Otherwise, . That is,
Therefore, after the first generation, will be updated by substituting with Then, after the generation,
where is the solution vector that performs the worst in the . Therefore, after the generation, will be updated by substituting with
Let . Then, using an indicator function, the solution formula for in Equation (13) can be represented by:
where
3. Empirical Convergence of Harmony Search
3.1. One Variable Case
In this section, we discuss the behavior of the solutions in HM. Let us first consider is a function of one discrete variable . Without loss of generality, we can assume that satisfies after rearrangement, and we have subintervals that are separated by (See Figure 1). In addition, satisfies after rearrangement. The element of HM at the generation will be written as .
Figure 1.
Elements of the candidate set and the subintervals.
Here, let us define the total ranges of the candidate set and HM. In addition, we define the distances between neighboring endpoints, which are the lengths of subintervals as follows:
and we define the minimum and maximum value of as follows:
Here, note that we have an important relation as follows:
As we mentioned, we deal with the convex function with discrete variables for its minimum optimal solution. However, note that it can be also applied to the concave function for the maximum optimal solution without loss of generality. If has the minimum value at , then can be located in four different ways. Figure 2 shows these four different cases. (a) and (b) show the convex objective functions. In this paper, we focus on the convex cases where is located in the middle of the range of HM, which is also applied to the concave cases. However, this theory can be also applied to the monotone increasing/decreasing objective functions such as (c) and (d) as trivial cases, where is located at the right or left end point.
Figure 2.
Candidate solutions and the worst element of one-variable functions.
By Equations (5) and (7), the worst element of HM is replaced with a better element as is increased. Even though not every generation replaces the worst element of HM with a new one, it is true that the worst element of HM is replaced as is increased. Now, let us consider the case when HM is updated at the generation. From Figure 2, we can consider the following properties.
- (a)
- & (c): Because the left-end smallest-value point will be replaced by some value except in the case of Here, note that . The smallest-value element can be newly updated by or . Consider that we are at the generation. Then, the previous is denoted as , and the newly updated smallest one from HM will be denoted as . Here,It is clear that
- (b)
- & (d): Because the right end point will be replaced by some value except in the case of , which shows better performance than . Here, note that . The largest-value element can be updated by or . Consider that we are at the generation. Then, the previous is denoted as , and the newly updated largest one from HM will be denoted as . Here,It is clear thatNow we prove the following theorem.
Theorem 1.
Assume that there exists exactly one solution in the candidate setfor the objective function. Ifis the length of HM at the generation, then
- (i)
- is a monotone decreasing sequence asis increased.
- (ii)
- Furthermore, the solutions in HM converge.
Proof.
which means the sequence is a monotone decreasing sequence.
(i) As we have seen from Equations (21) and (23), we satisfy
Note that not every generation makes HM updated. Therefore, we need to define a new notation for the generation that updates HM. If the worst element of HM is replaced by a new element at generation such that
then is the set of generation numbers that update HM when each update reduces in HM. It is clear that
i.e., a subsequence of
Note that, if there are more than one worst solution in HM, then even if HM is updated, it does not satisfy Equation (25). Therefore, the generation number at that time, it is not an element of .
Now, we consider that either or is true. Equivalently, if either or is true, then, in the case of , we have
Furthermore, for the smallest length of subinterval it is easily seen that:
Here, we need to check the existence of {} because we are not sure if HM is steadily updated as is increased. That is, we are not sure if we can guarantee is selected at certain generation. In addition, it is finally guaranteed based on the relation between empirical probability and theoretical probability. That is, if we repeat the generation, it is guaranteed that is surely selected at certain generation. Therefore, there exist each generation number ( which updates HM.
Meanwhile, we have the following property:
HM is updated only at generation (). Therefore, because is the first generation which updates HM. Thus, , where is the initial HM. Since is the second generation that makes HM updated,
If we repeat this procedure times, we get:
where is the initial range of HM. Here, {} is a subsequence of {}. Here, is unreasonable form practical point of view. Therefore, we assume that . Since {} is decreasing via monotone, {} is decreasing via monotone as well. Furthermore, {} is bounded below by 0. For {}, note that we have:
Now, let us define as follows:
Since as increases, can be negative, but we consider only the case of . Therefore, let us have for . Since () have discrete values, we see that } is a positive decreasing sequence (See Figure 3a). As is increased, exist empirically. Therefore, every time HM is updated, the sequence is decreased. That is, as increases,
Figure 3.
Convergence process when is located at the left end side.
Assume that the solutions in HM converge after the generation. Then,
And it is true that:
because is the minimum length of subintervals. Therefore, we continue the procedure even after the solutions in HM converge until is very small but positive. We repeat the generation more times after the generation until we satisfy:
for some positive integer . However, Equation (36) is not possible because we assume that so we stop the procedure after more generations. Therefore, Here, is still a positive constant as in Figure 3b. But,
Since we have either or , Equations (35)–(37) guarantee that is now 0. It means that, at the generation, it is guaranteed that:
In fact, Equation (38) has been already satisfied from the generation, but now the convergence is finally guaranteed by the inequality (34) mathematically. That is, the values in HM are convergent before the generation. Therefore, we can conclude that there exists some for some positive integers such that is convergent.
Therefore, the values in HM converge empirically, which proves the theorem. □
Remark.
In the general probability theory, for each eventE of the sample spaceS, we definen(E) to be the number of times in the firstn repetitions of the experiment that the eventE occurs. ThenP(E), which is the probability of the eventE, is defined as:
Here, P(E) is defined as the (limiting) proportion of time that E occurs. It is, thus, the limiting frequency of E.is called the “theoretical probability” andis called the “empirical probability.” The empirical probability approaches the theoretical probability as the number of experiments is increased. It means when the repetition number n is small, the event E may not occur. However, if we repeat the experiment many times, i.e., if n is large enough, the ratio between n(E) and n converges. It guarantees that the E should occur (even n(E) times) if we repeat the experiment many times. Therefore, even when HM does not include the solution, that minimizes, ifincludes the unique solution, sayThen it is absolutely selected as we repeat the iteration. That is:
where the event E is the event of selection of. In addition, Equation (25) guarantees thatshould be selected as.
Example 1.
Let us consider the situation when the minimum value of the objective functionis at the right end side, as in Figure 2c. Define:
Figure 4.
Convergence process in Example 1.
Furthermore, we get:
Therefore, from Equation (43). However, from
we get and it is not a very small number yet. Therefore, we repeat the procedure s more times until we get Equations (35) and (36). It can be easily calculated that . Hence,
We have either or by the definition of . Therefore, Equation (45) guarantees that . Figure 4 shows the convergence of HM in Example 1. In fact, we already have , but it is not proven mathematically yet. Therefore, it is finally shown after 15 more generations as follows.
Therefore, there exists a positive integer such that the solution in is convergent. Example 1 shows Theorem 1 is confirmed.
3.2. Multiple Variable Case
Let be a function of n variables. Let be a candidate solution from the candidate set Let us consider the norm of each solution vector as follows:
Without loss of generality, we can assume that satisfies after rearrangement, and we have distances between vectors. The distance between two vectors and is defined by:
In addition, satisfies after rearrangement. The vector of HM at the generation will be written as .
Here, let us define the total ranges of the candidate set and HM. Similarly, without loss of generality, we focus on the minimum case for this optimal solution problem in this section because we can easily apply this theory to the maximum case. In addition, we define the distances between neighboring solution candidate vectors, which are the lengths of subintervals as follows.
Then we define the minimum and maximum value of as follows:
Here, note that we have an important relation as follows:
- (i)
- If then will be replaced by some which shows better performance than . Here, note that Therefore, which has the smallest norm, will be newly replaced by . Consider that we are at the generation, then the previous is denoted as . At that point, the newly updated smallest vector from HM will be denoted as . Here,
It is clear that
- (ii)
- If then will be replaced by another vector which shows better performance than . Here, note that . Therefore, which has the largest norm, will be newly replaced by . Consider that we are at the generation. Then the previous is denoted as . Then, the newly updated largest vector from HM will be denoted as . Here,
It is clear that:
Corollary 2. Assume that there exists exactly one solution vector in the candidate set for the objective function . In addition, if is the length of HM at the generation, then
- (i)
- is a monotone decreasing sequence as is increased.
- (ii)
- Furthermore, the values in HM converge.
Proof.
Based on the notation that are defined in Equations (47)–(51), the proof is clearly done by Theorem 1. □
4. Conclusions
In this communication, we employed the distance concept and proved the convergence of the HS algorithm based on the empirical probability. The solution behavior of HS for one or more discrete variables was discussed and the given theorem was demonstrated with a numerical example.
For the future study, we will expand the theorem to include non-discrete variables, multi-modal functions, and adaptive parameters [35].
Author Contributions
J.H.Y. developed the conceptualization, proving the theorems, and drafted the manuscript. Moreover, supervising, reviewing, and editing were done by Z.W.G. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2020R1A2C1A01011131).
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Geem, Z.W.; Kim, J.H.; Loganathan, G.V. A new heuristic optimization algorithm: Harmony search. Simulation 2001, 76, 60–68. [Google Scholar] [CrossRef]
- Lee, K.S.; Geem, Z.W. A new metaheuristic algorithm for continuous engineering optimization: Harmony search theory and practice. Comput. Methods Appl. Mech. Eng. 2004, 194, 3902–3933. [Google Scholar] [CrossRef]
- Geem, Z.W. Multiobjective Optimization of Time-Cost Trade-Off Using Harmony Search. J. Constr. Eng. Manag. ASCE 2010, 136, 711–716. [Google Scholar] [CrossRef]
- Geem, Z.W. Harmony Search Algorithms for Structural Design Optimization; Springer: Berlin, Germany, 2009. [Google Scholar]
- Nazari-Heris, M.; Mohammadi-Ivatloo, B.; Asadi, S.; Kim, J.-H.; Geem, Z.W. Harmony search algorithm for energy system applications: An updated review and analysis. J. Exp. Theor. Artif. Intell. 2019, 31, 723–749. [Google Scholar] [CrossRef]
- Moon, Y.Y.; Geem, Z.W.; Han, G.-T. Vanishing point detection for self-driving car using harmony search algorithm. Swarm Evol. Comput. 2018, 41, 111–119. [Google Scholar] [CrossRef]
- Geem, Z.W. Can Music Supplant Math in Environmental Planning? Leonardo 2015, 48, 147–150. [Google Scholar] [CrossRef]
- Lee, W.-Y.; Ko, K.-E.; Geem, Z.W.; Sim, K.-B. Method that determining the Hyperparameter of CNN using HS Algorithm. J. Korean Inst. Intell. Syst. 2017, 27, 22–28. [Google Scholar] [CrossRef]
- Tuo, S.H. A Modified Harmony Search Algorithm for Portfolio Optimization Problems. Econ. Comput. Econ. Cybern. Stud. Res. 2016, 50, 311–326. [Google Scholar]
- Daliri, S. Using Harmony Search Algorithm in Neural Networks to Improve Fraud Detection in Banking System. Comput. Intell. Neurosci. 2020, 2020, 6503459. [Google Scholar] [CrossRef] [PubMed]
- Shih, P.-C.; Chiu, C.-Y.; Chou, C.-H. Using Dynamic Adjusting NGHS-ANN for Predicting the Recidivism Rate of Commuted Prisoners. Mathematics 2019, 7, 1187. [Google Scholar] [CrossRef]
- Fairchild, G.; Hickmann, K.S.; Mniszewski, S.M.; Del Valle, S.Y.; Hyman, J.M. Optimizing human activity patterns using global sensitivity analysis. Comput. Math. Organ Theory 2014, 20, 394–416. [Google Scholar] [CrossRef] [PubMed][Green Version]
- Elyasigomari, V.; Lee, D.A.; Screen, H.R.C.; Shaheed, M.H. Development of a two-stage gene selection method that incorporates a novel hybrid approach using the cuckoo optimization algorithm and harmony search for cancer classification. J. Biomed. Inform. 2017, 67, 11–20. [Google Scholar] [CrossRef] [PubMed]
- Deeg, H.J.; Moutou, C.; Erikson, A.; Csizmadia, S.; Tingley, B.; Barge, P.; Bruntt, H.; Havel, M.; Aigrain, S.; Almenara, J.M.; et al. A transiting giant planet with a temperature between 250 K and 430 K. Nature 2010, 464, 384–387. [Google Scholar] [CrossRef] [PubMed]
- Geem, Z.W.; Choi, J.Y. Music Composition Using Harmony Search Algorithm. Lect. Notes Comput. Sci. 2007, 4448, 593–600. [Google Scholar]
- Navarro, M.; Corchado, J.M.; Demazeau, Y. MUSIC-MAS: Modeling a harmonic composition system with virtual organizations to assist novice composers. Expert Syst. Appl. 2016, 57, 345–355. [Google Scholar] [CrossRef]
- Koenderink, J.; van Doorn, A.; Wagemans, J. Picasso in the mind’s eye of the beholder: Three-dimensional filling-in of ambiguous line drawings. Cognition 2012, 125, 394–412. [Google Scholar] [CrossRef] [PubMed]
- Geem, Z.W. Harmony Search Algorithm for Solving Sudoku. Lect. Notes Comput. Sci. 2007, 4692, 371–378. [Google Scholar]
- Geem, Z.W. Music-Inspired Harmony Search Algorithm: Theory and Applications; Springer: New York, NY, USA, 2009. [Google Scholar]
- Manjarres, D.; Landa-Torres, I.; Gil-Lopez, S.; Del Ser, J.; Bilbao, M.N.; Salcedo-Sanz, S.; Geem, Z.W. A Survey on Applications of the Harmony Search Algorithm. Eng. Appl. Artif. Intell. 2013, 26, 1818–1831. [Google Scholar] [CrossRef]
- Askarzadeh, A. Solving electrical power system problems by harmony search: A review. Artif. Intell. Rev. 2017, 47, 217–251. [Google Scholar] [CrossRef]
- Yi, J.; Lu, C.; Li, G. A literature review on latest developments of Harmony Search and its applications to intelligent manufacturing. Math. Biosci. Eng. 2019, 16, 2086–2117. [Google Scholar] [CrossRef]
- Ala’a, A.; Alsewari, A.A.; Alamri, H.S.; Zamli, K.Z. Comprehensive Review of the Development of the Harmony Search Algorithm and Its Applications. IEEE Access 2019, 7, 14233–14245. [Google Scholar]
- Alia, M.; Mandava, R. The variants of the harmony search algorithm: An overview. Artif. Intell. Rev. 2011, 36, 49–68. [Google Scholar] [CrossRef]
- Gao, X.Z.; Govindasamy, V.; Xu, H.; Wang, X.; Zenger, K. Harmony Search Method: Theory and Applications. Comput. Intell. Neurosci. 2015, 2015, 258491. [Google Scholar] [CrossRef] [PubMed]
- Beyer, H.-G. On the dynamics of EAs without selection. In Foundations of Genetic Algorithms; Banzhaf, W., Reeves, C., Eds.; Morgan Kaufmann: San Francisco, CA, USA, 1999; Volume 5, pp. 5–26. [Google Scholar]
- Das, S.; Mukhopadhyay, A.; Roy, A.; Abraham, A. Exploratory Power of the Harmony Search Algorithm: Analysis and Im-provements for Global Numerical Optimization. IEEE Trans. Sys. Man Cybern. Part B Cybern. 2001, 41, 89–106. [Google Scholar] [CrossRef]
- Wu, C.F.J. On the convergence properties of the EM algorithm. Ann. Stat. 1983, 11, 95–103. [Google Scholar] [CrossRef]
- Bull, A.D. Convergence Rates of Efficient Global Optimization Algorithms. J. Mach. Learn. Res. 2011, 12, 2879–2904. [Google Scholar]
- Trelea, I.C. The particle swarm optimization algorithm: Convergence analysis and parameter selection. Inf. Process. Lett. 2003, 85, 317–325. [Google Scholar] [CrossRef]
- Zhang, X.; Zheng, X.; Cheng, R.; Qiu, J.; Jin, Y. A competitive mechanism based multi-objective particle swarm optimizer with fast convergence. Inf. Sci. 2018, 427, 63–76. [Google Scholar] [CrossRef]
- Facchinei, F.; Júdice, J.; Soares, J. Generating Box-Constrained Optimization Problems. ACM Trans. Math. Softw. 1997, 23, 443–447. [Google Scholar] [CrossRef]
- Geem, Z.W. Novel derivative of harmony search algorithm for discrete design variables. Appl. Math. Comp. 2008, 199, 223–230. [Google Scholar] [CrossRef]
- Zhang, T.; Geem, Z.W. Review of Harmony Search with Respect to Algorithm Structure. Swarm Evol. Comput. 2019, 48, 31–43. [Google Scholar] [CrossRef]
- Almeida, F.; Giménez, D.; López-Espín, J.J.; Pérez-Pérez, M. Parameterized Schemes of Metaheuristics: Basic Ideas and Applications with Genetic Algorithms, Scatter Search, and GRASP. IEEE Trans. Syst. Man Cybern. Syst. 2013, 43, 570–586. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).