Hyper-Heuristic Based on ACO and Local Search for Dynamic Optimization Problems

Hyper-heuristics comprise a set of approaches that are motivated (at least in part) by the objective of intelligently combining heuristic methods to solve hard optimization problems. Ant colony optimization (ACO) algorithms have been proven to deal with Dynamic Optimization Problems (DOPs) properly. Despite the good results obtained by the integration of local search operators with ACO, little has been done to tackle DOPs. In this research, one of the most reliable ACO schemes, theMAX -MIN Ant System (MMAS), has been integrated with advanced and effective local search operators, resulting in an innovative hyper-heuristic. The local search operators are the Lin–Kernighan (LK) and the Unstringing and Stringing (US) heuristics, and they were intelligently chosen to improve the solution obtained by ACO. The proposed method aims to combine the adaptation capabilities of ACO for DOPs and the good performance of the local search operators chosen in an adaptive way and based on their performance, creating in this way a hyper-heuristic. The travelling salesman problem (TSP) was the base problem to generate both symmetric and asymmetric dynamic test cases. Experiments have shown that theMMAS provides good initial solutions to the local search operators and the hyper-heuristic creates a robust and effective method for the vast majority of test cases.


Introduction
Hyper-heuristics comprise a set of approaches with the common goal of automating design and tuning heuristic methods to solve hard computational search problems. Their main goal is to produce applicable search methodologies more broadly. Hyper-heuristics' distinguishing feature is that they operate on a search space of heuristics (or heuristic components) rather than directly on the search space of solutions to the underlying problem, as happens with most meta-heuristic approaches [1]. For an updated survey about hyperheuristics for Dynamic Optimization Problems (DOPs), see [2].
Ant colony optimization (ACO) algorithms have proved that they are capable to find the optimal (or near optimal) solution for difficult combinatorial optimization problems (e.g., the static travelling salesman problem (STSP) [3]). In this research, we utilize the travelling salesman problem (TSP) as the base problem to generate both symmetric and asymmetric dynamic test cases. The STSP has been studied extensively for the last few decades [4] and it is one of the most challenging N P-complete combinatorial optimization problems. As a whole, literature publications have dealt only with static problems, without dynamic changes, i.e., the instances do not change during the problem solving. However, several real-world applications experience changes during the optimization process (e.g., traffic restrictions), making the problem even more difficult. DOPs are defiant This paper is an extended version of our paper [8] that presents only a local search operator (LSO), improving the best solution found by an Ant Colony Optimisation framework, and compares the use of two of these local search operators (named Lin-Kernighan and Unstring and Stringing) independently. The hyper-heuristic proposed here (called HULK -Hyper-heuristic with Us and LK as local search operators) is very different from the congress paper due to the intelligence applied in the choice of LSOs and the order in which they are applied that make use of a self-adaptive function. There is another huge difference in the instances and in the frequency of dynamic changes, i.e., the instances used here allow arc blocking, and the frequency of dynamic changes has dropped from 10 4 to 100, making HULK faster without losing quality. This new HULK is able to outperform the methods presented in the congress paper, and a whole set of new results are presented here. The HULK improves its ability to cope with the best features of LK and US, producing much better solutions, mainly in the asymmetric dynamic change environment. For the purpose of keeping a self-sufficient article, the explanations of the LSOs are the same as in the congress paper, and we also present some results to allow for a better understanding of the computational experiments. The paper highlights can be listed as follows: • We integrate one of the best ACO variations with advanced and effective local search operators, i.e., the Lin-Kernighan and the Unstringing and Stringing heuristics, resulting in a powerful hyper-heuristic (HULK). • The proposed HULK combines the adaptation capabilities of ACO for DOPs and the superior performance of the local search operators. This is done with a smart and self-adaptive way to choose the LSO that is applied using a weighted roulette wheel, based on the objective function value of the previous solutions. • We include arc blocking and dropped the frequency of dynamic changes without losing quality. • The proposed method can provide better solutions, especially in asymmetric dynamic test cases.
We use the TSP as the base problem to systematically generate dynamic test cases [11]. In the literature, most cases of dynamic changes are symmetric [9,11,35,36]. However, in the real world they are often asymmetric [8,10,37]. For example, in pick-up or delivering problems, the travel time to arrive in some points can drastically change depending on the direction travelled due to traffic conditions. For the sake of comparison, both symmetric and asymmetric dynamic changes are considered. The rest of the paper is organized as follows. Section 2 describes the TSP and the proposed dynamic test cases. Section 3 describes the hyper-heuristic method using one of the best variations of ACO as the base. To make the paper self-contained, we also provide the core logic of the two local search operators. Section 4 shows the experimental results and analysis. Finally, the conclusions are presented in Section 5.

Base Problem Formulation
As mentioned before the TSP is used as the base problem to generate dynamic test cases. Typically, a TSP instance is modelled by a fully connected weighted graph G = (N, A), where N = {1, . . . , n} is a set of n nodes and A = {(i, j) | i, j ∈ N, i = j} is a set of arcs. Each arc (i, j) ∈ A is associated with a non-negative value w ij ∈ + , which for the classic TSP represents the distance between nodes i and j. In the case of such arc, in which it is blocked or does not exist, we set its associated value as a very large number. In the symmetric TSPs, we have w ij = w ji for every pair of nodes. When w ij = w ji for at least one pair of nodes, the TSP becomes asymmetric. Every problem instance consists of a weight matrix, W = {w ij } n×n , containing all the weights associated with the arcs of the corresponding graph G. A classical mathematical model for TSP was proposed by [38] and later modified and relaxed, so it was easier for the authors of [39] to solve, and it was recently used by [40] . The formulation of the TSP is an integer linear programming utilizing n 2 binary variables x ij . The variable x ij is defined as: x ij = 1, if and only if arc (i, j)(i = 1, . . . , n; j = 1, . . . , n) is in the optimal tour 0, otherwise The objective function is to minimize the route cost, as shown in Equation (2).
where constraints (3) and (4) impose that the in-degree and out-degree of each vertex, respectively, is equal to one, while constraints (5) are sub-tour elimination constraints and impose that no partial circuit exists. The variables domain are expressed by constraints (6).

Generating Dynamic Test Cases
To transform the problem instance in a dynamic instance, we modify the weight matrix as follows [11]: where T = t/ f is the period of a dynamic change, t is the counter of algorithmic evaluations, and f is the frequency in which changes occur. Note that, since the dynamic changes are synchronized with the optimization process, the parameter f is expressed in algorithmic steps. A specific dynamic test case can be generated by blocking a set of arcs and/or assigning an increasing or decreasing factor value to the arc connecting nodes i and j as follows: where B S (T) ⊂ A the set of arcs to be blocked, MAX is a constant sufficiently large number, w ij (0) is the original weight of the arc (i, j) (for the static problem instance when T = 0), R ij is a normally distributed random number (with zero mean and standard deviation set to 0.2 · w ij (0) [37]) that defines the modified factor value of the arc, A S (T) ⊂ A defines the set of arcs randomly selected for the change at that period and T defines the period index (as we can see in Equation (7)). The difference between the instances presented in [8] that does not allow arc blocking is the inclusion of the first line in the right hand side of Equation (8).
The number of blocked arcs correspond to 1% of the arcs selected to dynamic change.
The size of the change set A is determined by the number of arcs (n(n − 1)/2 for symmetric and n(n − 1) for asymmetric cases). Consequently, the size of A S (T) is defined by the size of A and the extent of change (i.e., m ∈ [0, 1]). For example, in a period T, precisely mn(n − 1)/2 and mn(n − 1) arcs are selected to have their weights changed in symmetric and asymmetric cases, respectively. The higher the m value, the more arcs will change value. Recalling that, if w ij changes in symmetric cases, then w ji changes uniformly (i.e., w ji = w ij ). On the contrary, when w ij changes in asymmetric cases, w ji will not change unless arc (j, i) is selected for a change (but not necessarily uniformly with w ij ). The same occurs when blocking arcs are introduced in order to maintain the symmetry or asymmetry of the problem.
A particular solution π = [π 1 , . . . , π n ] in the search space is specified by a permutation of the node indices, and for the dynamic TSP (DTSP), it is evaluated as follows see Equation (9) :

HULK: Hyper-Heuristic Based on ACO and Local Search Operators
The hyper-heuristic presented in this paper is integrated with an ACO metaheuristic. ACO consists of a colony of ω ants that construct solutions and share their information among each other via their pheromone trails. One of the best performing ACO variations, named MAX -MI N AS (MMAS) [41], which used in the hyper-heuristic framework shown in Algorithm 2. Notice that in [8], the step of choosing the LSOs does not exist since the LSO is picked before the algorithm starts. We will use the TSP notation to describe the hyper-heuristic based on ACO framework and LSOs.

Building Solutions Inside ACO
Ants utilize information of the pheromone trail to build solutions and reinforce information on the pheromone trail to mark their path. The probability in which ant k, currently at node i, will move to node j is calculated as follows: where τ ij and η ij are the existing pheromone trail and the heuristic information available a priori between nodes i and j, respectively. The pheromone trails are initialized evenly with the same value τ 0 . The heuristic information is calculated as η ij = 1/w ij (T) where w ij (T) is defined as in Equation (7). N k i is the set of unvisited nodes for ant k adjacent to node i. α and β are the two parameters that determine the relative influence of τ ij and η ij , respectively.

Choosing and Applying Local Search Operator
Stützle and Hoos [41] applied local search operators to the iteration-best ant of MMAS after every iteration, while in [29], they further applied local search operators to all ants. However, local search operators are computationally expensive methods; thus, this massive usage may not be very efficient for DTSP because of the natural increase of computation effort. As previously discussed, for DTSPs, algorithms must produce highquality solutions quickly (essentially before the next dynamic change occurs) [37]. In [8,10], the local search operator is applied to the best-so-far ant of MMAS only when a new best solution is found. This is because local search operators are typically executed until no further improvement is possible. However, in this research, as there may be the exchange of LSO there are cases where it is interesting to apply LSO even when there is no improvement in the best solution so far. These exchanges between LSO can change the pheromone trail guiding the search to promising places not yet explored. We investigate two powerful local search heuristics designed for the TSP, LK, and US heuristics that were adapted to cope with both symmetric and asymmetric cases; we describe them in Sections 3.2.1 and 3.2.2, respectively.
The proposed hyper-heuristic is designed to select how and when to apply the local search operator. As previously discussed, each time the ACO finds an improved solution, a local search heuristic is applied to that solution for further improvement, but the hyper-heuristic can also apply LSOs to avoid solution stagnation and to perform a sort of diversification in the search space; the whole hyper-heuristic framework is presented in Algorithm 2. To decide which local search will be applied, we use a weighted roulette wheel with choice probabilities of initially equal to 0.5 for both and updated as shown in Equation (11) if US was chosen, and in Equation (12) if LK was chosen. This idea is similar to Genetic Algorithms, where the chromosomes with better fitness value (LSO with better performance) have greater chances to picked. averUS and averLK will be the average improvement of up to three last-solution values produced by US and LK, respectively. We need to make sure that this adjustment was only made if both operators had already been selected at least once and the average of the solutions found by the algorithm chosen was less than the other (minimization problem), i.e., averUS < averLK if US is chosen, and averLK < averUS if LK was chosen. This update attempts to prevent the choice of only one of the operators available prematurely; also, γ, and δ are limited in the range [0.1, 0.9]. Observe that HULK starts with the same chance of selecting both LSO, but during the search, the percentages are self-adjusted in favor of the LSO with better performance, and this procedure is easily adapted if there are more than two LSOs or if we consider other DOPs. To choose the LSO, we first generate a random number uniformly distributed in the interval [0.0, 1.0], and if this number is less than γ, US is chosen. Otherwise, we choose LK. The integration of the LSO into the hyper-heuristic is made through the procedure ApplyLSO(π ib , A, DistanceMatrix), where π is the current solution and A is the LSO. This procedure is detailed in Algorithm 1.

Lin-Kernighan Local Search Operator
The LK heuristic performs a series of k-opt moves to transform a TSP tour into a shorter one [33], where k-opt move consists of the exchange of a set of k tour arcs by a set of k new arcs keeping the tour feasibility. The algorithm has an efficient implementation [34] with open code but with so many modifications over time that are very difficult to use and adapt; however, here we successfully made it also include the ability to deal with asymmetry. The LK heuristic starts with two empty arc sets: X (i.e., out-arcs) and Y (i.e., in-arcs). At each step, one arc that currently belongs to the tour will be added to X and a new arc that does not belong to the tour will be added to Y. After the first step, the LK heuristic will favour arc insertions that result in a shorter complete TSP tour. When a new complete tour is achieved, the algorithm will begin a new phase of arc exchanges and this process will continue until there is no further improvement.
LK establishes a series of rules to be followed, aimed at enhancing performance. These rules can be summarized as follows: • Each arc removed must share a node with its added counterpart. After the first arc exchange in each cycle, each arc being removed must also share a node with the previously added arc. Figure 1 illustrates an example of a 2-opt move, where in the first step arc (V 1 ,V 2 ) is removed and arc (V 2 ,V 4 ), which shares the node V 2 with its removed counterpart, is added. In the second step arc (V 3 ,V 4 ), which shares node V 4 with the previously added arc, is removed and arc (V 3 ,V 1 ) is added, closing the tour. • No exchanges that result in the tour being broken into multiple closed circuits are allowed. An example of this type of exchange is shown in Figure 2, where arcs (V 1 ,V 2 ) and (V 3 ,V 4 ) are removed and arcs (V 4 ,V 1 ) and (V 3 ,V 2 ) are added. In this case, the addition of any arcs would not be accepted because it would result in a segment of tour forming a cycle. • Each pair of arcs exchanged must be gainful, meaning that each arc being added must be shorter than its removed counterpart. If the problem is asymmetric, both orientations of the resulting tour must be analysed. • Once an arc is removed, it cannot be reinserted until the tour is closed.
Although LK was originally designed for symmetric problems [33], it can easily be modified to address asymmetric problems by transforming an asymmetric weight matrix into a symmetric, normally by doubling the graph nodes [34,42].

Unstringing and Stringing Local Search Operator
The US heuristic is the core part of the Generalized Insertion Procedure-Unstringing and Stringing (GENIUS) [32]. The US repeatedly removes (unstringing) and inserts (stringing) nodes searching for solution improvements. The main feature of the algorithm is that the stringing nodes can be made between non-adjacent nodes, resulting in a route where both nodes become adjacent to the node being inserted. All movements are based on local and global neighbourhoods. The asymmetric version of US was proposed and implemented in [43], considering the node V x to be inserted between any two nodes V i and V j . For a given orientation of a tour, consider V k a node in the sub tour from V j to V i , and V l a node in the sub tour from V i to V j . We also consider for any node V h on the tour, V h+1 its successor and V h−1 its predecessor. The re-insertion of V x between V i and V j can be done in several ways using different types of insertions and removals. Since the potential number of choices for V i , V j and V x could be large, we use a candidate set based on the closest neighbours. Once V x is chosen, its neighbours will be the q nodes that have lower value arcs with V x . To choose the best place to insert a node in the tour, we look only for the neighbourhood of each node involved with stringing or unstringing movements. In [9,32], only symmetric problem instances were considered and tackled with Type I and Type II removals (Figure 3a,b) and Type I and Type II insertions (Figure 4a,b). In [8,10,43], other two types of removals and other two types of insertions were considered in order to tackle asymmetric problem instances: Type III and Type IV removals (Figure 3c,d), and Type III and Type IV insertions (Figure 4c,d). The unstringing procedure removes a node from the tour and reconnects the tour with the remaining nodes trying to obtain an improved closed tour. The procedure consists of four types of removals defined as follows: • Unstringing Type I: consider V j belonging to the neighbourhood of V i+1 and V k belonging to the neighbourhood of Unstringing Type II: consider V j belonging to the neighbourhood of V i+1 , V k belonging to the neighbourhood of V i−1 , with V k being part of the subtour (V j+1 , . . . , V i−2 ), and V l belonging to the neighbourhood of V k+1 , with V l being part of the sub tour (V j , . . . , V k−1 ). The removal of node V i results in the deletion of arcs As before, the sub tours (V i+1 , . . . , V j−1 ) and (V l+1 , . . . , V k ) are reversed. • Unstringing Type III: consider V j belonging to the neighborhood of V i+1 and V k belonging to the neighborhood of V i−1 with V k being part of the sub tour (V i+1 , . . . , V j−1 ). The removal of node V i results in the deletion of arcs and (V k−1 , V k ) and the insertion of arcs (V i+1 , V j ), (V i−1 , V k ), and (V j−1 , V k−1 ). As before, the sub tours (V j−1 , . . . , V k ) and (V i−1 , . . . , V j ) are reversed. • Unstringing Type IV: consider V j belonging to the neighborhood of V i+1 , V k belonging to the neighborhood of V i−1 with V k being part of the sub tour (V l+1 , . . . , V i−2 ), and V l belonging to the neighborhood of V j−1 with V l being part of the sub tour (V j+1 , . . . , V k−1 ). The removal of node V i results in the deletion of arcs As before, the sub tours (V k+1 , . . . , V i−1 ) and (V j , . . . , V l−1 ) are reversed.
The stringing procedure can be seen as the reverse of the unstringing procedure and consists of four types of insertions as follows: • Stringing Type I: As before, the sub tours (V i+1 , . . . , V l−1 ) and (V l , . . . , V j ) are reversed. • Stringing Type III: this stringing type can be seen as the inverse of Stringing Type I. Notice that when node V x is inserted between V i and V j , the sub tour of nodes is rearranged in such a way that almost the entire sequence is reversed. The objective is to explore other promising regions of the search space. As in Stringing Type I, assume . As before, the sub tours (V i , . . . , V j−1 ) and (V k , . . . , V i−1 ) are reversed. • Stringing Type IV: similarly, this type of insertion can be seen as the reverse of Stringing Type II. As in Stringing Type II, assume that As above, the sub tours (V i , . . . , V l ) and (V l+1 , . . . , V j−1 ) are reversed.

Pheromone Trail Update
One of the most important steps on ACO algorithms is how to adjust the pheromone trail to guide the search to promising solutions in the search space as well to forget misleading paths. The pheromone trail plays an important role for HULK since it maintains relevant information about previous dynamic environments, accelerating the solution search in the new dynamic environment or when applying a different LSO. The pheromone adjustments in MMAS are made in the same way as in [8], i.e., the evaporation is commanded by Equation (13).
where ρ is the evaporation rate, satisfying 0 < ρ ≤ 1, and τ ij is the current pheromone value. After evaporation, the best ant updates the pheromone trail by adding pheromone as described in Equation (14).
where ∆τ best ij = 1/φ(π best , t) is the amount of pheromone deposited by the best ant. The pheromone can be deposited by the best-so-far ant (i.e., a special ant that may not necessarily belong to the current constructing colony), in this case, π best = π bs , or by the iteration-best ant, in this case, π best = π ib . These ants allow a transition from early stronger exploration to later stronger exploitation of the search space. To better explain this process, consider f bs the frequency with which the best-so-far ant deposits pheromone on its trail. Consider f bs to be the number of iterations performed by the algorithm between two best-so-far ant pheromone updates. This frequency is adjusted as the search progress [29] starting with a value that allows the best-so-far ant deposits pheromone at each iteration. After the first 25 iterations, we reduce f bs to 5, 3, 2 and 1 at every 25 additional iterations, retaining the last value until the end of the ACO. This scheme allows that, as the search evolves, the influence of the iteration-best ant on the pheromone trail decreases while the impact of the best-so-far ant grows. This process refreshes at the start of each dynamic change. Observe that the pheromone trail is updated after the end of the local search, so the improvements obtained by the local search can be explored by the proposed hyper-heuristic in the sequence.

Keeping Solution Diversity
Diversity is a key factor when solving DOPs. Having a diverse set of solutions allows the search to quickly adapt to the new environments escaping from the outdated ones [44]. Given that the best ant is the unique one allowed to deposit pheromone on the trail, the search can be stuck in the solutions found in the first iterations. Consequently, we provide a way to once in a while reinitialize the pheromone trails to the current τ max value to increase exploration. For example, the pheromone reinitialization mechanism is triggered every time stagnation occurs or no improvement in the solution is found for a given number of iterations. Stagnation is detected using λ-branching [45], which calculates the statistics regarding the distribution of the current pheromone trails.
We also impose lower and upper limits (τ min τ max /(2n), where n is the number of nodes and τ max = 1/(ρ · φ(π bs , t))) for the pheromone trails, keeping p k ij > 0 and giving a chance to all arcs to be selected. The τ max value is updated every time a new best-so-far ant is found.

Responding to Dynamic Changes
One of the great advantages of ACO algorithms is that they can be directly applied to DOPs. This occurs because they can use the pheromone trails to keep the knowledge obtained from previous environments [36,46]. For example, normally the environment changes are not huge; therefore, the pheromone trails can retain sufficient knowledge to accelerate the optimization process of the new environment. Meanwhile, ACO must be supple to accept the knowledge provided for the former pheromone trails as well to discard that information to deal with the new environment. These actions take place through the evaporation process that quickly eliminates unpromising areas of the new environment. Thereby, the ants can continue exploring the search space for the new optimum. This is also important when the LSOs do their job. However, if the environments become completely different, i.e., the magnitude of changes is very large, re-initializing the pheromone trails is sometimes is better than transferring knowledge [36,46,47].

Integration between Algorithms in the Dynamic Test Environment
The hyper-heuristic proposed can be summarized in a running environment that allows for the intelligent integration of ACO with both LK and US algorithms. The synchronization between algorithms happens through a shared distance matrix. While running the tests, each T iteration of a perturbation was applied to the distance matrix, as described in Section 2.2. The algorithm adopted as a local search is selected in the beginning of each hyper-heuristic throughout a weighted roulette wheel. If the ACO improves the solution or does not produce an improvement solution after five or more iterations, apply local search. Additionally, if the algorithm does not produce an improvement after 0.4 T iterations (where T is the dynamic change period), apply both LSO in sequence. The first algorithm to be applied on the sequential local search is selected through the weighted roulette. The general framework for the dynamic test running environment is in Algorithm 2. Observe that this idea can be applied in a wide variety of DOPs, mainly routing problems; one recent example for dynamic vehicle routing problem can be found in [48], where local search operators smartly repair the solution obtained by ACO.

Experimental Setup
During the literature review, we noticed that the computational experiments for DTSPs remain in a small number of TSP instances from TSPLib, mainly from the family KROA (more specifically KROA100, KROA150 and KROA200. The KROA family has been proposed by [49] and came from examples of truck-dispatching problems. We compare the results in the literature with the ones obtained by HULK, considering the magnitude of change m = 0.10. To be fair to the literature, we first perform the tests without arc blocking with the same set of parameters useed here. We also compute the results from the family KROA considering arc blocking. We investigate the proposed hyper-heuristic, which cleverly combines the flexibility of ACO to provide solutions for symmetric and asymmetric dynamic changes with the power of improvement of two good local search operators. Comparisons of the performance of the following strategies were investigated for DTSPs: • MMAS + US: for each best-so-far ant found by MMAS, we apply the US local search operator pursuing improvements (detailed in [8]). • MMAS + LK: for each best-so-far ant found by MMAS, we apply the LK local search operator pursuing improvements (detailed in [8]). • HULK: proposed hyper-heuristic that combines ACO with the LSOs LK and US, biased by a self-adjusting weighted roulette wheel (detailed in Algorithm 2).
The initial parameters for tested methods use the following values: α = 1, β = 5, γ = 0.5, δ = 0.5, ρ = 0.8 and ω = 50 (number of ants). In addition to the instances of the KROA family, DTSPs are generated from six originally symmetric static benchmark instances obtained from TSPLIB (Available from http://comopt.ifi.uni-heidelberg.de/ software/TSPLIB95/, accessed on 9 August 2021): d198, pcb442, u574, pcb1173, rat783 and lin318, using the dynamic generator described in Section 2.2. The first four benchmark instances originate from the operation of drilling holes in printed circuit boards [50], the fifth benchmark instance arises from rattled grid [51] and the sixth benchmark instance from the travel cost between cities [34].
The frequency of change f was set to change every 100 algorithmic evaluations, and the amplitude of change m was set to 0.05, 0.1, 0.2 and 0.4, denoting small to medium changing environments; we also allowed blocking up to 1% of the arcs selected for dynamic change trying to keep at least one Hamiltonian cycle. On the whole, there were 72 instances to be tested as follows: a series of four DTSP test cases were constructed from each stationary instance, for symmetric and asymmetric changes, to systematically investigate the performance of the algorithms (all asymmetric problem instances had an extension of .atsp at the end of the problem label). For the asymmetric instances only asymmetric dynamic changes were allowed. Thirty independent runs for each algorithm were performed on a DTSP, kept at the same set of random seeds. For each run, we allowed 100 environmental changes, storing the best-so-far solution for each of them. All seeds used in the random generators were kept for all environmental changes, and the random numbers were uniformly distributed. To be fair, all methods performed the same number of evaluations. The proportional evaluations required when applying LSOs were added to the total evaluations of the algorithms.
To evaluate the methods overall performance, the offline performance [52] is used as defined in Equation (15).P where E is the total number of evaluations and π bs is the best-so-far solution quality after each dynamic change.

Experimental Results and Discussion
We first present Table 1, containing the results of 24 instances from the KROA family, without arc blocking. The experiments show that HULK outperforms the other methods in all symmetric and asymmetric instances and MMAS + US have better performance than MMAS + LK. From this first set of instances, we can already observe HULK efficiency.
In the literature, we have not found results for asymmetric instances from the KROA family. Therefore, we choose four algorithms to compare HULK with a magnitude of change m = 0.1. They use immigrant schemes (that arise as the most promising mechanisms due to their structural simplicity even superior in computational performance) [35], RIACO (random ACO), EIACO (elitism-based immigrants ACO) and MIACO (memory-based immigrants ACO) are detailed in [36]. The last one is the best result among the four strategies presented in [27] of the procedure named ALNS (adaptive-large-neighborhood Search-based immigrant schemes). Obeserve that ALNS outperforms the other three immigrant schemes, but HULK is better than the best ALNS for at least 6.8% (KROA100.tsp) and at most 14.6% (KROA200.tsp). Since LSO has an asymptotic behavior considering its initial solution, i.e., it will have at least the same value as its initial solution, the comparisons presented in Table 2 show that HULK is much better than any other immigrant scheme approach. Thus, the remaining tests will compare HULK to the best procedures presented in [8] in instances with higher dimensions, barely explored previously in the literature, and we will show another variant of DTSPs considering arc blocking when generating the dynamic changes, as one extension of the results in [8]. The problems considering arc blocking seem to be more difficult to solve-see Tables 1 and 2 (KROA100 family), where HULK has outperformed the compared methods in all cases. However, when we include arc blocking, the superiority of HULK is still evident, but we observe in Tables 3 and 4 that for KROA100.tsp and KROA100.atsp there are some cases in which HULK loses to other compared methods but always by less than 1.0%. Table 1. Experimental results regarding the average offline performance of the presented methods on symmetric DTSPs (upper half) and asymmetric DTSPs (lower half) for instances from KROA family without arc blocking. The number between brackets means the distance in percentage from the current value to the best value, in boldface. Tables 3 and 4 present the offline performance results of all methods compared in this paper considering all DTSPs and the different magnitudes of dynamic changes, for sym-metric and asymmetric problems, respectively. Table 5 presents the statistical analysis of the results, performing pairwise Wilcoxon rank-sum statistical tests with a significance level of 0.05 (see Julia library: HypothesisTests.jl, at https://juliastats.org/HypothesisTests.jl/ stable/nonparametric/, accessed on 22 October 2021). To better visualize the results, we apply the following symbols: "+" (if the first algorithm is better than the second one)"++" (if the first algorithm is significantly better than the second one), "−" (if the second algorithm is better than the first one), "− −" (if the second algorithm is significantly better than the first one) and "∼" (if there is no significant difference between them). It is significantly better when the statistics show difference between the algorithms and the percentage difference between the average offline performance is greater than 2.0%, and when they have no significant difference when the statistics show it or the percentage difference between the average offline performance is less or equal to 0.1%. Table 3. Experimental results regarding the average offline performance of the presented methods on symmetric DTSPs. The number between brackets means the distance in percentage from the current value to the best value, in boldface.

Problem
Instance m MMAS + LK MMAS + US HULK Table 4. Experimental results regarding the average offline performance of the presented methods asymmetric DTSPs. The number between brackets means the distance in percentage from the current value to the best value, in boldface.

Problem
Instance m MMAS + LK MMAS + US HULK  To visualize and understand the behavior of the compared methods, we plotted for the last twenty environmental changes the dynamic average offline performance of the presented hyper-heuristic (HULK), MMAS + US and MMAS + LK, and see Figure 5 for symmetric, and Figure 6 for asymmetric dynamic changes. The problems plotted are rat783 and pcb1173, and we can observe that for the symmetrical change case all methods are competitive with an advantage to HULK and MMAS + LK, but when we turn to the asymmetrical change case we can see the consistently better performance of HULK proving that the combination of two powerful LSOs can achieve much better solutions. We can also trace the following observations from the computational results.  Considering the symmetric environmental changes (Table 3), HULK performs better in 66.7% of test cases and MMAS + LK performs better in the rest, but there are two where MMAS + US wins. In terms of numbers, HULK wins in 24 out of 32 instances; in seven the percentage difference is less than 1.0%, and for the other five the percentage difference remains between 1.0% and 3.1%. There are situations where MMAS + US outperforms MMAS + LK, e.g., u574.tsp and the entire KROA family, but there are only two cases where MMAS + US outperform HULK. We can also observe Table 5, which shows that HULK always has a similar or superior quality compared with MMAS + US and in some cases loses to MMAS + LK. This alternation performance was expected since both LSOs were originally developed to deal with symmetric TSP, taking advantage of the properties of instances that use Euclidean distances and respect triangle inequality.
However, when we observe the asymmetric environmental changes (Table 4) we note a total HULK superiority outperforming MMAS + LK and MMAS + US in 94.4% of test cases. Observing Table 5, we can see that there are only two cases where HULK has worse performance, i.e., MMAS + LK (KROA200.atsp, m = 0.05) and MMAS + US (KROA200.atsp, m = 0.2), where the statistics show similar performance between them. This behavior shows how beneficial the clever combination of LSOs is, which allows for the hyper-heuristic to take advantage of the best aspects of each LSO and proves that MMAS can retain crucial information when a dynamic change occurs. Due to this fact, MMAS is able to provide good initial solutions for the local search operators, and HULK can use the best features of both local search operators to work in a very robust way in both symmetric and asymmetric dynamic changes environments, as expected. The complete results are presented in Tables 3 and 4 and the statistical comparisons in Table 5.
Differently from what was presented in [8], we included arc blocking on the dynamic changes and produced environments much closer to reality, creating a new and more difficult problem to solve. We chose two of the best LSOs (LK and US) for TSP and adapted them to work with both symmetric and asymmetric environmental changes. So, we become able to take advantage of the synergy created between them through hyper-heuristic based on ACO. We can also noticed from the results that the HULK performance is independent of the frequency and the magnitude of environmental changes. The results showed that MMAS + US and MMAS + LK have similar behavior in favor of one, or sometimes another (for symmetric and asymmetric dynamic changes). This probably happens because they are based on arc exchange operations taking advantage of TSP properties. MMAS + LK outperforms HULK in a few symetric cases (d198.tsp, pcb442.tsp and pcp1173.tsp), while MMAS + US shows a performance similar to HULK only in a few cases (all of them in KROA family).
HULK shows complete superiority in asymmetric dynamic changes, and it is better in symmetric dynamic changes. In general, it achieves the best results in more than 80.0% of the cases and improves its performance when the problem dimension grows. The way that HULK combines the LSO significantly improves the method performance with no considerable additional computational effort. HULK shows how to improve the potentiality of LSO by responsively interacting between both and the choice of the parameters γ and δ shows itself to be effective with two LSO. The application of both LSOs in sequence sometimes improves the overall performance of HULK. For future research, we can choose more LSO and add more intelligence to HULK, directing the search to unexplored regions in the solution space.

Conclusions
This paper presented the hyper-heuristic HULK based on ACO and LSOs. We have used one of the best ACO variants named MMAS and integrated it with two powerful LSOs (LK and US). This integration enabled the adaptation capabilities of MMAS in dynamic environmental changes and the solution improvements of LSO that led to better exploitation and exploration in the search space. HULK has shown better performance than other compared methods in more than 80% of tested instances (66.7% considering symmetric changes and 94.4% for asymmetric changes). The idea of using an adaptive weighted roulette wheel to chose among the LSO available proved to be an important issue since when there was some change in the LSO during the same dynamic environment, the ACO retained the information due to the previous one and this synergy could direct the solution search to unexplored areas in the search space. We also applied both LSOs in sequence when some stagnation was detected. This behavior can be better seen when we analyze each environment and see how HULK improves the method intelligence. The method of adjusting the weights during the execution of HULK always seeks to favor the best LCO at the current moment, using as a guide the average of up to three latest solutions, but never leaving a single LSO no chance of being chosen. The TSP was the base problem used to generate the dynamic test instances; to make the problem more realistic, we included the possibility of blocking arcs, in both senses if the dynamic change was symmetric and if the change was asymmetric in just one sense.
At the beginning of the experiments, we compared HULK with other ACO with immigrant schemes found in the literature. HULK outperformed them by a minimal of 6.8%. So, we turned our attention to instances of higher dimensions, and to the best of our knowledge, ACO-based methods for DOPs have never been mentioned before in the literature. We also extended the dynamic changes, including arc blocking and asymmetry, making the instances more difficult to solve. Section 4 analyzes in detail the behavior of all methods exploring the results from Tables 1-5, showing the superiority of HULK mainly in asymmetric environments.
Finally, we can conclude that HULK has had better performance than MMAS + LK MMAS + LK in solving dynamic symmetric changes and even better performance in dynamic asymmetric changes. In HULK, both LSOs can work together, improving the search for solutions capacity. In future research, we can more widely explore the parameters to choose the most suitable LSO, and the possibility exists to include more LSOs. The idea of HULK is easily adapted to cope with other DOPs such as dynamic routing problems that interact very well with ACO and have a wide variety of LSOs in the literature.