Simple Constructive, Insertion, and Improvement Heuristics Based on the Girding Polygon for the Euclidean Traveling Salesman Problem

The Traveling Salesman Problem (TSP) aims at finding the shortest trip for a salesman, who has to visit each of the locations from a given set exactly once, starting and ending at the same location. Here, we consider the Euclidean version of the problem, in which the locations are points in the two-dimensional Euclidean space and the distances are correspondingly Euclidean distances. We propose simple, fast, and easily implementable heuristics that work well, in practice, for large real-life problem instances. The algorithm works on three phases, the constructive, the insertion, and the improvement phases. The first two phases run in time O(n2) and the number of repetitions in the improvement phase, in practice, is bounded by a small constant. We have tested the practical behavior of our heuristics on the available benchmark problem instances. The approximation provided by our algorithm for the tested benchmark problem instances did not beat best known results. At the same time, comparing the CPU time used by our algorithm with that of the earlier known ones, in about 92% of the cases our algorithm has required less computational time. Our algorithm is also memory efficient: for the largest tested problem instance with 744,710 cities, it has used about 50 MiB, whereas the average memory usage for the remained 217 instances was 1.6 MiB.


Introduction
The Traveling Salesman Problem (TSP) is one of the most studied strongly NP-hard combinatorial optimization problems.Given an n × n matrix of distances between n objects, call them cities, one looks for a shortest possible feasible tour which can be seen as a permutation of the given n objects: a feasible tour visits each of the n cities exactly once except the first visited city with which the tour ends.The cost of a tour is the sum of the distances between each pair of the neighboring cities in that tour.This problem can also be described in graph terms.We have an undirected weighted complete graph G = (V, E), where V is the set of n = |V| vertices (cities) and E is the set of the n 2 − n edges (i, j) = (j, i), i = j.A non-negative weight of an edge (i, j), w(i, j) is the distance between vertices i and j.There are two basic sets of restrictions that define feasible solution (a tour that has to start and complete at the same vertex and has to contain all the vertices from set V exactly once).A feasible tour T can be represented as: and its cost is The objective is to find an optimal tour, a feasible one with the minimum cost min T C(T).Some special cases of the problem have been commonly considered.For instance, in the symmetric version, the distance matrix is symmetric (i.e., for each edge (i, j), w(i, j) = w(j, i)); in another setting, the distances between the cities are Euclidean distances (i.e., set V can be represented as points in the two-dimensional Euclidean space).Clearly, the Euclidean TSP is also a symmetric TSP but not vice versa.The Euclidean TSP has a straightforward immediate application in the real-life scenario when a salesman wishes to visit the cities using the shortest possible tour.Because in the Euclidean version the cities are points in plane, for each pair of points, the triangle inequality holds, which makes the problem a bit more accessible in the sense that simple geometric rules can be used for calculating the cost of a tour or the cost of the inclusion of a new point in a partial tour, unlike the general setting.Nevertheless, the Euclidean TSP remains strongly NP-hard; see Papadimitriou [1] and Garey et al. [2].
The exact solution methods for TSP can only solve problem instances with a moderate number of cities; hence, approximation algorithms are of a primary interest.There exist a vast amount of approximation heuristic algorithms for TSP.The literature on TSP is very wide-ranging, and it is not our goal to overview all the important relevant work here (we refer the reader, e.g., to a book by Lawler et al. [3] and an overview chapter by Jünger [4]).
The literature distinguishes two basic types of approximation algorithms for TSP: tour construction and loop improvement algorithms.The construction heuristics create a feasible tour in one pass so that the taken decisions are not reconsidered later.A feasible solution delivered by a construction heuristic can be used in a loop improvement heuristic as an initial feasible solution (though such initial solution can be constructed randomly).Given the current feasible tour, iteratively, an improvement algorithm, based on some local optimality criteria, makes some changes in that tour resulting in a new feasible solution with less cost.Well-known examples of tour improvement algorithms are 2-Opt Croes 2-Opt, its generalizations 3-Opt and k-Opt, and the algorithm by Lin and Kernighan [5], to mention a few.
The most successful algorithms we have found in the literature for large-scale TSP instances are Ant Colony Optimization (ACO) meta heuristics, with which we compare our results.On one hand, these algorithms give a good approximation.On the other hand, the traditional ACO-based algorithms tend to require a considerable computer memory, which is necessary to keep an n × n pheromone matrix.Typically, the time complexity of the selection of each next move using ACO is also costly.These drawbacks are addressed in some recent ACO-based algorithms in which, at each iteration of the calculation of the pheromone levels, the intermediate data are reduced storing only a limited number of the most promising tours in computer memory.With Partial ACO (PACO), only some part of a known good tour is altered.A PACO-based heuristic was proposed in Chitty [6] and the experimental results for four problem instances from library Art Gallery were reported.Effective Strategies + ACO (ESACO) uses pheromone values directly in the 2-opt local search for the solution improvement and reduces the pheromone matrix, yielding linear space complexity (see, for example, Ismkman [7]).Parallel Cooperative Hybrid Algorithm ACO (PACO-3Opt) uses a multi-colony of ants to prevent a possible stagnation (see, for example, Gülcü et al. [8]).In a very recent Restricted Pheromone Matrix Method (RPMM) [9], the pheromone matrix is reduced with a linear memory complexity, resulting in an essentially lower memory consumption.Another recent successful ACO-based Dynamic Flying ACO (DFACO) heuristic was proposed by Dahan et al. [10].Besides these ACO-based heuristics, we have compared our heuristics with other two meta-heuristics.One of them is a parallel algorithm based on the nearest neighborhood search suggested by Al-Adwan et al. [11], and the other one, proposed by Zhong et al. [12], is a Discrete Pigeon-Inspired Optimization (DPIO) metaheuristic.We have also implemented directly the Nearest Neighborhood (NN) algorithm for the comparison purposes (see Section 4 and Appendix A).
In Table A1 in Appendix A, we give a summary of the above heuristics including the information on the type and the number of the instances for which these algorithms were tested and the number of the runs of each of these algorithms.Unlike these heuristics, the heuristic that we propose here is deterministic, in the sense that, for any input, it delivers the same solution each time it is invoked; hence, there is no need in the repeated runs of our algorithm.We have tested the performance of our algorithm on 218 benchmark problem instances (the number of the reported instances for the algorithms from Table A1 vary from 6 to 36).The relative error of our algorithm for the tested instances did not beat the earlier known best results; however, for some instances, our error was better than that of the above-mentioned algorithms (see Table 9 at the end of Section 3).The error percentage provided by our algorithm has varied from 0% to 17%, with an average relative error of 7.16%.The standard error deviation over all the tested instances was 0.03.
In terms of the CPU time, our algorithm was faster than ones from Table A1 except for six instances from Art Gallery RPMM [9] and Partial-ACO [6], and for two instances from TSPLIB DPIO [12] were faster (see Table 10).Among all the comparisons we made, in about 92% of the cases, our algorithm has required less computational time.We have halted the execution of our algorithm for the two of the above-mentioned largest problem instances in 15 days, and for the next largest instance ara238025 with 238,025 cities our algorithm has halted in about 36 h.The average CPU time for the remained instances were 19.2 min.The standard CPU time deviation for these instances was 89.3 min (for all the instances, including the above-mentioned three largest ones, it was 2068.4 min).
Our algorithm consumes very little computer memory.For the largest problem instance with 744,710 cities, it has used only about 50 MiB (mebibytes).The average memory usage for the remained 217 instances was 1.6 MiB (the average for all the instances including the above largest one was 1.88 MiB).The standard deviation of the usage of the memory is 4.6 MiB.Equation ( 3) below (see also Figure 15 in Section 3) shows the dependence of the memory required by our algorithm on the total number of cities n.As we can observe, this dependence is linear: Our algorithm consists of the constructive, the insertion and the improvement phases, we call it the Constructive, Insertion, and Improvement algorithm, the CII-algorithm, for short.The constructive heuristics of Phase 1 deliver a partial tour that includes solely the points of the girding polygon.The insertion heuristic of Phase 2 completes the partial tour of Phase 1 to a complete feasible tour using the cheapest insertion strategy: iteratively, the current partial tour is augmented with a new point, one yielding the minimal increase in the cost in an auxiliary, specially formed tour.We use simple geometry in the decision-making process at Phases 2 and 3.The tour improvement heuristic of Phase 3 improves iteratively the tour of Phase 2 based on the local optimality conditions: it uses two heuristic algorithms which carry out some local rearrangement of the current tour.At Phase 1, the girding polygon for the points of set V and an initial, yet infeasible (partial) tour including the vertices of that polygon is constructed in time O(n 2 ).The initial tour of Phase 1 is iteratively extended with the new points from the internal area of the polygon at Phase 2. Phase 2 also runs in time O(n 2 ) and basically uses the triangle inequality for the selection of each newly added point.Phase 3 uses two heuristic algorithms.The first one, called 2-Opt, is a local search algorithm proposed by Croes [13].The second one is based on the procedure of Phase 2. The two heuristics are repeatedly applied in the iterative improvement cycle until a special approximation condition is satisfied.The number of repetitions in the improvement cycle, in practice, is bounded by a small constant.In particular, the average number of the repetitions for all the tested instances was about 9 (the maximum of 49 repetitions was attained for one of the moderate sized instances lra498378, and for the largest instance lrb744710 with 744,710 points, Phase 3 was repeated 18 times).
The rest of the paper is organized as follows.In Section 2, we describe the CII-algorithm and show its time complexity.In Section 3, we give the implementation details and the results of our computational experiments, and, in Section 4, we give some concluding remarks and possible directions for the future work.The tables presented in Appendix A contain the complete data of our computational results.

Methods
We start this section with a brief aggregated description of our algorithm and in the following subsections we describe its three phases (Figure 1).

Phase 3
The latter solution is further improved Solution for the TSP  At Phase 1, we construct the girding polygon for the points of set V and construct an initial yet infeasible (partial) tour that includes the points of that polygon.The construction of this polygon employs four extreme points v 1 , v 2 , v 3 and v 4 ; the uppermost, leftmost, lowermost, and rightmost, respectively [14], with ones from set V defined as follows.First, we define the sets of points T , L , B and R with and See the next procedure for the extreme points in Table 1.
Lemma 1.The time complexity of Procedure extreme_points is O(n).
Proof of Lemma 1.In this and in the following proofs, we only consider those lines in the formal descriptions in which the number of elementary operations, denote it by f (n), depends on n (ignoring the lines yielding a constant number of operations).In lines 5-9, there is a loop with n − 1 cycles, hence { f (n) = n − 1}.In lines 11-15, there is a loop with n cycles, hence { f (n) = n} In lines 20-21, 22-23, 24-25 and 26-27; there are four loops, each one with at most has n cycles, so { f (n) = 4n}.Hence, the total cost is O(n).

Procedure for the Construction of the Girding Polygon
Before we describe the procedure, let us define function θ(i, j), returning the angle formed between the edge (i, j) and the positive direction of the x-axis (Equation (8) and Figure 2): The girding Polygon P = P(V) is a convex geometric figure in a two-dimensional plane, such that any point in V either belongs to that polygon or to the area of that polygon Vakhania et al. [14].
The input of our procedure for the construction of polygon P (see Table 2), consists of (i) the set of vertices V and (ii) the distinguished extreme points v 1 , v 2 , v 3 and v 4 .Abusing slightly the notation, in the description below, we use: (i) P, for the array of the points that form the girding polygon, and (ii) k for the last vertex included so far into the array P. Initially, P := (v 1 ) and k := v 1 .
append the vertex l to P and update k equal to l.
append the vertex l to P and update k equal to l.
Lemma 2. The time complexity of Procedure polygon is O(n 2 ).

Proof of Lemma 2.
There are four independent while statements with similar structure, each of which can be repeated at most n times.In the first line of each of these while statements, in lines 4, 11, 18, and 25, the set of points V * is formed that yields { f (n) = 2n} operations.In lines 5, 12, 19, and 26, the In lines 6, 13, 20, and 27, the set of angles Θ * consisting of at most n − 1 elements is formed in time { f (n) = n − 1}.In lines 7, 14, 21, and 28 to find the minimum angle in set Θ * at most n − 1 comparisons are needed and the lemma follows.
In Figure 3, we illustrate an example with 5 and v 4 = 5 and P = (4, 2, 5, 4).Initially, P = (4).Then, vertex 2 is added to polygon in Step 1, vertex 5 is added in Step 2; Step 3 is not carried out because v 3 = v 4 ; vertex 4 is added at Step 4. Using polygon P(V) constructed by the Procedure Polygon, we obtain our initial, yet infeasible (partial) tour ) that is merely formed by all the points t 1 , t 2 , • • • , t m of that polygon, where t 1 = v 1 and m is the number of the points.
In the example of Figure 3, P is the initial infeasible tour T 0 = (4, 2, 5, 4).V \ T 0 = {1, 3, 6} is the set of points that will be inserted into the final tour.

Phase 2
The initial tour of Phase 1 is iteratively extended with new points from the internal area of polygon P(V) using the cheapest insertion strategy at Phase 2 [15].
Let l ∈ T h−1 be a candidate point to be included in tour T h−1 , resulting in an extended tour T h of iteration h > 0, and let t i ∈ T h−1 .Due to the triangle inequality, w(t i , l) + w(l, t i+1 ) ≥ w(t i , t i+1 ); i.e., the insertion of point l between points t i and t i+1 , will increase the current total cost C(T h−1 ) by w(t i , l) + w(l, t i+1 ) − w(t i , t i+1 ) ≥ 0 (see Figure 4).Once point l is included between points t i and t i+1 , for the convenience of the presentation, we let and t i+1 := l (due to the way in which we represent our tours, this re-indexing yields no extra cost in our algorithm).In Table 3, we give a formal description of our procedure that inserts point l between points t i and t i+1 in tour T. PROCEDURE insert_point_in_tour(T, l, i)

Procedure construc_tour
At each iteration h, the current tour T h−1 is extended by point l h ∈ V \ T h−1 yielding the minimum cost c h l (defined below), which represents the increase in the the current total cost C(T h−1 ) if that point is included into the current tour T h−1 .The cost for point l ∈ V \ T h−1 is defined as follows: For further references, we denote by i(l) the index of point t i for which the above minimum for point l is reached, i.e., w(t i(l) , l) Thus, l h is a point that attains the minimum whereas the ties can be broken arbitrarily.
To speed up the procedure, we initially calculate the minimum cost for each point l ∈ V \ T h−1 .After the insertion of point l h , the minimum cost c h l is updated as follows: We can describe now Procedure construct_tour as shown in Table 4.
The time complexity of the Procedure construct_tour is O(n 2 ).

Phase 3
At Phase 3, we iteratively improve the feasible tour T delivered by Phase 2. We use two heuristic algorithms.The first one is called 2-Opt, which is a local search algorithm proposed by Croes [13].The second one is based on our construct_tour procedure, named improve_tour.The current solution (initially, it is the tour delivered by Phase 2) is repeatedly improved first by 2-Opt-heuristics and then by Procedure improve_tour, until there is an improvement.Phase 3 halts if either the output of one of the heuristics has the same objective value as the input (by the construction, the output cannot be worse than the input) or the following condition is satisfied: where di f min is a constant (for instance, we let di f min = 0.0001).Thus, initially, 2-Opt-heuristics runs with input T. Repeatedly, Condition ( 12) is verified for the the output of every call of each of the heuristics.If it is satisfied, Phase 3 halts; otherwise, for the output of the last called heuristics, the other one is invoked and the whole procedure is repeated; see Figure 9.

PHASE 3
Feasible Solution T given by Phase 2

2-Opt
It is well-known that the time complexity of this procedure is O(n 2 ).For the completeness of our presentation, we give a formal description of this procedure in Table 5.
The result of a local replacement carried out by the procedure is represented schematically in the Figure 10).

Procedure improve_tour
We also use our algorithm construct_tour to improve a feasible solution T = (t 1 , t 2 , • • • , t n , t 1 ), n = |V|.Iteratively, point t i+1 , 1 ≤ i < n, is removed from the tour T and is reinserted by a call of procedure construct_tour(V, T \ {t i+1 }).If a removed point gets reinserted in the same position, then i := i + 1 and the procedure continues until i ≤ n (see Table 6).Proof of Lemma 4. In lines 2-7, there is a while statement with n − 1 repetitions.The call of Procedure construct_tour in line 5 yields the cost O(n) since with m = n − 1, h = 1; see the proof of Lemma 3 (m is the number of points in the current partial tour).The lemma follows.

Implementation and Results
CII-algorithm was coded in C++ and compiled in g++ on a server with processor 2x Intel Xeon E5-2650 0 @ 2.8 GHz (Cuernavaca, Mor., Mexico), 32 GB in RAM and Ubuntu 18.04 (bionic) operating system (we have used only one CPU in our experiments).We did not keep the cost matrix in computer memory, but we have rather calculated the costs using the coordinates of the points.This does not increase the computation time too much and saves considerably the required computer memory.
We have tested the performance of CII-algorithm for 85 benchmark instances from TSPLIB [16] library and for 135 benchmark instances from TSP Test Data [17] library.The detailed results are presented in the Appendix.In our tables, parameter "Error" specifies the approximation factor of algorithm H compared to cost of the best known solution (C(BKS)): In Table 7 below, we give the data on the average performance of our heuristics.The average error percentage of our heuristics is calculated using Formula (13).It shows, for each group of instances, the average error of the solutions delivered by Phase 2 and, at Phase 3, the number of cycles at Phase 3 and the average decrease in the cost of the solution decreased at Phase 3 compared to that Phase 3. In the diagrams below (on the left hand-side), we illustrate the dependence of the approximation given by our algorithm on the size of the tested instances, and the dependence of the execution time of our algorithm on the size of the instances (right hand-side diagrams).We classify the tested instance into three groups: the small ones (from 1 to 199 points in Figure 12), the middle-sized ones (from 200 to 9999 points in Figure 13), and large instances (from 10,000 to 250,000 in Figure 14).We do not include the data for the largest two problem instances lra498378 and lrb744710 because of the visualization being technically complicated.The error for these instances is 12.5% and 15.9%, respectively, and the CPU time was limited to two weeks for both instances.As we can see, at Phase 3, there is an improvement in the quality of the solutions delivered by Phase 2.   Table 8 shows the summary of the comparison statistics of the solutions delivered by our algorithm CII with the solutions obtained by the heuristics that we have mentioned in the introduction (namely, DFACO [10], ACO-3Opt [10], ESACO [7], PACO-3Opt [8], DPIO [12], ACO-RPMM [9], Partial ACO [6], and PRNN [11]).We may observe in Table 9 that algorithm CII has attained an improved approximation for 17 instances.At the same time, in terms of the execution time, our heuristic dominates the other heuristics.In the Table 9, we specify the problem instances for which our algorithm provided a better relative error than some of the earlier cited algorithms.In terms of the CPU time comparison, see Table 10.In the diagram below (Figure 15), we illustrate the dependence of the memory used by our algorithm of all tested instances.

Conclusions and Future Work
We have presented a simple, easily implementable and fast heuristic algorithm for the Euclidean traveling salesman problem that solves both small and large scale instances with an acceptable approximation and consumes a little computer memory.Since the algorithm uses simple geometric calculations, it is easily implementable.The algorithm is fast, the first two phases run in time O(n 2 ), whereas the number of the improvement repetitions in the third phase, in practice, is not large.The first two phases might be used independently from the third phase, for instance, for the generation of an initial tour in more complex loop improvement heuristics.The quality of the solution delivered already by Phase 2 is acceptable and is expected to greatly outperform that of a random solution used normally to initiate meta-heuristic algorithms.We have implemented NN (Nearest Neighborhood) heuristics and run the code for the benchmark instances (the initial vertex for NN heuristic was selected randomly).Phase 2 gave essentially better results.In average, for the tested 135 instances (6 large, 32 Medium and 97 small ones), the difference between the approximation factor obtained by the procedure of Phase 2 and that of Nearest Neighbor heuristic was 9.65% (the average error of Phase 2 was 16.89% and that of NN was 26.55%, whereas the standard deviations were similar, 0.05% and 0.04%, respectively).As for the overall algorithm, it uses a negligible computer memory.Although for most of the tested benchmark instances it did not improve the best known results, the execution time of our heuristic, on average, was better than the earlier reported best known times.For future work, we intend to create a more powerful, yet more complex, CII-algorithm by augmenting each of the three phases of our algorithm with alternative ways for the creation of the initial tour and alternative insertion and improvement procedures.The next table (Table A2) discloses the headings of our tables.In the tables below, each line corresponds to a particular benchmark instance.For each of these instances, we indicate the performance of Phase 2 and Phase 3, separately, and that of the other heuristics reporting the results for that instance.In addition, 85 benchmark instances were taken from TSPLIB [16] and 135 instances are from TSP Test Data [17] libraries.Tables A3, A4, and A6 include the earlier known results.
In some lines of our tables (e.g., line 1, Table A5), a slight difference in the approximation errors of our algorithm and those of the algorithms from the "Results for National TSP Benchmarks" table can be seen due to the way the distances in the obtained solutions are represented in our algorithm (we do not round the distances represented as decimal numbers, whereas the distances in the best known solutions are rounded).

Figure 1 .
Figure 1.Block diagram of the CII-algorithm: (a) Phase 1 delivers a partial (yet infeasible) solution, (b) Phase 2 extends the partial solution of Phase 1 to a complete feasible solution, and, (c) at Phase 3, the latter solution is further improved.

Figure 3 .
Figure 3. Example that shows the extreme vertices and girding polygon.

6 Figure 6 .
Figure 6.Point 6 was inserted in the tour T 0 between points 4 and 2.

6 Figure 7 .
Figure 7. Point 3 was inserted in the tour T 1 between points 6 and 2.

6 Figure 8 .
Figure 8. Point 1 be inserted in the tour T 2 between points 4 and 6.

Figure 11
Figure 11 illustrates the iterative improvement in the cost of the solutions obtained at Phase 3 for a sample problem instance usa115475.The initial solution T 0 of Phase 2 is iteratively improved as shown in the diagram.

Figure 11 .
Figure 11.The improvement rate at Phase 3 for instance usa115475.

Lemma 4 .
The time complexity of the Procedure improve_tour is O(n 2 ).

Figure 15 .
Figure 15.RAM vs. number of points for all the tested instances.

Table 7 .
Statistics about the solutions delivered by CII.

Table 8 .
Statistics between CII and other heuristics.

Table 9 .
Comparative relative errors for some problem instances.

Table 10 .
Comparative CPU time for the problem instances for which the other heuristics were faster.

Table A1 .
Heuristics used to compare the CII-algorithm.

Table A2 .
Description of the headings of Tables A3-A6.

Table A4 .
Results for Art TSP benchmarks.

Table A5 .
Results for National TSP benchmarks.

Table A6 .
Results for the VLSI TSP benchmark.