Comparative Study of Variations in Quantum Approximate Optimization Algorithms for the Traveling Salesman Problem

The traveling salesman problem (TSP) is one of the most often-used NP-hard problems in computer science to study the effectiveness of computing models and hardware platforms. In this regard, it is also heavily used as a vehicle to study the feasibility of the quantum computing paradigm for this class of problems. In this paper, we tackle the TSP using the quantum approximate optimization algorithm (QAOA) approach by formulating it as an optimization problem. By adopting an improved qubit encoding strategy and a layer-wise learning optimization protocol, we present numerical results obtained from the gate-based digital quantum simulator, specifically targeting TSP instances with 3, 4, and 5 cities. We focus on the evaluations of three distinctive QAOA mixer designs, considering their performances in terms of numerical accuracy and optimization cost. Notably, we find that a well-balanced QAOA mixer design exhibits more promising potential for gate-based simulators and realistic quantum devices in the long run, an observation further supported by our noise model simulations. Furthermore, we investigate the sensitivity of the simulations to the TSP graph. Overall, our simulation results show that the digital quantum simulation of problem-inspired ansatz is a successful candidate for finding optimal TSP solutions.


I. INTRODUCTION
For over a century, the Traveling Salesman Problem (TSP) [1] has inspired hundreds of works and dozens of algorithms, of both exact and heuristic approaches.Today, the TSP has become so quintessential in modern computing that it is commonly considered the prototypical NP-Hard combinatorial optimization problem, possessing far-reaching impact on countless applications in science, industry and society.Consequently, the TSP is frequently taken as an ideal candidate for new computational models and non-standard algorithmic approaches, including approximate approaches like simulated annealing [2] and self-organizing maps [3], which have been widely employed to tackle the TSP.
Recent advancements in quantum technologies have paved the way for various quantum computing approaches to tackle the Traveling Salesman Problem (TSP).These approaches include the quantum Held-Karp algorithm [4], quantum annealing (QA) [5][6][7][8][9], and the more general variational quantum algorithm [10,11] (VQA).VQA approaches have found extensive applications in diverse fields such as chemistry [11], physics [12], and finance [13], among others.Although complete demonstrations of quantum advantage over classical algorithms are currently limited due to the noisy intermediate-scale quantum (NISQ) era [14], exploring these quantum algorithms remains crucial as experimentation on prototype quantum hardware continues to rapidly approach what can be classically simulated by even the world's largest supercomputers.Notably, the quantum approximate optimization algorithm (QAOA) [10,15], a subclass of the general VQA, has been successfully applied to a number of optimization problems [16], including the max-cut problem [17,18], vehicle routing [19], DNA sequencing [20], protein folding [21], as well as the TSP [22].In comparison to the popular Hardware Efficient VQA, the QAOA takes advantage of the domain knowledge of the specific problem at hand to produce a variational ansatz with fewer parameters and a shallower depth.Furthermore, an extension of the original QAOA called the quantum alternating operator ansatz [23,24] offers a generalized approach that specializes in solving problems with hard constraints.
In the NISQ era, the QAOA approach can be particularly advantageous for addressing the challenges of the Traveling Salesman Problem (TSP), owing to the QAOA's hybrid feature, hardware-friendly structure, and controlled optimization.Being a hybrid approach, the QAOA exhibits robust tolerance to systematic errors by leveraging classical computer optimizers.Its layered ansatz structure inspired by the problem Hamiltonian allows for high flexibility in the circuit depth and qubit coherence time, incorporating the capabilities offered by the quantum backends.Compared with the QA [25,26], the QAOA also enables fine control of the optimization through its finite layers, which is particularly beneficial in the current NISQ era.However, the numerical simulation of the QAOA on the TSP, especially in the multiple layer region, is not well understood, since the non-adiabatic mechanism of the QAOA differs significantly from QA [27].Therefore, it becomes imperative to explore various implementations of the QAOA to determine the optimal path for simulation.Conducting investigations of these problems on digital quantum computers or simulators is essential, as it has the potential to unveil new quantum simulation strategies for traditional optimization tasks.We distinguish the present work from the previous studies by constructing our QAOA using different ansatzes and comparing their performances in both numerical accuracy and resource cost, which addresses a crucial aspect that is often neglected in conventional studies.
In this work, we study the effectiveness of three distinct designs of the QAOA in solving the TSP by adopting a layerwise learning optimization protocol [28] on digital quantum simulators via Qiskit [29].We organize this paper as follows: In Sect.II, we introduce the TSP and its mathematical formulation as a binary constraint optimization problem.In Sect.III, we outline the QAOA methods, with particular focus on the initialization, mixer ansatz, and measurement protocol employed in this work.In Sect.IV, we present and compare the numerical results of the QAOA simulation on TSP instances with 3, 4, and 5 cities, utilizing different ansatz designs.We discuss the impact of the device noise and TSP variations on the simulation results.In Sect.V, we summarize the results and discuss plans for the future.

II. TRAVELING SALESMAN PROBLEM
In this section, we first define the TSP as an optimization problem and then improve its formulation by taking advantage of symmetry in the solution.

A. TSP formulation as optimization problem
The Traveling Salesman Problem asks for the shortest path that visits each city exactly once and returns to the starting city.In the symmetric case where the distance between any two cities is the same regardless of the traveling direction, the TSP can be reformulated as an undirected graph problem where its vertices represent cities and edge weights represent traveling distances.Mathematically, given an undirected graph G with vertices V and edges E, i.e., G = (V, E), we aim to find a Hamiltonian cycle that goes through all |V | nodes exactly once with the smallest total weights of the connecting edges on the path.
In this graph formulation of the TSP, any valid cycle, be it minimum or not, can be represented by a visiting order or a permutation of integers, such as {0, 1, ..., n − 1}, where the integers are the city indices starting at 0 for a total of n cities.Alternatively, the visiting order on a TSP graph can be conveniently described by a sequence of binary decision variables x i,t , indicating whether the city-i is visited at time t [30].If x i,t = 1 then the city-i is visited at t, otherwise the city is not visited by the traveling salesman.Naively, to fully describe the solution to a n-city TSP, a total of n 2 binary variables is needed in this representation.
Alternatively, this "one-hot" representation of binary decision variables can be written collectively in either matrix or flattened array format for numerical implementation.For instance, a valid Hamiltonian cycle of permutation x = (0, 1, 2, 3), is translated into binary decision variables x as where the matrix row index represents each city index, and the column index represents each time instance.City-i is visited at time t if and only if x i,t = 1.In this work, all three notations (permutation, matrix, and bit string array) are used interchangeably.Any Hamiltonian cycle in the TSP has a unique sequence of binary decision variables or "bit string".But the reverse is not true since a large portion of the possible bit strings may not correspond to any meaningful permutation.Specifically, we classify any bit string x into three categories or states.
x is a permutation and gives the shortest path, false, x is a permutation but does not give the shortest path, invalid, x is not a permutation, (2) where the true and false bit strings are also called valid bit strings.Any bit string can be translated to a Hamiltonian cycle if and only if it is a permutation.Clearly, invalid solutions are disallowed traveling orders to the TSP.
With binary decision variables x, a true solution to an n-city TSP can be found by finding an x that minimizes the following cost function [30], where ω ij is the distance (or edge weight in the undirected graph) between city-i and city-j. 1 Here, C dist (x) also gives the shortest TSP distance when x is a true solution.Since the cost function itself does not forbid invalid solutions in general, additional constraint conditions must be satisfied for a valid Hamiltonian cycle, such as where Eq. ( 4) forbids multiple cities visited by the traveler at the same time, and Eq. ( 5) forbids revisiting the same city.Alternatively, in the matrix format, these constraints are easily implemented by requiring that any row or column sum to one exactly.These two hard constraints are the necessary conditions for any valid solution, though not necessarily a true solution to a TSP.To formulate the TSP as a minimum-optimization problem, these constraint conditions are conveniently incorporated as the penalty terms, such that the combined cost function, C(x) becomes, where λ is the weight factor of the penalty term, serving as the Lagrange multiplier.λ should be positive and sufficiently large.It is easy to see bit string x gives the minimum of C(x) if and only if x is a true solution to the given TSP.Finding a Hamiltonian cycle to the TSP is now equivalent to finding an x * that minimizes C(x) in Eq. ( 6), i.e.

B. Improved TSP by eliminating rotational symmetry
Symmetry plays a vital role in many graph optimization problems, and exploiting them can help reduce the problem's complexity.In the previously introduced TSP optimization, one uses n 2 decision variables for n cities; however, solutions obtained after the optimization display "rotational" symmetry: they are physically identical up to some rotation.For example, a visiting order of permutation (0, 1, 2) is equivalent to (1, 2, 0) and (2, 0, 1) for a 3-city TSP.They form a natural equivalence class on the solution sets.To reduce the size of the search space (and the number of qubits to encode), a simple but significant improvement can be made by fixing the starting city [30].
Without loss of generality, we fix city-0 as our starting point, and the traveling salesman will return to city-0 after visiting all the other cities exactly once.Then, the improved cost functions C dist (x) and C (x) become In this new cost function, decision variables x i,t only take value i = {1, 2, • • • n} and t = {1, 2, • • • n}, and thus we only need effectively (n − 1) 2 decision variables for an n-city TSP after fixing the initial city.The reduction in the length of the bit string is especially advantageous because it is ultimately equivalent to reducing the number of qubits for encoding the problem on a quantum circuit.Additionally, it is important to point out that this TSP optimization formulation works for a general symmetric TSP, not relying on a flat surface, which can be generalized to many real-world applications where non-planar relations are ubiquitous, such as social networks, stock markets, material science, and so forth.Asymmetric TSP (ω ij = ω ji ) can also be in principle formulated similarly but is not considered within the scope of this work.
There are many other ways to formulate n-city TSP as an optimization problem [31,32] usually requiring more than n 2 variables.Recent work [33] using the Hamiltonian Cycle Detection oracle leads to even fewer qubits.Within the n 2 -variable formulation, an alternative approach to formulating the TSP expresses the cost function in terms of adjacency matrix, where x adj is the adjacency/connectivity representation of a permutation. 2The adjacency matrix representation can be particularly useful in symmetric TSP because time degrees of freedom are automatically factored out.Penalty terms for the cost function can be conveniently included by the symmetry about the main diagonal.However, unlike our adopted construction, it is not straightforward to reduce the number of decision variables in Eq. (10), and therefore we leave it for a future study.In the subsequent section, we introduce the quantum approximate optimization algorithm based on the improved TSP optimization formulation according to Eq. (9).

III. QUANTUM APPROXIMATE OPTIMIZATION ALGORITHM (QAOA)
The quantum approximate optimization algorithm (QAOA) [10,23] is a general quantum heuristic approach for solving optimization problems.In this section, we introduce the QAOA workflow in detail and its application to the TSP formulation introduced in Sect.II B.

A. QAOA workflow
The QAOA is deeply connected with the adiabatic quantum computation (AQC) [34] which is based on the adiabatic theorem.In AQC, the whole simulation process can be viewed as a time-dependent Hamiltonian evolution represented by H(t), where 2 For an example, the adjacency matrix for (0, 1, 2, 3) visiting order in matrix form is Here, H M represents a known ansatz and H P is the target Hamiltonian that one aims at finding a ground state.
According to the adiabatic theorem, by gradually introducing perturbation, an initial eigenstate of H(t = 0) = H M will evolve into the ground state of H(t = T ) = H P .However, in practice, simulating this process can be extremely time-consuming, and accurately estimating a suitable duration poses its own challenges.The fundamental idea behind the QAOA is to approximate this adiabatic process by parameterizing the infinitely-long time evolution into finite time steps, addressing practical considerations.In both the original QAOA [10] and the extended QAOA [23], the hybrid quantum approach consists of three essential parts: 1. State initialization with initial state |s .
2. Parameterized unitary ansatz U p ( β, γ), a variational ansatz of p layers for the TSP, based on two alternating Hamiltonians H P and H M using respective parameters β and γ.
3. Measurement and optimization of the cost expectation β, γ|C(x)| β, γ for the final state | β, γ where an optimizer on a classical computer is used for the minimization.
Putting the three parts together, we construct the complete QAOA circuit, where the final state after the evolution is where p is referred to as the depth (or layer number) of the QAOA.Specifically, the two alternating unitary ansatzes in each layer are: where H P is the problem Hamiltonian derived from the cost function and H M is the mixer Hamiltonian that explores the feasible subspace.In this work, we refer to the QAOA ansatz with p layers as p-QAOA.Note γ and β are parameter vectors of length p to be optimized, and there is only one single parameter γ i (β i ) for the associated unitary ansatz U P (U M ) per layer.It means there are only 2 parameters per layer for the QAOA, independent of the number of qubits (i.e., problem size), which makes the approach highly scalable.These parameters or angles can also be regarded as mimicking the trotterization time steps in the QAOA to approximate the adiabatic evolution in Eq. ( 12); nonetheless, the behavior in the finite layer limit can be drastically different.
In the last few years, many variants of the QAOA approach have emerged [35].One such variant is the multiangle QAOA (ma-QAOA) [36], which uses a unique angle for each element of the Hamiltonian.This approach could potentially reduce circuit depth required for solving the TSP.Another variant, the digitized-counterdiabatic QAOA (DC-QAOA) [37,38] introduces an additional problem-dependent counterdiabatic driving term in each layer to enhance the convergence rate of the optimization process.Additionally, the adaptive-QAOA (ADAPT-QAOA) [39], inspired by the adaptive VQE, systematically selects the mixer ansatz based on the optimization, potentially improving the simulation outcome.Since these more advanced QAOAs generally require more than two parameters per layer and additional simulation time, we opted not to incorporate them in this initial work; however, we have plans to include these variants in a subsequent study, allowing for a more comprehensive analysis of the QAOA to the TSP.

B. From binary decision variables to qubits
To carry out the optimization on quantum computers, an efficient qubit encoding scheme is necessary to map the binary decision variable in the TSP formulation to quantum computers.Here, we use the standard boolean binary variable mapping strategy [40].For an n-city TSP, we simply map where Z (i,t) is the Pauli-Z matrix (see App. A) at qubit location (i, t) on a two-dimensional lattice.To identify the qubit on the lattice with its realistic index in a quantum device, one may use the ideal mapping (ignoring the device connectivity) that takes (i, t) → ni + t for the original TSP formulation in Eq. ( 6).For the improved TSP formulation according to Eq. ( 9), since both sets of the i = 0 and t = 0 qubits are never used, we economically map such that only a total of (n − 1) 2 qubits is needed, from index 0 to (n − 1) 2 − 1, for n cities. Reducing qubit number is crucial in the practical quantum simulation, and therefore we adopt the mapping strategy in Eq. ( 17) for the improved TSP formulation throughout this work.

C. State initialization
The initial states are one of the key components in the QAOA approach.In the original QAOA [10], the initial states are always set to be, |+ ⊗N , where N is the total number of qubits.For a n-city TSP, with the original n 2 = N case for simplicity, it means the initial state becomes In this way, the initial quantum state |s n H is a superposition of all possible basis states for the problem.While this strategy is easy to implement on a quantum device using Hadamard gates H, the magnitude of each basis state in the initial state shrinks exponentially as the number of cities increases because the dimension of the search space grows as O(2 N ).
Recently, additional initialization strategies of a restricted quantum search space following their corresponding mixing ansatzes have been considered in the QAOA.In particular, the so-called W N states [41] can be especially useful as it represents one-hot encoding on the quantum circuit suitable for binary decision variables.For example, a W 3 state on three qubits is written as where each bit string always sums to one.With the property of the W state, we can construct an improved initial state to satisfy the temporal or spatial constraints of the TSP automatically, i.e., Eq. ( 4) or Eq. ( 5), where the temporal constraint is satisfied by putting together multiple W states in parallel. 3ith a sufficiently powerful ansatz, one may also consider a permutation initial state, ignoring all superpositions, where its construction is simplest, using a few Pauli-X-gates.We also considered an equal superposition of all permutation states, representing the minimal Hilbert space containing all the valid solutions; however, we found it to be the most challenging to initialize on the circuit.These choices of initial states provide dramatically different initial search spaces, with dimension going from O(2 n 2 ), O(n n ), to O(1) respectively, along with their set relation {|s P } ⊂ {|s W } ⊂ {|s H }. Notably, both the |s H and |s W are a superposition of solution states, but |s P is not.The selection of initial states plays a vital role in the QAOA, as it can reduce the number of potential candidates in the quantum evolution, albeit at the expense of an increased number of quantum gates.Lastly, these initial states will be used together with their respective mixer Hamiltonians of the QAOA, which are introduced in the next section.

D. Variational ansatzes
Variational ansatzes are essential in optimizing the quantum state to represent the true solution.The variational ansatz U p introduced in Eq. ( 13) consists of two parts:

Problem Hamiltonian
The problem Hamiltonian is the qubitized cost function encoding the specific TSP instance to be solved in the QAOA approach.Specifically, these problem Hamiltonians are obtained by mapping the cost functions (Eq.( 6) and Eq. ( 9)) onto the quantum circuit according to the encoding strategy, Eq. ( 16), where the obtained operators are a sum of the Pauli-Z and Pauli-ZZ operators, known as the Ising Hamiltonian [43].
Combining them, we obtain H P , the problem Hamiltonian of the TSP instance, As a consequence of qubit encoding, a ground state of H P is guaranteed to be a true solution state that minimizes the respective TSP cost function.The Ising representation of the Hamiltonian is easily translated into a quantum circuit using a sequence of quantum gates.

Mixer Hamiltonian
The mixer Hamiltonian defines how the state space is to be explored and impacts how the quantum state evolves significantly with each iteration.Based on the Trotter product formula, the mixer Hamiltonian must not commute with the problem Hamiltonian, [H M , H P ] = 0, to simulate a tottered optimization like the QAOA.Many mixer Hamiltonians have been proposed [24,44,45] for different problems solved via QAOA.For different mixers, appropriate initial states as the eigenstates of the mixer Hamiltonian must be used in accordance with the adiabatic theorem.In evaluating the numerical performance of QAOA for TSP, we consider three types of mixers: X mixer, XY mixer, and Row-swap mixer (RS mixer), with details explained below.
(a) The X mixer is the original mixer proposed in the QAOA that works together with a number of problems such as the Max-cut problem [10].It takes s H for its state initialization.In the n-city TSP, the X mixer is The X mixer strategy proves most useful for quantum annealing applications, especially on practical D-Wave Systems [8].It is easy to implement on most quantum backends, only requiring O(n 2 ) single-qubit X-gates per layer in the QAOA.(b) The XY mixer is another natural candidate for the mixing Hamiltonian, preserving the Hamming distance among the acted qubits [46], which is especially suited to the one-hot encoding realized by the initial states s W . Here, we construct the XY mixer for the n-city TSP as where the XY-gate is implemented via the Pauli-XX and Pauli-YY gates on the circuit.The block-wise construction allows the conservation of probability for each city in the TSP, reinforcing the satisfaction of the temporal constraint, as in Eq. ( 4).A generic XY-gate across any two points (i, t) and (j, s) on the 2D lattice is where X (Y ) is the Pauli-X (Pauli-Y) matrix.Here, one should understand Eq. ( 26) as a cyclic iteration of the XYgate.For example, X n−1,n ≡ X n−1,0 in the n-city case; other variants such as non-cyclic and fully-connected XY-gates can also be used.The XY-gate is often interchangeably referred to as the swap-gate, as they both redistribute the amplitudes between two qubits while preserving the total amplitude of the quantum state.Alternatively, one could use the SWAP-gate [47] instead of the XY-gate to implement the XY mixer via where a similar performance is produced, and therefore we choose to use the simpler XY-gate to implement the XY mixer throughout this work.Compared with the X mixer, the XY mixer is more expensive to implement by having O(n 2 ) XY-gates per layer.
(c) The Row-swap (RS) mixer has recently been proposed in the QAOA as a means of embedding hard constraints directly into the mixer Hamiltonian [23,24].Although the RS mixer also uses the XY-gate, it simultaneously swaps all non-overlapping rows of qubits (corresponding to different cities) as a whole.The RS mixer can be represented as, where the first two sums represent all possible swapping between city-i and city-j, and the last product denotes the simultaneous swap of all corresponding entries in the associated cities.In this way, the RS mixer is capable of exploring the entire space of valid solutions when initialized on any single valid state, i.e., s P .However, it should be noted that the RS mixer incurs a significant computational cost during the simulation due to the involvement of many tensor products of the Pauli-XX or Pauli-YY matrices.One can mitigate this expense by relying on a set of creation and annihilation operators constructed from four-qubit gates [24].Nevertheless, the H MRS ansatz remains computationally expensive, requiring O[(n − 1)(n − 2)/2] four-qubit gates per layer with each four-qubit gate itself being expensive to construct.

E. Measurement and optimization protocol
Based on the unitary ansatz and their appropriate initial states, the cost expectation of the QAOA is evaluated by measurements performed on quantum devices and subsequently optimized using gradient-free optimizers such as COBYLA [48][49][50] and SPSA [51,52].The optimization process continues until convergence or when the maximum iteration threshold is reached.The resulting solution to the TSP is then determined by identifying the most dominant quantum state (or binary decision variable encoded in bit string).To account for statistical fluctuations in measurements, we run each quantum simulation multiple times (typically 5-10) with different random seeds and report the result with the lowest converged expectation value.Considering that the expectation values are TSP-specific, we use the standard evaluation metric, the approximation ratio (AR), to evaluate the performance by normalizing against the ideal cost in different TSPs.The AR is calculated as where a lower AR corresponds to a lower expectation cost, indicating a closer approximation to the exact solution.
Classical optimizers play a vital role in the optimization and their advantages can be further utilized in the QAOA: The expectation values of individual bit strings are cached and retrieved on the classical optimizer to enable fast computation of the final cost expectation during each iteration.The option to use constraint bounds of [0, 2π) for the ansatz parameters in the case of COBYLA can also accelerate the convergence, which is the main reason we primarily focused on simulations using the COBYLA optimizer in our study, although a comprehensive analysis with other available optimizers can be explored in the future research.
To optimize the QAOA, we employed the layerwise learning (LL) protocol introduced in Ref. [28].In comparison to complete depth learning (CDL), LL proved to be advantageous in reducing the optimization cost, particularly as the number of qubits and circuit depth increases.It also helps mitigate the likelihood of barren plateaus (BP) [28].In short, the LL is a two-part optimization protocol, as illustrated in Fig. 1: (A) Progressive pretraining: In the first part (Fig. 1(a)), we construct the QAOA ansatz by gradually adding layers.Initially, we train and optimize over the leading few layers (typically two layers).Then, for a p-layer QAOA simulation, we freeze the parameters in the first (p − 1)-th layers, obtained from previous simulations, and exclusively optimize the parameters in the p-th layer.Optimal parameters of the current layer that yield the lowest cost expectation are selected. 4This progressive optimization protocol proves to be efficient and leads an increasingly optimized solution as the number of layers increases.It also reduces the computational cost in parameter searching for very thick layers.We denote this protocol with the letter A and an integer to indicate the depth being optimized.
(B) Randomized retraining: In the second part (Fig. 1(b)), we take the pre-trained QAOA ansatz from part (A) and randomly select a larger portion of the parameters to be trained at a time.Typically, we free 50% of the parameters in each iteration of retraining.Although more computationally expensive, this retraining is still less costly than the CDL, and allows us to train the QAOA ansatz as a whole, mitigating the risk of getting trapped in local minima, which could occur when using the protocol of part (A) exclusively.We use the protocol name B with a number to indicate which iteration of retraining is being conducted.
It should be mentioned that there are also other variations to the LL, such as sequential blockwise learning used in Ref. [53], where one block/layer is optimized at a time while fixing all other blocks.Layerwise learning may also be prone to systematic layer saturations [54] that require special treatments, which we leave for a future study.For the numerical results presented in this work, we always use the LL optimization protocol as its computational cost and solution accuracy consistently outweigh those of CDL.In Fig. 2, we show an example of the layerwise learning applied to the QAOA with the X mixer, showing the optimization in both one protocol step and the full LL procedure; similar performance are also found for other mixers.

IV. NUMERICAL RESULTS
With both the TSP optimization and QAOA method introduced, we perform numerical quantum simulation on the IBM Quantum QASM simulator using aer.QasmSimulator.The problem and mixer Hamiltonian operators are constructed using qiskit.opflowlibrary.For the circuit implementations of the three mixers, we use Pauli-Z and Pauli-ZZ gates for the X mixer and the XY mixer, and use the PauliEvolutionGate library for the XY mixer.We focus on quantum simulations using the layerwise learning protocol for 3-, 4-, and 5-city TSPs on a sufficiently powerful local Ubuntu machine 5 , and compare their performances in terms of numerical accuracy and resource costs.To obtain converged results, we always use a sufficient number of TSP instances, varying from 7 to 10 graphs depending on the number of cities, mixer, and simulated noise, each with 5-10 repeated runs of quantum simulation.

A. Simulation accuracy
We follow the LL optimization protocol introduced in Sect.III E and use (n − 1) 2 qubits based on the improved TSP formulation (Eq.( 9)) for each quantum simulation with n cities.In Fig. 3, we present the QAOA simulation results when solving various instances of 4-city and 5-city TSPs using the X, XY, and XY mixers.The performance is evaluated with three criteria: (a) approximation ratio (AR), (b) percentage of the true solution, and (c) rank of the true solution.The two-part LL optimization is indicated by letters A and B, followed by the specific depth and iteration numbers, respectively.We use a sufficient number of layers in the QAOA simulation (4 layers for 3-city cases and 6 for 4-/5-city cases) to ensure convergence.The uncertainty bars depicted in Fig. 3 represent the standard deviations of the respective results calculated for various TSP graph instances.A comprehensive comparison of all the results can be found in Table .I, which includes the results for the 3-city TSP simulations as well.
(a) Approximation ratio (AR): Expectation cost, or equivalently AR, is the primary observable measured during the quantum simulation.It directly influences the classical optimizer's ability for finding the optimal parameters.Fig. 3(a) and Fig. 3(b) demonstrate that both pretraining and retraining parts of the LL are necessary to optimize the AR for various TSP instances.Among the three types of QAOA mixers, the RS mixer achieves the lowest AR, reaching values as low as 1.01±0.01(4-city case) and 1.18±0.14(5-city case).On the other hand, the X mixer performs the poorest, particularly as the problem size increases, partially due to the limitations of the ansatz's expressibility.It is worth noting that even the heuristic VQE ansatz outperforms the X mixer in the 4-city case, with a lower AR around 2.19 ± 0.37 compared to 2.33 ± 0.83 (see Table.I).With consideration of temporal constraints during construction, the XY mixer exhibits intermediate performance, with AR values of around 1.44 ± 0.23 and 1.89 ± 0.66 for 4-and 5-city TSPs, respectively.
(b) True percentage: The percentage of the true solution is also known as the overlap between the quantum state and the expected true solution.While the true percentage is determined only after the simulation, it is desirable to have it as large as possible for the accurate extraction of the optimal solution.In Fig. 3(c) and Fig. 3(d), we present the true percentages for the three mixers as the TSP problem size increases.Undeniably, RS is the dominating mixer, reaching around 96.3 ± 3.8% and 41.1 ± 29.5%; however, the large uncertainty suggests a highly unstable pattern in the obtained solution; see App.B for an explanation.On the other hand, the X mixer gives the lowest percentages, reflecting a poor performance in accurately identifying the true solution.Lastly, the XY mixer is again holding a middle ground, with true percentages of approximately at 36.6 ± 5.7% and 7.4 ± 0.6% respectively.
(c) Rank: The rank of the true solution specifies how many other states possessed a higher probability than the state corresponding to the true solution, and is a crucial indicator of the simulation's accuracy.Achieving a rank of 1 for the true solution signifies consistent identification of the correct solution, as it means the quantum state with the highest probability is always the true solution's state (so we want to have a rank as low as possible).The results are presented in Fig. 3(e) and Fig. 3(f).In the case of the X mixer, it exhibits a significantly high rank, indicating a low likelihood of picking the correct solution among the top quantum states.On the other hand, the ranks of the XY and RS mixers are comparable, both reaching around rank-1 for 4-city TSPs and around rank-2 for 5-city TSPs.Notably, for the 5-city case, we observe lower ranks of the XY mixer in the early stages compared to the final stages, showcasing the effectiveness of the XY mixer in even shallower QAOA for certain TSP instances.
Based on the observations in AR, true percentage, and rank, several conclusions can be made.First, we can see that X mixers consistently underperform in all three criteria, compared to the other two mixers.This behavior is expected because the Hadamard initialization produces a uniform superposition of all possible states, i.e., 2 16 states in the 5-city case, without any constraints on the solution.As a result, it becomes challenging for the classical optimizer to filter out the invalid and false solutions based solely on the problem Hamiltonian.Particularly when the problem size increases, the X mixer alone is not suitable for the QAOA simulation of the TSP.Secondly, we observe that the RS mixer stands out as the dominating mixer in terms of AR and true percentage, which makes it a reliable candidate for QAOA.In terms of ranks, the performances of the XY and RS mixers are quite similar.The strategies employed by the two mixers are very different: the RS mixer relies heavily on the expressibility of the mixer itself, while the XY mixer combines the initialization and the mixing Hamiltonian to achieve its results.By utilizing a single-bit string TABLE I. Comprehensive comparison of the numerical accuracy for the QAOA mixers and heuristic ansatzes used to solve the TSP.The standard deviations of the quantities obtained from variation in TSP graphs are provided in the parenthesis.Problem-specific VQE to the TSP such as Ref. [55] a Estimating the circuit depth exactly is difficult for the RS mixer.In comparison, it appears linearly increasing with a large slope of 1181 for small city numbers.
as the initial state, RS may potentially overlook the benefits of having superposition states in a quantum simulation.In a sense, the XY mixer takes a more balanced approach, whereas the RS mixer takes a more assertive approach; this distinction between the two mixers can have implications for the resource cost, which will be discussed in the following section.

B. Resource evaluations
Besides numerical accuracy, resource cost estimation is another crucial factor to consider in quantum simulation, as any computational resource is always finite.On the quantum computer and simulator, many factors will contribute to the performance of the simulation, including attributes of the transpiled quantum circuits, such as the number of qubits, the number of single-qubit (double-qubit) gates, and the quantum circuit depth.In a practical calculation, properties of the quantum device, such as qubit connectivity, coherent error, and incoherent noise, will also come into play.For this section, we focus on the quantum circuits of the three QAOA mixers and compare their resource costs on ideal devices; practical calculation will be discussed in the subsequent section using noisy simulation.
In Table .II, we compare the properties of the quantum circuits of the three mixers after transpilation for both finite and generic TSP cases.As expected, the complexity of the circuit, measured in terms of quantum gates and circuit depth, generally increases with the number of cities, resulting in a longer simulation time.Notably, the RS mixer incurs a significantly higher resource cost compared to the X and XY mixers, as reflected in the simulation time in practice.As discussed earlier, this increased cost is primarily attributed to the utilization of four-qubit gates in the RS mixer, leading to a quadratic scaling, i.e., O(n 4 ), of single-qubit and double-qubit gates.The abundance of double-qubit gates is anticipated to pose serious challenges in executing the simulation on a real quantum device or when employing a noise model [56].Interestingly, despite requiring fewer resources, the X mixer, actually takes a longer time to run in practice compared to the XY mixer, particularly as the number of qubits increases.This observation is likely due to the computational burden of the optimizer when evaluating the expected cost for a dense superposition of bit strings.On the other hand, the XY mixer requires relatively low computational resources, scaling linearly with the circuit depth and quadratically with the number of quantum gates, which is a more economical choice for running QAOA simulations.Considering both optimization accuracy and computational cost, the XY mixer emerges as a more balanced choice for the QAOA.Nonetheless, a resource cost of O(n 2 ) gates and qubits for the XY mixer is still quite expensive as n increases.Notably, building the XY mixer at the pulse level [57] has the potential to further enhance its numerical performance.Lastly, it should be acknowledged that the resource costs of all mixers would be even higher when simulating on current NISQ or future fault-tolerant quantum computers.In the interest of addressing this aspect, we present noise-model simulations in the subsequent section.

C. Robustness against noise
Estimating the performance of the QAOA simulation in the presence of noise is crucial to implementations on NISQ and fault-tolerant devices in the future.In this section, we employ the NoiseModel class from Qiskit to study the sensitivity of the simulation on different noise levels.In particular, we focus on noisy QAOA simulations with XY and RS mixers for the same set of 4-city TSP problems.In Fig. 4, we compare the performance of various noise simulations in terms of AR, true percentage, and rank.We consider noise models with different degrees of single-qubit errors: 0.005%, 0.01%, 0.05%, and 0.1%.Besides single-qubit errors, we set the double-qubit errors to be ten times their respective single-qubit errors, which is a reasonable approximation for realistic two-qubit gates such as the CX-gate.For the current study, we have omitted other potential errors for simplicity, such as the qubit connectivity and thermal relaxation time, which can also be implemented with the noise model.
From the results presented in Fig. 4, it is evident that the qubit errors in the noise model directly impact the quality of the simulation.As expected, QAOA simulations with larger errors perform poorly compared to smaller ones.Interestingly, there seems to be a noise threshold in the simulation results: noisy simulations with error less than or equal to 0.01% exhibit qualitatively different behavior compared to those with higher errors, as shown in Fig. 4(a), Fig. 4(b), and Fig. 4(c).Additional details of the noisy simulation are provided in Table .III, where we can clearly see that the LL protocol fails to optimize the QAOA simulation at 0.1% and 0.05% noise levels.Comparing the two ansatzes, the XY mixer outperforms the RS mixer in all indicators for all noisy simulations.Surprisingly, the XY mixer achieves performance similar to the ideal simulation with errors less than or equal to 0.01%, indicating its potential resilience against simulation noise.Our result suggests that the XY mixer is a more suitable choice among the three mixers when considering noise effects in the QAOA simulation.

D. Problem dependence
It is important to investigate the problem dependence of the QAOA simulation of the TSP in preparation for the full-fledged quantum simulation.Here, we study several TSP problem dependencies, such as the topology of the TSP graphs and the penalty weight.The topology of the TSP graph could potentially have a significant impact on the performance of the quantum simulation algorithm.One characteristic we consider is the "skewness" of the TSP graphs, which represents the level of asymmetry.To measure the skewness, we analyze the distribution of the edge weights ω ij in the graph using Fisher-Pearson's moment coefficient [58,59].Specifically, we calculate the skewness parameter g by where ω is the mean of the edge weights and |ω| is total number of edges in the graph.Here, m 3 is the third moment of the edges, and m 2 is the variance, the square of the standard deviation.Intuitively, the skewness can also be computed as the average value of the cubed z-scores.For instance, a skewness value of 0 indicates a symmetric/normal distribution of the edge, and skewness values of greater than 1 or less than -1 typically indicate highly-skewed distributions.Negative (positive) skewness indicates a left-skewed/right-leaning (right-skewed/left-leaning) distribution.In Fig. 5, we present the quantum simulation using X, XY, and RS mixers on various 4-city TSP graphs with varying skewnesses.Here, we focus on the approximation ratio in the final step of layerwise learning to assess the dependence on the TSP graph's skewness.Taking every bit string solution into account, AR represents the overall effectiveness of the simulation, which is suitable to analyze the skewness.We observe that the simulation tends to perform less effectively with right-skewed edge distributions, possibly due to the presence of more low-weight edges in positively-skewed graphs.Further investigations that include sampling uncertainties are necessary to fully study the consequences of varying graph topology for TSPs with more cities.
Additionally, the penalty weight λ in the TSP cost equation (Eq.( 9)) is essential to examine, for it directly controls the gaps between valid and invalid solutions.By a similar analysis for the skewness, we find that the simulation performs optimally when λ is in the range of [1.0E G,max , 4.5E G,max ], where E G,max represents the maximum TSP edge weight.This analysis further supports the choice of the penalty weight used in this study.

V. SUMMARY AND DISCUSSIONS
In this paper, we solved the symmetric TSP (Traveling Salesman Problem) as an optimization problem by using three distinct ansatzes to the QAOA (Quantum Approximate Optimization Algorithm) approach.By adopting a layered learning optimization protocol, we performed numerical quantum simulations on gate-based quantum simulators for various 3-, 4-, and 5-city TSPs.In particular, we presented and compared the performance of the three types of mixer ansatzes for the QAOA: the X mixer, the XY mixer, and the RS mixer.For the few-city TSPs studied in this work, we demonstrated that a well-balanced quantum simulation, such as using the XY mixer, is potentially more suitable in terms of both numerical accuracy and computational cost.These findings are further validified through the noise model simulations.Additionally, we highlighted other factors that may play a role in the quantum simulation, such as the TSP graph skewness and cost function penalty.
Our research is a significant step towards finding a successful strategy for the TSP optimization problem using the gate-based QAOA approach, which holds a particular interest in the current NISQ paradigm.The QAOA simulation complements traditional quantum annealing methods in the infinite time region, where efficient qubit reduction techniques, improved optimization protocols, and resource-efficient mixer ansatzes investigated in this work are expected to be valuable for realistic quantum device simulations.Moving forward, we plan to extend our investigations to larger-city TSPs, employing deeper QAOA circuits on noisy quantum backends.By utilizing an adaptive shot-frugal optimizer [60] and implementing digitized-counterdiabatic quantum approximate optimization methods [37,38], we aim to further enhance the accuracy and efficiency of our TSP simulations.

FIG. 1 .
FIG. 1. Two-part layerwise learning protocol to the QAOA.Horizontal lines represent the qubits; rectangular boxes are the unitary operators.Fixed parameters are in black; free parameters are in red.

FIG. 2 .
FIG. 2. Example of the layerwise learning protocol applied to the QAOA X mixer simulation.Left panel (a) shows the simulation for a selected LL protocol step A2 of a specific TSP instance, TSP-3.Right panel (b) shows the overall LL optimization for six different TSP instances.Here, a TSP instance is a random TSP graph with 3 nodes and a maximum edge weight of 20.

FIG. 3 .
FIG.3.Performance comparison of the three QAOA mixers for samples of the 4-city TSP (left column) and 5-city TSP (right column).In both cases, we compare the AR in panels (a-b), the percentage of the true solution in panels (c-d), and the rank of the true solution in panels (e-f).The uncertainty bars are standard deviations obtained from simulations of different TSPs.

FIG. 4 .
FIG. 4. Noisy QAOA simulation results of the XY and RS mixers compared with the noise-free simulation of the 4-city TSP graph.In the legend, we show the single-qubit error used for each noisy simulation.The uncertainty bars/bands are standard deviations obtained from simulations of different TSPs.The same scale is used for XY and RS, except for the plot of their ranks.

FIG. 5 .
FIG. 5. Results of the approximation ratio (AR) of the three QAOA mixers for the 4-city TSP (panel a) and the 5-city TSP (panel b) of distinct graph skewnesses.

2 FIG. 6 .
FIG.6.Performance comparison among the three mixers on a single TSP graph.The respective rank of the true solution in each optimization step is included at the top of the percentage.

TABLE II .
may produce a significantly better result.Quantum resource estimation per QAOA layer of various mixers considered in the 3-, 4-, and 5-city TSPs.The circuit depth, the count of single-qubit gates, and of the double-qubit gates are evaluated after transpilation (light transpilation, no approximation with qiskit.compiler.transpile)to the standard basis gate sets {CX, I, RZ, SX, X} used by the IBM Quantum.Exact numbers for the circuit depths and quantum gates are obtained whenever available; otherwise, asymptotic scalings are provided.

TABLE III .
Details of noisy simulation for the 4-city TSP case.Noise percentage refers to single-qubit errors used in the noise model simulation.The standard deviation of the quantities obtained from variation in TSP graphs is provided in the parenthesis.