1. Introduction
The traveling salesperson problem (TSP) is a famous computationally difficult problem in mathematics. The TSP involves a salesperson who wants to visit a number of cities to sell their goods. The salesperson knows the distance to each city and wants to make a travel itinerary that passes through each city following the shortest possible route. While the problem might appear simple, it is theorized to be in the complexity class of NP-hard problems. The challenge is that the number of possible routes grows as
where
is the number of cities. Thus, the brute force method of calculating the distance of each route is intractable on a classical computer for even a relatively small set of cities. Many classical algorithms have been developed that approximately solve the TSP, such as branch and bound techniques [
1,
2,
3,
4] and other heuristic algorithms [
5,
6,
7,
8,
9,
10,
11]; however, there is no known classical algorithm that solves the general TSP exactly with polynomial resources.
Quantum computing may provide improved algorithms for solving the TSP. Imagine a salesperson who can traverse each of the routes simultaneously. If the salesperson also has a method for finding which of the simultaneously traveled routes is shortest then they could find the shortest route in a single check. This is similar to how a quantum computer operates. A quantum computer can be in a superposition of exponentially many states. If the routes of the TSP are somehow encoded onto the states of a quantum computer, then the quantum computer can simulate our salesperson who can travel all routes simultaneously. The question is, how do we use the quantum computer to find which of the simultaneously traveled routes is shortest?
If we want the quantum computer to return the distance of each route, we would have to take exponentially many measurements of the quantum computer. Therefore, we need a cost function that evaluates a metric for a given quantum superposition that (1) is minimized when the quantum computer is in the state that represents the shortest route and (2) is efficient in the number of required quantum measurements. A typical solution is to derive an operator whose expectation value is minimized for the quantum state that represents the shortest route. This method has been used in a variety of quantum computing algorithms.
There are many quantum algorithms that promise increased efficiency over corresponding classical algorithms. One such algorithm is Grover’s algorithm [
12], which offers a polynomial speedup for database search problems. Grover’s algorithm was the first to be proposed as a quadratic speed up over classical algorithms for solving the TSP [
13,
14,
15]. However, these algorithms only work when certain conditions are satisfied. Another quantum algorithm is the Quantum Phase Estimation (QPE) algorithm [
16], which has been proposed as an addition to Grover’s algorithm in order to generalize to any TSP [
17]. However, the QPE algorithm requires substantial quantum resources, far beyond those currently available. Similarly, the Quantum Adiabatic Algorithm (QAA) [
18,
19,
20,
21,
22] requires a large amount of quantum resources but may exponentially outperform classical algorithms for solving the TSP. Quantum Annealing [
23,
24] has also been considered for solving the TSP [
25,
26,
27,
28]. While Quantum Annealing has the advantage that it does not require a universal quantum computer, it can be difficult to make precise statements about the efficiency. More contemporary algorithms include the Variational Quantum Eigensolver (VQE) [
29,
30,
31,
32] and the Quantum Approximate Optimization Algorithm (QAOA) [
33]. Both VQE and QAOA are variational algorithms and both require less resources than QPE or QAA while still providing exponential improvements over classical algorithms for certain problems [
31,
33]. VQE and QAOA may be tractable on currently available quantum computers and are potentially more efficient at solving the TSP for certain classes of problems [
34,
35]. We focus on the VQE algorithm in this paper.
A critical part of all quantum algorithms is the encoding. We need an encoding that maps the TSP onto the quantum computer. To the best of our knowledge, the TSP has been exclusively encoded using basis encodings [
17,
35]. In a basis encoding, possible solutions of a given problem are mapped onto the basis states of the quantum register. For the TSP, for example, each route is mapped onto a basis state. An alternative to basis encoding is amplitude encoding [
36]. In an amplitude encoding, the solutions are mapped onto linear combinations of basis states such that the amplitude of each basis state provides information about the solution. Amplitude encodings tend to require fewer qubits than basis encodings. However, for amplitude encoding, it can be difficult to derive a cost operator that can be efficiently implemented on a quantum computer. The process of converting a cost function to a cost operator is not trivial and there is no guarantee that it can be accomplished without exponential resources.
In this work, we present an amplitude encoding for the TSP along with a method to calculate the cost function. Our unique approach to calculating the cost function does not require the construction of a cost operator but instead evaluates the cost based on the probability distribution collected from measurements of the quantum computer. Our method for calculating the cost function allows us to use an amplitude encoding, which exponentially reduces the required number of qubits as compared to basis encodings. Because there is no efficient classical method for reproducing the probability distributions found using the quantum computer, our method may provide an advantage over classical TSP solvers. We demonstrate our method on a state vector simulator for a few small graphs and show that convergence can be found without fine tuning the hyperparameters of the method.
2. Method
2.1. Statement of the TSP
Let us formally define the TSP. We have a number of cities each with a unique label for . Each pair of cities and is assigned a distance . We can define a graph where the vertices V correspond to the cities and the edges E correspond to the paths between the cities. There are at most edges for directed graphs and edges for undirected graphs. For our demonstration we confine ourselves to undirected graphs; however, we formalize the algorithm for the general case.
A route
R is an ordered list of cities, e.g.,
, that defines the cities that the salesman visits and the order in which each city is visited. Let
be the index of the
city in
R. We define the distance of a route
as
which is a function that maps routes to real numbers. A route represents a full tour if it contains every city exactly once. Solutions of the TSP are full tours.
2.2. Quantum Amplitude Encoding for the TSP
We now set up the problem for a quantum computer. We use an amplitude encoding to map the TSP routes to states of the quantum computer. This encoding saves resources as compared to basis encodings and is a major advantage of our method.
Figure 1 shows an example of our construction. We assign a quantum basis state for each path edge in the graph
G. Let
be a binary index for the edge connecting cities
and
. We encode this edge on the quantum basis state
, where the binary number describes the state of the qubits. For example
is a quantum state where the first qubit is off and the second and third qubits are on (we assume little endian ordering). For
edges we need only
qubits. A route on the quantum computer
is defined by a superposition of states:
where
is the number of basis states in the summation. For example, the route
is mapped to
. Notice that
does not define an order for the paths
.
We define the order classically after the quantum computer is measured. Details are presented in
Section 2.3.
It is straightforward to define a quantum distance operator:
We can define the distance of an arbitrary quantum state
to be the expectation values of the quantum distance operator:
This allows us to calculate the distance of a route
R using the distance operator
However, the full cost function needs to do more than evaluate the distance of a quantum state. There must also be constraints in the cost function so that the cost is increased for quantum states that do not define true solutions of the TSP. We describe two methods for evaluating the full cost function.
2.3. Evaluating the Cost Function
The amplitude encoding offers a dense encoding of information onto the quantum register. The density of information is an advantage in that the required number of qubits is small. However, it presents a challenge when trying to evaluate the cost function. As seen in the previous section, the distance of a route can be promoted to a quantum operator. However, the cost function must also involve a constraint that only full tours should be considered as solutions. We refer to this constraint as the full-tour constraint. Typically, one would write a cost function that involves both the distance and the full tour constraint and promote this function to a quantum operator. However, it is far from obvious how one can obtain such an operator in this case. The main insight of our work is that we can evaluate the cost function without defining a corresponding quantum operator. Instead we use the probability distribution measured from the quantum computer.
In what follows, we will go into detail about how to use the measurement results of the quantum computer to determine a value for the cost function. It is important to remember that evaluating the cost function is not the same as finding a solution to the TSP. The cost function is simply a measure of how close the current state of the quantum computer is to the state that represents the solution. In order to find a solution, the quantum state must be varied. We will describe a variational quantum algorithm that can be used to find the solution in
Section 2.4.
We describe two methods of evaluating the cost function given the probability distribution over the quantum basis states. Given the state of the quantum computer
the probability distribution is
A fundamental aspect of quantum computation is that the probability distribution can be well approximated by repeated measurements of the quantum state. It is these measurement results that we use to evaluate the cost function.
2.3.1. Lagrange Multiplier Method
In this method, we evaluate the full tour constraint by extracting a route that represents a candidate for the full tour . We find the candidate tour by consecutively evaluating the highest probability edges. The steps for finding are as follows:
- step 0
- step 1
Find such that for all .
- step 2
Append to the last entry in T.
- step 3
Set and find a new such that for all j such that is not in T.
- step 4
Append to the last entry in T.
- step 5
Repeat step 3 and step 4 until all cities are reached.
To evaluate the cost function, we compare the candidate tour
to the state of the quantum computer
The full cost function is evaluated as
where
is evaluated by taking the expectation value of the distance operator as in Equation (
4) and
a is a Lagrange multiplier.
There is no guarantee that the tour found this way minimizes . However, it is guaranteed that as converges to a full tour. Therefore, minimizing enforces the full-tour constraint.
2.3.2. Tour Averaging Method
In this method, we do not use the expectation value of the distance operator in the cost function. Instead we take the average distance of a number of tours found probabilistically from the probability distribution of the quantum state.
The method for finding a tour is similar to that in the Lagrange multiplier method, except, we select each edge probabilistically. The steps to find a tour are as follows:
- step 0
- step 1
Select with probability .
- step 2
Append to the last entry in .
- step 3
Set and select a new with probability from any such that is not in .
- step 4
Append to the last entry in .
- step 5
Repeat step 3 and step 4 until all cities are reached.
where unless in which case is equally distributed over all j.
We calculate the distance of the tour using Equation (
1). We repeat this process to find a distribution of tours. The cost function is the average distance of all the tours
where
is the number of times
was found. Because every distance in the average is calculated based on a full tour, the full-tour constraint is automatically enforced.
2.3.3. Comparison of Lagrange Multiplier and Tour Averaging Methods
In the Lagrange multiplier method we interpret the quantum state as representing a single tour. We attempt to force the quantum state into an exact tour state using the Lagrange multiplier a. In the tour averaging method, we interpret the quantum state as representing a distribution of unique tours. We average over this distribution to calculate the distance. We see that while both methods use the same encoding, the interpretation of the quantum state is subtly different.
The Lagrange multiplier method gives us an extra control parameter a that may help expedite convergence. On the other hand, the evaluation of the cost function does not take full advantage of all of the information in the probability distribution as many different distributions can produce the same tour. Furthermore, there are switching points in the space of quantum states where the candidate tour changes. is likely to be discontinuous across these switching points.
The tour averaging method does not suffer from switching points in the same way. Of course, the most probable tour will switch suddenly at certain points in the state space. However, the cost function is continuous across these points. It is unclear which method will ultimately be the most effective; however, the continuity of the cost function is a strong indication that tour averaging may be preferred.
2.4. Variational Quantum Algorithm for Solving the TSP
To demonstrate the encoding method, we perform a VQE-type algorithm. A flow chart for the algorithm is depicted in
Figure 2, where the algorithm is compared to standard VQE. In both our algorithm and standard VQE, the quantum computer is used to generate a variational ansatz. The expectation value of the ansatz is then measured for a number of operators. Finally, a cost function is evaluated based on those expectation values. In standard VQE the cost function is simply a sum of expectation values. Our algorithm differs in that the cost function is a nonlinear function of expectation values. The variational parameters are then updated based on the value of the cost function using a classical optimization routine.
For the demonstration, we use the hardware-efficient ansatz [
30] as the variational ansatz and we use Simultaneous Perturbations Stochastic Approximation (SPSA) as the classical optimization routine. During each iteration of the optimization routine, we evaluate the cost function based on one of the two methods described in
Section 2.3.
Two layers of the hardware-efficient ansatz are shown in
Figure 3. Each layer consist of a stack of Y-rotations followed by alternating CNOT gates. The Y-rotations are parameterized by the variational parameter
where
q indicates the qubit and
l indicates the layer. Throughout our demonstration, we use a 2-layer ansatz.
2.5. Algorithm Complexity
The advantage of our encoding is that it requires a small number of qubits. To make that statement precise, let us analyze the method. Our method maps each path between pairs of cities onto a basis state. If there are cities, then there are ordered city pairs. Pairing a city with itself does not define an edge, thus, there are at most directed edges between city pairs. If we restrict the problem so that the return path has the same distance as the forward path then we only need to consider half of the total number of edges.
Given a number of qubits
, the number of basis states is
. We need a basis state for each edge. Therefore, we need a number of qubits equal to
Another factor that contributes to the algorithm complexity is the required number of ansatz layers, known as the circuit depth. The depth of the quantum circuit required to optimize the route is an open problem. However, it has been demonstrated that VQE can solve complex problems with reasonably shallow circuits [
30].
Another factor in the complexity of the algorithm is the number of shots required to achieve an accurate estimate of the cost function. The standard method of calculating expectation values of an operator on a quantum computer is to separate the operator into a sum of Pauli strings. Each set of commuting Pauli strings must be measured in a separate basis. As the distance operator
can be constructed from Pauli-Z and identity operators, it requires only a single basis to be measured. Similarly, while we have to calculate
different
, they all can be calculated from the same measurement basis. Thus, the number of shots
required to get an expectation value to an accuracy of
scales as [
37,
38]
Finally, evaluating the cost function from the results of the quantum computer is achieved by a classical algorithm that scales as
Thus, assuming we do not need exponentially many ansatz layers, no part of the algorithm scales exponentially with the number of cities.
2.6. Comparison to Basis Encoding
The main advantage of our method is the use of amplitude encoding to reduce the required number of qubits. In the standard basis encoding [
17,
35], one must have a unique basis state for each possible route. In general, there are
unique routes. As stated previously, the number of basis states in the quantum computer is
. Therefore, one needs at least
qubits for a basis encoding. Often, the encoding is simplified if you associate each qubit with an edge so that
. In either case, both
and
are exponentially larger than the number of qubits required for the amplitude encoding
. This improvement in qubit resources is due to the fact that, in the basis encoding, much of the quantum information that can be registered on a quantum computer is not used. In particular, in a basis encoding, different linear combinations of basis states do not carry unique information about the problem. The advantage of basis encoding is that the information about the problem is simple to read off of the quantum computer. For example, one may consider only the most probable basis state in the measured probability distribution to be relevant. This greatly simplifies the evaluation of the cost function. However, as we show, various strategies exist for evaluating costs functions for an amplitude encoding.
One open question is how the convergence of the variational method will compare between the basis and amplitude encodings. This question is beyond the scope of the current work. The small graphs that we consider will not be able to resolve the question of convergence as both encodings should converge quickly for small graphs. The real question is how the convergence will change as we increase the size of the graphs. For large graphs, the hardware-efficient ansatz is not likely to be useful for either encoding. A systematic exploration of various ansatzes is required to make any definite claims about scalability.
2.7. Quantum Advantage
Another open question is whether it is possible to obtain a quantum advantage using the our method. Because of the logarithmic scaling in the number of qubits, classical simulations of the quantum computing portion of the algorithm will be viable for even relatively large problems. However, there are still exponentially more basis states than the number of qubits and so the classical simulation still requires exponentially more resources than when using a quantum computer.
However, there are many other classical algorithms that can be used to address the TSP. A full comparison of each method to our own is beyond the scope of this work, but we will try to provide some insights. Firstly, our algorithm is certainly efficient in the number of qubits. In addition, the evaluation of the cost function scales with the number of measurement shots taken from the quantum computer. Thus, the question becomes, how many quantum operations are required in order to generate a probability distribution in which a good approximation of the cost function can be calculated using a small number of shots.
The number of required quantum operations is an open question for variational algorithms in general. The number of quantum operations depends on the ansatz that is used. For certain ansatzes, arguments have been made that a quantum advantage can be obtained for certain problems once we have fully error-corrected quantum computers [
29]. However, other ansatzes are specifically designed for near-term quantum computers and they may not scale in the long term [
30]. We have demonstrated our method with the hardware efficient ansatz. What other ansatzes are applicable is still an open question.