A Genetic Algorithm with Quantum Random Number Generator for Solving the Pollution-Routing Problem in Sustainable Logistics Management

: The increase of greenhouse gases emission, global warming, and even climate change is an ongoing issue. Sustainable logistics and distribution management can help reduce greenhouse gases emission and lighten its inﬂuence against our living environment. Quantum computing has become more and more popular in recent years for advancing artiﬁcial intelligence into the next generation. Hence, we apply quantum random number generator to provide true random numbers for the genetic algorithm to solve the pollution-routing problems (PRPs) in sustainable logistics management in this paper. The objective of the PRPs is to minimize carbon dioxide emissions, following one of the seventeen sustainable development goals set by the United Nations. We developed a two-phase hybrid model combining a modiﬁed k -means algorithm as a clustering method and a genetic algorithm with quantum random number generator as an optimization engine to solve the PRPs aiming to minimize the pollution produced by trucks traveling along delivery routes. We also compared the computation performance with another hybrid model by using a different optimization engine, i.e., the tabu search algorithm. From the experimental results, we found that both hybrid models can provide good solution quality for CO 2 emission minimization for 29 PRPs out of a total of 30 instances (30 runs each for all problems).


Introduction
Global warming and climate change are ongoing issues right now and required immediate attention around the world (Figure 1). Climate change and global warming are related to the emission of greenhouse gases into the atmosphere, theoretically. One of the important observations during the global pandemic of the new coronavirus (COVID-19) is that many regions have been on lockdown for several months. The aerosols and pollutants in the atmosphere reduced by around 9-64% [1]. It is obvious that human activities have a significant influence on the amount of pollutants in the atmosphere. Therefore, methods for reducing greenhouse gases emission in daily life and industries has become an important issue. Inefficient logistics and distribution management will produce more greenhouse gases emission, since trucks need to travel for a longer time and consume more fuel. Globalized enterprises need to pay more attention and take action to green their supply chain [2].
With continuous development of the vehicle routing problems (VRPs), various types of VRPs have been proposed for more than 60 years of time [3][4][5][6]. Among them, the pollutionrouting problems (PRPs) and green vehicle routing problems (GVRPs) are directly related to one of the seventeen sustainable development goals set by the United Nations.
The PRPs were first proposed in 2011 by the authors of [7]. They extended classical VRPs with broader and more comprehensive objective functions that considered not only the travel distance, but also the amount of greenhouse gases emission, fuel consumption, Almost within the same period, the GVRPs were introduced in 2012 by the authors of [11]. They formulated the GVRPs and developed some solution techniques to help organizations with alternative fuel-powered vehicle (AFV) fleets overcome the difficulties that currently limit vehicle driving range and limited refueling infrastructure.
The VRPs and its variants are NP-hard (non-deterministic polynomial-time hardness) problems. Various methods and algorithms have been proposed and tested to optimize solutions, according to the objectives for each problem. Meta-heuristic methods have been widely applied to solve the problems, including traditional genetic algorithms (GA), Tabu searches (TS), simulated annealing (SA), neighborhood searches, [12,13] etc. However, some algorithms may have better results when applied to certain problems, compared to others; determining which method is the best algorithm to solve the VRPs is still argued in research/academic society. Therefore, we designed and compared the different results by utilizing two different optimization algorithms, the GA with quantum random number generator (QRNG) and the TS, as part of the hybrid models to solve the PRPs in this paper. Moreover, we execute grouping techniques before optimizing to design our hybrid models in order to cluster customer points, based on their demands in the first phase. The objective was to develop hybrid models to solve the PRPs in a short time, whether in small-scale problems or large-scale problems, by combining clustering algorithms and optimization algorithms into integrated models.
The GA was first developed by the author of [14], which is inspired by the concept of "natural selection and survival of the fittest" proposed by Darwin's theory of evolution. The construction of the GA included chromosome selection, gene reproduction, crossover, and mutation. The original concept was to mimic the biological chromosomal gene architecture to represent complex system structures. Generally, when the GA is used to solve Almost within the same period, the GVRPs were introduced in 2012 by the authors of [11]. They formulated the GVRPs and developed some solution techniques to help organizations with alternative fuel-powered vehicle (AFV) fleets overcome the difficulties that currently limit vehicle driving range and limited refueling infrastructure.
The VRPs and its variants are NP-hard (non-deterministic polynomial-time hardness) problems. Various methods and algorithms have been proposed and tested to optimize solutions, according to the objectives for each problem. Meta-heuristic methods have been widely applied to solve the problems, including traditional genetic algorithms (GA), Tabu searches (TS), simulated annealing (SA), neighborhood searches, [12,13] etc. However, some algorithms may have better results when applied to certain problems, compared to others; determining which method is the best algorithm to solve the VRPs is still argued in research/academic society. Therefore, we designed and compared the different results by utilizing two different optimization algorithms, the GA with quantum random number generator (QRNG) and the TS, as part of the hybrid models to solve the PRPs in this paper. Moreover, we execute grouping techniques before optimizing to design our hybrid models in order to cluster customer points, based on their demands in the first phase. The objective was to develop hybrid models to solve the PRPs in a short time, whether in small-scale problems or large-scale problems, by combining clustering algorithms and optimization algorithms into integrated models.
The GA was first developed by the author of [14], which is inspired by the concept of "natural selection and survival of the fittest" proposed by Darwin's theory of evolution. The construction of the GA included chromosome selection, gene reproduction, crossover, and mutation. The original concept was to mimic the biological chromosomal gene architecture to represent complex system structures. Generally, when the GA is used to solve problems, the principles of genetic evolution are mainly used for searching for better solutions. Since the GA was proposed, many studies have used it to solve the VRPs and their variants [15][16][17]. The concept of the TS was first introduced in 1986 by the author of [18] to solve integer programming problem and serve as linkage to artificial intelligence at the beginning of personal computer era. The author of [19] proposed theory of the TS in 1990 and the TS was designed to enable searching process to escape local optimal and go to global optimal. The TS is based on introducing flexible memory structures, combined with strategic restrictions and aspiration levels, as a means for exploiting search spaces. This method is typically used in combinatorial optimization, such as travelling salesman problems (TSPs), manufacture scheduling problems, VRPs [20][21][22], etc. The overall approach is to avoid entrapment in cycles by forbidding or penalizing moves which take the solution, in the next iteration, to points in the solution space previously visited. This property is one of the important reasons why the TS can get satisfying solutions efficiently.
With the development of quantum mechanics, applications of quantum technology have been more valued in recent years by powerful quantum computers. Following Industry 4.0 by using artificial intelligence in a smart factory, the potential of quantum technology with artificial intelligence in the future is shown in Figure 2. Quantum computing was introduced by the author of [23]. It is the use of quantum phenomena, such as superposition and entanglement, to perform computation. Quantum superposition states that any two or more quantum states can be added together, and the result will be another valid quantum state. Also, quantum entanglement is a phenomenon where a pair of particles are generated and interact or share spatial proximity, in such a way that the quantum state of each particle of the pair cannot be described independently of the state of the other, even when the distance between the particles is large.
problems, the principles of genetic evolution are mainly used for searching for better solutions. Since the GA was proposed, many studies have used it to solve the VRPs and their variants [15][16][17].
The concept of the TS was first introduced in 1986 by the author of [18] to solve integer programming problem and serve as linkage to artificial intelligence at the beginning of personal computer era. The author of [19] proposed theory of the TS in 1990 and the TS was designed to enable searching process to escape local optimal and go to global optimal. The TS is based on introducing flexible memory structures, combined with strategic restrictions and aspiration levels, as a means for exploiting search spaces. This method is typically used in combinatorial optimization, such as travelling salesman problems (TSPs), manufacture scheduling problems, VRPs [20][21][22], etc. The overall approach is to avoid entrapment in cycles by forbidding or penalizing moves which take the solution, in the next iteration, to points in the solution space previously visited. This property is one of the important reasons why the TS can get satisfying solutions efficiently.
With the development of quantum mechanics, applications of quantum technology have been more valued in recent years by powerful quantum computers. Following Industry 4.0 by using artificial intelligence in a smart factory, the potential of quantum technology with artificial intelligence in the future is shown in Figure 2. Quantum computing was introduced by the author of [23]. It is the use of quantum phenomena, such as superposition and entanglement, to perform computation. Quantum superposition states that any two or more quantum states can be added together, and the result will be another valid quantum state. Also, quantum entanglement is a phenomenon where a pair of particles are generated and interact or share spatial proximity, in such a way that the quantum state of each particle of the pair cannot be described independently of the state of the other, even when the distance between the particles is large.
Quantum computers are believed to be able to solve certain computational problems and calculate faster than classical computers, and that is the reason why more and more researchers began to study from data science to quantum information science. Random numbers, which have important applications in simulation and cryptography, are a fundamental resource in science and engineering. However, the random number generators that we usually use have certain complicated rules or functions to generate outputs by inputting certain seeds, called the pseudo-random number generator Quantum computers are believed to be able to solve certain computational problems and calculate faster than classical computers, and that is the reason why more and more researchers began to study from data science to quantum information science.
Random numbers, which have important applications in simulation and cryptography, are a fundamental resource in science and engineering. However, the random number generators that we usually use have certain complicated rules or functions to generate outputs by inputting certain seeds, called the pseudo-random number generator (PRNG). The QRNG can provide truly random numbers because of the inherent randomness at the core of quantum mechanics, thus making quantum systems a perfect source of entropy. Also, the QRNG is one of the most mature quantum technologies [24], and the optical QRNG is a popular type of the QRNG [25,26]. The QRNG we embedded in our proposed hybrid model was acquired from the Australian National University QRNG server (https://qrng.anu.edu.au/ (accessed on 30 March 2021)).
The research objective focused on the business model of Business to Customers. In addition, we proposed two hybrid models and compared two different optimization engines: the GA with QRNG (GAQ), and the TS, to solve the PRPs.
The assumptions and limitations within the research are listed as follows: 1.
Each route started and ended at the same depot. Depots had all the required demands from customers.

2.
Product categories were not considered in this research.

3.
Each customer was visited exactly once by a truck and had known inhomogeneous demands; additionally, the service time was closed to 0. 4.
The location of each customer was known in advance.

5.
Every truck had the same known capacity. The total demand of each route did not exceed the truck capacity. 6.
There was no limit to the number of places that each truck could visit, nor the total duration of each route.

7.
No time-variant factory (no midnight delivery) or event occurrence could disrupt delivery. 8.
Every truck had the same weight, 3000 kg, when it was empty. The weight of every unit of demand was 100 kg. 9.
The distance, per unit, in the plane coordinates was 1 km. 10. The basic fuel consumption of each vehicle was 6.25 L per 100 km. The fuel consumption increased linearly by 0.5 L per 100 km (for every 100 kg of additional vehicle weight). Every liter of fuel consumed produced 2.2 kg of CO 2 .
Under all the stated assumptions and limitations, trying to search for the minimum total emission of CO 2 during the delivery was the research objective. The parameters in the assumptions could easily be changed (either increased or reduced) in the real world by computer software simulation.

Problem Proposition
The PRPs in this research considered a single depot with a capacity constraint for each truck. The objective of the PRP was to minimize the total emission of CO 2 during the delivery. The PRP was represented by a directed graph: G = (V, E), where V = {v 0 , . . . , v n } implied the set of nodes and E was the set of arcs. Typically, the depot was denoted with node j = 0, and customers were denoted with j = 1, 2, . . . , N, where N was the last customer that needed to be served. The demand was denoted with D j > 0, such that the demand of every customer must not be zero or a negative value. Each arc represented the path from node i to node j, with weight e ij > 0, which corresponded to the CO 2 emission from node i to node j.
Decision Variable: Objective function: Subject to: Sustainability 2021, 13, 8381 x ijk = 0 or 1, for all i, j, k, Notations: K = number of trucks; Q k = capacity of truck k; D i = demand at node i, D 0 = 0; y i was an auxiliary variable that was required, in order to avoid sub-tour elimination.
In the above formulation, the objective function was defined as: (1) to minimize the addition of all the transportation costs associated with the customers. Constraint (2) and (3) to make sure each customer was served by only one vehicle. Constraint (4) implied route continuity and constraint. Constraint (5) represented the vehicle's capacity constraints. Constraint (6) and (7) simply ensured that truck availability did not exceed any constraints. Finally, constraint (9) eliminated any sub-tour. Moreover, the demand at each node should be less than or equal to the capacity of each truck.

Clustering
Our proposed models are two-phase hybrid models, combed with clustering and optimized to solve the PRPs that aim to minimize the pollution produced by trucks traveling along delivery routes. The clustering method applied in this research is a two-stage method, which is the combination of the sweep algorithm and the k-means algorithm. By using clustering first approach, both hybrid models were able to solve large-scale problems.

First Stage (Sweep Algorithm)
The initialization of the k-means algorithm (second stage) was to choose the k initial centers of weight for k clusters, and one of the most popular initialization methods was proposed by the author of [27], which chooses random k data points to be the initial centers of weight for k clusters. However, after repeated attempts, we found that the initialization of the k-means algorithm with constraints had a huge influence on the final clustering results, which means that having a reasonable initialization is very important. Therefore, the objective of the first stage was to use the forward sweep algorithm to generate the initial centers of weight for each cluster. The sweep algorithm was chosen due to the fact that it has been widely, and famously, applied to solve the vehicle dispatching problems for a long time, with a rather simple and straightforward method at the same time.
Hence, the first step of this stage was to convert the Cartesian coordinate system to the polar coordinate system and to use the depot as the origin. After that, sort the customers by their angle increasingly, and dispatch customers according to the order of the vehicles, until they cannot take another customer (the remaining customers were then dispatched to another empty vehicle). Repeat the procedure until all the customers have been dispatched to one vehicle. Figure 3 illustrates the forward sweep algorithm.
was not smaller than the demand of the customer would be candidates for the customer to be dispatched to. Then, the customer would be dispatched to the candidate cluster which had the highest priority among them. This procedure would be repeated until all the customers were already dispatched to one cluster. Then, we could update the centers of weight of the k clusters and repeat the customer dispatching until the centers of weight of the k clusters did not change anymore or reached the set number of iterations.

Second Stage (k-Means Algorithm)
We obtained the initial k clusters of the customers after finishing the first stage of clustering, which means we could obtain the initial k centers of weight for the k-means algorithm by calculating the center of weight for each initial cluster. Before dispatching the customers via the k-means algorithm, we sort the customers by their demand decreasingly, since the customers with higher demands may cause more emission of CO 2 and may have a larger influence on the objective function.
After sorting the dispatching order, we calculated the distances between the customer and k centers of weight; the clusters with the shorter distance had the higher priority for the customer to be dispatched to. Only the clusters with a remaining capacity that was not smaller than the demand of the customer would be candidates for the customer to be dispatched to. Then, the customer would be dispatched to the candidate cluster which had the highest priority among them. This procedure would be repeated until all the customers were already dispatched to one cluster. Then, we could update the centers of weight of the k clusters and repeat the customer dispatching until the centers of weight of the k clusters did not change anymore or reached the set number of iterations.

Genetic Algorithm with Quantum Random Number Generator Optimization
The first optimization method, after the clustering method, is the GAQ. The traditional GA have been long used to solve truck dispatching problems as the attributes used such as genes, chromosomes, and population are largely similar to sequence-related problems (or scheduling problems). For the design in this research, the genes in chromosomes were set to integer values, in order to simplify the process of conversion. We use the notations 1, 2, . . . , n to represent customers, and the sequence was the order for truck operators to visit customers. For example, a gene 5, 4, 3, 2, 1 meant that the truck operator will start from the depot to visit customer number 5, following customer number 4, and so on. The truck would be back to the depot after visiting customer number 1. The objective function of the GAQ was to search for an order of sequence that provided the minimum CO 2 emission for delivery. While the GA was in iteration, the algorithms conduct several stages of operation including initialization, evaluation, selection, crossover and mutation until the final iteration was reached. In addition, all random numbers utilized in the GAQ in this research were generated by the QRNG (Section 2.3.6). In the initialization stage, we inserted the sequence that was the result from the clustering process; thus, the sequence length of each cluster is known. Then the population size was set up as 100. We randomized the initial sequences several times to serve and become the initial population.

Evaluation
After having the populations, we evaluated each of the populations through its fitness value. The fitness value comes from the accumulation of the CO 2 emission calculation from beginning to the end of the population.

Selection
In the selection process, we used a modified roulette wheel method. We created a poll table based on the fitness value of the population. The poll table worked by having the fitness value of the population as f i = 1, 2, . . . , population size, and the accumulation of all the reciprocal of fitness value as F = (1/ f i ). The probability of being selected was (1/f i )/F. However, if Max(f i ) − Min(f i ) << f i , the probability of selection would become more uniform, which means that the better choice would not receive a much higher probability to be selected and may cause the inefficiency of the model convergence.
Therefore, we introduced a new fitness value for the selection, as where r was a constant that 0 ≤ r < 1. The accumulation was S = (1/s i ), and the new probability of the selection was (1/s i )/S. The bigger that r was set, the higher the probability of selecting relatively better chromosomes would be. In this research, we set the value of r to 0.9. We generated a random number from zero to one. The number decided which population was selected to be the first parent, based on the poll table. To avoid selecting the same population to be the second parent, the poll table for the second parent was regenerated with the population without the first parent. Then, we generated another random number to decide which population was the second parent based on the new poll table.

Crossover
In the crossover stage, we generated a random value from zero to one and compared the number with the crossover probability. In the case that the random number was below or equal to the crossover probability, we would conduct a crossover to the previously selected two parents to produce two offspring. We inserted a roulette wheel to decide which crossover method was applied among partially mapped crossover, order one crossover, cycle crossover, and modified order crossover. The roulette wheel was created based on the fitness values of the parents applying each method.
Partially mapped crossover generated offspring by choosing a sequence of elements from one parent and preserving the order and position of as many elements as possible from the other parent. It worked by having the child keep a section of the first and second parents, accordingly, and then recorded the elements in the gene number of the opposing parent those that were not in the child. Then, a mapped operation occurred, until the children were both completed. Figure 4 illustrated a partially mapped crossover example.
Order one crossover worked by selecting a section of the first parent to be set to the same gene number as the first child; the same goes for the second child. From there, we evaluated the next gene number based on the other parent; for the genes that were not yet in the child, we inserted the genes in ascending order, after the last genes in the section. Figure 5 simply illustrates the order one crossover example. Partially mapped crossover generated offspring by choosing a sequence of elements from one parent and preserving the order and position of as many elements as possible from the other parent. It worked by having the child keep a section of the first and second parents, accordingly, and then recorded the elements in the gene number of the opposing parent those that were not in the child. Then, a mapped operation occurred, until the children were both completed. Figure 4 illustrated a partially mapped crossover example. Order one crossover worked by selecting a section of the first parent to be set to the same gene number as the first child; the same goes for the second child. From there, we evaluated the next gene number based on the other parent; for the genes that were not yet in the child, we inserted the genes in ascending order, after the last genes in the section. Figure 5 simply illustrates the order one crossover example. Cycle crossover worked by creating cycles between the two parents, such that the first element in parent one was matched with the first element in parent two. Next, we continued to find the elements from each other and inserted the elements in child one and child two, accordingly. Figure 6 shows an example of cycle crossover. Cycle crossover worked by creating cycles between the two parents, such that the first element in parent one was matched with the first element in parent two. Next, we continued to find the elements from each other and inserted the elements in child one and child two, accordingly. Figure 6 shows an example of cycle crossover.
Modified order crossover divides the parents into right and left sections, at a randomly chosen crossover points. The right sections of the two parents were selected to be the preserving elements for the opposing parent and we got the children from the parents. Then, we inserted the left sections of the two parents to the empty parts of the child from the opposing parent, in order from left to right. Figure 7 illustrates an example of modified order crossover. Cycle crossover worked by creating cycles between the two parents, such that the first element in parent one was matched with the first element in parent two. Next, we continued to find the elements from each other and inserted the elements in child one and child two, accordingly. Figure 6 shows an example of cycle crossover. Modified order crossover divides the parents into right and left sections, at a randomly chosen crossover points. The right sections of the two parents were selected to be the preserving elements for the opposing parent and we got the children from the parents. Then, we inserted the left sections of the two parents to the empty parts of the child from the opposing parent, in order from left to right. Figure 7 illustrates an example of modified order crossover.

Mutation
In the mutation stage, we generated a random number from zero to one and compared the number with the mutation probability. In the case that the random number was below or equal to the crossover probability, then we could conduct mutation to the previously produced offspring to produce a single offspring.
In this research, we applied two-point exchange methods to replace the traditional mutation method and completed the mutation process. We generated two positions of the offspring randomly, then exchanged the elements at these positions and produced the mutated offspring. The following Figure 8 illustrates a two-point exchange mutation example.

Mutation
In the mutation stage, we generated a random number from zero to one and compared the number with the mutation probability. In the case that the random number was below or equal to the crossover probability, then we could conduct mutation to the previously produced offspring to produce a single offspring.
In this research, we applied two-point exchange methods to replace the traditional mutation method and completed the mutation process. We generated two positions of the offspring randomly, then exchanged the elements at these positions and produced the mutated offspring. The following Figure 8 illustrates a two-point exchange mutation example. below or equal to the crossover probability, then we could conduct mutation to the previously produced offspring to produce a single offspring.
In this research, we applied two-point exchange methods to replace the traditional mutation method and completed the mutation process. We generated two positions of the offspring randomly, then exchanged the elements at these positions and produced the mutated offspring. The following Figure 8 illustrates a two-point exchange mutation example. The generation in the GAQ, iterated while executing the mentioned stages (and executing the crossover and mutation) when the criteria was satisfied, until the last iteration was reached and had the best fitness value, in this case the minimum value.

ANU QRNG
In this research, all the randomness in the GAQ (such as random numbers that decided whether to execute crossover or mutation, random numbers that were used in roulette wheels, and random positions of crossover/mutation points) were acquired from the Australian National University QRNG server. We used the R package, called "qrng", to retrieve quantum random numbers in real-time from the quantum computing server and integrated them into the GAQ coding. QRNG can provide true random numbers because of the inherent randomness at the core of quantum mechanics, which makes quantum systems a perfect source of entropy. Moreover, the difference between QRNG and PRNG is that QRNG has some unpredictable physical meanings to the generated numbers, and PRNG uses completely computer-generated probability distribution. The generation in the GAQ, iterated while executing the mentioned stages (and executing the crossover and mutation) when the criteria was satisfied, until the last iteration was reached and had the best fitness value, in this case the minimum value.

ANU QRNG
In this research, all the randomness in the GAQ (such as random numbers that decided whether to execute crossover or mutation, random numbers that were used in roulette wheels, and random positions of crossover/mutation points) were acquired from the Australian National University QRNG server. We used the R package, called "qrng", to retrieve quantum random numbers in real-time from the quantum computing server and integrated them into the GAQ coding. QRNG can provide true random numbers because of the inherent randomness at the core of quantum mechanics, which makes quantum systems a perfect source of entropy. Moreover, the difference between QRNG and PRNG is that QRNG has some unpredictable physical meanings to the generated numbers, and PRNG uses completely computer-generated probability distribution.

Tabu Search Optimization
The TS algorithm is a local search meta-heuristic and this method solves optimization problem widely. The TS guides a local heuristic search procedure to explore the solution space beyond local optimality. The TS is similar to the neighborhood search; that is, the TS begins in the same way as an ordinary neighborhood search, moving iteratively from one solution to another, until a solution is obtained. The motivation to construct the TS is that if one point has been visited once, there is no need to waste computation time to revisit and reevaluate that point.
Contrary to classical methods, the current solution generated from the TS may worsen from one iteration to the next. Thus, solutions possessing some attributes of recently explorations are temporarily declared Tabu or forbidden, unless their cost is less than a so-called aspiration level, to avoid cycling. The aspiration level is a condition that can override the Tabu restraint, if the question is minimizing the problem and movement is Tabu, but objective value is smaller than aspiration level, we allow the movement at this time. A move is defined as going from one solution to another. Moreover, the TS method has some parameters, which need to be set before calculation. It contains a Tabu tenure, a candidate list, and an aspiration condition. The Tabu tenure is the length of time during which a certain move is classified as the Tabu. The candidate list contains all the nodes that can be swapped. Obviously, the size of the TS's candidate list is an important factor. The author of [28] suggested that the ideal candidate size is between n and 2n. In general, there are two types of memory in the TS: recency-based memory and frequency-based memory, characteristics are as follows in Table 1: Table 1. Comparison of recency-based memory and frequency-based memory.

Recency-Based Memory Frequency-Based Memory
Short-term memory Long-term memory The value (tabu) decreased as iteration proceeded.
Measured by the counts of the number of occurrences of a particular move.
In this research, the TS method used the recency-based memory to avoid the reversal of moves and cycling. The following Table 2 was our TS's process.

Two Hybrid Models for Minimizing CO 2
In this research, we proposed two hybrid models to solve the PRPs. In the beginning, we used a two-stage method combining the sweep algorithm and the k-means algorithm for dispatching customers to k vehicles. After completing clustering, the PRP was divided into several TSPs. Finally, we used the GAQ and the TS to optimize every route from the grouping results in the first phase. Combing the above step, the second phase optimization, via the GAQ, was our first hybrid model, called k-means clustering with genetic algorithm with QRNG (kGAQ) and k-means clustering with tabu search optimization was the second hybrid model (kTS). The pseudo-code (Tables 3 and 4) and procedures (Figures 9 and 10) of the two hybrid models are shown as follows. Table 2. The steps of the TS Algorithm.
Step 1. Construct an empty tabu list and set up parameters.
Step 2. Randomly generate an initial solution.
Step 3. Solution moves to its neighborhood and constructs a candidate list by two-point exchange.
Step 4. Evaluate the objective value and sort according to candidate list.
Step 5. Choose lower objective value and check whether this candidate is in the tabu list. If yes, we continue to choose the next order. However, if this candidate's objective value is smaller than the aspiration criteria, we allow it to replace the current solution and for tabu tenure to recalculate. If not on the tabu list, the candidate is added to the tabu list and replaces the current solution.
Step 6. Return to step 3, until a stop criterion is satisfied or a predefined number of iteration is achieved. Table 3. The pseudo-code of kGAQ.
Step 1. Input the point of customer and depot.
Step 2. Use the sweep algorithm to generate the initial k centers of weight for k-means algorithm.
Step 3. Use the k-means algorithm with capacity constraint to assign each customer to each vehicle.
Step 4. For each route: run the GA optimization algorithm.
Step 5. End if all routes are optimized. Else, go back to Step 4.
Step 6. End of the hybrid model. Table 4. The pseudo-code of kTS.
Step 1. Input the point of customer and depot.
Step 2. Use the sweep algorithm to generate initial k centers of weight for k-means algorithm.
Step 3. Use the k-means algorithm with capacity constraint to assign each customer to each vehicle.
Step 4. For each route: run the TS optimization algorithm.
Step 5. End if all routes are optimized. Else, go back to Step 4.
Step 6. End of the second hybrid model. nability 2021, 13, x FOR PEER REVIEW 13 of 19 Figure 9. The procedure of kGAQ model. Figure 9. The procedure of kGAQ model.

Computational Results
In this section, we discuss the results of the computational experiments from the two hybrid models, based on their performance on the benchmark problems. We selected 30 benchmark problems to experiment from set A and set B (proposed by the authors of [29]) and set E (proposed by the authors of [30]). The objective of the experiment in this study was to minimize the emission of CO 2 during the delivery. The parameters for the adjustment of the kGAQ and the kTS were determined through the Design of Experiments. The settings of the kGAQ were as follows: • Iteration per run was 10,000. • Population was 100. • Crossover probability was 0.95.
The settings of the kTS were as follows: • Iteration per run was 10,000. • Tabu tenure was 6. • Size of candidate list was 2n.
From Table 5, it was observed that both hybrid models provided good solution quality and had the same best results in most instances. However, the average results of the TS were all better than the kGAQ, except in instances A-n32-k5 and E-n101-k8. Meanwhile, the standard deviations of kTS were smaller than the kGAQ. In addition, Figures 11-13 illustrated the best solutions for instances E-n22-k4, E-n76-k7, and B-n31-k5, respectively, provided by the kGAQ. We found that the kGAQ can produce a fair solution quality for both small-scale and large-scale problems. From the routing graph, we found that the results sometimes had cross-roads that normally should not occur in traditional TSPs because the emission of CO 2 is not only related to the travel distance but also the weight of vehicles. That is, customers with large amounts of demands may have lower priority to be visited, according to the properties of the PRPs. Figure 13 shows that kGAQ can provide fair results when the location of the depot is not in the center of the map.  Figure 11. The best solution provided by kGAQ in instance E-n22-k4.

Conclusions
In this research, we applied a two-stage clustering method to assign customers to trucks based on their demands, combining with the sweep algorithm and the k-means algorithm in the first phase as clustering first approach. We then used the GAQ and TS to minimize the CO 2 emission along every route in the second phase (for our proposed hybrid models). The major contribution in this paper is that the traditional GA has only one crossover method and one mutation method to produce offspring, which means that each individual could only search for solutions in one dimension for each iteration. In our design, we developed a new crossover process and a new mutation process for the purposed model, including four different crossover methods and four different mutation methods, to realize finite quantum search dimension. The selection process was achieved through a quantum random number generator, which was optimized to reduce the offspring's fitness value, produced by each method in the proposed hybrid model.
The results from the experiment proved that both hybrid models could provide good solutions, with fair quality for CO 2 emission minimization for 29 PRPs out of a total of 30 instances (30 runs each for all problems) and showed no difference between the best results on each instance, if not the same. This research is applicable and usable for solving real-world problems from various fields, especially in sustainable logistics management. Solving the PRPs to minimize carbon dioxide emission reduces fuel consumption and lowers the environmental impacts, following one of the seventeen sustainable development goals set by the United Nations.
For further research, modern metaheuristic algorithms may be considered to increase calculation performance. More advanced quantum computing technology may be applied when quantum computers are widely used in the world.