A Bi-Objective Vehicle-Routing Problem with Soft Time Windows and Multiple Depots to Minimize the Total Energy Consumption and Customer Dissatisfaction

: In recent years, the impact of the energy crisis and environment pollution on quality of life has forced industry to actively participate in the development of a sustainable society. Simultaneously, customer satisfaction improvement has always been a goal of businesses. It is recognized that efﬁcient technologies and advanced methods can help transportation companies ﬁnd a better balance between progress in energy saving and customer satisfaction. This paper investigates a bi-objective vehicle-routing problem with soft time windows and multiple depots, which aims to simultaneously minimize total energy consumption and customer dissatisfaction. To address the problem, we ﬁrst develop mixed-integer programming. Then, an augmented (cid:101) -constraint method is adopted to obtain the optimal Pareto front for small problems. It is very time consuming for the augmented (cid:101) -constraint method to precisely solve even medium-sized problems. For medium- and large-sized problems, two Non-dominated Sorting Genetic Algorithm-II (NSGA-II)-based heuristics with different rules for generating initial solutions and offspring are designed. The performance of the proposed methods is evaluated by 100 randomly generated instances. Computational results show that the second NSGA-II-based heuristic is highly effective in ﬁnding approximate non-dominated solutions for small-size and medium-size instances, and the ﬁrst one is performs better for the large-size instances.

The vehicle-routing problem with time windows (VRPTW) has been extensively studied in the context where high-performance industries hope their commodities and materials are delivered and picked up within an expected time window. Depending on the type of time window, VRPTW can be further distinguished into VRP with hard time windows (VRPHTW) and VRP with soft time windows (VRPSTW). For VRPHTW, a route is feasible only if every customer is served within the time window (c.f. Ho and Haugland [16], Alvarenga et al. [17], Dabia et al. [18], Desaulniers et al. [19]). VRPSTW can be considered as a relaxation of VRPHTW, which has valuable practical applications (Figliozzi [20]). For the VRPSTW, time window violation is permitted but with a penalty, i.e., early or/and late services will generate an extra cost. Liberatore et al. [21] propose a branch-and-price algorithm for the VRPSTW. Lu and Yu [22] study a pickup and delivery VRPSTW and combine data envelopment analysis with a genetic algorithm (GA). Duygu et al. [23] deal with a VRPSTW with stochastic travel times and develop a tabu search method for it. Iqbal et al. [24] propose a hybrid meta-heuristic combining artificial bee colony (ABC) algorithm for a multi-objective VRPSTW, in which the objectives are to minimize total traveling distance, number of time window violations and number of required vehicles. Kumar and Panneerselvam [25] give a comprehensive survey about the various exact methods and the heuristics and meta-heuristics used to solve the VRP and its variants, especially the VRPTW and the capacitated vehicle-routing problem (CVRP). For a more recently survey, Dixit et al. [26] review some of the recent advances in the VRPTW using meta-heuristic techniques, e.g., evolutionary and swarm intelligence-based algorithms such as Particle Swarm Optimization (PSO). The authors also highlighted the research gaps and prospects in this field.
In this work, we investigate a variant of the VRPTW with energy-saving consideration, to minimize total energy consumption and customer dissatisfaction (i.e., total violation of time windows). For small-size problems, the augmented -constraint method is adopted to obtain the exact Pareto fronts. Two NSGA-II-based heuristics are carefully designed to tackle medium-and large-size problems. Extensive computational experiments are conducted to evaluate and compare the performance of these two algorithms.
The objective of this study is to address the bi-objective VRP with soft time windows and multiple depots from the perspectives of the construction of mathematical formulation and the design of heuristic algorithms. There are three tasks in this study, which are: (1) construct a mixed-integer linear programming model to describe the investigated problem comprehensively; (2) design a detailed procedure of the heuristic algorithms to obtain good Pareto fronts quickly; and (3) conduct the computational experiments to evaluate and validate the performance of the proposed algorithms. The main contribution of this paper is threefold: (1) We first consider bi-objective optimization for the energy minimization VRP with soft time windows and multiple depots. (2) A novel mixed-integer programming model is formulated.
(3) An augmented -constraint method is adopted to obtain the optimal Pareto solutions for small-size instances. In addition, two NSGA-II-based heuristics are designed to efficiently solve medium-and large-size instances.
The remainder of the paper is organized as follows. Section 2 reviews the related literature. Section 3 presents the description of the problem and proposes a mixed-integer programming formulation. Section 4 gives details of the augmented -constraint method and two NSGA-II-based heuristics. Computational results on numerous instances are reported in Section 5. Conclusions and future research directions are suggested in Section 6.

Literature Review
Multiple depots and the energy-saving consideration are considered to be important characteristics for the investigated problem. To be more focused, we will only review the most relevant literature bellow.
In practice, there may be many depots for vehicles to depart from. Consequently, multiple-depots VRP has been investigated by many researchers. Cordeau et al. [27] investigated multiple-depots VRP with time windows (MDVRPTW) and develop a hybrid tabu search heuristic to minimize the number of vehicles. Dondo and Cerdá [28] studied the MDVRPTW with a fleet of heterogeneous vehicles. They provided a mixed-integer linear programming, and proposed a three-phase cluster-based algorithm. Goela and Gruhn [29] studied a MDVRPTW with heterogeneous vehicle fleets and pickup and delivery orders. They proposed a large neighborhood search to maximize the transportation profit. Flisberg et al. [30] considered a multi-period MDVRPTW with split pickup and delivery, and a heterogeneous truck fleet. A linear programming and tabu search algorithm is proposed. Bettinelli et al. [31] developed a branch-and-cut-and-price algorithm for MDVRPTW with heterogeneous vehicles to minimize the total costs. Luo and Chen [32] proposed an improved shuffled leapfrogging algorithm for MDVRPTW to minimize the total costs. There has been several pieces of research focusing on multi-objective MDVRPTW. Tan et al. [33] provided an evolutionary algorithm for a multi-objective MDVRPTW, which aims at minimizing total distances and number of vehicles. Ghoseiri and Ghannadpour [34] developed a GA for MDVRPTW to minimize the number of vehicles and total distances. Karakatič and Podgorelec [35] focused on the MDVRP. A detailed survey of GAs designed for solving MDVRP was presented. The results of a thorough experiment are presented and discussed, which evaluate the efficiency of different existing genetic methods on standard benchmark problems. The insights into strengths and weaknesses of specific methods, operators and settings are presented. The above multi-depot VRP (MDVRP) research mainly focused on the improvement of transportation efficiency and effectiveness at minimizing travel costs or distances, but energy saving and environment concern have not been considered.
The following researches take the energy consumption and/or customer satisfaction into consideration. Hassanzadeh and Rasti-Barzoki [36] proposed a new bi-objective mathematical model to reduce the consumption of energy and decrease the tardiness penalty in supply chain scheduling and the VRP. A new Non-dominated Sorting Genetic Algorithm based on shaking and local search strategies of the Variable Neighborhood Search algorithm was developed. The performance of the proposed algorithm was demonstrated by computational experiments.
Zhang et al. [37] considered an Electric Vehicle-Routing Problem (EVRP) to minimize the energy consumption of electric vehicles. The corresponding mathematical model is formulated. An ant colony (AC) algorithm-based meta-heuristics is proposed as the solution method of the problem. The benefits of using an energy consumption-minimizing objective function rather than a distance-related one is also illustrated.
Ghannadpour and Zarrabi [38] presented a new model and solution for the multi-objective heterogeneous vehicle-routing and scheduling problem. A new mathematical formulation for the VRPTW is also presented using the proposed concept of heterogeneities. The customers' priority for servicing is considered.
Abad et al. [39] proposed a bi-objective model for the pickup and delivery pollution-routing problem with integration and consolidation shipments in cross-docking system, to minimize total system cost, and total fuel consumption by vehicles. Three multi-objective meta-heuristic algorithms are proposed to solve this problem.
Androutsopoulos and Zografos [40] formulated and solved a bi-objective, time, load, and path-dependent vehicle-routing problem with time windows (BTL-VRPTW). The need to address simultaneous routing and path-finding decisions is considered to be a key feature. A generic solution framework is proposed to address this problem.
Recently, the trend of sustainable development has gained increased attention as a near-future business driver. Many researchers are focused on sustainable production, sustainable transportation, etc. Demartini et al. [41] introduced a Manufacturing Value Modeling Methodology (MVMM), and sustainability is investigated as a performance dimension when applying the MVMM. In their follow-up study, Demartini et al. [42] used the MVMM as a value-mapping framework to help firms in creating value propositions better suited for sustainability considering economic, environmental and social perspectives. Concerning sustainability, the setting of a catalogue that presents an overview of sustainable external and internal impact factors is constructed. A mapping between them that translates business goals into manufacturing strategy is provided. The operational performance is improved by adopting a set of sustainable industrial practices.
On the other hand, energy and environment problems have been gaining increased attention in the sustainable development context. Various variants of VRP that consider energy consumption and greenhouse gas emissions are beginning to be studied, such as Energy Minimization VRP (EMVRP) (c.f. Kara et al. [43]; Fukasawa et al. [44]), Pollution-Routing Problem (PRP) (c.f. Bektaş and Laporte [45]; Demir et al. [46]) and Green VRP (GVRP) (c.f. Erdoǧan and Miller-Hooks [47]; Lin et al. [48]). Single depot EMVRP is first introduced in Kara et al. [43]. They first defined arc cost as the product of vehicle weight and arc length and assume that the vehicles types, vehicle speed, air condition, and road condition are constant. Xiao et al. [49] proposed a simulated annealing heuristic for the EMVRP. The problem was further studied by Fukasawa et al. [44], and two formulations and a branch-and-cut algorithm are proposed. Liu et al. [50] investigated a VRP with alternative paths. The objective is also related to energy consumption, i.e., finding a route with minimal carbon footprint. The time-dependent heterogeneous-fleet vehicle is considered. A GA is designed to solve this problem. The experimental results show that the alternative path has a significant impact in terms of carbon footprint. However, in existing works, MDVRPTW with energy minimization has not been investigated.
In real applications, energy-saving and customer satisfaction are often incompatible, but a better balance between the two objectives should be achieved by efficient optimal models and methods. Little research can be found in the literature. Demir et al. [46] investigated a bi-objective PRP with single depot focusing on simultaneously minimizing the fuel consumption and total driving time. An adaptive large-neighborhood search algorithm was proposed. To the best of our knowledge, bi-objective EMVRP with soft time windows and multiple depots has not been studied.

Problem Description and Formulation
In this section, the problem is described in detail. Then, a mixed-integer linear programming is proposed to represent the investigated problem comprehensively.
EMVRP is defined on a complete graph G = (V, A) with V = N 0 ∪ N 1 , where A, N 0 and N 1 denote the set of arcs, the set of depots and the set of customer locations, respectively. Let d ij denote the distance between vertex i and j, and d ij = d ji . K = {1, 2, ..., |K|} represents the fleet of heterogeneous vehicles located at N 0 , and vehicle k has the curb weight w k and the capacity V k . A desired pickup time window of customer i ∈ N 1 is denoted by [e i , l i ]. The vehicle must wait until e i to start the service if it arrives early. The customer will be dissatisfied if the service starts late, and the dissatisfaction is measured by max{0, a ik − l i }, where a ik denotes the arrival time of vehicle k at customer i. In this paper, we assume that the traffic condition and the speed of the vehicles are constant. The problem consists of determining a set of routes for vehicles such that (i) each customer is served exactly by a single vehicle; (ii) each vehicle starts from its departure depot with its curb weight w k , and picks up q i when it visits customer i. The total load carried by any vehicle does not exceed its capacity V k ; and (iii) the departure depot of each vehicle is preset, and the arrival depots of vehicles should be optimized.
In this paper, energy consumption on an arc is computed by multiplying the length of arc and the total weight of the vehicle, as in Kara et al. [43] and Fukasawa et al. [44]. The customer dissatisfaction is defined as the total tardiness of vehicles (∑ i∈N 1 ∑ k∈K max{0, a ik − l i }).
In the following, we give parameters and decision variables definitions. Then a novel bi-objective mixed-integer linear programming for the EMVRP is formulated. Since the investigated problem can be reduced to the VRP, which is NP-hard, then the investigated problem is also NP-hard.
α ik : equal to 1 if depot i ∈ N 0 is set to be the departure depot of vehicle k ∈ K, 0 otherwise. d ij : the distance between two vertexes i and j, i, j ∈ N 0 ∪ N 1 , i = j. e i : the earliest pickup time at customer i, i ∈ N 1 . l i : the latest pickup time at customer i, i ∈ N 1 . q i : pickup quantity at customer i, i ∈ N 1 . s i : service time at customer i, i ∈ N 1 . w k : curb weight of vehicle k, k ∈ K. V k : load capacity of vehicle k, k ∈ K. v: the average speed of vehicles. M: a large enough number. Variables:

The Formulation of the EMVRP
The formulation is as follows: Objective function f 1 is to minimize the total energy consumption, f 2 aims to minimize the customer dissatisfaction.
Constraint (3)-(5) ensures that each customer is served exactly once by a vehicle.
Constraint (6) ensures that vehicles do not travel between departure and arrival depots.
Constraint (7) ensures that each vehicle returns back to a depot after serving customers.
Constraint (8) ensures that vehicle k starts from its departure depot.
Constraint (9) ensures that the total load of vehicle k does not exceed its capacity V k .
Constraints (10) and (11) ensure that the load of vehicle k equals its curb weight w k when it departs from the depot.
Constraints (12) and (13) detail the way of calculating the load carried by vehicle k before it loads at vertex i, which also can be used as subtour elimination.
Constraints (14)- (17) explain the calculation method of energy consumption of vehicle k before it loads at vertex i.
Constraints (18)-(21) calculates the arrival time of the vehicle k at each vertex.
Constraints (22) and (23) define the tardiness of vehicle k at vertex i.
Constraints (24) and (25) are the restrictions on decision variables.

Solution Domain
There are various methods for solving multi-objective optimization problems, such as weighting method, -constraint method, evolutionary algorithms, etc. NSGA-II and -constraint method are known to be the most popular methods for bi-objective problems. In this section, we first describe bi-objective optimization and the principle of augmented -constraint method. Then, augmented -constraint method is adopted and two NSGA-II-based heuristics are designed.

Bi-Objective Optimization Problem
We consider the bi-objective optimization problem in the form as bellow, see Equations (26) and (27): where f 1 (x) and f 2 (x) denote total energy consumption and customer dissatisfaction respectively. Item x represents a decision variable vector, which belongs to the feasible solution region X defined by (3)- (25). A solution x is non-dominated only if it cannot be replaced by another solution which reduces one objective without increasing another (Tkindt and Ballaut [51]). A non-dominated solution is said to be Pareto-optimal, and the image of corresponding objective values of non-dominated solutions is called the Pareto front.

The Augmented -Constraint Method
The basic idea of -constraint method is to transform the bi-objective problem into a series of single-objective problems, which optimizes one objective with restricting another by a bound . The definition of the value of in each iteration is one of critical factors for -constraint method. For our problem, the second objective is considered to be a constraint and restricted by .
, the range of , is obtained by following ideal point and nadir point (Bérubé et al. [52]). - Ideal point: Not all solutions obtained by the -constraint method are non-dominated (c.f. Ehrgott and Ruzika [53]). To avoid iterations that generate dominated solutions and accelerate the whole process, augmented -constraint method is proposed by Mavrotas [54], and further improved by Mavrotas and Florios [55]. It introduces a slack/surplus variable and a bypass coefficient. For our problem, we transform the original bi-objective problem into several single-objective problems, as shown in Equations (28)- (30): x ∈ X (30) where eps: a very small number, i.e., eps = 10 −5 .
where i denotes the iteration counter, and step is the step size of epsilon and set as the minimal unit value of f 2 . -X : the feasible region.
The value of is also bounded by interval . By varying the value of , a sequence of single-objective problems can be generated and solved. For each single-objective problem, -constraint is presented by an equality combining the slack variable S, and f 2 . To optimize the objective, f 1 is minimized and slack variable S is maximized with lower priority as its weight coefficient eps is very small, forcing the program to obtain only non-dominated solutions (Mavrotas [54]).
As stated by Mavrotas and Florios [55], when slack variable S is larger than the step, then the same solution will be obtained with the slack variable being S − step in the next iteration. Generation of no new non-dominated solutions makes the iteration redundant and therefore it can be skipped. Then a bypass coefficient b implying the number of consecutive iterations that can be skipped is introduced, which is calculated as the integer part of (S/step). The framework of augmented -constraint method is shown in Algorithm 1.

Algorithm 1:
The augmented -constraint method. 1 i = 1. (Initialize iteration counter) 2 Compute the Ideal point and the Nadir point; Solve problem P ( ) exactly, obtain an optimal solution x * and ( f 1 (x * ), f 2 (x * )), calculate the bypass coefficient b; To obtain exact Pareto front is time consuming for the augmented -constraint method, even impossible for almost large-size NP-hard problems. In the next subsection, we focus on developing heuristics to find approximate Pareto front.

NSGA-II Based Heuristics
Non-dominated Sorting Genetic Algorithm-II (NSGA-II), proposed by Deb et al. [56], is a fast and elitist multi-objective evolutionary algorithm. NSGA-II starts with an initial set of solutions. During each generation, offspring solutions are reproduced by parent solutions using genetic operators. Then the population combining current solutions and offspring solutions is renewed. To be specific, the combined population is sorted by non-domination sort procedure into different ranks {F 1 , F 2 , ..., F t , ...}. Solutions of ranks in ascending order are sequentially added into the new population P, if |P| + |F t−1 | ≤ pop and |P| + |F t | > pop, the first pop − |P| solutions in level F t are selected according to their crowding distances in descending order, where pop denotes the predetermined population size. After the predefined number of generations is completed, a set of approximate Pareto solutions can be obtained.
In this section, two NSGA-II-based heuristics H 1 and H 2 with different rules of generating initial solutions and reproducing offspring are designed for our problem. The proposed heuristics are composed of (1) solution representation; (2) evaluation and feasibility check; (3) population initialization; (4) selection; (5) genetic operators; and (6) recombination.
To differentiate the two algorithms, the following statements should be noticed. H 1 focuses on constructing routing for each vehicle based on the customer sequence, while H 2 searches for available vehicles for each customer. Besides, the initialization procedure of two algorithms are various, as shown in Algorithms 2 and 3. Although these two algorithms share the same mutation operator 1, H 1 adopted a different crossover operator and a different mutation operator 2 compared to H 2 . The detailed steps of H 1 and H 2 are given as follows.

Solution Representation
Solutions are represented by chromosomes. By analyzing the proposed problem, we find that the factors which can affect the objective values are: (1) assignment of each customer to exactly one vehicle; (2) the sequence of customers to be served by each vehicle; and (3) arrival depot of each vehicle. Therefore, in two heuristics, a solution is composed of three decision parts: (i) the first part is composed by the numbers of customers served by vehicles; (ii) the second part is the order of customers to be served by each vehicle; and (iii) the third part is the arrival depot of each vehicle. Figure 1 illustrates the chromosome representation. Assume that there are 10 customers and 3 vehicles, and all vehicles depart from depot 1. The chromosome in Figure 1 shows that vehicle 1 serves 2 customers (8,4) in order and arrives at depot 1, vehicle 2 serves 4 customers (3,9,10,5) in order and then arrives at depot 2, and vehicle 3 serves 4 customers (2, 6, 7, 1) in order and arrives at depot 1.

Evaluation and Feasibility Check
The evaluation step is to determine fitness values of an individual. For our problem, evaluation procedure corresponds to the computation of total energy consumption and total tardiness of a solution. A chromosome must be translated into the routes of vehicles before evaluation.
Consider a chromosome S mentioned above. The route of each vehicle can be obtained accordingly as shown in Figure 1. If a vehicle serves no customer, the energy consumption and tardiness of this vehicle are set to be zero. Otherwise, let vector path k represents the route of vehicle k. For instance, path 2 = {1, 3, 9, 10, 5, 2} denotes the route of vehicle 2 in Figure 1, where the first and the last element represent the departure depot and arrival depot of vehicle 2. The cumulative energy consumption R ik of vehicle k before it loads at a vertex i ∈ N 0 ∪ N 1 is detailed in constraints (10)- (17), and the tardiness L ik of vehicle k at a customer i ∈ N 1 , is defined by constraints (18)- (23). The calculation of two fitness values, i.e., total energy consumption and total tardiness, are given by (1) and (2).
After a solution is generated, it is possible that this solution is infeasible, which violates some constraints provided by the formulation in Section 3. Feasibility check procedure is to ensure the feasibility of the problem. We can find that by using the solution representation in Section 4.3.1, constraints (3)- (8) and (24)- (25) are immediately satisfied, and constraints (10)- (23) correspond to the definition of cumulative energy consumption and tardiness. Therefore, the feasibility check procedure mainly focuses on checking whether the capacity constraint (9) is violated. If the feasibility check is not passed for a solution, that is, the load of at least one vehicle exceeds its capacity in the solution, then a very large penalty will be added to both of its fitness values in order to discard this solution through the non-dominated sorting procedure.

Population Initialization
The initial population includes several feasible solutions. To generate an initial solution in H 1 , firstly the indices of customers are sorted randomly, and based on that a vehicle is selected at random to serve customers one after another until its capacity limit is reached. Then for remaining unserved customers, another vehicle is chosen, and the same operation is repeated until each customer is served by exactly one vehicle. The arrival depot of each vehicle is randomly selected, after which the routes of all vehicles are determined and then coded by the solution representation method. The framework of population initialization in H 1 is illustrated in Algorithm 2.
Population initialization in H 2 is shown in Algorithm 3. For each customer, there is an initial set of available vehicles, which can serve this customer without violating the capacity constraint. From every customer 1 to |N 1 |, after the vehicle that serves each customer is randomly selected from the set of available vehicles, the load of this vehicle and the available vehicles for the next customer are updated. Then the sequence of customers served by each vehicle is randomly arranged, arrival depot of each vehicle is selected at random, and routes of all vehicles are determined and coded by solution representation method. if Veh(k) can load at Seq j without violating capacity constraint then 12 The jth customer is added into the path of the kth vehicle: path Veh(k) = path Veh(k) ∪ Seq(j);

13
The load of the kth vehicle is updated: load Veh(k) = load Veh(k) + demand of Seq(j);

14
The set of unserved customers is updated: Seq(j) = Seq(j)\{Seq}; Update S ini = S ini ∪ S i .

Selection
After the non-dominated sorting procedure is completed, initial solutions are sorted into different non-dominated ranks and their crowding distances are computed. It is implied that between two individuals with different ranks, the one with lower rank is preferred, when they are of the same front, the one with larger crowding distance is selected.

Genetic Operators
Genetic operators, including crossover and mutation procedure, are performed for reproducing offspring in each generation. The crossover procedure of H 1 is shown in Figure 2. We randomly choose a vehicle and exchange its routes between two parent solutions and assign remaining vehicles randomly for the unserved customers. For mutation procedure, there are two ways: (i) exchanging the sequences of two customers in the route of one certain vehicle, which is shown in Figure 3; and (ii) exchanging two customers served by different vehicles as shown in Figure 4.   In terms of crossover procedure in H 2 , we focus on preserving similarities between two parental solutions. For each vehicle, if it serves the same customer in two parental solutions, then the customer will be still served by this vehicle in the offspring solutions. In addition, we randomly assign the remaining customers to vehicles, then arrange the sequence of customers to be served by each vehicle at random. There are two methods for mutation: (i) exchanging the sequences of two customers in the route of a certain vehicle as shown in Figure 3; and (ii) changing the terminal depots for vehicles by selecting a new depot randomly.

Recombination
During each iteration, the population which combines current solutions and offspring solutions will be updated. Population is sorted into various non-dominated ranks {F 1 , F 2 , ..., F t , ...} by non-dominated sorting procedure. Solutions in F 1 are updated into the renewed population if the population size does not exceed pop. Then solutions in F 2 are added into the renewed population as long as the population size is no more than pop, and so on. The operation will not be stopped until the population size exceeds pop by adding the solutions in front F t . To maintain several pop solutions, solutions in front F t are selected by the selection procedure described in Section 4.3.4.

Computational Experiments
In this section, the performance of proposed methods is evaluated by 100 randomly generated instances, which includes small-, medium-, and large-size ones. All algorithms were coded in MATLAB R2014b. All computational experiments were conducted on a PC with 3.60 GHz processor and 8.00 GB RAM under windows 10 operating system. For the augmented -constraint method, CPLEX 12.6 is used to solve single-objective problems, computation time for each single-objective optimization and total computational time are limited within 7200 s and 36,000 s respectively.
For two NSGA-II-based heuristics, a preliminary analysis was conducted to fine-tune the parameters, which is presented in Table 1. Population size and generation number are set to be 500 and 20 respectively. In computational experiments, tournament size is set to be 0.5 × pop as in line with the classic NSGA-II in Deb et al. [56].

Data Generation
For small-size and medium-size instances, the data set is derived from instances for VRPTW proposed by Solomon [57]. Solomon's benchmark problems have been grouped into three different types: C, R, and RC. Each type of data includes 8 to 12 instances with a central depot, 100 customers, homogeneous vehicles, and different time windows. Customers in type R are uniformly and randomly distributed, and those in type C have been clustered. Problems of RC-type are mixtures of both random and clustered locations. Time windows in sets of 'R1', 'C1' and 'RC1' are narrow, while those in sets of 'R2', 'C2' and 'RC2' are wider. Euclidean distances and traveling times are numerically identical.
We set the curb weight of each vehicle w k = ρV k with ρ = 0.15 as in Kara et al. [43]. For instance, with |N| customers, |D| depots and |K| vehicles, data set of customers are derived from the original benchmark problem R-101 (Solomon [57]) by considering the first |N| customers. Euclidean coordinates of the depots are generated by considering the first |D| depots detailed in Table 2, where the coordinates of the first 7 depots are in line with Dondo and Cerdá [28] and those of the last 3 depots are randomly generated. The capacities of vehicles are generated by considering the first |K| vehicles in Table 2. Without loss of generality, departure depot for each vehicle is randomly assigned.

Performance Metrics
Since the output of a bi-objective optimization is a set of non-dominated solutions, solution quality is less straightforward. As stated in Liu et al. [58], an ideal bi-objective method has two features: (i) it can find a set of solutions close to the optimal Pareto front; and (ii) solutions have a large diversity. To evaluate the performance of the approaches, three widely used metrics, which are cardinality, hyper-volume ratio and average e-dominance, are introduced (Cheng et al. [59]).
Cardinality (Q): this metric measures the number of non-dominated solutions obtained by each method (Van Veldhuizen and Lamont [60]). A larger Q implies a better solution set.
The hyper-volume Ratio (H): This measure calculates the ratio of the hyper-volume (HV) of a solution set A and a reference set R, which is denoted as HV A and HV R , separately. It is calculated as Equation (31): The reference solution set can either be the exact Pareto front or a set of high quality non-dominated solutions. HV implies the size of dominated space for a set of solutions (Zitzler et al. [61,62]). This indicator calculates the volume of a given region H. For bi-objective problems, HV is equal to the union of the rectangle shaped by the nadir point and each point in the set of non-dominated solutions, the volume of overlapped area is only calculated once. A larger value of this metric implies a better set of solutions.
Average e-dominance (D): this measure implies the average distance between a solution set A and the reference set R, which is shown in Equation (32). The closer the value of D to 1 implies a better performance. where,

Results and Discussion
Computational results obtained are reported in Tables 3-5. For each instance, we run each algorithm 30 times independently and obtain the average value of these results (Cheng et al. [59]). Table 3 represents the computational results of the small-size instances in which the number of customers ranges from 5 to 11. Since augmented -constraint method direct uses CPLEX and it can obtain the optimal Pareto fronts, which is selected as the reference set. It is understandable that both H and D value of the reference set itself are equal to 1, therefore the values of these two indicators of augmented -constraint method are omitted in Table 3. By comparing the value of Q of the methods, it can be seen that on average the performance of H 1 is a bit worse in terms of the number of non-dominated solutions. Meanwhile, it also can be seen that the average value of H of H 2 is larger and closer to 1 compared with that of H 1 , which implies that the solutions yielded by H 2 have similar quality to the exact Pareto font in terms of dominated space on average. Besides, according to the computational results, we can find that the average D value of H 2 is closer to 1, indicating that the average distance from solutions of H 2 to the reference set is smaller than that of H 1 . Moreover, H 2 performs better in terms of computational time. For small-size instances, concluding, H 2 is much more efficient than H 1 . In the meantime, CPLEX loses its power to solve the problem exactly as it scales up, even failing to obtain optimal solutions for instances with more than 12 customers. Then, for the medium-size and large-size instances, the reference solution set is formed by selecting the non-dominated solutions from the combination of the two fronts obtained by the heuristics. Both algorithms can obtain approximate solutions for instances with 20 to 100 customers within 40 s. Table 4 presents the computational results of the medium-size instances. The average computational time of H 1 is about 1.72 times that of H 2 , which shows that H 2 performs better in terms of the computational time. Furthermore, it can be seen that the average number of non-dominated solutions yielded by H 1 is slightly less than that obtained by H 2 . By comparing H values of two heuristics, we can find that the solutions yielded by H 2 have higher quality than H 1 on average. Besides, there is little difference between the average D values of two heuristics, which implies that the distance between the solutions obtained by the heuristics is small. For medium-size instances, to conclude, H 2 performs better than H 1 in terms of computation time, the number of non-dominated solutions. and the solution quality.
Computational results of large-size instances with 200 to 800 customers are reported in Table 5. It can be obtained that the computational time of H 2 increases much faster than H 1 as the problem scales up, which implies that H 1 is relatively stable and performs better for large-size instances in terms of computational time. The quality of solutions obtained by H 1 is higher than that of H 2 in terms of the average value of H. However, the average number of solutions obtained by H 1 is less than that yielded by H 2 . Moreover, we can find that the average D value of solutions of H 1 is slightly larger than that of H 2 . By comparing the results of two methods under different sizes of problems, we can find that, on average: (1) two heuristics are much more efficient than the augmented -constraint method incorporating with CPLEX in terms of computational time; (2) for small-size and medium-size instances, H 2 performs better than H 1 in terms of solution quality and computational time; (3) for large-size instances, H 1 outperforms H 2 in terms of computational time and the solution quality; and (4) for large-size instances, H 2 can help obtain more non-dominated solutions than H 1 .

Conclusions
This paper investigated a bi-objective VRP with soft time windows and multiple depots, which aims to minimize total energy consumption and customer dissatisfaction simultaneously. A bi-objective mixed-integer linear programming model is presented for the problem. Then augmented -constraint method are adopted to obtain the exact optimal Pareto front for small-size problems, and two NSGA-II-based heuristics are designed to obtain approximate Pareto solutions for medium-and large-size problems. Computational experiments are conducted. The results show that H 2 performs relatively better than H 1 for small-and medium-size problems. For large-size problems, H 2 could find more Pareto solutions, while H 1 is able to obtain approximate Pareto fronts with better convergence in less computation time.
In addition, the property of the investigated problem is not fully explored, which may be the main limitation of this study.
For future research, several interesting issues may be addressed. Firstly, some other factors that may affect the objectives may be considered, such as the traffic conditions and the acceleration of vehicles. Besides, we may extend the study to other variants of the investigated problem, e.g., periodic VRP and pickup and delivery problems. Moreover, we also can combine the proposed method with other heuristics to develop more efficient solutions.