Hybrid Heuristic for the Multi-Depot Static Bike Rebalancing and Collection Problem

: The bike rebalancing problem is one of the major operational challenges in the urban bike-sharing system, which involves the redistribution of bikes among stations to prevent stations from being empty or overloaded. This paper investigates a new bike rebalancing problem, which considers the collection of broken bikes in the multi-depot system. The proposed problem can be classiﬁed as a two-commodity vehicle routing problem with pick-up and delivery. An integer programming model is formulated to ﬁnd the optimal vehicle assignment and visiting sequences with the minimum total working time and ﬁxed cost of vehicles. A hybrid heuristic algorithm integrating variable neighborhood search and dynamic programming is proposed to solve the problem. The computational results show that the proposed method can ﬁnd 26 best solutions out of 36 instances, while the CPLEX obtains 16 best solutions. Impact of broken bikes collection and distribution of depots is examined. Comparison of different practical strategies indicates that the number of vehicles can be signiﬁcantly reduced by allowing multiple visits to depots. Allowing vehicles to return to different depots can help reduce the total working time.


Introduction
In recent years, public bike-sharing systems (BSS) flourished worldwide as a complementary mode to public transportation for last mile service [1].Since the first BSS installed in 1965, the number of systems increased to more than 1000 around the world [2].Bike riding, which is a green mode of transportation, can be used to decrease the dependence on automobiles, thereby relieving traffic congestion and air pollution.A BSS provides rental services of public bikes to citizens.Rental stations are typically located in residential areas, commercial areas, and the places near public transportation stations.Citizens can rent a bike at any station, use it for a journey and automatically return it to any station.Bike rental demand is unevenly distributed; hence, stations easily become empty and full, which results in difficulties in renting and returning bikes.To improve the service quality of the system, the system operators need to redistribute bikes from stations with excess bikes to those with insufficient bikes using heavy vehicles, which is known as bike rebalancing.
The bike rebalancing problem (BRP) is to route vehicles to perform rebalancing requirements with minimum travelling cost [3].The use of heavy vehicles during the operations is expensive and produces heavy carbon emissions; thus, conducting vehicle routing optimization is necessary to decrease the transportation cost of these heavy vehicles [4].The bike rebalancing can be performed in both dynamic and static manners.The dynamic rebalancing is a day-time operation while the system is in high usage and the usage rate changes during the operation.The static rebalancing assumes no user activity during rebalancing, which is a night-time operation.As heavy trucks are always forbidden to enter the inner cities during the day-time and the systems want to guarantee the safety of the riders, we focus on the static rebalancing in this paper.In this case, the rebalancing and collection requests of each night can be extracted from the information system before the planning.The ideal inventory of each station is set based on the historical data analysis and the broken bikes are recognized by the daily inspection and customer reports.
As time passes, bikes may break down because of repeated usage or other reasons.Broken bikes occupy space in stations and cannot serve users, which decrease operational efficiency and user satisfaction of the BSS.Therefore, collecting broken bikes from stations and transporting them to the depot is necessary.Broken bikes will be repaired in the depot and then returned back to the stations.The addition of the broken bike collection activity to the usable bike rebalancing operation is an approach to address such a problem [5,6], and needs to be optimized, as it makes the rebalancing less effective.As the broken bikes cannot be unloaded after collection, it leads to an increase in the cumulative number of bikes on vehicles during routes.Vehicles always stop serving and need to return back to the depot due to limited vehicle capacity.Further, more stations and bikes are installed into the system with the expansion of the BSS.Therefore, more depots are required in the BSS to reduce the traveling cost and improve the service level.In this situation, stations are not necessarily assigned to their nearest depots.These new changes in the real situation motivate us to extend the original studies to capture systems with broken bikes and multiple depots.
In this paper, we propose a multi-depot static bike rebalancing and collection problem (MSBRCP).In the MSBRCP, a set of stations with rebalancing and collection requests need to be served by a set of capacitated vehicles from a set of depots.We need to find the optimal visiting sequences for all the vehicles with the minimum travelling cost.Before starting their routes, vehicles should be assigned to appropriate depots.Vehicles are allowed to start and end at different depots to reduce the cost.When visiting a station, both rebalancing and collection requests should be fulfilled.The proposed problem in this study can be classified as a two-commodity vehicle routing problem with pickup and delivery.The system has two types of bikes, namely, usable and broken bikes, which have different pickup-delivery relationships.The pickup and delivery problem can be classified into three types based on different pickup-delivery relationships [7], namely, many-to-many, one-to-many-to-one, and one-to-one problems.We consider more than one relationship in the proposed problem.The rebalancing of usable bikes is a many-to-many problem, which indicates that any station can be a source or destination for usable bikes.Broken bike collection is a one-to-many-to-one problem.In one route, the bikes that were repaired in a depot can be delivered to many stations, and broken bikes in many stations should be collected and returned to a depot.Therefore, the proposed problem is a type of pickup and delivery with two pickup-delivery relationships.
The MSBRCP differs from that in previous research as follows : (1) the system network includes multiple depots, allowing vehicles' multiple visits.The decisions include not only the routes of vehicles, but also the allocated starting and ending depots for each vehicle; (2) two types of bikes with two pickup-delivery relationships are considered, i.e., the usable bikes which can be reallocated from station to station, and the broken bikes which can only be returned to the depots.
Due to the difficulty of solving the MSBRCP to optimality with commercial solvers, we propose a hybrid heuristic algorithm by integrating variable neighborhood search (VNS) and dynamic programming (DP).The basic concept of VNS is the change in neighborhoods during the local search, which explores a wide solution space and avoids getting stuck in the local optimum.In the algorithm we propose, the restricted DP with two extension rules is embedded in the framework of VNS as a local search operator.It improves the feasibility and quality of solutions by resolving a set of traveling salesman problems (TSPs) included in the solution obtained by VNS.The effectiveness of the proposed algorithm is tested on both benchmark and randomized instances.Some computational experiments are conducted to examine the impact of broken bikes and distribution of depots and to compare different strategies in practice.
The remainder of this paper is structured as follows: Section 2 provides a literature review that focuses on BRP.Section 3 presents the mathematical formulation of the proposed problem.Section 4 describes the hybrid heuristic algorithm.Section 5 shows the computational experiments and results.Section 6 concludes the study and discusses the directions for future research.

Literature Review
As one of the most important operations of a BSS, bike rebalancing attracted increasing attention in recent years.In general, BRP can be categorized into two types: dynamic rebalancing problem (DBRP) and static rebalancing problem (SBRP).
For DBRP, Caggiani and Ottomanelli [8] presented a flexible fuzzy decision support system for the rebalancing process that considered the dynamic variation in demand.The method was applied to a small area with five stations and one vehicle but can be expanded to a larger area.Contardo et al. [9] proposed a mathematical model for a dynamic public bike-sharing balancing problem.A scalable methodology was presented to provide the lower and upper bounds of the problem.Regue and Recker [10] proposed a comprehensive framework to solve DBRP.The simulation results indicate that the framework could improve system performance and reduce insufficiency.Pfrommer et al. [11] combined BRP and dynamic pricing to improve the BSS.The routes of vehicles were determined by a heuristic.Ghosh et al. [12] proposed a model to determine the routes of vehicles for multiple time steps using the expected demand for bikes.The problem was solved with novel approaches based on decomposition and abstraction.Shui and Szeto [13] presented a dynamic green BRP model to minimize the unmet demands and carbon emissions of vehicles.A hybrid rolling horizon artificial bee colony algorithm was developed to solve the model.Caggiani et al. [14] investigated the dynamic management of free-floating BSS and presented a spatio-temporal forecasting model and a relocation decision support system for solving the problem.The relocation process is activated several times and generates how many bikes need to be repositioned each time.Legros [15] proposed a Markov decision process approach to derive a dynamic bike repositioning strategy to minimize the rate of arrival of unsatisfied users.They first proved the optimal inventory level at each station, then presented a policy for the bike repositioning problem using a one-step policy improvement method.Brinkmann et al. [16] presented the stochastic dynamic inventory routing problem for BSS to avoid unsatisfied demand.They proposed a dynamic lookahead policy, which was autonomously parametrized by means of value function approximation.
SBRP was first investigated by Benchimol et al. [17] based on the BSS installed in Paris in 2007.The authors classified the problem into a one-commodity pickup and delivery problem with one vehicle.Multiple visits to a station were allowed.Chemla et al. [18] also focused on single-vehicle SBRP.In their problem, stations can be visited several times and be used as buffers where bikes can be temporary stored before being moving to their final destination.A branch-and-cut algorithm was used to solve the relaxation of the model to obtain the lower bound.A tabu search was used to obtain the upper bound.The effectiveness of the algorithms was tested on a set of instances.Erdo gan et al. [19] extended the single-vehicle SBRP by considering demand intervals.The target inventory of a station was not a certain value but an interval with lower and upper bounds.A branch-and-cut algorithm with a Benders decomposition scheme was developed to solve the model.Li et al. [20] investigated a new SBRP that considered multiple types of bikes.Several types can be substituted by other types and can occupy different spaces in stations.This new problem was solved using a hybrid genetic algorithm.Xu and Zou [21] addressed a broken bike recycling problem by incentivizing participants with monetary rewards.This problem was an extension of the vehicle routing problem and a modified tabu search was developed to solve it.Many studies focus on developing solution methods for single-vehicle BRP, including exact algorithms ( [4,22]) and metaheuristic algorithms ( [23][24][25]).
In addition to the single-vehicle situation, the multiple-vehicle BRP was investigated by several researchers.Raviv et al. [26] investigated SBRP with a fleet of vehicles.The model did not specify a target inventory level for each station, but determined the inventory based on a convex penalty function incorporated into the objective function along with the total travelling cost.Two models were presented and compared through a numerical study.Dell'Amico et al. [3] presented four mathematical formulations for SBRP with a fleet of capacitated vehicles.The branch-and-cut algorithm was developed to solve the problem.Arabzad et al. [27] formulated an integer linear programming model for a periodic BRP in multiple periods.Bulhões et al. [28] studied the SBRP with multiple vehicles and visits.A branch-and-cut algorithm and an iterated local search metaheuristic were developed to solve the proposed model.Szeto and Shui [29] investigated a new BRP that determined the routes and quantities of loading and unloading at each station with the objective of minimizing demand dissatisfaction and service time.Additionally, several solution methods were developed for multi-vehicle SBRP, such as a three-step mathematical programming-based heuristic [30], a series of local search-based metaheuristics [31], a destroy-and-repair algorithm [32], a heuristic based on unsatisfied demand forecasting [6], a hybrid large neighborhood search [33,34], and a cluster-first route-second heuristic [35].Luo et al. [36] investigated the BSS with consideration of minimizing the system's life cycle greenhouse gas emissions.They presented a framework integrating a simulation model, an optimization model, and a life cycle assessment model to obtain the optimal bike fleet size and rebalancing strategy.Shui and Szeto [37] reviewed and systematically classified the existing literature of BRP.Table 1 provides a summary of the literature on the static bike rebalancing problem.

× ×
Two-phase approach [26] Branch and cut algorithm [3] Three-step heuristic [30] Local search based metaheuristics [31] Destroy and repair algorithm [32] Two-step heuristic [35] Lingo [27] Artificial bee colony algorithm [29] Simulation [36] Station However, most problems investigated in the previous studies consider only one depot in a BSS.As a BSS expands, more than one depot will be installed to match the increase in the number of stations and redistribution requests.Before routing, the assignment of vehicles should be considered in the decision making of the daily rebalancing operation.Liu et al. [38] conducted the pioneering research considering multiple depots in a freefloating BSS.They discretized the whole rebalancing duration into many equal intervals as periods, and allowed the vehicles to visit the stations multiple times.The objective was to minimize the total unmet demand, inconvenience of getting a bike, and weighted vehicles' total operational time.Moreover, broken bike collection is rarely considered in previous models.Alvarez-Valdes et al.
[6] investigated a BSS with damaged bikes.They proposed a two-stage method consisting of estimating unsatisfied demand and routing for redistribution.They verified the procedure by real data from the BSS in Spain with a single depot.Wang and Szeto [39] studied BSS considering broken bikes with the objective of minimizing the total CO 2 emissions.They applied a commercial solver to address the problem and a clustering method based on the nearest neighbor heuristic for a real-word instance.They conducted extensive experiments to discuss problem characteristics and the factors that affect the CO 2 emissions.However, the study considered only one single depot, while more depots can decrease the total working time of vehicles in the BSS.Du and Cheng et al. [40] investigated SBRP with multiple depots, visits to the stations and vehicles in free-floating BSS, and proposed a greedy genetic heuristic.They assumed the trucks to be free from travel time and distance, and each truck can only be used once.However, with regard to BRP, the trucks are easily full of malfunctioning bikes and have to return to deports to release the broken bikes.These empty trucks can go back to the system and continue to work.
According to these research gaps, we analyze multi-depot SBRP with broken bike collection in this paper.An enhanced VNS is developed to solve the problem.VNS was proposed by Mladenović and Hansen [41] and was widely and successfully applied to various hard combinatorial optimization problems [42].Cabrera-Guerrero and A Lvarez et al. proposed a VNS-based heuristic to solve the districting problem in BSS [43].VNS is proven to be efficient in solving the multi-depot vehicle routing problems (MDVRP), such as MDVRP with time windows [44], MDVRP with loading cost [45], MDVRP with private fleet and common carriers [46], and MDVRP with heterogeneous fleet vehicles [42].

Problem Description
The problem we investigated in this study is the MSBRCP, which is used to redistribute usable bikes and collect broken bikes using capacitated vehicles with the objective of minimizing total working time and fixed cost of vehicles.A BSS is represented by a complete digraph G = (V, A), which contains a set of vertices V and a set of arcs A, that represents the connections between vertices.The set of vertices is decomposed into a set of depots V D and a set of bike stations V S .The rebalancing demand q i is given for each station, which is obtained from the difference between the ideal inventory and final status when the system stops providing services.This usually happens at night-time, and during this period, no user rents or returns bikes from (to) the BSS.We assume that the ideal inventory of each station in the system is predefined by analyzing historical rental data.When the system stops serving, the rebalancing demand q i will be quickly calculated.If q i > 0, station i is referred to as a pickup station and its pickup request is q i ; if q i < 0, station i is a delivery station and its delivery request is |q i |; and if q i = 0, station i has no rebalancing request.We let q sys = − ∑ i∈V S q i to denote the rebalancing requirement of the whole system, which is used in the model to formulate the initial and final number of usable bikes on vehicles.The collection request r i (r i ≥ 0), which indicates the number of broken bikes in station i, is also given through the daily inspection and user reports.There are two types of bikes having different pickup-delivery relationships.Usable bikes can be redistributed between different stations and brought into or taken out of the system according to the rebalancing targets.Broken bikes in stations should be collected and returned to depots.
A fleet of K homogeneous vehicles (trucks) with capacity Q is available at depots, where Q is the number of bikes that can be held in an individual vehicle.Each vehicle departs from a depot and can return to any depot after redistributing usable bikes from stations with excess bikes to those with insufficient bikes and simultaneously collecting broken bikes to bring back to depots.Vehicles depart from depots empty or with an initial load (usable bikes).A usable bike can also be taken back to depots from the system according to the rebalancing requirements.The capacity of depots is not considered in this paper.We assume depots contain sufficient usable bikes that come from repaired and new supplementary bikes.
Each arc (i, j) is associated with a working time t ij , which is evaluated by the sum of the travel time on the arc (i, j) and the loading/unloading time spent in the station j.The loading and unloading times are assumed to be 1 min per bike [26].There is a maximum working duration T for each vehicle.Figure 1 illustrates an example of MSBRCP.This example has 2 depots (denoted by rectangles) and 11 stations (denoted by circles).There are 4 vehicles with a capacity of 10 bikes available at depot.A 2-tuple is associated with each station, where the former item denotes the rebalancing request of usable bikes and the latter item denotes the collection request of broken bikes.For example, station 7 is a delivery station and the number of usable bikes to be delivered is 4. In addition, the collection request for broken bikes is 2. Information for working time t ij is not shown for simplicity.The problem is to determine the allocation of vehicles to depots and the routes of the vehicles to meet all the rebalancing demand and collection requests.The objective is to minimize the total working time and fixed cost of vehicles subject to satisfying all the service requirements of the BSS.
collected and returned to depots.
A fleet of  homogeneous vehicles (trucks) with capacity  is available where  is the number of bikes that can be held in an individual vehicle.E departs from a depot and can return to any depot after redistributing usable stations with excess bikes to those with insufficient bikes and simultaneously broken bikes to bring back to depots.Vehicles depart from depots empty or w tial load (usable bikes).A usable bike can also be taken back to depots from according to the rebalancing requirements.The capacity of depots is not con this paper.We assume depots contain sufficient usable bikes that come fro and new supplementary bikes.
Each arc (, ) is associated with a working time   , which is evaluated of the travel time on the arc (, ) and the loading/unloading time spent in th The loading and unloading times are assumed to be 1 min per bike [26].maximum working duration  for each vehicle.Figure 1 illustrates an example of MSBRCP.This example has 2 depots ( rectangles) and 11 stations (denoted by circles).There are 4 vehicles with a cap bikes available at depot.A 2-tuple is associated with each station, where the f denotes the rebalancing request of usable bikes and the latter item denotes th request of broken bikes.For example, station 7 is a delivery station and the usable bikes to be delivered is 4. In addition, the collection request for broken Information for working time   is not shown for simplicity.The problem mine the allocation of vehicles to depots and the routes of the vehicles to m rebalancing demand and collection requests.The objective is to minimiz working time and fixed cost of vehicles subject to satisfying all the service req of the BSS.

Mathematical Formulation
The objective of the problem is to minimize the sum of the total working time and fixed cost of using vehicles, as shown in Equation (1).Constraints (2) to (8) restrict the route of each vehicle.Constraint (2) shows that a vehicle should be assigned to one depot at most.Constraint (3) states that every vehicle k in K starts at its assigned depot.Constraint (4) imposes that the vehicle should return to one depot after serving stations.The departure and end depots can be different.Constraint (5) ensures that each station is visited exactly once by one vehicle.Constraint (6) forbids the vehicle to move if the vehicle is not allocated to any depot.Constraint (7) ensures that two stations on an arc are served by the same vehicle.Constraint ( 8) is used to eliminate subtours.
Constraints (9) to (12) model the flow of usable bikes in the network.Constraint (9) illustrates the flow balance of usable bikes entering and leaving the vertex.Constraint (10) derives the lower and upper bounds of the load of usable bikes on a specific arc.Constraints (11) and (12) determine the initial and final number of usable bikes on vehicles.
Similarly, Constraints (13) to (16) model the flow of broken bikes in the network.Constraint (13) specifies the flow balance of broken bikes entering and leaving the vertex.Constraint (14) derives the lower and upper bounds of the load of broken bikes on arcs.Constraint (15) states that the initial load of broken bikes at the depot is zero.Constraint (16) indicates that the sum of usable and broken bikes on a vehicle cannot exceed the capacity of the vehicle.Constraint (17) restricts the maximum route duration for each vehicle.Constraint (18) derives the value ranges of binary variables.
Decision variables x ijk : Equal to 1 if arc (i, j) is travelled by vehicle k, and 0 otherwise y pk : Equal to 1 if vehicle k originated from depot p, p ∈ V D z ij : The number of usable bikes transported over arc (i, j) f ij : The number of broken bikes transported over arc (i, j) ∑ k∈K max{0, q i , −q j }x ijk ≤ z ij ≤ ∑ k∈K min{Q, Q + q i , Q − q j }x ijk , ∀i, j ∈ V S (10)  ∑ i∈V S ∑ p∈V D z pi = max{q sys , 0}

Other Practical Considerations
The formulation can be modified to cater for some other practical scenarios, such as: (a) If the vehicles need to return to the same depot from where they originated, Constraint (4) in the above formulation should be replaced by (19) as shown below: (b) In this problem, each vehicle is restricted to perform at most one route and the routes may end up shorter than the maximum working duration due to limited capacity.The collection of broken bikes may worsen the situation because broken bikes can only be delivered to depots.A consequence is the need for an oversized fleet of vehicles to satisfy all the demand ( [47]).The multiple use of vehicles overcomes the mentioned limitations, which indicates multiple visits to depots.If multiple visits to depots are allowed, Constraints (3), ( 4) and ( 7) should be replaced by (20): Together with Constraint (6), Constraint (20) ensures consecutive paths for each vehicle.We delete Constraints ( 3) and ( 4) to relax the problem that can visit depots multiple times.The solution of this formulation is a set of journeys for vehicles.In a journey, the vehicle may return to depots many times to unload and restart to rebalance and collect till the maximum working duration is reached.
The optimal solution of an instance of MSBRCP defined by ( 1)-( 18) is shown in Figure 2. We can observe that three vehicles are required when they can return to different depots.The total working time of these three vehicles is 16,400 s.If the drivers want to start from and return to the same depot, three vehicles are still required, but with longer working time (16,800 s).If we allow multiple uses of vehicles, only two vehicles are needed to complete the rebalancing and collection, and the working time remains the same with the second scenario.Considering the fixed cost of using vehicles, allowing multiple uses of vehicles attains the minimum objective function value.
The formulation can be modified to cater for some other practical scen (a) If the vehicles need to return to the same depot from where they or straint (4) in the above formulation should be replaced by (19) as show (b) In this problem, each vehicle is restricted to perform at most one routes may end up shorter than the maximum working duration due pacity.The collection of broken bikes may worsen the situation be bikes can only be delivered to depots.A consequence is the need fo fleet of vehicles to satisfy all the demand ( [47]).The multiple use of comes the mentioned limitations, which indicates multiple visits to d tiple visits to depots are allowed, Constraints (3) (4) ( 7) should be repl Together with Constraint (6), Constraint (20) ensures consecutive pat hicle.We delete Constraints ( 3) and ( 4) to relax the problem that can visit d times.The solution of this formulation is a set of journeys for vehicles.In vehicle may return to depots many times to unload and restart to rebalan till the maximum working duration is reached.
The optimal solution of an instance of MSBRCP defined by ( 1)-(18 Figure 2. We can observe that three vehicles are required when they can ferent depots.The total working time of these three vehicles is 16,400 s. want to start from and return to the same depot, three vehicles are still requ longer working time (16800 s).If we allow multiple uses of vehicles, only tw needed to complete the rebalancing and collection, and the working tim same with the second scenario.Considering the fixed cost of using veh multiple uses of vehicles attains the minimum objective function value.

Solution Method
In [3], SBRP was proven to be NP-hard by generalizing the capac routing problem.Thus, the presented problem with more complicated pi relationships is also NP-hard.We developed a hybrid heuristic algorithm MSBRCP, namely, VNS-DP.The algorithm is based on the framework o corporates the restricted DP algorithm as an improvement operator.
VNS-DP starts with an initial solution, predefined sets of neighborh and local search operators.A simple heuristic is used to construct an init which consists of several routes.A route is a partial solution that starts visits several stations, and ends at the same depot.Initially,  is set as t solution   .Then, VNS-DP iteratively improves the incumbent solutio variable neighborhood search and dynamic programming.
The first step of the iteration is shaking, which randomly generates

Solution Method
In [3], SBRP was proven to be NP-hard by generalizing the capacitated vehicle routing problem.Thus, the presented problem with more complicated pickup-delivery relationships is also NP-hard.We developed a hybrid heuristic algorithm to solve the MSBRCP, namely, VNS-DP.The algorithm is based on the framework of VNS and incorporates the restricted DP algorithm as an improvement operator.
VNS-DP starts with an initial solution, predefined sets of neighborhood structures and local search operators.A simple heuristic is used to construct an initial solution x, which consists of several routes.A route is a partial solution that starts from a depot, visits several stations, and ends at the same depot.Initially, x is set as the global best solution x best .Then, VNS-DP iteratively improves the incumbent solution by utilizing variable neighborhood search and dynamic programming.
The first step of the iteration is shaking, which randomly generates a solution x from the kth neighborhood N k of the current solution x; k is set to 1 at the beginning and incrementally increase as the number of iterations grows.Then, a set of local search operators is performed on x to find the local optimal solution x .Subsequently, an acceptance decision is made to determine whether to update the current solution or not.We apply a simulated annealing (SA) acceptance criterion ( [48]) to accept solution x .It is controlled by three parameters: initial temperature i T , cooling rate α, and the number of iterations to reset the initial temperature β.If the acceptance criterion is met, then the current solution is replaced with x and the shaking returns to N 1 in the subsequent iteration.Furthermore, if the new current solution improves the global best solution, then x best is updated.Otherwise, x remains unchanged and the neighborhood iteration continues.In addition, we check if the criterion of using DP is met; if met, the restricted DP is applied to improve every route of x.The algorithm stops and outputs the best solution when a given number of iterations N max is performed.
The pseudo-code of VNS-DP is presented as Algorithm 1.The detailed explanations of the proposed algorithm are introduced in Sections 4.1-4.3.An assignment heuristic is presented in Section 4.4 to decrease the number of vehicles used in the solution obtained by VNS-DP.The computational complexity of VNS-DP is analyzed in Section 4.5.

28:
until the stop criterion is met.

Initial Solution Generation
An initial solution is constructed using an insert-based heuristic, which generates one route at a time.The heuristic initials were obtained by randomly selecting the two depots as the departure and destination of the current route.Afterward, we divided the stations into two ordered sets: pickup set and delivery set.A station is included in the pickup set when it has a pickup rebalancing demand or no rebalancing demand (q i ≥ 0).A station is included in the delivery set when it has a delivery rebalancing demand (q i < 0).The pickup (delivery) set is sorted in descending (ascending) order.Pickup stations are first assigned into a route in order and then delivery stations are inserted until no more stations can be added without violating the vehicle load and time duration constraints.When selecting a station to be inserted, we give priority to the station ranked first in the ordered set.As we allow vehicles to start from and return to different depots, we reset the departure and destination of the current route to the depots with minimum duration.The above procedure is repeated until all the stations are covered.
When deciding if a new station can be inserted into a route, we should check if the vehicle load and time duration constraints are violated.The feasibility check procedure is as follows: First, we compute the cumulative usable and repair demand along a route σ by the Equations ( 21) and (22), respectively.
N σ(i) (σ) and R σ(i) (σ) denote the number of usable and broken bikes after visiting station σ(i) of route σ on the vehicle with an empty initial load.As the initial load of usable bikes is not limited to be 0, negative N σ(i) (σ) may also lead to a feasible route.Considering both types of bikes, a route is feasible if the inequality ( 23) is satisfied and the route duration does not reach the maximum.
Based on the above construction heuristic, the number of routes generated may exceed the maximum number of vehicles, which makes the solution infeasible.Hence, in our algorithm, the solution is evaluated by a fitness function f (x) = C(x) + ω * p f (x) + p t (x) + p v (x) , where C(x) is the objective value and the others are penalties, ω is the penalty parameter which is set to 10,000, p f (x), p t (x) and p v (x) are the values of the violations of the constraints on bike flow, route duration, and the number of vehicles, respectively, p f (x) is the sum of the number of bikes that violate constraint (23) for all routes in solution x, p t (x) is the sum of exceeded duration for all routes in x, and p v (x) is the product of a penalty parameter and the exceeded number of vehicles.The penalty parameter is set as Q.We set penalty parameter for p v (x), as the number of vehicles is more critical than the other two.

Main Iteration 4.2.1. Neighborhood Structures
The VNS procedure starts from the initial solution and systematically explores different neighborhoods to find the global optimum.The set of neighborhood structures is the core of the method.In this study, we use several types of neighborhoods that include the cyclic exchange, depot change, route split, and combine.
The cyclic exchange proposed by Thompson and Psaraftis [49] simultaneously exchanges parts of multiple routes.The cyclic exchange can create a series of neighborhoods with different settings.In this study, three parameters are considered, namely, the number of depots involved (N d ), number of routes involved (N σ ) and the maximum length of the partial route to exchange (N s ). Figure 3 shows a cyclic exchange with three routes σ 1 , σ 2 , σ 3 , i.e., N σ = 3.For j = 1, 2, and a cyclic exchange moves path σ j i j , i j + N s to the same position of route σ j+1 .For j = 3, the path σ j i j , i j + N s is moved to the same position of σ 1 .Then, three new routes are created that are shown in bold line in Figure 3.The starting position i j is determined in three ways: (1) randomly selected; (2) selecting the position that is closest to next route; and (3) selecting the most remote position in that route.For the depots involved, we distinguish between neighborhoods involving one depot (N d = 0) and those involving all depots (N d = 1).For N σ and N s , three and five choices are available, respectively, i.e., N σ = {2, 3, 4} and N s = {1, 2, 3, 4, 5}.We add 10 empty exchange situations when N σ = 2 to explore more neighborhoods.An empty exchange indicates that one of the two partial routes is empty.The cyclic exchange is defined as N b1 in Algorithm 2. The depot change is a new neighborhood for this problem, which is defined as  2 , which is shown in Figure 4.This new neighborhood is useful when the optimal solution includes only a few routes.The neighbor is constructed by changing the depot of one random route in the current solution.The number of routes involved is 1, i.e.,   = 1.The route split and combine are used to improve the feasibility of the current solution [51], and are defined as N b3 and N b4 .The route split attempts to split one selected route into two routes (N σ = 1).The combine attempts to combine two routes into one (N σ = 2).When selecting routes to split, the routes with long travel times are likely to be selected.By contrast, the routes with short travel times are likely to be selected in the combine.These two neighborhoods are, respectively, illustrated in Figures 5 and 6.The depot change is a new neighborhood for this problem, which is defined as N b2 , which is shown in Figure 4.This new neighborhood is useful when the optimal solution includes only a few routes.The neighbor is constructed by changing the depot of one random route in the current solution.The number of routes involved is 1, i.e., N σ = 1.The depot change is a new neighborhood for this problem, which is defined as  2 , which is shown in Figure 4.This new neighborhood is useful when the optimal solution includes only a few routes.The neighbor is constructed by changing the depot of one random route in the current solution.The number of routes involved is 1, i.e.,   = 1.The route split and combine are used to improve the feasibility of the current solution [51], and are defined as N b3 and N b4 .The route split attempts to split one selected route into two routes (N σ = 1).The combine attempts to combine two routes into one (N σ = 2).When selecting routes to split, the routes with long travel times are likely to be selected.By contrast, the routes with short travel times are likely to be selected in the combine.These two neighborhoods are, respectively, illustrated in Figures 5 and 6.The route split and combine are used to improve the feasibility of the current solution [51], and are defined as N b3 and N b4 .The route split attempts to split one selected route into two routes (N σ = 1).The combine attempts to combine two routes into one (N σ = 2).When selecting routes to split, the routes with long travel times are likely to be selected.By contrast, the routes with short travel times are likely to be selected in the combine.These two neighborhoods are, respectively, illustrated in Figures 5 and 6.The depot change is a new neighborhood for this problem, which is defined as  2 , which is shown in Figure 4.This new neighborhood is useful when the optimal solution includes only a few routes.The neighbor is constructed by changing the depot of one random route in the current solution.The number of routes involved is 1, i.e.,   = 1.The route split and combine are used to improve the feasibility of the current solution [51], and are defined as N b3 and N b4 .The route split attempts to split one selected route into two routes (N σ = 1).The combine attempts to combine two routes into one (N σ = 2).When selecting routes to split, the routes with long travel times are likely to be selected.By contrast, the routes with short travel times are likely to be selected in the combine.These two neighborhoods are, respectively, illustrated in Figures 5 and 6.

Shaking Procedure
The shaking procedure is shown in Algorithm 2. Each time a neighborhood structure is sequentially selected from the neighborhood set, the shaking procedure starts by selecting the involved routes.The involved routes are chosen from the available route set L that includes all the routes in the current solution.Two methods are used for route selection.The first method is random, which means that all the routes in the available set have the same probability to be selected.This random method is used in the depot change neighborhood.The second method allows bias on routes, which refers to the adaptive shaking proposed in [46].When selecting routes to combine, the routes with short travel times are likely to be selected.By contrast, the routes with long travel times are likely to be selected in the cyclic exchange and split as it can more effectively explore the diversity of solutions.The remaining routes that are close to the recently selected route are likely to be chosen.The closeness of two routes is measured by the sum of the distance between all the vertices of the two routes.

Shaking Procedure
The shaking procedure is shown in Algorithm 2. Each time a neighborhood structure is sequentially selected from the neighborhood set, the shaking procedure starts by selecting the involved routes.The involved routes are chosen from the available route set L that includes all the routes in the current solution.Two methods are used for route selection.The first method is random, which means that all the routes in the available set have the same probability to be selected.This random method is used in the depot change neighborhood.The second method allows bias on routes, which refers to the adaptive shaking proposed in [46].When selecting routes to combine, the routes with short travel times are likely to be selected.By contrast, the routes with long travel times are likely to be selected in the cyclic exchange and split as it can more effectively explore the diversity of solutions.The remaining routes that are close to the recently selected route are likely to be chosen.The closeness of two routes is measured by the sum of the distance between all the vertices of the two routes.
With the selected routes, the neighbor is constructed based on the predefined neighborhood structures.In the series of cyclic exchange neighborhoods, the start station of each partial route should be selected.Three rules are adopted: (i) random, (ii) distance to the subsequent route, and (iii) distance to the neighboring vertices.For the second rule, the vertices with short distances to all vertices of the subsequent route are likely to be selected as the start of the partial route.For the third rule, the vertices that are far from their previous and subsequent vertices are likely to be selected as the start of the partial route.After the neighborhood operators, a new solution x , which is the kth neighbor of the current solution x, is generated.

Local Search
The solution x obtained from shaking is the input of the local search procedure to derive the best local solution x .Two local search operators are used in this procedure: swap and 2-opt.The two local search operators are applied in a multi-level pattern.If x is better than x , then x = x and the local search returns to the swap operator and restarts.The local search stops when both operators fail to improve solution x .The two local search operators are only applied to the routes which are changed in the shaking procedure to save CPU time.
The swap operator exchanges the positions of a pair of vertices from two different routes to find better solutions, which is an inter-route optimization.The 2-opt operator, which was proposed by [52], is an effective improvement operator that reverses the direction of a partial route between two vertices.This operator works on all possible partial routes between two non-adjacent vertices within one route to observe if any improvement occurs, which is an intra-route optimization.

DP Improvement
VNS may spend a long time to eliminate infeasibility because the fitness of a solution is evaluated by the sum of the objective function and penalties on constraint violations.Furthermore, a local search may easily lead to being stuck in the local optimum.To overcome these drawbacks, we introduce a DP improvement into the VNS framework.The restricted DP algorithm is used as an improvement operator to reduce the travel time for every route in a solution.A solution to the problem is composed of routes that can be regarded as solutions for several TSPs with several additional constraints.For a specific route σ, the set V σ includes all the stations and the depots in this route.The DP improvement is used to solve the TSP with all stations and depots in V σ .The rebalancing and collection requests should be fulfilled, and the route may start and end at different depots.
In the algorithm, a state (ψ, j) defines a path that starts at the depot, visits every station in the subset ψ once, and ends at station j (ψ ⊆ V σ and j ∈ ψ).C(ψ, j) records the minimum total working time for the paths of state (ψ, j).l 1 (ψ) is the number of usable bikes in the vehicle after visiting all the stations in ψ, l 1 (ψ) = ∑ i∈ψ q i .l 2 (ψ) is the number of broken bikes in the vehicle after visiting all the stations in ψ, l 2 (ψ) = ∑ i∈ψ r i .We use 0 to denote the depot in V σ .The initial state is (0, 0), which indicates that the vehicle departs from the depot 0. In the initial state, C({0}, 0) = 0, l 1 ({0}) = q 0 , and l 2 ({0}) = 0. q 0 is the initial load of the vehicle, i.e., q 0 = max{0, − ∑ i∈V σ q i }.
In the first stage, the initial state (0, 0) is extended by adding a new station into subset ψ.For example, when station j is added to subset ψ, if q 0 + q j < 0 or q 0 + q j + r j > Q, then the addition is cancelled because of infeasibility.Otherwise, the new state ({0, j}, j) is generated with C({0, j}, j) = c 0j .The number of usable bikes in the vehicle is l 1 ({0, j}) = q 0 + q j , and the number of broken bikes in the vehicle is l 2 ({0, j}) = r j .
In the mth stage, the state (ψ\{j}, i) is transformed into (ψ, j), |ψ| = m + 1. C(ψ, j) can be computed by the following recursive equation: Feasibility is checked for the transition between states (ψ\{j}, i) and (ψ, j).For ex- ample, when station j, which is a pickup station (q j ≥ 0), is added, the sum of usable and broken bikes on the vehicle should not exceed capacity Q after visiting station j.If station j is a delivery station q j ≤ 0 , the number of usable bikes on the vehicle before visiting station j should exceed the amount of delivery demand.
All the stations should be visited in the (|V σ | + 1)th stage.As vehicles are allowed to return to any depot, all the depots are considered to be expanded at the last stage.The optimal route is the route with minimum working time, which is expressed as follows: Notably, one state typically represents more than one route.For example, the state ({0,1,2,3},3) has two routes, namely, routes [0-1-2-3] and [0-2-1-3], which are transformed from states ({0,1,2},2) and ({0,1,2},1), respectively.In this scenario, the route with minimum working time will be saved to the subsequent stage, which can save considerable computing space and ensure that the optimal route can be obtained at the end of the algorithm.
The number of states generated during state extension increases exponentially with the number of stations involved in the problem, which dramatically increases the computation time of the DP algorithm.In the computation, several states may have no feasible extension that has to be removed from the state set.If these infeasible states can be identified in advance, then the number of states during the computational will be effectively decreased.We propose two extension rules to identify infeasible states in advance.
Rule 1: For states with the given subset ψ, if at least one unvisited station i, i ∈ V σ \ψ, that satisfies l 2 (ψ) + max(q i , 0) + r i > Q, then these states should be removed.This rule can be used to identify the states with which vehicle will be eventually overloaded.
Rule 2: For states with given subset ψ, if at least one unvisited station i, i ∈ V σ \ψ, that satisfies Q − l 2 (ψ) + min(q i , 0) < 0 exists, then these states should be removed.This rule can be used to identify the states that the vehicle will be eventually underloaded.
Although the DP algorithm can find the optimal solution, its computational time increases exponentially with an increase in the number of stations.In this study, we use the restricted DP that restricts the state space via two parameters, H and E, according to [53].H restricts the maximum number of states saved at the end of each stage.The maximum H states with the lowest travelling cost are taken to be expanded in the subsequent stage.For each expansion, a state is not expanded to all feasible states but to partial feasible states.The maximum E best states are select to expand.The DP improvement is applied per γ iterations to all routes of the current solution to improve feasibility and quality.

Assignment Heuristic
The solution obtained by the proposed VNS-DP is a set of routes fulfilling all the rebalancing demand and collection requests with minimum total working time and fixed cost.As we stated in Section 3.3, if the vehicles are restricted to perform at most one route, the system may need an oversized fleet of vehicles to satisfy all the demand.The number of vehicles needed can be decreased by allowing multiple use of vehicles.In this section, a greedy heuristic is proposed to assign routes to vehicles with the objective of minimizing fleet size.
The heuristic works as follows: the initial feasible solution is the best solution obtained by VNS-DP.We remove one vehicle and use the assignment procedure to find next solution.The assignment procedure can be solved by the bin packing problem heuristic [54].Given a set of routes, the longest route that was not assigned is selected and assigned to the emptiest vehicle.When adding one route to a vehicle, the current last depot may be different with the starting depot of the selected route.We retain the one with shorter total duration after adding.This assignment procedure repeats until all the routes are assigned.If the new solution is feasible, one more vehicle is removed and the process is reiterated.The algorithm stops when the new solution build by assignment procedure is infeasible.The output of this heuristic is the last feasible solution encountered in which all the demand and requests are fulfilled.This algorithm is similar to the two-stage algorithms proposed by [23] to minimize the number of vehicles used, but they used different heuristics to build a feasible solution.

Computational Complexity of VNS-DP
The proposed VNS-DP comprises five parts: initial solution generation, shaking, local search, DP improvement, and assignment heuristic.Figure 7 briefly illustrates the relationship of these procedures.We first analyze the computational complexity of each part and then present the whole computational complexity of VNS-DP.
Mathematics 2022, 10, 4583 16 of 29 two-stage algorithms proposed by [23] to minimize the number of vehicles used, but they used different heuristics to build a feasible solution.

Computational Complexity of VNS-DP
The proposed VNS-DP comprises five parts: initial solution generation, shaking, local search, DP improvement, and assignment heuristic.Figure 7 briefly illustrates the relationship of these procedures.We first analyze the computational complexity of each part and then present the whole computational complexity of VNS-DP.Initial solution generation follows an iteratively inserting procedure, and hence its computational complexity is (), where  is the number of stations of the problem instance.
Shaking provides a perturbation on the selected routes by the neighborhood structures, where the computational complexity is ( ).
The swap and 2-opt operators in local search check all possible operations within the selected routes to observe if any improvement occurs.Therefore, the computational complexity of local search is ( 2 ), where  is the maximum number of stations in a feasible route.
Dynamic programming is usually time-consuming for the reason of state explosions.Here, we use  to restrict the maximum number of states saved at the end of each stage.Therefore, the computational complexity of applying DP to one route is (  ).As DP improvement is applied per  iterations to all routes of the current solution, the computational complexity is ( 1    ), where  is the number of routes of the incumbent solution.
Assignment heuristic is also an iterative procedure, and hence its computational complexity is ().Initial solution generation follows an iteratively inserting procedure, and hence its computational complexity is O(N ), where N is the number of stations of the problem instance.
Shaking provides a perturbation on the selected routes by the neighborhood structures, where the computational complexity is O(1).
The swap and 2-opt operators in local search check all possible operations within the selected routes to observe if any improvement occurs.Therefore, the computational complexity of local search is O P 2 , where P is the maximum number of stations in a feasible route.
Dynamic programming is usually time-consuming for the reason of state explosions.Here, we use H to restrict the maximum number of states saved at the end of each stage.Therefore, the computational complexity of applying DP to one route is O(H P ).As DP improvement is applied per γ iterations to all routes of the current solution, the computational complexity is O( 1 γ MH P ), where M is the number of routes of the incumbent solution.Assignment heuristic is also an iterative procedure, and hence its computational complexity is O(M).
The initial solution generation and assignment heuristic are executed only once, while the other procedures will be iteratively conducted N max times, and therefore the computational complexity of the proposed VNS-DP is O(N max (P 2 + 1 γ MH P )).Due to the difficulty in comparing the values of P 2 and 1 γ MH P on different instance scales, they are all kept in the formula.

Numerical Experiment
This section provides the computational results of the proposed model and algorithm.The benchmark instances and randomly generated instances are used to evaluate the quality of the proposed algorithm.VNS-DP was run 10 times for each instance.We presented the minimum, average, and standard deviation of objective functions value found by VNS-DP and compared them with the best lower/upper bound obtained by CPLEX 12.10 over the mathematical model proposed in Section 3.2.The algorithm was coded in the MATLAB software environment (Version R2019a).All experiments were performed on an Intel Core i7-10700 CPU with 2.90 GHz and 8 GB RAM.
The benchmark instances by [3] include 65 real-world instances, ranging from 13 to 116 vertices.The maximum route duration T and the maximum number of vehicles |K| refer to [28], who adapted the 65 real-world instances to the BRP with multiple vehicles and visits.We adapt the 65 real-world benchmark instances to our problem by randomly transforming several vertices into depots and generating collection requests r i between [0 and 3].We add ∑ i∈V\{0} r i /Q to the value of |K| for each instance as the consideration for new broken bike collection requests.The fixed cost of using a vehicle is set as 1000.
The random instances are generated as follows: The sizes of the instances are characterized by the number of vertices, which varies from 10 to 300.Depots are scattered randomly in a square with a side length of 20 km, and stations are randomly distributed around the depots.The working time t ij is the sum of the travel time on the arc (i, j) and the loading/unloading time spent in the station j.The travel speed is set to 40 km/h and the loading/unloading time for a bike is set to 1 min.The rebalancing demand q i is randomly generated between [−Q and Q], and the collection request r i is randomly generated between [0 and 3].The vehicle capacity Q is set as [10,12,15].The maximum working duration T is set as 2 h.The maximum number of vehicles is set by running the initial generation heuristic proposed in Section 4.2.The fixed cost of using a vehicle is set as 100.Some characteristics of the random instances are reported in Table 2.The parameters associated with the proposed method are presented in Table 3.For N max , H and E, it is evident that the larger they are, the better the solution will be.Concerning the expense, the long computation time will be costly.Therefore, the three parameters are set based on the trade-off between solution quality and computation time.In the subsequent experiments, for small-size instances (|V| < 20), N max is set as 2000.For medium-size instances (20 ≤ |V| < 50), N max is set as 4000.For large-size instances (|V| ≥ 50), N max is set as 6000.We set H = 5000 and E = 5 to obtain solutions within a reasonable computation time.The values for the other four parameters in the remainder of this section are set as follows: i T = 800, α = 0.95, β = 700, and γ = 500.Our initial experiments showed that these settings have a good performance.Interval for using DP improvement

Results for Benchmark Instances
Tables 4-6 present the detailed results for the real-world benchmark instances derived by [3].The results of the VNS-DP are tested against the results obtained using CPLEX at a 2 h computation time limit with the default setting.The tables list the characteristics of the instances, the results of using CPLEX, and the results of running VNS-DP 10 times for each instance.We depict the instances by the number of vertices |V|, the number of depots V D and the vehicle capacity Q.Table 4 presents the results of small-size benchmark instances.For CPLEX, the optimal solution value C opt , the CPU time (t) in seconds, and the number of vehicles required in the solutions (n veh ) are reported.For VNS-DP, we report the minimal objective values among the 10 trials (C min ), the average objective values obtained by running the algorithm 10 times C avg , the standard deviation of objective values in percentage (%dev), the average computation time (t) in seconds, the average number of required vehicles (n veh ), and the number of trials that obtain feasible solutions n f eas .To compare the solution quality, the percentage gaps %gap min and %gap avg %gap min = C min − C opt /C opt * 100 , %gap avg = C avg − C opt /C opt * 100) are reported.Thirteen of fifteen optimal solutions are found and all of them are quickly identified by VNS-DP.The gaps of minimal and average solution values generated by VNS-DP, with respect to the optimal values, are 0.1% and 1.8%, respectively.The average standard deviation of objective value obtained by VNS-DP in 10 runs is 1.6%.The computational time of the proposed algorithm has no obvious change, while that of CPLEX changed greatly just in these small-size instances.The average numbers of vehicles needed in solutions are almost equal for the two methods.VNS-DP finds feasible solutions in all trials for this set of instances.
The computational results for medium-size benchmark instances are reported in Table 5.The best lower and upper bounds (LB and UB) found by CPLEX are reported, as most instances cannot be optimally solved within a 2 h time limit.For the instances that are optimally solved by CPLEX (t < 7200 s), UB is the optimal solution value.Note that VNS-DP can also find the optimal solutions for some of these instances.For other instances (t > 7200 s), VNS-DP can improve the best solutions obtained by CPLEX in most instances, which is shown by negative values of %gap UB .Especially for larger instances, for example '42Dublin', VNS-DP improves the best solution of CPLEX by 65.1% in the best case, and by 64.5% in the average case.The number of vehicles needed for '42Dublin' is 6 in the solution by CPLEX, but decreased to 3.7 on average by VNS-DP.For these instances, on average 9.7 out of 10 trials find feasible solutions.We can also observe VNS-DP shows great advantage in computational time.
The results of large-size benchmark instances are given in Table 6.For CPLEX, we just report the lower bounds obtained, as it cannot find feasible solutions for all the large-size instances with a 2 h time limit.For VNS-DP, columns depicted are the same as Table 4.One can notice that VNS-DP finds feasible solutions for all large-size instances in 108.3 s on average.We compare the results generated by VNS-DP with the lower bounds obtained by CPLEX.The minimal solution value is 31.2%above the lower bound.The gap between the average solution value and the lower bound is 37.4%.On average, 10.2 vehicles are required in solutions to this set of instances.For these large-size instances, on average 9.6 out of 10 trials find feasible solutions.

Evaluation of the Proposed VNS-DP
In this subsection, we present the computational results for the proposed VNS-DP for the randomly generated instances, which are compared with computational results for CPLEX and VNS (without the DP operator).The comparison experiment is designed to evaluate the performance of our proposed method VNS-DP and DP operator.In addition, the performance of the proposed VNS-DP for large-size random instances up to 300 stations is presented.
Table 7 displays the comparison results.In summary, we give the overall average deviation (in %) for all the solutions found by VNS and VNS-DP to their minimum solution values, number of best solutions found by each method (# best solutions), and the average performance of the two meta-heuristics.For each instance, we bolded the best objective value among the three results.The best is not necessarily optimal.When the three results are identical at the same time, all of them are bolded the best.We can observe that the first nine small-size instances (|V| ≤ 15) are optimally solved by all three methods, as the objective values obtained by CPLEX, VNS, and VNS-DP are the same as LB.When instance scale increases from |V| = 18 to |V| = 30, the CPLEX has a slight advantage in solution quality, but with a dramatic rise in computational time.However, both VNS and VNS-DP can solve the instance within a few seconds.For instances of |V| ≥ 40, CPLEX cannot provide any feasible solutions in 2 h, but VNS and VNS-DP can still get feasible solutions within a reasonable time.We can see that VNS-DP performs better than VNS, as it obtained more best solutions.In summary, the proposed VNS-DP finds 26 best solutions out of 36 instances, while CPLEX and VNS obtain 16 and 15 best solutions, respectively.This indicates that incorporating the DP operator can improve the VNS to yield better solutions, especially for larger instances.In addition, the number of feasible solutions found in 10 trials increases from 9.6 to 9.7 after using the DP operator, which shows VNS-DP has a higher capability to search for feasible solutions.Furthermore, the average deviation of the solutions generated by VNS-DP is smaller than that of VNS.In terms of computational time, VNS-DP consumes more than VNS as the addition of the DP operator.The average computational time of VNS-DP is 17.3 s, which is still acceptable.

Impact of Broken Bike Collection and Distribution of Depots
In this subsection, we conduct experiments to evaluate the impact of the broken bike collection and distribution of depots.In scenario 1, we compare the results of a set of random instances with different amount of collection requests (Pr = {0%, 50%, 100%}).Let Pr be the percentage of stations that have collection requests.A set of 27 instances with |V| = 30 and V D = 2, comprising of three subsets (Q = {10, 12, 15}), are used to test the impact of Pr. Figure 8 shows the impact of Pr on the objective value, the computational time, the number of required vehicles, and the number of feasible trials.According to computational results, the number of stations |V| shows a decisive influence on computational time.We observe that CPLEX can solve the instances to optimality when |V| is less than 40 (both for benchmark instances and random instances).As for the proposed VNS-DP, it can solve the instances of |V| = 300 within 100 s.

Impact of Broken Bike Collection and Distribution of Depots
In this subsection, we conduct experiments to evaluate the impact of the broken bike collection and distribution of depots.In scenario 1, we compare the results of a set of random instances with different amount of collection requests ( = {0%, 0%, 00%}).Let  be the percentage of stations that have collection requests.A set of 27 instances with || = 30 and |  | = , comprising of three subsets ( = { 0, , }), are used to test the impact of Pr. Figure 8 shows the impact of  on the objective value, the computational time, the number of required vehicles, and the number of feasible trials.
We can verify that the objective values and number of required vehicles tend to increase as the amount of collection requests increases for all the three sets of instances.This indicates that there is an increase in transportation costs when adding the broken bike collection work.However, the average computational time has an obvious decline when the amount of collection requests moves up.The collection of broken bikes leads to an increase in the cumulative number of bikes on vehicles in routes, as the broken bikes cannot be unloaded after collection.This leads to the decrease in the number of feasible stations that can be inserted into a route, which reduces the solution space for DP and other local search operators, then reduces the computational time of VNS-DP.We can verify that the objective values and number of required vehicles tend to increase as the amount of collection requests increases for all the three sets of instances.This indicates that there is an increase in transportation costs when adding the broken bike collection work.However, the average computational time has an obvious decline when the amount of collection requests moves up.The collection of broken bikes leads to an increase in the cumulative number of bikes on vehicles in routes, as the broken bikes cannot be unloaded after collection.This leads to the decrease in the number of feasible stations that can be inserted into a route, which reduces the solution space for DP and other local search operators, then reduces the computational time of VNS-DP.
To further evaluate the benefit of adding collection requests to the optimization, we conduct experiments for considering fulfilling the rebalancing and collection requests separately, and compare the results with the proposed model MSBRCP.In this computational experiment, the amount of collection requests varies from 10% to 90% for an instance with |V| = 30 and V D = 2.The comparison results are presented in Table 9.The "two-stage method" represents optimizing the rebalancing and collection separately.C 1 is the objective value when considering only rebalancing demand in the optimization, and C 2 is the objective value when considering only collection requests.C 1+2 is the sum of C 1 and C 2 , which represents the final objective value for the two-stage method; n veh of the two-stage method is also the sum of the number of vehicles used in the two operations.These figures are compared with objective values and numbers of vehicles used in the proposed model MSBRCP.We can observe that both objective values and the number of vehicles used are improved if we consider rebalancing and collection operations simultaneously instead of planning these two operations separately.The average improvements are 29.2% and 30.9%.The improvement becomes more significant as the amount of collection requests increases.For example, when Pr = 10%, the objective value and number of vehicles are reduced by 8.6% and 20%, respectively.When Pr = 90%, the objective value and number of vehicles are reduced by 37.0% and 37.5%, respectively.Although adding a collection request to rebalancing would increase the objective value, this objective value is still much less than the sum of objective values of completing rebalancing and collection operations separately.In scenario 2, we compare the results of random instances with a different distribution of depots.Central-based distribution and edge-based distribution are considered, which are shown in Figure 9. Central-based distribution represents that depots are located in the centre of some stations to serve them better.All the random instances in this paper are generated in this way.Depots are scattered randomly in a square and stations are randomly distributed around the depots.Edge-based distribution is the situation in which depots are located in the suburb of a city to save the cost.All the depots are distributed at the edge of the BSS.The experiments are conducted on a set of instances with 100 vertices.We compare the objective value, the computational time, the number of required vehicles, and the number of feasible trials of two types of distribution, which are presented in Table 10.The results indicate that the central-based distribution of depots has better objective values than edge-based distribution for all the instances.As the depots are located near stations, the total travel time for rebalancing and collection would be shorter for central-based situation.We note that edge-based distribution has a slight advantage in terms of number of vehicles required (n veh ) for most instances.The reason may be that vehicles would like to serve more stations once they leave depots in an edge-based situation, as returning back to depots may take longer travel time than a central-based situation.

Comparison Results of Three Strategies
We compare the results of three scenarios in this section.For small instances, we compare three strategies: (1) vehicles can start from and return to different depots; (2) vehicles should start from and return to the same depot; and (3) allowing multiple uses of vehicles.The results are reported in Table 11.We can observe that allowing vehicles to return to different depots can obtain better objective values than asking them to return to the same depots for many instances.The number of vehicles used in the first two strategies is always the same, except for instance "15 Treviso".When vehicles are allowed to use multiple times, the number of vehicles used on average is 2.1, which is smaller than that number for the other two strategies (2.3).Although the benefit of reducing needed vehicles is not very significant for these small-sized instances, we can still observe it in instances "3Bari" and "15Treviso".
Table 12 presents the results of the assignment heuristic.We use the assignment heuristic to solutions obtained by VNS-DP to minimize the number of vehicles for large-size instances (|| = 0 to 300).The results of using the assignment heuristic are

Comparison Results of Three Strategies
We compare the results of three scenarios in this section.For small instances, we compare three strategies: (1) vehicles can start from and return to different depots; (2) vehicles should start from and return to the same depot; and (3) allowing multiple uses of vehicles.The results are reported in Table 11.We can observe that allowing vehicles to return to different depots can obtain better objective values than asking them to return to the same depots for many instances.The number of vehicles used in the first two strategies is always the same, except for instance "15 Treviso".When vehicles are allowed to use multiple times, the number of vehicles used on average is 2.1, which is smaller than that number for the other two strategies (2.3).Although the benefit of reducing needed vehicles is not very significant for these small-sized instances, we can still observe it in instances "3Bari" and "15Treviso".Table 12 presents the results of the assignment heuristic.We use the assignment heuristic to solutions obtained by VNS-DP to minimize the number of vehicles for largesize instances (|V| = 150 to 300).The results of using the assignment heuristic are compared with solutions obtained by VNS-DP.Total time is the average total working time of solutions.We can see that there is a significant reduction in the number of vehicles used.For example, for the first instance, the average number of vehicles required is 26.0 before using a heuristic.This value reduces to 6.6 after using this heuristic, and this induces a 9.8% increase in total working time.The results indicate that the number of required vehicles has a reduction of 65.5% on average compared with the single use of vehicles for large-size instances, and the total working time has an increase of 7.4% on average.

Conclusions
This study presents a multi-depot static bike rebalancing and collection problem, considers multiple depots and broken bike collection.We classify the problem as a two-commodity vehicle routing problem with pickup and delivery and formulate it as an integer programming model.A hybrid solution method, i.e., VNS-DP, which incorporates the restricted DP into the framework of VNS, is developed to solve the problem.The algorithm is equipped with a heuristic for constructing initial solutions, a shaking procedure for exploring different neighborhoods, traditional local search operators, and DP improvement to improve solution quality.The effectiveness of the proposed method is proven by computational experiments on benchmark instances and randomly generated instances.These instances vary in size from 10 to 300 vertices and 2 to 6 depots.The results show that the VNS-DP finds 26 best solutions out of 36 instances, while CPLEX obtains 16 best solutions.We also test the impact of broken bike collection and distribution of depots.The results show that considering broken bike collection increases the transportation cost and reduces the computational time compared with traditional SBRP.More depots provide more choices for vehicles to find better departures and destinations, which leads to a reduction in transportation costs.Comparison of different practical strategies indicates that the number of vehicles can be reduced by 69.4% when allowing multiple visits to depots.Allowing vehicles to return to different depots can help reduce the total working time.
In future work, we should extend the current problem and solution method to consider more flexible situations, such as multiple types of vehicles, multiple visits to stations and multiple service levels.Furthermore, BSS without designated docking stations rapidly increased in recent days and differ from the system investigated in this research.New BRPs should be built to help improve the service quality of such new systems.Incorporating determination of ideal inventory for each station is also a good future direction.

Figure 2 .
Figure 2. Optimal solution to the MSBRCP instance (#3 of benchmark instances).sents a depot, a circle represents a station.Numbers beside each circle represent demand and collection request of the station indicated inside the circle.

.
Figure 2. Optimal solution to the MSBRCP instance (#3 of benchmark instances).A square represents a depot, a circle represents a station.Numbers beside each circle represent the rebalancing demand and collection request of the station indicated inside the circle.

Figure 4 .
Figure 4. Illustration example of depot change.

Figure 5 .
Figure 5. Illustration example of route split.

Figure 4 .
Figure 4. Illustration example of depot change.

Figure 4 .
Figure 4. Illustration example of depot change.

Figure 6 .
Figure 6.Illustration example of route combine.

Figure 7 .
Figure 7.The flow chart of the proposed VNS-DP.

Figure 7 .
Figure 7.The flow chart of the proposed VNS-DP.

Mathematics 2022 ,
10, 4583 25 of 29advantage in terms of number of vehicles required ( ℎ ) for most instances.The reason may be that vehicles would like to serve more stations once they leave depots in an edge-based situation, as returning back to depots may take longer travel time than a central-based situation.

Figure 9 .
Figure 9. Central-based distribution and edge-based distribution of depots.

Figure 9 .
Figure 9. Central-based distribution and edge-based distribution of depots.

Table 1 .
Related publications on static bike rebalancing problem.

Table 2 .
Some characteristics of random instances.

Table 3 .
Parameters used in the algorithm.

Table 4 .
Results for small-size benchmark instances.

Table 5 .
Results for medium-size benchmark instances.

Table 6 .
Results for large-size benchmark instances.

Table 7 .
Results for VNS-DP and VNS for random instances.

Table 8 .
Results for VNS-DP for large-size random instances.

Table 10 .
Comparison results of central-based distribution and edge-based distribution of depots.

Table 10 .
Comparison results of central-based distribution and edge-based distribution of depots.

Table 11 .
Comparison results for three scenarios.: vehicles can start from and return to different depots; 2 : vehicles should start from and return to same depot; 1 3: allowing multiple uses of vehicles.

Table 12 .
Performance of assignment heuristic.