A Simheuristic Algorithm for Solving the Stochastic Omnichannel Vehicle Routing Problem with Pick-up and Delivery

: Advances in information and communication technologies have made possible the emergence of new shopping channels. The so-called ‘omnichannel’ retailing mode allows customers to shop for products online and receive them at home. This paper focuses on the omnichannel delivery concept for the retailing industry, which addresses the replenishment of a set of retail stores and the direct shipment of the products to customers within an integrated vehicle routing formulation. Due to its NP-Hardness , a constructive heuristic, which is extended into a biased-randomized heuristic and which is embedded into a multi-start procedure, is introduced for solving the large-sized instances of the problem. Next, the problem is enriched by considering a more realistic scenario in which travel times are modeled as random variables. For dealing with the stochastic version of the problem, a simheuristic algorithm is proposed. A series of computational experiments contribute to illustrate how our simheuristic can provide reliable and low-cost solutions under uncertain conditions.


Introduction
Today, people are changing their shopping behavior. Recent advances in information and communication technologies have introduced new shopping channels and models, which make possible the expansion of e-commerce and, consequently, the emergence of new challenges in operational research, transportation, and logistics areas. Specifically, regarding modern e-commerce business models, new decision variables and constraints have been incorporated in them, leading to emerging variants of distribution problems in supply chain management.
Unlike brick-and-mortar stores, where salespeople are available to support and help customers to make their purchases, the popularization of mobile devices with access to the Internet has promoted the use of different shopping channels. The online channel is an example that has emerged as a competitive marketing channel to that of traditional retail centers, transforming e-commerce into a global trend and an important tool for every business worldwide [1]. With e-commerce, customers are immersed in an environment of a plethora of information, opinions, and access to a vast combined supply of stock, which together allows them to browse through different stores in an online environment. According to some experts, the online shopping channels were predicted to kill off the physical ones. However, they co-exist and have completely transformed the way customers shop nowadays [2]. The use of a variety of shopping channels is referred to as 'omnichannel retailing,' where, instead of having only the single option of physically visiting a store to buy products, consumers can also buy them via online

Literature Review
With recent advances in information and communication technologies, different shopping channels have emerged which have attracted the attention of customers. Nowadays, customers use different channels to shop, giving them a multichannel shopping environment [14]. The transition from multichannel to omnichannel retailing has been discussed by Verhoef et al. [15] and Hübner et al. [16]. On the one hand, in multichannel retailing, the online and offline channels are treated as separate businesses. On the other hand, the use of both channels in an omnichannel environment is completely integrated, providing consumers with a seamless experience [17]. Therefore, the customer experience is different although multichannel and omnichannel retailing environments are often considered the same. Beck and Rygl [18] presented a complete categorization and definition of retailer channels: multi, cross, and omnichannel are clearly defined. According to Hübner et al. [3], there are several operations within the omnichannel distribution system which are responsible for its excellence, such as expanding delivery modes, increasing delivery speed, and service levels.
As mentioned, the OM-VRP combines the VRP and a PDP. The classical VRP, originally proposed by Dantzig and Ramser [19], has been extensively studied by practitioners and academics due to its wide applications in several areas. The VRP belongs to a set of NP-hard combinatorial optimization problems (COPs) [7]. Therefore, the use of exact algorithms is efficient only for solving small-sized VRP instances. Mostly, these exact approaches are based on the combination of column and cut generation algorithms [20,21]. On the other hand, approximate algorithms, such as metaheuristics, are frequently very efficient for solving large-sized instances of COPs. Several metaheuristics have been proposed to solve the VRP, which include tabu search (TS), genetic algorithms (GA), ant colony optimization (ACO), and some hybrid methodologies [22][23][24][25]. Pick-up and delivery problems have also been studied for more than 30 years. These problems incorporate some route order dependencies in which some nodes should be visited before others in order to transfer inventory between them. Similar to the VRP, the PDP is also an NP-hard problem [8], and some exact methodologies, based on branch-and-cut and branch-and-cut-and-price algorithms, have been developed to optimally solve small-sized PDP instances [26,27]. Moreover, several heuristics and metaheuristics algorithms have been developed to solve the PDP and some of its variants. We highlight the use of TS [28], GA [29], large neighborhood search heuristics (LNS) [30], adaptive LNS (ALNS) [31,32], particle swarm optimization (PSO) [33], and greedy clustering methods (GCM) [34]. A literature review and classification of PDPs is presented by Berbeglia et al. [35].
To the best of our knowledge, Abdulkader et al. [1] were the first authors to address the omnichannel VRP as a combination of the VRP and the PDP in an omnichannel retailing context. For solving this novel integrated problem, the authors proposed a two-phase heuristic, based on: (i) inserting consumers into retailers routes and on correcting infeasible solutions, and (ii) on joining the routes through the maximum-savings criterion, i.e., the Clarke & Write Savings heuristic (CWS) [36]. Apart from this two-phase heuristic, a multi-ant colony algorithm (MAC) was proposed. A complete set of instances have been generated to test their methodologies, and the MAC outperformed the heuristic's performance. Martins et al. [9] proposed a simple and deterministic heuristic for solving the OM-VRP. Although no best-known solutions were found, the heuristic showed to be promising since feasible solutions were provided in short computational time and no repair operations were needed. Recently, the same problem have been also solved by Bayliss et al. [37] and Martins et al. [38]. In contrast to Bayliss et al. [37], Martins et al. [38] have framed the problem in the humanitarian logistics field, providing an agile optimization solving methodology-which combines parallel computing with biased-randomized heuristics-which was proposed for solving it in real-time. Competitive results were found in milliseconds, enabling the agile optimization technique able to outperform other heuristics from the literature. On the other hand, Bayliss et al. [37] formulated the OM-VRP as a mixed-integer program (MIP) and proposed a two-phase local search with a discrete-event heuristic for solving the problem. In contrast to Abdulkader et al. [1]'s model, their formulation avoids the need to directly assign retailers to vehicles, since the pick-up and delivery constraints are modeled through routing variables. The first phase of the approach employs a discrete-event constructive heuristic, while the second phase aims to refine the most promising solutions obtained in stage one, by applying a sequence of local search neighborhoods. The authors have found new best-known solutions for the vast majority of problem instances, and improved lower bounds for a set of small instances were obtained.
Constructive heuristics are simple deterministic procedures that follow a logical sequence of decisions and which always generate the same solution when starting from the same point. Biased-randomized algorithms incorporate non-symmetric random sampling in order to diversify the behavior of a base constructive heuristic [39]. At each stage of the base constructive heuristic, a list of candidate decisions is considered. The list is sorted in decreasing order of the benefit concerning an objective function, and a candidate is then randomly selected. In this random selection process, the higher-ranked candidates receive a higher probability of being selected. Biased-randomized algorithms have been employed for solving different COPs in transportation [40][41][42], scheduling [43,44], and facility location problems [39,45].
Simheuristic algorithms combine the use of approximate methods, such as heuristics or metaheuristics, with simulation techniques, in order to cope with stochastic combinatorial optimization problems. They have proven to be an efficient approach to solve stochastic problems [46]. The extension of traditional metaheuristics into simheuristics is gaining popularity, and they have already been applied to solve a wide set of stochastic optimization problems, in different fields such as permutation flow-shop [47], facility location problems [48], inventory routing problems [48][49][50], telecommunication networks [51] and finances [52].
Other examples of simheuristics include Lam et al. [53], Lopes et al. [54], and Santos et al. [55]. Lam et al. [53] employs a simheuristic approach for evolving agent behavior in the exploration of novel combat tactics. They use a genetic algorithm to find the states, during a flight maneuver, at which an aircraft should transit into the next phase of the maneuver. Lopes et al. [54] tackles a stochastic assembly line problem using a simheuristic approach. Demands for different product types are stochastic and occur in real-time. The optimization task lies in assigning tasks to stations in such a way that task precedence constraints are respected and the expected throughput is maximized. Santos et al. [55] develops a simheuristic based decision support tool for an iron ore crusher circuit. They propose and validate a simulation model of throughput efficiency depending on the amount of equipment active in each phase of the crusher circuit. The simheuristic integrates the simulation model with an iterated local search (ILS) metaheuristic. Their solutions improve throughput and reduce energy consumption compared to the all active equipment solution.
Despite the fact that travel times are stochastic in most real-life transportation activities, in the past only deterministic versions of the OM-VRP have been considered. Hence, our work goes one step further by considering random travel times and proposing a simheuristic algorithm to solve the stochastic version of the OM-VRP.

Details on the Stochastic OM-VRP
Usually, retail stores must be supplied from the depot with a large number of products, which are packed and frequently measured in terms of the number of pallets. In turn, and in contrast to retail stores' demand, online customer orders are of negligible size but require processing at a physical retail store before their delivery. Since orders must be processed at retail stores before delivery, products ordered online cannot be shipped directly from the central warehouse to online customers.
In the OM-VRP, a single fleet of vehicles is employed to simultaneously perform three different operations: (i) bulk deliveries to retail stores from a main central depot; (ii) the pick-up of online customer orders from these retail stores; and finally, (iii) the deliver of online customers' orders. Different product types are available at each retail store. Likewise, each retail store has a limited inventory of processed products that can be delivered to online customers. Therefore, the vehicle responsible for a particular online order must first visit a retail store with that product in stock. The available processed inventory and bulk demand for each retail store are known in advance. It is assumed that bulk inventory does not contribute to a retailer's available inventory since bulk deliveries cannot be processed within the drop time of delivery. As a simplifying assumption, each online customer orders a single product. When more than one item is ordered, the consumer is replicated in the same location as a 'virtual' consumer, and each product is delivered separately. According to Abdulkader et al. [1], this strategy of considering single-item orders by consumers guarantees the solution feasibility and also minimizes the distribution cost. Figure 1 depicts how the processes of visiting retail centers and online customers are performed by the same fleet of vehicles. A route starts from the single and central depot, and the same cargo vehicle visits a set of retail centers and customers. Deliveries to customers cannot be directly performed from the depot. Hence, drivers must meet precedence constraints in order to pick-up customer orders from the retail centers before they can be delivered. Since retailers are supplied by the depot, these nodes are both delivery and pick-up points. On the other hand, customers require only the delivery of items from these retail centers, making them, delivery nodes. Apart from precedence and inventory constraints, the service of each route is limited by a maximum tour length and vehicle capacity limit. We can define the distribution network of the OM-VRP as directed graph G = (V, A). The set of vertices V = {0, 1, 2, . . . , r + c} is composed of a single depot (0), r retail centers, and c online customers. In other words, the problem is represented by a set R of retail stores, which are supplied by the depot 0, and a set C of online customers, which are posteriorly supplied by the retail centers. The set N = R ∪ C comprises both sets R and C, in which R = {1, 2, . . . , r} and C = {r + 1, r + 2, . . . , r + c}. The same fleet of capacitated vehicles, initially stationed at the central depot at the start time t 0 , is employed for serving both the retail centers and consumers. For the complete mathematical formulation of this problem, readers are referred to Abdulkader et al. [1]. In this work, we extend previous work on this topic by considering the case in which edge-traversal times are stochastic. We define the time to traverse edge (i, j) as T ij = t ij + D ij where t ij is the deterministic edge-traversal time (which represents the edge-traversal time under 'ideal' traffic conditions), D ij is a log-normally distributed delay term, and T ij denotes the distribution of edge-traversal times. The objective of this extended problem is to minimize the total travel cost of the vehicle routes such that: (i) every route starts and ends at the depot; (ii) the routes do not violate the maximum tour length; (iii) every node i ∈ N is visited by only one vehicle and only once; (iv) the total load of the vehicle does not exceed the vehicle capacity; (v) the total consumer demand to be fulfilled by a vehicle for a specific product does not exceed the inventory picked up by the vehicle; (vi) the retail store determined to satisfy the consumer's demand must be visited before the consumer and by the same vehicle; and (vii) the routes must have a reliability level greater than a user-set parameter, R min . The reliability level of a set of routes is defined as the probability that all routes are completed within the maximum time limit of T max .
Two alternative mathematical formulations of the deterministic OM-VRP can be found in Abdulkader et al. [1], Bayliss et al. [37]. The differences in this case are a stochastic objective function and probabilistic constraint regarding route completion reliability. Let x f kij be a binary decision variable indicating whether or not vehicle f in the fleet F traverses edge ij in the kth node visit in its route. The stochastic objective function is given in Equation (1), where D ij is the stochastic delay associated with traversing edge ij.
The probabilistic route completion reliability constraint is given in Equation (2).
Equation (2) also accounts for the drop times (τ) that are required when performing deliveries and pickups. In this work we employ a simheuristic approach for addressing the stochastic aspects of the OM-VRP. The main novelty of the OM-VRP, in comparison to other VRPs, is the precedence constraints regarding the collection of customer orders from retailers and subsequent delivery to customers. The vehicle capacity constraint regarding the retailer demands that can be satisfied are expressed by Equation (3), where OP j denotes the number of ordered product units of node j (which is zero for customer nodes) and H denotes the vehicle capacity.
The customer order precedence constraints are expressed by Equation (4), where OD jq denotes the number of items of type q ∈ Q ordered by node j (which is zero for retailer nodes) and P jq denotes the number of items picked up of type q ∈ Q at node j (which is zero for customer nodes).
(4) Table 1 provides the details for a small-sized instance, which is composed of one central warehouse, four retail stores, and nine online customers. For this instance, Table 1 provides: the geographic coordinates (X and Y) of each node, the demand required from the depot by each retail center i ∈ R, the demands for each customer j for each product type q (OD jq ), and the inventory of each available product P iq , q ∈ {1, 2, 3} at the retail centers. Since the products are negligible in terms of capacity, they do not affect the vehicle load capacity. Accordingly, Figure 2 presents the optimal solution for the provided example. The routes are performed with a total cost of 455.91 distance units, and a fleet of vehicles each with a capacity of 100 weight units is employed for performing the routes. In the first route, customers 8, 5, and 9 are served by retail center 1, while customers 12 and 13 are served by retail 3. The route requires 272.37 distance units, and the vehicle starts with 93 loaded demand units. Note that the retail centers are first visited by the vehicle in order to deliver the required demand from the depot and to pick-up the products ordered by the customers. Similarly, in the second route, customers 11 and 10 are supplied by retail center 4 while customers 6 and 7 are served by retail store 2. The route requires 183.53 distance units, starting with 79 loaded demand units.   Table 1.

Methodology
The multi-start framework belongs to a family of metaheuristics algorithms [56] which includes approaches such as Evolutionary Algorithms [57,58] and Swarm Intelligence Algorithms [59][60][61][62], among others. Metaheuristics were introduced in order to provide high-quality solutions using a reasonable amount of computation time and memory. Typically, they require a time-consuming parameter tuning process. Although, there are approaches that provide self-adaptive parameter control [58], our methodology has been devised to work with a reduced number of parameters, providing an excellent trade-off between simplicity, computational time, and solution quality.
In order to solve the stochastic OM-VRP, a simheuristic approach is proposed. Our methodology combines a biased-randomized savings-based heuristic with simulation techniques to deal with the stochastic aspects of the problem. Biased-randomized heuristics can be extended into simheuristics in a natural way. Starting from a savings-based heuristic, which is completely deterministic, this solution method is then extended into a biased-randomized multi-start algorithm. Also, a local search procedure is applied to search for locally optimal solutions. Finally, the multi-start framework is extended into a simheuristic approach in order to deal with stochastic travel times.

Savings-Based Heuristic
As introduced, the savings-based heuristic (LH) [9] is based on greedy decisions and designed to solve the deterministic OM-VRP. The LH is composed of the following three stages: 1. In the first stage, a dummy solution is created, where each route in this dummy solution serves one node i ∈ N, which can be either a consumer or a retail store. Routes depart from the depot, travel to the node, and then return to the depot. 2. The second stage of LH, named as CWS 1 , is presented in the Figure 3, where the dummy routes from stage 1 are merged using a maximum-savings criterion [36]. Each box from Figure 3 is numbered to aid the following explanations.
Initially, a savings list (SL) is constructed (step 1). This list considers all possible pairs of nodes-i.e., edges-from the problem. For each edge {i, j}, the corresponding savings value is calculated as s ij = t 0i + t j0 − t ij , where t ij represents the deterministic travel time between nodes i and j. That is, candidate solutions are generated using 'ideal' traffic conditions for the edge-traversal times.
Initially, all edges from SL are eligible. The list is sorted in descending order of the savings value (step 2), and the edge with the highest saving is selected (step 3). At this stage, the selection of edges is restricted to guarantee the assignment of a retail center to each consumer in the problem (step 5). In other words: a route containing one single customer can only be merged with a route containing a retail center, i.e., the selection is restricted to eligible edges {i, j}, where node j is a consumer in a dummy route and i is a node in a route with a retailer that can supply consumer j. According to Martins et al. [9], these attempts at only merging routes containing at least one retail center, which can supply a single consumer, are made first in order to avoid infeasible solutions-i.e., solutions in which some customers are not assigned to any retailer. This approach of addressing solution feasibility first is based on the observation that the availability of feasibility restoring merges will only decrease as the algorithm progresses.
Based on CWS route-merging conditions (step 6), the two corresponding routes, i and j, of an edge {i, j} (obtained in steps 4.1 and 4.2) can be merged only if: (i) nodes i and j are exterior in their respective routes (a node is exterior if it is adjacent to the depot); (ii) i and j belong to different routes; (iii) the maximum tour length is not violated; and (iv) the vehicle capacity is not violated.
The selected edge is deleted from SL (step 9) only if: (a) the corresponding merge is performed (step 7); or (b) at least one of the CWS constraints ((i)-(iv)) are violated (in step 6). Otherwise, the edge becomes temporarily ineligible (step 10), but it is not removed from the list since subsequent merges might restore eligibility. This can occur when a different retail center is merged into a route, increasing the available processed inventory for subsequent consumers. For example, when selecting an edge (i, j), the evolving route of i may have insufficient processed inventory for customer j at this time. However, if in subsequent iterations route i is merged with another route containing retailers, the evolving route of i may then be able to serve customer j.
When a merge is successfully performed (step 7), the entire SL becomes eligible (step 8), since a new inventory scenario is generated.
At the end of this stage, all the consumers are supplied by the retail centers, guaranteeing a feasible final solution. Notice that this is achieved without solving separate assignment and routing problems, as done in Abdulkader et al. [1].
3. Finally, the third stage (CWS 2 ) tries to improve the solution generated in the previous step.
To do that, the algorithm cycles through the SL list, which includes the remaining saving edges that were discarded in the previous step, with the aim of identifying more beneficial merges. Unlike the procedure used in the CWS 1 stage, in this phase all the customers are already assigned to a retail center, so step 5 of Figure 3 is not required. Hence, all edges are eligible. The process attempts all the available merging possibilities which may improve the solution. Each time a new edge is selected from the SL list, it is removed from the list, whether the corresponding merge is performed or not-due to it violating any of the constraints (i)-(iv). In each new iteration, the highest saving edge is selected to restart the merge process. This process is repeated until SL is empty. At the end of the procedure, a feasible solution is generated, without the necessity of repair operations.

Introducing a Local Search
Once an initial and feasible solution is constructed, a fast local search is applied to improve its quality. This local search mechanism is based on a 2-opt movement. Since this problem has some precedence particularities, the 2-opt movement, which consists of the reversal of sub-sequence of nodes, is restricted to movements that do not violate the precedence order between customers and their suppliers-hence, the feasibility of the solution is preserved.

Extending to a Biased-Randomized Algorithm
To modify the original greedy and deterministic behavior of our heuristic, the selection of candidates from the SL is randomized by introducing a skewed probability distribution into the selection process. For the biased-randomized component, we employ a geometric distribution, which is controlled by a single parameter, β ∈ (0, 1). β values closer to 1 increase the probability that the highest saving merges are selected. On the contrary, as β approaches 0, a near-uniform randomization is obtained. In this way, β controls the level at which the selection probabilities decrease along the sorted SL. Hence, unlike deterministic heuristics, which always generate the same solution when starting from the same initial solution, different decisions are taken at each selection iteration, consequently generating different solutions. Algorithm 1 represents the selection process of LH with biased-randomization, which replaces the greedy selection of edges from the SL. Considering all of the stages which have been introduced, Algorithm 2 represents the overall structure of our biased-randomized algorithm with local search (BRLH). This algorithm is then embedded into a multi-start procedure in order to obtain a variety of solutions (MS BRLH ).

Extending to a Simheuristic Approach
Deterministic travel times are widely assumed in transportation problems. However, when dealing with real-life problems, which are often fraught with uncertainty, travel times are usually stochastic in nature. As introduced in Section 3, the deterministic travel time employed to traverse edge (i, j), t ij , can be seen as the travel time required under ideal traffic conditions. In the proposed extension, the stochastic time to traverse (i, j) is computed as T ij = t ij + D ij , where D ij follows a logNormal(µ, σ) probability distribution, and represents a random delay resulting from uncertain conditions. Since the logNormal probability distribution can only take positive values, it follows that T ij > t ij , ∀(i, j) in the set of connecting edges. Algorithm 3 describes our simheuristic approach (Sim BRLH ). Initially, a solution is generated by our BRLH in line 1 (Algorithm 2), by employing the greedy approach (i.e., β ≈ 1). A short simulation is then performed on this initial solution (line 2), in order to estimate its average stochastic cost. This initial solution is set as the best-found stochastic solution cost (line 4). While the termination criterion is not met (line 6), different solutions are generated by BRLH (line 7). The deterministic cost of the initial solution is considered for guiding the search. Therefore, a solution is accepted for being submitted to the simulation module (line 10) only if its deterministic cost is smaller than the best-found deterministic solution cost plus m% of its value (line 9). This solution filtering approach reduces the amount of time spent on testing unpromising solutions in computationally expensive simulations. Moreover, by allowing the acceptance of moderately worse solutions, controlled by the parameter m, a better exploration of the solution space can be achieved [63]. At this stage, q short Monte Carlo simulation runs are used to test the accepted solution. Each simulation run replaces the deterministic travel times of a solution with randomly sampled ones-according to the assumed probability distribution. From this complete simulation process, the average stochastic cost of each solution is computed. Every time a new best stochastic cost is found (line 15), this solution is introduced into a pool of 'elite' solutions E (line 17). This process is repeated while the termination criterion is not met. On this reduced set of solutions, q long Monte Carlo simulation runs are performed (line 23) in order to generate more accurate results for solutions in stochastic environments. During the simulation process we also obtain an estimate of the reliability rate of a solution [64]. This estimate is computed as the rate at which all routes show completion times lower than the maximum allowed travel time. At the end, the set of elite solutions is sorted in descending order of their expected cost (line 25), and the best-found stochastic solution is provided to the manager.

Results and Discussion
To test the proposed methodologies, we have used the 60 problem instances introduced in Abdulkader et al. [1]. These instances differ in the number of retail centers (r ∈ {10, 15, 20, 25}), online customers (c ∈ {20, 50, 75, 100, 150}), and also in the inventory scenarios of the retail centers (tight, relaxed, and abundant). Therefore, for each inventory scenario, 20 different problem combinations are generated. The maximum tour length of the routes and the vehicle capacity are set to 8 h and 100 weight units, respectively. While the BRLH is guided by a single parameter, β, the simheuristic approach is also controlled by the maximum running time time max , the acceptance margin of worst solutions m, and the number, q short and q long , of simulation repeats in short and long simulation runs, respectively. The stochastic travel times for each edge are set by the log-normal distribution parameters, µ and σ. Table 2 summarizes the setup of the parameters employed during the computational experiments. For calibrating these parameters, we have used the methodology proposed in Calvet et al. [65], which is based on a general and automated statistical learning procedure. Regarding the maximum run time time max , the value was set depending on the size of the instance (c + r) × 0.342, which leads to a maximum execution time of 60 s in the case of the largest instance. This stopping criterion is employed in both the multi-start and simheuristic strategies.
The algorithms were coded in Java and all tests were performed on an Intel Core i7-8550U processor with 16 GB of RAM. Note that three different values have been considered for the σ parameter, which is used to modify the deterministic travel times. Since the maximum tour length is fixed independently of the size of the instances, small-sized instances are more likely to generate short routes. Therefore, a larger value for σ introduces more variability in the travel times, which increases route failure rates. On the other hand, large-sized instances are often composed of larger routes, then a small value for σ is introduced. In particular we have, σ = 2.5 for small instances (composed of 25 customers), σ = 1.55 for large instances (composed of 150 customers), and σ = 1.9 for the remaining medium-sized ones. This approach ensures that small, medium, and large instances each have similar levels of difficulty with respect to the risk of route failure.
The initial analysis aims to compare the solutions generated by our greedy heuristic (LH) and by our MS BRLH -in which β is (uniformly) randomly selected in the interval [0.45, 0.75]-with the solutions obtained by the two-phase heuristic (AH) and multi-ant colony metaheuristic (MAC) proposed by Abdulkader et al. [1]. Their methodologies were performed on four 2.1 GHz processors with 16-cores each and a total of 256 GB RAM. That is, we initially focus on comparing each algorithm in terms of deterministic travel cost. Tables 3-5 present the results obtained for tight, relaxed, and abundant inventory scenarios, respectively. For each problem instance (I), we present results for: the cost of the best-found solution obtained by the different methodologies; the average cost of our MS BRLH ; the CPU time (in seconds) required by each methodology; and their percentage gaps. The best results returned by the solution methodologies are highlighted in bold. Figures 4 and 5 present how both the gap and the cost of the solutions, i.e., the objective function (OF) value, behave according to the employed solution approach and inventory scenario, respectively.      By analyzing Tables 3-5, we can observe that our MS BRLH algorithm is able to improve previous results (from LH) by 9-12%, on average (column gap (2)-(1)). When comparing the MS BRLH with the AH heuristic (column gap (3)-(2)), our approach is able to reduce solution costs by up to 26% in short computational times (about 18 s on average). On the other hand, when comparing our results with those generated by the MAC approach (column gap (4)-(2)), our solutions are between 9% and 21% worse, on average. Particularly, in the abundant inventory scenario, MS BRLH 's results are only 9% worse than MAC. Notice, however, that the processing time required by MAC is substantially larger in all inventory scenarios. By analyzing Figure 4, it is evident that MS BRLH performs better in the abundant inventory scenario, being able to find one better solution and several others with a maximum gap of 8%. Moreover, we can observe a variability of around 10% in the gap between MS BRLH and MAC on average. This variability is reduced to around 8% for the relaxed and abundant scenarios. These results demonstrate the robustness of our solution approach for the deterministic case. When analyzing Figure 5, which presents the overall performance of each solution approach for each inventory scenario, we can observe that our multi-start strategy is more efficient than both LH and AH heuristics, by generating solutions with a lower cost. To complement these box-plots, an ANOVA test was run for each inventory scenario. The p-values associated with the tight, relaxed, and abundant inventory scenarios were, respectively, of 0.001, 0.000, and 0.000. Also, the Fisher's LSD test suggests significant differences in all tree scenarios between MAC and AH, between MS-BRLH and AH, as well as between MAC and LH. However, cost differences between MAC and MS-BRLH were not significant in any scenario, despite the fact that MAC employed a noticeably higher amount of computing time than MS-BRLH.
Next, in Figure 6, we present the convergence of three problem instances' solutions, including one from each different inventory scenario (instances b6, b26, and b46), by comparing the MS BRLH with the best-known solutions, in terms of gap. As introduced, these instances require approximately 15 s of processing time, given their magnitude. As we can see in Figure 6, the instances demonstrate the same convergence behavior as solution time increases. It is noticeable that the convergence rate is abrupt during the first few seconds. However, contrary to the tight and relaxed inventory scenarios, where the solutions continue improving over time, the search demonstrates more stable convergence during the remaining execution time in the abundant scenario. Being more efficient in the more flexible inventory case, our multi-start approach quickly achieves its best solutions.
Next, we want to compare the quality of the solutions generated for the deterministic scenario (MS BRLH ) against the solutions generated for the stochastic scenario (Sim BRLH ). Since the only difference between the deterministic and stochastic scenarios is that stochastic delays are added to edge traversal times, we can consider the deterministic cost of the best solutions generated by MS BRLH as a lower bound (LB) of the stochastic travel times of the best Sim BRLH solution. Moreover, since MS BRLH does not account for stochastic travel times, we can consider the stochastic travel time of the best MS BRLH solution as an upper bound (UB) for the stochastic travel time of the best Sim BRLH solutions. Table 6 provides both LBs and UBs values and the best-found stochastic travel times obtained by our Sim BRLH . The solutions reported in the Sim BRLH column are the best-found stochastic travel times. As we can see in Table 6, all the Sim BRLH solution costs are between the LB and the UB, as expected. For 24 problem instances, the solution returned by our simheuristic is better than the best deterministic solution when it is tested in the stochastic scenario (the UB column). From this, we can assert that our Sim BRLH is able to generate competitive results for the stochastic scenario. The reliability value is calculated by simulation for each solution and represents the probability that all routes are completed within maximum tour duration. For visualizing this trade-off between the deterministic cost of the solutions and their reliability rate, which are conflicting objectives, a Pareto frontier of non-dominated is presented. Accordingly, Figure 7a-c present the non-dominated solutions for three different instances (b17, b37, and b57), each one belonging to a different inventory scenario. A solution is non-dominated if, no other solution has a greater reliability and a lower or equal travel cost, or if no other solution has a lower travel cost and a greater or equal reliability level. The b17 solution was randomly chosen, while the b37 and b57 solutions are for the same problem but set in the two other inventory scenarios. The square orange dot represents the best deterministic solution found by our MS BRLH , while the remaining ones, round and blue, represent different solutions with a higher reliability rate, but with higher operating costs.
As we can see in Figure 7, despite being the solutions with the lowest cost, the best deterministic solutions (square dots) are the least reliable ones for stochastic scenarios. Particularly in Figure 7a, the best deterministic solution is approximately only 21% reliable under stochasticity. By selecting higher-cost solutions, the reliability rate reaches more than 70%. The same behavior is noticed in Figure 7b,c, however, the deterministic b57 solution is reasonably reliable. Usually, low-cost solutions are made up of a small number of large vehicle routes, in terms of travel distance and time. Therefore, when increasing the travel time variability in those scenarios, the risk of exceeding the time constraint is higher. On the other hand, higher cost solutions are built from a larger number of smaller routes, and smaller routes exhibit a lower risk of violating the maximum tour duration constraint in stochastic scenarios. In this way, decision-makers should consider that low-cost solutions under deterministic scenarios might not necessarily be the best option when stochasticity is taken into account.

Conclusions and Future Work
With the emergence of online retail channels and the popularization of mobile devices, new retailing modes have become popular. Some of these retailing practices allow customers to browse through different online stores and, then, to get the items bought directly delivered to their homes. Hence, new versions of the vehicle routing problem (VRP) considering additional decision variables and constraints have emerged. Omnichannel retailing leads to an integrated problem combining the VRP and the pick-up and delivery problem. In omnichannel distribution systems, a set of retail stores need to be replenished and, at the same time, products have to be sent from these stores to final customers. The resulting omnichannel VRP consists in two stages: (i) a group of retail stores that must be served from a distribution center; and (ii) a set of online consumers who must be served, by the same fleet of cargo vehicles, from these retail stores.
For solving the deterministic OM-VRP, a simple heuristic was initially introduced. This heuristic was then extended into a multi-start biased-randomized algorithm, which is tested against the state-of-the-art methodologies. Our biased-randomized algorithm performs reasonably well in a set of 60 instances of the deterministic OM-VRP, which allows us to extend it into a full simheuristic for solving the stochastic version of the problem. This is a more realistic version of the OM-VRP where travel times are modeled as random variables following a log-normal distribution. Our simheuristic approach is also capable of measuring the reliability of any proposed solution when it is employed in a stochastic scenario. To the best of our knowledge, this is the first time that such a stochastic variant of the problem has been solved in the literature. Regarding the simheuristic results, we conclude that the best deterministic solutions may perform badly when used in a stochastic scenario. Those solutions are often not reliable in terms of completing all routes within a time limit. On the other hand, our simheuristic approach was able to generate reliable and competitive results for these stochastic scenarios. Therefore, our methodology enables decision makers to choose the solution that better fits his or her utility function in terms of cost and reliability level.
Two-echelon distribution systems, as the one considered here, are typically characterized by the use of different fleets of vehicles at each distribution level. Apart from considering a single fleet of vehicles for serving both delivery levels, the products ordered by customers were assumed to not affect the vehicles' capacity. Therefore, future directions of research include the consideration of non-negligible sized online customer orders, heterogeneous vehicle fleets, and the incorporation of positive demands for multiple product types for each customer. Regarding the solution approach, different perturbation stages and local search operators could be tested in order to speed up the convergence process towards near-optimal solutions. This might be particularly relevant when considering large-sized instances.