Simulation-Optimization Approach for Multi-Period Facility Location Problems with Forecasted and Random Demands in a Last-Mile Logistics Application

: The introduction of automated parcel locker (APL) systems is one possible approach to improve urban logistics (UL) activities. Based on the city of Dortmund as case study, we propose a simulation-optimization approach integrating a system dynamics simulation model (SDSM) with a multi-period capacitated facility location problem (CFLP). We propose this integrated model as a decision support tool for future APL implementations as a last-mile distribution scheme. First, we built a causal-loop and stock-ﬂow diagram to show main components and interdependencies of the APL systems. Then, we formulated a multi-period CFLP model to determine the optimal number of APLs for each period. Finally, we used a Monte Carlo simulation to estimate the costs and reliability level with random demands. We evaluate three e-shopper rate scenarios with the SDSM, and then analyze ten detailed demand conﬁgurations based on the results for the middle-size scenario with our CFLP model. After 36 months, the number of APLs increases from 99 to 165 with the growing demand, and stabilizes in all conﬁgurations from month 24. A middle-demand conﬁguration, which has total costs of about 750,000 e , already locates a suitable number of APLs. If the budget is lower, our approach offers alternatives for decision-makers.


Introduction
Last-mile logistics (LML) is known as the least efficient and most complex part of the supply chain. LML activities have negative impacts on pollution and traffic congestion in urban areas [1]. The growth of e-commerce activities has increased the number of individual home deliveries, thus driving up LML flows. Improving the efficiency of LML in urban areas through research is an important driver for the success of e-commerce and helps to reduce the negative externalities associated with urban logistics (UL).
An automated parcel locker (APL) is a potential solution to LML challenges. In our current work, we analyze the use of APLs such as packstations or locker boxes as a promising alternative to improve UL activities [2]. An APL is a group of electronic lockers with variable opening codes. APLs can be used by different consumers whenever it is convenient for them. APLs are located near consumers' homes, workplaces, and train stations, where online shoppers deliver or ship packages. The costs of home delivery and the related risk of missed delivery are likely to be higher compared APL systems. Online shoppers are likely to use APLs more often in the future [3]. Third-party logistics providers such as DHL, InPost, Norway Post, UPS, or Amazon continue to invest in APLs to gain a competitive advantage [3]. account different estimates for the requirements. Third, the performance of the associated solutions in a stochastic environment is evaluated using a Monte Carlo simulation. Finally, the conclusions present possible future work and applications.

System Dynamics Modeling
The System Dynamics (SD) methodology was developed by Jay W. Forrester [18]. SD was originally introduced to facilitate the understanding of industrial processes. SD is used as a methodological approach to explain the effects of decisions in complex dynamic systems. The SD approach emphasizes time functions [19,20]. SD Models undergo constant interaction, continuous questioning, testing, and refinement. Based on the feedback concepts of control theory, the SD methodology generates the dynamic behavior of the associated model. The feedback loop structure of systems is represented by causal loop diagrams (CLD) [19,21]. A feedback loop contains two or more causality-related variables that are self-contained. The variable relationships in the loop can be either positive or negative. A positive relationship means that when one variable increases, the other one increases, too. In a balanced relationship, the change in the variables is inverse. The stock and flow diagram (SFD) is the underlying physical structure of the system. The SFD is normally structured according to the CLD. The stock (level) represents the state of the system and the flow (rate) is changed by decisions based on the state of the system. The stock and flow structure (including feedback) of a system determines the quantitative modes of behavior that the system can adopt. For the development of an SD model, the work in [19] presents a modeling process with the following steps: (i) problem analysis, (ii) system conceptualization, (iii) model formulation, (iv) simulation and verification, and (v) policy analysis and improvement. Figure 1 illustrates the SDSM process. Steps to build an system dynamics simulation model (SDSM) based on the work in [19].
In the context of logistics, some studies show the application of SD to LML activities. In [22], the authors analyzed the decisions and interdependencies between customers, retailers, and suppliers using an SD model from an economic perspective. In [23], the authors applied an SD approach to model interdependencies of decisions by various stakeholders and their impact on city logistics. In [24], the authors presented a specific SD application in UL operations. They also used an SD model in [25] to understand customer behavior from an LML perspective. Although modeling efforts are important in urban logistics [12,26], the simulation seems to be still in the development phase [10] . The most popular simulation approaches remain multi-agent approaches [10,27] . While SD is still preliminary in its applications for urban freight distribution, it has great potential because it can take into account the complexity of the dynamics and heterogeneity of the systems [23].

Facility Location Problems
Facility location determination is a critical strategic business decision. There are several factors that determine facility location, including competition, costs, and associated impacts. The facility location decision has a profound impact on tactical and operational operations. For dealing with this strategic decision, the Facility Location Problem (FLP) was introduced to the field of operations research in the 1960s [28] and was initially called the Plant Location Problem.
FLPs consist of determining the optimal location for one or more facilities to serve a range of demand points. The importance of the optimal location depends on the nature of the problem in terms of the constraints and optimality criteria considered for the site [29]. FLPs are useful for determining the location of facilities such as hospitals, fire stations, bus stops, train stations, truck terminals, gas stations, blood bank centers, retail stores, neighborhoods, libraries, parks, post offices, airports, and landfills. The FLP can be seen as a generalization of the vehicle routing problem with fixed costs for the installation of facilities. An exhaustive review and discussion of the FLPs is provided in [30].
In a basic formulation, the FLP consists of a number of potential plant locations at which a plant can be opened and a number of demand points that must be served. The aim is to determine what subset of facilities needs to be opened in order to minimize the total costs of delivering goods to customers plus the sum of the facility opening costs. A simple example of a classic FLP instance is shown in Figure 2, where each customer (blue circle) is assigned to the nearest open facility (red square) via an active connection. One of the most frequently investigated discrete location problems is the uncapacitated facility location problem (UFLP). The UFLP is the problem of determining the best location for a given facility-or the best locations for a given group of facilities-given some limitations on the environment in which it can be placed. This contrasts with the capacitated FLP, where facilities limit the number of customers they can serve. In the uncapacitated version, there is no such limitation. Some applications of FLP in UL context are presented in [31][32][33][34][35][36][37][38].

Monte Carlo Simulation
Monte Carlo simulation (MCS) generates distributions of possible result values. By using probability distributions, variables can have different probabilities of different outcomes occurring. Probability distributions are a suitable way to describe the uncertainty in variables of a risk analysis. During an MCS process, values are randomly selected from input probability distributions. Each set of samples is called an iteration, and the result of that sample is recorded. MCS conducts this hundreds or thousands of times, and the result is a probability distribution of possible outcomes. In this way, MCS provides a much more complete prediction of what may happen because it also delivers the probability that it will happen [39].
Many quantitative problems in science, engineering, and finance are solved today with MCS techniques. We list some important application areas, such as industrial engineering and operations research, physical processes, economics and finance, computer-based statistics and parallel computing, adaptive Monte Carlo Algorithms, spatial processes, and quasi Monte Carlo [40]. The MCS has been applied to last mile logistics in the real-world, which depends on many external random factors. This is especially true for last-mile deliveries. Challenging factors include-but are not limited to-traffic, weather, and the size of individual orders. To this end, MCS has found great use in assessing the risk and reliability of supply chains.

Integrated Simulation-Optimization Approach
Hybrid models play an important role in most real-world systems. Multiple perspectives can be obtained, each of which can answer a different important question. For answering the range of questions that can be asked with respect to a system, a combined set of model types can be the answer. Hybrid methods can bring more comprehensive and efficient estimations of a reality by enhancing the synergies among different methods and giving the suitable output for decision-makers [41]. One of the main goals of the SO method as a hybrid model is to efficiently address both the optimization and the uncertainty. An overview of the application of SO methods in designing resilient supply chain networks is presented in [42]. There are many ways to combine simulation and optimization, and the appropriate design depends strongly on the properties of the problem. The guideline 3633.12 [43] by the German Association of Engineers (Verein Deutscher Ingenieure (VDI)) provides a classification for different combinations of simulation and optimization in terms of sequential and hierarchical combination, which has been in detail elaborated by the Arbeitsgemeinschaft Simulation (ASIM) in Germany [44]. A sequential combination assumes that either simulation or optimization is completed before the other one can be executed. Within a hierarchical combined approach that can be called several times during the overall execution. Moreover, the details of the main classifications of various SO combinations are described in [13]. According to their classifications, we consider an analytical model improvement approach, where simulation is used to improve the model results, either by refining its parameters or by extending them, e.g., by considering different scenarios. In this context, the SDSM, based on an SO concept for APLs presented in [45], provides a suitable methodology to determine the behavior of the parameters in our multi-period capacitated FLP model. A well-proven SO approach to solve this kind of problems is provided by simheuristic algorithms [46,47].
In particular, our approach consists of the following steps, which are shown in Figure 3: (i) for each district, we use the SDSM to generate an estimate of expected demand; (ii) for different scenarios, each scenario being defined by a different level of demand (e.g., lower than expected, as expected, or higher than expected), solve the associated CFLP model; and (iii) use a Monte Carlo simulation to evaluate the solutions obtained in the previous step when used in a stochastic environment. Here, we assumed that the demand per district is uncertain and follows a known probability distribution, with the aim of comparing total costs with the reliability.

Application in the City of Dortmund
This paper deals with the case of the city of Dortmund, which is divided into 62 districts, codified from 000 to 960. Figure 4 shows the map of the city of Dortmund with its respective districts.

System Dynamics Simulation Model
We propose an SDSM for APL as a UL delivery scheme. The SDSM is designed to understand the behavior of components of APL systems, and it will be used as a decision support for future implementations. To develop an SD model, we followed the steps shown in Figure 1

Problem Identification
E-commerce does not necessarily mean the absence of physical shops, but rather an evolution in the way retailers carry out orders. For this reason, e-commerce has led to an increase in innovative combinations of physical and digital solutions, resulting in different ways of preparing, distributing, and collecting orders from customers [48].
Examples include home delivery, collection points, and APLs. We focus on understanding APL systems as a UL delivery scheme in the context of e-commerce and evaluating its components over time.

System Conceptualization
From a qualitative point of view, we use an SDSM to understand the behavior and interdependencies of the components of APL systems. From the existing literature on APLs (mainly based on case studies and field data), we define the main components that have an impact on the system. Following SD standard procedures [19], we use the software tool Vensim to create the causal-loop and stock-flow diagrams. Figure 5 describes the main APL system components that third-party logistics service providers will need to consider for future APL applications. The CLD describes that the market size is positively influenced by the population and the population growth rate. The potential number of e-customers is positively influenced by the e-shoppers growth rate and balanced by the number of APL users. In turn, the number of deliveries is positively reinforced by purchases per month and number of APL users. In turn, the purchases per month are positively reinforced by the average purchases per month and the online purchase rate.

Model Formulation
From a quantitative perspective, we present the evaluation of the APL components. Based on the CLD, we built the SFD, as shown in Figure 6. First, the variables of market size, potential e-customers, purchases per month, and APL users are defined as stocks (squared). Then, population growth rate, e-shoppers growth rate, online purchase rate, and APL market growth rate are defined as flows. Finally, population, accessibility, service level, average purchase per month, APL market share, and number of deliveries are auxiliary variables. The main output in this model is the number of deliveries, which are used as input values in the FLP model.

Simulation and Verification
We apply the SDSM based on public data of the city of Dortmund and using the e-commerce trends in the German context. Taking into account the volatility of the ecommerce sector, the SDSM evaluates the system components of APL for a planning horizon of three years (divided into 36 months). Table 1 shows the components used in the SDSM application and their values.  Table 2 shows the significant changes in the e-shoppers rate to build the scenarios. We consider Scenario 1 (S1), Scenario 2 (S2), and Scenario 3 (S3) with increasing rates of e-shoppers.

Multi-Period Facility Location Problem
The FLP is a well-known optimization challenge where the typical goal is to find the minimum costs and location of facilities that must be open to meet customer requirements, either deterministically [30] or stochastically [31,49]. If routing decisions are also included, the FLP turns into the so-called location routing problem [50,51]. In general, FLPs are classified either as capacitated or uncapacitated. The former refers to the case where the facilities have a known limit to the demand they can meet. The latter is the case where the service capacity of each facility exceeds the total demand of customers. Figure 7 illustrates the capacitated FLP (CFLP) for the APL network in the city of Dortmund. Here, each district is a potential APL location (yellow square) and each APL is connected to each other in the APLs configuration (dashed lines). These connections are used to calculate the distance matrix between districts. A multi-period CFLP is taken into account in our work. Decisions made in a given period affect future periods over a time horizon of T. In particular, as demand is expected to increase in future periods, we assume that whenever an APL is opened within a period t ∈ T, it must remain open until the end of the time horizon, i.e., for all t ∈ T : t > t. Similarly, third-party logistics providers indicate that a minimum percentage of m ∈ (0, 1) of total installed capacity must be used. Therefore, with the set I of nodes representing all districts in the city, each district i ∈ I could contain no, one, or more APLs, each with a known capacity a i > 0. Similarly, each district j ∈ I has an aggregated demand in the period t ∈ T, d jt > 0. For two districts i, j ∈ I, the unit costs of assigning an APL located in the district i to a customer located in the district j is c ij > 0. Similarly, the costs of opening an APL in district i ∈ I during the period t ∈ T is indicated as f it > 0. In this context, the binary variable x ijt takes the value 1 if customers in the district j ∈ I are assigned to an APL in the district i ∈ I during the period t ∈ T; otherwise, the value is 0. Similarly, the integer variable y it represents the number of APLs that are open in district i ∈ I and in period t ∈ T. Then, our multi-period CFLP can be formulated as follows.
subject to: ∑ i∈I x ijt = 1 ∀j ∈ I, ∀t ∈ T (2) x ijt ∈ {0, 1} ∀i ∈ I, ∀j ∈ I, ∀t ∈ T (6) The expression (1) indicates the objective function that minimizes the total costs: The first term indicates the service costs of APLs, while the second represents the fixed costs of opening new APLs in the time horizon. Constraints (2) ensure that for each period t ∈ T and each district j ∈ I exactly one APL is assigned. Restrictions (3) ensure that once an APL is opened, it remains open until the end of the time horizon. Constraints (4) ensure that for each APL in district i ∈ I and time period t ∈ T, the demand served by that APL does not exceed its capacity. Constraints (5) guarantee a minimum utilization percentage of the total installed capacity of APLs for each t ∈ T period. Finally, constraints (6) and (7) specify the ranges of the decision variables.

Computational Results and Discussion
Based on the city of Dortmund as a real-world case, a set of experiments considering a 36-month planning horizon has been tested. Table 1 shows the parameters provided by the SDSM. It yields multiple outputs, from which the yearly demand per district is the most relevant one to feed the multi-period CFLP model. Then, the integrated SO approach described in Section 3 is applied to obtain a set of solutions assessed in terms of stochastic cost and reliability level.

System Dynamics Simulation Model Results
The market size increases in line with the population growth rate from from 602,666 in month 1 to 606,182 inhabitants in month 36. The purchases per month, the number of deliveries and the number of APLs show an increasing trend over time. The number of deliveries increases from 125, 353 to 277, 910 packages per month. We applied the SDSM and changed the average purchases per month as shown in Table 2. The results for APL users in the first month are 45,666 for S1, 54,799 for S2, and 63,933 for S3, and at the 36th month 64,331 for S1, 77,202 for S2, and 90,071 for S3. The results of the number of deliveries (units) in the first month are 125,353 for S1, 150,423 for S2, and 175,496 for S3, and in the 36th month 277,910 for S1, 333,512 for S2, and 389,106 for S3. Figure 8 illustrates the scenario comparison of the users of APL and the number of deliveries. The complete results generated by the SDSM of the default scenarios are shown in Tables A1-A3 in Appendix A.

Generating and Simulating Optimal Configurations
As soon as our SDSM provides the number of deliveries (expected demand) for the three scenarios (s ∈ S, where S = {S1, S2, S3}) under consideration of Table 2, are used to feed our CFLP model. We evaluate ten APL network configurations (k ∈ K, where K = {1, 2, ..., 10}) with the demand increasing proportional to k, based on the scenario S2. Each configuration is obtained by optimally solving the CFLP model using the procedure described below.

1.
Consider a uniformly distributed random demand D jtk per district j ∈ J during the period t ∈ T for generating the configurations.

2.
Define µ jt = E[D jtk ] and assume that µ jt is the medium demand corresponding to the scenario S2.

3.
Define a factor δ = 0.01 to increase the size of the uniform interval as we move forward into future periods.

4.
Generate the random demand using Equation (8). The expression 1 + k−1 |K|−1 is useful to increase µ jt proportionally to the value of k. In this way, we guarantee that generated configurations differ in size.
The variable costs c ij are proportional to the distance between each pair of districts. They were estimated using a web mapping service. The fixed costs are f it = 5500e for the first year and each district, and increase according to an average inflation rate of 2% per year. The capacity of each APL in a district i ∈ I is a i = 6000 units per month, and the minimum utilization percentage is m = 40%. Then, our CFLP model is solved with Cplex for all ten configurations. The number of resulting open APLs per month is shown in Figure 9 for three out of these configurations. The lowest and highest lines represent solutions for the lowest and highest demand, respectively. The rest of the solutions are in between. As the demand µ jt increases over time, the number of open APLs will behave the same regardless of the configuration. However, this consistent behavior does not extend beyond the year 1 for k = 10 and beyond the year 2 for k = 1 and k = 5, when the total installed APLs are sufficient to cover the total demand by the end of the planning horizon. Furthermore, there is a sharp increase in open APLs from months 11 to 12. This behavior is caused by two parameters: The annual growth of the fixed costs f it drives the APLs that are opened when they are less expensive, but always limited by the minimum utilization percentage m. Finally, the total number of APLs installed varies significantly from one scenario to another, for example, while 165 APLs are required for k = 10, only 99 APLs are installed in the configuration k = 1 at the end of the planning horizon. All configuration results are stored in Tables A4-A6 in Appendix B. Once all configurations have been generated, they are tested in a stochastic environment, assuming that the demand per district is uncertain and follows a known probability distribution. Consider a random demand D jts whose mean and standard deviation are µ jts and σ jts , respectively, per district j ∈ J during the period t ∈ T for the scenario s ∈ S. We assume that µ jts is the demand generated by our SDSM. Now, as the goal is to evaluate the performance of each configuration, they must be tested under the same demand conditions; therefore, the demand does not depend further on the configuration. Then, D jts is simulated and each configuration is evaluated in terms of total costs (Equation (1)) and reliability. Studies on reliability in supply chains are found in [52,53]. We define the reliability R ks of the configuration k ∈ K for the scenario s ∈ S as the probability that the stochastic demand of all districts in the city can be successfully satisfied, i.e., where b ks is the total number of simulation runs where the configuration does not cover all district demands, and n is the total number of runs. In other words, if at least one APL in a configuration is not able to cover all assigned needs, that configuration will fail. In our experiments, a total of n = 5000 runs are performed for each combination of scenario s and configuration k. Without losing generality, we assume that demand is independent of the customers' district, but our methodology can easily be adapted to take into account correlated demand. For the realization of the demand, three probability distributions have been tested:

2.
A symmetric triangular distribution, according to Equation (11), i.e., the mode equals µ jts . To obtain conditions similar to 1, the lower and upper limits of this distribution are calculated assuming that the standard deviation is equal.

3.
A lognormal distribution, according to Equation (12). Again, the standard deviation is the same as in the point 1 to preserve similar conditions. D jts ∼ Lognormal µ jts , σ jts (12) Figure 10 shows the main results of the simulation process for each configuration. Blue, orange, and green lines represent the results from the demand for uniform, triangular, or log-normal distribution. In addition, dotted, solid, and dashed lines represent the results for the scenarios S1 (low demand), S2 (medium demand), and S3 (high demand), respectively. Each dot on each line represents a single configuration. In general, more expensive configurations result in higher reliability, because they include a larger number of installed APLs. When the demand follows either a uniform or a triangular distribution, the most expensive half of the configurations always achieve a 100.0% reliability level, regardless of the scenario. In other words, the configuration k = 6, with total costs of 748,660e, already locates a suitable number of APLs and eliminates the need to consider more expensive configurations. However, if the budget is lower, our approach offers other good alternatives for the decision makers.
In general, configurations are less reliable when demand scenarios are increased. For example, configuration k = 4, with total costs of 661,100e, only achieves a reliability level of 14.0% under the high demand scenario and a log-normal distribution. Conversely, this configuration achieves a reliability level of 98.8% under the low demand scenario. Furthermore, the reliability is very sensitive to the probability distribution. Broadly speaking, a configuration fails if the demand is too high (Equation (9)). Therefore, configurations simulating a log-normal demand, which has no upper limit, are less reliable than those where the probability distribution is either uniform or triangular (Equations (10) and (11)). This fact underlines the relevance of integrating the study to determine the behavior of demand in the real case.

Conclusions
With the goal of determining the optimal number and location of automated parcel locker (APL) systems in a multi-period time horizon, this paper has proposed the use of an integrated simulation-optimization approach combining system dynamics with exact optimization and Monte Carlo simulation. We propose this integrated model as a decision support tool for future APL implementations as a last-mile distribution scheme. The analysis is based on a real-world case study where service requirements are considered as random variables that evolve over time. First, a system dynamics simulation model is designed to determine the 36-month performance of parameters such as APL users and number of deliveries. Then, these results feed a multi-period facility location model that provides the optimal number of APLs. To deal with the demand uncertainty, different scenarios are considered and solved with precise methods. The solutions associated with each scenario are then sent to a Monte Carlo simulation to estimate both their costs and reliability level.
The model provides an optimal number of APLs, taking into account the expectations of user demands. We have considered three scenarios S1, S2, and S3 for 50%, 60%, and 70% of the e-shopper rate. The results for the number of deliveries (units) after 36 months show a wide range of shipments from about 277,000 in S1 to nearly 400,000 in S3. We used our CFLP to evaluate ten APL network configurations (k = 1, ..., 10) with increasing demand in relation to each scenario. Obviously, there is a strong impact on the number of APLs that the city needs. After 36 months, the number of APLs increases from 99 in the case of the lowest demand to 165 at maximum demand. Interestingly, the number of APLs stabilizes from month 24 in all configurations. Thus, we can conclude that the effect on APLs appears linear in relation to the potential users of APL with no obvious scale effects. From a stochastic environment, we assumed that the demand per district is uncertain and follows a known probability distribution. Whenever the demand follows either a uniform or a triangular distribution, the most expensive configurations always reach a reliability level of 100.0% regardless of the scenario. The configuration k = 6, with total costs of 748,660e, already locates a suitable number of APLs. However, if the budget is lower, our approach offers other alternatives for decision makers.
All in all, the work illustrates the potential of combining different simulation and optimization techniques to correctly address complex optimization problems in real urban logistics, where uncertainties must also be taken into account. The following research lines are still open for the future: (i) increasing the level of detail on the demand side, taking into account correlated and individual customers' demands instead of aggregated ones-which will significantly increase the size of the problem; (ii) develop a metaheuristicbased approach for the optimization phase, as this will be a necessary step when larger instances are to be analyzed; and (iii) extend the approach to a fully simheuristic algorithm, so that the feedback provided by the Monte Carlo simulation can be reused to guide the metaheuristic search.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Results Generated by the SDSM for the Horizon Planning in the Proposal Scenarios
Appendix A shows the results by the SDSM for the first three years.

Appendix B. Number of APLs by Period and Configuration
Appendix B shows the results for the required number of APLs for the first three years.