Matheuristics for the Design of a Multi-Step, Multi-Product Supply Chain with Multimodal Transport

: Supply-chain network design is a complex task because there are many decisions involved, and presently, global networks involve many actors and variables, for example, in the automotive, pharmaceutical, and electronics industries. This research addresses a supply-chain network design problem with four levels: suppliers, factories, warehouses, and customers. The problem considered decides on the number, locations, and capacities of factories and warehouses and the transportation between levels in the supply chain. The problem is modeled as a mixed-integer linear program. The main contribution of this work is the proposal of two matheuristic algorithms to solve the problem. Matheuristics are algorithms that combine exact methods and heuristics, attracting interest in the literature because of their fast execution and high-quality solutions. The matheuristics proposed to select the warehouses and their capacities following heuristic rules. Once the warehouses and their capacities are ﬁxed, the algorithms solve reduced models using commercial optimization software. Medium and large instances were generated based on a procedure described in the literature. A comparison is made between the algorithms and the results obtained, solving the model with a time limit. The algorithms proposed are successful in obtaining better results for the largest instances in shorter execution times.


Introduction
An optimal supply chain is a fundamental part of any company's success; a good design and administration represent a competitive advantage or even a requirement for market participation. Supply chains represent a large part of a company's assets. Additionally, costs or savings are dependent on their design. Among the advantages obtained by a good design of the supply chain are reduced purchase costs, reduced production costs, increased company profits, reduced investment in fixed costs, and increased cash flow, among others.
In the end, the main objective of a supply chain is to provide an efficient way to supply the products to the client at the lowest possible cost (Council of Supply-Chain Management Professionals (CSCMP), 2019). However, within a supply chain, a large number of costs are incurred. Additionally, within large companies, supply chains are becoming increasingly complex. Knowing which is the best option among all the possible combinations becomes quite a complex task. The global production and distribution networks involve many actors and variables, exploding the combinatorial nature of the decisions involved, especially when more characteristics are considered, such as location, transportation, inventory, product architecture, or sustainability considerations. These networks exist in industries such as pharmaceutical, automotive, and electronic products, with suppliers, plants, and customers in different continents.
One way to address the decision-making problem in the design of supply chains has been to propose optimization models based on mathematical programming. As long as computers increase power, it has been possible to solve more complex models involving more elements. However, the mixed-integer linear programming models that have been widely used in the optimization of the supply chain are mostly NP-hard [1], making it impossible to obtain optimal solutions in reasonable times for instances of size similar to those found in real problems. Fortunately, the ability of computers to calculate results for different scenarios brought the opportunity to use heuristics and randomness to help in the construction of solutions for NP-hard combinatorial problems. The first results showed that although these solutions had variable and worse quality than those obtained with exact methods, the computational effort required shorter times. With the goal of improving the quality of these solutions, the research on metaheuristic algorithms was born. Many advances were achieved with algorithms such as simulated annealing, tabu search, ant colony optimization, genetic algorithms, and others. Recently, new algorithms are being proposed, combining heuristics and exact mathematical programming methods, known as matheuristics. The research in this field is new, and the goal is to combine the speed of processing using heuristic components with the quality obtained with exact methods. Thus, the main contribution of the work presented in this paper is to advance in the research of matheuritics, proposing two algorithms to solve a complex problem efficiently for supply-chain network design with good quality solutions in a reasonable time.
In this work, a supply-chain network design (SCND) problem will be presented, where a literature review of both the problem and the methods of solution are shown first; then, the problem is described in detail. A mixed-integer linear programming model is presented, and the matheuristics are used to solve the same problem. The instances presented range from 100 to 200 clients, and the results of the mixed-integer linear programming model will be analyzed and compared with those of the matheuristic algorithms proposed. Finally, the conclusions will be presented in the last section.

Literature Review
This section will describe some of the literature, first related to the problem of supplychain network design, and later, related to matheuristics, commenting about the gap covered by the research presented in this paper.
Pirkul and Jayaraman [2] created a model called PLANWAR, which focused on optimizing the supply chain through a heuristic to decide on the opening of plants and warehouses and the flow between them. However, the model does not handle different levels of capacity between facilities nor multimodal transportation.
Wu et al. [3] solved the supply-chain planning problem where the same product can be produced in multiple facilities, but their work focuses more on analyzing different algorithms and the complexity of each one of them. In their problem of supply-chain design, Eskigun et al. [4] consider delivery times and transportation modes, but their work focuses mainly on outbound logistics and is not multi-tier. Sadjady and Davoudpour [5] solved a multi-product supply-chain problem using a mixed-integer linear programming model in which the opening of facilities is decided, as well as the level of capacity and the mode of transport of flows between levels of the supply-chain. However, the work takes into account only finished products. Olivares-Benitez et al. [6] focus on optimizing the transportation of a two-tier supply chain, calculating the flow and time between facilities using bi-objective optimization. The problem addressed in this work is for a single product. Rahmaniani and Ghaderi [7] developed a mixed-integer linear programming model and a heuristic based on the evolutionary algorithm of the firefly, where transport and construction costs are taken into account, but their solution method focuses on problems in telecommunications or power distribution companies.
Bertazzi et al. [8] developed min-max methods and a heuristic to solve the multi-tier inventory problem taking into account purchasing, manufacturing, and transportation costs, but they do not address the issue of opening of facilities. Additionally, they do not take into account the bill of materials.
Many more recent models address the problem of supply-chain network design with different considerations, for example, ref. [9] describe a problem with environmental and financial considerations, or [10] analyze a model considering inventory decisions besides the classic decisions for location and transportation. However, the research in the literature is far from a general model for supply-chain network design that includes all the situations, variables, and decisions. Our purpose was to analyze a problem with a high level of complexity to be a challenge for new methods of solution. In particular, very few models consider the product architecture represented in the bill of materials. One of the reasons is that introducing this element creates an explosion in the number of variables and interactions, making it very hard to solve even small instances. As can be observed in the literature presented, the most used solution methods are based on metaheuristics because of the computational complexity of the models used. We are proposing new methods of solution into the field of matheuristics, which rarely has been used in SCND.
The term matheuristic is relatively new; the term began to be used between 2008 and 2009. Maniezzo et al. [11] talk about the hybridization that can exist between mathematical programming and metaheuristics. Fischetti and Fischetti [12] explain that matheuristics exploit heuristics and metaheuristics to improve and facilitate the mixed-integer programming (MIP) model. They show three applications: optimization of layout, packaging, and routing. Moreover, there is a rising interest in these hybrid methods, demonstrated by a growing literature and specialized tutorials in operations research [13]. Matheuristics have been applied to different decision-making problems, for example in routing [14], production planning [15], lot sizing [16], and risk management [17], among others. However, we found in the literature few applications of matheuristics to supply-chain network design.
Boschetti et al. [18] solve a Single-Source Capacitated Facility Location problem (SCFLP) using different matheuristics. Raa et al. [19] use a matheuristic to solve their aggregate production-distribution problem, but the work is focused on mould-sharing between factories. Tautenhain et al. [20] use the combination of a heuristic called MathFix and a matheuristic called AugMathFix to solve a bi-objective model for sustainable supplychain design. However, the model has only two tiers: suppliers-plant and plant-clients. Cantú et al. [21] propose a matheuristic for the design of sustainable hydrogen supply chains using a multi-objective perspective, but the model is focused on a single product and transportation mode. Souto et al. [22] propose a matheuristic algorithm to solve the problem of supply-chain network design only with two levels, a single transportation mode, and a single product. Table 1 describe the features for the literature presented in this section. It can be noted that the problem we are solving in this work has a higher level of complexity than other problems solved in the literature using matheuristics. The elements that add this complexity beyond the classic decisions on location and transportation are: a hierarchical product architecture as described by a bill of materials, different capacities in the facilities, and different available transportation modes. The complexity of the problem represents a challenge for any solution method, and is a good choice to demonstrate the efficiency of the matheuristic algorithms proposed.

Problem Description
This paper addresses the problem of designing a four-tier multi-product supply-chain network (SCND): suppliers, factories, warehouses, and customers ( Figure 1). The number and locations of plants and warehouses must be chosen from a set of potential plants and warehouses, respectively, and the capacity level of each factory and each warehouse must be chosen from a set of predetermined capacity levels for each location. Each location with each of its capacity levels has a fixed opening cost. Each of the materials can be supplied by a set of suppliers that have this material. Each material has a different purchase price for each supplier. Materials are shipped to factories, where they are converted into finished products. Each finished product has its bill of materials (BOM), which states what materials are needed for each finished product and how many units, so all the materials needed to produce a finished product will have to be taken to the factory where it is produced. Figure 2, shows the representation of the BOM; it exemplifies that to produce product p 1 , 4 units of material r 1 and 3 units of material r 2 are needed. The generic description is for any product p the necessary units of material r are given by the parameter A rp .
Each product can be manufactured in a pre-established set of factories. Each of the products has a manufacturing cost in each plant. Once the products are manufactured, they can be taken to a set of pre-established warehouses. Finished products must be supplied to each customer in a single delivery; that is, products cannot be shipped to the customer from two different warehouses. All materials and finished products can be transported by different transport methods for each destination-origin pair, which can be the following: supplier-factory, factory-warehouse, warehouse-customer. Each product or material has a transport cost for each means of transport for each possible pair. Each means of transportation for each origin-destination pair has a minimum quantity to be transported and a maximum transport capacity.
Each supplier has a maximum capacity to supply each material. Each product occupies a different manufacturing or storage capacity. Each factory has a maximum production capacity for each product, a total maximum production capacity, and a minimum production capacity so that the factory can be opened for each of its capacity levels. Each warehouse has a maximum storage capacity and a minimum of storage to be opened.
The objective of the supply-chain design is to minimize the sum of the fixed cost by opening factories and warehouses, the total purchasing costs, the total manufacturing costs, and the total transportation costs.
This supply-chain design can be applied to global networks of manufacturing and distribution in industries such as pharmaceutics, automotive, and electronics. For example, in the production of drugs, many ingredients can be blended, purchased from suppliers in different countries, and distributed to customers in other continents using alternative transportation modes. A potential example is to have suppliers in Asia, shipping materials to plants in Europe to manufacture products, to be transported to markets in the US, Canada, and Brazil. Similar cases can be devised to manufacture vehicles, tablets, cell phones, computers, or vaccines.
Some assumptions of the model that may represent limitations to its applicability are: 1.
It is a single-period problem that usually is applied to long-term planning. 2.
The model is deterministic and is not considering variability, although being fast to solve is useful for what-if analysis. 3.
The model is not incorporating other operational and strategic elements such as routing, sustainability, inventory, service level, etc.

Mixed-Integer Linear Programming Model (MILP)
Following is the notation used for the MILP, based on the model of [23].

Decision Variables
Binary Variables Real Variables x rt s f : Amount of material r that is supplied by supplier s to factory f by mode of transport t, (s ∈ S; f ∈ F; r ∈ R; t ∈ T) x pt f w : Amount of product p that is sent from factory f to warehouse w by mode of transport t, ( f ∈ F; w ∈ W; p ∈ P; t ∈ T)

Objective Function
The objective function (1) is to minimize the sum of all the costs that we are taking into account in the model. The first term adds the fixed costs of opening a factory, the second, the fixed costs of opening a warehouse, the third the purchase and transportation costs from the supplier to the factories of the materials, the fourth, the cost of manufacturing and transportation from the factories to the warehouses and the fifth the transportation from the warehouses to the customers.
Supplier constraints Constraint (2) limits the order quantity to a material supplier's maximum capacity, constraint (3) prevents a factory ordering a specific material from more than one supplier and constraint (4) ensures that not only can the material be transported from a supplier to a factory if we select that supplier.
Factory and warehouse constraints Constraint (5) ensures that each potential location can only be opened with one level of capacity. Constraint (6) ensures that the amount of material ordered from the supplier is exactly what is needed to produce the finished products. Constraints (7) and (8) limit production to the minimum and maximum quantities in general of each factory and also limit the maximum capacity per product, respectively. Finally, constraint (9) imposes the lower and upper limits of storage in each warehouse.
Customer and flow retention constraints Constraint (10) ensures that a customer is only dispatched from a warehouse and constraint (11) ensures that the quantity of product that reaches customers is the same as that which reaches the warehouses.
Transport related constraints Constraints (12)- (14) limit the minimum and maximum quantities to transport between origins and destinations.
Binary and non-negativity constraints x rt

Matheuristics
The model described above can be reduced to the single-source capacitated facility location problem [23] and therefore it belongs to the class of NP-hard problems. Thus, a heuristic method is justified, especially to solve large instances.The matheuristics consist of the steps described in Algorithms 1-3.The algorithm described in Algorithm 1 is the general one. The problem is divided in two parts. In the first part the warehouses are opened following one of two methods: simple or selective (with probability). Once the decision for the warehouses was fixed, an allocation sub-problem is solved using commercial optimization software. The sub-problem is described in Section 4.2.1. This sub-problem helps to determine the flows between the warehouses and the customers. Finally, a second sub-problem is solved considering the decisions fixed previously. This sub-problem is described in Section 4.2.2 as a reduced supply-chain network design (SCND) problem. Here, the remaining non-fixed variables are solved.
The algorithm described in Algorithm 2 helps to open warehouses with a certain capacity using a simple, random selection. The process finishes when the aggregated capacity is greater than the total demand. This method is reported as "Simple Heuristic".
The other method for warehouse selection is described in Algorithm 3. In this algorithm, an estimated cost is calculated for each warehouse. This cost is a combination of the fixed cost divided by the capacity of the facility, and an estimation of a potential transportation cost per unit. A probability is calculated such that the more expensive facility has a lower probability, and the cheapest facility has a higher probability. These probabilities are used with a random number to select warehouses with a certain capacity. The process finishes when the aggregated capacity is greater than the total demand. This selective method is reported as "Heuristic with Probability".
The complexity of both matheuristics is dominated by the complexity of the solution of the Allocation Model which can be reduced to the Linear Assignment Problem, which can be solved in O(n 3 ) time [24].  while Open ws = 1 10: Open ws = 1 11: z ws,qs = 1 12: Available capacity = Available capacity + SQ ws,qs 13: while Available capacity < Demanded storage capacity repeat 13: Generate random ns 14: Select w and q corresponding to the random number ns 15: while Open ws = 1 16: Open w = 1 17: z wq = 1 18: Available capacity = Available capacity + SQ wq 19: while Available capacity < Demanded storage capacity

Allocation Model
Allocation model 2.1 defines the decision variables u pt wc which due to the nature of the problem is the largest set of variables since the number of these variables is defined by the expression |W| · |C| · |P| · |T| assuming that you always have more customers than warehouses or factories.
In the case of the instances used for this work, the smallest number of variables u pt wc used in one instance was 6000, so defining those variables alternately greatly reduces the number of branches explored by the branching and cutting algorithm. This is used in its default configuration in the IBM ILOG program CPLEX to solve mixed-integer linear programming models. The allocation model is presented below. Variables The constraints used in the original model are constraints (9), (10), (14) and (21) And the objective function is the following (22):

Model of Reduced SCND
The variables of the model that become parameters are the following: to open a warehouse w with a client c by mode of transport t, (w ∈ W; c ∈ C; t ∈ T) 0 other case The parameters that will no longer be used in model 2.2 with respect to the original are the following: Parameters not used

SQU p
Storage capacity required to store a unit of p, (p ∈ P) SQ wq Maximum storage capacity of warehouse f with level of capacity q, (w ∈ W; q ∈ Q) SQ wq Minimu production capacity used by factory f with level of capacity q, (w ∈ W; q ∈ Q) The SCND model defines the remaining decision variables of the original model, taking as parameters, the decision variables resolved by both the heuristic and the 2.1 allocation model, removing the constraints that are not necessary.
The objective function is used the same as the original model, although some terms do not contain decision variables but parameters, with the aim of comparing the matheuristic solutions with those of the MILP.

Instances
The instances are generated by means of a methodology described by [25] that simulate reality. The size of the generated instances is shown in Table 2 and the size of the corresponding model is shown in Table 3.  100  10  10  20  3  3  20  20   125  13  13  25  3  3  25  25   150  15  15  30  3  3  30  30   175  18  18  35  3  3  35  35   200  20  20  40  3  3  40  40 Transport mode 1 represents the train, which is not available for all locations and the minimum amount of transport is very high. Modes 2 and 3 represent land vehicles, mode 2 represents a small vehicle such as delivery vans, and mode 3 represents a larger cargo vehicle; these two are assumed to be hired through an external logistics provider so they do not have a maximum amount of transport. For mode 3 there is a minimum quantity to transport to justify the use of the larger vehicle. For each product-customer pair a random demand DEM pc = X ∼ UN IFD [1,10], is generated, where X is a random number that follows a discrete uniform distribution UN IFD[min, max] that goes from min to max. The fixed cost of opening a factory and a warehouse are defined by the following expression: The acquisition cost of a material is generated by the following expression PC r s = X ∼ UN IFC[0.075, 0.625] where UN IFC[min, max] represents a continuous uniform distribution The manufacturing cost of a product is generated by the following expression MC The other parameters are calculated using a more sophisticated methodology explained in Appendix A of the work by [25]. A Windows console program developed in C + + was carried out to generate an instance automatically to generate these instances.

Results
The generated instances were resolved with the mixed-integer linear programming model programmed in OPL in the IBM ILOG CPLEX 12.8.0 IDE, and the case of the matheuristics, an application in C + + Concert Technology was programmed with the use of CPLEX. Both were run on a Lenovo ThinkPad T580 computer with an Intel Core i7-8550U @ 1.80GHz processor and 16 GB of RAM.
The results obtained are shown in Table 4, indicating the total cost obtained for each instance with the different methods compared.
The results show that as the instances increase in size, the computation time to obtain optimal solutions increases exponentially. In all the cases, the MILP reached the time limit of 4 h. Only for instances with 100 clients, the matheuristic algorithms achieved times below 25 min. However, for instances larger than 100 clients, the matheuristic algorithms reached the time limit of 2 h. Hence, obtaining results through the use of mathematical modeling becomes impractical for large instances.
It can be observed in Table 4 that when the size of the instance grows, i.e., it has a higher number of clients, the total cost increases also. Likewise, it can be seen that as the instance size increase, the difference between the quality of solutions provided by CPLEX and those provided by the matheuristics decreases. For instances of 100, 125 and 150 clients, the results of the matheuristic algorithm with the "simple heuristic" were just 14.20% on average above those obtained with CPLEX, and the results with the matheuristic algorithm with the "heuristics with probability" were only 6.06% on average above those obtained with CPLEX.
In the instances of both 175 and 200 clients, at least one of the two implemented matheuristics obtained a better solution than that of CPLEX in most of the cases. For the instances with 175 customers, the average improvement was 1.41% comparing the result of CPLEX and the result of the matheuristic algorithm with the "simple heuristic". For the same instances, the average improvement was 5.96% comparing the result of CPLEX and the result of the matheuristic algorithm with the "heuristics with probability". For the instance 2-200, the improvement was 3.93% comparing the result of CPLEX and the result of the matheuristic algorithm with the "simple heuristic". For the same instance, the improvement was 9.22% comparing the result of CPLEX and the result of the matheuristic algorithm with the "heuristics with probability". In six cases out of 15, with 175 and 200 clients, the matheuristic algorithms were able to find a feasible solution while CPLEX could not.
Comparing the two matheuristics, it can be observed that the heuristic method used to decide the opening of warehouses has a great impact on the objective functions for the type of instances used because the fixed cost of opening a warehouse is the cost with the greatest impact. Both the first and second heuristics diversify the results at each iteration so that in small instances, the algorithms may be used on more than one occasion to increase the quality of the solutions. Regarding the use of the probability distribution used for constructive heuristics with probability, although some randomness is applied, the selected warehouses will tend to be those with the lowest cost of opening divided by their capacity, so the quality of the solutions tends to be better. In average, for all the instances tested, the matheuristic algorithm with the "heuristics with probability" was 7.15% better than the algorithm with the "simple heuristic".  For the interested reader, a further discussion about the SCND configurations obtained and the distribution of costs, according to the design of instances proposed in [25], can be found at [23].

Conclusions
In this work, a problem for supply-chain network design is addressed. This problem was presented in the literature by Corthinal et al. [23]. We selected this complex NP-Hard problem to prove the efficiency of novel matheuristic algorithms. Matheuristics are algorithms that combine heuristic rules with exact optimization methods based on mathematical programming. There is a growing interest in these methods in the search for improving the quality of solutions with a fast computation. Solving hard combinatorial optimization problems, Although heuristics and metaheuristics demonstrated efficiency in obtaining feasible but poor-quality solutions in short computation times, exact methods implemented in commercial and open-source software deliver optimal solutions at the cost of long execution times. In this way, the research in matheuristics looks for the benefits of hybridization.
The problem addressed has characteristics that increase the level of complexity compared to other problems of supply-chain network design solved with matheuristics. In addition to the classic decisions on the location of facilities and transportation, the problem involves determining capacities for the facilities, choosing between different transportation modes, and considering a hierachical product architecture as described by a bill of materials. Hence, this problem is a good challenge to demonstrate the efficiency of the methods proposed. The structure of the network and decisions presented can be used for the design of global manufacturing and distribution networks in high technology industries such as pharmaceutics, automotive, or electronics.
The main contribution of this paper is the proposal of novel matheuristic algorithms to solve a challenging combinatorial optimization problem for supply chain network design.
The algorithms proposed were efficient in obtaining solutions of high quality in reasonable computation times for large instances.
To study larger instances or longer supply chains, the problem can be divided into more sub-problems with heuristic decisions in between. As we observed, it is possible to obtain better solutions with a well-thought heuristic in less time as data sets grow. Additionally, more complex decisions can be added to the model to integrate routing, inventory management, sustainability issues, and variability through stochastic modeling.
The key to designing efficient matheuristics is the adequate decomposition of the problem to take advantage of the quality of exact methods using heuristics and randomness to boost the execution speed.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

Notation
Description Sets S Set of suppliers C Set of clients F Set of potential factories T Set of modes of transport Q Set of capacity levels for factory F or warehouse W Subsets R Subset of raw material P Subset of finished products W Subset of potential warehouses P r Subset of finished products that require raw material r, (P r ⊂ P; r ∈ R) S r Subset of suppliers who supply raw material r, (S r ⊂ S; r ∈ R) F p Subset of factories that can produce finished product p, (F p ⊂ F; p ∈ P) W p Subset of warehouses that can store finished product p, (W p ⊂; p ∈ P) C p Subset of clients with demand of finished product p, (C p ⊂ C; p ∈ P) T od Subset of modes of transport available from origin to destination d, (T od ⊂ T) Parameters Parameters of cost FC f q Fixed cost of opening a factory f with capacity level q, ( f ∈ F; q ∈ Q) FC wq Fixed cost of opening a warehouse w with capacity level q, (w ∈ W; q ∈ Q) PC r s Cost of buying raw material r with supplier s, (s ∈ S; r ∈ R) MC p f Cost of manufacturing product p in factory f , ( f ∈ F; p ∈ P) TC rt od Cost of transporting material r from origin o or to destination d with transport mode t, ((o, d) ∈ (S, F); r ∈ R; t ∈ T)