A Surrogate Model-Based Hybrid Approach for Stochastic Robust Double Row Layout Problem

: The double row layout problem is to arrange a number of machines on both sides of a straight aisle so as to minimize the total material handling cost. Aiming at the random distribution of product demands, we study a stochastic robust double row layout problem (SR-DRLP). A mixed integer programming (MIP) model is established for SR-DRLP. A surrogate model is used to linearize the nonlinear term in the MIP to achieve a mixed integer linear programming model, which can be readily solved by an exact method to yield high-quality solutions (layouts) for small-scale SR-DRLPs. Furthermore, we propose a hybrid approach combining a local search and an exact approach (LS-EA) to solve large-scale SR-DRLPs. Firstly, a local search is designed to optimize the machine sequences on two rows and the clearance from the most left machine on row 1 to the left boundary. Then, the exact location of each machine is further optimized by an exact approach. The LS-EA is applied to six problem instances ranging from 8 to 50 machines. Experimental results show that the surrogate model is effective and LS-EA outperforms the comparison approaches.


Introduction
Research on the facility layout problem (FLP) usually involves the arrangement of machines in a plant, which is of great significance for improving production efficiency and reducing cost [1]. The three most commonly used material handling devices in FLP [2,3] are gantry robot, automated guided vehicle (AGV) and handling robot. Due to the outstanding performance of the AGV in terms of utilization and flexibility, it is commonly selected to link pairs of machines in FLP. To simplify the material handling, AGVs are restricted to move along a bi-directional straight line and machines are usually placed on one or both sides of the straight aisle. The double row layout problem (DRLP) is to locate machines on both sides of a straight aisle to minimize the material handling cost, and has a wide range of applications in semiconductor manufacturing [4] and LCD fabrication lines [5].
In DRLP, there are material interactions amongst machines on the same side and/or on different sides. Zuo et al. [4] showed that separating adjacent machines with a suitable clearance larger than the minimum could achieve a layout with lower material flow cost. Therefore, a solution to DRLP involves the machine sequence on each row, and the exact location of each machine on its assigned row (i.e., the extra clearance of each machine). That makes the DRLP more difficult to solve.
In FLPs with the aim of minimizing total material handling cost, the material flows amongst machines directly affect the performance of the layout. In practice, product demands dynamically change so as to meet the personalized needs of users, resulting in the material flows amongst machines changing over time. To our knowledge, there are only two studies on DRLP in dynamic production environments. Wang et al. [6] studied a dynamic double row layout problem, where multiple processing periods are considered and material flows amongst machines are fixed during a specific period but change over different periods. They sought to design an effective layout for each processing period to minimize the sum of the material handling cost and the rearrangement cost of machines relocated in consecutive periods. The dynamic layout problem usually involves considerable rearrangement and time costs due to production interruption, while a robust layout is able to deal with different processing periods well. Tang et al. [7] studied a robust double row layout problem that minimizes material handling cost and layout area simultaneously.
All existing studies on DRLP assume that material flows amongst machines are predefined and known. However, it is difficult to accurately forecast specific product demands, due to the uncertainty of market demands. According to the historical trend of product demands, employing a random variable to describe them is more realistic. The normal distribution is one of the most commonly used statistical distributions for this [8,9].
In this paper, we study a stochastic double row layout problem (SR-DRLP), which involves multiple processing periods, and product demands are independent normally distributed random variables. We focus on finding a robust layout which can handle different processing periods. Firstly, a mixed integer programming (MIP) model is established for SR-DRLP. To linearize the nonlinear term in the MIP, we build a linear model to fit the nonlinear term. A collection of training data is obtained by randomly generated layouts to the SR-DRLP, in order to train the linear model. Then, the linear model (termed surrogate model) is employed to replace the nonlinear term in order to make the MIP a mixed integer linear programming model, which can be utilized to find high-quality solutions to small-scale SR-DRLPs by an exact method. Then, a hybrid approach combining a local search and an exact approach (LS-EA) is proposed to solve large-scale SR-DRLPs. Firstly, the local search is designed to determine the sequence of all machines, the number of machines on row 1 (abbreviated as breakpoint) and the clearance from the left-most machine on row 1 to the left boundary (abbreviated as offset). Subsequently, utilizing the linearized MIP, the exact location of each machine is optimized by the exact approach.

Related Literature
FLP refers to determining the locations of machines in a plant, and plays an important role in manufacturing systems in terms of improving production efficiency and reducing operating costs [10]. Several recent reviews on FLP have been published [11,12]. There are two types of layout design closely related to DRLP, the parallel row ordering problem (PROP) and the space-free double-row facility layout problem (SF-DRLP).
Double row layout problem. In DRLP, a machine can be located anywhere on either of the two rows. Thus, there are three sub-problems in DRLP, i.e., the allocation of machines on the two rows, the sequence of machines on each row and the exact locations of these machines. Chung and Tanchoco [5] added more variables and constraints to the existing formulations of the single row layout problem and developed a mixed integer programming model for DRLP. In addition, a high-quality initial solution and corresponding upper bound of DRLP was provided by five proposed heuristic algorithms. However, the mixed integer programming model ignores the clearance constraint between adjacent machines, which may result in an illegal solution. A corrected formulation was proposed by Zhang and Murray [13]. Amaral [14] proposed a mixed integer programming formulation based on the α-incidence vectors for DRLP, where the implicit clearances are considered. This can provide a solution for the DRLP with up to 12 machines. Later, Secchin and Amaral [15] modified the mixed integer programming model in [14] and proposed a tighter model. This modified model can be utilized to solve the DRLP with up to 15 machines. Chae and Reganb [16] proposed a model for DRLP, which is modified from the model in [15]. The model greatly reduces the calculation time and solves DRLP with 16 machines. Amaral [17] established a mixed integer programming model for DRLP, which includes valid inequalities and a symmetry-breaking constraint. The model is more intuitive for handling qualitative inputs. Dahlbeck [18] presented combinatorial lower bounds for DRLP, and combined these lower bounds with a new distance-based mixed integer linear programming model to improve the lower bounds. Amaral [19] established a mixed integer programming model for DRLP based on a linear extension of a partial order on the set of machines. Compared with previously published models in the literature, this mode has the least number of 0-1 variables. All the above studies seek to obtain the optimal solution to DRLP by establishing a mathematical programming model.
To obtain a satisfactory solution for large-scale DRLP, some heuristic algorithms have been proposed. Zuo et al. [4] studied an extended DRLP where nonzero aisle widths are allowed and two optimization objectives are considered (cost and layout area). Moreover, they proposed an approach combining multi-objective tabu search with linear programming for the extended DRLP. Gülşen et al. [20] studied a variant of DRLP, where multiple replicates of the same machine allow the product flows to split into multiple parallel flow lines. For this problem, they established a nonlinear mixed integer programming formulation and developed a hierarchical heuristic to determine machine placements and corresponding material flows. Guan [21] proposed a decomposition-based algorithm for DRLP. First, a local search was applied to optimize machine sequences on two rows and the difference between two starting abscissas. Then, the clearances between adjacent machines were optimized by a particle swarm optimization. Murray [22] studied the DRLP with asymmetric material flows and established a mixed integer linear programming model. The problem was solved by a combination of a constructive heuristic and a local search. Amaral [23] presented four variants of a two-phase algorithm for a DRLP, where the width of aisle between two rows is negligible and required minimal clearance between any two machines is a constant. First, an improvement heuristic was devised to optimize the machine sequence and the leftmost point of the arrangement at row 1. Then, the machine location was further optimized by a linear programming.
For the dynamic processing environment in practice, in our previous work we studied a dynamic DRLP, where a layout was designed for each processing period to minimize total material flow cost and rearrangement cost [6], and proposed a robust layout method for a robust DRLP [7]. In all the above studies on DRLP, the material flows amongst machines are predefined and known.
Parallel row ordering problem. The PROP is an extended version of the single-row layout problem (SRLP) and considers the arrangements of facilities along two parallel rows. In PROP, facilities with certain characteristics in common are placed on one row, and the remaining facilities are arranged on a parallel row. Moreover, the arrangements of facilities on both rows have a common left origin and there is no clearance allowed between adjacent facilities on the same row. The objective of PROP is to find a facility sequence on each row to minimize the total material handling cost. Amaral [24] presented a mixed integer programming model for PROP, which extends a mixed integer programming formulation of the SRLP. Since PROP has fewer constraints and binary variables than SRLP, the PROP can be solved faster. Yang et al. [25] proposed a mixed integer programming model for PROP, which modifies some constraints of the model in [24]. Gong et al. [26] studied a dynamic PROP considering variable inter-facility material flow. The objective is to minimize the sum of the material handling cost and the rearrangement cost. They developed a hybrid algorithm of harmony search and tabu search to solve larger-scale PROPs.
Space-free double-row facility layout problem. The PROP is further extended to a SF-DRLP (also called the corridor allocation problem, CAP), in which the facilities can be placed on any of two rows. The SF-DRLP does not restrict the row to which facilities are allocated, so that SF-DRLP is more complicated than PROP. Amaral [27] first investigated SF-DRLP and proposed a mixed integer programming formulation that can be solved in reasonable time for moderate-scale problem instances. Moreover, three heuristic procedures were presented to handle larger-scale problem instances. Ahonen [28] presented a tabu search and a simulated annealing algorithm for SF-DRLP to minimize the sum of weighted distances. Kalita et al. [29] applied a permutation-based genetic algorithm to handle SF-DRLP, with the objective of minimizing the total material handling cost and the length of the corridor. Kalita et al. [30] further studied a bi-objective SF-DRLP and proposed a permutation-based genetic algorithm incorporating a promising local search technique to solve SF-DRLP. Zhang et al. [31] established a mixed integer programming model for SF-DRLP, which considers the effect of the corridor width. Moreover, they proposed an improved scatter search algorithm for SR-DRLP, which includes several improvement mechanisms, such as the adoption of a simulated annealing operation, a dynamic reference set update method, and an improved subset generation. Fischer [32] presented a mixed integer linear programming formulation for SF-DRFLP to solve instances with 16 machines.

Problem Description
SR-DRLP involves producing a series of different types of products in multiple processing periods and determining the locations of machines participating in the production task on two rows R = {1, 2}. Let L = {1, 2, · · · , l}, T = {1, 2, · · · , t} and M = {1, 2, · · · , m}, respectively, represent the sets of product types, processing periods and machines, where l, t, and m denote the number of the product types, processing periods and machines, respectively. The demands of product p ∈ L in processing period c ∈ T are denoted by D cp . Assume that product demands D cp are independent and normally distributed with known expected value E(D cp ) and variance Var(D cp ). Each machine i ∈ M is a rectangle with fixed width w i and depth d i . The processing technology of the product p is represented by B ijp , such that B ijp = 1 if product p ∈ L is processed on machine i ∈ M and then immediately processed on machine j ∈ M\{i}, otherwise B ijp = 0. Each pair of machines on the same row i ∈ M and j ∈ M\{i} must satisfy a minimum clearance constraint a ij . In addition to the minimum clearance a ij , there may be an extra clearance e j either between machines i ∈ M and j ∈ M\{i} or between machine i ∈ M and the left side of the layout boundary. The load/unload station of the machine is located at the center of its predefined side. The width of aisle between two rows is represented C. The distance of each pair of machines is measured by their rectilinear distance between their load/unload stations. The optimization objective is to determine the location of each machine to minimize the total material flow cost in all the t periods. Figure 1 shows an example of layout for SR-DRLP, which includes six machines and two rows. In this case, machines 1, 2 and 3 (4, 5 and 6) are location on row 1 (row 2). The width and depth of machine 6 are w 6 and d 6 , respectively. Machines 1 and 2 are on the same row and they are separated by a minimum clearance of them a 1,2 and an extra clearance e 2 . The distance between machine 6 and machine 2 is L 2,6 = x 6 − x 2 + C, where x 6 and x 2 are the abscissas of machines 6 and 2, respectively.  The decision variables of SR-DRLP are listed in Table 1. , Auxiliary continuous decision variables are utilized to determine the distance between machines The decision variables of SR-DRLP are listed in Table 1.
Auxiliary continuous decision variables are utilized to determine the distance between machines i ∈ M and j ∈ M\{i}. q ir Binary variable, such that q ir = 1 if machines i ∈ M is placed on the row r, otherwise q ir = 0.
s ij Binary variable, such that s ij = 1 if both machines i ∈ M and j ∈ M\{i} are placed on the same row, otherwise s ij = 0.
z ijr Binary variable, such that z ijr = 1 if both machines i ∈ M and j ∈ M\{i} are placed on the same row r ∈ R and i is placed to the left of j, otherwise z ijr = 0.
Derakhshan et al. [8] established a mixed integer programming model for an unequalarea stochastic facility layout problem. Based on the work in [8], we formulate a mixed integer programming model for SR-DRLP.
The material handling cost of SR-DRLP is given by Equation (1), where the Z 1-α is the standard normal Z value at confidence level 1-α. Constraint (2) is employed to determine the distance between a pair of machines. Constraint (3) ensures that one, and only one, row is selected for each machine to be located. The minimum clearance constraint between each pair of machines on the same row is defined by Equation (4). U is a sufficiently large constant. Constraints (5) and (6) relate binary decision variables z ijr and q ir . Constraint (7) is used to judge whether any two machines are on the same row.
The layout (solution) to SR-DRLP is symmetry, i.e., rotating a layout by 180 • or swapping the machine sequences on two rows can result in a layout with the same material handling cost as the original layout. There are three equivalent layouts for each layout, as shown in Figure 2. If this symmetry can be broken, the size of the decision space can be greatly reduced. Constraints (8) and (9) are utilized to break the symmetry, i.e., we add the following constraints: machine 1 is placed on row 1 and the abscissa of machine 1 is smaller than the abscissa of machine 2. Equations (10) and (11) define the ranges of decision variables. ping the machine sequences on two rows can result in a layout with the same material handling cost as the original layout. There are three equivalent layouts for each layout, as shown in Figure 2. If this symmetry can be broken, the size of the decision space can be greatly reduced. Constraints (8) and (9) are utilized to break the symmetry, i.e., we add the following constraints: machine 1 is placed on row 1 and the abscissa of machine 1 is smaller than the abscissa of machine 2. Equations (10) and (11) define the ranges of decision variables.

Proposed Approach
For the SR-DRLP, a feasible layout contains the machine sequence on each row and the exact location of each machine on its row. SR-DRLP is NP-hard due to involving the machine sorting task. Local search is an effective heuristic approach for combinatorial optimization problems and there are many studies [33][34][35] that apply this to solve FLPs. In this paper, we propose a hybrid approach combining a local search and an exact approach (LS-EA) to solve SR-DRLP.
LS-EA includes two stages: (1) a local search is proposed to determine the sequence of all machines, breakpoint and offset for the SR-DRLP; (2) EA is employed to optimize the exact locations of machines of the solution found by the LS, and the optimal extra clearance of each machine is identified.

Local Search Algorithm
Amaral [23] presented four heuristics for DRLP without minimum clearance between any two adjacent machines. The fourth heuristic optimized the machine sequence, the number of machines on row 1 and the leftmost point of the arrangement at row 1. In this heuristic, each feasible solution corresponds to a layout, where there is no clearance between any adjacent machines and the leftmost point of the arrangement at row 2 is fixed

Proposed Approach
For the SR-DRLP, a feasible layout contains the machine sequence on each row and the exact location of each machine on its row. SR-DRLP is NP-hard due to involving the machine sorting task. Local search is an effective heuristic approach for combinatorial optimization problems and there are many studies [33][34][35] that apply this to solve FLPs. In this paper, we propose a hybrid approach combining a local search and an exact approach (LS-EA) to solve SR-DRLP.
LS-EA includes two stages: (1) a local search is proposed to determine the sequence of all machines, breakpoint and offset for the SR-DRLP; (2) EA is employed to optimize the exact locations of machines of the solution found by the LS, and the optimal extra clearance of each machine is identified.

Local Search Algorithm
Amaral [23] presented four heuristics for DRLP without minimum clearance between any two adjacent machines. The fourth heuristic optimized the machine sequence, the number of machines on row 1 and the leftmost point of the arrangement at row 1. In this heuristic, each feasible solution corresponds to a layout, where there is no clearance between any adjacent machines and the leftmost point of the arrangement at row 2 is fixed at zero. Based on this heuristic, we devise a local search for SR-DRLP with the following improvements: (1) Let the range of the offset be an equally spaced vector with values from 0 to maxO and the range of the breakpoint be in [ m/2 , maxB], where maxO and maxB are predefined parameters corresponding to the maximum value of offset and breakpoint, respectively. Then, all possible combinations of offset and breakpoint are traversed. For each combination, a 2-opt local search with the first improvement search strategy is employed to improve the machine sequence. The final solution is the best solution amongst all solutions found by 2-opt local search for all possible combinations of offset and breakpoint. Traversing all combinations aims to avoid falling into the local optima during the search of offset and breakpoint.
(2) In the 2-opt local search, we add an additional step to improve the global search capability. That is, in each iteration, we first generate all neighbor solutions by swapping the positions of machines, and then randomly shuffle the order of neighbor solutions.

Encoding and Decoding
In the LS, a feasible solution of SR-DRLP is represented by a coding (π, b, o), where π denotes the sequence of all machines; b is the breakpoint that identifies the number of machines on row 1 such that (m-b) machines are on row 2; and o is the offset that indicates the clearance from the most left machine on row 1 to the left boundary (i.e., the extra clearance of the leftmost machine on row 1). For a feasible solution in the LS, the extra clearances of all machines are zero except the leftmost machine on row 1.
The decoding procedure is to convert a coding (π, b, o) into a layout. To clearly describe the decoding process, we take a coding (π = {1, 2, 3, 4, 5, 6}, b = 3, o = 2) as an example to illustrate the decoding process, as shown in Figure 3. As b is 3, the first three machines {1, 2, 3} in π are placed on row 1 from left to right one by one and there is no extra clearance for each machine, i.e., adjacent machines on the same row are separated by their minimum clearance. The next three machines {4, 5, 6} are places on row 2 in the same manner. The extra clearance of the left most machine on row 1 is the value of o, i.e., e 1 = 2.
to improve the machine sequence. The final solution is the best solution amongst all solu-tions found by 2-opt local search for all possible combinations of offset and breakpoint. Traversing all combinations aims to avoid falling into the local optima during the search of offset and breakpoint.
(2) In the 2-opt local search, we add an additional step to improve the global search capability. That is, in each iteration, we first generate all neighbor solutions by swapping the positions of machines, and then randomly shuffle the order of neighbor solutions.

Encoding and Decoding
In the LS, a feasible solution of SR-DRLP is represented by a coding (π, b, o), where π denotes the sequence of all machines; b is the breakpoint that identifies the number of machines on row 1 such that (m-b) machines are on row 2; and o is the offset that indicates the clearance from the most left machine on row 1 to the left boundary (i.e., the extra clearance of the leftmost machine on row 1). For a feasible solution in the LS, the extra clearances of all machines are zero except the leftmost machine on row 1.
The decoding procedure is to convert a coding (π, b, o) into a layout. To clearly describe the decoding process, we take a coding (π = {1, 2, 3, 4, 5, 6}, b = 3, o = 2) as an example to illustrate the decoding process, as shown in

Local Search Algorithm
The pseudo-code of the LS is presented in Algorithm 1. The LS starts with an initial solution y = (π, m/2, 0), where π is a random permutation of all machines. The material handling cost of y, f(y), is computed by Equation (1)

Local Search Algorithm
The pseudo-code of the LS is presented in Algorithm 1. The LS starts with an initial solution y = (π, m/2, 0), where π is a random permutation of all machines. The material handling cost of y, f (y), is computed by Equation (1). For each candidate combination of b and o, the breakpoint and offset in y are first updated to b and o, respectively. Then y is iteratively optimized and the best solution obtained for the combination of b and o is saved. In each iteration, an auxiliary procedure (Algorithm 2) is introduced to structure a perturbation solution y for y. Subsequently, a local optimal solution for y is obtained by

The Inversion Operation
To avoid falling into the local optima during the search of machine sequences, an inversion operation [23] is applied to generate a perturbation solution. The inversion operation is described in Algorithm 2. We randomly generate uniformly distributed random integers i and num in [1, m] and [1 + m/8 , m/4 ], respectively. The i and num represent the starting position in the machine sequence and the number of machines in the sequence, respectively. To avoid inverting too many machines which would change the original solution too much, num should be a small number. The num is reset as a random integer in [3,4] if the number of machines (m) is less than 20. The ending position of the machine sequence to be inverted, denoted by j, is as follows: The num/2 iterations are conducted to reverse the order of machines. In each iteration, the machines at positions i and j are swapped until the order of all the num machines is reversed. If m< 20 then num is set as a random integer in [3,4]; Swap the two machines at positions i and j of π;

Exact Approach
The solution found by LS, y = (π, b, o), does not consider the extra clearances of machines except the leftmost machine on row 1. To achieve the optimal extra clearances of all machines, an exact approach is used to optimize the extra clearances. Since there is a square root term in the objective function in Equation (1), Equation (1) is non-convex. In this case, an exact approach cannot ensure finding the optimal solution for the MIP in Section 3. Thus, we use a linear model to fit the nonlinear term in Equation (1) in order to achieve a mixed integer linear programming model, which is used to identify the optimal extra clearances of all machines.

Surrogate Model
Naslund's approximation in [36] can linearize the nonlinear function in Equation (1), such that Equation (1) can be substituted with the following formula. where Through experiments, we found that Naslund's approximation causes a large mean absolute percentage error (MAPE) [37]. The detailed description of experiments is given in Section 5.3. In this paper, we propose a surrogate model to replace the nonlinear function in Equation (1). For a SR-DRLP, z 1−α , β ijp and Var(D cp ) are all known parameters. The nonconvex term F 1 can be regarded as the Euclidean distance amongst machines weighted by these known parameters. For FLPs, the linear Manhattan distance and Euclidean distance are two commonly used measurements of distance between a pair of machines. Inspired by this idea, we use the Manhattan distance F 2 to fit the Euclidean distance F 1 .
For a SR-DRLP, we randomly generate 10,000 layouts and Figure 4 shows the values of F 1 and F 2 of these layouts. We find that F 1 and F 2 satisfy certain linear relationship. Therefore, a linear model is applied to fit the relationship of F 1 and F 2 , where a and b are the fitting coefficients (slope and intercept) of the linear model and can be readily obtained by MATLAB toolbox. The fitting results (red points) are shown in Figure 4. Thus, the objective function Equation (1) can be approximated by a linear formula in Equation (17).
be readily obtained by MATLAB toolbox. The fitting results (red points) are shown in Figure 4. Thus, the objective function Equation (1) can be approximated by a linear formula in Equation (17).

Optimize the Exact Location of Each Machine by Exact Approach
The pseudo-code of EA is presented in Algorithm 4. The solution y = (π, b, o) found by LS can determine the machine sequences on two rows. Thus, all binary variables sij, qij and zijr in the MIP of SR-DRLP in Section 3 can be fixed by π and b, and objective function Equation (1)

Optimize the Exact Location of Each Machine by Exact Approach
The pseudo-code of EA is presented in Algorithm 4. The solution y = (π, b, o) found by LS can determine the machine sequences on two rows. Thus, all binary variables s ij , q ij and z ijr in the MIP of SR-DRLP in Section 3 can be fixed by π and b, and objective function Equation (1) is replaced with the linear formula Equation (17). The constraints related to those binary decision variables are removed from the MIP model. Then, the model involves only continuous decision variables x i , v + ij and v − ij , and therefore is a linear programming model (denoted by LP 1 ), which can be readily solved by CPLEX (a popular mathematical programming solver) to obtain the optimal abscissa of each machine, i.e., exact location of machine x i . The extra clearances of machines o can be identified by abscissas of machines. Thus, a new solution is produced y = (π, b, o ). Because the surrogate model is in LP 1 , the material handling cost of y needs to be recalculated by Equation (1). The solution y is compared against y , and the better one is kept as the final solution (layout) to SR-DRLP. Input: A solution y = (π, b, o).

Begin:
Utilizing π and b of y to fix the binary variables in the model in Section 3, a linear programming model (LP 1 ) is obtained, where objective function Equation (1) is replaced with the linear formula in Equation (17); LP 1 is solve by CPLEX and a solution, y = (π, b, o ), is produced; Equation (1) is used to calculate the material handling cost of y , f (y ); If f (y ) < f (y) then y = y ;

End
Return y.

Experimental Results
To verify the effectiveness of LS-EA, it is applied to a number of problem instances and compared against an exact method (EM) and a two-stage algorithm combining a heuristic and linear programming (HS-LP) [23]. All of LS-EA, EM and HS-LP are coded in C++ and run on a computer with Intel Core i7 3.2 GHz CPU, 8 GB RAM and Windows 10 operation system. The exact approach (CPLEX 12.9.0) is also performed on the PC.

Problem Instances
There are no problem instances for SR-DRLP since the problem has not been studied previously. Hence, six problem instances with different scales are randomly generated, namely P8, P10, P12, P20, P30, P50. The number in each problem indicates the number of machines involved in that instance.
SR-DRLP has some problem parameters, i.e., the demand of product p in processing period c, D cp ; the number of machines, m; processing period, t; product type, l; the width of aisle, C; the width (depth) of machine i, w i (d i ); and routes of products. For each problem instance, D cp is a normally distributed random number with a variance E(D cp ) and an expected value Var(D cp ). The ranges of E(D cp ) and Var(D cp ) are given in Table 2. C is set to be 1. The w i and d i follow unif [8,20]. For each product type, the percentage of machines visited follows unif [0.25, 0.75], thus that a route for the product type is defined. The values of m, t and l for each problem instance are listed in Table 2. For all problem instances, the confidence level is set as 0.95, i.e., 1 − α = 0.95.

Algorithm Parameters and Comparative Approaches
In [23], the maximum number of machines on row 1 is set to m/2 + 4. In LS-EA, the maximum number of breakpoint (maxB) is also set to m/2 + 4. For LS-EA, the greater the values of MaxIter and maxO are, the better the layout (the larger the computational time) is. After brief experiments, we find that a good solution could be obtained by setting the values of MaxIter and maxO to 10 and 3, respectively. Thus, MaxIter and maxO are set to 10 and 3 for all problem instances, respectively.
An EM is also utilized to solve SR-DRLP. In Section 3, MIP is established for SR-DRLP. The surrogate model is employed to replace the square root term F 1 in Equation (1) of the MIP to achieve a mixed integer linear programming model LP 2 with the objective function Equation (17). LP 2 is solved by CPLEX to obtain a solution and its material handling cost is calculated by Equation (1). For EM, the maximum runtime is restricted as 3600 s for each instance.
There have been no studies for SR-DRLP. A state-of-the-art algorithm was proposed by Amaral [23] to solve a DRLP without considering the minimum clearance amongst machines. We adapt this algorithm to solve SR-DRLP, termed HS-LP. In HS-LP, the fourth heuristic (Heuristic 4) in [23] is employed to optimize the machine sequence, breakpoint and offset. Since the objective function of SR-DRLP is non-convex, the EA, namely Algorithm 4 in Section 4.2.2, replaces the linear programming in [23] to optimize the exact location of each machine. To make a fair comparison, HS-LP is given slightly larger runtime than LS-EA, as shown in Table 3. HS-LP has two parameters, ITER and MAX_k, which were suggested to set the same value. For a problem instance, the ITER decreases as MAX_k increases if the runtime is fixed. We observe that the values of ITER and MAX_k are close when MAX_k is respectively set to 15, 15, 15, 15, 10 and 10 for problem instances P8, P10, P12, P20, P30, P50. Thus, MAX_k is set to those values for each problem instance.

Experimental Results and Analysis
The effectiveness of the surrogate model. A set of experiments are designed to evaluate the surrogate model. The expected value of the product demand is given by a fixed interval, E(Dcp) ∈ [50, 60] and ten large variance intervals of the product demand are given, Var(Dcp) ∈ { [1,20], [20,50] (1), the objective function value based on Naslund's approximation method (MHC ) by Equation (14), the value of square root term F 1 by Equation (13) and linear Manhattan distance F 2 by Equation (15). The top 8000 layouts are used as the training set and the remaining 2000 layouts as the test set. Each element in the training set has a pair of F 1 and F 2 values and MATLAB toolbox is employed to learn the coefficients a and b in Equation (16). Using a and b (the surrogate model), the fitting objective function values (MHC ) of each element in the test set are calculated by Equation (17). Based on the test set, the mean absolute percentage error (MAPE) [37] of Naslund's approximation and surrogate model are calculated by Equation (18). The average MAPE value of all the 30 problem instances is shown in Figure 5.
where G k is the fitting objective function value (MHC or MHC ) of the kth layout; K = 2000 is the number of layouts in the test set; and G k is the real objective function value (MHC) of the kth layout. It can be seen that when the expected value interval is fixed, as the variance interval increases, MAPEs of both Naslund's approximation and the surrogate model increase.
The average values of MAPEs of the surrogate model are obviously lower than those of Naslund's approximation. Even when the value of variance is much larger than expected value, Var(D cp ) ∈ [10,000, 100,000], the surrogate model still maintains a lower average value of MAPE (0.62%).
where ′ k G is the fitting objective function value ( ′ MHC or ′′ MHC ) of the kth layout; K = 2000 is the number of layouts in the test set; and Gk is the real objective function value (MHC) of the kth layout. Comparation with other approaches. LS-EA and HS-LP are independently performed for 20 runs for each problem instance with the parameter settings in Section 5.2. Table 3 presents the minimal and average material handling cost, and average runtime over the 20 runs. The material handling cost, runtime and optimality gap of EM are also provided in Table 3. The best results are highlighted in boldface.
For problem instance P8, it only takes a short time for EM to obtain a feasible solution. Although we cannot guarantee that this solution is the optimal one, it is very close to the optimal, since the average value of MAPE of the surrogate model is very small. Each run of LS-EA achieves the same solution as found by EM. This solution is represented in Fig  Comparation with other approaches. LS-EA and HS-LP are independently performed for 20 runs for each problem instance with the parameter settings in Section 5.2. Table 3 presents the minimal and average material handling cost, and average runtime over the 20 runs. The material handling cost, runtime and optimality gap of EM are also provided in Table 3. The best results are highlighted in boldface.
For problem instance P8, it only takes a short time for EM to obtain a feasible solution. Although we cannot guarantee that this solution is the optimal one, it is very close to the optimal, since the average value of MAPE of the surrogate model is very small. Each run of LS-EA achieves the same solution as found by EM. This solution is represented in Figure 6a. We take P8 as an example to illustrate its decision variables. As the problem size increases, the computational time required for EM increases greatly. The EM can only find the optimal solutions of LP 2 for P8 and P10. For other problem instances, the EM is able to find the feasible solutions within the given runtime. With the increase of scale of problem instances, LS-EA and HS-LP have significant advantages over EM. Although HS-LP is given longer runtime than LS-EA, LS-EA outperforms HS-LP in terms of average material handling cost on all problem instances except for P8. The best solutions (layouts) found by LS-EA for all problem instances are shown in Figure 6.
From Table 3, we can see that, compared with HS-LP, LS-EA reduces the minimum material handling cost most for P10 (reduces by 3.1%). The reason may be that a large offset is required by the best solution of P10. In each iteration of HS-LP, the value of offset is changed by adding a random number in {−0.5, 0, 0.5} to the current value of the offset. This local search strategy may lead to the local optima. That is, if the material handling cost of the current solution is not reduced after the offset value is added (or subtracted) by 0.5, a larger (or smaller) offset value is not considered. As the problem size increases, the computational time required for EM increases greatly. The EM can only find the optimal solutions of LP2 for P8 and P10. For other problem instances, the EM is able to find the feasible solutions within the given runtime. With the increase of scale of problem instances, LS-EA and HS-LP have significant advantages over EM. Although HS-LP is given longer runtime than LS-EA, LS-EA outperforms HS-LP in terms of average material handling cost on all problem instances except for P8. The best solutions (layouts) found by LS-EA for all problem instances are shown in Figure 6.
From Table 3, we can see that, compared with HS-LP, LS-EA reduces the minimum material handling cost most for P10 (reduces by 3.1%). The reason may be that a large offset is required by the best solution of P10. In each iteration of HS-LP, the value of offset is changed by adding a random number in {−0.5, 0, 0.5} to the current value of the offset. This local search strategy may lead to the local optima. That is, if the material handling To verify this speculation, the maxO in LS-EA is set to 20 to make LS-EA experience a larger runtime. MAX_k in HS-LP is set to 30 to ensure that ITER and MAX_k have similar values. Table 4 presents experimental results of LS-EA and HS_LP for problem instance P10. LS-EA can find the same solution as EM in each run when giving a larger maxO value. However, HS-LP cannot improve the solution quality under longer runtime. Figure 7 shows the best solution found by LS-EA and HS-LP for P10. We can see that the offset of the solution obtained by HS-LP is much smaller than that of LS-EA. This means that for some specific problem instances whose optimal solutions have large offsets, LS-EA can find a much better solution than HS-LP by setting a suitable value of maxO. ilar values. Table 4 presents experimental results of LS-EA and HS_LP for problem instance P10. LS-EA can find the same solution as EM in each run when giving a larger maxO value. However, HS-LP cannot improve the solution quality under longer runtime. Figure  7 shows the best solution found by LS-EA and HS-LP for P10. We can see that the offset of the solution obtained by HS-LP is much smaller than that of LS-EA. This means that for some specific problem instances whose optimal solutions have large offsets, LS-EA can find a much better solution than HS-LP by setting a suitable value of maxO.   Table 5 shows the maximum and average material handling cost reduction of solutions obtained by EA, and the average runtime of EA. We can see that the maximum reduction on material handling cost by EA is from 0.14% to 6.18% while the runtime of EA is tiny, ranging from 0.004 s to 0.837 s.

Statistical Significance
To evaluate whether there is a significant advantage of LS-EA compared to HS-LP, the nonparametric statistical tests, Sign test and Wilcoxon signed ranks test are performed. The null hypothesis is defined so that the performance of LS-EA and HS-LP shows no difference and the statistical significance level is set to 0.05. In the experiment shown in Table 3, LE-EA and HS-LP obtain 120 solutions for all six problem instances. Based on the material handling cost of these solutions, the p-values of Sign test and Wilcoxon signed ranks test are calculated by a statistical software, SPSS. The results of two tests are summarized in Table 6. It can be seen that the LS-EA has more Wins than HS-LP and both of   Table 5 shows the maximum and average material handling cost reduction of solutions obtained by EA, and the average runtime of EA. We can see that the maximum reduction on material handling cost by EA is from 0.14% to 6.18% while the runtime of EA is tiny, ranging from 0.004 s to 0.837 s.

Statistical Significance
To evaluate whether there is a significant advantage of LS-EA compared to HS-LP, the nonparametric statistical tests, Sign test and Wilcoxon signed ranks test are performed. The null hypothesis is defined so that the performance of LS-EA and HS-LP shows no difference and the statistical significance level is set to 0.05. In the experiment shown in Table 3, LE-EA and HS-LP obtain 120 solutions for all six problem instances. Based on the material handling cost of these solutions, the p-values of Sign test and Wilcoxon signed ranks test are calculated by a statistical software, SPSS. The results of two tests are summarized in Table 6. It can be seen that the LS-EA has more Wins than HS-LP and both of the p-values of two tests are less than the significance level (0.05). This means that the null-hypothesis is rejected and LS-EA significantly outperforms HS-LP.

Conclusions
In this paper, we study an extended DRLP, where the produce demands are independent and normally distributed random variables with known expected value and variance, termed as SR-DRLP. A MIP for SR-DRLP is established. To linearize the nonlinear term in the MIP, a surrogate model is used to replace the nonlinear term to make the MIP become a mixed integer linear programming model. A high-quality solution to problem instances with no more than ten machines can be obtained by solving the linear programming model by an exact approach. A two-stage hybrid approach combining a local search and an exact approach (LS-EA) is proposed to solve large-scale problem instances. First, the LS is employed to optimize the machine sequences on two rows and the extra clearance of the leftmost machine on row 1. Then, the EA is utilized to determine the optimal extra clearances of all machines.
Experimental results show that the average absolute percentage error of the surrogate model is much smaller than that of Naslund's approximation. LS-EA can find the same solution as EM for small-scale instances and LS-EA significantly outperforms HS-LP. Specifically, for problem instances whose optimal solutions require larger offset values, LS-EA can obtain solutions much better than HS-LP.
Future research directions include considering more optimization objectives (e.g., layout area and closeness of facilities) and extending the SR-DRLP to a stochastic robust multi-row layout problem.