Solving the Manufacturing Cell Design Problem through an Autonomous Water Cycle Algorithm

Metaheuristics are multi-purpose problem solvers devoted to particularly tackle large instances of complex optimization problems. However, in spite of the relevance of metaheuristics in the optimization world, their proper design and implementation to reach optimal solutions is not a simple task. Metaheuristics require an initial parameter configuration, which is dramatically relevant for the efficient exploration and exploitation of the search space, and therefore to the effective finding of high-quality solutions. In this paper, the authors propose a variation of the water cycle inspired metaheuristic capable of automatically adjusting its parameter by using the autonomous search paradigm. The goal of our proposal is to explore and to exploit promising regions of the search space to rapidly converge to optimal solutions. To validate the proposal, we tested 160 instances of the manufacturing cell design problem, which is a relevant problem for the industry, whose objective is to minimize the number of movements and exchanges of parts between organizational elements called cells. As a result of the experimental analysis, the authors checked that the proposal performs similarly to the default approach, but without being specifically configured for solving the problem.


Introduction
Worldwide manufacturing plants are usually structured into manufacturing entities, which are composed of machines processing a specific part for a product. This organization presents a problem for most companies because, generally, the output of a machine could be the input of another, which could be located in a faraway place, reducing the efficiency of the construction process because of the moving of parts. This relevant problem for the industry is a classical optimization problem defined by Flanders [1] and called the manufacturing cell design problem, whose objective is to minimize the number of movements and exchanges of parts between groups of machines, which are called cells.

Related Work
Cellular manufacturing systems are widely considered in the industry due to the several benefits involved in economic terms; indeed, preliminary investigations in this context date back from 1960 with the work of Burbidge [12] about production flow analysis. From this point, mathematical programming was strongly involved in solving the problem. Several examples in this line can be found from classic linear programming to more advanced goal programming procedures [11,[13][14][15][16][17][18]. Techniques derived from mathematical programming and artificial intelligence, such as constraint programming, were also reported [2,3]. More recently, the room for metaheuristics in this area has been growing since its use is more appropriate when problem instances are intractable with exact methods because of computing time restrictions. In this line, some authors applied classic metaheuristics as tabu search for solving different approaches of the problem [19][20][21][22][23]. The presence of genetic algorithms (GAs) is also large. For instance, a bi-criteria model for solving cell formation problems was presented in [24]. A similar approach was reported in [25], but involving three objective functions. Alternative routines for the part-flow were incorporated in [26,27]. In [28], another GA was proposed but mostly oriented to minimize the inter-cell flow cost rather than minimizing the number of inter-cell movements. Industrial cases from the automobile and steel industry were presented in [29,30] while applying GAs. Hybrid GAs, as well as variants such as the predator-prey GA, were also explored [31][32][33][34]. Classic simulated annealing, differential evolution, scatter search, and particle optimization algorithms also participate in the literature [35][36][37][38].
From the previous analysis, the authors reach that modern swarm intelligence metaheuristics are the most widely applied techniques for solving the problem in recent years. For instance, the authors may cite the following works, among many others. In [39], a migrating bird algorithm was applied, which was later parallelized in [40]. The artificial fish swarm algorithm and the shuffled frog leaping algorithm were applied in [41,42]. Bat algorithm and its autonomous version were applied in [43,44], respectively. The firefly algorithm was considered in [45,46], the cat swarm optimization in [47] and the flower pollination algorithm in [48]. Black hole algorithm has successfully been applied to fine tune a machine learning approach in [49]. Finally, an Egyptian vulture optimization algorithm was reported in [50].
As introduced before, although metaheuristics are widely considered in the literature for solving complex optimization problems, the usage of these methods is limited by the needed of configuring parameters, which definitively affect how the search is performed. This situation implies that it is needed to solve the fitting problem of the algorithm. However, it is unlikely to get the optimal solution to the problem because of the number of possible combinations. This fact means that the performance of the algorithm could be biased due to the usage of an inadequate configuration. A solution to this problem could be to consider metaheuristics with the less possible number of parameters. However, parameters usually provide metaheuristics the capacity of providing a good solution in a great range of instances and problems, i.e., adaptability. Another possibility is the one considered in this paper, where the authors study how to provide autonomous adaptability during the solving time. This is a novel concept, which has not been extensively considered in the literature. For instance, we may cite the following works defining an autonomous strategy for guiding specific metaheuristics [51][52][53]. On this basis, in this work, the authors study how to provide autonomous adaptability to the water cycle algorithm, which is a metaheuristic that was successfully applied to solve several optimization problems [54][55][56][57]. As far as the authors know, this is the first work in the field providing adaptability to the water cycle algorithm, while solving the cell design problem.

The Manufacturing Cell Design Problem
The cellular manufacturing strategy proposed by Flanders [1] promotes the separation of the machines involved in a production plant by following a specific strategy. The idea is to group parts of similar functions, geometry or fabrication process into families that are processed in the same section, and thus creating highly independent areas called cells. Some advantages of this production strategy include reduction of cost and material-handling time, labor, and paperwork, a decrease of in-process inventories, shortening delivery time, and an increase of machine utilization and production control. Further analysis of this technology may be found in [58][59][60].

Problem Statement
To find the optimal design of the production plant, where the inter-cell part exchange is minimized, the problem is schematized into the part-machine matrix, where each coordinate shows which machine processes a particular part. Through the reorganization of the rows and columns of the matrix different configurations can be tested. From this initial matrix, two others are derived: the machine-cell and the part-cell matrices, representing the cell that currently allocates the machines and parts, respectively.
The idea in this optimization problem is to minimize the so-called exceptional elements, which are parts that move from one cell to another to satisfy the production workflow [11]. The mathematical model representing this problem is described as follows: x ij : the (i, j) element in the machine-part incidence matrix, meaning where i ∈ {1, . . . , M} is the machine number and j ∈ {1, . . . , P} is the part number.
• Decision variables: y ik : the (i, k) element in the machine-cell incidence matrix, meaning where k ∈ {1, . . . , C} is the cell number. -z jk : the (j, k) element in the part-cell incidence matrix, meaning The goal of the optimization problem is given by subject to

Problem Example
An example of the manufacturing cell design problem is included in this section to clarify the previous definition. To this end, the authors consider the machine-part matrix in Table 1, determining how machines and parts are related in a given industrial process. From this table, there are 10 machines (M = 10) and 10 parts (P = 10). Suppose that for the optimized design, there are 3 (C = 3) available cells to organize the industrial process and there is a constraint in the maximum number of machines in a cell so that this maximum number equals 4 (M max = 4). With this information, the machine-cell and part-cell matrices are generated by randomly assigning machines and parts to cells. These two matrices are shown in Table 2. From these two matrices, the objective of the optimization problem is to reorganize machines and parts so that inter-cell movements are minimized. A possible solution to this problem is shown in Table 3, where machines {A, E, F, H} and parts {3, 7, 10} are assigned to cell 1, machines {B, C, I} and parts {1, 2, 6, 9} are assigned to cell 2, and machines {D, G, J} and parts {4, 5, 8} are assigned to cell 3. The cost of this solution is 0 because there are not inter-cell movements, meaning that this is an optimal solution to the problem.

The Water Cycle Inspired Solving-Method
As is well known, water exists in the earth in three different states: solid (ice, snow), liquid (water, sea, raindrops) and gaseous (vapor). Even though oceans, sea, rivers, streams, clouds, and rains are constantly changing, the total amount of water on the planet is not affected [61][62][63]. The interconnection of the three water states forms the water life cycle depicted in Figure 1. In this figure, raindrops travel in the mountains towards the sea, forming rivers or streams; however, there is also a possibility that some streams flow into rivers and not necessarily into the sea. The water life cycle is composed of three phases [64][65][66]. The first one occurs when the heat produced by the sun affects the water surface, producing that seawater begins to evaporate. This first phase is complemented by the photosynthesis of plants in a process known as transpiration. In the second phase, vapor rises to the atmosphere

The Water Cycle Inspired Solving-Method
As is well known, water exists in the earth in three different states: solid (ice, snow), liquid (water, sea, raindrops) and gaseous (vapor). Even though oceans, sea, rivers, streams, clouds, and rains are constantly changing, the total amount of water on the planet is not affected [61][62][63]. The interconnection of the three water states forms the water life cycle depicted in Figure 1. In this figure, raindrops travel in the mountains towards the sea, forming rivers or streams; however, there is also a possibility that some streams flow into rivers and not necessarily into the sea. The water life cycle is composed of three phases [64][65][66]. The first one occurs when the heat produced by the sun affects the water surface, producing that seawater begins to evaporate. This first phase is complemented by the photosynthesis of plants in a process known as transpiration. In the second phase, vapor rises to the atmosphere forming clouds which store plenty of evaporated water through condensation. The third phase, known as precipitation, occurs when stored vapor becomes liquid water because of clouds start to cool, for instance when clouds rise considerably. The water cycle inspired algorithm is a population-based solving method. In the context of the algorithm, each individual in the population is a solution to the problem known as a raindrop. Thus, each raindrop has associated a machine-cell and a part-cell matrix, as well as a fitness value calculated as given by Equation (4). The population is composed of N pop raindrops.
The solutions in the population are organized on three levels: sea, rivers, and streams. To this end, the population is first sorted in decreased order of fitness quality. Then, the first solution is the sea. The following N rivers solutions are the rivers and the rest of the solutions are the streams. Thus, the sum of rivers and sea is given by and the number of streams is given by According to the flow magnitude ( how good a solution is), each sea/river has associated a set of streams flowing to them. The cardinal of this set for a given sea/river is defined as where c n is the fitness value of the n-th solution in the first N sr solutions in the population as given by Equation (4). From this expression, the sea will have a higher number of streams than a lower quality river. Following this idea, better solutions will have the capacity of attracting more water flows than worse ones, as shown in Figure 2. In this figure, rivers (stars) and streams (circles) modify their trajectory to follow stronger flows. When a stream flows into the sea (squares), it is taken as a solution.
Additionally, the white color is used to detail advances in each iteration (a new position). The way in which a stream modify its trajectory is depicted in Figure 3, where X is the new position of the stream as a random displacement in the interval [0, α · d], d is the distance between the river and the stream (usually in terms of fitness value), and α ∈ (1, 2). According to the attraction capacity of water flows, the algorithm applies three phases to transform a solution during one iteration of the optimization process. First, it moves streams forward to the rivers as given by where rand is a random number in the interval [0, 1]. Second, streams are directed to the sea as given by Third, rivers flow towards the sea as given by Next, the algorithm evaluates if the newly generated solution is better than its connection, i.e., if the stream provides a better fitness value than the river. In such a case, roles are exchanges as depicted in Figure 4. The algorithm also includes a mechanism to avoid premature convergence based on the evaluation of the evaporation condition. This process evaluates how close are rivers and streams to the sea so that evaporation is produced. Thus, the evaporation condition for rivers is evaluated as and for streams is evaluated as where d max is updated over iterations as given by Note that the d max is reduced over iterations, meaning that the evaporation condition is more complicated to meet so as execution progresses. If the evaporation condition is met, a rain process occurs, meaning that the water cycle starts again and then, a new population is generated replacing the previous one. That is, where LB and UB represent the lower and upper bound values, respectively. Based on the above description, the main steps in the algorithm are described in Algorithm 1.

if
The stream is better than its connection to the river then 12 Exchange of position between stream and river.;

if
The stream is better than its connection to the sea then 14 Exchange of position between stream and sea.;

if
The river is better than its connection to the sea then 16 Exchange of position between river and sea.; 17 if Evaporation process. Equations (14) and (15) then 18 Rain process.; 19 New Raindrops. Equation (17).; 20 Reduce d max according to Equation (16).;

The Proposed Autonomous Water Cycle Algorithm
The integration of autonomous search into the water cycle algorithm will be responsible for varying the population size (N pop ), while maintaining the same proportion of rivers/sea (N sr ) and streams (N streams ). This proportion was experimentally defined as 30% for N sr and 70% for N streams . The calculation of N pop is inspired by one of the most important expressions in the water cycle algorithm, Equation (10), defining the trade-off between exploitation and exploration in the metaheuristic. Thus, the population size is calculated according to the relationship between the worst and best solution in the current population. That is: where c N pop best and c N pop worst are the best and worst fitness value in the current population. In this expression, population size will be larger as the difference between the best and the worst fitness value is increased. Otherwise, population size will be smaller.
The criterion for updating the population size is defined according to the differences observed, in percentage, between the best and worst solution in N sr in two different time intervals. Thus, let di f f t1 N sr and di f f t2 N sr the differences observed between the best and worst solution in N sr in time t1 and t2, t1 < t2, respectively. Then, if di f f t2 N sr − di f f t1 N sr < 0, it means that the algorithm could be trapped in a local minimum and then, the population size should be increased in a number of elements as given by Equation (18) it means that the differences observed could be very large and then, the algorithm should focus the search on the most promising areas, so reducing the population size in terms of Equation (18). Note that di f f th was experimentally defined as 3.00%. The previously discussed hybridization of autonomous search and the water cycle algorithm can be find in Algorithm 2.
Algorithm 2: Hybridization of autonomous search and the water cycle algorithm.

11
Define t1 and t2.; When proposing a hybrid approach is also important to analyze the time complexity of the proposal in comparison to the default approach. If we focus on the default water cycle algorithm, the time complexity has a linear relationship between the maximum number of iterations (T) and the population size. Hence, the time complexity of the algorithm is bounded by O(k · n), where k is related to the number of iterations and n with the population size. Analyzing the hybrid approach, we also reach that time complexity is also bounded by a linear relationship between the number of iterations, which remains constant in both approaches, and the population size, which varies over the execution of the algorithm. Hence, theoretical complexity is not increased. As expected, execution time could differ for both approaches because of the size of the population, as occurs for two runs of the default algorithm with two different values for the population parameter. If we focus on the memory footprint, we reach that it depends on the size of the population. Hence, the usage of the main memory will be similar for both approaches, while considering the same population size.

Experimental Analysis
This section discusses the experimental methodology followed by conducting the experimentation, as well as a discussion about the results obtained while comparing both default and autonomous approaches.

Discussion of the Experimental Results
Tables 6-11 show the results obtained by solving the Boctor's instances and the set of more complex instances through both approaches, respectively. In these tables, it is shown the average execution time (Stime), in seconds, and the average number of iterations needed (It) to reach the best solution found, whose fitness value is shown in the Opt field. Note that the best fitness value found for each instance of the problem appears in bold font to facilitate the understanding of the results.
Analyzing Table 6 for instances BP01 to BP30, the authors check that the best configuration for the default approach was N pop = 80, getting the best solution in 13 cases, while the autonomous approach reached the best solution in 12 cases. The rest of configurations for the default approach performs significantly worse than N pop = 80. That means that the autonomous approach could be working in a similar way than the default one with N pop = 80, but without being specifically configured. This fact is exemplified in Figure 5, which shows the evolution of the population size over iterations for the autonomous approach. In this figure, it is observed that the algorithm has the capacity of reducing or increasing the population according to the needed of exploitation or exploration, respectively. Additionally, it is also observed that the algorithm tends to consider large populations, matching with the best configuration found for the default approach. Regarding Stime, the authors check that this value is significantly lower than when considering the autonomous approach, which could be due to the improvement in the searchability of the algorithm. Note that the linear trend observed in Figure 5 is just something anecdotal and then, it does not represent the usual behavior of the population size. That is, the algorithm adapts the population size according to the needed for exploration or exploitation. Focusing on the results in Table 7 for instances BP31 to BP60, the authors check that the best configuration for the default approach was N pop = 100, getting the best solution in 12 cases, while the autonomous approach reached the best solution in 10 cases. As before, the rest of configurations for the default approach performs worse than N pop = 100 and the autonomous approach tends to provide a similar behavior than the best static configuration, but without being specifically configured. Note that Stime is also better in the autonomous approach.
Regarding the results in Table 8 for instances BP61 to BP90, the authors check that there are three configurations for the default approach providing a good performance (N pop = 50, N pop = 80, and N pop = 100), getting the best solutions in 7 cases. On the other hand, the autonomous approach provides the best solutions in 10 cases. Thus, the autonomous approach tends to perform in a similar way than the best corresponding default approach, but without being specifically configured. This situation implies that for a real case where the best configuration could not be obtained for a problem before the execution, the usage of an autonomous approach could increase the probabilities of getting a good solution.
If we focus on the results solving more complex instances in Table 9, the authors check that the best configuration for the default approach was N pop = 30, getting the best solution in 6 cases, followed by N pop = 50 with 5 cases, while the autonomous approach obtained the best solution in 12 cases. As before, there is no general configuration for the default approach which performs well in most of the instances, and then the autonomous approach seems to be a promising way to address the problem.
Focusing on the results in Table 10, the authors check that there is a configuration for the default approach that performs well in most of the instances. This configuration is N pop = 80, getting the best solution in 11 cases, while the autonomous approach provided the best solution in 8 cases, which is better than the rest of configurations for the default approach. That means that the autonomous approach tended to work with a similar configuration as N pop = 80, but without being specifically programmed. Finally, in the last set of instances in Table 11, the authors check that there is no general configuration for the default approach performing well in most of the instances. As expected from the previous analysis, the autonomous approach performs better in this case.
From this analysis, the authors conclude that the performance of the autonomous water cycle algorithm is similar than the default approach when the latter was successfully configured. This fact is especially valuable because it implies that the autonomous approach could replace the original one in cases when it is not possible to identify an adequate configuration before running the algorithm. Additionally, the autonomous approach has shown a lesser execution time finding a good solution because of the improvement in the search capacity due to the usage of a dynamic population.

Conclusions
In this research, the authors have demonstrated a study of an alternative approach to the application of the water cycle metaheuristic algorithm. Specifically, the authors considered concepts from the autonomous search paradigm, which is a particular case of adaptive systems, for updating the population parameter of the bio-inspired algorithm. Thus, the proposed metaheuristic can control this relevant parameter during the search process.
To know the performance of the proposal, the default water cycle algorithm and the autonomous water cycle algorithm were considered for solving the manufacturing cell design problem, which is a relevant optimization problem in the industry. Thus, two sets of instances were considered, i.e., 90 classic instances proposed by Boctor and 70 more complex instances proposed by several authors. As a result of solving these instances, the authors concluded that the performance of the autonomous approach tends to be similar to the default approach when the latter was successfully configured. This conclusion is relevant because the autonomous approach could perform in a similar way than a well-configured default approach, but without being specifically configured. That means that the autonomous approach could be considered for solving optimization problems, where configurations are unknown.
As future works, it could be interesting to improve the autonomy of the algorithm by including the capacity of dynamically updating the rest of parameters in the metaheuristic. It could also be interesting to study how to provide autonomy to other metaheuristics in the literature. Additionally, the authors plan to explore the Learnheuristic approach [67], which considers concepts from the machine learning field, to improve the search capacity of the metaheuristic.