On Solving Large-Size Generalized Cell Formation Problems via a Hard Computing Approach Using the PMP

In this paper, we show that the hard computing approach using the p-median problem (PMP) is a very effective strategy for optimally solving large-size generalized cell formation (GCF) problems. The soft computing approach, relying on heuristic or metaheuristic search algorithms, has been the prevailing strategy for solving large-size GCF problems with a short computation time at the cost of the global optimum in large instances of GCF problems; however, due to recent advances in computing technology, using hard computing techniques to solve large-sized GCF problems optimally is not time-prohibitive if an appropriate mathematical model is built. We show that the hard computing approach using the PMP-type model can even solve large 0–1 GCF instances optimally in a very short computation time with a powerful mixed integer linear programming (MILP) solver adopting an exact search algorithm such as the branch-and-bound algorithm.


Introduction
The cell formation (CF) problem has attracted researchers in academia as well as practitioners in the field since it was introduced as a part of group technology (GT) [1]. The initial step of CF is to create machine cells and their associated part families. A machine cell is a collection of functionally dissimilar machines which are grouped together and dedicated to process its associated part family, which is a collection of parts which are similar with regard to their geometric shape and size or processing requirements. By creating efficient cells, the maximum operations of the machines within cells (intra-cell operations) and minimum transfers of parts from one cell to another (inter-cell operations) are achieved. This leads numerous operational benefits, such as a reduction in setup time, work-in-process inventories, improvement in quality and a high degree of flexibilities to product demand changes [2].
Since the CF is an NP-hard problem [3,4], a number of approaches and methods have been proposed to solve the CF problem effectively. Papaioannou and Wilson [5] provided a recent review of the CF solution methodologies. The CF problems are classified into two categories: the standard CF (SCF) problem, considering only one process plan for each part, and the generalized CF (GCF) problem, considering alternative process plans for each part. Both problems can include replicate machines; i.e., extra copies for a machine type. GCF is more complicated than SCF since SCF is a special case of GCF. When a part has alternative process plans, operations can be performed on different types of machines or extra copies of a machine type. By considering alternative process plans and replicate machines, more independent cells and higher machine utilization due to reduced intercell flows can be achieved [6].
The first step in solving CF problems is to construct the mathematical model which is best suited to achieving the objectives of a specific CF. However, this usually leads to a huge model that has many integer and continuous variables, constraints, and/or nonlinear functions. As the number of machines and parts directly influencing the size of CF problem increases, the optimal solution methodology fails to solve large CF instances [7]. More specifically, if a mathematical model of CF contains nonlinear and/or multi-objective functions over multiple periods, it is very difficult to optimally solve that model, although those nonlinear functions can be linearized. Therefore, a number of soft computing approaches relying on local search methodologies such as artificial intelligence, heuristic/meta-heuristic, or hybrid algorithms have been proposed. Soft computing approaches attempt to find good or acceptable solutions to the proposed mathematical model of CF in a short computation time at the expense of the global optimum. Most soft computing approaches use specific mathematical models to set up the CF problem rather than solving them optimally. In this regard, almost all soft computing approaches for CF are heuristic in nature. However, the cell design experience with industrial experts shows that designers would rather spend more time to achieve an optimal or near optimal solution than use a heuristic approach to get an inferior solution [8].
On the contrary, hard computing approaches can use exact search algorithms such as branchand-bound to solve the mathematical models of CF optimally if reasonable computation time is allowed. The application of hard computing approaches for CF significantly has relied on recent advances in computer hardware and commercially available mixed integer linear programming (MILP) solvers, such as CPLEX, LINGO, or Gurobi. Borrero et al. [9] stated that hard computing approaches can yield optimal solutions to large MILP problems with a reasonable running time if appropriate mathematical models are constructed.
Recently, two hard computing approaches for solving the mathematical models of CF have been mentioned in the CF-related literature. The first is an exact method that attempts to find the best cell configuration by directly maximizing the objective function of CF. The grouping efficacy (GE) measure [10] has been widely used as an objective function of CF. Since the GE takes a fractional function, the CF problem with the GE objective function results in a 0-1 nonlinear fractional programming problem. Thus, maximizing the GE directly has attracted many researchers since the early 2010s [11][12][13][14][15][16][17][18]. In order to evaluate the performance of their exact methods, 35 small to intermediate-size benchmark incidences [19] have been widely used for benchmark testing, and their solutions have been compared. However, some instances were not solved optimally even under the time limit of 100,000 s using the CPLEX MILP solver.
The other approach aims to indirectly maximize the GE or other alternative performance measures by using the classic or modified p-median problem (PMP). Since Hakimi [20,21] first introduced the PMP on a network of nodes and arcs, the PMP has been widely studied and extended to many practical situations including the location of plants, warehouses, distribution centers, hubs, and public service facilities [22]. Revelle and Swain [23] used Balinski-type constraints [24] to present an integer linear programming (ILP) formulation of the PMP. Unfortunately, since the original Revelle and Swain model (ORSM) defined on an -node network contains binary variables and + 1 constraints, it is computationally infeasible to exactly solve the ORSM even for moderately sized networks. Therefore, many attempts have been made to formulate equivalent PMP models including fewer binary variables and constraints than the original ORSM [25][26][27][28][29][30][31]. Those reduced PMP models have been solved using hard computing techniques on MILP solvers, and their performances have been compared to those of past PMP models.
Kusiak [32,33] first proposed using the PMP-type model as an alternative mathematical programming model for the CF, replacing exact methods. However, the PMP itself does not explicitly optimize the objective of CF in the same way as the GE. Nevertheless, the PMP grasps the clustering nature of CF and presents a flexible framework by allowing additional constraints reflecting realistic aspects to be introduced [34]. In this context, the PMP matches the CF problem well and shows good solution performance for small to intermediate-size SCF/GCF instances [35][36][37][38][39][40][41][42][43][44][45][46][47][48][49]. Recently, Goldengorin et al. [34] proposed a flexible PMP-based approach for solving large-sized 0-1 SCF problems and used the Xpress MILP solver to optimally solve most of the SCF instances available in the literature within one second.
However, few studies reporting successful applications of the hard computing approach of the PMP-type model to large-sized GCF instances have appeared in the literature. There are two main reasons for this:


First, with regard to SCF, the 35 standard incidences available in the literature have been widely used for benchmark testing for the last 20 years, and a recent study adopting a hybrid algorithm [50] has reported the best optimal solutions with huge time-savings. However, there are few open large GCF data sets available in the literature regarding the standard instances for benchmark testing and the performance comparison of the solution algorithms used. As far as the present author knows, the largest example of an open GCF available in the literature has at most 55 machines, 60 part types, and a total of 124 process plans [51].  Second, execution strategies for CF and complicated aspects inherent in the GCF problem itself make the solution quality of CF methods very sensitive to subsequent part assignment or improvement procedures that are necessary to follow after machine cells are obtained. Three different strategies have been used to execute the CF algorithm [52]: the part family identification (PGI) strategy forms part families first and then groups machines, the machine group identification (MGI) strategy creates machine cells first and then allocate parts to cells, and the part family/machine grouping (PF/MG) strategy forms machine cells and part families simultaneously. Most soft and hard computing approaches for CF use the MGI strategy to execute CF algorithms since it usually takes enormous computation time to implement the PF/MG strategy even for intermediate-size incidences. The PMP-based approach for CF also uses the MGI strategy to create cells. Therefore, once machine cells are obtained from the PMP solution, part families need to be formed by allocating parts to the best cells. Danilovic and Ilic [50] and Li et al. [53] have established sufficient conditions for the optimal assignment of parts to machine cells given a partition of machines of a SCF problem. However, it should be noted that their sufficient conditions may not guarantee the optimal assignment of parts maximizing the GE in the GCF problem due to the existence of alternative process plans and/or replicate machines.
Motivated by the drawbacks of extant studies attacking the GCF problem, this paper proposes an effective hard computing approach using the PMP-type model to solve large-sized GCF problems. Our hard computing approach has the following distinctive features compared to previous hard computing approaches dealing with the GCF problem:


Two new linear 0-1 mathematical models of GCF are formulated: an exact model that directly maximizes the GE and a PMP-type model that indirectly maximizes the GE. Because the exact model contains too many binary variables and constraints, the PMP-type model is used to solve large-sized GCF instances optimally. According to the computational experiments applied to large GCF instances with over 10,000 binary variables, our PMP-type model solves those large GCF instances optimally within one second using the LINGO MILP solver.  Since the PMP-type approach uses MGI strategy to form machine cells first, a subsequent part allocating step is needed to form the corresponding part families. In this paper, a systematic heuristic part assignment procedure based on a new classification scheme of part types with alternative process plans is used to assign the best process plan of each part to its best cell. A subsequent refinement procedure is the used to further improve the block diagonal solution by reassigning improperly assigned exceptional machines (EMs) in such a way that the GE is maximized. The computational burden of implementing these extra procedures is negligibly small since they accomplish a high-quality CF within 0.2 s, even for the largest GCF instances tested in our computational experiments.  Unlike many comparative studies of SCF using the standard data set provided in Goncalves and Resende [19], studies of GCF lack the standard data set. Our computational experiment has been conducted over the widest range of GCF incidences that have ever appeared in the CF-related literature. Our collection of the GCF incidences can be used as a standard data set for subsequent benchmark tests in the future.

Basic Input
Different approaches for the GCF take different approaches to the definition of alternative process plans or routings. Vin and Delchambre [54] classify the approaches by sorting the processes or routings into six categories: (i) fixed routing (process route), (ii) routing with replicate machines, (iii) routing with alternate machines for some operations, (iv) several fixed routings, (v) fixed process plan, and (vi) alternative process plans. In this paper, we take the combination of manners (ii) and (iv) to define alternative process plans to model the GCF problem.
To model the GCF problem, the binary part-machine incidence matrix (PMIM), which represents the association between the process plans of parts and machines, will be used as a basic input. Given m part types with a total of t process plans and n different machine types, the × binary PMIM = is defined as follows: In addition, the following indices, notation, and decision variables will be used throughout the paper: Then, the 0-1 GCF problem can be illustrated with a generalized 0-1 PMIM as shown in Figure  1a, in which each row indicates a part and each column a machine. The manufacturing system shown in Figure 1a has eight part types with a total of 19 process plans and five different machine types with an extra copy for machine type 3. An entry of "1" indicates that a part is processed by its associated machine, and an entry of "0", which is not shown for visual convenience, indicates that it is not processed by its associated machine. The alphabetical letters after the part numbers indicate alternative process plans. Rearranging the rows and columns in such a way that only one process plan is selected for each part and only one copy is allowed for each different machine type in each cell results in a block diagonal solution matrix, as shown in Figure 1b

Performance Measure
Several comprehensive grouping efficiency measures considering both EEs and voids have been proposed to evaluate the quality of 0-1 block diagonal solutions and are reviewed critically [55,56]. Of those measures, the grouping efficacy (GE) [10] has been most widely used to evaluate the performance of the 0-1 GCF problem as well as the 0-1 SCF problem. The GE, Γ, is defined as = total no. of 1s in the block diagonal matrix EEs total no. of 1s in the block diagonal matrix + voids = .
(1) According to the above definition, the block diagonal solution of Figure 1b gives a GE of 85.19%. A GE of 100% indicates a perfect CF without EEs and zeros.

Exact Model
Several authors have provided exact formulations maximizing the grouping efficacy in the 0-1 SCF problem with a single copy of each machine type [11][12][13][14][15][16][17][18]. In this subsection, we present an exact formulation extended for the 0-1 GCF problem with multiple copies of replicate machine types.
To construct the exact formulation which directly maximizes the GE, the values of , , and should be calculated. They are given by the following equations: (2) (4) Note that once a process plan of part is processed on a copy of replicate machine type in cell , the remaining − 1 elements with = 1 processed by different copies of that replicate machine type in other cells are not the EEs.
The objective function in Equation (5) maximizes the GE. Constraint (6) ensures that only one process plan of each part is assigned to only one cell. Constraint (7) ensures that each machine belongs to exactly one machine cell. Constraint (8) ensures that at most a single copy of a replicate machine type is assigned to each cell. Constraint (9) ensures that the number of machines in each cell does not exceed U machines. Constraint (10) ensures the binary restriction of variables.
Model 2 can then be solved by using a solver such as LINGO adopting the branch-and-bound algorithm. However, the computational burden of optimally solving model 2 seems still to be heavy even if a powerful solver is used due to the presence of too many binary variables and constraints. The total number of binary variables of model 2 is and the number of constraints is For example, to solve a large-sized 0-1 GCF problem containing 110 machines, 120 part types, and 248 process plans-which will be tested in Section 4-331,656 binary variables and 663,554 constraints are needed. Therefore, relying on an alternative model with much fewer binary variables and constraints that leads to the maximization of the GE can be a better strategy. Thus, we use the PMP-type model to solve a large-sized GCF problem by maximizing the GE indirectly.

PMP-Type Model
Since the exact model 2 has too many binary variables and constraints to optimally solve largesized GCF incidences within a reasonably short computation time, we use the PMP-type model as a better alternative formulation to maximize the GE indirectly. However, to develop the PMP-type model of GCF, we need the definition of similarity coefficients incorporating both alternative process plans and replicate machines. In this paper, we use a modified version of Won and Kim's similarity coefficient [59] defined between pairs of machine types to formulate the mathematical model of the GCF. Won and Kim's similarity coefficient based on the binary PMIM is a generalization of the Jaccard similarity coefficient used for the SCF problem. Their similarity coefficient , between two machine types and is defined by where Based on the similarity coefficient defined in Equation (13), the PMP-type model of GCF can be formulated as follows: Subject to , ≤ , = 1, = 1, ⋯ , , ∈ ≤ 1, = 1, ⋯ , ; ∈ The objective function in Equation (18) maximizes the sum of similarities among all pairs of machines including replicate machines. Constraint (19) ensures that each machine belongs to exactly one machine cell. Constraint (20) specifies the required number of machine cells. Constraint (21) ensures that the number of machines in each cell does not exceed machines. Constraint (22) ensures that at most a single copy of a replicate machine type is assigned to each cell. Constraint (23) ensures the binary restriction of variables.
Since model 3 is a linear model and the number of machine types is usually much lower than the number pf process plans, moderately large-sized instances can be solved optimally within a reasonable computation time regardless of the total number of process plans. On the contrary, the number of process plans critically affects the model size in model 2. To solve the example incidence addressed in Section 2.3.1, 12,100 binary variables and 497 constraints are needed. Clearly, model 3 is very economical since it contains much fewer binary variables and/or constraints than model 2. We will show that even a large GCF incidence can be solved optimally within only one second using the LINGO solver.

Part Assignment to Cells
The solution of PMP-type model 3 identifies only the machine cells. Once the machine cells are formed, parts need to be assigned to their best associated cells in such a way that the objective of CF is optimized. The classic part assignment rule that has been widely used is the maximum density rule, assigning a part to the cell in which it has most operations. Since different part assignment rules affect the solution quality of CF, several modified part assignment rules have been proposed [50,[59][60][61][62][63][64]. A critical drawback of these rules is that the solution quality due to the assignment of a part depends on the assignment of other parts since they are heuristic rules. Some authors [50,53] have established sufficient conditions for optimal part assignment that do not depend on the assignment of other parts for the SCF problem. Several heuristic part assignment procedures for the GCF problem considering the number of EEs and voids have been proposed in the literature [59,[61][62][63]; however, due to the existence of alternative process plans and replicate machines, no optimal part assignment rules for the GCF problem have been proposed.
Won [63] used the nonbinary generalized PMIM to develop a heuristic part assignment procedure for the GCF problem. In this paper, we use a modified heuristic part assignment procedure based on the binary generalized PMIM to classify parts into eight categories as follows:


A type I strongly nonexceptional part (SNEP) for which a unique process plan has the most 1s in a unique machine cell without EEs;  A type II SNEP for which multiple process plans have the most 1s in a unique machine cell without EEs;  A type I neutrally nonexceptional part (NNEP) for which a unique process plan has the most 1s in more than one machine cell without EEs;  A type II NNEP for which multiple process plans have the most 1s in more than one machine cell without EEs;  A type I weakly exceptional part (WEP) for which a unique process plan has the most 1s in a unique machine cell with EEs;  A type II WEP for which multiple process plans have the most 1s in a unique machine cell with EEs;  A type I neutrally exceptional part (NEP) for which a unique process plan has the most 1s in more than one machine cell with EEs;  A type II NEP for which multiple process plans have the most 1s in more than one machine cell with EEs.
To determine the specific type of a part and assign the best process plan to its best associated cell in such a way that the GE is maximized, the following measures need to be calculated for all process plans of parts: 1. The numbers of EEs due to the assignment of each process plan to each cell; 2. The number of voids due to the assignment of each process plan to each cell; 3. The number of cells which have the most 1s for completing the required operations due to the assignment of each process plan to each cell; 4. The total number of operations (1s) contained in each cell with all the parts assigned until the current stage; 5. The number of parts assigned to each cell until the current stage; and 6. The number of operations processed by the machines in each cell. Criteria 1 and 3 contribute to the independent cell configuration with the least inter-cell moves. Criteria 2 and 6 contribute to the compact cell configuration with high machine utilization. Criteria 4 and 5 contribute to the balanced cell configuration with an even workload among cells. A specific type of a part can be determined from criteria 1 and 3. Figure 2 presents a flow chart showing the procedure which identifies the type of a part. The assignment of a type I SNEP or type I WEP is straightforward: its unique candidate process plan is assigned to its associated unique cell. However, the assignment of remaining part types requires tie-breaking rules to select the best candidate process plan and its best associated cell. Given the information of criteria 1 to 6 for each part, assignment rules for the remaining part types are stated as follows: Assignment rule for Type II SNEP or Type II WEP There is a unique candidate cell for each of these part types. The process plan with the least number of EEs from 1 is selected. If ties occur, the plan with the maximum number of operations from 6 is selected. If ties occur again, the smallest-numbered cell is selected.
Assignment rule for Type I NNEP or Type I NEP There is a unique candidate plan for each of these part types. The cell with the lowest number of EEs from 1 is selected. If ties occur, the cell with the least number of voids from 2 is selected. If ties occur again, the cell assigned with the least number of 1s from 4 is selected. If ties occur again, the cell assigned with the least number of part types from 5 is selected. If ties occur again, the cell with the maximum number of operations in a cell from 6 is selected. If ties occur again, the smallestnumbered cell is selected.
Assignment rule for Type II NNEP or Type II NEP There are multiple candidate plans and cells for each of these part types. The plan and the associated cell with the least number of EEs from 1 are selected. If ties occur, the process plan and the associated cell with the least number of voids from 2 are selected. If ties occur again, the process plan and the associated cell assigned with the least number of 1s from 4 are selected. If ties occur again, the process plan and the associated cell assigned with the least number of part types from 5 are selected. If ties occur again, the process plan and the associated cell with the maximum number of operations in a cell from 6 are selected. If ties occur again, the smallest-numbered process plan and cell are selected.

Reassigning Improperly Assigned Exceptional Machines (EMs) and Redundant Machines (RMs)
The CF based on the solution from model 3 and the part family formation may result in an unsatisfactory block diagonal solution due to improperly assigned Ems, which process most parts in other cells, or RMs, which process no parts in their parent cell. Therefore, a subsequent refinement procedure is used to improve the quality of incumbent block diagonal solutions through an inspection by reassigning them to their most appropriate cells if there are any improperly assigned EMs. The reassignment of improperly assigned EMs/RMs can lead to higher GE by decreasing EEs and voids.
The improperly assigned EMs/RMs are categorized as follows:  An absolute RM which processes no parts in any cell;  A type I RM which processes most parts in other unique cells except for its parent cell;  A type II RM which processes most parts in two or more other cells except for its parent cell;  A type I EM which processes some parts in its parent cell but processes most parts in other unique cells except for its parent cell;  A type II EM which processes most parts in two or more cells including its parent cell.
To determine the type of EM and RM, the following elements need to be evaluated for each machine:


Its parent cell;  Cells processing most parts;  The total number of parts processed; and  The number of parts processed in its parent cell.  Reassign a type I RM or type I EM to another unique candidate cell;  Reassign an absolute RM, type II RM or type II EM to the cell with the fewest 1s. If ties occur, reassign them to the cell with the lowest number of machines.  Remove an absolute RM from the current cell since it does not process any parts under the incumbent cell configuration.

Illustrative Example
The proposed procedures for part assignment and improperly assigned EM/RM reassignment are illustrated with a hypothetical GCF incidence which includes seven machine types, 15 part types, and 35 process plans as shown in Table 1 3, 5  5  a  3, 4, 5, 6  b  2, 6  b  2, 7  c  2, 3, 4  c  1, 3, 4, 5, 7  13  a  1, 2, 3, 6  6  a  2, 4  14  a  2, 3, 4, 6  b  1, 3, 6  b  1, 2, 3, 4  7  a  3, 5  15  a  2, 7  b  4, 5, 7  b  3, 4  8  a  2, 3, 5  b 4, 6, 7 c 1, 2, 3, 6 Once these machine cells are determined, partial assignment steps of parts 1, 2, and 3 are presented in Table 2, where the column (1) indicates the number of EEs generated by the process plan assigned to a specific cell. The symbol * indicates the entries corresponding to the candidate process plan and cells selected by the criterion of that step. The symbol ** indicates the entries corresponding to the best process plan and the cell selected by the criterion of the final step. If the best candidate plan is determined, the associated best process plan and cell can be identified by scanning the columns from left to right. After all parts are assigned, no RMs and improperly assigned EMs are found. Figure 5 is the resulting block diagonal solution matrix with a GE of 77.19%. Type I SNEP 2** 0* 0* 3 1 2 * indicates the candidate process plans and cells and ** indicates the best process plan and cell.

Results
The purpose of the computational experiment performed in this section was two-fold. One aim was to show the effectiveness of our approach compared to the solutions already reported in the literature under the same restrictions as the reference approaches with regard to the number of cells and cell size ; the other was to provide solutions that can be used as benchmarks for comparative testing with different CF solution approaches by finding better alternative solutions that have not   [69]. d ) The best solution reported in [79]. e ) The best solution reported in [61].
The former incidences are used for comparative purposes along with the results obtained by the reference approaches; the latter incidences are used to show the effectiveness of our approach in solving large-sized GCF incidences. The double-expanded large-size incidences have been produced by following Adil et al.'s data expansion scheme [80], which duplicates rows and columns of an existing intermediate-sized binary PMIM instead of using randomly generated data sets.
For the original problems 6, 11, 22, 23, and 24 in Table 3, extra incidences have been tested for values of and different from those used the in the original problems. Some of the original data sets include information on the operation sequences and/or production volumes of parts. Those data sets are slightly changed so that the resulting routing data or PMIMs are suited for the 0-1 GCF problem format. For problem 5, process plans b and c of part type 2 are merged into an identical process plan since they require the same tools. For problems 17 and 19, process plan c of part type 1 has been deleted since it requires the same machines as process plan b. All the GCF incidences and block diagonal solution matrices are available upon request.
Model 3 has been solved using the LINGO 16.0 solver on an ASUS laptop computer including an Intel Core i7-9750H processor running at 2.6 GHz with 16 GB RAM under the Windows 10 operating system. The procedures for part assignment and improperly assigned EM/RM reassignment were coded in C + + and implemented using the Visual Studio 2015. The execution times taken to implement all the reference CF methods selected for comparative purposes will not be reported since each incidence selected in our experiment was solved using different machines and compilers; only the execution time taken to implement our approach is reported for future comparison.

Discussion
In Table 3, the bold-faced GEs indicate the best GEs obtained from the corresponding reference  approach and proposed PMP hard computing approach. Of these 39 test incidences, the best  published values of GE under the corresponding conditions of  and  for four incidences of  problem 22 and three extra test incidences of problems 11, 23, and 24 have not been reported in the literature. Therefore, these instances are not used for performance comparison between the reference approaches and the proposed PMP approach and are reported only for the purpose of future comparison.
Regarding the computing time required to implement model 3, the proposed PMP hard computing approach reached the global optimum within 0.3 s even for a large incidence of 24 including 3025 binary variables. The execution times required to implement subsequent procedures for part assignment and improperly assigned EM/RM reassignment were also negligibly small since those procedures were terminated within 0.1 s for all incidences.
The proposed PMP hard computing approach has been applied to larger GCF incidences in order to show how efficiently the PMP hard computing approach solves large-sized GCF incidences. To the best of our knowledge, problem 24 is the largest open GCF incidence that is available in the literature and therefore can be used for comparative purposes with different CF solution approaches. Some authors [61,81] tested the efficiency of their CF solution approaches with randomly generated incidences in order to show the efficiency of their methods for large-sized incidences. However, they did not provide solutions that show an explicit configuration of machine cells and associated part families. As a result, the best published values of GE for those incidences are not available for benchmark testing with such randomly generated incidences.
To show how efficiently the proposed PMP hard computing approach solves even larger GCF incidences, we have chosen to double-expand some original large incidences instead of using randomly generated incidences. For this purpose, problems 8 and 21 to 24 were selected and expanded. A typical strategy used to randomly generate large-sized matrices is to create an ideal block diagonal structure first and then destroy it gradually by using random flips. The random flipping of the original GCF matrices is performed to change 1s in the diagonal blocks into zeros and zeros in the off-diagonal blocks into 1s [34]. The more flipped the expanded matrices, the less random they become. However, the large GCF incidences generated through double-expansion in our computational experiment are not randomly flipped, meaning that the resulting matrices do not approach completely random ones. By avoiding random flips over the expanded matrices, we can test both the efficiency and robustness of the proposed PMP approach. The only element randomly scrambled is the order of part numbers.  Table 4 shows the double-expanded incidences and computational results with those incidences. The numbers of machines of expanded incidences vary from 60 to 110, and the numbers of binary variables therefore vary from 3600 to 12,100. As far as the author knows, few mathematical models have been implemented to use a hard computing approach to optimally solve large-sized GCF incidences with such a large number of binary variables. According to Table 4, model 3 reaches the global optimum for all the expanded incidences within one second. Subsequent procedures for part assignment and improperly assigned EM/RM reassignment were terminated within 0.2 s. Clearly, this shows the computational efficiency of the proposed PMP hard computing approach for largesized GCF problems. Furthermore, except for incidence 2, which was double-expanded for problem 21, the remaining double-expanded incidences even yielded better GEs than the original incidence before double-expansion. This reveals the robustness of the proposed PMP approach when applied to large-sized GCF problems.

Conclusions
In this paper, we have proposed an effective hard computing technique to solve large-sized GCF incidences to the global optimum within a short computation time. The distinctive contributions of this paper are summarized as follows:


Two new linear 0-1 mathematical models have been formulated to solve the GCF problem: an exact model to directly maximize the GE and a PMP-type model to indirectly maximize the GE;  It seems still to be very difficult to use a hard computing approach to optimally solve large-sized GCF incidences of an exact mathematical model optimizing the objective function directly.  The PMP-type model, as an alternative to optimizing the objective function of GCF indirectly, can use hard computing techniques to solve large-sized GCF incidences optimally in a very short computation time. Even a large GCF incidence containing 12,100 binary variables can be solved within only one second to the global optimum, and the machine cells can be identified.  The computation time for subsequent part assignment procedures finding the associated part families and refinement steps and reassigning improperly assigned EMs/RMs is also negligibly small, even if some soft computing heuristics can work faster, as shown by Goldengorin et al. [34].  The solution quality based on the proposed hard computing approach compares favorably to existing GCF solution approaches.  The GCF incidences collected in our computation experiment can be used as a standard data set for subsequent benchmark tests in the future.
A limitation of the paper should be addressed. Unlike Danilovic & Ilic's part assignment rule [50], the part assignment procedure proposed in this paper does not guarantee the optimal part assignment since it is heuristic. The solution quality due to the assignment of a part depends on the assignment of other parts. Therefore, establishing sufficient conditions for optimal part assignment in the GCF problem will be a very attractive future research issue.
Funding: This research received no external funding.