An Integrated Method for Modular Design Based on Auto-Generated Multi-Attribute DSM and Improved Genetic Algorithm

: Modular architecture is very conducive to the development, maintenance, and upgrading of electromechanical products. In the initial stage of module division, the design structure matrix (DSM) is a crucial measure to concisely express the component relationship of electromechanical products through the visual symmetrical structure. However, product structure modeling, as a very important activity, was mostly carried out manually by engineers relying on experience in previous studies, which was inefﬁcient and difﬁcult to ensure the consistency of the model. To overcome these problems, an integrated method for modular design based on auto-generated multi-attribute DSM and improved genetic algorithm (GA) is presented. First, the product information extraction algorithm is designed based on the automatic programming structure provided by commercial CAD software, to obtain the assembly, degrees of freedom, and material information needed for modeling. Secondly, based on the evaluation criteria of product component correlation strength, the structural correlation DSM and material correlation DSM of components are established, respectively, and the comprehensive correlation DSM of products is obtained through weighting processing. Finally, the improved GA and the modularity evaluation index Q are used to complete the product module division and obtain the optimal modular granularity. Based on a model in published literature and a bicycle model, comparative studies are carried out to verify the effectiveness and practicality of the proposed method.


Introduction
To cope with the increasingly fierce market competition, manufacturers need to deal with a series of issues such as product diversity, product customization, shorter product life cycles, and rapidly changing policies and environments [1]. The idea of modularity in product architecture is widely recommended as a corporate strategic decision to deal with these issues all at once [2]. As a scientific methodology supporting product development, modularity is to plan a series of modules (including basic modules and optional modules) for product families or product platform modularity based on the products' structural characteristics [3]. Compared with the conventional product architecture, the advantages of modular architecture are reflected in all stages of the product life cycle [4,5]. Reasonable product module planning is not only conducive to product development, manufacturing, and upgrading but also can effectively reduce the impact on the environment during product service and after scrapping [6,7]. Modularity is recognized as a product development strategy by academia and industry, and the openness of modular architecture makes it play an important role in sustainable product development [8,9]. Modular design concepts and methods have been used in many electromechanical product design processes, such as coffee makers [10], industrial steam turbines [11], wind turbines [12], and large tonnage only obtain the structural relationship between product components. Therefore, this method cannot meet actual needs well when companies need to consider sustainable design factors such as the environment or recycling. The product structure model is the basis of modular design. Modeling efficiency and model accuracy have a far-reaching impact on the popularization and application of modular design methods. The existing product structure modeling methods have the following two shortcomings: first, the establishment process of product multi-attribute DSM is mostly manual, which is inefficient and error-prone. Second, the existing method of automatically establishing product DSM only considers the structural relationship, and the factors considered are too single and do not meet the actual needs of the enterprise. Therefore, an integrated method for modular design based on auto-generated multi-attribute DSM and improved GA is proposed in this paper. The main contributions and novelty of this work are as follows: (1) an automatic method for establishing product multi-attribute DSM is proposed. Based on the product's 3D assembly model and the secondary development tools provided by commercial CAD software, a product information extraction algorithm was developed, and then the DSM model was established based on the automatically extracted product information. (2) An improved GA is proposed to realize product module division. The traditional GA mutation operation is improved to break through the limitation of the number of initial modules on the algorithm to obtain the global optimal solution.
The remainder of the paper is organized as follows. Section 2 introduces the research framework and related operators, such as information extraction of a product's 3D assembly model, digitization of product information, formation of the comprehensive correlation DSM, and an improved GA-based module division method. Two illustrative case studies are presented in Section 3, and the module division of the gear oil pump and the bicycle is executed to prove the effectiveness of the method proposed in this paper. Discussions and conclusions are given in Sections 4 and 5, respectively.

The Proposed Module Partition Methodology
This section introduces the proposed module partition methodology, as shown in Figure 1, which contains the automatic extraction of 3D model information, the pre-processing of product information, the formation of a comprehensive component correlation DSM of the product, and the module partition method based on improved GA.

Improved GA CAD assembly model Extracted information
① Extract

Pre-processed information
Comprehensive DSM   (1) Automatic extraction of 3D model information mainly refers to step 1 in Figure 1. Firstly, the type of product information required is determined according to the modeling requirements, then the appropriate interface function is selected to extract the corresponding product information. (2) Step 2 in Figure 1 refers to the pre-processing of product information. The product information extracted from the product 3D assembly model is mostly text information and the information contains a large number of software design marks. Therefore, the text information is converted into digital information through the pre-processing of the information to improve the efficiency of the subsequent product DSM establishment. (3) The main purpose of step 3 in Figure 1 is to generate the comprehensive component correlation DSM of the product. According to the different types of extracted product information, their correlation strength evaluation standards are formulated, respectively. Then, the comprehensive product component correlation DSM is obtained by the weighted summation method. (4) The main function of step 4 in Figure 1 is to realize the module division of the product. The modular index Q is selected as the fitness function to transform the module partition problem into an optimization problem, and then the optimal product module partition scheme is obtained based on the improved GA.

Information Extraction Algorithm of Product 3D Assembly Model
As for a product 3D assembly model, it contains a lot of product-related information, such as structure, name, material, and matching relationship [33]. In the design of the extraction algorithm, we only need to extract the information required to establish the product DSM model. After analyzing the commonly used modular driving factors and the product information content contained in the 3D assembly model, the connection strength and material similarity between components are selected as the main factors of module division. Dividing the components with high connection strength into one module is conducive to the assembly and disassembly of the module. The materials of components are closely related to their recycling methods, so the components with the same or similar materials are grouped together to facilitate the disposal of product scrap.
This work divides the product information to be extracted into three categories: the basic information of product components, the assembly information of product components, and the material information of product components. The basic information of product components mainly refers to the number and name of product components, which are used to determine the size, row, and column elements of the correlation matrix of product components. The assembly information of product components mainly includes the constraint relationship between components and the degree of freedom of components. The type and quantity of the constraint relationship between components can be used to analyze whether two components are in contact, and the strength of the connection relationship can be determined by the degree of freedom of components. As the name suggests, the material information of a product component refers to the name of the material used to produce the component. The material information of the product component can be used to obtain the disposal method of the component when the product is at the end of the lifecycle.
Most of the current CAD software, such as SolidWorks, Pro/Engineer, and UG, offer their own standard automated programmable interfaces (APIs) for customizing software applications. Based on the standardized APIs, the file of the assembly model is instantiated by selecting the corresponding interface and the required product information can be obtained directly by accessing the members of the instantiated object. The assembly information extraction algorithm developed in this paper is oriented to the assembly model file of SolidWorks. SolidWorks software supports secondary development using a variety of programming languages, and this paper features VB language to develop an information extraction algorithm on the Microsoft Visual Studio platform. The function brief of some main interface members used in the extraction algorithm is shown in Table 1. For example, to obtain the material information of the component, we can directly read it by calling the member "GetMaterialPropertyName2" under the "IPartDoc" interface. Gets the names of the material database and the material for the specified configuration.
A common assembly model of SolidWorks is a typical tree-like hierarchy, which includes a number of components in multiple levels as shown in Figure 2 (take the hydraulic cylinder in Figure 1 as an example). As shown in the bottom half of this figure, a list of features can offer a concise and clear description of how the assembly was constructed. The name information of the product component can be directly extracted from the feature tree and the sub-feature tree, and the assembly information of the component is extracted from the "Mates" feature at the bottom of the feature tree and the sub-feature tree. The acquisition of component materials and degrees of freedom data is to lock the pointer to the corresponding target and directly extract it. The pseudo code of the information extraction algorithm is shown in Figure 3. brief of some main interface members used in the extraction algorithm is shown in Table  1. For example, to obtain the material information of the component, we can directly read it by calling the member "GetMaterialPropertyName2" under the "IPartDoc" interface. Gets the names of the material database and the material for the specified configuration.
A common assembly model of SolidWorks is a typical tree-like hierarchy, which includes a number of components in multiple levels as shown in Figure 2 (take the hydraulic cylinder in Figure 1 as an example). As shown in the bottom half of this figure, a list of features can offer a concise and clear description of how the assembly was constructed. The name information of the product component can be directly extracted from the feature tree and the sub-feature tree, and the assembly information of the component is extracted from the "Mates" feature at the bottom of the feature tree and the sub-feature tree. The acquisition of component materials and degrees of freedom data is to lock the pointer to the corresponding target and directly extract it. The pseudo code of the information extraction algorithm is shown in Figure 3.    The extraction algorithm proposed in this paper can realize the automatic acquisition of product assembly information, but the algorithm has certain limitations in its application and promotion. From the perspective of the application, the object of this algorithm is a standard SolidWorks assembly document, and the constraint relationship between components must strictly follow reality. Otherwise, there will be errors in the product DSM data established based on the extracted information. As for the promotion of algorithms, the interfaces and functions in the secondary development tools provided by mature commercial CAD are different. When the object of extracting information is other types of 3D assembly models (Pro/E, UG, etc.), it is necessary to update the corresponding interface and function information under the framework of the information extraction algorithm.

Pre-Processing of Extracted Product Information
The original product information extracted from the 3D assembly is mostly in text format and contains a lot of system setting information of CAD software, which is not suitable for directly constructing the correlation matrix of product components. Therefore, pre-processing of extracted product information facilitates the improvement of the construction efficiency of the component correlation matrix.
The main operations of the pre-processing of the basic product information include deleting the system setting information of the CAD software and adding the sequence information of the product components. By deleting the system setting information of the CAD software, the extracted product information can be kept consistent with reality to enhance the legibility of the extracted information. The sequence information of the product components is added to facilitate the subsequent processing of the matching information and the material information of the product components.
The pre-processing of assembly information of product components mainly includes two steps: assembly information simplification and text information digitization. In this paper, the contact between two components is judged by whether there is an assembly relationship between components, and the assembly information extraction algorithm extracts all the assembly relationships between components, which leads to information re- The extraction algorithm proposed in this paper can realize the automatic acquisition of product assembly information, but the algorithm has certain limitations in its application and promotion. From the perspective of the application, the object of this algorithm is a standard SolidWorks assembly document, and the constraint relationship between components must strictly follow reality. Otherwise, there will be errors in the product DSM data established based on the extracted information. As for the promotion of algorithms, the interfaces and functions in the secondary development tools provided by mature commercial CAD are different. When the object of extracting information is other types of 3D assembly models (Pro/E, UG, etc.), it is necessary to update the corresponding interface and function information under the framework of the information extraction algorithm.

Pre-Processing of Extracted Product Information
The original product information extracted from the 3D assembly is mostly in text format and contains a lot of system setting information of CAD software, which is not suitable for directly constructing the correlation matrix of product components. Therefore, pre-processing of extracted product information facilitates the improvement of the construction efficiency of the component correlation matrix.
The main operations of the pre-processing of the basic product information include deleting the system setting information of the CAD software and adding the sequence information of the product components. By deleting the system setting information of the CAD software, the extracted product information can be kept consistent with reality to enhance the legibility of the extracted information. The sequence information of the product components is added to facilitate the subsequent processing of the matching information and the material information of the product components.
The pre-processing of assembly information of product components mainly includes two steps: assembly information simplification and text information digitization. In this paper, the contact between two components is judged by whether there is an assembly relationship between components, and the assembly information extraction algorithm extracts all the assembly relationships between components, which leads to information redundancy and is not conducive to the subsequent DSM construction. Through information simplification, only the information of whether there is an assembly relationship between components is retained and the quantity and type of assembly information is removed. In the process of simplification of assembly information, the digitization of text information is realized by replacing text information with digital information. That is, the numbers 0 and 1 are used to indicate whether there is contact between two components, and a set of N-dimensional vectors are used to indicate the contact relationship between a certain component and the remaining components. As for the degree of freedom information of product components, the fixed and floating states of components are represented by values 1 and 0, respectively.
The pre-processing of material information of product components mainly includes two aspects: digitization of material information and increasing material compatibility information. Commercial CAD software has carried out a very detailed division of material types. For example, the types of ordinary carbon steel include Q225, Q235, and Q245. Therefore, when digitizing material information, it is only necessary to use a serial number instead of corresponding material text information. In the process of digitization, there is a lack of material compatibility information, that is, although the two materials are different, they belong to the same kind of materials, such as Q235 and Q245 of carbon steel. Therefore, the material compatibility information is added during digitization to facilitate the subsequent component material similarity evaluation.
The component information of the product after pre-processing is composed of the component's name and an N + 6-dimensional vector, where N is the number of product components. The first two bits of the vector represent the sequence information of the component and the 3 to N + 2 bits of the vector represent the matching information of the component; the degree of freedom information of the component is in the N + 3 bit of the vector and the last three bits of the vector are composed of material information and recycling information.

Formation of the Comprehensive Correlation DSM
By analyzing the extracted product information, the comprehensive correlation DSM of products can be established. The process of establishing product comprehensive correlation DSM is mainly divided into three steps: structural correlation analysis between assembled components, material correlation analysis between assembled components, and comprehensive evaluation of correlation strength between assembled components.

Structure Correlation Analysis between Components
The structural correlation strength analysis between components mainly includes two steps: the construction of the adjacency matrix of components and the evaluation of the structural correlation strength between components.
The construction of the adjacency matrix of product components. During manual modeling, engineers or designers can judge whether two product components are in contact according to the product entity, design experience, and engineering data to establish an adjacency matrix of product components. This process is easy to understand for people, but difficult for computers. Therefore, this work uses the extracted product assembly information to determine the contact relationship between the components, that is, when there is an assembly constraint relationship between the two components, the two components are considered to be in contact, and vice versa.
The structural correlation strength between components is evaluated. To gain the structure correlation strengths between product components, a dependency rating criterion based on the degree of freedom (DoF) of components is proposed in this paper. A lot of dependency rating standards have been developed to assess correlation strengths in the literature. Helmer et al. investigated these schemes from the perspective of the consistency and applicability of the evaluation results and explained their limitations [34]. These rating schemes usually judge the difficulty of disassembly based on the type of connection relationship between components, to obtain the strength of the association between them. However, the connection relationship type of the component can neither be directly extracted from the 3D model of the product nor indirectly obtained through the extracted  [35]. The main idea of this method is that as the DoFs increase, the intensity of dependence also increases. Based on the methods mentioned in the literature [24,35], this paper proposes a dependency rating criterion based on the counts of DoFs in which four rating scales from 0 to 1 are introduced to indicate different dependency strengths. Table 2 shows the structure correlation strength evaluation between product assembled components, where S(a,b) refers to structure correlation strength between component a and component b. Table 2. Structure correlation strength evaluation between components.

No
Description S(a,b) 1 Self-correlation of the component. There is no contact relationship between the two components. 0

Material Correlation Analysis between Components
The sustainability factor has been paid more and more attention by researchers in modular design. In the process of product modularization, full consideration of the relevance of component material and grouping components with the same or similar materials into the same module can greatly reduce the environmental impact of the product scrapping process. Table 3 shows the material correlation strength evaluation between assembled components, where M(a,b) represents material correlation strength between component a and component b. Table 3. Material correlation strength evaluation between components.

No
Description M(a,b) 1 Self-correlation of the component. 1 2 Two components have the same material. 0.8 3 The materials of the two components are compatible. 0. 4 4 The materials of the two components are not compatible. 0

Comprehensive Assessment of Correlation Strength between Components
Based on the above-mentioned structural and material correlation strength evaluation criteria, the structural correlation strength and material correlation strength between components can be easily obtained. As for the calculation of the comprehensive correlation strength between product components, the commonly used weighted summation method is used. The weight w i (i = 1, 2) corresponds to the structural correlation strength and the material correlation strength. The value range of i is greater than zero and less than one, and the sum of the two weights is equal to one. According to the design specifications and usage scenarios of mechanical products, the weight values of structure correlation strength and material correlation strength are various and determined by the product engineer in the relevant engineering field. Accordingly, the comprehensive correlation strength C(a,b) between component i and j can be expressed as follows: where C(a,b), S(a,b), and M(a,b) represent the comprehensive correlation strength, structure correlation strength, and material correlation strength between components a and b, respectively. ω 1 and ω 2 , respectively, represent the weight of S(a,b) and M(a,b), which are assigned in light of expert evaluation. According to the calculation result of the comprehensive correlation strength, the correlation matrix C between product components is a square matrix of order k, where k is the number of product components, which is given by

Improved GA-Based Module Division Method
The purpose of modular design is to improve the degree of cohesion of each module while reducing the degree of coupling between modules. Therefore, by constructing a reasonable objective function, the problem of module division can be transformed into an optimization problem. As a classic intelligent evolutionary algorithm, GA is often used to solve combinatorial programming problems. However, as the number of modules and the size of the modules need to be set in advance, the traditional GA is susceptible to the influence of the initial settings and it is difficult to obtain the optimal solution. In this paper, an improved GA is proposed to overcome the disadvantages of the classic GA in cluster solving of modularization. The framework of the improved GA-based module division method is shown in Figure 4. Compared with the traditional GA, the improved GA mainly improves the mutation operation to break through the constraint of the initial number of modules on its global optimization. The improved mutation operation process is shown in the dotted box in Figure 4. The mutation operation of the traditional GA is only a transfer mutation, that is, according to the random number, the components corresponding to the mutation point are divided from the corresponding module before mutation to the corresponding module after mutation. As the value of the random number is constrained by the number of initial modules, product components can only be transferred in existing modules. In order to break through the limitation of the initial modular number on the global optimization of the algorithm, the separation mutation is added on the basis of the transfer mutation, that is, the mutation operation is allowed to generate new modules. By generating new modules through mutation operation, the constraint of the number of initial modules is weakened, and the improved GA has better global optimization ability.

Encoding and Initial Population Generation
In the improved GA, the position indices of the chromosome genes indicate the corresponding product component, and the value in the gene represents the module to which the component belongs. Taking the hydraulic cylinder in Figure 1 as an example, the coding method of the algorithm is further described. As shown in Figure 5a, the total number of components of the hydraulic cylinder determines the length of the chromosome is 9. The position indices of chromosome genes 1 to 9, respectively, indicate the components of the corresponding hydraulic cylinder: lifting lug, piston rod, buffering ring, etc. The value 1-3 of the chromosome gene refers to the division of the hydraulic cylinder into 3 modules and which components each module contains.
The generation of the initial population mainly consists of three steps. The first step is to set the parameters, including the number of populations, the length of chromosomes, and the number of initial modules. The second step is to generate a fixed-length chromosome and randomly assign a random number within the range of the total number of modules to each gene of the chromosome. The last step is to repeat step 2 until the initial population is generated.  Figure 4. The framework of modular design based on the improved GA.

Encoding and Initial Population Generation
In the improved GA, the position indices of the chromosome genes indicate the corresponding product component, and the value in the gene represents the module to which the component belongs. Taking the hydraulic cylinder in Figure 1 as an example, the coding method of the algorithm is further described. As shown in Figure 5a, the total number of components of the hydraulic cylinder determines the length of the chromosome is 9. The position indices of chromosome genes 1 to 9, respectively, indicate the components of the corresponding hydraulic cylinder: lifting lug, piston rod, buffering ring, etc. The value 1-3 of the chromosome gene refers to the division of the hydraulic cylinder into 3 modules and which components each module contains.
The generation of the initial population mainly consists of three steps. The first step is to set the parameters, including the number of populations, the length of chromosomes, and the number of initial modules. The second step is to generate a fixed-length chromosome and randomly assign a random number within the range of the total number of modules to each gene of the chromosome. The last step is to repeat step 2 until the initial population is generated.

Crossover
The function of the crossover operation is to make the improved GA have a stronger spatial search ability and the offspring can inherit the high-quality genes of the parent. Crossover is conducted by the following three steps: (1) according to the length of the chromosome, a one-dimensional array of 0 and 1 is randomly generated. (2) When the value of the array position corresponding to the chromosome gene is 1, the values of the

Crossover
The function of the crossover operation is to make the improved GA have a stronger spatial search ability and the offspring can inherit the high-quality genes of the parent. Crossover is conducted by the following three steps: (1) according to the length of the chromosome, a one-dimensional array of 0 and 1 is randomly generated. (2) When the value of the array position corresponding to the chromosome gene is 1, the values of the two parent chromosome genes are exchanged. When it is 0, no operation is performed.
(3) From the first gene of the parent chromosome, perform step (2) until the last gene. An example of the crossover is described in Figure 5b based on two randomly selected parent chromosomes.

Mutation
In this improved GA, the mutation is a very important operation to help the algorithm break through the limit of the initial number of modules to search for the global optimal solution. In this work, the mutation is performed by transfer mutation and separation mutation. Transfer mutation refers to the movement of a component from the original module to other modules, while separation mutation refers to the separation of a component from the original module to form a new module. The specific process of mutation operation is shown in Figure 5c. Firstly, according to the probability of mutation operation, a chromosome to be mutated is randomly selected as the parent. Then, a one-dimensional array composed of 0 and 1 is randomly generated according to the chromosome length, and the genes corresponding to the array of 1 are mutated. When the variation value of the gene is less than or equal to the number of initial modules, it is a transfer mutation, otherwise, it is a separation mutation.

Modular Object Adaptive Function
The role of the modular object adaptive function is to calculate the fitness number of each chromosome when the genetic algorithm is iterating and to give more opportunities for excellent chromosomes to be selected. Therefore, the criterion for selecting the modular object adaptive function should be to analyze whether it can effectively evaluate the pros and cons of the module division scheme. In related research work, scholars have proposed many indexes to evaluate product module division schemes, such as the partition coefficient (PC(c)) [36], the modularity index (MI) [37], the minimum description length [38], the integrative complexity (IC) [39], and the modularity assessment index (Q) [15]. Among them, the modularity Q, as an index introduced from the complex network theory, has been popular in the field of modular design in recent years. Therefore, this paper features the modularity index Q as the objective function to calculate the fitness of each chromosome. To facilitate the calculation of the value of Q, a module matrix e is constructed as: where y ii represents the fraction of intra-module interfaces while y ij represents the fraction of inter-module interfaces. For the module matrix e, the row sum is written as: The modularity index Q is defined as: where e ii represents the fraction of edges with both end vertices in the same module i, and a i represents fraction of edges with at least one end vertex inside module i. k refers to the number of modules in the modular scheme. The modularity index Q has a numerical range of (−1~1). The larger the value, the more reasonable the result of the module division, and vice versa.
Based on the product information automatic extraction algorithm and component correlation strength evaluation standard proposed in Sections 2.1 and 2.2, the structure correlation DSM and material correlation DSM of the gear oil pump are automatically constructed and shown in Figure 7a,b, respectively. Then, the weights w 1 = 0.8, w 2 = 0.2 are taken to obtain the comprehensive correlation DSM of the gear oil pump as shown in Figure 7c.
Based on the product information automatic extraction algorithm and component correlation strength evaluation standard proposed in Sections 2.1 and 2.2, the structure correlation DSM and material correlation DSM of the gear oil pump are automatically constructed and shown in Figure 7a 4 0.4 0.4 0.8 0 0.4 0.4 0 0.4 1 0.4 0.8 0.8 0.4 0.8  11 0.4 0.8 0.8 0.4 0 0.4 0.4 0 0.8 0.4 1 0.4 0.4 0.8 0.4  12 0.4 0.4 0.4 0.8 0 0.4 0.4 0 0.4 0.8 0.4 1 0.8 0.4 0.8  13 0.4 0.4 0.4 0.8 0 0.4 0.4 0 0.4 0.8 0.4 0.8 1 0.4 0.8  14 0.4 0.8 0.8 0.4 0 0.4 0.4 0 0.8 0.4 0.4 0.4 0.4 1 0.4  15 0.4 0.4 0.4 0.4 0 0.4 0.4 0 0.4 0.8 0.4 0.8 0.8  Based on the comprehensive correlation DSM in Figure 7c, the improved GA given in Section 2.3 is implemented to obtain the optimal module division scheme of the gear oil pump. The improved GA is programmed in a MATLAB 9.0 environment and runs on a desktop computer with a dual 2.63 GHz Intel i5 processor and 8GB RAM. The parameter settings of the improved GA are shown in Table 4. The optimization process is shown in Figure 8, and when the algorithm iterates about 110 times, the fitness function reaches the maximum and begins to converge. The value of the modularity index Q at this time is about 0.334 and the corresponding modularization scheme results in the gear oil pump is divided into 3 modules. The specific details of the modular scheme are shown in Table 5.   Based on the comprehensive correlation DSM in Figure 7c, the improved GA given in Section 2.3 is implemented to obtain the optimal module division scheme of the gear oil pump. The improved GA is programmed in a MATLAB 9.0 environment and runs on a desktop computer with a dual 2.63 GHz Intel i5 processor and 8GB RAM. The parameter settings of the improved GA are shown in Table 4. The optimization process is shown in Figure 8, and when the algorithm iterates about 110 times, the fitness function reaches the maximum and begins to converge. The value of the modularity index Q at this time is about 0.334 and the corresponding modularization scheme results in the gear oil pump is divided into 3 modules. The specific details of the modular scheme are shown in Table 5. oil pump. The improved GA is programmed in a MATLAB 9.0 environment and runs on a desktop computer with a dual 2.63 GHz Intel i5 processor and 8GB RAM. The parameter settings of the improved GA are shown in Table 4. The optimization process is shown in Figure 8, and when the algorithm iterates about 110 times, the fitness function reaches the maximum and begins to converge. The value of the modularity index Q at this time is about 0.334 and the corresponding modularization scheme results in the gear oil pump is divided into 3 modules. The specific details of the modular scheme are shown in Table 5.    In reference [40], the authors establish the multi-attribute network model of the gear oil pump on the basis of comprehensively considering the structural correlation, functional correlation, and flow correlation of components, and then the Fast Newman algorithm is used to identify the community structure in the correlation network. Finally, the modular index Q is employed to evaluate the quality of the module division scheme to obtain the best one. Compared with the method proposed in this paper, in the process of product structure modeling, both methods take structural association as the main driving factor of module division while considering different secondary driving factors of module division. The module division process is to use the Fast Newman algorithm and the improved GA, respectively, to obtain the optimal module division scheme with the modular index Q as the evaluation criterion. The two methods used similar product structure models, different module division algorithms, and the same modular scheme evaluation index, and finally obtained the same module division results of gear oil pumps. The comparison proves the feasibility of the module division method proposed in this paper.

Case Study for Bicycle
In this section, a bicycle assembly model is employed to testify the effectiveness of the proposed method further. As a common vehicle in people's daily life, the 3D assembly model of a bicycle is shown in Figure 9. The left side of the figure shows that the structure of the bicycle contains 23 components, and the right side shows the names and materials of these components. In reference [40], the authors establish the multi-attribute network model of the gear oil pump on the basis of comprehensively considering the structural correlation, functional correlation, and flow correlation of components, and then the Fast Newman algorithm is used to identify the community structure in the correlation network. Finally, the modular index Q is employed to evaluate the quality of the module division scheme to obtain the best one. Compared with the method proposed in this paper, in the process of product structure modeling, both methods take structural association as the main driving factor of module division while considering different secondary driving factors of module division. The module division process is to use the Fast Newman algorithm and the improved GA, respectively, to obtain the optimal module division scheme with the modular index Q as the evaluation criterion. The two methods used similar product structure models, different module division algorithms, and the same modular scheme evaluation index, and finally obtained the same module division results of gear oil pumps. The comparison proves the feasibility of the module division method proposed in this paper.

Case Study for Bicycle
In this section, a bicycle assembly model is employed to testify the effectiveness of the proposed method further. As a common vehicle in people's daily life, the 3D assembly model of a bicycle is shown in Figure 9. The left side of the figure shows that the structure of the bicycle contains 23 components, and the right side shows the names and materials of these components. In reference [40], the authors establish the multi-attribute network model of the gear oil pump on the basis of comprehensively considering the structural correlation, functional correlation, and flow correlation of components, and then the Fast Newman algorithm is used to identify the community structure in the correlation network. Finally, the modular index Q is employed to evaluate the quality of the module division scheme to obtain the best one. Compared with the method proposed in this paper, in the process of product structure modeling, both methods take structural association as the main driving factor of module division while considering different secondary driving factors of module division. The module division process is to use the Fast Newman algorithm and the improved GA, respectively, to obtain the optimal module division scheme with the modular index Q as the evaluation criterion. The two methods used similar product structure models, different module division algorithms, and the same modular scheme evaluation index, and finally obtained the same module division results of gear oil pumps. The comparison proves the feasibility of the module division method proposed in this paper.

Case Study for Bicycle
In this section, a bicycle assembly model is employed to testify the effectiveness of the proposed method further. As a common vehicle in people's daily life, the 3D assembly model of a bicycle is shown in Figure 9. The left side of the figure shows that the structure of the bicycle contains 23 components, and the right side shows the names and materials of these components.
In reference [40], the authors establish the multi-attribute network model of the gear oil pump on the basis of comprehensively considering the structural correlation, functional correlation, and flow correlation of components, and then the Fast Newman algorithm is used to identify the community structure in the correlation network. Finally, the modular index Q is employed to evaluate the quality of the module division scheme to obtain the best one. Compared with the method proposed in this paper, in the process of product structure modeling, both methods take structural association as the main driving factor of module division while considering different secondary driving factors of module division. The module division process is to use the Fast Newman algorithm and the improved GA, respectively, to obtain the optimal module division scheme with the modular index Q as the evaluation criterion. The two methods used similar product structure models, different module division algorithms, and the same modular scheme evaluation index, and finally obtained the same module division results of gear oil pumps. The comparison proves the feasibility of the module division method proposed in this paper.

Case Study for Bicycle
In this section, a bicycle assembly model is employed to testify the effectiveness of the proposed method further. As a common vehicle in people's daily life, the 3D assembly model of a bicycle is shown in Figure 9. The left side of the figure shows that the structure of the bicycle contains 23 components, and the right side shows the names and materials of these components. In the same way as the case of the gear oil pump, based on the product information automatic extraction algorithm and component correlation strength evaluation standard proposed in Sections 2.1 and 2.2, the structure correlation DSM and material correlation DSM of the bicycle components are automatically constructed and shown in Figure 10a,b, respectively. Then, the weights w 1 = 0.8, w 2 = 0.2 are taken to obtain the comprehensive correlation DSM of the bicycle components as shown in Figure 10c.
Based on the comprehensive correlation DSM in Figure 10c, the improved GA given in Section 2.3 is implemented to obtain the optimal module division scheme of the bicycle. The parameter settings of the improved GA are shown in Table 6. The optimization process is shown in Figure 11, and when the algorithm iterates about 170 times, the fitness function reaches the maximum and begins to converge. The value of the modularity index Q at this time is about 0.512 and the corresponding modularization scheme results in the bicycle being divided into 7 modules. The specific details of the modular scheme are shown in Table 7. From the perspective of structure and recycling, the result of the bicycle module division obtained according to the product 3D model information is reasonable. In terms of bicycle structure, each module has a compact structure and relatively independent functions. For example, the power input module is composed of the left pedal, right pedal, middle axle, and chain ring. The assembly relationship between components is close, and the materials of components have good compatibility. The modular design method proposed in this paper only provides a modular scheme for enterprises quickly. As for the specific implementation of the modular scheme, enterprises need to make appropriate adjustments according to their scale, customer type, and other factors.
In the same way as the case of the gear oil pump, based on the product information automatic extraction algorithm and component correlation strength evaluation standard proposed in Sections 2.1 and 2.2, the structure correlation DSM and material correlation DSM of the bicycle components are automatically constructed and shown in Figure 10a,b, respectively. Then, the weights w1 = 0.8, w2 = 0.2 are taken to obtain the comprehensive correlation DSM of the bicycle components as shown in Figure 10c.  Based on the comprehensive correlation DSM in Figure 10c, the improved GA given in Section 2.3 is implemented to obtain the optimal module division scheme of the bicycle. The parameter settings of the improved GA are shown in Table 6. The optimization process is shown in Figure 11, and when the algorithm iterates about 170 times, the fitness function reaches the maximum and begins to converge. The value of the modularity index Q at this time is about 0.512 and the corresponding modularization scheme results in the bicycle being divided into 7 modules. The specific details of the modular scheme are shown in Table 7. From the perspective of structure and recycling, the result of the bicycle module division obtained according to the product 3D model information is reasonable. In terms of bicycle structure, each module has a compact structure and relatively independent functions. For example, the power input module is composed of the left pedal, right pedal, middle axle, and chain ring. The assembly relationship between components is close, and the materials of components have good compatibility. The modular design method proposed in this paper only provides a modular scheme for enterprises quickly. As for the specific implementation of the modular scheme, enterprises need to make appropriate adjustments according to their scale, customer type, and other factors. Table 6. Improved GA parameter setting (the bicycle).

Parameter
Value Initial population 200 Generation time 300 Crossover rate 0.8 Mutation rate 0.2 Number of initial modules 5 Figure 11. The optimization process of the improved GA (the bicycle). Figure 11. The optimization process of the improved GA (the bicycle).

Discussion
This section mainly discusses and analyzes from two aspects: modeling and algorithm. The modeling aspect is to prove the advantages of automatic modeling by comparing it with manual modeling methods. The algorithm aspect is to verify the advantages of the proposed improved genetic algorithm in global optimization by comparing it with the traditional genetic algorithm.

Comparison with Manual Modeling
Compared with manual modeling, the automatic modeling method proposed in this paper has significant advantages in terms of efficiency, and the time spent on modeling is relatively less affected by the complexity of the model. In addition, the consistency of the automatically established product DSM model will not be affected by the modeling operator, and the errors that are prone to manual modeling such as missing and incorrect filling can also be completely avoided.
Automatic modeling has great advantages in efficiency, consistency, and accuracy,

Discussion
This section mainly discusses and analyzes from two aspects: modeling and algorithm. The modeling aspect is to prove the advantages of automatic modeling by comparing it with manual modeling methods. The algorithm aspect is to verify the advantages of the proposed improved genetic algorithm in global optimization by comparing it with the traditional genetic algorithm.

Comparison with Manual Modeling
Compared with manual modeling, the automatic modeling method proposed in this paper has significant advantages in terms of efficiency, and the time spent on modeling is relatively less affected by the complexity of the model. In addition, the consistency of the automatically established product DSM model will not be affected by the modeling operator, and the errors that are prone to manual modeling such as missing and incorrect filling can also be completely avoided.

Discussion
This section mainly discusses and analyzes from two aspects: modeling and algorithm. The modeling aspect is to prove the advantages of automatic modeling by comparing it with manual modeling methods. The algorithm aspect is to verify the advantages of the proposed improved genetic algorithm in global optimization by comparing it with the traditional genetic algorithm.

Comparison with Manual Modeling
Compared with manual modeling, the automatic modeling method proposed in this paper has significant advantages in terms of efficiency, and the time spent on modeling is relatively less affected by the complexity of the model. In addition, the consistency of the automatically established product DSM model will not be affected by the modeling operator, and the errors that are prone to manual modeling such as missing and incorrect filling can also be completely avoided.

Discussion
This section mainly discusses and analyzes from two aspects: modeling and algorithm. The modeling aspect is to prove the advantages of automatic modeling by comparing it with manual modeling methods. The algorithm aspect is to verify the advantages of the proposed improved genetic algorithm in global optimization by comparing it with the traditional genetic algorithm.

Comparison with Manual Modeling
Compared with manual modeling, the automatic modeling method proposed in this paper has significant advantages in terms of efficiency, and the time spent on modeling is relatively less affected by the complexity of the model. In addition, the consistency of the automatically established product DSM model will not be affected by the modeling operator, and the errors that are prone to manual modeling such as missing and incorrect fill-

Discussion
This section mainly discusses and analyzes from two aspects: modeling and algorithm. The modeling aspect is to prove the advantages of automatic modeling by comparing it with manual modeling methods. The algorithm aspect is to verify the advantages of the proposed improved genetic algorithm in global optimization by comparing it with the traditional genetic algorithm.

Comparison with Manual Modeling
Compared with manual modeling, the automatic modeling method proposed in this paper has significant advantages in terms of efficiency, and the time spent on modeling is relatively less affected by the complexity of the model. In addition, the consistency of the automatically established product DSM model will not be affected by the modeling operator, and the errors that are prone to manual modeling such as missing and incorrect fill-

Discussion
This section mainly discusses and analyzes from two aspects: modeling and algorithm. The modeling aspect is to prove the advantages of automatic modeling by comparing it with manual modeling methods. The algorithm aspect is to verify the advantages of the proposed improved genetic algorithm in global optimization by comparing it with the traditional genetic algorithm.

Comparison with Manual Modeling
Compared with manual modeling, the automatic modeling method proposed in this paper has significant advantages in terms of efficiency, and the time spent on modeling is relatively less affected by the complexity of the model. In addition, the consistency of the automatically established product DSM model will not be affected by the modeling oper-

Discussion
This section mainly discusses and analyzes from two aspects: modeling and algorithm. The modeling aspect is to prove the advantages of automatic modeling by comparing it with manual modeling methods. The algorithm aspect is to verify the advantages of the proposed improved genetic algorithm in global optimization by comparing it with the traditional genetic algorithm.

Comparison with Manual Modeling
Compared with manual modeling, the automatic modeling method proposed in this paper has significant advantages in terms of efficiency, and the time spent on modeling is relatively less affected by the complexity of the model. In addition, the consistency of the automatically established product DSM model will not be affected by the modeling oper-

Discussion
This section mainly discusses and analyzes from two aspects: modeling and algorithm. The modeling aspect is to prove the advantages of automatic modeling by comparing it with manual modeling methods. The algorithm aspect is to verify the advantages of the proposed improved genetic algorithm in global optimization by comparing it with the traditional genetic algorithm.

Comparison with Manual Modeling
Compared with manual modeling, the automatic modeling method proposed in this paper has significant advantages in terms of efficiency, and the time spent on modeling is relatively less affected by the complexity of the model. In addition, the consistency of the automatically established product DSM model will not be affected by the modeling operator, and the errors that are prone to manual modeling such as missing and incorrect filling can also be completely avoided.
Automatic modeling has great advantages in efficiency, consistency, and accuracy, but its ability to automatically extract product information is limited. At present, the algorithm in this paper can only realize the automatic extraction of structure information and material information, but it cannot obtain the data not contained in the product assembly model such as product function information and maintenance information.

Comparison with Traditional GA
The motivation of the improved GA proposed in this paper is to break the limitation of the number of modules in the initialization of the traditional GA for the global optimization of the population. Therefore, in the case of module division where the optimal number of modules is less than or equal to the initial number of modules, the optimal module division schemes obtained by the two algorithms are consistent. For example, in the case of a gear oil pump, the initial value of the number of modules is 4, and the optimal number of modules obtained by the two algorithms is 3 when the modularity index Q is maximum.
The discussion in this section focuses on the advantages shown by the improved GA when the optimal number of modules is greater than the initial number of modules. In case 2, the initial number of modules when the bicycle is divided into modules is 5, and the optimal number of modules obtained is 7. When the initial parameters and settings are the same, the bicycle in case 2 is divided into modules using the traditional GA. The optimization process is shown in Figure 12, and when the algorithm iterates about 140 times, the fitness function reaches the maximum and begins to converge. The value of the modularity index Q at this time is about 0.508 and the corresponding modularization scheme results in the bicycle being divided into 5 modules. Obviously, the traditional GA fails to reach the global optimal solution when Q is equal to 0.508 due to the constraint of the number of initial modules. The comparison of bicycle module division schemes under the two algorithms is shown in Figure 13. The solution obtained by the improved GA is significantly better than the solution obtained by the traditional GA in terms of the compactness of the module and the uniformity of the modular granularity.

Comparison with Traditional GA
The motivation of the improved GA proposed in this paper is to break the limitation of the number of modules in the initialization of the traditional GA for the global optimization of the population. Therefore, in the case of module division where the optimal number of modules is less than or equal to the initial number of modules, the optimal module division schemes obtained by the two algorithms are consistent. For example, in the case of a gear oil pump, the initial value of the number of modules is 4, and the optimal number of modules obtained by the two algorithms is 3 when the modularity index Q is maximum.
The discussion in this section focuses on the advantages shown by the improved GA when the optimal number of modules is greater than the initial number of modules. In case 2, the initial number of modules when the bicycle is divided into modules is 5, and the optimal number of modules obtained is 7. When the initial parameters and settings are the same, the bicycle in case 2 is divided into modules using the traditional GA. The optimization process is shown in Figure 12, and when the algorithm iterates about 140 times, the fitness function reaches the maximum and begins to converge. The value of the modularity index Q at this time is about 0.508 and the corresponding modularization scheme results in the bicycle being divided into 5 modules. Obviously, the traditional GA fails to reach the global optimal solution when Q is equal to 0.508 due to the constraint of the number of initial modules. The comparison of bicycle module division schemes under the two algorithms is shown in Figure 13. The solution obtained by the improved GA is significantly better than the solution obtained by the traditional GA in terms of the compactness of the module and the uniformity of the modular granularity.

Conclusions and Future Work
Product structure modeling and the solution of module division schemes are two important activities in the process of product modular design. The product structure modeling methods commonly used in existing related research work have the disadvantages of low modeling efficiency and poor consistency of modeling results. As for the solution of the module division scheme, some intelligent swarm optimization algorithms are usually used, but these algorithms are susceptible to the limitation of initial parameters in the process of searching for the optimal solution. In order to solve these problems in product modular design, an integrated method is developed to identify the modular structure of the product based on auto-generated multi-attribute DSM and an improved GA. Compared with the published related research work, the innovations of this article are mainly reflected in the following two aspects. One is the different ways of obtaining product information. Different from the traditional modeling method which mainly refers to consulting product manuals and interviewing design engineers, this article uses information extraction algorithms to extract product information from product 3D assembly models. The other is that the mutation operation of the GA is different. On the basis of traditional GA transfer mutation, separation mutation is added to improve the algorithm's global search ability. The gear oil pump in Han et al.'s paper [40] is used as a case to prove the effectiveness of the method in this article and the reliability of the results of the module division. In addition, a bicycle as a brand-new case is used to further prove that the method has generalizability in application. The main contributions of this paper are summarized as follows.
(1) An automatic construction method of product multi-attribute DSM is developed based on the automatic extraction algorithm of product 3D assembly model information.
(2) An improved GA is proposed to solve the problem that the optimal solution is easily affected by the initial parameters in the process of product module division.
The future work is mainly to integrate and promote the modular design method of the product and other advanced technologies (mass customization, digital twinning, etc.) in the environment of industry 4.0. For example, this method can promote the intelligence of mass customization production.