Robustness Optimization of Cloud Manufacturing Process under Various Resource Substitution Strategies

: Cloud manufacturing is characterized by large uncertainties and disturbances due to its networked, distributed, and loosely coupled features. To target the problem of frequent cloud resource node failure, this paper proposes (1) three resource substitution strategies based on node redundancy and (2) a new robustness analysis method for cloud manufacturing systems based on a combination of the complex network and multi-agent simulation. First, a multi-agent simulation model is constructed, and simulation evaluation indexes are designed to study the robustness of the dynamic cloud manufacturing process (CMP). Second, a complex network model of cloud manufacturing resources is established to analyze the static topological robustness of the cloud manufacturing network. Four types of node failure modes are deﬁned, based on the initial and recomputed topologies. Further, three resource substitution strategies are proposed (i.e., internal replacement, external replacement, and internal–external integration replacement) to enable the normal operation of the system after resource node failure. Third, a case study is conducted for a cloud manufacturing project of a new energy vehicle. The results show that (1) the proposed robustness of service index is effective at describing the variations in CMP robustness, (2) the two node failure modes based on the recalculated topology are more destructive to the robustness of the CMP than the two based on the initial topology, and (3) under all four failure modes, all three resource substitution strategies can improve the robustness of the dynamic CMP to some extent, with the internal–external integration replacement strategy being most effective, followed by the external replacement strategy, and then the internal replacement strategy.


Introduction
In the era of Industry 4.0, with the rapid development of Internet technology, information technology, and manufacturing technology, the traditional large-scale manufacturing mode is gradually being replaced by customized service modes. A series of advanced networked manufacturing modes, such as application service providers, manufacturing grids, agile manufacturing, and global manufacturing have been proposed successively [1]. In this context, Li et al. [2] introduced the concept of "cloud manufacturing"-a new service-oriented networked manufacturing mode that gathers manufacturing resources and capabilities together on the cloud platform, escaping the limitations of space and distance; through service integration, the sharing of manufacturing resources and capabilities is fully realized [3]. Since its inception, cloud manufacturing has attracted widespread attention because of its advanced ideas and technical concepts [4].
The cloud manufacturing mode extends the manufacturing environment to multiple user subjects, service subjects, and geographical spaces. As such, it faces a high level of uncertainty and disturbance [5]. The cloud manufacturing system (CMS) can reduce

Multi-Agent Simulation Study of Cloud Manufacturing System
The existing research on the robustness of advanced manufacturing systems mostly uses the complex network analysis method, which is well suited to reflect the structural characteristics of the system. As a type of networked manufacturing mode, the structural characteristics of cloud manufacturing is a key topic in robustness research; however, the dynamic operation process, logical judgment, and dynamic temporal relationship among entities are also aspects of cloud manufacturing that should be focused on.
The CMS contains a variety of entities, and various forms of behavior interaction and information transfer exist between the same entities and different entities, showing the characteristics of the complex system. Analyzing the complexity through simple, intelligent, and autonomous entities, such as agents in multi-agent systems, is considered an appropriate approach to address this challenge in industrial scenarios [32]. Further, the multi-agent simulation modeling of the CMS is presently a popular research topic. For example, Zhao et al. designed and implemented an agent-based cloud manufacturing simulation platform, where the simple reflective agent was used to encapsulate the resources and the complex agent was used to encapsulate the services. This gave the cloud platform a five-layer architecture (i.e., the data layer, low tool layer, management layer, upper tool layer, and application layer). Dobrescu et al. [33] proposed a cloud simulation platform to provide computing resources and services for the hybrid simulation of virtual manufacturing systems, and Chen and Chiu [34] developed a cloud-based factory simulation experiment system. Zhang et al. [35] analyzed typical smart manufacturing simulation techniques from three aspects: manufacturing unit simulation, manufacturing integration simulation, and manufacturing intelligence simulation. In addition, some scholars have conducted multi-agent simulation modeling research on cloud manufacturing from multiple perspectives, such as cloud service entity encapsulation [36,37], selection and scheduling [38][39][40], and trust and security issues [5,8], among others. Multiagent simulation has become an important tool for cloud manufacturing research, yet perspectives, such as cloud service entity encapsulation [36,37], selection and scheduling [38][39][40], and trust and security issues [5,8], among others. Multi-agent simulation has become an important tool for cloud manufacturing research, yet current multi-agent simulation research on the robustness of cloud manufacturing appears relatively rare. This paper proposes a robustness analysis method that combines the complex network and multi-agent simulation to investigate the optimization and enhancement of CMS robustness under multiple resource substitution strategies. The complex network perspective can reflect the structural robustness of CMSs, while multi-agent simulation can consider the process robustness of CMSs from multiple dimensions, such as time, cost, quality, and reliability. The combination of these two perspectives extends the robustness analysis object of the CMS from the CMN to the CMP, thereby realizing the dual-dimensional analysis of the static structure and dynamic process of CMS robustness.
The rest of this paper is arranged as follows. Section 3 constructs a multi-agent simulation model of the CMP and proposes process robustness indexes from the simulation perspective. Section 4 establishes a complex network model of cloud manufacturing resources and selects structural robustness metrics from the network perspective. Section 5 defines four types of resource node failure modes and three types of resource replacement strategies to deal with resource failure. Section 6 conducts a case study, combining multiagent simulation software Anylogic and Python 3.0 tools to study the changes in the robustness of cloud manufacturing under different failure modes. Section 7 provides the research conclusions and future prospects.

Construction of Multi-Agent Simulation Model
The cloud platform, cloud task, cloud resource, cloud message, cloud order, and other types of subjects are all contained in the CMS, along with two different types of user roles: cloud service providers and cloud demanders [41]. The CMP [2] basically entails the following, as indicated in Figure 1: (1) Through information transformation, resource sensing, resource access, unified modeling of cloud services, and other technologies, cloud service providers integrate different kinds of manufacturing equipment and manufacturing capability resources into the (1) Through information transformation, resource sensing, resource access, unified modeling of cloud services, and other technologies, cloud service providers integrate different kinds of manufacturing equipment and manufacturing capability resources into the cloud platform and deposit them into the cloud resource pool. This allows globally distributed resources to be managed and shared centrally, thereby circumventing the spatial and geographical limitations.
(2) Utilizing terminal devices, cloud demanders submit their requests for services (i.e., orders) to the cloud platform. The cloud demand set uniformly stores orders awaiting processing from various cloud demanders. (3) In accordance with the service route of the to-be-processed order, the cloud platform integrates and adapts various cloud tasks to create structured and reliable cloud task sequences. (4) In order to perform cloud manufacturing services, the platform imports each order into the appropriate cloud task sequence when the cloud demand set is not vacant. Based on the task type, the appropriate resources are requested from the resource pool while processing cloud tasks. After being requested, resources in an inactive state transition to a busy state. The resource is released and returned to an idle state once the assignment has been finished.
Multiple entity types are present in the CMS, and numerous forms of information transmission and behavior interactions occur among entities of the same and different types. Consequently, the CMS model can be stated as follows: CMS={PA,DA,SA,TA,RA,OA,MA,E} (1) where PA represents the cloud platform agent; DA represents the cloud demander agent; SA represents the cloud service agent; TA represents the cloud task agent; RA represents the cloud resource agent; OA represents the order agent issued by DA; MA represents the message agent sent to SA when TA requests or releases resources; and E represents the external environment of information transmission and inter-entity behavior interaction.

Modeling of Cloud Resource Agent
Through information transformation, resource access, cloud service unified modeling, and other technology, service providers' manufacturing equipment and manufacturing capacity resources are integrated into the cloud resource pool to create virtual resources known as cloud resources. The primary function of the cloud resource agent is to collaborate with cloud tasks to finish the processing of cloud orders: RA 0 =〈ID,produceLevel,busy,broken,owner,price〉 (2) where ID represents the resource's special identification number; produceLevel is an integer ranging from 1 to 10 that specifies the resource's productivity level; busy indicates whether the resource is in a busy condition; broken shows whether the resource is defective; owner identifies which cloud server the resource belongs to; and price denotes the resource's cost, which is randomly generated using a normal distribution at model startup.
Considering the resource substitution strategies developed in this paper in the face of resource failure (see Section 5.2 for details), it is necessary to continue expanding the attributes of cloud resource agents: where replace_Resource specifies the alternative resource type R j for each resource type R i , with this model assuming that R i and R j are mutually substitutive; and replace_Rate is the replacement rate (i.e., matching rate) of the replacement resource. Although the substitute resource can replace the original resource to complete the established cloud task, there is an increase in the total work time. The resource replacement rate is generated by a normally distributed random number at the time of model initialization.

Modeling of Cloud Task Agent
The development of the cloud task agent is essential to cloud manufacturing simulation modeling. It encompasses both the behavior interaction and information transfer between the cloud server agent and the cloud resource agent, in addition to the processing Appl. Sci. 2023, 13, 7418 6 of 25 path for all order kinds (e.g., serial, parallel, and hybrid paths). The cloud task agent further generates (a) the mechanism for choosing the best service provider, (b) numerous statistical data, including the service cycle and cost, and (c) information on the cloud task and cloud resource nodes. The existing process modeling library components are modified accordingly to accomplish this goal. The following is a description of the cloud task agent: where ID represents the task's special identification number; owner_Orders identifies which type of order processing path the task belongs to; pre_taskList and after_taskList designate the pre-order and post-order tasks, accordingly; reque_resourceList indicates the resource type requested by the task. basicWorkingTime specifies the average time required to complete a task; currentOrder indicates the order being processed at the moment; Func selectBestServer identifies the best server by considering the resource cost, logistics, distance, and other variables; Func selectBestResource decides the best resource; Func recordRouteStamp records the order-task and task-resource relationships for finished orders; Func recordTaskTime , Func recordTaskCost , and Func recordTaskReliability record, respectively, the service time, the service cost, and the service reliability of the present task.
Again, it is necessary to continue expanding the attributes of cloud task agents: internal_replaceRate,internal_replaceSerial,is _externalReplace, external_Partner,external_replaceSerial,Func calcuWorkingTime 〉 where is_internalReplace indicates whether the current cloud task process has invoked the internal resource replacement strategy; internal_replaceResource specifies the replacement resource selected under the internal replacement strategy; internal_replaceRate is the resource matching rate of the internal replacement resource; internal_replaceSerial is an integer from 1 to 3 indicating which specific case of internal replacement the current process belongs to; is_externalReplace indicates whether the current cloud task process has invoked the external resource replacement strategy; external_Partner indicates the specific external server with which to cooperate under the external resource replacement strategy; external_replaceSerial is an integer from 1 to 3 indicating which specific case of external replacement the current process belongs to; and Func calcuWorkingTime calculates the cloud task time under the current resource replacement strategy. Figure 2 depicts the comprehensive CMP simulation implemented within the cloud task agent by modifying and adapting existing component codes from Anylogic's process modeling library. This procedure's specifics are as follows: (1) By means of the enter component, the order is imported into the cloud task's internal procedure. The order is immediately assigned by the cloud platform if the current task is the first in the task sequence; if not, the preceding task assigns the order after it has been finished (e.g., task 2 orders are assigned by task 1 after task 1 is finished). (2) The queue component temporarily stores the current order while the following determinations are made: (a) if the current task is first in the task sequence or there is only one task in the previous task sequence, the hold and hold1 components are opened concurrently, and the current order is entered into queue2 for further processing; or (b) if there are multiple tasks in the previous task sequence, the current order must wait until the orders of all previous tasks have been processed before entering queue2 for further processing. (3) The queue1 component combines the information of multiple branch orders, whereas the hold2 component ensures that only a single order is entered for subsequent processing at any given time. Hold2 reopens and proceeds to serve the next order when the current order is fulfilled and exits through exit. (4) When the order enters queue3, the task agent selects the optimal service provider and sends "resource request" information to it. When the optimal service provider accepts the request, it chooses to adopt or not adopt the resource replacement strategy according to whether the target resource is faulty. If the resource substitution strategy is adopted, it is necessary to further select which resource substitution strategy to adopt. The busy attribute of the corresponding optimal resource is changed to "true", the hold3 component opens and the order flows through the delay component to simulate the cloud manufacturing service. After a certain delay time, the service is completed. (5) The order is placed in queue4 and the "release resource" message is sent to the best server. When the best server acknowledges the message, the busy attribute of the best resource changes from "true" to "false", and the hold4 component opens. Order traverses the delay1 component, and the release of the resource is accomplished following a predetermined delay period. (6) The order passes through the exit component to conclude all of its service procedures for this task. It then imports the post-order task sequence of this task: (a) if there is only one post-order task, it is imported instantly into the enter component of the post-order task; (b) if there are numerous post-order tasks, the information of the current order is copied and brought into the enter component of the corresponding post-order tasks; and (c) if there is no post-order task, this indicates that the task is already the last task in the task sequence. As a result, the order is included in the group of completed orders, and data such as the service cycle, service cost, and route record are tallied and output. (2) The queue component temporarily stores the current order while the following determinations are made: (a) if the current task is first in the task sequence or there is only one task in the previous task sequence, the hold and hold1 components are opened concurrently, and the current order is entered into queue2 for further processing; or (b) if there are multiple tasks in the previous task sequence, the current order must wait until the orders of all previous tasks have been processed before entering queue2 for further processing.
(3) The queue1 component combines the information of multiple branch orders, whereas the hold2 component ensures that only a single order is entered for subsequent processing at any given time. Hold2 reopens and proceeds to serve the next order when the current order is fulfilled and exits through exit.
(4) When the order enters queue3, the task agent selects the optimal service provider and sends "resource request" information to it. When the optimal service provider accepts the request, it chooses to adopt or not adopt the resource replacement strategy according to whether the target resource is faulty. If the resource substitution strategy is adopted, it is necessary to further select which resource substitution strategy to adopt. The busy attribute of the corresponding optimal resource is changed to "true", the hold3 component opens and the order flows through the delay component to simulate the cloud manufacturing service. After a certain delay time, the service is completed.
(5) The order is placed in queue4 and the "release resource" message is sent to the best server. When the best server acknowledges the message, the busy attribute of the best resource changes from "true" to "false", and the hold4 component opens. Order traverses the delay1 component, and the release of the resource is accomplished following a predetermined delay period. (6) The order passes through the exit component to conclude all of its service procedures for this task. It then imports the post-order task sequence of this task: (a) if there is only one post-order task, it is imported instantly into the enter component of the postorder task; (b) if there are numerous post-order tasks, the information of the current order is copied and brought into the enter component of the corresponding post-order tasks; and (c) if there is no post-order task, this indicates that the task is already the last task in the task sequence. As a result, the order is included in the group of completed orders, and data such as the service cycle, service cost, and route record are tallied and output.

Modeling of Cloud Server Agent
The primary function of cloud servers is to convey information and interact with cloud tasks. Each cloud server s resource pool has all kinds of cloud resources. The server locates the appropriate resource in its resource pool and allots it to the cloud task after receiving the "request resource" message. The server releases the associated cloud resource and adds it back to the cloud resource pool after receiving the "release resource" message. The cloud server agent can be illustrated as follows:

Modeling of Cloud Server Agent
The primary function of cloud servers is to convey information and interact with cloud tasks. Each cloud server's resource pool has all kinds of cloud resources. The server locates the appropriate resource in its resource pool and allots it to the cloud task after receiving the "request resource" message. The server releases the associated cloud resource and adds it back to the cloud resource pool after receiving the "release resource" message. The cloud server agent can be illustrated as follows: SA=〈ID,location,resourcePool,dScore, pScore,totalScore,Func configureResource , Func common ,Func inter_replace ,Func exter_replace ,Func inter&exter 〉 where ID represents the cloud service provider's special identification number; location refers to the server's latitude and longitude coordinates, which is utilized to initialize the server's Appl. Sci. 2023, 13, 7418 8 of 25 location on the GIS map; resourcePool is used for storing the corresponding virtual resources of the cloud server; dScore, pScore, and totalScore are the distance score, price score, and total score when the cloud task chooses the best cloud server; and Func configureResource is used to manage and distribute resources when receiving cloud task information. If the message is "release resource", the server will locate the corresponding resource and set its busy attribute to "false". If the message is "request resource", the broken attribute of the corresponding resource is examined before any other properties: (1) If the value of the broken property is "false", this indicates that the resource is not faulty, so its busy attribute continues to be judged, where, (a) if the busy attribute is "false", cloud order processing can be started, and (b) if the busy attribute is "true", this means that this resource is being used by other tasks, so it needs to wait until the other tasks have been completed before starting the processing of orders. (2) If the value of the broken property is "true", this means that the resource is faulty and it cannot complete the processing of the corresponding cloud task. If the resource is faulty, according to the experimental settings, one of the following strategies is chosen: the no-resource replacement strategy, the internal resource replacement strategy, the external resource replacement strategy, or the internal-external integrated strategy. Func common is the no-resource replacement strategy: when the requested resource fails, the requested processing order is added to the failed order set. Func inter_replace is the internal resource replacement strategy: when the requested resource fails, a replacement resource is found for the failed resource within the current service provider. Func exter_replace is the external resource replacement strategy: when the requested resource fails, the same type of resource as the failed resource from other cloud service providers is requested. Func inter&exter is a combination strategy of internal and external resource replacement: when the requested resource fails, the priority is to find a replacement resource within the service provider; if the internal replacement resource fails, then external resources are sought from other service providers.

Multi-Agent Simulation-Based Robustness Evaluation Indicator
Based on the multi-agent model in Section 3.1 and the order-task sequence, the dynamic simulation of the CMP can be realized, and the results data (e.g., the order completion time, logistics transportation distance, and resource occupation) can be output to evaluate the performance of the CMP. Quality of service (QoS) is commonly used in academia to evaluate the CMP, with QoS values mostly being evaluated from multiple dimensions, such as the time, cost, and reliability [43][44][45]. Referring to the definition of QoS, this paper proposes robustness of service (RoS) as a robustness measurement indicator from the perspective of the simulation of four dimensions: service time, service cost, reliability, and order completion rate. The specific calculation formulae for these four dimensions are as follows: (1) Service time The sum of all orders' completion times during the simulation cycle is represented by the following formula: where m denotes the overall number of orders, j = 1, 2, . . . ,m denotes the jth order in the order sequence, and t j denotes the finish time of the jth order, which can be derived from the results of the simulation.
(2) Service cost The cloud resource service fee, logistics service fee, and cloud resource release fee, which together make up the overall cloud service cost, are determined as follows: (12) where m represents the total amount of orders and n j represents the number of assignments associated with each order; j = 1, 2, . . . , m is the jth order in the order sequence; i = 1, 2, . . . , n is the ith the task in the task sequence; t serving i,j represents the cloud service time of the ith task in the jth order; p resource i,j represents the service cost per unit time of the resource related to the task; d i,j represents the logistics distance related to the task; c logistic represents the logistics cost per unit distance; t releasing i,j represents the release time of the cloud resources for the task; and p release represents the cost per unit time of releasing resources.
(3) Service reliability The service reliability is measured by a multiplicative index [44], which has the following form: where rel i,j represents the service reliability of the ith task in the jth order, as specified in the order agent's reliabilityAccum property. The evaluation of the CMP's robustness in this paper also refers to the definition of robustness, which refers to the ability of the CMS to operate normally and maintain its original performance in the face of various unexpected disruptions and interruptions. For the CMS, the normal completion of cloud orders within a service cycle can reflect its normal operation capability more intuitively. Therefore, the order completion rate index is introduced: (4) Order completion rate The order completion rate is the proportion of total scheduled orders to orders actually completed throughout the simulation cycle.
where N 1 represents the number of orders fulfilled throughout the simulation cycle and N 2 represents the number of incomplete orders. In summary, combining the definitions of robustness and QoS, this paper proposes a new robustness evaluation index, RoS, to comprehensively consider the normal operational capability of the CMS in the face of unexpected disturbances.
The RoS index takes the order completion rate as the main body, where the higher the order completion rate, the stronger the normal operational ability in the face of interference, and the stronger the ability to resist risks (i.e., the stronger the robustness of the cloud service), and vice versa. When the order completion rates are the same, the index then compares the differences in the time and cost of completing the same order quantity: if the same number of orders are completed with less time, lower cost, and higher reliability, the cloud service is more robust, and vice versa. Therefore, it can be expressed as: where T 0 , C 0 , and rel 0 are the respective baseline values of the service time, cost, and reliability under the condition of no interference; T, C, and rel are the actual values of the current experimental group; and ω 1 , ω 2 , and ω 3 are the respective weight coefficients of the three indexes, satisfying ∑ 3 i=1 ω i = 1.

Development of a Complex Network Model for Cloud Manufacturing
The CMN consists of cloud service resources and their interconnections. The network can be evaluated utilizing the complex network model due to the vast number of resources and intricate connection relationships. Figure 3a depicts the processing task paths for Order-A, Order-B, and Order-C, the resources utilized by each task along these paths, and the relationships between the resources and servers. If two tasks are linked by a path, their corresponding resources are also linked. Figure 3b illustrates how the CMN is formed by considering all resources to be network nodes and resource connections to be connected edges.

Evaluation Indicator for Network Robustness Based on Static Topology
The term "network robustness" is generally used to describe the degree of network performance retention following the failure of network nodes or edges [44], and the change in the maximum connected subgraph following node failure can reflect the degree

Evaluation Indicator for Network Robustness Based on Static Topology
The term "network robustness" is generally used to describe the degree of network performance retention following the failure of network nodes or edges [44], and the change in the maximum connected subgraph following node failure can reflect the degree of structural integrity retention in the network. As a result, the rate of change in the maximal connected subgraph's node count was chosen as one of the robustness evaluation indicators for this study.
where N' represents the number of nodes in the maximally connected subgraph after the network has been attacked, and N represents the number of nodes in the original network. Specifically, S = 0 indicates that the network is disconnected, whereas S = 1 indicates that the network is completely connected and there are no isolated nodes.

Definition of Failure Modes for Robustness Analysis
The definition of failure modes is the key to robustness analysis. Based on a combination of the cloud manufacturing characteristics and the spatial topology structure of the CMN, this paper proposes two types of failure modes: cloud resource failure based on the initial topology, and cloud resource failure based on the recomputed topology.
The initial topology refers to the initial structural characteristics of the CMN, which is a static network. The recalculated topology refers to the structural characteristics of the CMN that are obtained through recalculation after the initial network is attacked, which is a dynamic network that changes step by step with the attack steps.
Both failure modes are subdivided into degree-based and betweenness-based resource failures. The degree is widely used to measure the importance of the nodes: it represents how closely a resource node is connected to other resource nodes in the CMN. The betweenness reflects the structural importance of the nodes in the network [46,47]: a node with high betweenness has greater control over the logistics and information flow in the network. The specific failure mode definitions are shown in Table 1. Table 1. Resource failure based on initial topology and recomputed topology.

Failure Mode Description Failure Mode Calculation Process
Resource failure based on initial topology Initial node degree loss (ID) Sort the resource nodes in the initial network (Network-0) by degree, from largest to smallest. Remove one node at a time, and repeat n times until all nodes in the network are removed. Initial node betweenness loss (IB) Sort the resource nodes in the initial network (Network-0) by betweenness, from largest to smallest. Remove one node at a time, and repeat n times until all nodes in the network are removed.
Resource failure based on recomputed topology Recomputed node degree loss (RD) Sort the resource nodes in the initial network (Network-0) by degree, from largest to smallest. Remove the first node and generate a new network (Network-1). Recalculate and sort the resource nodes in the new network (Network-1) by degree, from largest to smallest. Remove the first node and generate a new network (Network-2) . . . and so on, until all nodes in the network are removed.

Recomputed node betweenness loss (RB)
Sort the resource nodes in the initial network (Network-0) by betweenness, from largest to smallest. Remove the first node and generate a new network (Network-1). Recalculate and sort the resource nodes in the new network (Network-1) by betweenness, from largest to smallest. Remove the first node and generate a new network (Network-2) . . . and so on, until all nodes in the network are removed.
Note: The removal of nodes is handled differently in the complex network model and the multi-agent model: (1) In the complex network model, the corresponding resource nodes and all connected edges on the nodes are deleted. (2) In the multi-agent model, the corresponding resource agent is changed to a "fault" state, which means the resource is unable to provide services.

Formulation of Multiple Resource Substitution Strategies
Targeting the cloud resource node failure problem proposed in Section 5.1, this paper aims to develop a robustness enhancement strategy for CMNs that involves adding redundant nodes.
For the supply chain modeling and robustness problem, Zhao [48] took the smartphone supply chain as an example and put forward three different robustness optimization strategies: enterprise internal operation management, cooperative management between enterprises, and a regional development strategy. Inspired by this, and combined with the characteristics of cloud manufacturing, this paper proposes three kinds of robustness improvement strategies: internal resource replacement, external resource replacement, and internal-external integration replacement.
(1) Internal replacement strategy: The cloud service provider Si will internally provide replacement resources Rj for Ri (noted as Ri-Si and Rj-Si, respectively). If Ri-Si fails, Rj-Si will replace it to complete the processing of cloud manufacturing tasks. Although the tasks will be completed, the cloud task time will be increased due to the different resource types, and additional working hours will be incurred. (2) External replacement strategy: The cloud service provider Sj, Sk . . . etc. will provide the same type of resources Ri as Ri-Si (recorded as Ri-Sj, Ri-Sk . . . etc.). When the resource Ri fails, the strategy will comprehensively select the best cloud service provider based on multiple factors, such as resource quotation and the distance between service providers. It will then request alternative resources from them to replace the failed Ri-Si to complete the cloud manufacturing task. Since the resource types are the same, this does not add additional task time; however, the transfer of resources and information among different service providers will generate additional logistics transportation costs and time. (3) Internal-external integration replacement strategy: This strategy is the combination of the previous two strategies. When a cloud resource fails, this strategy first looks for a replacement resource within the service provider; if no replacement resource is found or its replacement resource also fails, it continues to seek the same type of resource from other service providers. The logical flow of these three strategies is shown in Figure 4. In addition, in the complex network model, node failure is reflected by removing the failed resource node and all the edges connected to the node. After selecting the corresponding resource replacement strategy, the optimal alternative resource node under the current strategy is first determined. If this replacement node is already in the original network, all connected edges belonging to the failed node will be directly linked to the replacement node; if it is not already in the original network, the replacement node should first be added to the network, then be linked similarly.

Model Parameters Description
The cloud manufacturing project for a new energy vehicle is used as a case study. This project offers life-cycle cloud manufacturing services for new energy vehicles, with the technologies provided including electrification and autonomous driving.
The cloud manufacturing project consists of 24 order types, 95 cloud tasks (t1-t95), and 72 resource types (r1-r72). Table 2 displays the appropriate resource types for each cloud task, and Table 3 displays the routes for each order type s associated cloud task. This paper assumes a bidirectional substitution relationship between resources (e.g., if the substitute resource of i r is j r, the substitute resource of j r is i r ). Based on this, the substitution relationships among internal resources are shown in Table 4. Each of the project s five cloud servers (S1-S5) offers 72 different kinds of cloud resources. The cloud servers compete for different orders because they charge different prices for their resources and are located at various distances from cloud demanders. Resource r1 of servers S1-S5 are identified by the labels r1-S1, r1-S2, r1-S3, r1-S4, and r1-S5, respectively.

Model Parameters Description
The cloud manufacturing project for a new energy vehicle is used as a case study. This project offers life-cycle cloud manufacturing services for new energy vehicles, with the technologies provided including electrification and autonomous driving.
The cloud manufacturing project consists of 24 order types, 95 cloud tasks (t1-t95), and 72 resource types (r1-r72). Table 2 displays the appropriate resource types for each cloud task, and Table 3 displays the routes for each order type's associated cloud task. This paper assumes a bidirectional substitution relationship between resources (e.g., if the substitute resource of r i is r j , the substitute resource of r j is r i ). Based on this, the substitution relationships among internal resources are shown in Table 4.
Each of the project's five cloud servers (S1-S5) offers 72 different kinds of cloud resources. The cloud servers compete for different orders because they charge different prices for their resources and are located at various distances from cloud demanders. Resource r1 of servers S1-S5 are identified by the labels r1-S1, r1-S2, r1-S3, r1-S4, and r1-S5, respectively.
Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel. orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel. The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel. The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel. The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel. The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel. The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel. The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel. The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel. The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3.   Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.
The weight coefficients for the RoS were set in this study to be 1 ω = 1/3, 2 ω = 1/3, and 3 ω = 1/3. Table 2. Order and cloud service route correspondence. Table 3. Task and resource correspondence. Moreover, there are 14 cloud demanders (d1-d14). Each cloud demander submits 24 orders, with 1 of each order type submitted (i.e., 1 of each of the 24 order types). As indicated in Table 5, the fundamental details of each cloud service provider and cloud demander are externally imported from Excel.

Structural Robustness Analysis
Based on the network model construction method described in Section 4.1, Figure 5 depicts the initial cloud manufacturing resource network. Matlab-2020a software was implemented to conduct data statistical analysis on the network, and as shown in Table 6 and Figure 6, the relevant network topology parameters and degree distribution were obtained. The network contained 231 resource nodes, and the distribution of node degrees was very unbalanced. A small number of nodes occupied the vast majority of connected edges, proving that the network possessed the traits of a scale-free network. It was a sparse network, as the nodes with higher degree values tended to connect the nodes with lower degree values, as indicated by the network s low density.  Then, using the four failure modes outlined in Section 5.1, Python 3.0 was used to simulate and determine how the structural robustness indexes changed in response to each failure mode.  The network contained 231 resource nodes, and the distribution of node degre very unbalanced. A small number of nodes occupied the vast majority of connected proving that the network possessed the traits of a scale-free network. It was a spars work, as the nodes with higher degree values tended to connect the nodes with degree values, as indicated by the network s low density.  Then, using the four failure modes outlined in Section 5.1, Python 3.0 was u simulate and determine how the structural robustness indexes changed in respo each failure mode. The network contained 231 resource nodes, and the distribution of node degrees was very unbalanced. A small number of nodes occupied the vast majority of connected edges, proving that the network possessed the traits of a scale-free network. It was a sparse network, as the nodes with higher degree values tended to connect the nodes with lower degree values, as indicated by the network's low density.
Then, using the four failure modes outlined in Section 5.1, Python 3.0 was used to simulate and determine how the structural robustness indexes changed in response to each failure mode.

Structural Robustness Comparison of Three Substitution Strategies under Initial Node Degree Loss (ID) Failure Mode
As shown in Figure 7, (1) in the ID failure mode, the maximum connectivity subgraphs under the no-substitution strategy ("common") and the three resource substitution strategies ("internal", "external", and "inter&exter") all show a decreasing trend as the number of node failures increases. This indicates that from the complex network perspective, the structural robustness of the CMS gradually decreases. (2) The curves of the three resource substitution strategies are all located above the curve of the no-substitution strategy (i.e., their maximum connectivity subgraph values are larger than the no-substitution strategy's value). This indicates that all three resource substitution strategies can improve the structural robustness of the CMS in the face of initial node degree failure. (3) The "internal" curve is only slightly higher than the "common" curve, and when the number of node attacks is high, the maximum connected subgraph values of these two strategies decrease to 0, at which point the CMN has completely collapsed. In contrast, the "external" and "inter&exter" curves are significantly higher than the "common" curve, and even when the number of node attacks is high, the connected subgraph values still maintain a high level. This indicates that the external replacement strategy and the internal-external integrated replacement strategy both offer more significant robustness enhancement than the internal replacement strategy under the ID failure mode. As shown in Figure 7, (1) in the ID failure mode, the maximum connectivity subgraphs under the no-substitution strategy ("common") and the three resource substitution strategies ("internal", "external", and "inter&exter") all show a decreasing trend as the number of node failures increases. This indicates that from the complex network perspective, the structural robustness of the CMS gradually decreases. (2) The curves of the three resource substitution strategies are all located above the curve of the no-substitution strategy (i.e., their maximum connectivity subgraph values are larger than the no-substitution strategy s value). This indicates that all three resource substitution strategies can improve the structural robustness of the CMS in the face of initial node degree failure. (3) The "internal" curve is only slightly higher than the "common" curve, and when the number of node attacks is high, the maximum connected subgraph values of these two strategies decrease to 0, at which point the CMN has completely collapsed. In contrast, the "external" and "inter&exter" curves are significantly higher than the "common" curve, and even when the number of node attacks is high, the connected subgraph values still maintain a high level. This indicates that the external replacement strategy and the internal-external integrated replacement strategy both offer more significant robustness enhancement than the internal replacement strategy under the ID failure mode.

Structural Robustness Comparison of Three Substitution Strategies under Initial Node Betweenness Loss (IB) Failure Mode
As shown in Figure 8, (1) in the IB failure mode, the maximum connectivity subgraphs under the no-substitution strategy ("common") and the three resource substitution strategies ("internal", "external", and "inter&exter") all show a decreasing trend as the number of node failures increases. This indicates that from the complex network perspective, the structural robustness of the CMS gradually decreases. (2) The curves of the three resource substitution strategies are all located above the curve of the no-substitution strategy (i.e., their maximum connectivity subgraph values are larger than the no-substitution strategy s value). This indicates that all three resource substitution strategies can improve the structural robustness of the CMS in the face of initial node betweenness failure. (3) The "internal" curve is only slightly higher than the "common" curve, and when the number of node attacks is high, the maximum connected subgraph values of these two strategies decrease to 0, at which point the CMN has completely collapsed. In contrast, the "external" and "inter&exter" curves are significantly higher than the "common" curve, and even when the number of node attacks is high, the connected subgraph values still maintain a

Structural Robustness Comparison of Three Substitution Strategies under Initial Node Betweenness Loss (IB) Failure Mode
As shown in Figure 8, (1) in the IB failure mode, the maximum connectivity subgraphs under the no-substitution strategy ("common") and the three resource substitution strategies ("internal", "external", and "inter&exter") all show a decreasing trend as the number of node failures increases. This indicates that from the complex network perspective, the structural robustness of the CMS gradually decreases. (2) The curves of the three resource substitution strategies are all located above the curve of the no-substitution strategy (i.e., their maximum connectivity subgraph values are larger than the no-substitution strategy's value). This indicates that all three resource substitution strategies can improve the structural robustness of the CMS in the face of initial node betweenness failure.
(3) The "internal" curve is only slightly higher than the "common" curve, and when the number of node attacks is high, the maximum connected subgraph values of these two strategies decrease to 0, at which point the CMN has completely collapsed. In contrast, the "external" and "inter&exter" curves are significantly higher than the "common" curve, and even when the number of node attacks is high, the connected subgraph values still maintain a high level. This indicates that the external replacement strategy and the internal-external integrated replacement strategy both offer more significant robustness enhancement than the internal replacement strategy under the IB failure mode. high level. This indicates that the external replacement strategy and the internal-external integrated replacement strategy both offer more significant robustness enhancement than the internal replacement strategy under the IB failure mode. As shown in Figure 9, (1) in the RD failure mode, when the number of node attacks is high, the maximum connectivity subgraph values under all three substitution strategies decrease to 0, at which point the CMN has completely collapsed. This indicates that the RD failure mode is more destructive to the structural robustness of the CMS than either the ID or IB modes. (2) The "external" curve is always located above the "internal" curve, and the "inter&exter" curve is nearly always located above the "external" curve. This indicates that the maximum connectivity subgraph value is largest under the internal-external integration replacement strategy, followed by the external replacement strategy, and then the internal replacement strategy. Therefore, for the structural robustness of the CMS under the RD failure mode: internal-external integration replacement strategy>external replacement strategy>internal replacement strategy.  As shown in Figure 9, (1) in the RD failure mode, when the number of node attacks is high, the maximum connectivity subgraph values under all three substitution strategies decrease to 0, at which point the CMN has completely collapsed. This indicates that the RD failure mode is more destructive to the structural robustness of the CMS than either the ID or IB modes. (2) The "external" curve is always located above the "internal" curve, and the "inter&exter" curve is nearly always located above the "external" curve. This indicates that the maximum connectivity subgraph value is largest under the internal-external integration replacement strategy, followed by the external replacement strategy, and then the internal replacement strategy. Therefore, for the structural robustness of the CMS under the RD failure mode: internal-external integration replacement strategy > external replacement strategy > internal replacement strategy. high level. This indicates that the external replacement strategy and the internal-external integrated replacement strategy both offer more significant robustness enhancement than the internal replacement strategy under the IB failure mode. As shown in Figure 9, (1) in the RD failure mode, when the number of node attacks is high, the maximum connectivity subgraph values under all three substitution strategies decrease to 0, at which point the CMN has completely collapsed. This indicates that the RD failure mode is more destructive to the structural robustness of the CMS than either the ID or IB modes. (2) The "external" curve is always located above the "internal" curve, and the "inter&exter" curve is nearly always located above the "external" curve. This indicates that the maximum connectivity subgraph value is largest under the internal-external integration replacement strategy, followed by the external replacement strategy, and then the internal replacement strategy. Therefore, for the structural robustness of the CMS under the RD failure mode: internal-external integration replacement strategy>external replacement strategy>internal replacement strategy.  As shown in Figure 10, (1) in the RB failure mode, when the number of node attacks is high, the maximum connectivity subgraph values under all three strategies decrease to 0, at which point the CMN has completely collapsed. This indicates that the RB failure mode is more destructive to the structural robustness of the CMS than either the ID or IB modes.
(2) The "external" curve is always located above the "internal" curve, and the majority of the "inter&exter" curve is located above the "external" curve. This indicates that the maximum connectivity subgraph value is largest under the internal-external integration replacement strategy, followed by the external replacement strategy, and then the internal replacement strategy. Therefore, for the structural robustness of the CMS under the RB failure mode: internal-external integration replacement strategy > external replacement strategy > internal replacement strategy.

Structural Robustness Comparison of Three Substitution Strategies under
Recomputed Node Betweenness Loss (RB) Failure Mode As shown in Figure 10, (1) in the RB failure mode, when the number of node attacks is high, the maximum connectivity subgraph values under all three strategies decrease to 0, at which point the CMN has completely collapsed. This indicates that the RB failure mode is more destructive to the structural robustness of the CMS than either the ID or IB modes. (2) The "external" curve is always located above the "internal" curve, and the majority of the "inter&exter" curve is located above the "external" curve. This indicates that the maximum connectivity subgraph value is largest under the internal-external integration replacement strategy, followed by the external replacement strategy, and then the internal replacement strategy. Therefore, for the structural robustness of the CMS under the RB failure mode: internal-external integration replacement strategy>external replacement strategy>internal replacement strategy. In summary, from the complex network perspective, all three resource substitution strategies significantly improved the structural robustness of the CMS. In the four failure modes (i.e., ID, IB, RD, and RB), the structural robustness levels under all three strategies were higher than those with no strategy. Further, the internal-external integration replacement strategy brought the greatest robustness enhancement to the CMS, followed by the external replacement strategy, and then the internal replacement strategy. This is reasonable because the essence of a resource substitution strategy is to add redundant nodes, so when a resource fails, redundant alternative resources will be there to replace it to complete the task. Therefore, as the number of node failures increased, the strategies with more initial redundant nodes (i.e., the internal-external resource integration strategy and the external replacement strategy) were more robust. Conversely, the strategy with fewer initial redundant nodes (i.e., the internal replacement strategy) was less robust. In addition, in the failure modes based on the initial topology (i.e., ID and IB), only the structural robustness under the no-replacement strategy and the internal replacement strategy significantly decreased. In contrast, in the failure modes based on the recomputed topology (i.e., RD and RB), the structural robustness under all strategies significantly decreased, indicating that the failure modes based on the recomputed topology were more destructive to the structural robustness of the CMS. However, for all four failure modes, all three resource substitution strategies could protect the structural robustness of the CMS to some extent. In summary, from the complex network perspective, all three resource substitution strategies significantly improved the structural robustness of the CMS. In the four failure modes (i.e., ID, IB, RD, and RB), the structural robustness levels under all three strategies were higher than those with no strategy. Further, the internal-external integration replacement strategy brought the greatest robustness enhancement to the CMS, followed by the external replacement strategy, and then the internal replacement strategy. This is reasonable because the essence of a resource substitution strategy is to add redundant nodes, so when a resource fails, redundant alternative resources will be there to replace it to complete the task. Therefore, as the number of node failures increased, the strategies with more initial redundant nodes (i.e., the internal-external resource integration strategy and the external replacement strategy) were more robust. Conversely, the strategy with fewer initial redundant nodes (i.e., the internal replacement strategy) was less robust. In addition, in the failure modes based on the initial topology (i.e., ID and IB), only the structural robustness under the no-replacement strategy and the internal replacement strategy significantly decreased. In contrast, in the failure modes based on the recomputed topology (i.e., RD and RB), the structural robustness under all strategies significantly decreased, indicating that the failure modes based on the recomputed topology were more destructive to the structural robustness of the CMS. However, for all four failure modes, all three resource substitution strategies could protect the structural robustness of the CMS to some extent.

Process Robustness Analysis
This research analyzed changes in the RoS under the four failure types using the multi-agent simulation program Anylogic and Python 3.0.

Process Robustness Comparison of Three Substitution Strategies under ID Failure Mode
As shown in Figure 11, (1) in the ID failure mode, the RoS values under the nosubstitution strategy ("common") and the three resource substitution strategies ("internal", "external", and "inter&exter") all show a decreasing trend as the number of node failures increases (i.e., as the cloud order completion rate decreases). This indicates that from a multi-agent simulation perspective, the robustness of the CMP gradually decreases.
(2) The curves of the three resource substitution strategies are all located above the curve of the no-substitution strategy (i.e., their RoS values are larger than the no-substitution strategy's value, which means their order completion rates are higher). This indicates that all three resource substitution strategies can improve the robustness of the CMP in the face of initial node degree failure. (3) The "internal" curve is only slightly higher than the "common" curve, and when the number of node attacks is high, the RoS values of these two strategies decrease to 0, at which point all cloud orders fail to be processed. In contrast, the "external" and "inter&exter" curves are significantly higher than the "common" curve, and though the RoS values fluctuate when the number of node attacks is high, they still maintain a high level (above 0.6). This indicates that the external replacement strategy and the internal-external integration replacement strategy both offer more significant robustness enhancement than the initial replacement strategy under the ID failure mode. In particular, when the number of node attacks is high, the slight fluctuations of the RoS value indicate that the cloud order completion rate tends to be stable at this time, but different resource substitution strategies will lead to changes in the service time, cost, reliability, and other factors.

Mode
As shown in Figure 11, (1) in the ID failure mode, the RoS values under the no-substitution strategy ("common") and the three resource substitution strategies ("internal", "external", and "inter&exter") all show a decreasing trend as the number of node failures increases (i.e., as the cloud order completion rate decreases). This indicates that from a multi-agent simulation perspective, the robustness of the CMP gradually decreases. (2) The curves of the three resource substitution strategies are all located above the curve of the no-substitution strategy (i.e., their RoS values are larger than the no-substitution strategy s value, which means their order completion rates are higher). This indicates that all three resource substitution strategies can improve the robustness of the CMP in the face of initial node degree failure. (3) The "internal" curve is only slightly higher than the "common" curve, and when the number of node attacks is high, the RoS values of these two strategies decrease to 0, at which point all cloud orders fail to be processed. In contrast, the "external" and "inter&exter" curves are significantly higher than the "common" curve, and though the RoS values fluctuate when the number of node attacks is high, they still maintain a high level (above 0.6). This indicates that the external replacement strategy and the internal-external integration replacement strategy both offer more significant robustness enhancement than the initial replacement strategy under the ID failure mode. In particular, when the number of node attacks is high, the slight fluctuations of the RoS value indicate that the cloud order completion rate tends to be stable at this time, but different resource substitution strategies will lead to changes in the service time, cost, reliability, and other factors.

Process Robustness Comparison of Three Substitution Strategies under IB Failure Mode
As shown in Figure 12, (1) in the IB failure mode, the RoS values under the no-substitution strategy ("common") and the three resource substitution strategies ("internal", "external", and "inter&exter") all show a decreasing trend as the number of node failures increases (i.e., as the cloud order completion rate decreases). This indicates that from a multi-agent simulation perspective, the robustness of the CMP gradually decreases. (2) The curves of the three resource substitution strategies are all located above the curve of the no-substitution strategy (i.e., their RoS values are larger than the no-substitution As shown in Figure 12, (1) in the IB failure mode, the RoS values under the nosubstitution strategy ("common") and the three resource substitution strategies ("internal", "external", and "inter&exter") all show a decreasing trend as the number of node failures increases (i.e., as the cloud order completion rate decreases). This indicates that from a multi-agent simulation perspective, the robustness of the CMP gradually decreases.
(2) The curves of the three resource substitution strategies are all located above the curve of the no-substitution strategy (i.e., their RoS values are larger than the no-substitution strategy's value, which means their order completion rates are higher). This indicates that all three resource substitution strategies can improve the robustness of the CMP in the face of initial node betweenness failure. (3) The "internal" curve is only slightly higher than the "common" curve, and when the number of node attacks is high, the RoS values of these two strategies decrease to 0, at which point all cloud orders fail to be processed. In contrast, the "external" and "inter&exter" curves are significantly higher than the "common" curve, and though the RoS values fluctuate when the number of node attacks is high, they still maintain a high level (above 0.6). This indicates that the external replacement strategy and the internal-external integration replacement strategy both offer more significant robustness enhancement than the initial replacement strategy under the IB failure mode. In particular, when the number of node attacks is high, the slight fluctuations in the RoS value indicate that the cloud order completion rate tends to be stable at this time, but different resource substitution strategies will lead to changes in the service time, cost, reliability, and other factors.
face of initial node betweenness failure. (3) The "internal" curve is only slightly higher than the "common" curve, and when the number of node attacks is high, the RoS values of these two strategies decrease to 0, at which point all cloud orders fail to be processed. In contrast, the "external" and "inter&exter" curves are significantly higher than the "common" curve, and though the RoS values fluctuate when the number of node attacks is high, they still maintain a high level (above 0.6). This indicates that the external replacement strategy and the internal-external integration replacement strategy both offer more significant robustness enhancement than the initial replacement strategy under the IB failure mode. In particular, when the number of node attacks is high, the slight fluctuations in the RoS value indicate that the cloud order completion rate tends to be stable at this time, but different resource substitution strategies will lead to changes in the service time, cost, reliability, and other factors.

Process Robustness Comparison of Three Substitution Strategies under RD Failure Mode
As shown in Figure 13, (1) in the RD failure mode, as the number of node attacks increases, the RoS values under all three substitution strategies rapidly decline and finally decrease to 0, at which point all cloud orders fail to be processed. This indicates that the RD failure mode is more destructive to the robustness of the CMP than either the ID or IB modes. (2) The "external" curve is always located above the "internal" curve, and the "in-ter&exter" curve is nearly always located above the "external" curve. This indicates that the RoS value is largest under the internal-external integration replacement strategy, followed by the external substitution strategy, and then the internal substitution strategy. Further, under the three strategies, the numbers of node attacks required to make all cloud order processing fail (i.e., when the process robustness decreases to its lowest) are approximately 110 ("internal"), 140 ("external"), and 160 ("inter&exter"). Therefore, to attain robustness of the CMP under the RD failure mode: internal-external integration strat-egy>external substitution strategy> internal substitution strategy.

Process Robustness Comparison of Three Substitution Strategies under RD Failure Mode
As shown in Figure 13, (1) in the RD failure mode, as the number of node attacks increases, the RoS values under all three substitution strategies rapidly decline and finally decrease to 0, at which point all cloud orders fail to be processed. This indicates that the RD failure mode is more destructive to the robustness of the CMP than either the ID or IB modes. (2) The "external" curve is always located above the "internal" curve, and the "inter&exter" curve is nearly always located above the "external" curve. This indicates that the RoS value is largest under the internal-external integration replacement strategy, followed by the external substitution strategy, and then the internal substitution strategy. Further, under the three strategies, the numbers of node attacks required to make all cloud order processing fail (i.e., when the process robustness decreases to its lowest) are approximately 110 ("internal"), 140 ("external"), and 160 ("inter&exter"). Therefore, to attain robustness of the CMP under the RD failure mode: internal-external integration strategy > external substitution strategy > internal substitution strategy. As shown in Figure 14, (1) in the RB failure mode, as the number of node attacks increases, the RoS values under all three substitution strategies rapidly decline and finally As shown in Figure 14, (1) in the RB failure mode, as the number of node attacks increases, the RoS values under all three substitution strategies rapidly decline and finally decrease to 0, at which point all cloud orders fail to be processed. This indicates that the RB failure mode is more destructive to the robustness of the CMP than either the ID or IB modes. (2) The "external" curve is always located above the "internal" curve, and the "inter&exter" curve is nearly always located above the "external" curve. This indicates that the RoS value is largest under the internal-external integration replacement strategy, followed by the external substitution strategy, and then the internal substitution strategy. Further, under the three strategies, the number of node attacks required to make all cloud order processing fail (i.e., when the process robustness decreases to its lowest) are 110 ("internal"), 150 ("external"), and 160 ("inter&exter"). Therefore, to obtain robustness of the CMP under the RB failure mode: internal-external integration strategy > external substitution strategy > internal substitution strategy.

Management Suggestion
Based on the analysis results in Sections 6.2 and 6.3, the following management suggestions were obtained. First, focus should be placed on protecting the resource nodes with a larger degree and larger betweenness (i.e., the important nodes in the CMS). The structural robustness and process robustness of the CMS decreased rapidly in the failure modes based on the node degree (i.e., ID and RD) and node betweenness (i.e., IB and RB), indicating that nodes with a larger degree and betweenness are crucial to maintaining system robustness. More specifically, nodes with a large degree are closely connected with other nodes, so they can play an important role in maintaining system connectivity, and nodes with a large betweenness have greater control over the logistics and information flow in the system, so they can play an important role in maintaining the information transmission rate of the system. Second, alternative resources must be provided to ensure that when the original resources fail, alternative resources can replace them to complete their tasks. These alternative resources can be set up within (1) the same service provider, (2) other external service providers, or (3) a combination of both, and all these methods can protect the robustness of the CMS to a certain extent.

Conclusions
This study combined the complex network with multi-agent simulation to propose a new analysis method for the structural robustness and process robustness of the CMS. To target the frequent failure of resource nodes in the cloud manufacturing environment, three resource substitution strategies were proposed to better ensure the stability and robustness of the system. First, a multi-agent simulation model was constructed to study the dynamic process robustness of the CMS. Here, RoS was proposed as a robustness measure, and the behavior characteristics and modeling methods of several key types of CMP In summary, from the perspective of multi-agent simulation, all three resource substitution strategies significantly improved the process robustness of the CMS. In the four failure modes (i.e., ID, IB, RD, and RB), the process robustness levels under all three strategies were higher than those with no strategy. Further, the internal-external integration replacement strategy brought the greatest robustness enhancement, followed by the external replacement strategy, and then the internal replacement strategy. This is reasonable because the three strategies provided different amounts of alternative resources: the internal substitution strategy can provide fewer alternative resources, which corresponds to lower robustness; the external substitution strategy can provide five resources of the same type because five external cloud service providers are involved in this paper; and the internal-external integration replacement strategy can provide more than five resources of the same type or alternative resources, which corresponds to higher robustness. In addition, in the failure modes based on the initial topology (i.e., ID and IB), only the process robustness under the no-replacement strategy and the internal replacement strategy significantly decreased. In contrast, in the failure modes based on the recomputed topology (i.e., RD and RB), the process robustness under all strategies significantly decreased, indicating that the failure modes based on the recomputed topology were more destructive to process robustness. However, for all four failure modes, all three resource substitution strategies could protect the process robustness of the CMS to some extent.

Management Suggestion
Based on the analysis results in Sections 6.2 and 6.3, the following management suggestions were obtained. First, focus should be placed on protecting the resource nodes with a larger degree and larger betweenness (i.e., the important nodes in the CMS). The structural robustness and process robustness of the CMS decreased rapidly in the failure modes based on the node degree (i.e., ID and RD) and node betweenness (i.e., IB and RB), indicating that nodes with a larger degree and betweenness are crucial to maintaining system robustness. More specifically, nodes with a large degree are closely connected with other nodes, so they can play an important role in maintaining system connectivity, and nodes with a large betweenness have greater control over the logistics and information flow in the system, so they can play an important role in maintaining the information transmission rate of the system. Second, alternative resources must be provided to ensure that when the original resources fail, alternative resources can replace them to complete their tasks. These alternative resources can be set up within (1) the same service provider, (2) other external service providers, or (3) a combination of both, and all these methods can protect the robustness of the CMS to a certain extent.

Conclusions
This study combined the complex network with multi-agent simulation to propose a new analysis method for the structural robustness and process robustness of the CMS. To target the frequent failure of resource nodes in the cloud manufacturing environment, three resource substitution strategies were proposed to better ensure the stability and robustness of the system. First, a multi-agent simulation model was constructed to study the dynamic process robustness of the CMS. Here, RoS was proposed as a robustness measure, and the behavior characteristics and modeling methods of several key types of CMP agents were detailed. Second, a complex network model of cloud manufacturing resources was established through the order-task relationship and task-resource relationship to study the static topological robustness of the CMS. Here, the maximum connectivity subgraph was proposed as a robustness measure. Regarding attack strategies, four failure modes (i.e., ID, IB, RD, and RB) were defined, and regarding robustness enhancement strategies, three resource substitution strategies (i.e., internal replacement, external replacement, and internal-external integration replacement) were proposed. Third, a case study of a cloud manufacturing project of a new energy vehicle was conducted. The results of this show that (1) the proposed RoS index was effective at portraying the variations of CMP robustness, (2) the three resource substitution strategies could improve both the structural robustness and process robustness of the CMS (with the internal-external integration strategy being most effective, followed by the external substitution strategy, and then the internal substitution strategy), and (3) the two node failure modes based on the recalculated topology were more destructive to the robustness of the CMP than the two node failure modes based on the initial topology. However, for all four failure modes, all three resource substitution strategies could protect the robustness of the CMS to some degree.
In combining the complex network with multi-agent simulation, the robustness analysis object of the CMS is extended from the CMN to the CMP, which provides a new perspective with two dimensions (i.e., structure and process). Moreover, the three proposed recovery strategies (elastic measures) are designed based on the idea of adding redundant nodes, which is of great significance to the implementation and deployment of cloud manufacturing projects. This research will be furthered by investigating the robustness of cloud path interruption, cloud logistics interruption, city lockdowns, and other phenomena, to provide a quantitative and dynamic decision-making basis for improving the robustness of the CMS. Data Availability Statement: The used and analyzed datasets during the present study are available from the corresponding author on reasonable request.