1. Introduction
The study of Servitization and Product Service Systems (PSS) has become an intriguing and evolving field, showing significant progress in the last thirty years [
1]. PSS has been acclaimed as a highly effective tool for driving society toward a resource-efficient, circular economy and sparking a necessary ‘resource revolution’ since its inception [
2]. In recent years, the Industrial Product Service System (IPS
2) has been regarded as an integrated offering of products and services, and it can adapt to changing customer demands and provider abilities dynamically and deliver values in industrial applications [
3]. However, with the increase in the degree of customization in industrial production, the performance parameters, manufacturing processes, and additional services between similar industrial products increasingly show considerable differences. Therefore, designing an industrial product service system that can fully meet individual needs and properly arrange manufacturing and maintenance services is very important.
With the trend of product modularization, emerging knowledge bases such as digital innovation, the Internet of Things (IoT), and closed-loop supply chains have received more attention [
4]. Nowadays, the competitive capability of companies is not focused on adding various offline functions to products anymore but on fulfilling customer demands with specific functions such as dynamic scheduling, remote monitoring, and preventive maintenance [
5]. As a result, the development and application of emerging technologies can help improve the design and delivery of IPS
2 at various stages, including product and service requirements analysis, manufacturing system configuration, comprehensive workflow modeling, and production service activity scheduling.
With the intention of maximizing the service value, both human processes and physical processes are supposed to be unified and considered comprehensively for the PSS design [
6]. Formalized configuration rules are crucial to efficiently and rapidly configuring the PSS [
7]. Significantly, few kinds of research concentrate on the overall process of IPS
2 design, especially consisting of the equipment configuration, the service flow modeling, and the scheduling optimal method for service activities. This paper takes the actual case study of robot-driven sanding processing lines as the object of study, and its overall architecture is shown in
Figure 1. In the production process of traditional carbon anode sanding, employees have to endure not only high-intensity physical labor but also the harsh environment caused by a high level of dust. The application of robot-driven sanding processing lines can easily solve the problems of environment and efficiency; however, they are difficult to popularize due to the high cost and the complexity of operation and maintenance. In addition, the accompanying problem of multi-process parallelism of production requires appropriate planning and scheduling solutions. Hence, with the purpose of promoting the application of robot-driven sanding processing lines, this paper proposes an order-based, autonomous configuration, and flow-driven design approach for developing IPS
2, together with a scheduling method for service activities based on deep reinforcement learning.
The main contributions of this paper can be listed as follows: (1) A comprehensive IPS2 design framework is proposed, which includes structured service order design, customized resource configuration, and fine-grained service flow modeling. (2) Aimed at solving the problem of multiple service flows in parallel during the delivery process of IPS2, a scheduling method based on deep reinforcement learning is proposed. (3) The design framework and scheduling method proposed are based on the foundation of a real industrial case of robot-driven sanding processing lines, and the results of the IPS2 design using this framework are supported by the relevant data concerning this case.
The remainder of this paper is organized as follows:
Section 2 gives a brief overview of the existing studies related to this paper.
Section 3 provides a detailed explanation of the modeling method and the scheduling method proposed in this paper.
Section 4 elaborates a practical case study to verify the performance of methods proposed in
Section 3.
Section 5 discusses the main achievements and potential research directions of proposed methods, and the conclusions are summarized in
Section 6.
4. Case Study
In recent years, robots have emerged as the workhorse of modern industrial production and advanced manufacturing facilities globally, especially in repetitive or hazardous tasks [
36,
37], such as industrial robots in automatic forging processing lines [
38] and vision-aided robots in welding tasks [
39]. In this paper, we take the robot-driven sanding processing line as a practical research case, which is used to sand prebaked carbon anodes in a severely dusty environment. The approach to designing an industrial product service system for a robot-driven sanding processing line is studied in detail. In the meantime, a scheduling method for service flows in the delivery stage of IPS
2 based on deep reinforcement learning is proposed.
4.1. IPS2 Service Order
On account of the different requirements of prebaked carbon anodes that need to be sanded, developing an industrial product service system solution for the robot-driven sanding processing line is beneficial for those customers with a history of long-term cooperation. Service orders are the foundation for implementing IPS
2 design and development, which includes specific demand information such as hardware selection, production capacity planning, machining features, delivery cycles, and additional services. In IPS
2 robot-driven sanding processing lines, the specific content of the order design is divided into three parts according to the method mentioned: basic requirements, equipment configuration, and customized service modules. A case for the proposed service order can be seen in
Figure 9.
4.2. IPS2 Resource Configuration
This industrial case uses the interface technology of graphic editing to perform detailed resource configuration of robot-driven sanding processing lines. Firstly, the canvas of resource configuration is supposed to be bound to the service order. Then, the graphic elements can be dragged into the canvas, including the main equipment/service activities of robot-driven sanding processing lines and the UI nodes. The detailed information on each graphic element can be edited in the information bar, which contains the selection of types, the equipment cost, and the different demands of sensors in the configuration of equipment resources. For instance, if an element named ‘Ash collector device’ in the canvas is clicked, the information bar on the right of the canvas can jump to the detailed information setting, where the different types and equipment costs obtained from the database are available for selection.
Some special elements are appended with extra details describing their quantities or dimensions, such as area and length. Additionally, the information bar relating to the UI nodes is quite different. The equipment and the types of sensors can be selected in the information bar of the UI nodes to satisfy various requirements in terms of data visualization and real-time monitoring. For example, the UI node for the ash collector device can select the temperature sensor since it has been equipped with the gas flow sensor and the gas pressure sensor in advance. If the processing line bound to the equipment configuration canvas has been put into operation, a data card can pop up when clicking the UI node, and the sensor data can be updated in real time. The realization case of the configuration of equipment resources for IPS
2 of robot-driven sanding processing lines can be seen in
Figure 10. In the aspect of service resources, the configuration process is similar to that of equipment resources. Using the ‘Sanding Production’ activity as an example, the service resources should include technicians, kits, spare parts, and extra devices, as detailed in
Table 3.
4.3. IPS2 Service Flow
After the completion of the service order design and the resource configuration of IPS
2, the service flow can be designed based on the concrete service activities in the whole life cycle of robot-driven sanding processing lines. In this industrial case, the main activities include the order setup, the program administration, the design and simulation of processing lines, the logistics of equipment needed, the installation and test of processing lines, the production of sanded prebaked carbon anodes, quality inspection, product warehousing, the predictive maintenance of processing lines, the unplanned repairs of sudden equipment failure, and the state assessment of processing lines. The process of service flow design still adopts graphic technologies to realize the extended UML Activity Diagram. The elements named by main service activities are divided into four parts by different departments, called the swim lanes, including the administration department, the design and development department, the production department, and the maintenance and repair department. We can drag the graphic elements into the swim lanes in the canvas, connect them, and edit detailed information containing the estimated capacity and cost so that a service flow for IPS
2 can be modeled in a fine-grained form. The realization case of the service flow modeling for the IPS
2 robot-driven sanding processing lines can be seen in
Figure 11.
4.4. IPS2 Service Flow Scheduling
In the real case of robot-driven sanding processing lines, a considerable number of orders for prebaked carbon anodes are waiting to be completed. The finite resources should be arranged properly and fully utilized to maximize benefit. In this section, we first try to solve SFSP based on the modified DRL algorithm, and then we conduct some comparison experiments to verify the effect of our scheduling method. These numerical experiments are implemented in PyCharm 2022.1.4 and run on a PC with 2.10 GHz 12th Gen Intel (R) Core (TM) i7-12700 CPU and 16 GB RAM.
4.4.1. Settings and Hyperparameters
Under the background of robot-driven sanding processing lines, we designed a service flow scheduling problem with 10 service flows and 10 service groups, and these groups belong to the four different departments mentioned before. The processing time of each service activity in service flows is randomly generated with some constraints, where the ‘Production’ activity (
) takes the longest time during the whole service flow, and the processing time of the ‘Maintenance & Inspection’ activity (
) is approximately 10% to 30% of the ‘Production’ activity. Considering that minimum tardiness is our goal, we set the due date of each service flow as 2.4 times its duration. In addition, all service flows waiting to be scheduled arrive simultaneously. After experiments on network structure, the number of hidden layers in the Q-Network is set to 5, and the number of nodes in each hidden layer is 50. The ‘GELU’ activation function is applied to hidden layers. Meanwhile, the input layer contains 18 nodes, and the output layer contains 10 nodes, without any activation functions, to preserve the original state information and output actions from the DRL environment. Moreover, we adjust the method for calculating the priority of each experience to set the
as a low value (about 0.5) initially and gradually increase it to 1 to reduce potential instability and overfitting. The details of the hyperparameters can be seen in
Table 4. Additionally, a case of processing time of service activities in service flows is listed in
Table 5.
4.4.2. Experimental Results
We first test the performance of the single dispatching rule in the action set on solving the SFSP, and the results are listed in
Table 6. According to this, we can find that the dispatching rule of SRPT, EDD, SOST, and SROT can obtain scheduling schemes without tardiness, but the makespan and the average utilization rate show that these schemes can be further optimized. Then, we try to verify the effect of Prioritized Experience Replay (PER). The reward line of DQN, Double DQN, and Dueling DDQN (the combination of Dueling DQN and DDQN) with the mentioned settings and hyperparameters without PER is shown in
Figure 12, which indicates the severe instability of the training processes, especially DQN and Double DQN. The performance of these algorithms adopting PER can be seen in
Figure 13, which shows that PER is effective in reducing instability and accelerating convergence. The comprehensive comparison of DQN and its modified versions can be seen in
Figure 14, which shows the gradual optimization in terms of efficiency and stability from the traditional DQN algorithm to the Dueling DDQN containing PER. The average reward in 200 episodes is presented in
Table 7 for comparison.
Then, we conduct experiments on the performance of the priority coefficient (
) through Dueling DDQN, and the comparison of results can be seen in
Figure 15, which clearly shows that
can contribute to promoting training efficiency since it encourages sampling experiences with important state feature transformation. Moreover, the comparison also indicates that the fixed
may accelerate the convergence and the decreasing
is likely to be more stable.
Through all the experiments above, we can find dozens of scheduling schemes without tardiness differences from that of the scheme with the single dispatching rule. The scheme with the shortest makespan we find is presented in the form of a Gantt chart, as shown in
Figure 16, whose makespan is 2422, and the average utilization rate is 0.823292. In the absence of tardiness, this scheme reduces the makespan by over 10% and promotes the average utilization rate by 4.5% compared with the SOST. At last, we compare the performance of the modified Dueling DDQN model trained by the specific case with the original model and genetic algorithm (GA) through random cases and different due dates, and the relevant data can be seen in
Table 8. To avoid potential influence from
since we think it may be dependent on the actual situation of industrial cases, the model is trained without
. The results in
Table 8 show the high effectiveness and stability of the modified Dueling DDQN in obtaining scheduling schemes of SFSP with low tardiness compared with the original model. Meanwhile, we can find that GA achieves better outcomes when the deadlines are generous, whereas in situations with tighter deadlines, the performance of modified Dueling DDQN is superior.
6. Conclusions
In this paper, we propose a design method for IPS2 based on robot-driven sanding processing lines to solve the difficulties triggered by the high cost and the complex operation and maintenance. Moreover, a scheduling method is put forward in the face of multiple concurrent service flows. The main contributions of this paper lie in three aspects: (1) A comprehensive design method for IPS2 is propounded to obtain a highly customized scheme, which includes service order design, resource configuration, and service flow modeling. (2) A scheduling method adopting the deep reinforcement learning algorithm for service flows is proposed in an attempt to satisfy the requirements of the due date, optimize the makespan, and promote the average utilization rate. (3) A real industrial case of robot-driven sanding processing lines and their relevant data are implemented to verify the practicability and performance of the proposed methods. In addition, the modifications of reward shaping and priority coefficient in the Dueling DDQN are discussed to pursue more efficient and robust scheduling schemes.
Nevertheless, some existing limitations in our methods should be addressed. Firstly, the transformation from the Service Flow Scheduling Problem into the Hybrid Flow Scheduling Problem is somewhat idealized since it ignores the logistics activity and the potential risk of sudden breakdowns. Furthermore, despite the “Fork” and the “Join” in the UML Activity Diagram, the modeling method for service flows does not seem to fit the profoundly complex combination of parallel activities so well. Therefore, further research is required to establish a more flexible and refined modeling method for service flows. Meanwhile, the possible dynamic factors during the scheduling process deserve serious consideration to prevent an unstable and inefficient scheme.