Smart Manufacturing Real-Time Analysis Based on Blockchain and Machine Learning Approaches

The growth of data production in the manufacturing industry causes the monitoring system to become an essential concept for decision-making and management. The recent powerful technologies, such as the Internet of Things (IoT), which is sensor-based, can process suitable ways to monitor the manufacturing process. The proposed system in this research is the integration of IoT, Machine Learning (ML), and for monitoring the manufacturing system. The environmental data are collected from IoT sensors, including temperature, humidity, gyroscope, and accelerometer. The data types generated from sensors are unstructured, massive, and real-time. Various big data techniques are applied to further process of the data. The hybrid prediction model used in this system uses the Random Forest classification technique to remove the sensor data outliers and donate fault detection through the manufacturing system. The proposed system was evaluated for automotive manufacturing in South Korea. The technique applied in this system is used to secure and improve the data trust to avoid real data changes with fake data and system transactions. The results section provides the effectiveness of the proposed system compared to other approaches. Moreover, the hybrid prediction model provides an acceptable fault prediction than other inputs. The expected process from the proposed method is to enhance decision-making and reduce the faults through the manufacturing process.


Introduction
The manufacturing system is one of the important parts of development in the economic sector for any country worldwide [1][2][3][4]. The growth of technology causes the manufacturing industry to be competitive and sustainable throughout the industrial sector. Information and communication technology (ICT) make a huge change in the manufacturing system from traditional to advanced operations [5]. The monitoring system is a known and important part of manufacturing for controlling and managing the process. Predicting disease [6], production improvement [7], cost reduction [8] and early warning systems [9,10] are part of monitoring systems. Integration with Internet of Things (IoT) devices and monitoring systems contains advantages such as preventing design errors [11], fault diagnosis [12], predicting quality [13], and improving the decision-making [14]. In [15], a survey of smart manufacturing related to industrial technology was presented. There are a total of 31 research topics covered in this study regarding the significance of the circular industry. The circular economy model is based on digital innovation, which offers solutions such as digital platforms, artificial intelligence, and smart devices to optimize assets. The circular economy creation can comfort the authorship based on recent technologies. In [16], the authors presented Industry 4.0 modeling and simulating in the manufacturing industry. This process gives the answer of material flow optimization and modeling for the huge manufacturing industry. The analysis techniques and software for • Usage of a real-time monitoring system based on integrating IoT environmental sensors, big data, and machine learning, in the automotive industry. • Securing the collected dataset to avoid fake data changes to real data and record the transaction information. • Collecting the environmental dataset from the IoT sensors, e.g., humidity and the temperature in the manufacturing line, and processing the data using big data techniques to handle a large dataset. • Using the hybrid prediction model and the Random Forest model for classification to avoid the outlier dataset. • Apply fault detection through the manufacturing procedure. • Using the integration method to improve the performance of smart manufacturing for better and higher security and standard environment. • Improving management decision-making. • Improving classification model performance. • Identifying the outliers and removing them. • Making the detection of the abnormal process more accurate through the steps of manufacturing. • Real-time data extraction to improve the automotive industry prediction preservation.
The remaining of this paper divided as follows: Section 2 presents the related work of the current industrial and technology process. Section 3 presents the proposed manufacturing model in the automotive industry system architecture and design. Section 4 presents the system step-by-step implementation process. Section 5 presents the system performance and results. Section 6 presents the discussion related to the proposed system, and we conclude this paper in the conclusion section.

Related Work
This section presents a brief review of the smart manufacturing and monitoring system literature in the automotive industry. This section has four main topics: monitoring systems based on IoT technology, big data in manufacturing, machine learning in manufacturing, and Blockchain in manufacturing. The proposed system integrates these methods to improve the automotive manufacturing industry's safety, quality, analysis, etc. The realtime dataset was collected from various sensors mentioned above and analyzed based on the integrated method techniques.

Monitoring System Based on IoT Technology
The latest technology in the Internet of Things, machine learning, big data, and sensors can be employed in monitoring systems, e.g., for prediction, cost reduction, production improvement, etc., for easier decision-making. There are various researches related to IoT-based monitoring systems, which contain positive results and feedback. Cheung et al. presented manufacturing and safety sites based on wireless sensor monitoring [22][23][24][25][26][27]. The main core is the collection of wireless sensor data that are addressed to a remote server. If an unusual situation happens, the alarm is triggered, which constitutes the safety and well management process of the presented research. In [11], low-cost IoT sensors in the monitoring environment were applied to avoid the design phase errors in the manufacturing process. The applied sensors were supposed to collect temperature and humidity records. The collection of environmental condition reports affects the manufacturing design phase process. The mentioned recent works mainly focused on the environmental situation by using IoT sensors, which improves system proficiency. The IoT assumption in the manufacturing system authorizes digitalized manufacturing from the traditional model to the modern one. Sensors sensing elements can capture and transfer the data based on electric signals to various devices. This option is the definitive role for collecting data from different points [28][29][30]. The radio frequency identification (RFID) and camera are important examples for the sensing sensors for the automotive industry [31].

Big Data in Manufacturing
The amount of generated data from manufacturing systems increases simultaneously based on the increasing data amount in IoT technology and sensors. This data procedure system is famous for big data [32][33][34][35][36]. Processing the generated data is one of the difficulties that need to be addressed. Big data provides multiple applications that can overcome this difficulty in the manufacturing industry. Zhang et al. presented a structure to minimize energy consumption in the manufacturing industry [37]. The presented system contains two main components: data analysis of energy usage and data acquisition for collecting energy data. Based on the provided information in their research, the final result reduced three percent of energy consumption and four percent of costs. For quick management of the manufacturing dataset, some big data technologies have been presented, e.g., Apache Kafka and NoSQL MongoDB. The first one is a scalable messaging queue system, structuring the real-time requests [38]. Similarly, it is scalable, fault-tolerant, etc. The second was supposed to save patient data from sensors to monitor diabetes. In [39], big data technology was proposed for the logistic discovery based on RFID-enabled data production for mining knowledge. The results were applied to represent the possibility of the developed system in the gained knowledge of big data. This process can improve the scheduling and logistics of the production system. In [40], big data techniques were combined with the supply chain social risk. This system involved big data analysis techniques in the supply chain to improve the prediction of different social problems and risks.

Machine Learning in Manufacturing
Recent developments in machine learning (ML) systems show significant potential for data analysis and similarly provide decision-making management to upgrade systems' performance. Machine learning techniques operate a definite pattern and implement it in different areas. Some of the studies operate ML in the manufacturing system and present considerable outputs. Kim et al. explained seven machine learning methods to detect novel data and faulty wafers. The models are processed based on classification and fault detection. Finally, the results of ML have a great chance to extract the faulty wafers. The availability of manufacturing resources is based on the combination of numbers and sequences and the performance of machine reliability. The simulation model [41][42][43][44][45] in the manufacturing process is used to explain each stochastic variable's behavior for the objective of productivity. This model evaluates the manufacturing performance of acute resources, failure, and repair requests. The simulation model estimates the availability of the machine, delays of delivery, inventory, etc. The problem of machine learning techniques in automotive industry manufacturing is the conflict with the outlier dataset that decreases the classification model's accuracy. The detection of outliers can define the pre-processing step to recognize the incoherences of the dataset, which causes a better classifier to generate better decision making. Previous researches show that removing the outliers causes better classification accuracy. In [46], the process of eliminating outliers for better classification was evaluated.

Blockchain in Manufacturing
The automotive industry is supposed to obtain some advantages based on three main parameters, i.e., transparency, trust, and traceability. Generally, technology is divided into two main parts: limited access and free access for users. Jean-Paul et al. [47] noted that it is similar to a book accessible for the whole world, but it is not possible to make any changes to it. The evolution of using smart contracts simplifyed supply chain management [48]. Blockchain in the automotive industry provides transparency and vehicle shipment optimization based on digital contacts, providing logistic process and price control information. Using the distributed ledger of causes a high transparency level. Rahul Guhathakurta et al. [49] presented as an ongoing database that limits the amount of answers from customers to store a large amount of information. This technique arranges the business records, authorizing purchases and vehicle traders to go through the vehicle lifecycle. Other advantages of the findings in [50,51] include offering an impressive solution for exchanging the suppliers, manufacturers, and customers' interactions. Table  1 shows the automotive industry challenges that contrast with technology. Eleven stakeholders in the automotive industry are compared based on the difficulties they face in this environment. Car owners, temporary management companies, car-sharing systems, car entrepreneurs, car retailers, car manufacturers, insurance companies, and repair shops, after marketing, public organization, and service provider telecommunication are the stakeholders in this process.

System Architecture of the Proposed Manufacturing System
The proposed integration method for real-time monitoring in the automotive industry is to improve the manager access point to an assembly line of manufacturing and provide a warning scheme for fault detection during the process. Integration of machine learning and technology clarifies system transactions and data preparation steps in the automotive industry. In this section, system design, implementation, infusion of integrated approaches, and fault detection are briefly explained.

System Design
The presented monitoring system manages the manufacturing process in the automotive industry and similarly warn if there is any issue during the procedure. The proposed system deploys IoT, predicting a hybrid model in ML, and uses big data analysis. Figure 1 presents the main process applied in the proposed system. There are three main layers summarized as manufacturing intelligence and analysis, automation control, and automotive extensions. The automotive extension layer mainly consists of the web interface, automotive integration layer, and automotive repository. The important part of this layer is the monitoring system presented to monitor the process performance and warn of problems during the process. The next step is to simulate this architecture based on three analysis techniques: impact analysis, statistical analysis, and dynamic analysis. The output of these steps is directly connected to the production repository. Intelligence analysis contains the lifecycle of smart manufacturing based on applying machine learning techniques divided into knowledge-based and intelligence analysis. The knowledge-based structure is the transformation of the traditional automotive industry into a new technique named a knowledge-based structure. The knowledge-based structure used to restructure and improve the companies' organization mainly focuses on learning in system engineering. The extracted information is connected to the manufacturing production plan and control system. The main focus of the proposed method is to control the integration broker process, which contains production plans and connectivity data. The technologies based on the IEEE 802.11p standard (applying wireless access for the vehicular communication system) gives chip manufacturers the authorities' transport for the automotive industry ecosystem. The primary reason for using this technology is holding and deploying infrastructure needed for connectivity.

Evaluating the Essentiality of Blockchain in Automotive Manufacturing
The automotive industry process is based on two main branches. One is transactions, and the other is business networks. This process generates the service and goods flow. Similarly, underlying markets also can join as open markets in car sales or as private markets for supply chain transactions. In any of the mentioned options, assets move on between various stakeholders in the business network. Assets are divided into two main parts, namely tangible assents and intangible assets. Moreover, intangible assets are also divided into financial and intellectual assets. Table 2 shows the use-case information of Blockchain technology, divided into some features based on their statements and usecases in the automotive industry. This process has two main functions: keeping records (static consistency, identification, smart contracts) and transactions (dynamic consistency, payment structure). Coin offering Blockchain as a service Figure 2 presents the advantages in the automotive industry. There are six categories, which summarize this process as: access privileges, transactions, data coordination trust, tracking, identifying, and data transparency. Each of these categories is divided into various parts. Access privilege contains distributed access control. This is the public database, which controls the variable sets in the dataset. The data coordination trust includes multiple participants and a consensus mechanism. In the case of a consensus mechanism, it is a fault-tolerant process in the system used to reach an agreement among the dataset state's distributed network. The participants are the users, manufacturers, and distributors who need to track product information or add the product's information. The identifying process contains the saving digital certificate and anti-thief mechanism. Saving the certificates means securing and decentralizing the dataset, which shows the possibility of saving digital certificates and creating further potential values. The identification section gives each user a unique ID to track and control their process in the system. Identification records the user's information and changes that they made based on their access limitation. The transaction contains the distributed ledger and cryptographic hash. The distributed ledger is defined as ownership, trust, security, saving transactions, etc., is based on the digital assets and cryptographic hash and provides the functionality in a single view to users. The transaction records the payment information with date, time, user ID, etc. The tracking contains the database access, which can manage the database through the decentralized network. Finally, data transparency contains encryption and control mechanisms. The main use of encryption is to secure the database from misusing sensitive information.

Evaluating the Essentiality of Machine Learning in Automotive Manufacturing
Machine learning is one of the related areas in the automotive industry based on product innovations, and similarly, it is effective in a business function. Based on the machine learning techniques, product quality control and data analysis were organized. In recent technology, the automotive industry required ML techniques to overcome the data classification and analysis problems. There are various classification models that organize and manage the dataset for further usage in multiple environments. Figure 3 presents the machine learning architecture in the proposed system. ML models' main validation steps are data cleaning, feature pre-processing, model selection, and parameter optimization. Data cleaning contains some procedures to prepare the data for further processes, such as removing duplicates, fixing structural errors, handling missing data, and data validation. The second step is pre-processing features based on acquiring data, splitting data, feature scaling, etc. Additionally, model selection, parameter optimization, and validating the proposed model are performed.   Figure 4 shows the flow diagram of the proposed integrated system. There are five layers in this system named the Internet of Things (IoT), big data, Blockchain, cloud computing, and artificial intelligence (AI). An IoT layer is defined based on collecting data based on IoT sensors. The second layer is the big data layer used to structure data and handling a large amount of data more easily. The third layer is the layer that is the main core of security in this system. The fourth layer is the cloud computing layer, which saves the structured data for easier access during the process. Finally, the artificial intelligence layer is used to predict, classify, and detect faults in the proposed system.

Implementation
In this section, the data information, IoT sensors' performance, and the implementation process are presented in detail.

Data
The collected dataset in this process is from the IoT-based sensors mentioned above, including temperature, humidity, gyroscope, and accelerometer. Figure 5 shows data generating from the sensors in JSON format and being sent to a Kafka server. The data are delivered to the hybrid prediction model. The results of the data and prediction are saved into NoSQL MongoDB.

IoT-Based Sensor Performance
The IoT investigation answers the need of various companies for detecting and positioning the progress of the industry. Sensors, which are based on the IoT, contain devices and programs, which recover the sensor data and address them into the cloud. This is an important section to process the IoT sensor data under various conditions. In this system, the network's delay is defined based on the average time needed to address the sensor dataset and capture the objectives. The performance metrics are based on the CPU and RAM to evaluate the program's utilization in different scenarios. There are four primary sensors used in this process, namely temperature, humidity, gyroscope, and accelerometer. The generated data from the sensors is transferred wirelessly to the cloud, which big data processes. In total, the experiments used 1GB RAM. Table 3 presents the details of specific programs and sensors used in this process. The system's main components are the programming language, sensors type, list of sensors, RAM, and cloud server. The programming language used in this process is Winpython 3.6.2. The sensors are IoT-based sensors for real-time monitoring of the environment of automotive industry manufacturing.

Blockchain Implementation Process
The implementation and design are briefly explained in this section. Table 4 shows the development environment of the implemented technology related to the proposed system. There are a total of 10 components defined in this system. The IDE is presented based on composer-playground, memory usage is 32 GB, CPU is Intel (R) Core(TM) i7-8700 @3.20 GHz, the python language version is 3.6.2, and the operating system is Ubuntu Linux 18.04.1 LTS. Furthermore, the docker environment version is 18.06.1-ce, and the virtual machine is processed in the docker composer version 1.13.0. The Hyperledger Fabric framework is from Linux Foundation. The main reason to choose the Hyperledger Fabric framework in the proposed system is the effectiveness of this system compared with ethereum and DLTs in the scalability of the network, and it can manage huge transactional records [52][53][54].  Figure 7 presents the manufacturer records in the composer rest server. The manufacturer information can be verified from the manufacturer ID. The client's request is based on the "/API/Manufacturer-Manufacturer1" to submit to the Rest server. The information of the manufacturer is stores in the Hyperledger composer, which is based on the Rest server, which answers the query request. The JSON format is the view of response requests on the Rest server of Hyperledger. The request for the URL contains the API address with the running port information.

Results and Discussions
In this section, the detailed information related to the results of the proposed integrated system is evaluated.

IoT-Based Real-Time Monitoring
Data visualization development aims to monitor real-time sensor data records. Based on this process, the manager can easily monitor the assembly line and capture the faults (abnormal incidents) during the processing. The system's real-time monitoring contains three main cores: IoT-based sensors, big data, and a hybrid prediction model. Figure 8 presents the web-based real-time monitoring. In the proposed system, four main sensors, i.e., gyroscope, temperature, accelerometer, and humidity, are used in a real-time environment. The IoT-based devices (sensor devices) collect information per second. The hybrid prediction model applied in this system is used to predict the real-time system fault records. The presented system was implemented and examined with one of South Korea's automotive manufacturers. The period was from the first of February 2020 to November 2020. The sensors, which were positioned in the industrial assembly line, transmitted data based on seconds. Within the testing time, 20 million records were accumulated.  Figure 9 shows the network delay, and Figures 10 and 11 show the memory and CPU utilization of the program. Four periods are considered to evaluate in this process. The program reading and sending steps are considered at five, 10, 30, and 60 s. Based on the presented results, the reading time had less effect on CPU and RAM usage. As shown in Figure 9, by increasing the sensor data, the network delay also increased. Sending 1000 IoT-based pieces of sensor data at the same time took almost fifty seconds. The user program's computational cost in CPU was less than 3%, and for RAM, it was almost 18 MB for all periods.

Big-Data Processing Performance
Analyzing system performance based on big data is one of the important tasks in this process. The performance metrics of this procedure are divided into system latency, throughput, and concurrency. System latency is the time needed to handle, process, and save the dataset into the database. The throughput is based on the amount of sensor data every second; finally, the concurrency shows the number of clients who can simultaneously access the system. The experiments are managed based on the various server numbers and their response time, which accumulate for analysis. Figures 12-15 shows a comparison of system latency and throughput. In this process, a single client used various sensor data sent to the cloud services simultaneously. Figure 12 presents the increasing sensor dataset in the cloud server and, similarly, increasing the response time. Response times changes are also based on the number of clients. The proposed system requires a longer response time because of the large number of clients. To decrease the system's response time, the scalability advantages help by adding several servers and comparing them with one server, as mentioned in Figure 13. Figures 14 and 15 present the system throughput based on various clients, and similarly, to reach a better performance a number of servers are effective.

Blockchain Transaction Process
The transaction management breaks into orders and peers, and the network reaches higher concurrency. Each transaction extracts from the peers in the world state. Based on the success or failure, the peers' certificates are signed. The re-execution of am order is not allowed and cannot maintain the ledger. Figure 16 illustrates the transaction process's total architecture for all components inside the network. Issuing the transaction proposal is authorized based on the user manager's decision. The transaction starts when the client sends the request to the node, which takes part in the network. The encoder node's responsibility is to evaluate the transaction proposal and validate the result and respond to the ledger's transaction block. Figure 17 shows the transaction list in the environment. There are four main records, including date, time, type of entry, and participants. The date and time section shows the exact date and time at which the transaction happened in the system. The entry type shows the type of entry related to adding a participant, approval, adding an asset, or other options. The participant section shows the details of the participant, and the view record section shows the contents related to the transaction process. The details of each transaction are presented in Figure 18 with the unique ID for each event. All certified users in the presented network can start a new transaction based on the set rules. If the transaction is successful, the participant responds from the system based on the user ID.

Fault Detection Based on Hybrid Prediction
A hybrid prediction model in the proposed system was applied to extract the normal and abnormal functionality during the procedure. As shown in Figure 19, this procedure detects the mentioned functionality in the manufacturing system. The outlier detection in the hybrid prediction model is used to remove the sensor data outliers and classify them based on Random Forest. This procedure's final step is the performance evaluation comparing the hybrid prediction model results with other classification models. The per-formance evaluation is based on the different ML prediction models. The dataset contains 400 classified instances that are normal or abnormal throughout the manufacturing process. The data are divided into eight features: temperature, humidity, accelerometer 1, accelerometer 2, accelerometer 3, gyroscope 1, gyroscope 2, and gyroscope 3. The applied machine learning technique is expected to produce a robust model classifier from the provided data. After the generating monitoring system, the results show the prediction results of the sensor dataset.

Sensor Dataset
Pre-processing Detecting Outliers Evaluation Classification based on Random Forest After collecting the relevant dataset, the pre-processing step removes unsuitable, conflicting, and missing values from the data records. Table 5 shows the dataset's detailed information. There are four main sensors used in this procedure: temperature, humidity, accelerometer, and gyroscope. Moreover, the information gain (IG) technique analyzes the considerable features during the process [55]. Table 6 shows the dataset attributes and IG scores. Based on the provided results, the manufacturing system's temperature has the highest factor affecting abnormal functions. Table 5. Dataset information.

Features Introduction
Temperature (Celsius) Provided environmental temperature Humidity (Relative humidity) Provided environmental humidity Accelerometer 1 First value of the accelerometer Accelerometer 2 Second value of the accelerometer Accelerometer 3 Third value of the accelerometer Gyroscope 1 First value of the gyroscope Gyroscope 2 Second value of the gyroscope Gyroscope 3 Third value of the gyroscope The comparison of the different classification models is shown in Table 7. The models are defined as: multiple linear regression, Random Forest, k-nearest neighbor, decision tree, and extra tree. Based on the results, the combination of the hybrid method and the Random Forest had the best performance in the proposed system with an accuracy of 0.956 and an RMSE (Root Mean Square Error) of 0.105. This procedure showed the RMSE score for each model. The highest error rate was for the linear regression model, which had the lowest accuracy score in this system. The performance metrics of the presented models were evaluated based on Equations (1)

Smart Manufacturing Challenges
The automotive industry is one of the largest and most important areas in terms of business and vehicle production. There are some challenges that manufacturers are facing while producing new products. Here we mention seven recent and important challenges in the automotive industry: regulations, vehicle fuel changing possibility, vehicle brands' constancy, the technology of power train, automotive industry supply chain, reconnecting with buyers, and global stabilization. The fuel for the usage of automobiles is a challenge for cases such as the climate and plant lives. In other cases, the power train defines the automotive business's future and similarly changes customer preferences. The supply chain needs to be updated based on recent development and changes. In the current manufacturing industry, the supply chain's delay means that the production line may shut down for a while, and similarly, this causes a high cost for overall operations.

Conclusions and Future Directions
This research focused on a real-time monitoring system based on the integration of IoT sensors, big data, and a hybrid prediction model. This system anticipates improving the monitoring system in the manufacturing environment, extracts faults during the procedure, and similarly prevents issues in the assembly line. The integration of big data and IoT sensors into this system presented a large amount of real-time sensor datasets. Based on this technique, handling the datasets and extracting the proper information is examined. The applied big data technique is NoSQL MongoDB. The results section shows the effectiveness of this system, which is scalable and more reasonable than traditional methods. Moreover, the system performance was analyzed based on the network's delay, CPU, and RAM. The results showed quite acceptable solutions and were successful in the data collection and transmitting process within a short time and with low cost. Smart manufacturing's fault detection issue is also a severe problem for identifying normal and abnormal functions. The presented hybrid prediction model contains Random Forest classification, which is used to predict the input data issues. Compared with other machine learning approaches, Random Forest has higher accuracy and fits the proposed system.
The output of this procedure is expected to improve the manufacturing industry's decisionmaking and ignore unexpected faults. Blockchain technology applied in this system covers the security of the collected dataset and similarly prevents providing fake data, improves data transmission and costs and improves system safety. In future research, we will try to improve the supply chain procedure and try other related IoT sensors to improve the fault detection results. Blockchain presents an opportunity in the automotive manufacturing industry to improve the competition in the industrial world. Blockchain improves the various types of business models with lower transaction fees and reduces information transfer between various users. Reducing fraud and systemic risks are other advantages of using this system. The use of machine learning techniques in the industrial environment provides the advantage of better data acquisition and improves system accuracy. The classification and detection algorithms retain the accuracy of various models without reducing the advantages and accuracy; similarly, handling a large amount of sensor data is much easier and accurate with machine learning techniques.