An Affordable Fast Early Warning System for Edge Computing in Assembly Line

Maintaining product quality is essential for smart factories, hence detecting abnormal events in assembly line is important for timely decision-making. This study proposes an affordable fast early warning system based on edge computing to detect abnormal events during assembly line. The proposed model obtains environmental data from various sensors including gyroscopes, accelerometers, temperature, humidity, ambient light, and air quality. The fault model is installed close to the facilities, so abnormal events can be timely detected. Several performance evaluations are conducted to obtain the optimal scenario for utilizing edge devices to improve data processing and analysis speed, and the final proposed model provides the highest accuracy in terms of detecting abnormal events compared to other classification models. The proposed model was tested over four months of operation in a Korean automobile parts factory, and provided significant benefits from monitoring assembly line, as well as classifying abnormal events. The model helped improve decision-making by reducing or preventing unexpected losses due to abnormal events.


Introduction
The concept of Industry 4.0 [1] was recently proposed as the new state of the art between information and communication technology (ICT) and manufacturing technologies, offering opportunities to significantly enhance manufacturing systems and help improve product quality, production efficacy, and allow real-time condition monitoring and decision-making [2].A transition from traditional to advanced manufacturing can be enabled by adopting ICT [3].Early warning systems within ICT applications can provide important inputs to manufacturing processes and management, and integrating ICT with IoT and sensors enables early warning systems to monitor manufacturing processes.
With recent cloud computing developments, sufficient computing and storage resources can be acquired without requiring physical data centers or servers [4,5].However, cloud computing has a number of drawbacks, including low response time due to centralized computation, where the data must first be transmitted to the cloud for computation and the result returned later.On the other hand, short response time and real-time decision support are essential for some IoT applications, such smart healthcare [6], emergency response [7], and early warning systems [8].Edge computing can be utilized to overcome these cloud issues by providing computation and storage resources near the data source, minimizing latency, and providing real-time decision-making.
Manufacturing data, e.g., sensor data, process logs, etc., must be analyzed to facilitate meaningful decision-making.Machine learning has great potential to analyze data and has been successfully applied for quality [9][10][11] and fault detection [12][13][14].For example, random forest (RF) can provide high accuracy for detecting abnormal events in a process, whereas traditional machine learning models encounter challenging issues, such as outlier data and imbalanced datasets, with consequentially low model accuracy.Several studies have demonstrated that eliminating outlier data using density based spatial clustering of applications with noise (DBSCAN) methods [15] and balancing the dataset using synthetic minority over sampling techniques (SMOTE) [16] can significantly improve model accuracy [17][18][19].
The present study proposed an affordable fast early warning system (AFEWS) utilizing edge devices and a hybrid fault model.The edge device was a computation unit close to the data source (sensors), and we used a hybrid fault model to predict whether the process was functioning normally or abnormally.The model comprised of DBSCAN based outlier detection, SMOTE, and the RF algorithm.DBSCAN was employed to eliminate outlier data, SMOTE to balance the dataset, and RF to predict faults.We developed a web dashboard to visualize sensor data and fault status in fast response.The proposed system was implemented in a Korean automobile parts factory.Contributions from the present study can be summarized as follows: • Affordable edge computing system.The proposed system employed an edge device based on an open source single-board computer (SBC), providing low cost, good support, and sufficient computation resources.Sensor devices were combined with the SBC to gather, process, analyze, and present the sensor data and consequential results in a web dashboard without requiring network communication to the cloud server, thus minimizing network latency and improving analysis speed.

•
Fault detection based on hybrid fault model.We utilized a hybrid fault model combining DBSCAN outlier detection, SMOTE, and RF to improve prediction accuracy.The hybrid fault model learned and generated the model that was subsequently used in the edge device to predict faults in fast response.

•
Implementation and performance analysis.The proposed system was tested in the door trim assembly line for a Korean automobile parts factory.The selected edge device provided sufficient performance, successfully gathering, analyzing, and displaying sensor data in fast response.
The proposed hybrid fault model provided the highest accuracy compared with traditional classification models.
The remainder of this paper is organized as follows: Section 2 overviews related studies.Section 3 explains the overall design and implementation of the proposed system.Section 4 presents results from implementing the proposed system and discusses various managerial implications.Section 5 summarizes and concludes the paper and discusses future study avenues.

Edge Computing for Warning Systems
Shi et al. defined edge computing as computation and network resources placed close to the data source [20], which could provide an effective and efficient solution for data-driven low latency IoT applications.Edge systems can respond in real-time to urgent events triggered by machines or sensors.Since computation can be completed close to the data sources, real-time decisions and time sensitive applications can be solved in a timely manner.Benefits from edge computing include reduced response, improved latency, and reduced energy consumption.Sood and Mahajan proposed a warning system to detect and prevent outbreaks of mosquito borne diseases (MBDs) [8], utilizing fog-cloud based computing to process data and machine learning algorithm(s) to detect infected persons, generating early warning alerts when abnormalities were detected, i.e., high possibility of dense mosquito regions and/or breeding sites.Uninfected persons could then take immediate precautions and MBD outbreaks could be prevented.Their results confirmed that the proposed model could effectively monitor, detect, and prevent MBD outbreaks.In addition, Ferrández-Pastor et al. proposed an architecture based on edge and fog computing to solve the integration and interoperability issues in heterogenous smart building services [21].Their experimental results showed the feasibility of the integration and interoperability between existing services and new services not only in smart building, but also in other areas.
Edge computing has been adopted in manufacturing areas to monitoring machine health, improving assembly line productivity and increasing quantity and quality.Wu et al. proposed a framework based on fog-computing for cyber manufacturing to monitor machine health and generate predictive analytics [22].Their results demonstrated positive impacts from adopting the proposed approach for a real factory.Petrali et al. developed a flexible production line edge computing system for a white-appliances industry, using edge computing to process the order and communicate with other devices [23].Their results showed that the developed model improved system productivity and decreased reconfiguration costs.Hu et al. proposed an intelligent robot (iRobot) factory based on cognitive manufacturing and edge computing in the production line [24].Edge computing provided optimal computing resources for the iRobot and reduced network transmission time.The proposed model provided significantly increased production compared with a traditional factory approach.Tao et al. showed that cloud-based data storage and analytics were inappropriate for low latency and/or real time applications [25].They recommended fog and edge computing be adopted to store and process data, significantly reducing bandwidth requirements and latency.
Therefore, the present study utilized edge computing to monitor manufacturing processes and integrated edge computing and machine learning to provide early fault detection and warning, helping to prevent further losses due to manufacturing faults.

Machine Learning for Fault Detection
Technology developments in manufacturing generate new approaches to enhance product quality and prevent unexpected losses due to faults.Machine learning can analyze manufacturing data, providing managerial support as well as improving productivity.Soualhi et al. proposed a method to detect ball bearing machine health status.Healthy vibration data was extracted using the Hilbert-Huang transform to track critical bearing component degradation.Then, the support vector machine (SVM) approach was used to detect faults [9].Their results showed that the proposed method could effectively detect ball bearing faults.Chen et al. proposed welding quality detection using SVM in a high-power disk layer [10].Their proposed model successfully detected welding quality and could be implemented for real-time monitoring.Several machine learning models were utilized and evaluated to successfully detect metal casting quality [11].
Fault detection and analysis are important engineering problems to identify abnormal events during process.Early process fault detection and warning can help reduce or avoid productivity losses.Previous studies have identified that the RF machine learning model has good efficacy to detect manufacturing faults, e.g., failure detection of rotor bars [13], where RF achieved higher detection accuracy compared to other models and was suitable for in-process real-time fault detection.RF has also been used to detect bearing failure [14], outperforming neural network approaches in terms of performance and accuracy.Finally, RF was used to diagnose induction motor faults [12], achieving higher accuracy and faster execution compared to other models.
Data preprocessing is a critical step to identify inconsistencies and/or outliers and generate better classification models.Previous studies have showed that eliminating outlier data can significantly improve classification accuracy [26,27].Clustering methods can be used for outlier detection based on the assumption that normal data correspond to dense clusters, whereas outlier data correspond to small groups, which are not included in any cluster [28].The DBSCAN method distinguishes outlier data by finding dense regions based on the number of data close to a given point [15].Points that do not belong to any cluster are regarded as outliers.DBSCAN is shown to provide excellent performance for distinguishing outlier from normal data, with consequentially improved classification accuracy.Alfian et al. utilized DBSCAN to detect outlier sensor data for a supply chain, successfully distinguishing outlier and normal data [29].Thang and Kim proposed multiple parameter DBSCAN (DBSCAN-MP) to detect network intrusion, achieving a significantly higher detection rate compared to other methods [30].Chen and Li subsequently showed that the enhanced DBSCAN method could achieve higher intrusion detection accuracy compared to other methods [31].
Accuracy can also be improved by applying DBSCAN for outlier removal.ElBarawy et al. developed community detection utilizing DBSCAN to eliminate outliers and showed that the proposed method could precisely cluster community data for social network analysis [32].Ijaz et al. utilized DBSCAN for outlier removal to enhance diabetes and hypertension prediction accuracy [19].
Unbalanced class distribution is a common problem for supervised learning.Oversampling can be employed to address this problem, creating artificial data to balance the class distribution.The SMOTE oversampling method is widely used for machine learning, and several previous studies highlighted that this method subsequently improved model accuracy.Yousefian-Jazi et al. proposed a decision support system to automatically detect thin film transistor liquid crystal display (TFT-LCD) glass substrate defects using SVM [17], employing SMOTE techniques to balance their dataset.The results found that the accuracy of the model was increased.The proposed SVM model outperformed classification and regression tree and multilayer perceptron (MLP) models during testing.Kim et al. employed SMOTE to solve data imbalance for a semiconductor dataset [18], providing improved model accuracy.The RF model achieved the highest accuracy compared to the decision tree, logistics regression, and artificial neural network.
Previous studies have shown that employing DBSCAN outlier detection and SMOTE to balance the dataset distribution provides improved model accuracy.Therefore, we propose a hybrid fault model consisting of DBSCAN outlier detection, SMOTE, and RF to improve classification accuracy and identify faults at an early stage.Thus, managers can take appropriate actions earlier to preventing further losses due to faults during manufacturing.

System Design
This section describes the overall design for the proposed affordable fast early warning system (AFEWS) system.Figure 1 shows that the proposed AFEWS system consists of an edge device and visualization tools.The edge device consists of single board computer (SBC), sensors, hybrid fault model, and data storage, and gathers and processes environmental conditions from the workstation.The hybrid fault model embedded in the edge device triggers alert/warning messages in fast response when a fault is detected during process.The generated sensor data and prediction results are stored, as well as pushed, to the data visualization layer.Management can monitor the status of every workstation through the web dashboard and receive warning messages via email and/or instant messaging (IM) when a fault is detected.
Edge devices were installed at every workstation to allow conditions to be monitored for each workstation during process.Figure 2 shows an example assembly line process for the automobile parts factory, with edge devices in four workstations with different assembly processes.The edge devices act independently to collect, store, process, and predict fault states in fast response without requiring data communication to the cloud.

Edge Devices
Edge devices were attached to relevant workstations, so that sensors could collect various environmental condition data, such as gyroscope, accelerometer, temperature, humidity, ambient light, and air quality, during process.The SBC provided general-purpose input and output (GPIO) ports, low cost, and low power consumption [33].There were several suitable SBC options, including Raspberry Pi [34] and BeagleBoard [35].We selected Raspberry Pi for the present study due to its

Edge Devices
Edge devices were attached to relevant workstations, so that sensors could collect various environmental condition data, such as gyroscope, accelerometer, temperature, humidity, ambient light, and air quality, during process.The SBC provided general-purpose input and output (GPIO) ports, low cost, and low power consumption [33].There were several suitable SBC options, including Raspberry Pi [34] and BeagleBoard [35].We selected Raspberry Pi for the present study due to its

Edge Devices
Edge devices were attached to relevant workstations, so that sensors could collect various environmental condition data, such as gyroscope, accelerometer, temperature, humidity, ambient light, and air quality, during process.The SBC provided general-purpose input and output (GPIO) ports, low cost, and low power consumption [33].There were several suitable SBC options, including Raspberry Pi [34] and BeagleBoard [35].We selected Raspberry Pi for the present study due to its lower cost, support availability, and reasonable performance [36].A local database was installed in each edge device to store collected sensor data and analysis results.
The Raspberry Pi was approximately 8 × 5 × 1 cm (width/length/thickness) and provided micro USB power, micro SD card, display serial interface (DSI), USB, camera serial interface (CSI), LAN, HDMI, audio, and video ports, along with 40 pin GPIO connectors for sensor interfaces.The Raspberry Pi used in this study was as follows: We employed several sensor devices, including accelerometer and gyroscope: Sense Hat [37], humidity and temperature: DHT11 [38], ambient light: BH1750FVI [39], and air quality: ZP01-MP503 [40], as detailed in Table 1.We used MongoDB as the local edge device database for this study.MongoDB is a Non-SQL type database, which offers flexible data-schema compared to relational databases.Previous studies have shown it can be efficiently used to store continuous generated sensor data [29,41].Several client libraries, including C, C++, Python, Java, Node.js, etc., were also available to simplify direct communication with MongoDB [42], ensuring the development community can rapidly develop MongoDB based applications [43].
We installed the official supported operating system (OS) for the Raspberry Pi (Raspbian Stretch with Desktop OS) [44], with MongoDB V3.2, Python V3.6.5 and PyMongo V3. 7.2 [45] to collect sensor data and store them to MongoDB.Incoming sensor data were analyzed using the proposed hybrid fault model to predict workstation conditions and this result was also stored to MongoDB.

Proposed AFEWS Implementation
This section describes the proposed AFEWS implementation.AFEWS was implemented for an automobile parts assembly line producing car door trim parts.Four steps were monitored: Inside handle, switch bezel, main parts, and fusion splitting assembly, as shown in Figure 2. The assembly line required approximately 5 min to perform each process step.We identified fault cause for the door trim process.If the worker screwed and/or hammered too hard, the product would break, causing a process fault.
Figure 3a shows the assembly line layout in the factory, highlighting an example workstation shown in detail in Figure 3b, with typical edge device installation in Figure 3c.
Appl.Sci.2019, 9 FOR PEER REVIEW 7 handle, switch bezel, main parts, and fusion splitting assembly, as shown in Figure 2. The assembly line required approximately 5 min to perform each process step.We identified fault cause for the door trim process.If the worker screwed and/or hammered too hard, the product would break, causing a process fault.
Figure 3a shows the assembly line layout in the factory, highlighting an example workstation shown in detail in Figure 3b, with typical edge device installation in Figure 3c.

Hybrid Fault Model
We used the hybrid fault model, combining DBSCAN [15], SMOTE [16], and RF [46], to predict normal and abnormal events during process, as shown in Figure 4.

Hybrid Fault Model
We used the hybrid fault model, combining DBSCAN [15], SMOTE [16], and RF [46], to predict normal and abnormal events during process, as shown in Figure 4. device.

Hybrid Fault Model
We used the hybrid fault model, combining DBSCAN [15], SMOTE [16], and RF [46], to predict normal and abnormal events during process, as shown in Figure 4. We collected an experimental dataset during initial implementation to evaluate performance of the proposed model.The collected dataset included 614 instances (378 normal and 236 abnormal), each with 10 attributes as follows: • Gyroscope X, Y, and Z directions (gyroX, gyroY, and gyroZ, respectively).
Random forest classification was employed to learn and generate a robust model from the collected dataset, which was then installed into each edge device to ensure prediction from sensor data could be generated and presented in fast response.
We used preprocessing to remove inappropriate data and correct for missing values, with attribute and class parameters shown in Table 2.The air quality sensor provided categorical values (clean (CL), light pollution (LP), moderate pollution (MP), and severe pollution (SP)), hence the statistical distribution could not be derived for this attribute.Figure 5 shows the information gain [28] significance for each attribute.Gyroscope Z and Y directions (gyroZ and gyroY, respectively) had the maximum affect for abnormal events.Due to imperfect sensing devices and network connection problems, some sensor data may be significantly noisy or outliers.Therefore, we filtered the sensor data using DBSCAN [15] implemented in R V3.5.1 [47] to remove outliers, having previously determined optimal values for epsilon (eps = 4) and minimum points (MinPts = 5), neighborhood radius around a data point and minimum value of neighboring data points, respectively.We defined eps by calculating the average distance of every point to its k-nearest neighbors, using R V3.5.1 [47]. Figure 6a,b show the sorted k-nearest neighbor (NN) distribution and DBSCAN outlier detection, respectively.DBSCAN grouped the sensor data into four clusters, labelled 1-4, with cluster 0 representing un-clustered data, regarded as outliers.A total of 39 outlier instances were identified (of 614), and the remaining 575 were used for further analysis.Due to imperfect sensing devices and network connection problems, some sensor data may be significantly noisy or outliers.Therefore, we filtered the sensor data using DBSCAN [15] implemented in R V3.5.1 [47] to remove outliers, having previously determined optimal values for epsilon (eps = 4) and minimum points (MinPts = 5), neighborhood radius around a data point and minimum value of neighboring data points, respectively.We defined eps by calculating the average distance of every point to its k-nearest neighbors, using R V3.5.1 [47].Figures 6a,b show the sorted knearest neighbor (NN) distribution and DBSCAN outlier detection, respectively.DBSCAN grouped the sensor data into four clusters, labelled 1-4, with cluster 0 representing un-clustered data, regarded as outliers.A total of 39 outlier instances were identified (of 614), and the remaining 575 were used for further analysis.Data distributions in this study were unbalanced between normal and abnormal subsets after DBSCAN, as shown in Table 3.Therefore, we employed SMOTE to balance the dataset, using Weka V3.6.15 [48].The original dataset distribution was maintained by adding synthetic data close to minority data, with 50% increased abnormal class data balancing the dataset well (see Table 3), allowing machine learning to generate high classification accuracy.The RF algorithm is a supervised classification model formed by combining several decision tree Data distributions in this study were unbalanced between normal and abnormal subsets after DBSCAN, as shown in Table 3.Therefore, we employed SMOTE to balance the dataset, using Weka V3.6.15 [48].The original dataset distribution was maintained by adding synthetic data close to minority data, with 50% increased abnormal class data balancing the dataset well (see Table 3), allowing machine learning to generate high classification accuracy.The RF algorithm is a supervised classification model formed by combining several decision tree models.Each tree within the RF is independently constructed by choosing a random subset of attributes and bootstrap samples from the dataset.Each generated tree model is accumulated using a majority voting method to brain the best final outcome [49].RF overcomes various decision tree problems, such as low variance generation and overfitting.We applied DBSCAN to remove outliers, SMOTE to balance the resulting dataset, and RF to learn and generate the final model from the balanced dataset.The generated model was then installed on each edge device for fault prediction.
Prediction output may have four possible outcomes [50].True positive (TP) and true negative (TN) outcomes are the number of correctly classified data.False positive (FP) and false negative (FN) outcomes are the number of data incorrectly classified as normal when they are actually abnormal, and abnormal class when they actually are normal, respectively.We employed 10-fold cross-validation for all classification models, with final performance metric being the average.Table 4 shows classification performance metrics based on precision (p), recall (r), F-1 score (f), and accuracy (a).(TP + TN)/(TP + TN + FP + FN)

Data Visualization
The web dashboard and warning messages were implemented in the data visualization layer.These modules enable managers to visually monitor assembly line conditions and immediately receive alert or warning messages when a fault is detected during process.Information regarding process conditions from the edge device were provided visually in a web dashboard.The web dashboard visualized conditions in chart form to provide historical conditions prior to fault occurrence, to assist in analyzing the root cause.Alert or warnings were also provided by the web dashboard and sent to registered contacts via email and/or IM apps, e.g., Telegram [51].Thus, managers can instantly notice fault conditions, which will improve decision-making.
The web dashboard was developed using several open source software (OSS), including Node.js V10.13.0, Express JS V4.16.0, Bootstrap V4.1.3,Chart.jsV2.7.3, and Socket.IO V2.1.1.We employed Node.js as a webserver, whereas Express Js, Bootstrap, and Chart.js were used for visualization.Socket.IO was used to handle and present sensor data and prediction results in fast response.When the web dashboard initiates, the local IP address and port are visible and can be accessed through a web browser on computer, smartphone, or tablet connected to the local network.Figure 7 shows how the web dashboard presents sensor, i.e., gyroscope, accelerometer, temperature, humidity, ambient light, and air quality.The sensor number represents the process identification for the workstation for that specific process.The hybrid fault model was applied within each edge device to predict the fault state, which was presented to the web dashboard in fast response.Once a fault was detected, the warning message was immediately generated, sent to the web dashboard and subsequently to registered contacts by email and/or IM, as shown in Figure 8.Thus, managers can immediately notice the fault and take action to prevent further losses during process.

Edge Device Performance
This section discusses the proposed edge device performance.The edge devices combine several sensors and a client program to retrieve, store, and analyze sensor data.Previous studies utilized response time and CPU usage as performance metrics to evaluate IoT device performance [52,53].We defined response time as the average time between sending sensor data from the source (client program) and successful delivery to the destination, i.e., the local database, and CPU usage as average CPU usage under different storing scenarios.
We used Python for the client program on edge devices that collected gyroscope, accelerometer, temperature, humidity, ambient light, and air quality data.Edge devices incorporated the Linux Raspbian OS Stretch and 1 GB RAM for the experiment.We set the client program to retrieve sensor data every 5 s with different storage periodicity into the local database: Scenarios I-V stored one data set every 5 s, two every 10 s, three every 15 s, four every 20 s, and five every 25 s, respectively.Hence, incoming sensor data was held in the client program before storage in the local database for scenarios 2-5.CPU usage was monitored over 2 min for each scenario.
Figure 9 shows the mean response time and CPU usage under the different scenarios.Response time significantly increased with increasing sensor data stored to the database, whereas CPU usage decreased.Thus, storing sensor data less frequently, e.g., scenario V, generated the least CPU usage.However, response time and CPU usage were less than 0.05 s and 0.5% for all scenarios, and scenario III provided optimal tradeoff between response time and CPU usage for this study, although this would requires case by case investigation for other implementations.Thus, the proposed edge device has sufficient capability to gather and store sensor data with relatively low CPU usage and response time.

Edge Device Performance
This section discusses the proposed edge device performance.The edge devices combine several sensors and a client program to retrieve, store, and analyze sensor data.Previous studies utilized response time and CPU usage as performance metrics to evaluate IoT device performance [52,53].We defined response time as the average time between sending sensor data from the source (client program) and successful delivery to the destination, i.e., the local database, and CPU usage as average CPU usage under different storing scenarios.
We used Python for the client program on edge devices that collected gyroscope, accelerometer, temperature, humidity, ambient light, and air quality data.Edge devices incorporated the Linux Raspbian OS Stretch and 1 GB RAM for the experiment.We set the client program to retrieve sensor data every 5 s with different storage periodicity into the local database: Scenarios I-V stored one data set every 5 s, two every 10 s, three every 15 s, four every 20 s, and five every 25 s, respectively.Hence, incoming sensor data was held in the client program before storage in the local database for scenarios 2-5.CPU usage was monitored over 2 min for each scenario.
Figure 9 shows the mean response time and CPU usage under the different scenarios.Response time significantly increased with increasing sensor data stored to the database, whereas CPU usage decreased.Thus, storing sensor data less frequently, e.g.scenario V, generated the least CPU usage.However, response time and CPU usage were less than 0.05 s and 0.5% for all scenarios, and scenario III provided optimal tradeoff between response time and CPU usage for this study, although this would requires case by case investigation for other implementations.Thus, the proposed edge device has sufficient capability to gather and store sensor data with relatively low CPU usage and response time.

Fault Model Performance
The proposed hybrid fault model was compared with several current classification models, as shown in Table 5.The proposed model outperformed all other considered models significantly for all of r, f, p, and a, providing 98.64% prediction accuracy.

Fault Model Performance
The proposed hybrid fault model was compared with several current classification models, as shown in Table 5.The proposed model outperformed all other considered models significantly for all of r, f, p, and a, providing 98.64% prediction accuracy.Notes: r = recall, f = F-1 score, p = precision, and a = accuracy.
The proposed hybrid fault model combining DBSCAN, SMOTE, and RF provided improved classification accuracy.We investigated DBSCAN and SMOTE impacts for classification models, as shown in Figure 10.Applying DBSCAN and SMOTE improved classification accuracy for all models except logistic regression (LR), with 1.07% average improvement.The generated hybrid fault model was installed onto the edge devices to provide fault prediction.Input gyroscope, accelerometer, temperature, humidity, ambient light, and air quality sensor data were then handled, processed, and analyzed in the edge device without requiring network communication with the cloud server, minimizing network costs, while simultaneously improving data processing and analysis speeds.The proposed AFEWS will help improve decisionmaking, as well as reduce or prevent unexpected losses caused by early stage faults.

Managerial Implications
There are three main management implications for the proposed AFEWS: Low-cost system development, assembly line monitoring, and improved decision-making due to timely fault warnings.The study utilized edge computing technology based on a Raspberry Pi SBC that offered small size, low cost, and sufficient processing power [54,55].The edge device also included Sense Hat, DHT11, BH1750FVI, and ZP01-MP503 sensors, and provided a 16 GB micro SD memory card for local database storage of sensor and fault prediction data.The total cost for each edge device was approximately $91 [56][57][58][59][60][61], although prices may vary from different suppliers for each component.Therefore, the edge devices were considered to be cost effective.
Previous studies identified significant advantages to utilizing Raspberry Pi for similar scenarios, including secure and robust healthcare services [62], intelligent video surveillance platforms [63], building management systems [64], and beehive monitoring [65].Edge computing benefits have been clarified by several studies to include reduced response time [66] and energy consumption [67].Adopting OSS can also generate economic gains, including software development productivity, low investment cost (i.e., licenses), and external support availability [68][69][70].
Machine learning has been applied to manufacturing to differentiate carbon fiber fabric type [71], diagnose locomotion gait faults for reconfigurable robots [72], machine tool health status [73], early stage electrical fault detection in induction motors [74], sealing surface defect for chili oil production line [75], metallic surfaces for a flat metal component production line [76], aerospace The generated hybrid model was installed onto the edge devices to provide fault prediction.Input gyroscope, accelerometer, temperature, humidity, ambient light, and air quality sensor data were then handled, processed, and analyzed in the edge device without requiring network communication with the cloud server, minimizing network costs, while simultaneously improving data processing and analysis speeds.The proposed AFEWS will help improve decision-making, as well as reduce or prevent unexpected losses caused by early stage faults.

Managerial Implications
There are three main management implications for the proposed AFEWS: Low-cost system development, assembly line monitoring, and improved decision-making due to timely fault warnings.The study utilized edge computing technology based on a Raspberry Pi SBC that offered small size, low cost, and sufficient processing power [54,55].The edge device also included Sense Hat, DHT11, BH1750FVI, and ZP01-MP503 sensors, and provided a 16 GB micro SD memory card for local database storage of sensor and fault prediction data.The total cost for each edge device was approximately $91 [56][57][58][59][60][61], although prices may vary from different suppliers for each component.Therefore, the edge devices were considered to be cost effective.
Previous studies identified significant advantages to utilizing Raspberry Pi for similar scenarios, including secure and robust healthcare services [62], intelligent video surveillance platforms [63], building management systems [64], and beehive monitoring [65].Edge computing benefits have been clarified by several studies to include reduced response time [66] and energy consumption [67].
Machine learning has been applied to manufacturing to differentiate carbon fiber fabric type [71], diagnose locomotion gait faults for reconfigurable robots [72], machine tool health status [73], early stage electrical fault detection in induction motors [74], sealing surface defect for chili oil production line [75], metallic surfaces for a flat metal component production line [76], aerospace deburring prediction [77], remaining turbofan engine useful life [78], etc.The present study adopted a machine learning model in the edge devices to detect fault events during assembly line process.Once a fault was detected, the proposed AFEWS sends a warning message to registered users via email and/or IM.Thus, managers can notice faults immediately and take action to prevent further losses.Results from the present study provide practical guidelines for industrial practitioners to develop edge devices and machine learning models for early warning systems into their manufacturing process.

Conclusions
We proposed AFEWS based on edge devices and hybrid fault model to identify faults during process in fast response.The present study adopted edge computing, enabling data collection and prediction within each edge device in fast response without requiring communication with a cloud server.The edge device incorporated several sensor devices and collected and processed sensor data to be subsequently analyzed on the edge device using a hybrid fault model.Input sensor data and prediction results are then presented to the web dashboard.When a fault is detected, warning messages can be immediately sent to identified managers for immediate action via the web dashboard, registered email addresses and IM (Telegram).The proposed AFEWS system was based on a low-cost edge device capable of processing the sensor data in an acceptable time.Edge device performance was assessed using various metrics, including network delay and CPU usage.The proposed edge device provided an efficient solution for all experimental scenarios, successfully collecting, processing, and analyzing the sensor data within an acceptable time and low computation cost.
We implemented a hybrid fault model combining DBSCAN outlier detection, SMOTE, and RF classifier, which achieved highest accuracy compared to several common current classification models.
The proposed AFEWS was tested for four months operation at an actual production assembly line at a Korean automobile parts factory, and exhibited significant benefits by allowing managers to monitor the process status and identify process faults, preventing or significantly reducing unexpected losses.
Future studies will consider more comprehensive performance evaluations, and different scenarios, operations, and setup locations.Various abnormal events can be further identified, collected, and analyzed to extend the proposed model to learn from new complex datasets.

Figure 2 .
Figure 2. Example assembly line excerpt with edge devices attached to workstation panels.

Figure 2 .
Figure 2. Example assembly line excerpt with edge devices attached to workstation panels.

Figure 2 .
Figure 2. Example assembly line excerpt with edge devices attached to workstation panels.

Figure 4 .
Figure 4. Hybrid fault model using density based spatial clustering of applications with noise (DBSCAN) outlier detection, synthetic minority oversampling technique (SMOTE) class balancing and random forest (RF) classification model.

Figure 5 .
Figure 5. Attribute significance (attribute labels are defined in the main text above).

9 Figure 5 .
Figure 5. Attribute significance (attribute labels are defined in the main text above).

Figure 8 .
Figure 8. Warning message from AFEWS to registered contact via (a) Telegram messenger and (b) email.

Figure 9 .
Figure 9. Edge device performance in terms of response time and CPU usage for different storing periodicity.Scenario details are provided in the main text above.

Figure 9 .
Figure 9. Edge device performance in terms of response time and CPU usage for different storing periodicity.Scenario details are provided in the main text above.
: r = recall, f = F-1 score, p = precision, and a = accuracy The proposed hybrid fault model combining SMOTE, and RF provided improved classification accuracy.We investigated DBSCAN and SMOTE impacts for classification models, as shown in Figure10.Applying DBSCAN and SMOTE improved classification accuracy for all models except logistic regression (LR), with 1.07% average improvement.

Table 1 .
Proposed affordable fast early warning system (AFEWS) sensor devices.

Table 2 .
Experimental dataset distributions.Min = minimum; Max = maximum; STD = standard deviation; air quality sensor provided categorical values, hence statistical distribution could not be derived; attribute labels are defined in the main text above.
Notes: Min = minimum; Max = maximum; STD = standard deviation; air quality sensor provided categorical values, hence statistical distribution could not be derived; attribute labels are defined in the main text above.

Table 4 .
Classification model performance metrics.

Table 5 .
Performance comparison of several classification models.

Table 5 .
Performance comparison of several classification models.