Improved Ensemble-Learning Algorithm for Predictive Maintenance in the Manufacturing Process

: Industrial Internet of Things (IIoT) technologies comprise sensors, devices, networks, and applications from the edge to the cloud. Recent advances in data communication and application using IIoT have streamlined predictive maintenance (PdM) for equipment maintenance and quality management in manufacturing processes. PdM is useful in fields such as device, facility, and total quality management. PdM based on cloud or edge computing has revolutionized smart manufacturing processes. To address quality management problems, herein, we develop a new calculation method that improves ensemble-learning algorithms with adaptive learning to make a boosted decision tree more intelligent. The algorithm predicts main PdM issues, such as product failure or unqualified manufacturing equipment, in advance, thus improving the machine-learning performance. Herein, semiconductor and blister packing machine data are used separately in manufacturing data analytics. The former data help in predicting yield failure in a semiconductor manufacturing process. The blister packing machine data are used to predict the product packaging quality. Experimental results indicate that the proposed method is accurate, with an area under a receiver operating characteristic curve exceeding 96%. Thus, the proposed method provides a practical approach for PDM in semiconductor manufacturing processes and blister packing machines. trees.


Introduction
The Industrial Internet of Things (IIoT) comprises internet-connected devices and cutting-edge analytics platforms that process manufacturing data. Generally, IIoT refers to applying Internet of Things (IoT)-related technologies in the manufacturing industry. These associated technologies are concerned with the interconnection of smart objects within cyber-physical systems for industrial applications. With the rapid advancement in IIoT, considerable industrial data are available on the cloud. Although many studies exist on data analytics [1] and IIoT, few studies have investigated their convergence [2]. Recent IIoT applications have aimed to create more value for their services. Further, predictive maintenance (PdM) is a maintenance approach that predicts the future failure of manufacturing process-related problems (e.g., yield, equipment, and product quality). PdM is used to maintain industrial assets such as devices and production quality and quantity.
IIoT applications have been integrated with the PdM approach, referred to as "IIoTbased PdM." The combination of IIoT and PdM has been widely used in the energy industry to improve production, identify potential failures, and detect anomalies. Chevron Corporation (San Ramon, CA, USA), a famous energy company, leverages IIoT applications to identify corrosion and pipeline damages. Jabil Inc. (Saint Petersburg, FL, USA), an electronic manufacturing company, uses cloud analytics to predict product images on the plant floor [3]. IBM (San Francisco, CA, USA), Salesforce Inc. (San Francisco, CA, USA), General Electric (GE; Boston, MA, USA), and Cloudera Inc. (Santa Clara, CA, USA) provide PdM applications in the factory even though each of these companies has a unique manufacturing process [4]. IIoT-based PdM has transferred large amounts of data from the edge-side to the cloud side. The manufacturing industry faces a challenge in using big data and extracting valuable industrial information. For small-or medium-sized manufacturing companies, using data to enable intelligent manufacturing is an economical approach. Data analytics using existing equipment is less expensive than purchasing new advanced equipment. Thus, using big data to enhance industrial manufacturing processes is essential to the manufacturing industry. Machine learning (ML) can extract data in manufacturing processes to obtain useful information applicable to specific manufacturing processes. To enhance the performance of industrial data analytics, herein, we propose a new ML algorithm-an adaptive boosted decision tree that integrates a neural network and boosted decision tree. The proposed algorithm is based on the boosted decision tree, and the neural network is used for retaining a weaker sample. The proposed method uses adaptive learning to dynamically improve the boosted decision tree, making an ML model smarter. This study uses the proposed algorithm to improve production quality in semiconductor manufacturing processes and packaging machining.

Semiconductor Manufacturing Process
Semiconductor manufacturing process management is based on signal data collected from process measurement points and sensing elements. Figure 1 shows the necessary steps in a semiconductor manufacturing process, with signal data at each stage. There are more collected signal data than actual signal data in a semiconductor manufacturing process. Received signal data include noise, relevant data, and irrelevant data. Noise and related data may improve the monitoring of the manufacturing process. Signal data optimization increases semiconductor manufacturing process yield, decreases computation time, and lowers production costs per unit.

Scissor Product Packaging Machining Process
In the scissor manufacturing industry, different scissor products have different shapes. After the primary manufacturing process, a scissor product needs to be packaged using a machining device. During the forming process, Polyvinyl Chloride (PVC) and other materials can be combined to form a blister package component. The packaged part of the scissor package product includes the blister top cover, blister bottom cover, blister lining, and card. The blister packing machine is a standard package machining device used to form a blister element and heat seal it with a card. High-quality packaging protects the scissor product from abrasion. However, different scissor shapes may form blisters. The card is difficult to handle because of the blister component's rough surface. Figure 2 shows the packaging procedure. The different lidding structures require different equipment settings, such as pressure, dwell time, and seal temperature settings, in the heat-sealing process. The glue material coating weight is critical for determining the seal quality. An incorrect heat seal may cause package leakage, resulting in the possibility of product abrasion. The heat-seal process may eventually cause moisture to enter the package, affecting product quality. Thus, scissor products should be packaged with heat-seal coatings of appropriate thickness. During the heat-sealing process, the machine seals from the left side to the right side of the package component to maintain appropriate heat-seal coating thickness. Therefore, the scissor product package should be placed with the appropriate area between the product-containing cavities during the trimming process. The package quality should be validated through leakage testing, and any unqualified package should be withdrawn. The main factors in the packaging machining process include the heat-sealing time, cooling time, the blister bottom cover material, and the card material. This study uses equipment data such as heating time, pressure, and cooling time to predict packaging quality. Innovative IIoT-related technologies have transformed the manufacturing industry into an intelligent, dynamically scalable, and demand-driven service sector with a costeffective business model. The application of PdM in the manufacturing industry faces a challenge in using manufacturing data across spatial boundaries. We develop an adaptive boosted decision tree algorithm for yield prediction and fault diagnosis to address these challenges. This study aims to enhance the calculation performance of ensemble-learning algorithms (ELAs). ELAs are integrated algorithms containing multiple ML algorithms. Compared to the commonly used statistical or ML model, the ensemble-learning approach enhances computation performance. Ensemble learning integrates multiple ML models to produce better predictive performance than a single ML model. The main principle behind an ensemble model is that it combines a group of weak learners to form a keen learner, thereby improving the overall performance of a learning model. This study aims to predict, detect, identify, and classify failures and degradation in the manufacturing process before criticality. This study's contribution is to prevent unqualified production, identify root causes for follow-up processes, and enable efficient evidence-based maintenance planning and manufacturing process optimization. Thus, we propose an improved ML algorithm and data analytics technologies for PdM in the manufacturing field. Our research contributions can be summarized as follows.

•
Develop a new approach to improve the computation performance of boosted decision trees.

•
Investigate the effect of ELAs and single models on manufacturing. • Forecast yield failure in the semiconductor manufacturing process.

•
Predict the package quality of the blister packing machine. • Provide a prediagnostic suggestion for equipment configuration to improve work efficiency.
The remainder of this paper is organized as follows. Section 2 provides further background on IIoT and PdM in the manufacturing process. In Section 3, we review IIoT-and PdM-related literature for manufacturing. We present the framework of this research in Section 4. Section 5 describes the use case, dataset, and evaluation criteria. We apply the proposed approach to analyze open-source data from a practical factory's UCI repository and equipment data. Section 6 presents the analytics results, and Section 7 discusses the effect of PdM on IIoT and other related problems. Finally, Section 8 presents the conclusion and future research opportunities.

Current Trend of IIoT-Based PdM
According to the latest forecast by IDC [5], the proportion of IIoT-related applications in the IoT market will be 18.3% in 2023. IIoT integrates a sensing element, hardware, software, and firmware to improve industrial management and manufacturing processes [6]. Such technologies include a cyber network, data connection, data storage and application, and cloud platform. IIoT could overgrow; for example, IndustryARC ® reported that the IIoT market could reach USD 123.89 billion by 2021 [7]. In 2023, the IIoT market size could reach USD 310 billion [8]. Worldwide spending on software and hardware in IIoT could reach USD 982 billion in 2024 [9] and USD 949 billion by 2025 [10]. Thus, using IIoT technologies to develop PdM is the current trend. An IIoT vendor has advanced technologies with advantages of scalability, effective data management, and a real-time monitor module. Therefore, IIoT vendors, such as PTC Inc. Furthermore, integrating IIoT technologies into PdM-related technologies can effectively increase automated equipment maintenance processes and gain enterprise profits. The combination of IIOT-based PdM technologies is called "IIoT-based PdM." Figure 3 shows the global PdM-related market sales forecasted by consulting firms. For example, Statista ® predicted that PdM-related sales would reach USD 9.1 billion in 2021 [11]. According to a report by Consultancy.org ® , the global market sales of PdM-related technologies were USD 6 billion in 2020 [12]. PdM-related technology sales could reach USD 11 billion in 2022, increasing by 83% in two years (2020-2022). According to a report by Consultancy.org ® , IoT analytics company reports have a similar prediction trend, i.e., the increase rate from 2020-2024 is up to 83% [12,13]. A report published by MarketsandMar-kets™ Comany researches that PdM-related applications will increase and that the PdM market size could reach USD 10.7 billion in 2024 [14]. Investment is increasing in PdMrelated applications, IoT-related applications, artificial intelligence techniques, and ML technologies [14]. Most IoT vendors integrate data analytics technologies and data communication to develop IIoT-based PdM applications. For example, Microsoft Azure, IBM Watson, and Amazon Web Service (Seattle, WA, USA) provide data storage, IoT gateway, ML module, and other services for PdM manufacturing. These major factors may contribute to the PdM-related market growth during the forecast period. IIoT-based PdM vendors focus on cloud-and edge-based predictive analytics. The present study predicts the semiconductor and blister packing machine cases using the ML algorithm in data analytics.

IIoT-Based PdM-Related Studies
To survey IIoT-based predictive maintenance-related research studies on the abovelisted manufacturing problems, we first investigated and highlighted current research efforts directed at PdM technologies using IIoT. Second, we investigated maintenance-related problems using ELAs.

IIoT-Based PdM-Related Studies
This study surveyed 986 relevant studies from 2015 to 2020 and analyzed the time trend and disciplinary distribution of emerging IIoT-based PdM-related topics. Initially, we used the academic publishing analytics software-"Publish or Perish" (PoP) to obtain related papers and analyze them. PoP uses benchmark data sources such as Google Scholar to obtain the papers' citation information, then analyzes the obtained information, and presents academic estimate scores (e.g., h-index). We reviewed the obtained results manually and then filtered on the basis of citations. Table 1 lists the top 20 citations and provides detailed descriptions. We included keywords from the top-cited articles in this study to present the current technology trends. The findings of these studies suggest that data-driven technologies, cyber-physical systems, and real-time communication are the core problems that drive research associated with IIoT-based PdM. Critical review, data analytics, and system design are among the primary research topics. "Industry 4.0," "Smart manufacturing," "Cloud computing," and "Data analytics" are high-frequency keywords for IIoT-PdM-related topics. Table 1 demonstrates that papers that were published earlier have more citations than other, later-published articles; e.g., articles [15][16][17][18][19][20][21][22][23][24][25]. However, the citation analytics in the most recent five years may be limited by the publishing time window. The earlier published articles have a larger window than the later published article. The later published article may have fewer citations than the earlier published articles. In this study, the citation analytics may be biased toward articles published in earlier years. Table 1. IIoT-based PdM-related studies from 2015 to 2020.

2016
Gilchrist et al. presented IIoT potential applications, including data mining, predictive analytics, statistical approaches, and other application opportunities, to increase manufacturing productivity and efficiency [15]. Civerchia proposed an IIoT-based predictive maintenance application for predicting the remaining lifespan of equipment [33]. Rehman indicated that the IIoT incorporates data storage, processing, and analytics technologies [2]. Kanawaday used time-series data to predict IIoT performance. He used an autoregressive integrated moving average model to predict possible failures and quality defects in a slitting machine [35]. Some researchers have proposed that the concentric model [2] and data mining model [35] support a learner model to access and process data; thus, in a different field, such as the edge-cloud environment, the maintenance strategy can be predicted.

Application of Ensemble Learning in Predictive Maintenance
The high costs in maintaining today's sophisticated equipment necessitate the improvement of modern maintenance management efficiency. PdM for Industry 4.0 is an advanced concept for predicting equipment failure by analyzing data (e.g., device data, sensor data, and production data) to identify patterns and anticipate problems before failure. This section provides an overview of ensemble learning's application in PdM in the manufacturing industry. ELAs are an integration module that aggregates multiple ML algorithms and selects a high-performance ML to improve predictive performance. Furthermore, we investigated an ensemble-learning-based analytics method for achieving the goal of PdM, which has attracted considerable research interest. For example, Li integrated similar algorithms to predict the remaining useful lifespan of aircraft engines and bearings [36] and found that the ensemble-learning prognostic had the least prediction error. Wu et al. proposed a random forest-related algorithm that aggregated multiple regression trees to forecast tool wear in the dry milling process [37]. Additionally, some researchers have used the decision jungle algorithm to predict software defects [38]. A boosted decision tree is useful in the event of pipe failure [39]. The results of these previous works demonstrated that the ensemble-learning solution provides users with convenient tools to analyze equipment data and develop maintenance strategies.

Methodology
Previous studies have shown that ELAs exhibit high accuracy in some cases [38,39]. However, we used the proposed method, which was specifically designed for predicting yield failure in the semiconductor manufacturing process and the quality of the packing machining process. We processed manufacturing data with the standard data analytics procedure. The data analytics procedure was based on knowledge discovery in the database process, with the training dataset serving as the input and the target being the yield failure in the manufacturing industry or packing quality in blister packing machines. The data analytics procedure involves data collection, preprocessing, and training and evaluation of models, which are discussed below.

Data Preprocessing
The original dataset needs to be preprocessed for various objectives such as for handling the missing values. The data obtained from the manufacturing industry and blister packing machines need to be processed before importing into the machine-learning model. In the manufacturing field, obtaining the anomaly data is challenging because the probability of the occurrence of an anomaly situation is lesser than that of a normal situation. Therefore, the manufacturing data may face the data imbalance problem. Data imbalance refers to a sample size of data in which one class outnumbers the others by a significant proportion. However, most of the real datasets in the manufacturing field suffer from an imbalanced ratio. The data imbalance problem may cause failure of classification accuracy. In both cases, the data should be preprocessed for data balance. This study used the synthetic minority oversampling technique (SMOTE) to balance the proportion of each class datum. SMOTE, proposed by Chawla et al. [40], is an oversampling method based on the k-nearest neighbor (KNN) algorithm. KNN is judged based on the Euclidean distance between data points in a solution space. The procedure of SMOTE is summarized as follows.

•
Step 1: exploring the minority class input data point.

•
Step 2: finding the KNNs of the explored input data point.

•
Step 3: select one of these neighbors' point, and place a new point on the path connecting the point under consideration and its chosen neighbor.

•
Step 4: Repeat Steps 1 and 2 until the termination condition is met (i.e., until the data are balanced).
After balancing the data proportion, the features were selected from the preprocessed data via data balance and mutual information (MI) estimations. MI is a score that measures the mutual dependencies among variables. It is useful in feature selection as it maximizes the MI between the target variables and joint distribution in multidimensional datasets.
x∈X y∈Y (1) where MI (X; Y) represents the mutual information between variables X and Y. The variables prob(x) and prob(y) refer to the probability distribution of x and y, respectively, and prob (x, y) represents the joint probability distribution of X and Y.

Training Module
We used emerging ELAs to predict the manufacturing data. Each ensemble algorithm has a unique ensemble method. The main ensemble methods are "bagging" and "boosting." The decision jungle used the "bagging" method, whereas the boosted algorithm used the "boosting" method. The bagging method produces an additional dataset from the original data and aims to decrease the variance in prediction. The bagging method was used in the decision jungle training process to reduce the complexity of models that overfitted the training data. The boosting method is used to adjust the weight of an observation based on the last classification; the weight of the observation can be updated when an incorrect classification occurs. Boosting was used in the boosted decision tree to improve prediction performance via iterative modification.

Decision Jungle
Shotton et al. proposed a decision jungle algorithm, which is an ensemble of rooted directed acyclic graphs (DAGs) [41]. The main difference between the traditional decision tree and decision jungle is the number of training paths from the root to the leaves. The decision jungle requires multiple paths from the root to each leaf, whereas only one path is allowed in the decision tree. The decision jungle optimizes tree construction, and tree generation, such as node merging and splitting, is based on the minimization of the same objective function. In the training procedure, the DAG path is constructed independently, and only one DAG level can be created at a time. Equation (2) expresses the set of instances from Node i that travel through the tree's left branches (L) and right branches (R) to any child node.
Here, N parent and N child denote the sets of parent and child nodes, respectively, and represents the parameters of the split feature function f for the parent Node i ∈ N parent . The child Node j ∈ N child is assumed to exist. Si denotes the set of labeled training instances (x; y) that reach Node i. Let l i ∈ N child be the current assignment of the left edge from the parent node i ∈ N parent to a child node, and similarly, for the right edge r i ∈ N child Equation (3) is an objective function that aims to minimize the child node objective r , l and split objective α , where F(S) denotes the Shannon entropy of the class label y in the training instances (x; y). Both random forest and decision jungle used trees and DAG as base learners. Shotton et al. proposed that decision jungle computation is more memory-efficient than several other ensemble-learning modules [41].

Boosted Decision Trees
Boosting is a method for training a strong classifier using an ensemble of weak trees. The concept is similar to that of reinforcement learning. A decision tree is an iterative approach in which a subsequent tree corrects the errors of the previous tree. We integrated the outcome of each tree to evaluate and find the optimal prediction result. Equation (4) expresses the objective function (O t ).
where f represents functions that contain the tree and leaf scores. Ls is the training loss function, R represents the regularization function, y' is the predicted value, y is the actual value, and t represents the epoch time.

Proposed Method: Adaptive Boosted Decision Trees
This study improves the boosted decision tree using a retraining mechanism, and the weak pattern from the training data source that should be corrected. The retraining mechanism improves computational performance by expanding the solution space for the weaker pattern to explore. This study used neural networks to retrain the weaker pattern data source from the trained boosted decision tree model. The framework of the neural networks comprised the input, hidden, and output layers. Figure 4 depicts the flowchart of the proposed method. The input patterns were fed forward to the net's input layer as the training began. The patterns feedforward to hidden layers that are the intermediate layers between the input and output layers and where all computation occurs. This pattern is fed forward to the output layer. The computed error is estimated in the output layer by comparing the difference between the output result and target. Therefore, the proposed method was applied to data analytics for the PdM problem: the two manufacturing case studies. In this study, we compared the predictive performance of the new approach and single-and ensemble-based modules, using the conventional decision tree algorithm, decision jungle, and boosted decision trees.

Module Deployment
Users invoke functions in a programming language used by developers such that function usage depends on the specific programming language (e.g., Python plugin library, MATLAB library). An application programming interface (API) is defined as an interactive interface between multiple software applications, and it is also considered the communication media between users and developers. A trained MLAPI can be used in real-time, batch, or offline analytics for data analytics. Therefore, many studies have deployed trained ML models in various forms, such as APIs and web services (WS), which form the integration approach used to communicate with another application, platform, and database. To increase the use of the trained model, the trained model herein was deployed as an API and a WS that can be used with different programming languages. Users can access an API using any programming language, regardless of the programming language used in developing the API. Microsoft Azure, Google, and Amazon also provide different services for generating modules such as APIs or WS. An API and WS can be used in other platforms to display a result. In this study, we deployed the trained model as an API for future use. Figure 5 shows the API deployment in this study.

Experiment
Semiconductor manufacturing and machining are among primary manufacturing processes. The open-source data and the real case deployed in this study serve as instances. In addition, with the standard data analytics framework, the data we obtained from various sources predicted yield failure and product quality separately.

Case Data Description
We used two datasets as case studies to demonstrate how the proposed method uses data analytics for anomaly prediction. Here, the manufacturing case aimed to predict yield failure in the semiconductor manufacturing process.This study aims to address the requirements of a practical manufacturing factory. Hence, we selected a conventional product packaging company as another case. The case is a typical small-sized manufacturing enterprise in Asia; they have a standard operating procedure and conventional production facilities to package different types of scissors and other cutting tools. The case used the blister packing machine to package finished goods. The customized scissor product posed a challenge to completing the packaging process, which could be caused by faulty packaging and an incomplete seal as a packaging component (see Figure 6). Finished goods with faulty packaging are deemed unqualified products. Small-sized manufacturing enterprises may not be ready for Industry 4.0 migration and face the challenge of insufficient funds (e.g., funds for purchasing advanced equipment). Similar to other small-sized manufacturing enterprises, the case at improving the quality of current manufacturing processes herein was performed using an economic means rather than purchasing new intelligent equipment to achieve Industry 4.0. Therefore, this study improves the manufacturing process using artificial intelligence, which only requires current equipment data to improve smart manufacturing. In this case, we predicted the quality of scissor product packaging using a blister packing machine setting to correctly seal package components. Two case data sources were used for data analysis to address the PdM problem in the manufacturing industry. Herein, the UCI ML repository served as one of the data sources. The prediction target is a binary outcome of the yield status (e.g., pass or fail). The values of 0 and 1 represent a pass and fail in the test point, respectively. The manufacturing case is a predictive case that aims to predict yield failure in the semiconductor manufacturing process. The semiconductor dataset comprises 1567 instances with 591 features. The number of anomaly instances and normal instances are 104 instances and 1463 instances, respectively. The proportion of anomaly data is 6.6% in the total dataset, and the proportion of semiconductor dataset was biased toward the normal class. The other data source is the blister packing machine dataset. The main attribute of this dataset is related to equipment settings, including the heat-sealing time, cooling time, material of the blister bottom cover, and material of the cards. Table 2 describes the above-described datasets. Semiconductor data contain 17,836 instances with six attributes. The number of anomaly instances and normal instances are 17,464 instances and 372 instances, respectively. The proportion of blister packing machine dataset was biased toward the anomaly class (unqualified products), and normal data constitute 2.1% of the total dataset. Both manufacturing datasets faced the data imbalance problem. The anomaly data with the lowest or highest number of instances is usually the class of interest from the perspective of PdM. We used the SMOTE algorithm to address this problem by generating synthetic samples through a combination of existing samples for increasing the number of minority samples to be equal to the other class samples. Both preprocessed datasets had the same the number of instances between anomaly data and normal data after using the SMOTE algorithm to enlarge the dataset.

Manufacturing Type Description Instances
The yield of the manufacturing process Semiconductor manufacturing process The data have 591 features containing methods, classifications, and time stamps for each instance.

1567
Packing quality of the product Blister packing machine The data have six attributes: heating time on the left side, heating time on the right side, coding time, pressure, and the packaging card type, and the packing cover type. 17,836

Evaluation Criteria
We used accuracy, receiver operating characteristic (ROC), the area under the curve (AUC), and recall rate as the evaluation criteria for evaluating prediction performance. Each criterion is discussed below.

Recall Rate (Sensitivity)
The recall rate is a statistical measure of the performance of the classification Equation (7). The recall rate value is estimated by the proportion of actual positives that are identified correctly in the overall dataset, where QFN represents the number of false negatives and QTP denotes the number of true positives.
Recall rate = Q TP Q TP +Q FN

Receiver Operating Characteristic
The ROC is a probability curve characterizing the performance of an ML model. The ROC curve comprises the recall rate and false positive rate (FPR), with the recall rate plotted on the y-axis and the FPR on the x-axis. The formulas of sensitivity and the FPR are expressed as:

Area under the ROC Curve
The area under the ROC curve (AUC) refers to the two-dimensional area underneath the entire ROC curve and ranges from 0% to 100%. A model with 100% false predictions has an AUC of zero, whereas that with 100% correct predictions has an AUC of one. An AUC value closer to zero represents a worthless test, whereas a value closer to one represents an outstanding discrimination near-perfect test. Table 3 shows the AUC ranges with an aggregate measure of prediction performance.

AUC Range
Level of Discrimination AUC = 0.5 no discrimination 0.7 ≦ AUC ≦ 0.8 acceptable discrimination 0.8 ≦ AUC ≦ 0.9 excellent discrimination 0.9 ≦ AUC ≦ 1.0 outstanding discrimination Table 4 lists the predictive case studies; thus, this section is divided into two parts according to the case type. The data analysis comprises data importation, feature selection, data preprocessing, model training and evaluation, and model deployment. We used the proposed method in the semiconductor and blister packing machine cases, respectively. Table 4. List of case studies.

Semiconductor Case Blister Packing Machine Case
Predictive Problem The yield failure in the semiconductor manufacturing process.
The quality of scissor product packaging in the blister packing machining process.

Yield Failure Prediction in the Semiconductor Manufacturing Process
We used the preprocessed data to predict yield failure in the semiconductor manufacturing process (Table 5). An accuracy and recall rate of 0.974 and 0.957, respectively, were obtained using the proposed method. The boosted decision tree had an accuracy and recall rate of 0.966 and 0.945, respectively. Recall rate is the measure for how many true anomaly instances are predicted out of all the anomalies in the dataset; the result showed that the proposed method has higher accuracy and recall rate than the other algorithm. The proposed method outperformed the boosted decision tree and decision tree. Thus, compared with the decision tree, ensemble learning exhibits better prediction performance than the decision tree. Figure 7 shows the ROC of the proposed method and boosted decision tree algorithms. The AUC of the proposed method is 99.5% and that of the boosted decision tree is 99.4%. All ensemble methods achieved an AUC greater than 0.9; the result represents the better performance of the proposed method and the ELAs at distinguishing between the binary classes issue.

Quality Prediction of Scissor Product Packaging in Blister Packing Machining Process
This study used the preprocessed data to predict the quality of packaging in the blister packing machining process (Table 6). An accuracy and recall rate of 0.992 and 0.997, respectively, were obtained using the proposed method. The boosted decision tree achieved an accuracy of 0.991 and a recall rate of 0.993. The result demonstrated that the proposed method and boosted decision tree achieved the same predictive performance.  Figure 8 shows that ensemble learning is suitable for the blister packing machine case, where all ensemble methods achieved an AUC greater than 0.9. The experimental results have shown that the ensemble method can outperform the traditional single module in prognostic accuracy. Unlike conventional decision trees, which only allow one path to each node, the result shows that the ELAs achieve higher accuracy than the decision tree. In summary, the result obtained using the proposed method was compared with that of the boosted decision tree and decision tree. Furthermore, the obtained result indicates that the ELA achieves more than 95% accuracy in both the semiconductor case and the blister packing machine case. The proposed method outperformed the single learning method in the two cases. The trained model was deployed as a RESTful API and WS, which can be used in edge-or cloud-based analytics for predictive maintenance manufacturing. The data analytics process is fully automated after sending the analytics requirement to the server side via the API. The submission is detected by the server-side process and automatically mapped to a calculation workflow that specifies the task type (e.g., preprocessing, ML training module) according to the API message. The server-side response to the analytics result equipment data can be imported to the cloud platform to improve textile quality by predicting equipment settings. This study uses Python to develop the proposed PdM APIs based on the REpresentational State Transfer (REST) pattern for the web. The RESTful APIs use commands to obtain resources. The state of a resource at any given timestamp is called a resource representation. RESTful APIs have different execution types: "GET," "PUT," "POST," and "DELETE." GET is used to retrieve information; PUT is used to transmit information, change the state of or update information; POST is used to transmit information and create a resource; and DELETE is used to delete information. This study proposed managing APIs through the Swagger platform. In this study, the PdM APIs enable users to directly access the trained model via the Hypertext Transfer Protocol (HTTP) and provide an efficient way for users to programmatically query analytics instead of relying on browser-based interfaces. A user can send an API request through the media type of HTTP and respond to information in the typical API design. In this study, we consistently use the JSON format as the API response format, and this has been widely used in web or mobile applications (see Figure 9).

Discussion
The proposed method and ELAs achieve high accuracy in the semiconductor case and blister packing machine case. The decision tree algorithm allows one path from the root to each leaf. The decision jungle algorithm allows multiple paths from the root to each leaf. Some researchers have reported that decision jungles require less memory while significantly improving generalization. In addition, they have better computational performance than the conventional decision tree algorithm on different datasets [41,42]. Shotton's finding supports our experimental results, which demonstrated that decision jungle has high accuracy [41]. The boosted decision tree is a type of ensemble-learning model that improves learning performance using weak tree ensembles to build a robust classifier. The boosted decision is used to forecast failure in water distribution facilities, and the result demonstrates high accuracy in the predictive problem [39]. Both the bagging (decision jungle) and boosting (boosted decision tree) in the ensemble approach achieve good prediction performance; thus, the results of related studies agree with our findings [39][40][41]. In some cases, a boosted decision tree may achieve better prediction accuracy than the decision jungle. However, boosted decision trees may cause overfitting when analyzing data with a large amount of noise. To address the overfitting problem, the proposed approach uses the neural network to retrain the weaker pattern in the ensemble decision trees to explore other potential solution spaces.

Potential of the Improved Ensemble-Learning Algorithm
The ensemble-learning module uses the ensemble framework. It outperforms the single learning framework in training models. The learning is associated with the tree construction approach and ensemble method. Some recent studies have advanced the ensemble method and the tree construction approach. Some studies have used supervised learning or unsupervised learning to optimize tree classifiers [38,42] and applied the adaptive apriori to tree post-pruning technique [43]. Improved ensemble learning has recently been used in IIoT-PdM-based applications. This study demonstrated a new approach using neural networks to improve boosted decision trees. With advancements in predictive analytics, improved ELAs can be used in fault prediction in the manufacturing field.

Edge-based Analytics and Fog-Based Analytics
Transmission data streams between edge-side equipment and cloud-side communications play an essential role in IIoT advancements [2]. This study represents the use of data analytics in the manufacturing process. However, model deployment is a significant stage in the data analytics procedure. The trained model can be deployed as API, which allows a user to send a service request with an API key, and the server side can respond to the request with a prediction. The API can be used in an edge-side platform to form edge-based analytics. Edge-based analytics provides computing resources for applications with networking close to the edge-side (e.g., factory floors and end-users). In contrast, the fog-based framework lies between the edge and cloud. Fog-based analytics facilitates computing operations, providing connectivity between the edge-side and cloudbased platforms. Related studies have used fog computing in the predictive maintenance problem [44]. The sustainable development of IIoT-based PdM applications is crucial for smart manufacturing; thus, configurable API use may co-create the value for manufacturing maintenance. Herein, the trained model is deployed as an API that can be used in edge-, fog-, and cloud-based analytics for flexible use (see Figure 10).

Conclusions
In this study, we surveyed IIoT-and PdM-related research. Based on the literature, we presented results from 986 published academic papers associated with manufacturing PdM and related studies. We used descriptive statistics for PoP to analyze studies from 2015 to 2020 obtained from Google Scholar. The findings demonstrated that PdM-related studies have many citations, and IIoT-based PdM applications are widely used in complex production lines. In addition, most PdM-related research studies focused on product quality, facility, device, and manufacturing maintenance. The ensemble-learning concept has been proposed recently, and ensemble-learning methods have been used in diverse manufacturing fields. The results demonstrated that ELAs provide better accuracy and coverage than single decision trees, indicating that ensemble learning has better predictive performance than a single decision tree. To improve ELAs, we developed a new approach to enforce weaker instance training and explore other solution spaces using a neural network algorithm. According to experimental results, the proposed method has an average accuracy of 98% in the two cases. The analysis can assist managers and machine operators in setting equipment manufacturing parameters to achieve higher product quality. In the semiconductor case, we predicted yield failure in the manufacturing process. In the blister packing machine case, we predicted the quality of packaging failure and assisted in predictive maintenance strategy. We used different manufacturing industry case studies to demonstrate that ensemble learning provides better predictive analytics for management and avoids manufacturing process anomalies.

Limitation of the Study
Unavailable industrial datasets limit this study. The proposed case is a prediction problem for the manufacturing process. However, predictive maintenance analytics in the manufacturing industry are associated with facilities, devices, products, and manufacturing processes. More detailed manufacturing data are used in predictive analytics for various use scenarios, such as equipment and merchandise. For example, predictive maintenance is used in high-value asset maintenance. The equipment data trained to identify equipment faults can also make an accurate prognosis to forecast component failure. The more manufacturing scenarios considered in a production line, the more data can be acquired in a factory.

Future Study
For a smart factory, data analytics is not the only critical problem; data integration and real-time predictive analytics are also significant for PdM. In the future, we hope to develop an IIoT-based platform, which can be deployed on the edge-and fog-side in a manufacturing factory. Supervisory control, data acquisition, and real-time communication technology can be integrated with the trained model to enable efficient predictive maintenance in the future. In addition, users can leverage a graphical user interface to monitor manufacturing data and make efficient decisions based on predicted outcomes.
Summarily, this study proposes using the proposed method to develop a predictive maintenance analytics mode, which can be deployed as an API and a WS. Few studies have investigated bagging and boosting ELAs simultaneously to address predictive maintenance in the manufacturing industry. In this study, we developed a new approach and investigated the decision jungle and boosted decision tree methods. The decision jungle ensemble method is based on the bagging ensemble-learning method. The boostedbased ensemble-learning method was used in the boosted decision tree module. We successfully used the proposed method and ensemble-learning module in the semiconductor manufacturing process and the blister packing machine process. The ensemble-learning module is a new type of ML method that can improve ML results by combining several training models instead of using a single training model. Our proposed algorithm is aimed at improving ensemble-learning performance by retraining the weaker pattern of an ensemble model. The experimental results showed that the proposed method outperformed the two types of ELAs, which achieved 97% accuracy. Some small-and medium-sized manufacturing firms may not be ready to migrate to IIoT-based PdM. Our proposed research framework may assist an emerging company or small-and medium-sized manufacturing firms in adopting smart manufacturing using data analytics technologies, as well as digital decision making. The use of data and APIs/WSs denotes a cost-efficient strategy rather than eliminating old equipment and purchasing advanced equipment. Such a framework is similar to the principle of Industry 3.5, which combines the practice of existing manufacturing for Industry 3.0 and future Industry 4.0 for smart manufacturing. Industry 3.5 technologies aim to develop an economic strategy using data, sensing elements, and networks. The small-scale manufacturing field can achieve the goal of smart manufacturing by keeping old equipment and adding nonintrusive elements to obtain and analyze data. Predictive analytics is an efficient way to add value to the manufacturing process, and it can be improved by adopting the principles of Industry 3.5 and 4.0.

Conflicts of Interest:
The authors declare no conflict of interest.