A Selectively Fuzzified Back Propagation Network Approach for Precisely Estimating the Cycle Time Range in Wafer Fabrication

: Forecasting the cycle time of each job is a critical task for a factory. However, recent studies have shown that it is a challenging task, even with state-of-the-art deep learning techniques. To address this challenge, a selectively fuzziﬁed back propagation network (SFBPN) approach is proposed to estimate the range of a cycle time, the results of which provide valuable information for many managerial purposes. The SFBPN approach is distinct from existing methods, because the thresholds on both the hidden and output layers of a back propagation network are fuzziﬁed to tighten the range of a cycle time, while most of the existing methods only fuzzify the threshold on the output node. In addition, a random search and local optimization algorithm is also proposed to derive the optimal values of the fuzzy thresholds. The proposed methodology is applied to a real case from the literature. The experimental results show that the proposed methodology improved the forecasting precision by up to 65%.


Introduction
This study aims to estimate the cycle time range of a job in a factory. The cycle time (or manufacturing lead time) of a job is the time it takes for the job to pass through the factory [1]. Various methods have been proposed to predict the cycle time of a job in a factory [2][3][4]. However, even if some advanced computing techniques, such as big data or deep learning, are applied, the prediction accuracy is still not good enough [3][4][5][6]. To overcome this problem, estimating the range of the cycle time instead is a meaningful treatment.
Estimating the cycle time range is an important issue because it provides valuable information for various managerial activities. For example, in internal due date assignment [1,5], an internal due date must be later than the upper bound of the cycle time in order to ensure timely delivery [6,7]. However, it is also a challenging task because the cycle time of a job is subject to many uncertainties caused by unstable human intervention, unexpected machine breakdown, poor job sequencing and scheduling, and so on [8][9][10]. As a result, the cycle time range of a job may be very wide, or does not even contain the actual value, which represents poor estimation precision [4,11,12].
Based on the above reasons, the research problem of this study is on how to improve the precision for estimating the cycle time range of each job in a factory. Existing methods in this field are subject to the following problems: (1) Some existing methods establish the confidence interval of the cycle time [12][13][14][15].
However, even with advanced computing technologies such as deep learning and big data analysis, the accuracy of predicting the cycle time is still not satisfactory. As far as the impact is reached, the established confidence interval of the cycle time has no reference value. (2) In some studies, the parameters of a cycle time forecasting method were fuzzified to generate a fuzzy forecast, which represents the range of the cycle time. Most of these methods only fuzzified a single parameter to simplify calculations [4,16]. However, fuzzifying more parameters can further shorten the range of the cycle time.
To solve the problems of the existing methods and to improve the precision of estimating the cycle time range of a job, a selectively fuzzified (SFBPN) approach is proposed in this study. The contribution of the proposed methodology is to establish a systematic procedure to efficiently fuzzify the multiple network parameters of a BPN, thereby further reducing the cycle time range of each job.
In the proposed methodology, a SFBPN is constructed to estimate the cycle time range of a job. In SFBPN, the thresholds on the hidden-layer and output-layer nodes are fuzzified to further tighten the cycle time range. To derive the optimal values of the fuzzy thresholds, a nonlinear programming (NLP) problem needs to be solved, which is not easy. To tackle this difficulty, a random search and local optimization algorithm is proposed.
The remainder of this paper is organized as follows. Section 2 is dedicated to the literature review. Section 3 introduces the proposed methodology, including the implementation procedure, the SFBPN architecture, and the random search and local optimization algorithm. To assess the effectiveness of the proposed methodology, it has been applied to a real case from the literature, which is described in Section 4. Section 5 presents concluding remarks and puts forth some topics for future investigation.

Literature Review
Chen and Wang [13] applied fuzzy c-means (FCM) to classify jobs in a factory, and then constructed a backpropagation network (BPN) to forecast the cycle times of the jobs for each category. The range of the cycle time was estimated by constructing the confidence interval of the cycle time. However, in theory, the confidence interval does not necessarily contain the actual value. To solve this problem, Chen and Lin [17] fuzzified the parameters of a BPN to generate a fuzzy cycle time forecast. The support of a fuzzy cycle time forecast represented the range of the cycle time. However, nonlinear programming (NLP) problems needed to be solved, which was not easy.
Hsieh et al. [18] applied response surface modelling (RSM) to evaluate the impact of emergency jobs on the cycle times of normal jobs. More emergency jobs extended the cycle times of normal jobs and widened the ranges of these cycle times. Therefore, the percentage of emergent jobs in the factory should be minimized to improve the precision of estimating the cycle ranges of normal jobs.
Wang and Zhang [14] modified the FCM−BPN method [13] by incorporating a conditional mutual information-based feature selection mechanism to choose the inputs for the FCM−BPN method. The improvement in the forecasting accuracy also helped to establish a narrower cycle time range. However, the range was based on the confidence interval, which might not contain the actual value.
In Chen [4], the threshold on the output node of a BPN was fuzzified to estimate the range of the cycle time. Then, fuzzy intersection (FI) was applied to aggregate the cycle time ranges estimated by multiple experts. In this way, the cycle time range could be further reduced. However, the cycle time range was affected by extreme cases, namely jobs with unexpected long-or short-cycle times [19]. To overcome this problem, Chen [20] slightly adjusted each cycle time forecast before fuzzifying the threshold on the output node. In this way, the impact of extreme cases could be mitigated. However, as only one network parameter was fuzzified, there was still room for improvement.
Wang et al. [21] constructed a two-dimensional long short-term memory (LSTM) network with multiple memory units to forecast the cycle time of a job. The LSTM network was a deep recurrent neural network [22,23] Through deep learning and considering the correlation between the network parameters, the forecasting accuracy could be improved, thereby helping to reduce the cycle time range. However, the problem of a confidence interval still existed.
Chen and Wu [16] replaced a cycle time forecast with its linear function before fuzzifying the threshold on the output node of a BPN to estimate the cycle time range, which further tightened the lower and upper bounds of the cycle time [17]. Fuzzifying only one network parameter limited the scope of improvement.
Wang et al. [24] modified the approach proposed by Wang and Zhang [14] by incorporating an adaptive logistic regression correlation analysis-based feature selection mechanism instead. These two studies suffered from the same problem of a confidence interval [25,26].
The novelty of the proposed methodology is highlighted by comparing it with some of the existing methods, as summarized in Table 1.

Methodology
The implementation procedure of the proposed methodology comprised the following steps: (1) Preprocess the collected data: two major tasks in this step are feature selection and data normalization. (2) Construct a SFBPN to forecast the cycle time of a job.
(3) Train the SFBPN using an existing algorithm to derive the cores of the network parameters. (4) Apply the random search and local optimization algorithm to derive the lower and upper bounds of the thresholds. (5) Estimate the cycle time ranges of all of the jobs. (6) Evaluate the forecasting precision.
A flow chart is provided in Figure 1 to illustrate the implementation procedure of the proposed methodology. Without a loss of generality, all of the fuzzy parameters and variables in the proposed methodology are given in or approximated with triangular fuzzy numbers (TFNs) [27,28].

Data Preprocessing
There are two major tasks at the stage of data preprocessing, namely, feature selection and data normalization.
First, relevant features are selected based on the way the cycle time of a job is forecasted. One way is to treat the cycle times of jobs as a time series and to apply a time series forecasting method to forecast the job cycle times [29], for which the relevant features are the cycle times of jobs that have been completed recently. The other way is to fit the relationship between the cycle time of a job and the attributes of the job. In this way, relevant features include the attributes of a job, the cycle times of jobs that have been completed recently, production conditions when a job is released into the wafer fab, etc. [30,31]. Features can also be functionalized, split, or combined using techniques such as principal component analysis and stepwise regression, before serving as inputs [24,32]. In the literature, various techniques have been applied to select relevant features, e.g., subjective elimination based on expert knowledge [24,33], backward elimination-based regression analysis [31], backward elimination-based genetic programming [34,35], conditional mutual information-based feature selection [14,21,36], adaptive logistic regression correlation analysis [24], mutual information network deconvolution feature selection [37], etc. In this study, input features were selected using backward elimination-based regression analysis.

Data Preprocessing
There are two major tasks at the stage of data preprocessing, namely, feature selection and data normalization.
First, relevant features are selected based on the way the cycle time of a job is forecasted. One way is to treat the cycle times of jobs as a time series and to apply a time series forecasting method to forecast the job cycle times [29], for which the relevant features are the cycle times of jobs that have been completed recently. The other way is to fit the relationship between the cycle time of a job and the attributes of the job. In this way, relevant features include the attributes of a job, the cycle times of jobs that have been completed recently, production conditions when a job is released into the wafer fab, etc. [30,31]. Features can also be functionalized, split, or combined using techniques such as principal component analysis and stepwise regression, before serving as inputs [24,32]. In the literature, various techniques have been applied to select relevant features, e.g., subjective elimination based on expert knowledge [24,33], backward elimination-based regression analysis [31], backward elimination-based genetic programming [34,35], conditional mutual information-based feature selection [14,21,36], adaptive logistic regression correlation analysis [24], mutual information network deconvolution feature selection [37], etc. In this study, input features were selected using backward elimination-based regression analysis.
Subsequently, the collected data were normalized into [0.1, 0.9] to ensure the extrapolation ability of the SFBPN using the partial normalization method [  Subsequently, the collected data were normalized into [0.1, 0.9] to ensure the extrapolation ability of the SFBPN using the partial normalization method [4]: (1) where N() is the partial normalization function. j and r are both indexes of a job; 1 ≤ j, r ≤ n. p is the index of an input; 1 ≤ p ≤ P. x jp (or x rp ) is the p-th attribute of job j (or r); z jp is the normalized value of x jp . To convert z jp back to the original value: where U() is the partial un-normalization function.

Forecasting the Cycle Time of a Job Using a SFBPN
In the proposed methodology, a SFBPN is constructed to forecast the cycle time of a job in a wafer fab. The SFBPN is a special FBPN for which the parameters are selectively fuzzified. The architecture of the SFBPN is illustrated in Figure 2, which has three layers, namely: the input layer, the hidden layer, and the output layer.
where U() is the partial un-normalization function.

Forecasting the Cycle Time of a Job Using a SFBPN
In the proposed methodology, a SFBPN is constructed to forecast the c job in a wafer fab. The SFBPN is a special FBPN for which the parameters a fuzzified. The architecture of the SFBPN is illustrated in Figure 2, which ha namely: the input layer, the hidden layer, and the output layer.    In the network training phase, inputs to the SFBPN are weighted and transmitted to each node of the hidden layer, on which they are aggregated and then outputted as follows where where l is the index of a hidden-layer node; l = 1~L. h jl is the output from node l of the hidden layer. θ h l is the threshold on this node; w h pl is the weight of the connection between input node p and this node. (-) denotes fuzzy subtraction [38]. After passing h jl to the output layer, the network output o j is generated as follows where θ o is the threshold on the output node; w o l is the weight of the connection between node l of the hidden layer and the output node. As h jl ≥ 0, Equation (8) can be shortened, as follows Some theoretical properties of the SFBPN are discussed in the following.
Proof. The required proof is trivial.

Determining the Values of Network Parameters
The training of the SFBPN is decomposed into three tasks, i.e., determining the core, lower bound, and upper bound of each network parameter [39].
At first, the SFBPN is treated as a crisp one and trained using an existing algorithm, such as the gradient descent (GD) algorithm, the Levenberg−Marquardt (LM) algorithm, the Broyden−Fletcher−Goldfarb−Shanno (BFGS) quasi-Newton algorithm, the GD algorithm with momentum and adaptive learning rate (GDX), or the resilient backpropagation (RP) algorithm [40], to determine the cores of the network parameters (such as w h pl2 , θ h l2 , w o l2 , and θ o 2 ), thereby optimizing the forecasting accuracy measured in terms of the root mean squared error (RMSE): where a j is the cycle time (i.e., actual value) of job j. Assuming the obtained optimal solution is indicated with {w h * pl2 , θ h * l2 , w o * l2 , θ o * 2 }. Subsequently, the following NLP problem is solved to derive the lower and upper bounds of the network parameters, thereby optimizing the forecasting precision measured in terms of the average range (AR) of fuzzy cycle time forecasts: (NLP Problem) Constrains (30) to (33) define the sequence of the three corners of the corresponding TFN. The NLP problem is not easy to solve. To tackle this difficulty, Chen and Lin [18] fuzzified only the threshold on the output node, and set the other network parameters to crisp values to simplify the problem, as follows: As a result, . The threshold on the output node can be optimized as follows.
Proof. Substituting Equations (17) and (18) into Equations (15) and (16) gives the following Substituting Equations (37) and (38) into Constraints (13) and (14), gives the following: Therefore, As Applying Theorem 1 to Constraints (43) and (44) gives the following: Constraints (45) and (46) hold for all jobs. Therefore, To minimize o j3 − o j1 , o j3 and o j1 are to be minimized and maximized, respectively. For this purpose, according to Property 1, θ o 1 and θ o 3 should be maximized and minimized, respectively. As a result, Theorem 2 is proved.

Fuzzifying Thresholds on Hidden-Layer Nodes
In the proposed methodology, the thresholds on both the hidden-layer and outputlayer nodes are fuzzified to further enhance the forecasting precision measured in terms of the average range of fuzzy cycle time forecasts. The optimal values of these thresholds can be derived as follows.
To derive the optimal values of θ o * and θ h * l , a two-step procedure is established in this study. First, the thresholds on the hidden-layer nodes are set to certain values. Then, the optimal value of the threshold on the output node can be derived according to Theorem 3. After repeating this process, the optimal values of these thresholds can be obtained. Accordingly, the following random search and local optimization of Algorithm is proposed, as follows (Algorithm 1):

Algorithm 1: Random search and local optimization
Step 1. Set t (time index) to 1.
Step 2. Set AR min (the minimum of AR so far) to a large positive value.
Step 7. If AR min > AR, set AR min to AR and record the values of θ h l3 , θ h l1 , θ o * 1 and θ o * 3 .
Step 9. If t > T (the number of iterations), go to Step 10; otherwise, return to Step 3.

Background
With the advancement of wafer fabrication technologies, more and more advanced semiconductor devices (such as 3D NAND and finFETs) have emerged; however, the required cycle times have also become longer. For example, a 5 nm semiconductor device may have up to 100 mask layers, and each layer takes 0.8 to 1.5 days. To cope with this, wafer fabrication factories (wafer fabs) usually require faster equipment with patterning tools [41]. However, it is well known that a longer cycle time is associated with higher variation [42]. Therefore, forecasting the cycle time of a job becomes even more difficult.
The case of a wafer fab for making dynamic random access memory (DRAM) products is adopted to illustrate the proposed methodology. This case has been investigated by Chen [4]. In this case, the data of 120 jobs in the wafer fab have been collected. After a backward regression analysis, six factors were considered to be most influential to the cycle time of a job, as defined in Table 2. Therefore, P = 6. Table 2. Factors influential to the cycle time of a job.

Variable Definition
x j1 job size (pieces) x j2 fab work-in-process (WIP; jobs) x j3 queue length before the bottleneck (jobs) x j4 queue length on the processing route (jobs) x j5 average waiting time of recently completed jobs (h) x j6 fab utilization The average and standard deviation of the cycle times were 1229 and 208 h, respectively, showing that the collected cycle times were highly uncertain. The data of the first 80 jobs were used to build/train the model, whereas the remaining data were reserved for testing/evaluation.

Application of the Proposed Methodology
MATLAB was applied to implement the proposed methodology on a PC with an i7-7700 CPU of 3.6 GHz and 8 GB of RAM. First, to derive the cores of the network parameters, SFBPN was treated as a crisp one and was trained using the LM algorithm. To optimize the forecasting accuracy in terms of RMSE, various numbers of nodes in the hidden layer were tried. The results are summarized in Figure 3. With more than eight nodes in the hidden layer, the RMSE for fitting the training data could be reduced to a sufficiently low level (i.e., <30 h). In addition, too many nodes in the hidden layer might increase the possibility of overfitting. Therefore, the number of hidden-layer nodes was set to eight.

Application of the Proposed Methodology
MATLAB was applied to implement the proposed methodology on a PC with an i7 7700 CPU of 3.6 GHz and 8 GB of RAM. First, to derive the cores of the network parame ters, SFBPN was treated as a crisp one and was trained using the LM algorithm. To opti mize the forecasting accuracy in terms of RMSE, various numbers of nodes in the hidden layer were tried. The results are summarized in Figure 3. With more than eight nodes in the hidden layer, the RMSE for fitting the training data could be reduced to a sufficiently low level (i.e., <30 h). In addition, too many nodes in the hidden layer might increase the possibility of overfitting. Therefore, the number of hidden-layer nodes was set to eight.

Application of the Proposed Methodology
MATLAB was applied to implement the proposed methodology on a PC with an i7-7700 CPU of 3.6 GHz and 8 GB of RAM. First, to derive the cores of the network parameters, SFBPN was treated as a crisp one and was trained using the LM algorithm. To optimize the forecasting accuracy in terms of RMSE, various numbers of nodes in the hidden layer were tried. The results are summarized in Figure 3. With more than eight nodes in the hidden layer, the RMSE for fitting the training data could be reduced to a sufficiently low level (i.e., <30 h). In addition, too many nodes in the hidden layer might increase the possibility of overfitting. Therefore, the number of hidden-layer nodes was set to eight.    Subsequently, the random search and local optimization algorithm was applied to estimate the range of time by fuzzifying the thresholds on the hidden and output layers. The number of iterations (T) was set to 100. The ranges of random numbers ξ l and ζ l were set to be within [0, 1], i.e., v = 1. The estimation results are shown in Figure 5. For the training data, the estimated range of the cycle time contained the actual value. However, such a property might not hold for test data. Mathematics 2021, 9,1430 13 of 19 (a) Estimated ranges of cycle times (training data).
(b) Estimated ranges of cycle times (test data). The forecasting precision was evaluated in terms of the average range (AR), hit rate (HR), and the cost for inclusion (CFI): The results are summarized in Table 3.  The forecasting precision was evaluated in terms of the average range (AR), hit rate (HR), and the cost for inclusion (CFI): where The results are summarized in Table 3.

Comparison with Existing Methods
Several existing methods have also been applied to this case for comparison. The first existing method is a traditional statistical analysis technique, the 6σ confidence interval method [26], in which three times the standard deviation was added to and subtracted from the core in order to determine the upper and lower bounds of the cycle time. However, in theory, the probability that a ±3σ confidence interval contains an actual value is only 99.7% under the residual normality assumption. The standard deviation, σ, using the FBPN was derived as follows [1]: In this experiment, σ was 289 h. The forecasting precision using the 6σ confidence interval method is shown in Table 4. The second existing method is the fuzzy linear regression (FLR)-quadratic programming (QP) method proposed by Donoso et al. [43], in which the relationship between the cycle time of a job and its attributes is fitted with a FLR regression, as follows: To derive the values of the fuzzy parameters in Equation (69), the following QP problem was solved: (QP) where ω 1 and ω 2 are the weights; ω 1 , ω 2 ∈ [0, 1]; ω 1 + ω 2 = 1. s is the satisfaction level; s ∈ [0, 1]. In this study, these parameters were set as ω 1 = 0.3; ω 2 = 0.7; s = 0.25. The forecasting precision achieved by applying the FLR-QP method is shown in Table 5. The third method is the FBPN method proposed by Chen and Lin [17], in which only the threshold on the output node was fuzzified to estimate the range of the cycle time. The estimated ranges of cycle times are shown in Figure 6. The forecasting precision using Chen and Lin's FBPN method was evaluated, and the results are shown in Table 6. ω ω + = . s is the satisfaction level; [0, 1] s ∈ . In this study, these parameters were set as 1 0.3 ω = ; 2 0.7 ω = ; 0.25 s = . The forecasting precision achieved by applying the FLR-QP method is shown in Table 5. The third method is the FBPN method proposed by Chen and Lin [17], in which only the threshold on the output node was fuzzified to estimate the range of the cycle time. The estimated ranges of cycle times are shown in Figure 6. The forecasting precision using Chen and Lin's FBPN method was evaluated, and the results are shown in Table 6.   Training  301  100%  301  Test  301  43%  709 From the experimental results, the following discussion was made:

Data Part AR (h) HR CFI (h)
(1) All of the compared methods maximized the hit rate for the training data. However, the average ranges achieved using these methods differed significantly, as illustrated by Figure 7. In this regard, the proposed methodology outperformed the existing methods by establishing the narrowest range for the cycle time.   From the experimental results, the following discussion was made: (1) All of the compared methods maximized the hit rate for the training data. However, the average ranges achieved using these methods differed significantly, as illustrated by Figure 7. In this regard, the proposed methodology outperformed the existing methods by establishing the narrowest range for the cycle time. ω ω + = . s is the satisfaction level; [0, 1] s ∈ . In this study, these parameters were set as 1 0.3 ω = ; 2 0.7 ω = ; 0.25 s = . The forecasting precision achieved by applying the FLR-QP method is shown in Table 5. The third method is the FBPN method proposed by Chen and Lin [17], in which only the threshold on the output node was fuzzified to estimate the range of the cycle time. The estimated ranges of cycle times are shown in Figure 6. The forecasting precision using Chen and Lin's FBPN method was evaluated, and the results are shown in Table 6.  From the experimental results, the following discussion was made: (1) All of the compared methods maximized the hit rate for the training data. However, the average ranges achieved using these methods differed significantly, as illustrated by Figure 7. In this regard, the proposed methodology outperformed the existing methods by establishing the narrowest range for the cycle time.  (2) It is questionable whether the advantage of the SFBPN approach over the existing methods is significant. To investigate this, the following hypotheses were tested: Hypothesis 1 (H1). The estimation precision using the SFBPN approach in terms of the average range is the same as that using the existing method.
Hypothesis 2 (H2). The estimation precision using the SFBPN approach in terms of the average range is more effective than that using the existing method. Table 7 presents a summary of the paired t test results. The estimation precision using the SFBPN approach was significantly improved (α = 0.05) when compared with the existing methods. (3) For the test data, none of these methods optimized the hit rate and the average range simultaneously. Hit rate was usually enhanced at the expense of wide ranges of fuzzy cycle time forecasts. For this sake, CFI might be a better measure for forecasting precision. In this regard, the proposed methodology surpassed the existing methods through reducing the CFI by up to 65%, as shown in Figure 8.
Mathematics 2021, 9, 1430 16 of (2) It is questionable whether the advantage of the SFBPN approach over the existi methods is significant. To investigate this, the following hypotheses were tested: Hypothesis 1 (H1). The estimation precision using the SFBPN approach in terms of the avera range is the same as that using the existing method.
Hypothesis 2 (H2). The estimation precision using the SFBPN approach in terms of the avera range is more effective than that using the existing method. Table 7 presents a summary of the paired t test results. The estimation precision usi the SFBPN approach was significantly improved (α = 0.05) when compared with the e isting methods. (3) For the test data, none of these methods optimized the hit rate and the average ran simultaneously. Hit rate was usually enhanced at the expense of wide ranges of fuz cycle time forecasts. For this sake, CFI might be a better measure for forecasting p cision. In this regard, the proposed methodology surpassed the existing metho through reducing the CFI by up to 65%, as shown in Figure 8. (4) In the random search and local optimization algorithm, it is interesting to kno whether the ranges of random numbers affected the forecasting performance of t proposed methodology. To investigate this issue, various ranges of l ξ and l ζ we tried so as to observe changes in the forecasting precision. The results are summariz in Figure 9. Obviously, with a wider range, it became more difficult to find the optim solution, which led to a poorer forecasting precision.  tried so as to observe changes in the forecasting precision. The results are summarized in Figure 9. Obviously, with a wider range, it became more difficult to find the optimal solution, which led to a poorer forecasting precision.

Conclusions and Future Research Directions
Forecasting the cycle time of a job is a critical task for a factory. However, owing to the uncertainty of the cycle time, it becomes a challenging task, even with state-of-the-art deep learning techniques. To address this challenge, estimating the range of the cycle time requires a viable treatment. In this study, a SFBPN approach is proposed. In the proposed methodology, thresholds on both the hidden and output layers of a BPN are fuzzified to estimate the range of the cycle time, while in the existing methods, only the threshold on the output node is fuzzified. In this way, the SFBPN approach can further tighten the range of the cycle time. To derive the optimal values of the fuzzy thresholds, a random search and local optimization algorithm is also proposed.
A real case from the literature is adopted to assess the effectiveness of the proposed methodology and to compare it with those of several existing methods. According to the experimental results, the following conclusions were drawn: (1) For training data (i.e., learned examples), all of the compared methods were able to include the actual values in the corresponding fuzzy cycle time forecasts or cycle time confidence intervals. However, only the proposed methodology could minimize the average ranges of the fuzzy cycle time forecasts. (2) For the test data (i.e., unlearned examples), CFI was a better measure for the forecasting precision. In this regard, the advantage of the proposed methodology over existing methods was up to 65%.
However, the proposed methodology is subject to the following limitations: (1) Although the random search and local optimization algorithm is likely to find a promising solution within a short time, it cannot guarantee the global optimality of the solution.
(2) The case used to illustrate the proposed methodology is relatively small. A larger case needs to be analyzed to further elaborate the effectiveness of the proposed methodology.
In future studies, connection weights in the SFBPN can also be fuzzified to further enhance the forecasting precision; however, this is computationally intense and requires an efficient algorithm [44][45][46]. As an alternative, fuzzifying some thresholds and some connection weights may be more tractable. In addition, the SFBPN approach is a general methodology that can be applied to estimate a range of data with uncertainty in other fields. Furthermore, other types of fuzzy numbers [47] can also be adopted to represent fuzzy thresholds. Hybridizing various methods to achieve total synergy [48] is another direction worth investigation.

Conclusions and Future Research Directions
Forecasting the cycle time of a job is a critical task for a factory. However, owing to the uncertainty of the cycle time, it becomes a challenging task, even with state-of-the-art deep learning techniques. To address this challenge, estimating the range of the cycle time requires a viable treatment. In this study, a SFBPN approach is proposed. In the proposed methodology, thresholds on both the hidden and output layers of a BPN are fuzzified to estimate the range of the cycle time, while in the existing methods, only the threshold on the output node is fuzzified. In this way, the SFBPN approach can further tighten the range of the cycle time. To derive the optimal values of the fuzzy thresholds, a random search and local optimization algorithm is also proposed.
A real case from the literature is adopted to assess the effectiveness of the proposed methodology and to compare it with those of several existing methods. According to the experimental results, the following conclusions were drawn: (1) For training data (i.e., learned examples), all of the compared methods were able to include the actual values in the corresponding fuzzy cycle time forecasts or cycle time confidence intervals. However, only the proposed methodology could minimize the average ranges of the fuzzy cycle time forecasts. (2) For the test data (i.e., unlearned examples), CFI was a better measure for the forecasting precision. In this regard, the advantage of the proposed methodology over existing methods was up to 65%.
However, the proposed methodology is subject to the following limitations: (1) Although the random search and local optimization algorithm is likely to find a promising solution within a short time, it cannot guarantee the global optimality of the solution.
(2) The case used to illustrate the proposed methodology is relatively small. A larger case needs to be analyzed to further elaborate the effectiveness of the proposed methodology.
In future studies, connection weights in the SFBPN can also be fuzzified to further enhance the forecasting precision; however, this is computationally intense and requires an efficient algorithm [44][45][46]. As an alternative, fuzzifying some thresholds and some connection weights may be more tractable. In addition, the SFBPN approach is a general methodology that can be applied to estimate a range of data with uncertainty in other fields. Furthermore, other types of fuzzy numbers [47] can also be adopted to represent fuzzy thresholds. Hybridizing various methods to achieve total synergy [48] is another direction worth investigation.