Adaptive Cuckoo Search-Extreme Learning Machine Based Prognosis for Electric Scooter System Under Intermittent Fault

In this paper, an adaptive Cuckoo search extreme learning machine (ACS-ELM)-based prognosis method is developed for an electric scooter system with intermittent faults. Firstly, bond-graph-based fault detection and isolation is carried out to find possible faulty components in the electric scooter system. Secondly, submodels are decomposed from the global model using structural model decomposition, followed by adaptive Cuckoo search (ACS)-based distributed fault estimation with less computational burden. Then, as the intermittent fault gradually deteriorates in magnitude, and possesses the characteristics of discontinuity and stochasticity, a set of fault features that can describe the intermittent fault’s evolutionary trend are captured with the aid of tumbling window. With the obtained dataset, which represents the fault features, the ACS-ELM is developed to model the intermittent fault degradation trend and predict the remaining useful life of the intermittently faulty component when the physical degradation model is unavailable. In the ACS-ELM, the ACS is employed to optimize the input weights and hidden layer biases of an extreme learning machine, to improve the algorithm performance. Finally, the proposed methodologies are validated by a series of simulation and experiment results based on the electric scooter system.


Introduction
Mechatronic systems, which involve the synergistic integration of mechanical and electrical structures, are essential parts of modern industrial systems [1][2][3][4][5]. Recently, with the increasing requirements for the reliability of mechatronic systems in industrial applications, fault diagnosis and prognosis, as an important technique to ensure the operational safety and stability of systems, has been a popular topic for researchers and practitioners [6,7].
In recent decades, research into fault diagnosis for mechatronic systems has had many achievements [6][7][8][9][10][11][12][13][14]. Fault diagnosis can generally be classified into data-driven methods and model-based methods. The data-driven diagnosis methods do not need to construct accurate physical models of systems; feature data under normal and faulty conditions, extracted from sensor measurements, are needed to implement the fault diagnosis procedure. However, in some cases, the application of data-driven methods may be limited, as it is difficult to obtain feature data under faulty conditions. Compared with data-driven methods, model-based methods can achieve better diagnostic accuracy due to the employment of physical models. However, system modeling is usually not a trivial task. Fortunately, among the various system modeling methods, the bond graph (BG) is an efficient and graphical modeling tool, which can model complex systems with multiple energy domains. The BG technique has been widely applied for fault diagnosis in mechatronic systems, and many significant results have been obtained [13,14]. In [13], the BG and analytical redundancy relations (ARRs)-based fault diagnosis method for continuous systems is extended to hybrid systems (including continuous dynamics and discrete modes) by introducing the concepts of hybrid BG and global ARRs, where both discrete faults and continuous faults can be detected and isolated. It is noteworthy that the aforementioned model-based fault diagnosis methods are developed based on the centralized architecture or global system model, which may lead a heavy computational burden in the centralized fault estimation with increases in system scale. To address this problem, the structural-model-decomposition-based distributed fault estimation method is developed in [14,15], where a set of local submodels are decomposed from the global model that is suitable for estimation. The computationally independent local estimation is formed based on these local submodels, resulting in a scalable distributed estimation approach that allows for the local sub-problems to be solved in parallel, thus decreasing the computational burden.
Differing from the relatively mature fault diagnosis technology for mechatronic systems, the research on prognosis is still in the development stage. Some relevant works can be found in [15][16][17][18][19][20][21][22]. The prognosis methods can be divided into two strategies, i.e., modelbased methods and data-driven methods. The model-based methods typically attempt to construct mathematical models to describe the degradation process of faulty components. In [16], the improved Wiener degradation process is proposed for the prognosis of incipient faults in the hybrid mechatronic system. In [18], an adaptive hybrid differential evolution algorithm is used to identify the degradation behavior of incipient faults, by which the remaining useful life (RUL) of faulty components can be predicted. However, in real systems, it is difficult or even impossible to accurately establish physical degradation models for faulty components, which limits the applications of the model-based prognostic method. Unlike the model-based methods, data-driven methods do not need to establish an accurate mathematical model of the monitoring object. Based on the collected system historical data, mining the hidden information in the data for prognosis is a more practical method. At present, neural networks, which can predict the future evolutionary trend according to historical degradation data when the physical degradation model of the faulty component is unavailable, have gradually become popular methods in the data-driven prognosis field [21,22]. Among the various neural networks, the extreme learning machine (ELM) possesses the merits of good generalization and a fast learning ability [23,24]. Therefore, ELM has been used to solve many prognosis problems [25][26][27]. For example, in [26], an enhanced multi-sensor prognostic model based on Kalman filter-online sequential ELM and logistic regression model is designed for the RUL prediction of an aircraft engine. It is notable that the aforementioned works mainly focus on the prognosis for permanent faults, while intermittent faults, which are also common in mechatronic systems, are not discussed. Unlike permanent faults, intermittent faults possess discontinuity and stochasticity. If the effective prognosis approach cannot quickly be implemented for the predictive maintenance purpose in the early stage of intermittent faults, intermittent faults may evolve into permanent faults. Recently, a method to address the prognosis of the electric scooter system with intermittent faults was introduced in [28]. However, this work only solves the problem of RUL prediction under the assumption of the monotonic degradation of intermittent fault magnitude, and does not concern the stochasticity of intermittent faults.. Moreover, the RUL prediction research for intermittently faulty components in [28] does not consider the fact that the degradation model of the faulty component is usually unknown in real applications.
Based on the above discussions, the prognosis of intermittent faults is still a challenging issue. Specifically, there are two major problems to be solved. Firstly, considering the discontinuity of intermittent faults and the randomness of fault appearance and disappear-ance, the construction of intermittent fault features based on the fault estimation results is an essential problem. Secondly, the degradation models of intermittently faulty components are usually unknown in practical applications. Thus, without the exact degradation model, predicting the RUL of the intermittently faulty component based on established intermittent fault degradation features is challenging.
An electric scooter is an essential vehicular transportation mode for people with different mobility difficulties when travelling. Note that various electrical and mechanical components in the electric scooter may suffer from intermittent faults due to aging and frequent usage. It is easy to neglect the influence of intermittent faults on the system's normal operation at the early stage. If an effective diagnosis and prognosis scheme is not predesigned for intermittent faults in the electric scooter, and with the continuous degradation of intermittently faulty components due to frequent usage, intermittent faults may eventually evolve into permanent faults, which will lead to system failure, and disastrous consequences. Therefore, it is necessary to develop a prognosis method for an electric scooter with intermittent faults. Therefore, an adaptive Cuckoo search extreme learning machine (ACS-ELM)-based prognosis method for an electric scooter with intermittent faults is proposed in this paper. The main contributions of this work are twofold: (1) An integrated condition-monitoring framework combining distributed modelbased diagnosis and data-driven prognosis (which contains the merits of both methods) is developed. On the one hand, the BG-based structural model decomposition is used to build submodels from the global model, based on which the distributed intermittent fault estimation can be implemented with less computational burden. On the other hand, considering the fact that the physical degradation models are usually unknown in practice, the data-driven prognosis method is developed to predict the RULs of intermittently faulty components.
(2) As intermittent faults gradually deteriorate, and possesses discontinuity and stochasticity, the intermittent fault features are captured with the aid of tumbling window (TW). Then, the ACS-ELM is proposed to model the intermittent fault feature evolutionary trend, as well as the RUL prediction of the intermittently faulty component, where ACS-ELM is developed by introducing adaptive Cuckoo search (ACS) into the ELM to optimize input weights and hidden layer biases. This paper is organized as follows. Section 2 presents the FDI framework under intermittent fault for an electric scooter based on a diagnostic bond graph (DBG) model. Section 3 discusses the distributed intermittent fault estimation based on structural model decomposition. Section 4 proposes the prognosis method for intermittently faulty components using ACS-ELM. Section 5 analyzes the simulation and presents the experimental results. Finally, Section 6 concludes this paper.

DBG Model of Electric Scooter
The structure diagram of the electric scooter is given in Figure 1, based on which its DBG model can be built, as shown in Figure 2. The DBG model of the electric scooter contains five parts, i.e., DC motor driver, DC motor, rear wheels, body, and front wheels. Descriptions of the main parameters of the model are summarized in Table 1. In the DBG model of the electric scooter system, the mechanical friction R m of the DC motor consists of a viscous friction coefficient R mv and Coulomb friction torque R mc . Similarly, the rear wheel friction R r contains a viscous friction coefficient R rv and Coulomb friction torque R rc . The front wheel friction R f also contains a viscous friction coefficient R f v and Coulomb friction torque R f c . Additionally, there are three flow sensors in the BG model: D f 1 :θ r and D f 3 :θ f are used to measure the angular velocities of rear wheels and front wheels, respectively, and D f 2 :ṡ is adopted to measure the line velocity of the electric scooter body.

FDI Method
Based on the DBG model in Figure 2, three modified analytical redundancy relations (MARRs) can be derived: (1)- (3). Differing from the traditional ARRs, MARRs is derived by introducing the efficiency factors to model the multiplicative faults of non-parametric components (including actuators and sensors).
where β U in , β θ r , β s , and β θ f denote the efficiency factors of non-parametric components (i.e., U in , θ r , s, and θ f ). If the residuals (numerical evaluations of MARRs) exceed the preset thresholds, the intermittent faults can be detected. The fault detection results can be represented by a binary coherence vector CV = [cv 1 cv 2 cv 3 ], cv i ∈ {0, 1}, i = 1, . . . , 3, which indicates the consistency of residuals (zero for consistent and nonzero for inconsistent). To investigate the fault isolability, the fault signature matrix (FSM), which represents the cause-effect relationships between component faults and residuals, is given in Table 2. If a nonzero CV is obtained from the fault detection process, the fault isolation procedure can be implemented by comparing the CV with the FSM. Then, a set of possible faults (SPF) can be determined.

Parameterization of Intermittently Faulty Component
The intermittent fault estimation aims to identify the intermittent fault magnitude, with appearing and disappearing instants for possible faulty components in SPF. Thus, the value change in θ (θ represents the parameter or efficiency factor in Table 2) under intermittent faults in the time interval t ∈ [t s , t e ] can be described by the following function: where ε( * ) is the unit step function, F nom,θ is the nominal value of θ, Thus, θ( * ) is the parameterized function of the faulty component with three sets of variables (i.e., F θ , λ θ , and µ θ ) to be identified. Based on (4), the value changes in all possible faulty components in SPF can be described by parameterization functions.

Construction of Submodels by Structural Model Decomposition
Since a large number of unknown variables need to be identified under the multiple intermittent faults condition (i.e., one has to identify the fault magnitude vector, faultappearing instant vector and fault-disappearing instant vector for each fault candidate in SPF), there is a heavy computational burden if the centralized fault estimation method, based on the global model, is used. Therefore, the use of a distributed fault estimation technique is recommended to achieve better computational efficiency. The distributed fault estimation is accomplished based on the submodels that were decomposed from the global model using structural model decomposition [14,15]. For illustration, the global model and the submodel can be defined as follows.
Definition I: (Global model) A global model is represented by G = (U, Y, Θ), where U and Y are the sets of inputs and outputs of the global model, respectively, Θ is the set of parameters and efficiency factors.
Definition II: (Submodel) The ith submodel is represented by and Y i are the sets of local inputs and local output of the ith submodel, respectively, Θ i ⊂ Θ is the set of parameters and efficiency factors in the ith submodel.
Theoretically, the number of submodels is determined by the number of sensors in the global model. Therefore, three submodels, as shown in Figure 3, are built from the global DBG model of the electric scooter system. Based on Figure 3, the submodel MARRs can be derived as follows.
(c) Submodel S 3 . In Figure 3 and (5)-(7), ( * ) U i and ( * ) Y i denote that * is treated as the local input and output of the submodel S i , i = 1, . . . , 3, respectively. Note that the output (i.e., sensor measurement) in the global model may be treated as a local input or local output in different submodels. However, for the faulty sensor, regardless of the function it plays in the submodel (i.e., local input or local output), the efficiency factor should remain the same to ensure consistent detection results from different submodels.
In the electric scooter system, multiple intermittent faults are considered. Two typical cases are discussed in detail as follows.
Case I: Intermittent faults occur in β U in and R f c , CV = [1 0 1] and SPF = β U in , R rv , R f v , R f c can be obtained by implementing the FDI procedure. Based on the SPF, β U in and R rv are located at submodel S 1 , while R f v and R f c are located at submodel S 3 . Therefore, the S 1based local estimator and the S 3 -based local estimator can be implemented in parallel to identify β U in , R rv and R f v , R f c , respectively.
Case II: The intermittent fault occurs in β θ r , CV = [1 1 0] and SPF = β U in , R rv , β θ r are obtained. β U in and R rv are located at submodel S 1 , while β θ r exists in both submodels, S 1 and S 2 . Since the submodel S 1 contains all possible faulty components in SPF, the S 1based local estimator can be used to identify β U in , R rv , and β θ r .

Distributed Fault Estimation via ACS Algorithm
Since the MARR S i of the submodel S i is the function of θ, and θ can be represented as the function of the unknown variables to be identified based on (4). The distributed fault estimation problem for submodel S i can be considered as the optimization problem using the following fitness function: where R is the number of samples. The possible faults in an MARR S i need to be represented by (4), such that the corresponding unknown variables can be obtained by the optimization algorithm, and ρ is a small constant to avoid zero division.
After the fitness function of each submodel is obtained, the submodel-based local estimators that are affected by faults can be activated. ACS is utilized for fault parameter identification in the local estimators, while ACS is developed by introducing the adaptive step-size scaling factor into the standard Cuckoo search (CS). The CS, as a natural heuristic algorithm, is proposed to be inspired by brood parasitism and Levy flight (LF) foraging behaviors of cuckoos [29]. Suppose that z l+1 d and z l d denote the positions of the lth and the (l + 1)th generations of the cuckoo d, respectively. Then, the LF-based position updating formulation is expressed as where α is the step-size scaling factor, ⊗ denotes the entry-wise multiplication, Levy(s, γ) represents the LF random search path, and the random step-size obeys the Levy distribution as follows: The Mantegna algorithm, which can achieve a symmetric Levy stable distribution, is usually an effective means of generating a random step-size that obeys the Levy distribution. Specifically, the step-size s is calculated via two variables with Gaussian distribution, as follows: where The Levy index γ = 1.5 and the step-size scaling factor α = 1 are default setups for the standard CS. However, the standard CS lacks the dynamical adaptability of search step-size, which may cause difficulties in algorithm convergence and lower estimation accuracy [30]. Therefore, the ACS is proposed to alleviate this problem, where the dynamic adaptive strategy of the step-size scaling factor based on (13) is introduced to the original CS. Using (13), the dynamic adaptive strategy is expressed by a nonlinear piece-wise function, where the larger α at the early searching stage helps the algorithm to converge to near the optimal solution quickly, while the smaller α at a later stage can achieve fine-tuning near the optimal solution.
where α max , α min , and α l represent the maximum, the minimum and the lth generation step-size scaling factor of the algorithm, respectively. l is the current iteration number, and l max is the maximum number of iterations.

ELM Theory
When the value changes in intermittently faulty components are identified by ACSbased distributed fault estimation, the prognosis module can be activated to predict the RULs of intermittently faulty components. If the degradation model is predefined, the RUL of the faulty component can be successfully predicted using identification results, while the actual physical degradation model is usually unknown for most industrial applications. Therefore, neural networks, which can predict future behaviors based on historical data, have gradually become an important means of implementing RUL prediction. Among the various neural networks, ELM is a kind of single hidden-layer feedforward neural network, which possesses the advantages of fewer training parameters, a faster learning speed, and a stronger generalization ability [23][24][25][26][27]. When applying ELM to the prognosis of intermittently faulty components, the basic principle of ELM is introduced as follows.
Suppose that an arbitrary distinct sample set is ( The single hidden layer neural network with L hidden layer nodes can be expressed as where g( * ) is the activation function, W i = [w i1 , w i2 , . . . , w in ] T and β i = [β i1 , β i2 , . . . , β im ] T denote input weight vector and output weight vector, respectively, b i is the bias of the ith hidden node, W i · X j represents the inner product of W i and X j . The learning goal of a single hidden-layer neural network is to minimize the output error, i.e., where input weight matrix, hidden layer bias vector, and input matrix can be expressed as , and X = [X 1 X 2 . . . X N ], respectively. Thus, (15) can also be represented by matrix form, as follows: where Since the input weight matrix W and hidden layer bias matrix b are randomly generated, the hidden layer output matrix H can be directly calculated based on (16). Therefore, the ELM aims to find the solution of a linear system Hβ = Y, and the output weightβ can be calculated as where H † is the Moore-Penrose generalized inverse of H.

RUL Prediction for Intermittently Faulty Components Using ACS-ELM
Based on the ELM theory, the main steps of RUL prediction for intermittently faulty components using ACS-ELM approach can be described as follows.
Step 1: Construct an intermittent fault feature dataset.
To activate the ACS-ELM based prognoser, the intermittent fault features, which are used to build the training/test dataset, need to be constructed from the identification results. However, if the training/test set is determined via simply recording each intermittent fault magnitude as a fault feature, the intermittent fault features cannot be obtained at equal time intervals due to the discontinuity of intermittent faults and the randomness of fault appearance and disappearance, meaning that the one-step-ahead prediction of ELM cannot be properly implemented. To address this problem, the TW with a fixed length L TW is defined as the sampling length of the intermittent fault feature construction process, while the maximum fault magnitude in TW is treated as the intermittent fault feature. Therefore, the intermittent fault feature dataset can be expressed as follows: where Q is the number of intermittent fault features in the time interval t ∈ [t d,θ , t d,θ + Q · L TW ]; t d,θ is the instant at which the intermittent fault first occurs; f k θ represents the fault feature of θ in the kth TW; F k,m θ denotes the magnitude of the mth fault appearance in the kth TW for θ; the sign "±" is determined by the relation between F k,m θ and F nom,θ ; the sign is "+" when F k,m θ > F nom,θ and "−" otherwise; M is the number of fault appearances in the kth TW.
To obtain enough intermittent fault features to establish the ELM prediction model, the evolutionary trend of intermittent faults over a long period must be acknowledged. Therefore, the corresponding local estimators containing SPF are activated once every L TW after the first intermittent fault is detected. Then, the intermittent fault feature in this TW can be obtained according to (20). After that, the above process is repeated with L TW as the period until the intermittent fault feature dataset f θ is determined, as shown in (19).
Step 2: Determine the input/output matrix and choose a training/test set. When the intermittent fault feature dataset f θ is obtained, as the ELM uses the onestep-ahead prediction strategy to implement RUL estimation, the first U data in f θ are taken as the inputs and the following are treated as the output. Thus, the input matrix x and the output matrix y can be expressed as: To train the ELM model for RUL prediction, the first V rows and the remaining Q − U − V rows of x and y are treated as the training set {x train , y train } and test set {x test , y test }, which can be represented as: Step 3: Optimize the input weights and hidden layer biases using ACS. To establish the canonical ELM model, the input weight matrix W and the hidden layer bias vector b are randomly selected. However, the random selection of W and b may lead to reductions in the generalization ability and prediction accuracy of the canonical ELM. To address the above problems, the ACS algorithm developed in Section 3.3 is adopted to optimize the input weights and hidden layer biases of ELM so that an optimal ELM model can be selected, i.e., ACS-ELM.
For ACS-ELM, a host nest z d (each host nest represents a feasible solution) in the population is composed of input weights and hidden layer biases. The fitness function is formulated by the mean square error of the ELM model output and the real output y test , which can be expressed as wheref j θ is the actual output after x test is substituted into the trained ELM model; f j θ is the desired output in the test set (i.e., y test ). By minimizing the fitness function, the optimal solutions, including input weight matrix, hidden layer bias vector, and output weight matrix, can be determined, after which the ACS-ELM is obtained.
Step 4: Predict the RUL of intermittently faulty components. When the ACS-ELM prediction model is determined, taking y test as the initial input of the ACS-ELM prediction model, the one-step-ahead prediction procedure is implemented to predict the next intermittent fault feature until the predicted feature valuef k θ exceeds the predefined threshold, i.e., f · L TW ] due to the stochasticity of intermittent faults. Therefore, the EOL and RUL of the intermittently faulty component are depicted by time intervals, as shown in (25) and (26), in which the magnitude of at least one fault will exceed the failure threshold rather than all faults.
Based on the above descriptions, the RUL prediction procedure for the intermittently faulty component using ACS-ELM is summarized in Figure 4.

Simulation and Experiment Results
To verify the effectiveness of the ACS-based distributed intermittent fault estimation method and the ACS-ELM-based intermittently faulty component prognosis method, a series of simulation and experiment analyses are carried out based on the electric scooter system in this section. The nominal values of the electric scooter system parameter are given in Table 3 [28]. The sampling times are set as 0.02 s in both the simulation and experiment studies, and the intermittent fault feature sampling length (i.e., L TW ) is 10 s.

Simulation Study
In the simulation study, the fault scenario described in the Case I of Section 3.2 is considered, where intermittent faults occur in β U in and R f c . The simulation procedure is implemented in the MATLAB/Simulink environment, where the BG model of the electric scooter system is shown in Figure 2. The runtime of the electric scooter model in the simulation is 1100s; when the simulation runs to 100s, intermittent faults occur in β U in and R f c simultaneously. The designed degradation processes of intermittent fault features of β U in and R f c in TW are expressed as f β U in = 1 − 0.03 · k 0.6088 and f R f c = 0.0167 · k 0.7221 + 1.86 × 10 −3 . Note that the intermittent faults of β U in and R f c are both designed for the situation where the intermittent fault gradually deteriorates in magnitude, and possesses stochasticity in fault appearance and disappearance. Firstly, the residual responses are shown in Figure 5, where dashed lines are thresholds and solid lines are residuals. The thresholds in the simulation process were chosen by observing the residual responses under normal simulation model conditions (i.e., 0.04 for |MARR 1 |, 0.6 for |MARR 2 |, and 0.04 for |MARR 3 |). Only the residual responses of t ∈ [0, 200]s are demonstrated in Figure 5 for easy observation, while the detection processes of the later 900s were similar to t ∈ [0, 200]s, and were no longer on display. From Figure 6, it can be observed that MARR 1 and MARR 3 intermittently exceed their thresholds; accordingly, a CV can be determined as CV = [1 0 1], and the SPF was obtained by comparing CV with FSM, i.e., SPF = β U in , R rv , R f v , R f c . Then, based on the descriptions of Case I of Section 3.2, β U in , R rv and R f v , R f c can be estimated in parallel by a S 1 -based local estimator and S 3 -based local estimator. According to the construction method of the intermittent fault feature dataset, the S 1 -based local estimator and the S 3 -based local estimator are activated once every L TW = 10s to identify the magnitude and appearing and disappearing instants of each intermittent fault in the corresponding TW after the first intermittent fault is detected; then, the intermittent fault feature in each TW can be determined by (20). To obtain enough intermittent fault features to build the ACS-ELM prediction model, monitoring of the intermittent fault degradation process in the simulation lasts for 1000s (from 100s to 1100s), i.e., Q = 100. Therefore, the distributed fault estimators need to be sequentially implemented 100 times to capture 100 intermittent fault features. From the estimation results, the actual values of R rv and R f v are consistent with their nominal values. Therefore, R rv and R f v can be excluded from the SPF, and it can be concluded that β U in and R f c suffer from intermittent faults. To demonstrate the effectiveness and accuracy of the ACS-based distributed fault estimation approach, the identification results of the first two TWs are taken as examples for illustration. With these in mind, the distributed fault estimation results of β U in and R rv of the first two TWs are shown in Table 4. Table 4 shows the fault estimation results (including magnitude, appearing and disappearing instants of each intermittent fault in two TWs) versus the designed values. Thus, based on (20), f 1 When the ACS-based distributed fault estimation process is terminated, all intermittent fault features are determined to form the intermittent fault feature dataset.   After that, the intermittent fault feature datasets of β U in and R f c , i.e., f β U in and f R f c , are depicted in Figures 6a and 7a, respectively, where

Experiment Study
To further investigate the performance of the proposed prognosis method under the experimental environment, the fault scenario described in the Case II of Section 3.2 was considered, where θ r (i.e., β θ r ) suffers from intermittent faults. The designed degradation process of intermittent fault features of β θ r in TW was expressed as f β θr = 1 − 0.02 · k 0.7268 . A diagram of the experimental platform workflow is given in Figure 8. The lead-acid batteries (model: 12 V 20 AH lead-acid batteries) supplied power to the DC motor driver. The control signal to the DC motor driver (model: CRRTIS 1212-2201, nominal voltage 24 V, drive current 45 Amps, peak boost current 55 Amps, maximum boost duration 10 s) was provided by the USB data acquisition card (model: Advantech USB-4711A, USB 2.0 interface, 16 analog input channels, 12 bit resolution, sampling rate 150 kS/s), powered by the onboard laptop. The velocities of the electric scooter were measured by three incremental encoders (model: Omron E6B2-CWZ3E, resolution ratio 1000P/R, rated voltage 24 V). The onboard laptop provided voltage to the data acquisition card and sent an input signal to the motor driver. The FDI procedure of the electric scooter system was accomplished by the LabVIEW module, while distributed fault estimation and ACS-ELM based prognosis were conducted by introducing MATLAB script node into LabVIEW, based on which the co-simulation between MATLAB and LabVIEW could be implemented. The proposed methods were implemented in MATLAB R2015a and LabVIEW 2014 using an onboard laptop with an Intel Core i7-6500 2.6 GHz CPU with 8 GB memory and Microsoft Windows 7 Enterprise SP1 64-bit operating system.  Firstly, the fault detection results shown in Figure 9 indicate that the CV = [1 1 0] and SPF = β U in , R rv , β θ r . Note that a low-pass filter 5 s+5 was used to deal with the measurement noises in the experiment and the thresholds of experiment had to be determined according to observations of the filtered residual responses of the actual system under healthy conditions (i.e., 0.2 for |MARR 1 |, 3 for |MARR 2 |, and 0.2 for |MARR 3 |). Due to the description of Case II of Section 3.2, the submodel S 1 contained all possible faulty components in SPF, such that the S 1 -based local estimator could be used to identify β U in , R rv , and β θ r . Then, the ACS-based intermittent fault estimation for submodel S 1 was implemented; similar to the first fault scenario, the fault estimation results show that intermittent fault occured in β θ r , while β U in and R rv were fault-free. The intermittent fault feature dataset of β θ r was obtained according to (19) and (20) Figure 10b, and the prediction results were obtained as k f ail β θr = 162 and t rul,β θr ∈ (610, 620]s.

Analysis and Comparison
To evaluate the RUL prediction accuracy of ACS-ELM approach, three metrics, including mean absolute error (MAE), root means square error (RMSE), and mean absolute percentage error (MAPE), are adopted, which can be computed as wheref k θ denotes the predicted intermittent fault feature in the kth TW for θ; f k,a θ represents the designed intermittent fault feature in the kth TW according to the pre-designed degradation process of intermittent fault features for θ.   According to the designed intermittent fault feature degradation processes of β U in , R f c , and β θr , i.e., and f β θr = 1 − 0.02 · k 0.7268 , the designed fault degradation trends of β U in , R f c and β θ r are depicted in Figures 6b, 7b and 10b, respectively. The evaluation metrics of the ACS-ELMbased RUL prediction results of the simulation and experiment are summarized in Table 5. Table 5 shows that the ACS-ELM-based prognosis for intermittently faulty components is accurate for both simulations and experiments. Finally, a comparison study was conducted with traditional ELM and PSO-ELM [27] to validate the superiority of the proposed ACS-ELM-based prognosis method. To make a fair comparison, all approaches were tested on the same simulation or experiment data. Additionally, the population size, maximum iterations and search spaces of parameters for three algorithms were the same. Each approach used 30 independent tests (i.e., simulation for β U in and R f c , experiment for β θ r ). The mean value of MAE (i.e., MAE), obtained from 30 tests, was taken as the comparison index of algorithm performance, as were the mean values of RMSE and MAPE (i.e., RMSE and MAPE). The comparison results of ELM, PSO-ELM, and ACS-ELM are summarized in Table 6. Meanwhile, the comparison results of the RUL prediction performance for different algorithms are depicted in Figure 11, where the cube roots of MAE, RMSE, and MAPE (i.e., MAE C , RMSE C , and MAPE C ) are calculated and illustrated in Figure 11 for observation. According to Table 6, compared with the standard ELM and PSO-ELM, the average prediction evaluation indexes (i.e., MAE, RMSE, and MAPE, respectively) in 30 trials of ACS-ELM were reduced by 83.95%, 81.50%, 83.51% (ELM compared to ACS-ELM) and 39.53%, 36.70%, 54.98% (PSO-ELM compared to ACS-ELM) for β U in , 86.67%, 87.82%, 89.45% and 29.17%, 41.54%, 42.52% for R f c , 85.27%, 84.84%, 81.78% and 38.04%, 32.71%, 45.69% for β θ r . It can be concluded that ACS-ELM performs better than traditional ELM and PSO-ELM in terms of RUL prediction under intermittent faults.

Conclusions
In this paper, an ACS-ELM-based prognosis method is developed for an electric scooter system with intermittent faults. The FDI framework helps to find possible faulty components. Based on the model's structural decomposition and ACS algorithm, distributed fault estimation was implemented to identify the magnitude and the appearing and disappearing instants of each intermittent fault for faulty components. For the prognosis of intermittently faulty components, ACS-ELM was proposed to model the degradation process of intermittent fault features and predict the RUL of intermittently faulty components. A series of simulation and experiment results verified the effectiveness of the proposed methods. Through experiment and comparison studies, it is concluded that the ACS-ELM-based RUL prediction results of intermittently faulty components are accurate, and the ACS-ELM performs better than traditional ELM and PSO-ELM for prognosis under intermittent faults.
This work provides an effective method for the RUL prediction of intermittently faulty components under the condition of unavailable degradation models. Several challenging issues still need to be addressed. Future research directions will focus on the following two aspects. First, this work only considers the RUL prediction method of the intermittent fault magnitude degradation process based on ACS-ELM, while intermittent fault degradation can also be reflected in terms of duration. It is necessary to apply the proposed method to RUL predictions of the intermittent fault duration degradation process. Secondly, as the system working conditions (road conditions, system input, system mode, etc.) often change in practice, RUL predictions of intermittent faults under variations in system working conditions should be considered in future work.