Dynamic System Identification and Prediction Using a Self-Evolving Takagi–Sugeno–Kang-Type Fuzzy CMAC Network

: This study proposes a Self-evolving Takagi-Sugeno-Kang-type Fuzzy Cerebellar Model Articulation Controller (STFCMAC) for solving identification and prediction problems. The proposed STFCMAC model uses the hypercube firing strength for generating external loops and internal feedback. A differentiable Gaussian function is used in the fuzzy hypercube cell of the proposed model, and a linear combination function of the model inputs is used as the output of the proposed model. The learning process of the STFCMAC is initiated using an empty hypercube base. Fuzzy hypercube cells are generated through structure learning, and the related parameters are adjusted by a gradient descent algorithm. The proposed STFCMAC network has some advantages that are summarized as follows: (1) the model automatically selects the parameters of the memory structure, (2) it requires few fuzzy hypercube cells, and (3) it performs identification and prediction adaptively and effectively.


Introduction
During the past decade, neural networks (NNs) have been widely used in dynamic system applications, such as control, identification, prediction, and signal processing [1][2][3][4]. NNs exhibit the advantages of effective function approximation, adaptive learning, generalization abilities, and computation parallelism. However, their disadvantages include computational complexity and slow convergence.
Based on the neurophysiological properties of the human cerebellum, Albus [5,6] presented a cerebellar model articulation controller (CMAC). The associative memory is constructed using the overlapping receptive fields. A corresponding relationship is present between the input and output of the mapping. The CMAC has a highly standardized computational structure, fast network learning, local generalization, and fast convergence [5][6][7][8][9]. However, because of its constant response and quantified receptive fields, the approximation capacity of the CMAC model is limited. In other words, the receptive fields are fixed in the conventional CMAC model.
To overcome the aforementioned problems, several studies have proposed improving the performance of the CMAC model by using differentiable cells with fuzzy boundaries [10][11][12]. Because the CMAC develops differentiable functions through learning, it exhibits structural flexibility in the local region. Several researchers [13][14][15][16] have combined the CMAC and fuzzy logic linguistic representation to solve problems that include processing uncertain and nonlinear problems. Sim et al. [13] introduced Bayesian Ying-Yang learning to optimize an FCMAC. Forward training and backward running phases were adopted for input-output discrimination. Wu et al. [14] proposed an adaptive mechanism for FCMAC learning. Wu [15] investigated the trajectory tracking control of wheeled mobile robots by using an FCMAC. Compared with the CMAC, the FCMAC uses fuzzy membership functions to model the problem. As a result, the FCMAC is highly intuitive and easy to understand.
Several methods [17,18] have employed different strategies to enhance the efficiency of CMACs in previous studies. Zeng and Keane [17] combined Kolmogorov's theorem and hierarchical fuzzy systems to promote the universal approximation property. To improve the function approximation ability, Lee et al. [18] proposed a parametric FCMAC (PFCMAC) that is a hybrid of a Takagi-Sugeno-Kang (TSK)-type fuzzy inference system [19] and a CMAC network. The PFCMAC can approximate continuous functions and minimize the number of hypercube cells. These models use a feedforward structure and therefore cause instability problems due to the local nature of hypercube cells. At the same time, this problem also occurs with overtraining in static or dynamic systems [20,21]. The dynamic systems involve correlation between the input and output; several types of recurrent technique involve the use of this mechanism for NNs and fuzzy systems [22][23][24][25][26]. The relative position of delay units can be adjusted to achieve precise control and enable the accurate approximation of actual values. Therefore, recurrent networks can overcome the disadvantages of feedforward networks. There are two types of recurrent structure. In one recurrent structure, global feedback is used in fuzzy NNs (FNNs) [27][28][29][30]. The other structure uses internal state variables as local recurrent feedback loops [27][28][29][30]. However, the aforementioned structures are incapable of mapping.
In this study, we extend our previous study [18] by developing a Self-evolving Takagi-Sugeno-Kang-type Fuzzy Cerebellar Model Articulation Controller (STFCMAC) model. The interactively recurrent structure of the proposed model provides a strong search ability for local and global solutions. Global feedback is obtained from itself and other fuzzy hypercube cells. However, local feedback is insufficient to represent all necessary information. In other words, a fuzzy hypercube cell receives feedback from itself only. Several studies have considered past states in recurrent structures without referring to current states and thus obtained insufficient information. The three major contributions of the proposed STFCMAC are summarized as follows: (1) the model automatically selects the parameters of the memory structure, (2) it requires few fuzzy hypercube cells, and (3) it performs identification and prediction adaptively and effectively.
Several types of simple CMAC are introduced in Section 2. The STFCMAC model is proposed in Section 3. Section 4 presents the learning algorithm of the STFCMAC. Section 5 illustrates the experimental results for identifying nonlinear dynamic systems and predicting time series. Finally, conclusions are provided in Section 6.

Fuzzy CMAC Model
The fuzzy CMAC model is similar to the traditional CMAC model but uses two main mappings S(x) and P(s) based on fuzzy operations to approximate the nonlinear function y = f(x). The FCMAC model is shown in Figure 1. In this case, a Gaussian function is used to model the receptive field basis function and add fuzzy weights to the result. The five layers are described as follows. Each input variable xi in Layer 1 is quantized into discrete regions or elements. Several elements in Layer 1 can accumulate to form a block. Each component performs a Gaussian basis function. Layer 2 is an associated memory space and corresponds to a linguistic variable in each function represented by a membership function. Layer 2 can be regarded as fuzzifying the input variables. Layer 3 is the receptive field space or fuzzy hypercube. Each node implements a fuzzy operation and obtains the firing intensity s. The fuzzy weights in Layer 4 are inferred to generate a partial fired fuzzy output by its fuzzy hypercube selection vector as the matching degree of inputs. In Layer 5, a centroid of area approach is adopted to obtain the model output.

Proposed STFCMAC Model
Structural learning and parametric learning in the STFCMAC model-an extension of our previous study [18]-is described in this section. The recurrent structure in the STFCMAC model uses an interactive feedback mechanism that captures key information from other hypercube units and can be combined with TSK-type linear functions for a better solution. The proposed STFCMAC model is different from that proposed by Lin et al. [31], which employs a fuzzy neural network structure; our model employs a TSK-type fuzzy CMAC structure. The proposed STFCMAC model employs a recurrent feedback mechanism in the temporal layer and a linear combination function in the consequent part to ensure high performance of the network. Figure 2 The structure of the STFCMAC model is illustrated as follows: Layer 1: Each node in the layer directly transfers the input value to the next layer.
where ∏ ( ) denotes the firing strength. Layer 4: In this layer, the recurrent node performs internal feedback and external feedback loops. The output depends on both the previous and current firing strengths.
Layer 5: Each node in this layer combines a linear combination function of inputs and the corresponding feedback loop output from Layer 4.
where and are the constant values and ND represents the number of the input dimensions. Layer 6: The centroid of area method is adopted for performing the defuzzification operation in this layer. The actual output y is described in the following:

Learning Algorithm for Proposed STFCMAC Model
The proposed supervised learning algorithm comprises both structure and parameter learning schemes. A flowchart of the structure and parameter learning schemes is shown in Figure 3. Firstly, the STFCMAC model has no fuzzy hypercubes. The degree measure determines the self-partition of the input space in the structure learning scheme. Secondly, the parametric learning scheme uses a backpropagation algorithm to adjust parameters in the STFCMAC model for minimizing a given cost function.

Structure Learning Scheme
A new fuzzy hypercube is generated in the structural learning scheme. The firing strength in Layer 3 is used as the degree measure after a product operation: The maximum degree measure Smax is determined as = max (8) where N is the current number of fuzzy hypercube cells. Here, the parameter of a prespecified threshold ̅ is defined. If ≤ ̅ , a new fuzzy hypercube cell is generated. Otherwise, no new fuzzy hypercube cell is generated. To avoid increasing the size of the STFCMAC model, the prespecified threshold should be reduced during the learning process. The selected threshold value will be subjective to different problems. That is, it depends on user experience or trial and error.

Parameter Learning Scheme
The backpropagation algorithm is used in the parameter learning scheme to adjust the parameters in the STFCMAC model. For ease of explanation, consider taking a single output as an example. The cost function E(t) is defined as follows: where ( ) and ( ) are the desired and actual model outputs, respectively, at time t. The general backpropagation learning algorithm is written in the following: where is the learning rate and P denotes an adjustable parameter of the STFCMAC model. The adjustable parameters P are calculated using the gradient of error function (•).
A recursive error term is generated in each layer by a chain rule to adjust the tunable parameters in the corresponding layer. Then, the parameters in the corresponding antecedent and consequent parts of the STFCMAC model are adjusted. The update rule for parameter can be exported as follows: where The recurrent weight parameter of each cell is updated based on the following equations: where where is a real value belonging to (0, 1), represents the learning rate, and e denotes the difference between the desired output and the model output.
The mean and variance of the receptive field functions are updated in the following equations: and where All of the aforementioned formulas pertain to the case of a multiple-input, single-output system. For a multi-input and multi-output system, the cost function is rewritten in the following equations: where k is the number of outputs, and k = 1, 2, …, n.

Experimental Results
To illustrate the identification and prediction performances of the proposed STFCMAC model, three simulation examples, involving two dynamic system identification problems and a Mackey-Glass chaotic series prediction problem, are described in this section. In the two examples of dynamic system identification, we focus on comparing the performance of STFCMAC model with those of different recurrent fuzzy neural networks. In addition to the aforementioned comparison method, the different structural networks are also used to demonstrate the superiority of STFCMAC in Mackey-Glass chaotic series prediction.

Example 1: Identification of Nonlinear System
In this example, a nonlinear dynamic system is identified using the STFCMAC model. The difference equation of the nonlinear system is described as follows: where ( , , , , ) = ( ) The initial parameters are set as = 0.1 and = 0.0001. The system uses the first two inputs and three previous outputs to get the output. In training the STFCMAC model, we use only ten epochs, and there are 900 time steps in each epoch. Similar to the inputs used in [29,32], the input is an iid uniform sequence over (−2, 2) for about half of the 900 time steps and a sinusoid function, 1.05sin(πt/45), given for the remaining time. There is no repetition on these 900 training data, i.e., we have different training sets for each epoch. The used time step in this paper is equal to 1. After training, three hypercube cells are generated. The testing input ( ) is as follows: To fairly compare the experimental results obtained, STFCMAC and other methods use the same number of training and test data and input variables. The performance of the STFCMAC model is compared with that of the self-organizing recurrent fuzzy CMAC model for dynamic system identification (RFCMAC) [24], the high-order recurrent neuro-fuzzy system (HO-RNFS) [28], the TSK-type recurrent fuzzy network (TRFN) [29], the wavelet recurrent fuzzy NN (WRFNN) [33], and the recurrent self-evolving NN with local feedback (RSEFNN-LF) [32]. The performance comparison of the various models in terms of dynamic system identification includes the fuzzy hypercube cells or fuzzy rules, number of parameters, training root-mean-square error (RMSE), and testing RMSE. The comparison results are presented in Table 1. Figure 4; Figure 5, respectively, display the identification results and errors between the real output and the output obtained when the STFCMAC model is used. In the third row of Table 1, the proposed STFCMAC requires fewer parameters than the other methods, except RFCMAC. The experimental results indicate that the proposed STFCMAC model exhibits better identification ability in RMSE than the other methods.   In addition, two simulations, including the different training magnitude regions and the different input delays, are used to observe their effects on the proposed model. Firstly, the testing signal in Equation (23) is also used for this simulation. We adopt the different training magnitude regions to illustrate the effects. That is, the input is an iid uniform sequence on (−2, 2), (−1.6, 1.6), and (−1.2, 1.2). Three different training magnitudes for about half of the 900 time steps and a sine function, 1.05sin(πt/45) are used to generate the remaining 450 training inputs. Figure 6 presents the identification results for a dynamic system using the STFCMAC model with different training magnitude regions. The simulation results show that the training magnitude region is relatively narrow, from −1.2 to 1.2, to obtain better identification results.
Secondly, the different input delay simulations are used to explore the relationship between the RMSE of testing and the input delay in the proposed model. In this simulation, input delays of 5 to 30 are used in the proposed model. Figure 7 illustrates the relationship between the testing RMSE and the time delay. In this figure, we find that if the degree of time delay is 30, the proposed model starts to perform very poorly.

Example 2: System Identification of Longer Input Delays
In this example, a system identification of longer input delays is considered. The equation for identification is as follows: The plant system output is based on four previous inputs and two previous outputs. In the training procedure, 10 epochs are used, and each epoch comprises 900 time steps. The initial learning rate is set at 0.15, and the decay threshold ̅ is set at 0.0001 during the learning process. After training, three hypercube cells are generated. The testing signal used in Example 1 is also used for this example. For fair evaluation, the same parameters, such as the number of input variables, training data, and testing data, are used in the STFCMAC model and the other models. Table 2 compares the results obtained by the STFCMAC model and other models [24,28,29,32,33]. Figures 8 and 9 respectively illustrate the identification results and errors between the real output and the output obtained using the STFCMAC model. The proposed TSK-type IRSFCMAC model outperforms the other network models.  Figure 8. The results of dynamic system identification with a longer input delay obtained using the STFCMAC model. Figure 9. The errors of the system identification of longer input delays between the real output and the output obtained using the STFCMAC model.

Example 3: Prediction of Chaotic Time Series
The well-known Mackey-Glass chaotic time series prediction problem is used in this example. This chaotic time series is generated using the following delay differential equation: The initial values are set as follows: u(0) = 1.2 and = 17. In this study, four past values are used as inputs to predict u(t). Therefore, the input-output data format is [u(t − 24), u(t − 18), u(t − 12), u(t − 6), u(t)]. Based on Equation (25), a total of 1000 data points are generated from t = 124 to t = 1123. The first 500 data points are used for training, whereas the remaining 500 are used for testing to validate the proposed model. The number of training epochs is set to 500. The initial parameters are set as follows: = 0.15 and = 0.0001. After training, three fuzzy hypercube cells are generated. Table 3 compares the merits of various methods, including the rules, total numbers of parameters, and training and testing RMSEs. The performance of the STFCMAC model is compared with that of the D-FNN [34], G-FNN [27], TRFN-S [29], RSEFNN-LF [32], and PFCMAC [18] and that of neural learning models, namely, the SEELA [35], SuPFuNIS [36], and FWNN [37], as shown in Table 3. Figure 10; Figure 11 respectively illustrate the prediction results and errors between the actual output and output obtained using the STFCMAC model. The proposed STFCMAC model outperforms all of its competitors.

Conclusions
This study proposes a STFCMAC model with structure and parameter learning to solve identification and prediction problems, in which a simultaneous structure and parameter learning algorithm are proposed. Moreover, in the structure learning scheme, no initial structure exists in advance. That is, the proposed structure learning scheme can automatically determine the required structure of a network. Therefore, the proposed STFCMAC model has three advantages: (1) The proposed model requires less memory and fewer hypercubes/fuzzy rules.
(2) The proposed model has a lower RMSE value.
(3) The proposed model determines the number of hypercubes/fuzzy rules using the prespecified threshold value.
Inevitably, the proposed model has limitations. For example, determining a predetermined threshold value depends on user experience or trial and error. Therefore, an adaptive threshold selection in the STFCMAC model will be considered in future research. At the same time, in order to achieve high-speed operation in real-time applications, the STFCMAC model will be also implemented on a field programmable gate array in future research.