A Novel Radial Basis Function Neural Network with High Generalization Performance for Nonlinear Process Modelling

: A radial basis function neural network (RBFNN), with a strong function approximation ability, was proven to be an effective tool for nonlinear process modeling. However, in many instances, the sample set is limited and the model evaluation error is ﬁxed, which makes it very difﬁcult to construct an optimal network structure to ensure the generalization ability of the established nonlinear process model. To solve this problem, a novel RBFNN with a high generation performance (RBFNN-GP), is proposed in this paper. The proposed RBFNN-GP consists of three contributions. First, a local generalization error bound, introducing the sample mean and variance, is developed to acquire a small error bound to reduce the range of error. Second, the self-organizing structure method, based on a generalization error bound and network sensitivity, is established to obtain a suitable number of neurons to improve the generalization ability. Third, the convergence of this proposed RBFNN-GP is proved theoretically in the case of structure ﬁxation and structure adjustment. Finally, the performance of the proposed RBFNN-GP is compared with some popular algorithms, using two numerical simulations and a practical application. The comparison results veriﬁed the effectiveness of RBFNN-GP.


Introduction
In recent years, with the continuous development of artificial intelligence and intelligent algorithms, data-driven methods have been widely used as an effective modeling method because they do not require complex mathematical models and high maintenance costs. Among them, the radial basis function neural network (RBFNN) is widely used due to its simple structure and strong nonlinear function approximation ability, especially in the fields of pattern classification, industrial control, nonlinear system modeling and so on [1][2][3][4]. However, there are still some problems to be solved in practice, for example, how to extend the network performance from limited training set to invisible data set, that is, how to design RBFNN with a good generalization ability [5,6]. The generalization performance of RBFNN is usually measured by generalization error, which mainly includes the approximation error caused by the insufficient representation ability of network and estimation errors caused by a limited number of samples. In order to make the RBFNN learnable, the generalization error should be zero as the data tends to infinity.
Due to the limitation of sample data, the network model will produce an estimation error. In order to make the total generalization error close to zero, the number of parameters and samples should tend to infinity to ensure the learnability of the model. Among them, it is worth mentioning that references [5,[7][8][9][10] deal with the problem of estimation error according to different assumptions. On this basis, the sample complexity of finite networks is studied to demonstrate that, when the data tends to infinity, the estimation error tends to zero. In addition, due to the limited number of samples, even if the optimal parameter setting is obtained, it will produce functions far from the target, resulting in errors and a poor generalization performance [9]. To solve this problem, Barron et al. [10] introduced the concept of an approximation and estimation bound of artificial neural network, pointing out that, for a kind of common artificial neural network, the integral square error between the estimation network model and the objective function is bounded, and discussing the comprehensive influence of approximation error and estimation error as the objective function on the network accuracy. In addition, Yeung et al. [11] developed a new RBFNN local generalization error model to identify the classifier. By predefining the neighborhood of training samples in the local generalization error model, the upper bound of generalization error of invisible samples was derived. Sarraf [12] proposed a tight upper bound of the generalization error under the assumption of twice continuously differentiable, which was composed of the estimation error under the sample space mean and the expected sensitivity of the error to the input change, and showed how the given upper bound could be used to analyze the generalization error of a feedforward neural network. Although the above methods achieved good results through the generalization error bound based on sensitivity, they still faced challenges due to the computational complexity of partial derivatives, and the generalization error should not only be a function of the number of parameters; it is also important to find a better structure. In addition, Wu et.al. proposed a self-adaptive structural optimal algorithm-based fuzzy neural network (SASOA-FNN) in [13]. This network can improve the generalization ability of the network by minimizing the structural risk model with the number of samples. Terada et al. [14] derived the fast generalization error bound of deep learning under the framework developed in [15]. In the derivation process, they only focused on the minimization of empirical risk and eliminated the scale invariance assumption of activation function. The common feature of the above references is that they are based on risk minimization, and accelerate the convergence speed of the network, while ignoring the influence of the properties of different networks on the generalization error. For example, RBFNN is essentially a local learning method. Each hidden neuron captures the local information of a specific region in the input space by the center and width of its Gaussian activation function [16]. However, the training samples far away from the center of hidden neurons have no effect on the learning of hidden neurons. Therefore, for this local learning method, finding the optimal compromise between model accuracy and generalization error is an effective way to improve the generalization ability of the network.
Different from the estimation error caused by the insufficient samples mentioned above, the approximation error of the network is greatly affected by the network structure. Thus, how to obtain a suitable network structure has always been a hot topic. For instance, Zhou et al. [17] proposed a self-organizing fuzzy neural network with hierarchical pruning scheme (SOFNN-HPS). In SOFNN-HPS, the adaptive ability and robustness of the prediction model were improved through the effective combination of a hierarchical pruning strategy and adaptive allocation strategy. Finally, the accurate prediction of ammonia nitrogen, the key variable in the wastewater treatment process, is realized. To predict the outlet ferrous ion concentration on-line, Xie et al. [18] developed a self-adjusting structure radial basis function neural network (SAS-RBFNN). This algorithm uses the supervised clustering algorithm to initialize the RBFNN, and combines or segments the hidden neurons according to the clustering distance to realize the structural self-organization of RBFNN. In addition, Huang et al. [19] proposed a growing and pruning RBF (GAP-RBF) method based on the significance of a neuron. For the GAP-RBF, the number of neurons can be self-designed to realize a compact RBFNN by linking the significance of neurons and a desired accuracy. On this basis, an improved GAP-RBF algorithm (IGAP-RBF) for any arbitrary distribution of input training data was proposed in reference [20]. This algorithm only adjusts the parameters of the nearest neuron to reduce the computational complexity while ensuring the learning performance. A common feature of GAP-RBF and IGAP-RBF is that the self-organizing strategy is based on the contribution of hidden neurons, and the training samples need to be known in advance. In reference [21], an adaptive gradient multi-objective particle swarm optimization algorithm was proposed to predict the biochemical oxygen demand, a key water quality parameter in the wastewater treatment process. This method adopts a multi-objective gradient method and adaptive flight parameter mechanism, which not only greatly reduces computational complexity, but also improves generalization performance; other structure self-organization methods are outlined in [22][23][24][25]. The advantage of the above algorithms is that they can adjust the network parameters while adjusting the network structure, and the disadvantage is that different parameter adjustment methods make the learning speed of the network different, which may affect the accuracy of the network.
Most of the existing methods focus on using self-organizing strategies to obtain appropriate structures, or using effective learning algorithms to obtain a higher accuracy. However, good training accuracy is not equal to good generalization performance. Therefore, designing an effective learning model to improve the generalization performance of RBFNN is still an urgent problem that needs to be solved. Based on the above analysis, a self-organizing RBFNN based on network sensitivity is proposed to improve the generalization performance. The main contributions of this method are as follows.
The generalization ability is quantified by network sensitivity. Then, an RBFNN-GP algorithm is constructed to improve the generalization performance: 1.
The convergence of the RBFNN-GP is verified in theory, which ensures its successful application; 2.
The effectiveness and feasibility of the RBFNN-GP are verified by predicting the key water quality parameters in wastewater treatment process.
The remainder of this paper is organized as follows. Section 2 briefly introduces the basic RBFNN and the local generalization error bound of the network. Then, the details of RBFNN-GP are given in Section 3. The convergence of RBFNN-GP is discussed in Section 4. Section 5 presents the experimental results of RBFNN-GP to demonstrate its advantages. The application field and future work direction of the proposed method are shown in Section 6. Finally, the conclusions are given in Section 7.

Radial Basis Function Neural Network (RBFNN)
In general, RBFNN consists of three layers: the input layer, the hidden layer and the output layer. A typical multiple-input, single-output RBFNN (MISO-RBFNN) is shown in Figure 1. The MISO-RBFNN is a k-m-1 network, and each neuron in the RBFNN hidden layer is constructed in the form of Gaussian function. The mathematical description of RBFNN output is as follows: where m is the number of hidden neurons, w j (t) is the weight between the jth hidden layer and the output layer at time t, and θ j (t) is the output of the jth hidden layer neuron, described as: where is the center vector of the jth hidden neuron at time t, and n is the dimension of the input vector; ||x(t)−c j (t)|| is the Euclidean distance between x(t) and c j (t), and σ j (t) is the width of the jth hidden neuron at time t.

Local Generalization Error Bound
The generalization error of the whole input space is defined as [11]: where x(t) is the input vector in the input space and p(x(t)) is the unknown probability density function of x(t). Given a training data set D = (x a (t), F a (t)), a = 1, 2, · · · , N, N is the number of pairs of input and output, namely the number of samples of training set. The empirical error of network can be defined as: wheref (x a (t)) and F(x a (t))represent the approximate and real mapping functions between the ath input and output in the input space, respectively. The ultimate goal of improving the generalization ability is minimize the approximation error, and the network can directly predict the unseen data. Since RBFNN is a local method, for each sample x a (t) ∈ D, we can find a sample set: where is regarded as perturbations, n denotes the number of input features, and S is a given number. The samples in X S,x a (t) (except x a (t)) are regarded as unseen samples.
where I is the entire input space. By Hoeffding's inequality, we can derive the definition of local generalization error bound as follows [11,26]: with: where ∆y(t) is the difference between the network output and the real value; the term E X S ,∆y 2 (t) is introduced in Section 3.1.

Remark 1.
Different from previous methods, to the best of our knowledge, this work is a new attempt to obtain looser generalization error bounds by eliminating high-order terms and reducing partial accuracy in exchange for better generalization ability.

RBFNN with High Generation Performance (RBFNN-GP)
The proposed RBFNN-GP, which can improve the network generalization ability, is introduced in this section. It contains the following two parts: (1) sensitivity measurement (SM) method and (2) structural self-organization optimization (SSO) strategy.

Sensitivity Measurement (SM) Method
Sensitivity analysis provides an effective method to evaluate the impact of different inputs on output results, and it can accurately express the causal response between input changes and corresponding outputs [27,28]. Different from existing studies that focus on improving modeling accuracy or looking for indicator variables, in this study, SM is introduced to quantify the impact of network input changes on output changes, and to intuitively represent the sensitivity of network output to input changes. Thus, the network structure can be adjusted accordingly. Suppose that the inputs are independent and not identically distributed; then, from this, each input feature has its own expectation µ x i and variance, δ 2 x i : with: and: In theory, as long as the input variation is finite, we do not strictly limit its data distribution. In this instance, we assume that the unseen samples S -neighborhood of the training samples obey the uniform distribution, and thus we obtain δ 2 ∆x i (t) = S 2 /3. By the law of large numbers, the sensitivity of RBFNN is: with: therefore, we can obtain: Remark 2. Limiting cases of E gen,S (t). Clearly, when S → ∞, Therefore, in the case of S → ∞ , E gen (t) < E gen,S (t).
Remark 3. Statistical performance. Compared with the regression error bound, which only uses the effective parameters and the number of training samples, the proposed RBFNN-GP algorithm has clear advantages, because it considers statistical characteristics, such as the mean and variance of the training data set.

Structural Self-Organization Optimization (SSO) Strategy
In order to construct a RBFNN with a high generalization performance, an SSO strategy is designed based on the sensitivity measurement. This SSO strategy can adjust the structure and parameters (including center, width and weight) of RBFNN at the same time. The self-organization strategies are shown as follows: are the center and width vectors of the hidden neuron at time t + 1, w(t + 1) is the weight of output layer at time t + 1, m(t) represents the number of hidden layer neurons at time t, and m(t) ≥ 1, η is the learning rate, λ 1 ≤ −2/3S 2 nξ m , ξ m is the ratio of statistical output and width of the mth neuron (λ 1 is a dynamic threshold and the statistical residual is negative), and λ 2 ≥ 0 are used to ensure the convergence performance of the network (here the value is twice that of the input dimension). It should be noted that, only when λ 1 and λ 2 acquire the equals sign at the same time, will the number of neurons remain unchanged, that is, the structure of the neural network holds. The variables of ∆c(t), ∆σ(t), ∆w(t) present the changes of the centers, widths and weights at time t, respectively. We obtain: where ∆c m (t) = [∆c m,1 (t), · · · , ∆c mn (t)] is the change of the center of the mth neuron at time t, and ∆σ m (t) and ∆w m (t) are the changes of width and weight of the mth neuron at time t, respectively. Moreover, the updates details of the parameters are: .
Based on the above analysis, the detailed steps of neuron growth and pruning are given below.

Growth Stage
If σ j < λ 1 , new neurons are added to the neural network to reduce the approximation error and improve the generalization performance. At this time, the number of neurons becomes, and the parameter of new neurons is: where c new , σ new and w new represent the center, width and weight of the new neuron,x i (t)represents the ith element in the input vector at time t, c i,new (t) and σ i,new (t) are the ith elements of the center and width of the new neuron at time t, respectively. After structural adjustment, the parameters are updated as: where c(t + 1), σ(t + 1) and w(t + 1) are the center, width and weight of RBFNN at time t + 1, respectively.

Prune Stage
If σ j > λ 2 , the jth neuron with the least information in the hidden layer is deleted. In this way, the network complexity is reduced under the premise of ensuring generalization ability. At this time, the number of neurons becomes m(t − 1), and the parameters of the new neurons are: where c j (t + 1), σ j (t + 1) and w j (t + 1) represent the center, width and weight of the jth neuron at time t + 1, respectively. After structural adjustment, the parameters of the ith neuron were as follows: Among them, the ith neuron is the neuron closest to the Euclidean distance from the jth neuron, c i (t + 1), σ i (t + 1) and w i (t + 1) are the center, width and weight of the ith neuron at time t + 1.

Remark 4.
Because there is only one hidden layer in the model, the network structure is simple, which greatly reduces the error accumulation in the process of back propagation. Furthermore, no extra parameters are added during network training, which reduces the amount of computation. Therefore, the stability of the network is well guaranteed.

Convergence Analysis
Another important problem of neural network structure is convergence analysis, which not only affects the application in practical engineering, but also reflects the generalization ability of neural network. If the neural network cannot guarantee convergence or meet convergence conditions, it is difficult to realize the successful application of the neural network. In addition, for RBFNN-GP, its convergence is not only related to the parameter optimization algorithm, but also related to structural changes. Therefore, this paper analyzes the convergence from three aspects: convergence in the stable stage, growth stage and deletion stage.

Hypothesis 1 (H1).
The center c of the hidden layer, the width σ of the hidden layer and the input-output weight w satisfy the boundedness, that is c ≤ µ c , σ ≤ µ σ , w ≤ µ w , where µ c ,µ σ ,µ w are positive real numbers.

Hypothesis 2 (H2).
There is a set of "optimal" network parameters, c * , σ * and w * , that is, the optimal center, width and weight.

Convergence Analysis of RBFNN with Fixed Structure
For the convenience of discussing its convergence, the differential equation of e is expressed as follows [29]. .
In order to transform the nonlinear output of the network into a partially linear form, the Taylor expansion of ∆θ is: where Ω is the higher order infinitesimal of Taylor expansion.

Theorem 1.
Suppose the number of hidden-layer neurons of RBFNN is fixed, and the network parameters are updated according to Equtions (14)-(16); when t → ∞, e(t) → 0 , the convergence of the network is guaranteed.
Proof of Theorem 1. The Lyapunov function is defined as: with: where v is the compensator and v * is the optimal compensator. The partial derivative of the Lyapunov function is: V (e, c,σ,w) = ee + ∆c T ∆c + ∆σ T ∆σ + ∆w T ∆w + ∆v T ∆v.
Thus,V is the seminegative definite in the above space. In light of the Lyapunov theorem, we can obtain: lim t→∞ e(t) = 0.
So far, the convergence of a fixed-structure RBFNN is proved.

Convergence Analysis of RBFNN with Changeable Structure
Theorem 2. If the network structure is self-organizing in the learning process, then the parameters are adjusted according to Equations (14)- (21). When t → ∞, e(t) → 0 , the convergence of RBFNN based on sensitivity can be guaranteed.
Proof Theorem 2. The structure self-organization process of RBFNN is divided into two parts: structure growth and structure pruning stages.

Growth Stage
At time t, there are m neurons in the hidden layer of self-organizing RBFNN based on sensitivity, and the network error is e m (t). When the growth condition is satisfied, the number of hidden layer neurons is increased by 1. Then, the number of hidden layer neurons is m + 1, and the output error of the RBFNN is: where e m+1 (t) is the network error of m + 1 hidden neurons,ŷ m (t) andŷ m+1 (t) represent the network output before and after the addition of hidden layer neurons, respectively. According to Equations (17)- (19), we can obtain: so: e m+1 (t) = 0.
It can be seen that when a new neuron is added to the hidden layer, the convergence speed of the network is accelerated based on the parameter setting of the newly added neurons.

Prune Stage
When the pruning condition is satisfied, the jth neuron in the hidden layer is deleted, and the error of the network is e m−1 (t). Thus, the error of the network can be rewritten as: whereŷ m (t)represents the network output when the number of hidden layer neurons is m: In summary, the error of the neural network remains unchanged before and after the jth neuron is deleted, that is, the process of deleting neurons does not destroy the convergence of the original neural network.

Experimental Studies
This section describes the experiments conducted to assess the effectiveness of the generalization performance of the RBFNN-GP algorithm. The experiment includes one benchmark and two practical problems, namely, the approximation of the Mexican straw hat function and the prediction of key water quality parameters, ammonia nitrogen and membrane permeability in the wastewater treatment process. In addition, the good generalization performance of the proposed RBFNN-GP is further illustrated by comparisons with the existing six algorithms.

Benchmark Example A
In this case, the RBFNN-GP algorithm is applied to approximate the Mexican straw hat function, which is a benchmark problem used in [30,31] to checkout many prevalent algorithms. The Mexican straw hat function is: In the training phase, the training samples are x = {x 1 , x 2 ; y} N , where x 1 , x 2 are stochastic and generated within the scope of (−2π, 2π), N = 2100 is the number of training samples; the testing samples are {x 1 , x 2 , y} M , where M = 700 is the number of testing samples, and the learning rate is set to η = 0.003.
The experimental results are shown in Figure 2, where Figure 2a shows the number of hidden neurons in the training process. It can be seen that the proposed RBFNN-GP can adjust the network structure by pruning or increasing the number of neurons in the learning process. Figure 2b shows the change trajectory of the root-mean-square error (RMSE) in the learning process. Meanwhile, the prediction error results and output error surface are depicted in Figure 2c,d. It is clear that the proposed RBFNN-GP can approximate the Mexican straw hat function with small predicting errors. In order to further prove the excellent generalization ability of the proposed method, the prediction results of the RBFNN-GP are compared with those of the other dynamic neural networks based on structural adjustment, such as SASOA-FNN [12], SOFNN-HPS [17], SAS-RBFNN [18], AG-MOPSO [21], ASOL-SORBFNN [23] and the RBFNN with a fixed structure (fixed-RBFNN). In order to make the comparison more meaningful, all algorithms in this experiment use the same data set, including training samples and test samples, and ensure that the initial number of neurons is the same. In addition, all of the algorithms run 10 times, and then take the average value to make the results more convincing. The results are shown in Table 1, where Max. is the maximum and Dev. is the deviation.
As can be seen from Table 1, the proposed RBFNN-GP requires fewer hidden nodes and output errors and has a better generalization ability than the self-organizing network based on information minimization and structural risk minimization. This mainly depends on the fact that the method considers not only the number of effective parameters in the network, but also the mean and variance of input data.

Benchmark Example B
In this example, the effectiveness of the RBFNN-GP is applied for the Mackey-Glass chaotic time series prediction problem, which is a famous benchmark [32,33]. The discrete expression of the time series is given by: where a 1 , a 2 , τ are constants, x(0) represents the initial value, and a 1 = 0.1, a 2 = 0.2, τ = 17, x(0) = 1.2. In the training phase, 1000 data samples are extracted from t = 21 to 1021. Additionally, 500 samples are used as training samples and 500 samples as test samples. The preset training error is 0.001, the initial learning rate η = 0.02, and the number of neurons is 5. It should be noted that the other parameters in all comparison methods are the same, and the running results of the experiment are shown in Figure 3. From Figure 3a, it can be seen that there are several increasing and decreasing stages in the learning process of RBFNN-GP. In the early period of training, the number of neurons changes frequently, and the network structure is unstable. Figure 3b shows the comparison between the predicted output of RBFNN-GP and the real value. Figures 3c,d show the prediction error value of the network and the RSME value in the training stage, respectively. It is worth mentioning that the reason why the RSME curve is so smooth is closely related to the sensitivity of the network. Since the mean and variance of the training samples are fully considered, the stability of the network is improved.
Similarly, Table 2 shows the comparative experimental results with the other four methods: SASOA-FNN [12], SOFNN-HPS [17], SAS-RBFNN [18], AGMOPSO [21], ASOL-SORBFNN [23] and fixed-RBFNN. As shown in Table 2, the optimal number of hidden nodes obtained by using the RBFNN-GP is only six, and the mean and deviation of the test error are minimal in the comparison method. Since the parameters are updated at the same time in the process of structural adjustment to avoid repeated calculations, the computational complexity is greatly reduced compared with the dynamic structural adjustment method based on information strength. Nevertheless, compared with other self-organizing networks based on structural risk, the proposed RBFNN-GP takes a little longer time to calculate the mean and variance information of input samples. However, we still have reason to believe that RBFNN-GP demonstrates a good compromise between generalization ability and training accuracy.

Permeability Prediction of Membrane Bio-Reactor
Membrane bio-reactor (MBR) is a new wastewater treatment technology combining membrane separation technology and biotechnology, which is widely used in wastewater treatment process (WWTP). However, in the process of MBR wastewater treatment, membrane pollution will shorten the service life of the membrane and cause an unnecessary loss of process energy consumption. It is of great practical significance to correctly predict the permeability of the membrane and increase its working efficiency [34][35][36]. Therefore, the proposed RBFNN-GP is applied to predict the permeability of MBR in WWTP. The real data of the experiment come from a wastewater treatment plant in Beijing, China. After disposing of the abnormal data, 500 samples were obtained and normalized.
In this experiment, 50 training samples and 50 test samples are selected to test the performance of RBFNN-GP. In order to remove the correlation between variables, partial least squares method was used to select nine variables from twenty-two variables as the input of RBFNN-GP. Due to the wide range of membrane bioreactor permeability, the number of iterations is T = 200, the learning rate is η = 0.003, and the time length is h = 10. The intuitive prediction results are shown in Figure 4. Figure 4a shows the comparison results between the actual and predicted values of membrane bioreactor permeability, and the prediction error is shown in Figure 4b. It can be seen from Figure 4 that the proposed RBFNN-GP has a good prediction performance for the permeability of the MBR, and the prediction error within the range [−4, 3].  Table 3 shows the comparison results of SASOA-FNN [12], SOFNN-HPS [17], SAS-RBFNN [18], AGMOPSO [21], ASOL-SORBFNN [23] and Fixed-RBFNN in predicting the permeability of MBR. It can be seen from Table 3 that, under the same iteration times, the learning time of SASOA-FNN [12] and RBFNN-GP is almost the same. However, the accuracy of the latter is better than that of the former. Thus, we have sufficient reasons to believe that the RBFNN-GP method shows a great improvement in training accuracy, calculation speed and generalization ability compared with the above methods.

Computational Complexity
Computational complexity is an important indicator for evaluating the model. For the proposed RBFNN-GP, the calculation involved is closely related to the training process of and N. Suppose [x(t), y(t); t = 1, · · · , N] is a set of training samples. When the structure of RBFNN-GP is k-m-l (k represents the input variable, m represents the number of hidden layer neurons, and l represents the output variable), the computational complexity of RBFNN-GP is O[m(t)]. It can be seen that the computational burden of RBFNN-GP is not heavy.

Future Trends
In this paper, which aims to realize the online prediction of key water quality parameter membrane pollution in wastewater treatment process, a prediction model based on selforganizing RBFNN is established that meets the needs of accurate prediction of key water quality parameters in wastewater treatment process. At the same time, the self-organizing network modeling method is developed. Due to its good generalization performance and theoretical support, the proposed method can also be extended to other types of networks, such as multi-input, single-output fuzzy neural networks and multi-input, multiple-output RBF neural network.

Conclusions
In this paper, an RBFNN-GP algorithm is proposed to improve the model generalization ability. Firstly, the upper bound of the local generalization error is found, and the network structure and parameters are adjusted within the allowable error range. Then, a generalization error formula based on network sensitivity and approximate error is introduced to improve the generalization performance, while ensuring its accuracy. Finally, experimental validation is carried out on two different benchmark data sets and a real application of wastewater treatment process. The results show that the RBFNN-GP algorithm can learn robust networks with good generalization performance and compact scale. In summary, RBFNN-GP has the following advantages: 1.
With the help of sensitivity measurements and locally generalized error bounds, the network has a statistical performance and can reasonably achieve structure selforganization without a high dependence on sample numbers.

2.
The convergence of RBFNN-GP for fixed and variable structures is guaranteed by the thresholds λ 1 and λ 2 . Therefore, the proposed RBFNN-GP can not only reduce the number of additional parameters, but also decreases the computational burden.

3.
Compared with existing algorithms, the proposed RBFNN-GP shows a good generalization ability in the prediction of key water quality parameters in wastewater treatment processes. Furthermore, this approach can be extended to other types of networks and industrial domains.