# A Real-Time BOD Estimation Method in Wastewater Treatment Process Based on an Optimized Extreme Learning Machine

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Extreme Learning Machine

^{th}hidden node and the input nodes; ${\beta}_{i}={\left[{\beta}_{i1},{\beta}_{i2},\dots ,{\beta}_{in}\right]}^{\mathrm{T}}$ is the connecting weight matrices of the i

^{th}hidden node and the output nodes; and b

_{i}is the threshold of the i

^{th}hidden node. ${x}_{j}\cdot {w}_{i}$ is the inner product of ${x}_{j}$ and ${w}_{i}$. The network topology with linear output nodes is shown in Figure 1.

_{i}, ${w}_{i}$ and b

_{i}, such that

**H**is called the hidden layer output matrix of the neural network; The i

^{th}column of

**H**is the i

^{th}hidden node output with respect to inputs x

_{1}, x

_{2}, …, x

_{N}; The row of matrix

**H**represents the hidden layer feature mapping with respect to input x

_{i}, that is ${x}_{i}:h({x}_{i})$.

**R**which is infinitely differentiable in any interval and, N arbitrary distinct samples $({x}_{i},{t}_{i})\in {R}^{n}\times {R}^{m}$, there exists $L\le N$ for any parameters of the network ${\{({w}_{i},{b}_{i})\}}_{i=1}^{L}$, according to any continuous probability distribution, then with probability one, $\Vert {H}_{N\times L}{\beta}_{L\times m}-{T}_{N\times m}\Vert <\epsilon $.

**H**is a non-square matrix, there exists ${\widehat{w}}_{i},{\widehat{b}}_{i},\widehat{\beta}$, so that Equation (7) can be established.

**w**

_{i}and the hidden layer biases b

_{i}are in fact not necessarily tuned and the hidden layer output matrix

**H**can actually remain unchanged once random values have been assigned to these parameters in the beginning of learning, and this makes Equation (7) is considerate as a linear system. The training for SLFNs is simply equivalent to finding the least squares solution $\widehat{\beta}$ of the linear equations

**H**β =

**T**, that is

**H**

^{+}is the Moore–Penrose generalized inverse of the matrix

**H**[20,21].

**Step 1:**Assign input weight,

**w**

_{i}and bias of hidden layer, b

_{i}, randomly (where i = 1, …, L);

**Step 2:**Calculate the hidden layer output matrix,

**H**;

**Step 3:**Calculate the output weight, $\widehat{\beta}$.

#### 2.2. Improved Cuckoo Search Algorithm-Based ELM (ICS-ELM)

#### 2.2.1. Improved Cuckoo Search (ICS) Algorithm

_{a}, which represents the probability to be found by the host, in CS (apart from the population size n). Therefore, it is very easy to implement.

_{a}and an adaptive dynamic adjustment strategy for the search step size S, which are described as follows.

_{a,max}and p

_{a,min}are the dynamic control parameters of p

_{a}which are equal to 0.75 and 0.1 respectively. In Equation (11), $m\in (0,1)$ is a regulatory factor; bestX

_{i}

_{-1}is the optimal nest position of the last generation groups; is the limiting factor; t and t

_{max}are the current iteration number and the maximum iteration number; S

_{min}is the minimum search step; p is an integer from 1 to 30.

_{i}(i = 1, 2, …, n); set the size of population, the dimension of independent variables, the maximum iteration number, the maximum and minimum probability of being detected.

_{i}into the objective function.

_{a}, randomly change the value of ${x}_{i}^{t+1}$; otherwise leave as it is. Keep the optimal nest position at last.

#### 2.2.2. ICS-Based ELM

**w**

_{i}and the hidden layer biases b

_{i}should be optimized through the training process by ICS algorithm.

_{train}is the number of training samples; m is the number of the hidden nodes; f is the fitness function.

#### 2.3. Experimental Data Processing

#### 2.3.1. Acquisition of Experimental Data from Benchmark Simulation Model No. 1 (BSM1)

#### 2.3.2. Fuzzy Rough Monotone Dependence Algorithm for Data Processing

^{th}column is the decision attribute (that is, the data of effluent water quality), and 1

^{st}to the (m − 1)

^{th}columns are the conditional attributes (that is, the data of the wastewater influent);

## 3. Results and Discussion

#### 3.1. Data Attribute Reduction

#### 3.2. Comparison and Discussion of Simulation Results

_{s}, X

_{i}, X

_{s}, X

_{bh}, S

_{nh}, and S

_{nd}of the influent and the flow rate Q are used as the auxiliary variables for the network input, so the ICS-based ELM network structure had 7 input nodes, 40 hidden nodes and one output node (7-40-1). The 1344 groups of data are randomly divided into training datasets (500 groups), verifying datasets (460 groups), and testing datasets (384 groups) for the simulation. The training accuracy and prediction accuracy of the soft sensor model are represented by mean square error (MSE).

^{*}(n) is the actual measured value; N is the sample size number.

_{min}= 0.01.

**w**

_{i}and the hidden layer biases b

_{i}.

## 4. Conclusions

_{a}and an adaptive dynamic adjustment strategy are designed to improve the cuckoo search algorithm. The input weights and the hidden layer biases of the proposed ICS-based ELM are optimized during the offline training process. Through the verification of the results, the proposed method can effectively improve the accuracy of the prediction with better anti-interference and generalization abilities.

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Dellana, S.A.; West, D. Predictive modeling for wastewater applications: Linear and nonlinear approaches. Environ. Model. Softw.
**2009**, 24, 96–106. [Google Scholar] [CrossRef] - Wang, J.; Zhang, Y. Research advances in biosensor for rapid measurement of biochemical oxygen demand (BOD). Acta Sci. Circumstantiae
**2007**, 27, 1066–1082. [Google Scholar] - Zhang, M.; Han, H.; Qiao, J. Research on dynamic feed-forward neural network structure based on growing and pruning methods. CAAI Trans. Intell. Syst.
**2011**, 6, 101–106. [Google Scholar] - Qiao, J.; Ju, Y.; Han, H. BOD soft-sensing based on SONNRW. J. Beijing Univ. Technol.
**2016**, 42, 1451–1460. [Google Scholar] - Liu, W. Online biochemical oxygen demand soft measurement based on echo state network. Comput. Meas. Control
**2014**, 22, 1351–1354. [Google Scholar] - Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: Theory and applications. Neurocomputing
**2006**, 70, 489–501. [Google Scholar] [CrossRef] - Huang, G.B.; Zhou, H.; Ding, X.; Zhang, R. Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. Part B
**2012**, 42, 513–529. [Google Scholar] [CrossRef] [PubMed] - Han, F.; Huang, D.S. Improved extreme learning machine for function approximation by encoding a priori information. Neurocomputing
**2006**, 69, 2369–2373. [Google Scholar] [CrossRef] - Deng, C.; Huang, G.; Xu, J.; Tang, J. Extreme learning machines: New trends and applications. Sci. China Inf. Sci.
**2015**, 58, 1–16. [Google Scholar] [CrossRef] - Wang, Y.; Di, K.; Zhang, S. Melt index prediction of polypropylene based on DBN-ELM. CIESC J.
**2016**, 67, 5163–5168. [Google Scholar] - Li, R.; Wang, L. Soft Measurement modeling and chemical application based on ISOMAP-ELM neural network. Acta Metrol. Sin.
**2016**, 37, 548–552. [Google Scholar] - Zhu, Q.Y.; Qin, A.K.; Suganthan, P.N.; Huang, G.B.; Patcog, J. Evolutionary extreme learning machine. Pattern Recognit.
**2005**, 38, 1759–1763. [Google Scholar] [CrossRef] - Yan, D.; Chu, Y.; Zhang, H.; Liu, D. Information discriminative extreme learning machine. Soft Comput.
**2018**, 22, 677–689. [Google Scholar] [CrossRef] - Kassani, P.H.; Teoh, A.; Kim, E. Sparse pseudoinverse incremental extreme learning machine. Neurocomputing
**2018**, 287, 128–142. [Google Scholar] [CrossRef] - Du, X.; Wang, J.; Jegatheesan, V.; Shi, G. Parameter estimation of activated sludge process based on an improved cuckoo search algorithm. Bioresour. Technol.
**2018**, 249, 249–447. [Google Scholar] [CrossRef] [PubMed] - Liang, J.; Luo, F.; Xu, Y. Fuzzy rough monotone dependence algorithm based on decision table and its application. J. South China Univ. Technol. Nat. Sci. Ed.
**2011**, 39, 7–12. [Google Scholar] - Alex, J.; Benedetti, L.; Copp, J. Benchmark Simulation Model No. 1 (BSM1); Technical Report; Lund University: Lund, Sweden, 2008. [Google Scholar]
- Jeppsson, U.; Pons, M.N. The COST benchmark simulation model—Current state and future perspective. Control Eng. Pract.
**2004**, 12, 299–304. [Google Scholar] [CrossRef] - Huang, G.B. Learning capability and storage capacity of two hidden layer feedforward networks. IEEE Trans. Neural Netw.
**2003**, 14, 274–281. [Google Scholar] [CrossRef] [PubMed] - Rao, C.R.; Mitra, S.K. Generalized Inverse of Matrices and Its Applications; Wiley: New York, NY, USA, 1971. [Google Scholar]
- Serre, D. Matrices: Theory and Applications; Springer: New York, NY, USA, 2002. [Google Scholar]
- Yang, X.; Deb, S. Cuckoo search via Lévy flights. In Proceeding of the World Congress on Nature & Biologically Inspired Computing, Coimbatore, India, 9–11 December 2009. [Google Scholar]
- Tao, T.; Zhang, J.; Xin, K.; Li, S. Optimal valve control in water distribution systems based on cuckoo search. J. Tongji Univ. Nat. Sci.
**2016**, 44, 106–110. [Google Scholar] - Xu, Y.; Cao, T.; Luo, F. Wastewater effluent quality prediction model based on relevance vector machine. J. South China Univ. Technol. Nat. Sci. Ed.
**2014**, 42, 103–108. [Google Scholar]

**Figure 5.**Compared prediction results of improved cuckoo search algorithm-based extreme learning machine (ICS-ELM) and basic ELM under dry weather condition, (

**a**) effluent BOD concentration, (

**b**) error in predicting the measured effluent BOD.

**Figure 6.**Prediction of effluent BOD under dry weather condition using five soft sensors, (

**a**) effluent BOD concentration, (

**b**) error in predicting the measured effluent BOD.

**Figure 7.**Prediction of effluent BOD concentration under rainy condition, (

**a**) Effluent BOD concentration, (

**b**) Error in predicting the measured effluent BOD.

**Figure 8.**Prediction of effluent BOD under stormy weather condition with a storm event, (

**a**) Effluent BOD concentration, (

**b**) error in predicting the measured effluent BOD.

Parameters | Values | Units | Descriptions |
---|---|---|---|

KLa_{3}, KLa_{4} | 240 | mg/day | Oxygen transfer coefficient of the 3rd and 4th bioreactors |

KLa_{5} | 83 | mg/day | Oxygen transfer coefficient of the 5th bioreactor |

Qint | 55338 | m^{3}/day | Internal recirculation flow rate |

Qr | 18446 | m^{3}/day | Returned sludge flow rate |

Qw | 385 | m^{3}/day | Waste sludge flow rate |

Component | Unit | Description |
---|---|---|

S_{i} | mg COD/L | Soluble inert organic matter |

S_{s} | mg COD/L | Readily biodegradable substrate |

X_{i} | mg COD /L | Particulate inert organic matter |

X_{s} | mg COD/L | Slowly biodegradable substrate |

X_{bh} | mg COD /L | Active heterotrophic biomass |

X_{ba} | mg COD/L | Active autotrophic biomass |

X_{p} | mg COD /L | Particulate product arising from biomass decay |

S_{o} | mg -COD/L | Oxygen (negative COD) |

S_{no} | mg N/L | Nitrate and nitrite nitrogen |

S_{nh} | mg N/L | NH_{4}^{+} and NH_{3} nitrogen |

S_{nd} | mg N/L | Soluble biodegradable organic nitrogen |

X_{nd} | mg N/L | Particulate biodegradable organic nitrogen |

S_{alk} | mole/m^{3} | Alkalinity |

TSS | mg S_{S}/L | Total amount of solids |

Q | m^{3}/day | Influent flow rate |

**Table 3.**Prediction results of the six soft sensor models. MSE: mean square error; CS-ELM: cuckoo search-extreme learning machine; RVM: relevance vector machine; BP: back propagation; LS-SVM: least squares support vector machines.

Model | MSE | Hidden Nodes | Training Time (sec) |
---|---|---|---|

ELM | 1.3011 | 40 | 1.78 |

CS-ELM | 0.0640 | 15 | 76.67 |

RVM | 0.0513 | - | - |

BP^{1} | 0.0909 | 25 | - |

LS-SVM^{1} | 0.0865 | - | - |

ICS-ELM | 0.0254 | 40 | 61.4 |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Yu, P.; Cao, J.; Jegatheesan, V.; Du, X. A Real-Time BOD Estimation Method in Wastewater Treatment Process Based on an Optimized Extreme Learning Machine. *Appl. Sci.* **2019**, *9*, 523.
https://doi.org/10.3390/app9030523

**AMA Style**

Yu P, Cao J, Jegatheesan V, Du X. A Real-Time BOD Estimation Method in Wastewater Treatment Process Based on an Optimized Extreme Learning Machine. *Applied Sciences*. 2019; 9(3):523.
https://doi.org/10.3390/app9030523

**Chicago/Turabian Style**

Yu, Ping, Jie Cao, Veeriah Jegatheesan, and Xianjun Du. 2019. "A Real-Time BOD Estimation Method in Wastewater Treatment Process Based on an Optimized Extreme Learning Machine" *Applied Sciences* 9, no. 3: 523.
https://doi.org/10.3390/app9030523