Prediction of Wastewater Quality at a Wastewater Treatment Plant Inlet Using a System Based on Machine Learning Methods

: One of the important factors determining the biochemical processes in bioreactors is the quality of the wastewater inﬂow to the wastewater treatment plant (WWTP). Information on the quality of wastewater, sufﬁciently in advance, makes it possible to properly select bioreactor settings to obtain optimal process conditions. This paper presents the use of classiﬁcation models to predict the variability of wastewater quality at the inﬂow to wastewater treatment plants, the values of which depend only on the amount of inﬂowing wastewater. The methodology of an expert system to predict selected indicators of wastewater quality at the inﬂow to the treatment plant (biochemical oxygen demand, chemical oxygen demand, total suspended solids, and ammonium nitrogen) on the example of a selected WWTP—Sitk ó wka Nowiny, was presented. In the considered system concept, a division of the values of measured wastewater quality indices into lower (reduced values of indicators in relation to average), average (typical and most common values), and upper (increased values) were adopted. On the basis of the calculations performed, it was found that the values of the selected wastewater quality indicators can be identiﬁed with sufﬁcient accuracy by means of the determined statistical models based on the support vector machines and boosted trees methods.


Introduction
The key factors having a significant impact on the operation of wastewater treatment plants, and thus the choice of setting values, include the quantity and quality of the influent wastewater [1,2]. They constitute independent variables in the process models for forecasting the operation of a treatment plant [3][4][5]. This is important in terms of the operation of the facilities, as having the values of wastewater quality indicators allows the simulation calculations of a biological reactor to be performed in advance using a process model. This allows optimization of the wastewater treatment process, reduction of operating costs, and obtaining high stability of the biological reactor [6][7][8][9].
The values of wastewater quality indicators at the inlet to the WWTP, as well as the amount of wastewater flowing into the WWTP, change over a wide range, e.g., as a result of heavy rains or uncontrolled discharges of pollutants into the sewage system. This has a significant impact on the treatment process in the biological reactor. Therefore, in order to obtain the required values of wastewater quality indicators at the outflow and to maintain high operational reliability of the treatment plant, it is necessary to constantly control Nowiny commune, and parts of the Masłów commune. As far as the wastewater flowing into the treatment plant is concerned, 80% is municipal wastewater and 20% originates from industrial plants in the city. The nominal capacity of the treatment plant is 72,000 m 3 /d, which corresponds to a load of 27,500 PE. The influent wastewater is pre-treated mechanically on step grates and aerated sand traps with separated fat removal, where coarser dirt and sand are captured. After the mechanical part, the wastewater flows into the reactor, where the purification process is based on the single-stage three-phase activated sludge (BARDENPHO) method. Next, the wastewater flows into the secondary settling tank, where the treated wastewater is separated from the activated sludge, from where it flows into the Bobrza River.
Monitoring in the Sitkowka-Nowiny WWTP includes measurements of the quantity and quality of the influent wastewater and the bioreactor operating parameters. The following reactor parameters are measured in an on-line mode with hourly resolution: oxygen concentration (DO) in the nitrification; denitrification and dephosphatation chambers; concentration (MLSS); settling rate of activated sludge in the chambers (SE); concentration and size of the recirculated sludge stream (RAS); the amount of the dosed external carbon source in the form of methanol (mMET); the amount of dosed coagulant (mCOAG); excess sludge stream (WAS); pre-sludge stream (QPRIM); redox potential in the dephosphatization chamber (ORP); wastewater inlet temperature (T) and temperature in the reactor; and pH. During the research period, a qualitative analysis of the influent wastewater was performed once or twice a week to determine its BOD, COD, TN, and TP. The organic compounds were determined as COD in accordance with PN-ISO 6060:2006 and as BOD5 with the method using the OXITOP, in accordance with PN-EN 1899-1:2002. TP was determined in accordance with PN-EN ISO 6878:2006. TN was determined in accordance with PN-C-04576-14:1973 [29]. Determination of the tested quality indicators in wastewater samples was made three times. The average values of the indexes were adopted for the analysis (standard deviation value was about 2%).
On the basis of the data on the wastewater quality at the inlet to the treatment plant (BOD5, TN) and the above-mentioned operational parameters, a model was created for the simulation of total nitrogen at the outlet from the treatment plant [29].
In the study period of 2012-2016, the annual rainfall was 537-757 mm, and the number of days with rainfall varied in the range of 155-266. The average annual air temperature varied from 8.1 °C to 9.6 C, whereas the number of days with snowfall amounted to 36-84.

Expert System Methodology for Identifying the Quality of Wastewater to Treatment Plant
Taking into account the problems associated with monitoring and forecasting continuous values of wastewater quality indicators, a system methodology ( Figure 2) that allows for identifying atypical states in the inflow with respect to selected indicators (BOD5, COD, TN, and TP) was proposed. This system is based on the models for identifying the states of selected wastewater quality indicators (reduced, typical, and The influent wastewater is pre-treated mechanically on step grates and aerated sand traps with separated fat removal, where coarser dirt and sand are captured. After the mechanical part, the wastewater flows into the reactor, where the purification process is based on the single-stage three-phase activated sludge (BARDENPHO) method. Next, the wastewater flows into the secondary settling tank, where the treated wastewater is separated from the activated sludge, from where it flows into the Bobrza River.
Monitoring in the Sitkowka-Nowiny WWTP includes measurements of the quantity and quality of the influent wastewater and the bioreactor operating parameters. The following reactor parameters are measured in an on-line mode with hourly resolution: oxygen concentration (DO) in the nitrification; denitrification and dephosphatation chambers; concentration (MLSS); settling rate of activated sludge in the chambers (SE); concentration and size of the recirculated sludge stream (RAS); the amount of the dosed external carbon source in the form of methanol (m MET ); the amount of dosed coagulant (m COAG ); excess sludge stream (WAS); pre-sludge stream (Q PRIM ); redox potential in the dephosphatization chamber (ORP); wastewater inlet temperature (T) and temperature in the reactor; and pH. During the research period, a qualitative analysis of the influent wastewater was performed once or twice a week to determine its BOD, COD, TN, and TP. The organic compounds were determined as COD in accordance with PN-ISO 6060:2006 and as BOD 5 with the method using the OXITOP, in accordance with PN-EN 1899-1:2002. TP was determined in accordance with PN-EN ISO 6878:2006. TN was determined in accordance with PN-C-04576-14:1973 [29]. Determination of the tested quality indicators in wastewater samples was made three times. The average values of the indexes were adopted for the analysis (standard deviation value was about 2%).
On the basis of the data on the wastewater quality at the inlet to the treatment plant (BOD 5 , TN) and the above-mentioned operational parameters, a model was created for the simulation of total nitrogen at the outlet from the treatment plant [29].
In the study period of 2012-2016, the annual rainfall was 537-757 mm, and the number of days with rainfall varied in the range of 155-266. The average annual air temperature varied from 8.1 • C to 9.6 • C, whereas the number of days with snowfall amounted to 36-84.

Expert System Methodology for Identifying the Quality of Wastewater to Treatment Plant
Taking into account the problems associated with monitoring and forecasting continuous values of wastewater quality indicators, a system methodology ( Figure 2) that allows for identifying atypical states in the inflow with respect to selected indicators (BOD 5 , COD, TN, and TP) was proposed. This system is based on the models for identifying the states of selected wastewater quality indicators (reduced, typical, and elevated values) and the models simulating increased inflows to wastewater treatment plants. elevated values) and the models simulating increased inflows to wastewater treatment plants. It was assumed in the calculations that 50% of all measurement values located within the median are "typical" values for the inflow to the treatment plant for the analyzed observation period. Other values of wastewater quality indicators located below the median were assumed to be "reduced", and above as "increased". From the point of view of wastewater treatment plant operation, the events are accompanied by an increased inflow, during which the values of indicators decrease, are particularly dangerous, and can potentially lead to disturbances in the operation of the reactor. In the presented diagram, the variability of selected wastewater quality indicators at the inflow at time t is identified based on the values of flow rate Q(t−z). The calculations provide the use of classification models (for a single quality indicator), described by the equations in the form: • identifier of reduced values of wastewater quality indicators: • identifier of increased values of wastewater quality indicators: where: f-the indicator depends on the inflow (Q) at different moments of time (t − 1, t − 2, …, t − z), Q(t)-inflow to the wastewater treatment plant at time t, z-delay values.
Independent variables in the classifier models were selected using the Chi-square test calculations. The first of the system classifiers in Figure 2 allows for forecasting the events when C( ) < C( ) , (identification of minimum values) and C( ) , > C( ) > C( ) , (identification of typical values). The advantage of this is the option to It was assumed in the calculations that 50% of all measurement values located within the median are "typical" values for the inflow to the treatment plant for the analyzed observation period. Other values of wastewater quality indicators located below the median were assumed to be "reduced", and above as "increased". From the point of view of wastewater treatment plant operation, the events are accompanied by an increased inflow, during which the values of indicators decrease, are particularly dangerous, and can potentially lead to disturbances in the operation of the reactor. In the presented diagram, the variability of selected wastewater quality indicators at the inflow at time t is identified based on the values of flow rate Q(t − z). The calculations provide the use of classification models (for a single quality indicator), described by the equations in the form: • identifier of reduced values of wastewater quality indicators: • identifier of increased values of wastewater quality indicators: where: f -the indicator depends on the inflow (Q) at different moments of time (t − 1, t − 2, . . . , t − z), Q(t)-inflow to the wastewater treatment plant at time t, z-delay values.
Independent variables in the classifier models were selected using the Chi-square test calculations. The first of the system classifiers in Figure 2 allows for forecasting the events when C(t) f < C(t) f ,lower (identification of minimum values) and C(t) f ,upper > C(t) f > C(t) f ,lower (identification of typical values). The advantage of this is the option to recognize cases where C(t) f > C(t) f ,upper , which allows for determining three ranges of variability of selected wastewater quality indicators. For the models for flow classification, the median inflow is for the ranges On the basis of the results of quantity (Q) and wastewater quality (BOD 5 , COD, TN, TP) measurements at the Sitkówka-Nowiny wastewater treatment plant from 2012-2016 (this gives 360 pieces of sets per single quality indicator), calculations of the models for identifying wastewater quality were carried out. The data covering individual wastewater quality indicators were collected as part of constant monitoring at the wastewater treatment plant. As part of the on-site monitoring, measurements are carried out once or twice a week for wastewater collection at the treatment plant inflow. Medium-day samples are representative of the day. A sampler was used to collect the wastewater samples. In the 2017-2018 period, the wastewater treatment plant was modernized, thus the data was not collected regularly. Therefore, the data would not be representative.

The Choice of Method to Identify the Quality of Wastewater at the Inflow to the Wastewater Treatment Plant
On the basis of the literature data [35,[37][38][39], it can be stated that one of the commonly used methods for simulation of the biochemical processes at the wastewater treatment plant and at its inlet is the artificial neural network method. A review of the works in this area indicates that multilayer neural networks of the multilayer perceptron (MLP) type are used to simulate the quality of wastewater at the outlet and inlet, as well as the processes in the bioreactor and technological facilities (nitrification chambers, denitrification, dephosphatation, etc.). Perceptrons were first inspired by the human brain and introduced in Rosenblatt (1958) [40]. In this method, at the stage of learning, so-called values are the identified weights connecting neurons in successive layers for the assumed number of neurons in the hidden layer and assumed activation functions. If the results obtained using the MLP method are not satisfactory, it is possible to introduce modifications to the model consisting of the implementation of additional connections between successive layers (in this way cascaded neural networks are obtained), introducing feedback (the resulting model is the so-called recursive neural networks-RvNN).
Despite the fact that neural networks are an effective tool for process simulation and in many cases an alternative to physical models, they have a number of disadvantages that affect the obtained calculation results. First of all, the use of MLP neural networks raises reservations in the case of a limited amount of data for model creation. If the network architecture is not selected properly, an insufficient amount of data available for training may cause so-called overfitting. The model draws too far-fetched conclusions from the data used in training, which results in poor performance for any new, previously unseen data. Secondly, due to the fact that the value of the goal function in the form: where: n-number of data points, y mes i -value for data point i measured on the physical model, y sym i -value obtained for data point i from the simulation is nonlinear with respect to optimized weights, the MLP network is characterized by many local minima and is sensitive to the initial values of optimized weights. Considering the above-mentioned disadvantages and limitations, a modification of the MLP method was developed, which resulted in the method of support vectors-SVM [41][42][43]. In the case of a non-linear relationship between the output from the model (y) and the explanatory variables (x 1 , x 2 , . . . , x n ), the transformation of n-dimensional space to k-dimensional space of variable features is performed using the kernel function, where the relations y = φ(x 1 , x 2 , . . . , x n ) k are linear. Namely, the goal of the optimization, given l data points in the form ((x 1 , x 2 , . . . , x n ) i , y i ) for i = 1, . . . , l, is: C and ε are responsible for improving the model's generalization. In the method of learning vectors, Vapnik [44] has developed a special calculation algorithm in which the identification of weight values (parameters w and b of the optimal hyperplane) has been reduced to the problem of square programming, which guarantees the occurrence of a single minimum of functions. Thus, a single global minimum of the goal function can be found. In this method, however, the predictive abilities of the obtained models also depend on the values of several hyperparameters: capacitance (C), kernel function (γ x i , x j = φ(x i ) T φ x j ), and insensitivity threshold (ε). Detailed information on the SVM method can be found in the industry literature on data mining and artificial intelligence methods [45]. MLPs and SVMs have been thoroughly analyzed and compared over the years, e.g., in [46] or [47].
In the work to assess the impact of model structure on simulation results, calculations were also made using the concept of regression trees. This method is much less complex than the SVM model and the results of calculations are simpler to interpret than SVM. However, the results of calculations in the regression tree method are stepwise, i.e., the model only uses a finite set of possible predicted values that can be assigned to any incoming data point. Thus, a small change in the value of one input variable can lead to a stepwise change in the value of the model output. This, in turn, weakens the model's predictive ability. Hence, a method to modify the regression tree method, i.e., the boosted tree method (BT) was used in the presented work. This approach implemented the concept of stochastic gradient strengthening of created trees to improve the predictive capabilities of the model, which is a vital idea behind state-of-the-art prediction and classification models like XGBoost [48]. In the gradient strengthening algorithm, multiple trees are built one by one and the final prediction is obtained by adding the predictions of all individual trees in the ensemble. With this approach, it is believed that multiple models with mediocre predictive capabilities can, when combined, perform equally well or better than one, complex model. The subsequent trees are created based on a random sample from the entire data set. This solution aims to eliminate model overfitting (each subsequent regression tree in the model structure is built based on different data sets) and allows the models with generalization properties to be obtained, which improves their predictive ability.
As part of these analyses, the SVM method was used to identify the quality of wastewater at the inlet to the treatment plant. In the analyses, the measurement data were divided into training (50%), test (25%), and validation (25%) sets. STATISTICA 10 software was used to develop the prediction models described above for selected wastewater quality indicators, where the data for the training and test sets are selected at random. The crossvalidation method has been adapted to optimize the training dataset of predictive models.
Optimal parameter values in individual models were sought in the respective ranges C = 1-10,000, γ = 0.1-2.5, and ε = 0.001-0.1 [45]. The final selection of their values was iterative, because for the initially adopted values of C, γ, ε calculations of the weight values were performed in the model; then, the values of the parameters listed above were changed until the best adjustment of the results of the calculations to measurements was obtained, which was determined on the basis of SEN-sensitivity (expresses the correctness of data classification within a set comprising the data in the case BOD 5 /COD/TN/TP > (BOD 5 /COD/TN/TP) lower/upper ) and SPEC -specificity (expresses the correctness of data classification within a set comprising the data in the case BOD 5 /COD/TN/TP < (BOD 5 /COD/TN/TP) lower/upper ). In the boosted trees method, the maximum number of model trees is not more than 300 to prevent overfitting the model. This is one of the criteria that allows for obtaining a model with high generalization capabilities, i.e., that performs equally as well on new, unseen data (e.g., the test dataset), as on the training data.

Variability of Wastewater Quantity and Quality Inflow to the Treatment Plant
On the basis of the results of the quantity and quality measurements of selected wastewater quality indicators (BOD 5 , COD, TN, and TP), statistical characteristics (minimum, average, maximum, and standard deviations) were determined with a breakdown into winter and autumn-spring periods (Table 1). On the basis of the data in Table 1, it can be stated that in the winter and autumn-spring periods, the amount of wastewater flowing into the treatment plant varied widely. The highest flow values were recorded in the autumn and spring period, during intense rainfall events, which resulted in an increased inflow to the considered object. At the same time, the standard deviation values in relation to the measured Q values indicate a smaller variability of the inflow to the wastewater treatment plant in the winter (due to thaws) than in autumn and spring (caused by rainfall).
In addition, based on Table 1, it can be stated that in winter the average values of selected wastewater quality indicators were lower than in the autumn and spring period, which may be caused first of all by a change in the kinetics of processes occurring in the wastewater flowing through channels due to a decrease in air temperature [49][50][51]. Secondly, it may also result from the seasonal nature of the functioning of various types of services in the city, including industrial plants. Within the city, there are food industry plants, the operation of which is seasonal; moreover, in the autumn and spring period, the amount of wastewater generated with their share is greater than in winter. Lower values of wastewater quality indicators recorded in autumn and spring may be due to dilution of wastewater flowing into the treatment plant during intense rainfall events, which is also confirmed by the works of other authors [18,32].
Following the described methodology, based on the measurement data, the lower and upper limits for the analyzed wastewater quality indicators were determined, which allowed setting ranges of typical, reduced, and elevated values. The results of the analyses are presented in Figure 3. The results indicate significant variations of the measured parameters. Similar variations in total ammonium nitrogen (TAN) are indicated by Kerrio and Bae [9], who also showed that these changes did not affect the stability of WWTP.  ) it was assumed that when < , = 0, otherwise = 1.

Selection of Independent Variables for Classification Models
On the basis of the measurement results of selected wastewater quality indicators (BOD 5 , COD, TN, TP) and daily flow values ( ) using the Chi test-the square for the assumed confidence level α = 0.05-independent variables for classification models were selected to identify the minimum and maximum values; the obtained test probability values are presented in Tables 2 and 3.
On the basis of the data in the tables below, it can be concluded that the variability of wastewater quality for BOD 5 and TP in the maximum range is described by the independent variables ( − 1) − ( − 4), in turn for COD, TN based on ( − 1) − ( − 6) and ( − 1) − ( − 5). In the above-mentioned cases, it was found that the variability in the quality of selected wastewater quality indicators can be described using On the basis of the lower and upper values given above, the measurement data were classified into binary, i.e., zero-one form. For example, when creating the model for the classification of reduced values (x lower ), it was assumed that when x < x lower the value of x = 0, otherwise (x > x lower ) x = 1. In the model for the classification of increased values (x upper ) it was assumed that when x < x upper , x = 0, otherwise x = 1.

Selection of Independent Variables for Classification Models
On the basis of the measurement results of selected wastewater quality indicators (BOD 5 , COD, TN, TP) and daily flow values (Q) using the Chi test-the square for the assumed confidence level α = 0.05-independent variables for classification models were selected to identify the minimum and maximum values; the obtained test probability values are presented in Tables 2 and 3. On the basis of the data in the tables below, it can be concluded that the variability of wastewater quality for BOD 5 and TP in the maximum range is described by the independent variables Q(t − 1) − Q(t − 4), in turn for COD, TN based on Q(t − 1) − Q(t − 6) and Q(t − 1) − Q(t − 5). In the above-mentioned cases, it was found that the variability in the quality of selected wastewater quality indicators can be described using only four independent variables covering flow values. The variability of the minimum values of selected wastewater quality indicators is more complicated than the maximum values, which is indicated by the greater number of independent variables obtained from the Chi test-square (Tables 2 and 3). This relationship is confirmed by the work of numerous authors [18,35,52] dealing with the impact of the amount of wastewater flowing into the treatment plant on the quality of wastewater. However, the relationships given by the aforementioned researchers related to the entire range of variability of the wastewater quality indicators examined by them (BOD 5 -Dogan et al. [52], COD-Ahnert et al. [36]), which is a much simpler task than this work.
The literature review [18,32,36] indicates the lack of analyses related to the study of the impact of selected groups of independent variables (wastewater quality indicators, flows, weather conditions, etc.) on the variability of wastewater quality at the inlet to municipal wastewater treatment plants divided into classes (typical values, minimum, maximum).

Designation of Classification Models for Forecasting the Quality of Selected Wastewater Indicators
On the basis of specific independent variables Q(t-z) for classification models to identify the wastewater quality (BOD 5 , COD, TN, TP) at the inlet to the wastewater treatment plant, mathematical models were determined using the support vector method and boosted tree. Table 4 gives the parameters describing the structure of the developed models (C, ε, γ) and measures of matching the calculation results to the measurements for the test and validation set (SENS, SPEC). On the basis of the data in the table, it was found that the parameter values were in the range C = 25-55, γ = 0.27-0.41 and ε = 0.005-0.1. In addition, based on the data in the table below, it can be stated that the determined models are characterized by satisfactory predictive capabilities, which is confirmed by the calculated SENS and SPEC values. Out of the mathematical models determined, the smallest values of errors in wastewater quality identification were obtained in the case of TP lower size, which is confirmed by SPEC = 99.23% and SENS = 97.03% values close to 100% correct classification. In the models obtained by using the reinforced trees method, it was found that the maximum calculated trees in the model are not greater than 300. This indicates that the obtained models are not overlooked. The highest number of classification errors was made when identifying TN upper , as indicated by SENS = 88.79% and SPEC = 88.1%. At the operational stage, this means that in 11.21% of cases, TN < 76 mg/L, and 11.90% when TN > 86 mg/L will be incorrectly identified, which may lead to these episodes; for example, to non-optimally selected set values in the aeration system, which is related to higher costs and the amount of air supplied to the system by the blowers. The values of the determined matching measures using BT are 4-10% lower than in the SVM method. As a result, this may lead to an increase in the number of incorrect decisions at the stage of the wastewater treatment plant operation. Therefore, in order to achieve the highest possible efficiency of the wastewater treatment plant from a technological point (obtaining the required reduction of wastewater pollution), it is advisable to implement a model that conditions smaller calculation errors. While analyzing the results obtained above (Table 4), it can be stated that compared to the simulations carried out by other researchers, they are far-reaching simplifications. Ansari et al. [53] showed the possibility of modeling the quality of wastewater at the inlet to a treatment plant on the basis of flow rate and rainfall depth obtaining R > 0.87. However, the object they analyzed was a relatively small wastewater treatment plant in Kuala Lumpur with a size of PE = 10,000. At the same time, the model they used was a complicated tool (ANFIS + PSO), the implementation of which would not be a simple task. These results confirmed the analyses of Rousseau et al. [54] and Ahnert et al. [36], who showed the possibility of COD forecasting based on flow rate by analyzing facilities in Germany and Belgium. Ebrahimi et al. [55] analyzed a single object and showed that it is possible to model TP based on BOD 5 , TSS or BOD 5 , TSS, COD or BOD 5 , TSS, TN, obtaining the values of the determination coefficient R 2 > 0.87. However, the performance of analytical determinations of the above-mentioned wastewater quality indicators in a short time (less than 1 day) is limited. Thus, the model they proposed has limited application. The problem of forecasting the wastewater quality at the inflow from the treatment plant was also raised by Dogan et al. [52], who showed a significant impact of COD, TSS, TN, and TP on the BOD 5 value. The amount of input data necessary to determine BOD 5 compared to the one proposed in the research generates higher costs of determining wastewater quality indicators and extends the time of performing their determinations.
In addition to the above-mentioned solutions, the quality of wastewater at the inlet to the wastewater treatment plant was modeled using autoregressive models in which the values of wastewater quality indices were determined based on previous measurement results. This solution requires maintaining the continuity of measurement data in a time series, in the absence of which the model has limited application [12]. Harmonic analysis was also used to model the quality of wastewater [56]. The simulation results have been normalized and depend on the size of the wastewater treatment plant. Nevertheless, these values are arithmetic means, which may condition the uncertainty of the selected settings in the bioreactor at the simulation stage of the treatment plant operation. As a result, process control at the treatment plant, selection of optimization strategies, and control of biochemical transformations requires the implementation of stochastic control algorithms.
The fact that the ranges of variability of selected wastewater quality indicators (BOD 5 , COD, TN) can be modeled on the basis of flows may be applied at the stage of facility operation. Namely, using the measurement data collected in the city of Kielce [57], it seems appropriate to develop a model for forecasting the flow based on their variability. The same data obtained can be an input to the designated model for forecasting the wastewater quality. From the point of the facility operation, it is important that the variability of wastewater quality (BOD 5 , COD, TN) at the facility inlet could be identified 24 h in advance. The results obtained in this way can be an input to the designated model for the forecast of total nitrogen [58]. Therefore, in the case under consideration, it would be possible to identify the operational parameters of the bioreactor settings (MLSS, DO, WAS) in advance [4,57,59,60].
This system would operate in such a way that for the determined value of wastewater quality indicators at the inlet on the basis of the classification model and the assumed TN value at the outlet, the bioreactor setting values would be calculated. This would allow for the control and continuous monitoring of the wastewater quality at the inlet and outlet of the treatment plant, as well as the selection of bioreactor setpoints to also reduce electricity consumption.

Conclusions
The results obtained in this work confirm that it is possible to identify the quality of selected indicators of wastewater quality on the basis of the results of measurements of the inflow to the treatment plant. The proposed solution is a simplification in relation to the models where the quality of wastewater at the inlet to the treatment plant was modeled on the basis of many wastewater quality indicators, the measurement time of which is long and the costs of their implementation are high.
On the basis of the simulation results, it can be stated that in the case of unsatisfactory results of modeling wastewater quality indicators, the classification model developed gives the opportunity to identify the ranges of variations of BOD 5 , COD, TN, and TP (reduced, typical and elevated values) at the inlet. This is an important advantage of the expert system given in the work. The presented model allows for the identification of typical states in the inflow to the wastewater treatment plant and makes it possible to forecast incidental states that affect the disturbance of the bioreactor's operation.
The simulation results obtained in this work may be helpful in making the technologist's decision at the stage of operation of the wastewater treatment plant and constitute the basis for choosing the value of settings in the biological reactor.
At the same time, further analyses are needed to assess the possibilities of using the models obtained in the work for optimization and control of wastewater treatment plants using the process models in which the data is uncertain and is subject to measurement errors.