Fault-Tolerant Model Predictive Control Applied to a Sewer Network

: This paper presents a Fault-Tolerant Model Predictive Control (FTMPC) algorithm applied to a simulation model for sewer networks. The aim of this work is to preserve the operation of the predictive controller as much as possible, in accordance with its operational objectives, when there may be anomalies affecting the elements of the control system, mainly sensors and actuators. For this purpose, a fault detection and diagnosis system (FDD) based on a moving window principal component analysis technique (MWPCA) will be developed to provide an online fault monitoring solution for large-scale complex processes (e.g., sewer systems) with dynamically changing characteristics, and a reconfiguration algorithm for the MPC controller taking advantage of its own features such as constraint handling. Comparing the results obtained considering various types of faults, with situations of normal controlled operation and with the behavior of the sewer network when no control is applied, will allow some conclusions to be drawn at the end.


Introduction
Urban drainage networks (UDN) collect and carry both urban wastewater and that which comes from precipitation to wastewater treatment plants (WWTPs) for treatment before being discharged into the environment, constituting a combined urban drainage system (CUDS).During periods of heavy rain, the residual water resulting from the mixing can overload the urban system and produce overflows (combined system overflows, CSOs) that can be harmful to the environment.To avoid CSOs, the current UDNs have retention systems capable of storing the water that reaches the network in times of intense rain and later releasing the stored volume at lower flows suitable for treatment by the WWTPs.Adequate real-time control (RTC) of the volume of water stored in the tanks can significantly improve the operation of the network to minimize the impact of CSOs [1][2][3][4].
Among the techniques used for optimal control of these systems, those that use a simplified model of the process to predict its behavior stand out.This is how Model Predictive Control (MPC) works [5].MPCs are part of a control methodology that uses a process prediction model to calculate the manipulated variables over a future horizon to optimize a certain cost function.It is an algorithm that has been successfully implemented for several decades and has also been applied to UDNs with great success [6][7][8][9].
On the other hand, urban water treatment systems (UWS), which integrate both UDNs and WWTPs, have a high degree of interconnection and their proper functioning depends on the reliability of the equipment used, such as sensors (flowmeters, level sensors) and actuators (pumping stations, gates, valves) and communication systems.The environmental conditions surrounding this equipment can cause its deterioration and malfunction.For this reason, it is necessary to develop Fault Tolerant Control Systems (FTCS) to maintain safe and efficient operation.In this way, a Fault-Tolerant Controller (FTC) is one that can achieve control objectives even though faults may exist, which can lead to a reduction in system performance [10][11][12].Fault-tolerant control takes advantage of the physical and analytical redundancies of the system to increase its performance when an element suffers a malfunction.Furthermore, rapid detection and identification of a fault can help avoid serious and even dangerous breakdowns.
Generally, FTCS can be classified into two types: passive (PFTCS) and active (AFTCS).AFTCS react to system component faults by actively reconfiguring control actions so that stability and performance are acceptable, even if performance has degraded [11,13].Normally, AFTCS consist of four subsystems: (1) a reconfigurable controller, (2) a fault detection and diagnosis scheme (FDD), (3) a controller reconfiguration mechanism, and (4) a command/reference governor.
Existing FDD approaches can be generally classified into two categories: (1) modelbased scheme and (2) data-driven (model-free) scheme [14,15].Data-driven schemes are divided mainly into two approaches: the multivariate statistical process control methods (MSPC) and machine learning (ML) methodologies.In the first case, the most applied methodology is the principal component analysis (PCA) [16][17][18][19][20].The second approach considered is the machine learning or artificial intelligence techniques [21][22][23][24][25]. Furthermore, deep learning strategies have become increasingly popular in the face of complex nonlinearity and can be used for modeling, control, or management of WWTPs as can be seen in [26][27][28][29][30][31]; however, very few studies have addressed the fault detection problems in sewer networks.
For sewer systems, some methods are based on data analysis [32].Furthermore, there are methods using closed-circuit television (CCTV) inspections and artificial intelligence to classify defects automatically [33].Others are based on state estimation, using, for example, a Luenberger observer [34], or determination of normal operating ranges for sensor and actuators [35,36].Often, the controller used is an MPC, so it would be a fault-tolerant model predictive control (FTCMPC) [35,36], but no work has been found in which the PCA technique has been applied to the fault detection and diagnosis in a sewer network.
Due to complex physical and chemical processes as well as changing operating conditions and the nonlinearity of the sewer networks, this technique could be applied successfully to this process.
The benchmark considered as a case study is described in [37].The main problem of this system is the high variability of the disturbances (collected flows in each area) that affect the process.Among the whole set of data that the benchmark realistically integrates, there are time intervals of weeks in which rainfall is very low, which means that the control system, even when working properly, has little influence on the performance of the system.In this case, a fault of any sensor or actuator, even a large one, would be virtually undetectable, although it would also have little impact on the system.
Something similar occurs in the complete opposite situation: if very heavy and repeated rainfall occurs, which can saturate the sewer network, the control system will not prevent overflows in the network, which can become important even if it works properly, and if a fault occurs in any equipment, it will have little impact on the system and therefore its detection and classification will be more difficult.
These same reasons lead us to think that when any type of fault occurs, if it is not very significant, it will most likely go unnoticed.Benchmark simulation tests demonstrate this.
Consequently, it is advisable to focus on intermediate situations, i.e., situations after times of moderately high-intensity rainfall, or at longer time intervals when mediumintensity rainfall occurs, but with more continuity.It has been found that it is in these cases that the MPC controller is most useful in reducing overflows at different points in the network and in keeping the inflow to the treatment plant closer to its nominal value.The main contribution of this work consists of the development of a real-time online FTMPC applied to the UDN system considered, consisting of three subsystems: a fault detection system based on an adaptive online PCA moving data window technique, capable of providing a real-time fault monitoring solution for the sewerage system despite the dynamically changing properties of the system; a fault diagnosis system, which will classify the detected fault through statistical calculations that will identify the variable that deviates the most from its normal behavior; and finally, a system for reconfiguring the MPC controller, taking advantage of its constraint handling capability, to try to maintain control over the whole plant, minimizing the effects of the fault.Several case studies with different disturbance profiles will be analyzed.The results have been compared with the behavior of the system without control, with the normal MPC control algorithm and with different fault situations without reconfiguration of the system.This article is structured as follows: after an introduction, the theoretical description of mathematical algorithms that will be used begins, i.e., MPC and PCA.Afterwards, the fault detection, diagnosis and reconfiguration methodology will be detailed.The following section exposes a case study where previous methodology will be applied: first, the sewer system description is presented, and then, the MPC control algorithm and the FTC system that will be applied.Next, the results obtained in each case will be shown to finish with the conclusions of the work.

Introduction
The proposed methodology is a Real-Time MPC-based Fault-Tolerant system that includes a mixed approach (model-based and data-based approach) that uses an online PCA technique for fault detection and diagnosis.In this section, the mathematical foundation of the MPC and PCA technique will be presented, as well as the statistics used to process monitoring and fault detection and diagnosis.

MPC Formulation
Model Predictive Control (MPC) is a control technique that calculates the control law solving an online optimal problem [5].Control objectives are formulated as an objective function J to be minimized over a time-finite horizon N, considering constraints like a system mathematical model, actuator, or sensor limits and/or disturbances.The problem can be stated at a certain time instant t as: min where N is the time optimization horizon; g 1 and g 2 represent the system model; x 0 is the values of the system states at instant t and X(t) is the sequence of the states, U(t) is the sequence of control signals and D(t) is the sequence of disturbances.These sequences are calculated or estimated at instant t and extend to N.
This formulation is suitable for adding fault-tolerant properties to the controller because the MPC problem can be updated with the information provided by the FDD module [35].

Principal Components Analysis (PCA)
This method has two advantages: on the one hand, it allows data from a higherdimensional space to be represented in a reduced-dimensional space, and on the other hand, it transforms the original correlated variables into new uncorrelated variables, which facilitates the understanding of the data [16][17][18][19].
A matrix X ∈ ℜ nxm representing n data taken from m process variables is arranged.So that all variables are equally considered, it is necessary to normalize the data of the matrix by columns with a mean of zero and variance of one, X n .From this matrix, the covariance matrix S is calculated and decomposed using singular value decomposition (SVD): where Λ ∈ ℜ mxm is a diagonal matrix formed by the non-negative real eigenvalues in decreasing order of and V is formed by the eigenvectors of S.
The principal components can be obtained as: where P is formed taking the first a columns of V (also called loadings), and T are the a principal components of X n .Each of the components of new space T are called scores.
An important aspect to consider is what criteria to apply when selecting the number of principal components a.One method is based on selecting the number of principal components until a limit of variance (CPV) is reached.CPV values between 80% and 90% are usually selected.This method has been applied in this work.
There are different statistics for monitoring a process considering the PCA by generating control charts to monitor the state of the process.The most used for this task are: -Hotelling statistician (T 2 ): For an observation vector x ∈ ℜ mx1 , this index is defined as: where Λ a has been created with the first rows and columns of Λ, and a is the number of principal components selected.
If this index exceeds a preset threshold, it follows that the process is not operating normally, that is, a fault has occurred.The threshold can be calculated offline using historical data, as follows: where n is the number of samples that have been considered for the calculation of the PCA, and F α (a, n − a) is the critical value of the function F (Fisher-Snedecor F distribution) with n and n − a degrees of freedom and α the significance level, which will specify the degree of false alarm commitment.Its typical values vary between 0.01 and 0.05.
-Q or SPE statistician: This index is known as a squared prediction error (SPE).It is calculated from an observation vector x ∈ ℜ mx1 as: where r is the residue vector, and I is the identity matrix mxm.
As for the T 2 index, a threshold can be calculated to determine when the process stops operating normally.This threshold is obtained from historical data taken offline for the PCA: with χ 2 α (1 − α, h) −1 being the inverse of accumulative function χ 2 , α is the tolerance index to false alarms and µ and v are the mean and variance of Q, respectively.
If the process is operating normally, the Q index measures noise fluctuations.When an abnormal event occurs that affects the covariance of X, it is detected because the Q statistic would exceed the set threshold.
The tests carried out for the system considered, introducing various types of faults that affect sensors and actuators, have shown that the Q index is more effective than the T 2 for detecting abnormal operating situations due to the great variability of the disturbances that affect the process.For this reason, this statistic will be used in this work.

Proposed Method for Fault Detection, Diagnosis and Reconfiguration
When we try to apply directly the PCA-based fault detection and diagnosis techniques, as explained in the previous section, these techniques do not work due to the high variability of the data taken from the system affected by disturbances and due to the method used to calculate the tank outlet flow setpoints, as these setpoints are continuously changing depending on the wastewater stored in the tanks at any given time.Therefore, these conditions mean that the thresholds calculated offline in normal operating situations are not useful for online fault detection and diagnosis.To try to solve this problem, it will be necessary to have normal operating data generated online to calculate the threshold corresponding to a certain range of values, considering the effect of disturbances and changes in output flow references.These data will be generated by a feedforward neural network trained with normal operating data and will form a moving data window at each time that allows us to calculate an adaptive threshold for the whole simulation data set [20,38,39].Once trained, the network will be able to generate normal operating data based on the disturbances.
The procedure to apply will be the following: STEP 1: Initialization: A matrix X p ∈ ℜ nxm formed by n data of m process variables is filled up with operating data obtained by simulation.This matrix is normalized and is going to be considered a sliding window of process data, with length n.STEP 2: Obtaining the Q statistic: a data vector x ∈ ℜ mx1 with the measurements of the system variables is formed.The x data are normalized and the PCA method is applied to this new data, and the Q statistic is calculated as indicated in Equation ( 6).This value is stored and updated every time a new sample is acquired, up to a maximum of n values, because each time a new sample is taken, the oldest one is discarded.STEP 3: Q α (Q threshold) calculation: to obtain the threshold of the Q statistic adaptively in the sliding window, the value of the actual disturbances taken from the plant and the value of the rest of the process variables under normal operating conditions are required.These variables are calculated with a neural network.This network has as inputs the disturbances, and the past value of the process variables.The output of the network is the prediction of the process variables at the next time instant.Thus, for the online procedure, a matrix X ∈ ℜ nxm is now constructed with the current disturbances and the network outputs, in the sliding window, to calculate the threshold of Q according to Equation (7).Since each time a new sample is taken, the oldest one is discarded, the calculated threshold is adapted to new disturbances.STEP 4: Fault detection: the number of times that the Q statistic exceeds the calculated threshold (number of alarms) is calculated, and if this number reaches M consecutive values, it is considered that a fault has occurred.Then, go to STEP 5; otherwise, go to STEP 2. STEP 5: Fault diagnosis: the fault instant is determined, and to calculate the contribution of each variable to the fault, the residue of the first H samples of M values that were used to detect the fault is evaluated by Equation ( 6).The variable whose mean of the residue is higher than those calculated for the H samples considered is determined and the failing device is identified.
STEP 6: Reconfiguration of the MPC: mainly, sensor and actuator faults are considered.Thus, if a sensor fails, its real value can be estimated using other measurements of the process, taking advantage of redundancy.On the other hand, if an actuator fails, MPC can be recalculated by adding new constraints about this device.
Section 4.3 will specify the details necessary for the application of this methodology to the system under consideration.

Benchmark Model Description
Figure 1 shows the sewer system used as a benchmark to test control algorithms [37].It is made up of six wastewater and rainwater collection areas (numbered 1 to 6 in the figure), six wastewater storage tanks (ST1, . .., ST6, one of them, ST5, is off-line), wastewater pipes, five valves and a pump for flow control, and a wastewater treatment plant (WWTP).It involves collecting all the wastewater and conveying it to the treatment plant, maintaining a supply flow rate with the least possible variability and as close to its nominal value.This is achieved by retaining the volume collected in the tanks during heavy rainfall and by releasing that volume during drought times.
Appl.Sci.2024, 14, x FOR PEER REVIEW 6 of 25 STEP 5: Fault diagnosis: the fault instant is determined, and to calculate the contribution of each variable to the fault, the residue of the first H samples of M values that were used to detect the fault is evaluated by Equation ( 6).The variable whose mean of the residue is higher than those calculated for the H samples considered is determined and the failing device is identified.STEP 6: Reconfiguration of the MPC: mainly, sensor and actuator faults are considered.
Thus, if a sensor fails, its real value can be estimated using other measurements of the process, taking advantage of redundancy.On the other hand, if an actuator fails, MPC can be recalculated by adding new constraints about this device.
Section 4.3 will specify the details necessary for the application of this methodology to the system under consideration.

Benchmark Model Description
Figure 1 shows the sewer system used as a benchmark to test control algorithms [37].It is made up of six wastewater and rainwater collection areas (numbered 1 to 6 in the figure), six wastewater storage tanks (ST1,…,ST6, one of them, ST5, is off-line), wastewater pipes, five valves and a pump for flow control, and a wastewater treatment plant (WWTP).It involves collecting all the wastewater and conveying it to the treatment plant, maintaining a supply flow rate with the least possible variability and as close to its nominal value.This is achieved by retaining the volume collected in the tanks during heavy rainfall and by releasing that volume during drought times.As part of the control algorithm, a simplified model of the process has been developed that will be used to predict the behavior of the system during a given time interval [40].The simplified mathematical model of the process is made up of the following elements: -WATER COLLECTION AREA: all water collected in the area constitutes an inflow to the system that is treated as a disturbance.-LINK ELEMENTS: they are wastewater conductions by gravity in open channels.Its discrete mathematical model would be the following: As part of the control algorithm, a simplified model of the process has been developed that will be used to predict the behavior of the system during a given time interval [40].The simplified mathematical model of the process is made up of the following elements: -WATER COLLECTION AREA: all water collected in the area constitutes an inflow to the system that is treated as a disturbance.-LINK ELEMENTS: they are wastewater conductions by gravity in open channels.Its discrete mathematical model would be the following: q u,i (k) is the sum of inflows to the link element i q i (k) is the output flow of the element i τ i is the time constant of the element i T is the sampling period where all parameters are related to tank i and instant k: is the outlet flow rate q ov,i (k) is the overflow flow rate V max,i is the maximum capacity of the tank V i (k) is the volume stored at the instant k c 0i is the discharge coefficient calculated empirically for each reservoir i A i is the surface of the base of the tank i h max,i is the tank height i h i (k) is the water level i a i (k) is the opening of the deposit i outlet gate (control variable: -NODES: they represent places of confluence of several wastewater pipes.The resulting flow is the sum of the tributary flows: The states considered are the levels of the tanks ST1, ST2, ST3, ST4 and ST6 (x 1 , . .., x 5 ) and the output flows of the link elements that communicate the tanks ST2 and ST3, ST4 and ST6 and the ST6 with the WWTP, which correspond to the states (x 6 , . .., x 9 ).The output flows of the link elements of the water collection zones 1, 2, 4, 5 and 6 and the flow collected in zone 3, will be considered as measurable disturbances in the process: (d 1 , . . .d 6 ).The system inputs are the desired flows at the outlet of each of the tanks (manipulated variables): (u 1 , . .., u 5 ).The model appears in detail in [40], where the equations of the linearized model of the system that is going to be used as a prediction model in the MPC algorithm are shown: where: x = (h 1 , h 2 , h 3 , h 4 , h 5 , q 3 , q 7 , q 8 , q 9 ), u = (u 1 , u 2 , u 3 , u 4 , u 5 ), d = (q 1 , q 2 , q 4 , q 5 , q 6 , q r3 ), Possible faults that can affect the system operation, usually happen in sensors (level meters, flowmeters) or actuators (gates controlling the tanks output flow rates).Thus, the study of system faults will focus on these pieces of equipment.

Model Predictive Control Algorithm
The control objective is to ensure a flow rate in the treatment plant that maximizes its capacity without exceeding a maximum value, avoiding overflows in the tanks and in the station itself as much as possible, and minimizing operating costs.
The algorithm used for predictive control uses a linear model of the process in the state space for prediction that includes disturbances and a cost function that calculates the outlet flow rates of the tanks for optimal operation.These flow rates calculated by the MPC are the setpoints for local output flow regulation loops for each tank, type P-I.
The cost function or objective function of the MPC is a quadratic form that considers both the tracking errors in the states, as well as the increases in the control sequence with respect to the flow reference (penalizing the control efforts), if the prediction and control horizons coincide, and their value is N [5]: where: The optimization problem that MPC solves is: subject to: where q maxj and u maxj are the upper bounds for flow rate in the link elements and the tank outputs, respectively.
The matrices Q(k), P and R serve to penalize the tracking errors and the control efforts (inputs) and will be used as controller tuning parameters, as will the control horizon N (P is a terminal penalty for MPC stability obtained by means of the Riccati equation [5]).
A variable Q(k) matrix has been considered so that if an overflow occurs in a tank, the corresponding weight is modified to avoid it as much as possible.The non-zero elements of Q(k) that constitute MPC tuning parameters are q 9 and: Parameters f i y α i are used to tune the MPC algorithm.To achieve optimal system operation, the level setpoints are calculated following the strategy of distributing the current volume of water among all tanks as evenly as possible considering their capacity.This is achieved by calculating for each one its reference level as a function of the total capacity of the network and the capacity of that tank [40]: where x ire f (k) is the reference level for deposit i at time k; V G (k) is the total volume occupied at the moment k; v i is a factor that represents the weight of the tank capacity i in the total available volume (sum of all tank volumes); V i and A i are the maximum capacity and the surface area of tank i, respectively.The reference values for the flow rates would be zero because they are not considered except x re f 9 = 60.000m 3 /d, which is the desired inlet flow for the WWTP.
The R matrix is a matrix whose first five values of the diagonal are used to penalize the variations in the flow references with respect to their reference value and the rest are null since they cannot be optimized because they correspond to the disturbances.The flow references are calculated using the desired level for each tank according to Equation (19) for the first five values:

Fault-Tolerant Control System for the Sewer Network
A Fault-Tolerant Control system (FTC) will be implemented following the architecture corresponding to a hierarchical control system, as shown in Figure 2: The R matrix is a matrix whose first five values of the diagonal are used to penalize the variations in the flow references with respect to their reference value and the rest are null since they cannot be optimized because they correspond to the disturbances.The flow references are calculated using the desired level for each tank according to Equation (19) for the first five values:

Fault-Tolerant Control System for the Sewer Network
A Fault-Tolerant Control system (FTC) will be implemented following the architecture corresponding to a hierarchical control system, as shown in Figure 2: As already indicated, the MPC generates the setpoints of the local P-I type regulators by solving a Quadratic Programming (QP) optimization problem considering the measurements of the system variables (levels and flows), the measurable disturbances of the process (inlet flows to the sewer network that comes from urban wastewater and precipitation).At the supervision level is the fault detection and diagnosis system (FDD), which receives the measurements of the system variables and disturbances from the level immediately below and allows the MPC algorithm of that level to be reconfigured after a fault has been detected and diagnosed correctly.
To reduce the number of cases in this study, only the possibility of faults in actuators (gates that regulate the flow rate of the tanks) and level sensors of each tank of the network will be considered.The performance of the fault detection system will be studied considering different moments when a fault will occur in the controlled system and two different disturbance scenarios.The faults under study will be: As already indicated, the MPC generates the setpoints of the local P-I type regulators by solving a Quadratic Programming (QP) optimization problem considering the measurements of the system variables (levels and flows), the measurable disturbances of the process (inlet flows to the sewer network that comes from urban wastewater and precipitation).At the supervision level is the fault detection and diagnosis system (FDD), which receives the measurements of the system variables and disturbances from the level immediately below and allows the MPC algorithm of that level to be reconfigured after a fault has been detected and diagnosed correctly.
To reduce the number of cases in this study, only the possibility of faults in actuators (gates that regulate the flow rate of the tanks) and level sensors of each tank of the network will be considered.The performance of the fault detection system will be studied considering different moments when a fault will occur in the controlled system and two different disturbance scenarios.The faults under study will be: -Faults in the level sensors of each tank: the system behavior will be analyzed considering faults in the sensor gain, which is reduced to 10% of its nominal value.-Faults in the actuators: the behavior of the detection and diagnosis system will be studied considering the gate of each tank blocked at 20% of its total opening.
Both types of faults are of high magnitude, justified by the reasons explained in the introduction.These faults will occur on the second, fifth or eighth day of the total 10 days for different simulation scenarios.The methodology explained in Section 3 will be applied to the sewer system considering the following parameter values: - The length of the sliding window is n = 50 because it has been heuristically proven that 50 samples are a suitable value.Matrix X p ∈ ℜ nxm contains these samples.- The data vector x ∈ ℜ mx1 , taken from the system, includes the system disturbances, the system state variables (tank levels and flow rates of the link elements 6, 7, 8 and 9), the output flows and output flow setpoints of each tank, so its length is m = 25.-A matrix X ∈ ℜ nxm is created with normal operating data using the neural network, including disturbances, and, considering a variance percentage of 95%, a fault threshold in that interval of 50 samples is calculated for the Q statistic.- The number of consecutive alarms considered to be a fault is M = 20.This is a suitable value to avoid false detections caused by the strong disturbances affecting the system.
-For diagnosis, the residue of the first 10 samples of the set of 20 that were used to detect the fault is evaluated (H = 10), as explained in Section 3.This value has been chosen experimentally.The variable whose mean of the residue is greater than those calculated for the set of 10 samples considered is determined.If this variable is the level of a specific tank, it is considered that the corresponding level sensor fails.If the variable is an outlet flow rate, it follows that the correspondent reservoir gate fails.- The MPC reconfiguration depends on the element that presents the fault, so different strategies will be applied to reconfigure the MPC controller to minimize the effects caused by the faults: -Faults in level sensors: when a level sensor fails, the level value of that tank h i (k) can be estimated at instant k if its output flow u i (k), its discharge coefficient c 0i and the gate opening degree v i (k) are known by Equation ( 20): -Faults in gates: in this case, the MPC algorithm is reconfigured by adding to the QP optimization problem an equality constraint for the calculation of the reference output flow rate of the tank affected, u iref (k) since its output flow u i (k) can be measured at instant k, so: To have normal online performance data, a neural network trained with fault-free data extracted from the benchmark will be used.The neural network is composed of 25 neurons in the hidden layer and 19 in the output layer, as shown in Figure 3.To improve the performance of the neural network, it is fed back with the system outputs from the previous instant.Thus, the disturbances (d 1 , . .., d 6 ) and the previous outputs are applied to the input (i 1 , . .., i 25 ), and the process variables of interest are obtained at the output (o 1 , . .., o 19 ): state variables (x 1 , . .., x 9 ), flow references (u ref1 , . .., u ref5 ) and output flow rates of every tank (u 1 ,. ..,u 5 ).

Results and Discussion
Three scenarios have been considered, extracted from the data time-series of the benchmark, in which the flow variations are more significant according to the reasons explained in the introduction of the article.The first scenario provides the neural network training data that will be used to generate normal online operating data.The second and third scenarios will serve to evaluate the fault detection and diagnosis system, as well as its reconfiguration.
To perform the simulation tests, the weights of the MPC cost function (non-null elements of matrices Q(k) and R) have been adjusted for Equation (17) and are shown in Table 1:

Results and Discussion
Three scenarios have been considered, extracted from the data time-series of the benchmark, in which the flow variations are more significant according to the reasons explained in the introduction of the article.The first scenario provides the neural network training data that will be used to generate normal online operating data.The second and third scenarios will serve to evaluate the fault detection and diagnosis system, as well as its reconfiguration.
To perform the simulation tests, the weights of the MPC cost function (non-null elements of matrices Q(k) and R) have been adjusted for Equation (17) and are shown in Table 1: The model system parameters are shown in Table 2:   Then, to validate the trained network, the data provided by the system and the neural network will be compared with input data corresponding to scenarios 2 and 3, shown in Figures 5 and 6, respectively: Then, to validate the trained network, the data provided by the system and the neural network will be compared with input data corresponding to scenarios 2 and 3, shown in Figures 5 and 6, respectively: Then, to validate the trained network, the data provided by the system and the neural network will be compared with input data corresponding to scenarios 2 and 3, shown in Figures 5 and 6, respectively:    Below are some results related to the evaluation of the trained network.Figures 7 and  8 show the reservoir levels provided by the system and by the neural network under normal operating conditions with input data from scenario 2. Similarly, Figures 9 and 10 depict the same levels with input data from scenario 3. Below are some results related to the evaluation of the trained network.Figures 7 and 8 show the reservoir levels provided by the system and by the neural network under normal operating conditions with input data from scenario 2. Similarly, Figures 9 and 10   Below are some results related to the evaluation of the trained network.Figures 7 and  8 show the reservoir levels provided by the system and by the neural network under normal operating conditions with input data from scenario 2. Similarly, Figures 9 and 10 depict the same levels with input data from scenario 3.       It is found that the results provided by the network in both cases largely match those generated by the system, so the network can be used to generate normal operating data based on the disturbances affecting the system.

Fault Detection and Diagnosis Tests
As exposed in Section 4.3, the faults under study will be: -Faults in level sensors: sensor gain is reduced to 10% of its nominal value.-Faults in actuators: gate is blocked at 20% of its total opening.These faults will be provoked on the second, fifth or eighth day of a 10-day simulation interval, considering both scenarios 2 and 3.
Previously, it has been verified that in the absence of faults, the detection system does It is found that the results provided by the network in both cases largely match those generated by the system, so the network can be used to generate normal operating data based on the disturbances affecting the system.

Fault Detection and Diagnosis Tests
As exposed in Section 4.3, the faults under study will be: -Faults in level sensors: sensor gain is reduced to 10% of its nominal value.
-Faults in actuators: gate is blocked at 20% of its total opening.These faults will be provoked on the second, fifth or eighth day of a 10-day simulation interval, considering both scenarios 2 and 3.
Previously, it has been verified that in the absence of faults, the detection system does not detect any type of fault, although it presents an alarm rate for scenarios 2 and 3 of 9.57% and 11.13%, respectively, but since 20 consecutive alarms are necessary to consider a fault situation, none are detected.
The following graphics show the calculated Q threshold and the Q statistic in the absence of faults for scenarios 2 and 3: Both Figures 11 and 12 show that the Q statistic sometimes exceeds the calculated threshold, but no fault is detected because 20 consecutive alarms are necessary.The results obtained in different fault situations are shown in Tables 3 and for scenarios 2 and 3.The tables show the detection results (detection instant) and the diagnosis for the type of fault considered (fault variable: h i , is the tank i level; u i is the tank i output flow rate).The correct diagnosis is highlighted in green:  In terms of fault detection, both scenarios show that all faults are detected relatively quickly (almost all before the next day).It must be considered, as already mentioned, that these are large faults.In the tests performed with less significant faults, detection was considerably delayed with respect to the time of fault generation, and there were some cases in which the fault was not detected at all.
Regarding fault classification, success is variable depending on the proximity of the disturbances to the moment of generation of the fault, as well as their magnitude and their frequency.As these characteristics are highly variable, success in detecting the fault is also variable.Further investigations must be improved by fault classification.

Fault Detection, Diagnosis and MPC Reconfiguration Tests
In this section, to assess the MPC reconfiguration performance, comparative results of the control system in four cases will be shown: -Case 1: sewer network without control, that is, always with all the gates open.-Case 2: sewer network controlled with MPC in the absence of faults.
-Case 3: sewer network controlled with MPC in the presence of a certain fault.-Case 4: sewer controlled with reconfigured MPC (FTMPC).Once the fault is correctly detected and identified, the controller is reconfigured to improve system performance compared to the previous case.
In each case, scenarios 2 and 3 will be considered.Two of the most representative faults have been selected: -Fault in the tank 1 level sensor, in which its gain is reduced to 10% of its normal value.-Fault in the tank 3 gate, which is supposed to be blocked at 20% of its total opening.Furthermore, to better evaluate the effect of the fault and the reconfiguration of the system, it will be assumed that, in all cases, the fault is generated on the second day of the 10-day simulation period considered for each scenario.
The performance evaluation criteria will be the same as those detailed in [40].In summary, these criteria are number of overflows (N ov ), duration of overflow (T ov ) in days, volume overflowed (V ov ) in m 3 , degree of utilization of WWTP (G u ) in %, and smoothness in the application of control signals (S) in m 3 /d.

•
Fault in the tank 1 level sensor: alarm percentage before a fault detection: 2.2%.Detection instant: 2.285 days.MPC controller is reconfigured using Equation (20) to estimate tank 1 level assuming tank 1 outlet flow rate measure is correct.
Figure 13 shows the Q threshold and the Q statistic calculated online for fault detection.Detection occurs when Q exceeds the threshold 20 consecutive times.Figure 13 shows the Q threshold and the Q statistic calculated online for fault detection.Detection occurs when Q exceeds the threshold 20 consecutive times.The following table provides the comparative data of system performance, including normal operating situation, fault without reconfiguration, and fault with controller reconfiguration (FTMPC).Note that the sewer configuration causes the overflowed volume in tank 1, Vov,1 returns to the network, and for this reason, it is not added to Vov in all tables [40].
For simplicity, the main indices to be considered are Vov, Gu and S. As can be seen in Table 5, a normal MPC controller offers the best performance because the total overflow is the lowest as well as the smoothness in the control actions and provides the highest degree of utilization of the WWTP.MPC with the fault considered reduces the system performance, worsening all indices, but by comparing with no control case, its performance is much better since Gu is 57.92% vs. 53.96%,Vov is 3.8003 × 10 4 vs. 6.8473 × 10 4 (m 3 ).

Data
No Control Normal MPC MPC with h1 Fault Reconfigured MPC The following table provides the comparative data of system performance, including normal operating situation, fault without reconfiguration, and fault with controller reconfiguration (FTMPC).Note that the sewer configuration causes the overflowed volume in tank 1, V ov,1 returns to the network, and for this reason, it is not added to V ov in all tables [40].
For simplicity, the main indices to be considered are V ov , G u and S. As can be seen in Table 5, a normal MPC controller offers the best performance because the total overflow is the lowest as well as the smoothness in the control actions and provides the highest degree of utilization of the WWTP.MPC with the fault considered reduces the system performance, worsening all indices, but by comparing with no control case, its performance is much better since G u is 57.92% vs. 53.96%,V ov is 3.8003 × 10 4 vs. 6.8473 × 10 4 (m 3 ).Finally, by comparing an MPC with a fault with FTCMPC, this one improves the system performance since the total overflow volume is reduced from 3.8003 × 10 4 to 3.4520 × 10 4 (m 3 ) and the degree of WWTP utilization increases from 57.92% to 58.55%, although S is worse because the system needs greater control efforts.Therefore, this reconfiguration strategy improves system performance when this fault occurs.

•
Fault in the tank 3 gate: alarm percentage before a fault detection: 2.01%.Detection instant: 2.31 days.MPC controller is reconfigured by Equation ( 21) by adding a new constraint to the MPC problem, assuming the tank 3 outlet flow rate measure is correct.
Figure 14 shows the Q threshold calculated online and the value of the Q statistic for fault detection.Detection happens when Q exceeds the threshold 20 consecutive times.Table 6 provides the comparative data of system performance in each case for scenario 2. Finally, by comparing an MPC with a fault with FTCMPC, this one improves the system performance since the total overflow volume is reduced from 3.8003 × 10 4 to 3.4520 × 10 4 (m 3 ) and the degree of WWTP utilization increases from 57.92% to 58.55%, although S is worse because the system needs greater control efforts.Therefore, this reconfiguration strategy improves system performance when this fault occurs.

•
Fault in the tank 3 gate: alarm percentage before a fault detection: 2.01%.Detection instant: 2.31 days.MPC controller is reconfigured by Equation ( 21) by adding a new constraint to the MPC problem, assuming the tank 3 outlet flow rate measure is correct.
Figure 14 shows the Q threshold calculated online and the value of the Q statistic for fault detection.Detection happens when Q exceeds the threshold 20 consecutive times.Table 6 provides the comparative data of system performance in each case for scenario 2.   Considering the main indices (V ov , G u and S), the results discussion is like the previous section, with the normal MPC controller having the best performance.Furthermore, MPC with the fault considered reduces the system performance, but by comparing with no control case, its performance is much better since G u is 58.49% vs. 53.96%,V ov is 3.5235 × 10 4 vs. 6.8473 × 10 4 (m 3 ).Finally, by comparing the MPC with a fault with FTCMPC, this one improves the system performance slightly since the total overflow volume is reduced from 3.5235 × 10 4 to 3.5162 × 10 4 and the degree of WWTP utilization increases from 58.49% to 58.52%, but S is worse because the system needs higher control efforts.

•
Fault in the tank 1 level sensor: alarm percentage before a fault detection: 2.44%.Detection instant: 2.26 days.MPC reconfiguration is performed in the same way as in Section 5.3.1 for the tank 1 level sensor.
Figure 15 shows the Q threshold and the value of the Q statistic calculated online.Detection occurs when Q exceeds the threshold 20 consecutive times.Considering the main indices (Vov, Gu and S), the results discussion is like the previous section, with the normal MPC controller having the best performance.Furthermore, MPC with the fault considered reduces the system performance, but by comparing with no control case, its performance is much better since Gu is 58.49% vs. 53.96%,Vov is 3.5235 × 10 4 vs. 6.8473 × 10 4 (m 3 ).Finally, by comparing the MPC with a fault with FTCMPC, this one improves the system performance slightly since the total overflow volume is reduced from 3.5235 × 10 4 to 3.5162 × 10 4 and the degree of WWTP utilization increases from 58.49% to 58.52%, but S is worse because the system needs higher control efforts.

•
Fault in the tank 1 level sensor: alarm percentage before a fault detection: 2.44%.Detection instant: 2.26 days.MPC reconfiguration is performed in the same way as in Section 5.3.1 for the tank 1 level sensor.
Figure 15 shows the Q threshold and the value of the Q statistic calculated online.Detection occurs when Q exceeds the threshold 20 consecutive times.Table 7 provides the comparative data of system performance in every case.These results allow to achieve the same conclusions as the scenario 2 results:  Table 7 provides the comparative data of system performance in every case.These results allow to achieve the same conclusions as the scenario 2 results: Regarding the main indices (V ov , G u and S), the results discussion is like previous cases, with the normal MPC controller having the best performance.Furthermore, the MPC with the fault considered reduces the system performance, but by comparing with no control case, its performance is better since G u is 67.28% vs. 61.67%,V ov is 4.77 × 10 4 vs. 8.74 × 10 4 (m 3 ).Furthermore, by comparing the MPC with a fault with FTCMPC, this one improves the system performance slightly since the total overflow volume is reduced from 4.77 × 10 4 to 4.41 × 10 4 and the degree of WWTP utilization increases from 67.28% to 68.81%, but S is worse because the system needs greater control efforts.

•
Fault in the tank 3 gate: alarm percentage before a fault detection: 3.05%.Detection instant: 2.271 days.MPC reconfiguration is performed in the same way as in Section 5.3.1 for the tank 3 gate.
Figure 16 shows the Q threshold calculated online and the value of the Q statistic for fault detection and allows knowing the detection instant.Detection occurs when Q exceeds the threshold 20 consecutive times.
Table 8 presents the comparative data of system performance in each case.Looking at the main indicators, you can see the same behavior as in previous cases.For instance, by comparison of an MPC with a fault with FTCMPC, G u is 67.28% vs. 68.81%and V ov is 4.6066 × 10 4 vs. 4.5926 × 10 4 (m 3 ).Therefore, FTCMPC improves the performance of the system.Table 8 presents the comparative data of system performance in each case.Looking at the main indicators, you can see the same behavior as in previous cases.For instance, by comparison of an MPC with a fault with FTCMPC, Gu is 67.28% vs. 68.81%and Vov is 4.6066 × 10 4 vs. 4.5926 × 10 4 (m 3 ).Therefore, FTCMPC improves the performance of the system.

Conclusions
In this paper, a methodology for fault detection and diagnosis in certain types of sensors and actuators of a wastewater sewer network, based on an adaptive PCA technique, has been presented and analyzed.Due to the peculiar characteristics of the system, subject to strong disturbances of high variability, only large faults have been detected and classified since low-intensity faults do not affect the performance of the system very much and, therefore, their detection is more difficult.Even so, the detection algorithm used has managed to detect faults in different elements and in different scenarios, with some speed and reliability.Regarding the diagnosis of the detected faults, its classification is very difficult, not only because of the disturbances in the system but also because the set point of the flow regulators is constantly being recalculated; therefore, the results obtained can be improved and it is intended to continue in this sense with the work started.For cases in which both detection and diagnosis have been successful, the MPC reconfiguration strategies show an improvement in system performance compared to that obtained when a malfunction situation occurs, and the controller is not acted upon.On the other hand, by its own structure, MPC facilitates the reconfiguration process when a fault occurs, for instance, by adding a new constraint to the optimization problem.Therefore, MPC reconfiguration is usually easy to implement, and many systems combine both strategies FTC and MPC, using a FTCMPC.Finally, although this FTCMPC controller has been designed for a sewer system, it can be easily adapted to other types of processes that present the same difficulties.

Figure 4 25 5. 1 .Figure 4
Figure 4 represents the inlet flows to the sewer network collected in each of the catchment areas considered, due to precipitation and wastewater, over a period of 10 days (scenario 1).It shows the training data profile of the neural network extracted from the benchmark:
depict the same levels with input data from scenario 3.

25 Figure 11 .
Figure 11.Q statistic and threshold calculated in a normal operating situation (scenario 2).

Figure 12 .
Figure 12.Q statistic and threshold calculated in a normal operating situation (scenario 3).

Figure 11 .
Figure 11.Q statistic and threshold calculated in a normal operating situation (scenario 2).

Figure 12 .
Figure 12.Q statistic and threshold calculated in a normal operating situation (scenario 3).

Figure 13 .
Figure 13.Q statistic and threshold calculated in level fault situation (scenario 2).

Figure 13 .
Figure 13.Q statistic and threshold calculated in level fault situation (scenario 2).

Figure 14 .
Figure 14.Q statistic and threshold calculated in gate fault situation (scenario 2).

Figure 15 .
Figure 15.Q statistic and threshold calculated in level fault situation (scenario 3).

Figure 15 .
Figure 15.Q statistic and threshold calculated in level fault situation (scenario 3).

Figure 16 .
Figure 16.Q statistic and threshold calculated in gate fault situation (scenario 3).

Figure 16 .
Figure 16.Q statistic and threshold calculated in gate fault situation (scenario 3).

Table 3 .
Fault detection and diagnosis results for scenario 2.
Figure 11.Q statistic and threshold calculated in a normal operating situation (scenario 2).

Table 3 .
Fault detection and diagnosis results for scenario 2.
Figure 12.Q statistic and threshold calculated in a normal operating situation (scenario 3).

Table 3 .
Fault detection and diagnosis results for scenario 2.

Table 4 .
Fault detection and diagnosis results for scenario 3.

Table 5 .
System performance in every case: fault in tank 1 level sensor (scenario 2).

Table 5 .
System performance in every case: fault in tank 1 level sensor (scenario 2).

Table 6 .
System performance in every case: fault in tank 3 gate (scenario 2).
Figure 14.Q statistic and threshold calculated in gate fault situation (scenario 2).

Table 6 .
System performance in every case: fault in tank 3 gate (scenario 2).

Table 7 .
System performance in every case: fault in tank 1 level sensor (scenario 3).

Table 8 .
System performance in every case: fault in tank 3 gate (scenario 3).

Table 8 .
System performance in every case: fault in tank 3 gate (scenario 3).