# Prediction of Surface Water Quality by Artificial Neural Network Model Using Probabilistic Weather Forecasting

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

_{3}-N concentrations only based on hydrologic data but did not consider water quality variation characteristics through the reaction mechanism in water bodies. Water quality is sensitive to runoff owing to rainfall flowing into the watershed as well as to changes in the water environment. Dunn et al. [10] stated that rainfall runoff affected an increase in the concentration of heavy metals in water bodies and can occur in both pervious and impervious surfaces in urban areas. Jeong et al. [11] analyzed the correlations between phytoplankton biomass (chlorophyll a concentration) and rainfall and explained that dam operation management must be performed effectively according to the rainfall received for water quality management. Meteorological factors as well as water quality factors should be considered when developing a water quality prediction model.

## 2. Materials and Methods

#### 2.1. Study Area and Data Description

#### 2.2. Methodology

#### 2.2.1. Probability Forecast System

#### 2.2.2. Exploratory Factor Analysis (EFA)

#### 2.2.3. Artificial Neural Network (ANN)

#### 2.2.4. Model Evaluation

^{2}), which is widely used in various fields, including water quality modeling, is a quantitative measure of the linear relationship between measurements and simulation values. The range of the coefficient value is between 0 and 1; the more linear the relationship, the closer the coefficient is to 1. The NSE is a statistical measure that is most widely used in the water quality modeling field. It is recommended by the ASCE [14], Legates and McCabe [15], and Moriasi et al. [16]. It is still being used by researchers who perform water quality modeling. A value closer to 1.0 means that the simulation values reflect the tendency of the measurements more accurately. The root mean square error (RMSE) is a statistical measure that includes a unit for simulation items and can quantitatively indicate errors. However, it is difficult for non-experts to evaluate it because it only represents the absolute degree of error. Care should be taken as the equation takes a square form and is greatly affected by high values or outliers.

## 3. Results

#### 3.1. Exploratory Factor Analysis (EFA) Results

#### 3.2. ANN Model Leaning

#### 3.2.1. ANN Learning System

#### 3.2.2. ANN Learning Results

^{2}was 0.810–0.929 for DO, 0.671–0.863 for BOD

_{5}, 0.802–0.878 for COD, 0.766–0.842 for TOC, 0.747–0.906 for T-P, and 0.627–0.784 for SS. The NSE was 0.806–0.913 for DO, 0.576–0.853 for BOD

_{5}, 0.769–0.878 for COD, 0.766–0.859 for TOC, 0.698–0.925 for T-P, and 0.315–0.673 for SS. The RMSE was 0.529–0.818 for DO, 0.214–0.473 for BOD

_{5}, 0.320–0.683 for COD, 0.260–0.673 for TOC, 0.007–0.022 for T-P, and 1.792–5.569 for SS. R

^{2}was found to be above 0.8 on average in six water quality variables at five unit watersheds. It showed a high model explanatory coefficient. The model evaluation results of water quality variables were generally good, except for the SS. Because the SS exhibits large variations in measurement values and the variation characteristics for precipitation events are dominant, the ANN model could be improved by reflecting the hydrological elements as much as possible. Furthermore, the model learning results of the Namgang E unit watershed were generally excellent.

#### 3.3. Evaluation of the ANN Model That Utilizes Probability Forecasts

^{2}was 0.673–0.866 for DO, 0.315–0.673 for BOD

_{5}, 0.570–0.926 for COD, 0.512–0.809 for TOC, 0.391–0.785 for T-P, and 0.471–0.602 for SS. The NSE was 0.658–0.865 for DO, 0.401–0.658 for BOD

_{5}, 0.496–0.864 for COD, 0.507–0.749 for TOC, 0.341–0.705 for T-P, and 0.338–0.587 for SS. The RMSE was 0.675–1.012 for DO, 0.310–0.578 for BOD

_{5}, 0.381–0.903 for COD, 0.283–0.718 for TOC, 0.009–0.032 for T-P, and 3.214–6.187 for SS.

## 4. Conclusions

- Based on the EFA results, the water temperature (W.T), temperature (T), and dissolved oxygen (DO) showed negative correlations at most locations and were classified as the same factor. This indicates that the characteristic of the decreasing dissolution rate of gas (oxygen) with decreasing W.T is reflected well. Immediately downstream of the Namgang Dam, water quality variables such as COD and nutrients were classified as the same factor. In Namgang E, BOD and Chl-a were classified as the same factor. This suggests that the native Chl-a and BOD have a high correlation owing to the hydraulically stagnant flow at the junction of the main stream and tributary.
- Most of the meteorological variables were not classified together with the water quality variables. This is because the meteorological variables did not exhibit large variability as they are not direct influencing factors for the water quality variables, but indirect factors related to the W.T or saturation. In other words, the nonlinear relationship between meteorological variables and water quality variables could not be statistically examined through EFA. However, we attempted to build a model that embodies the nonlinear correlation between the meteorological factors and water quality factors through ANN model learning.
- The coefficient of determination was determined, and the model was evaluated by building a water quality prediction model for each unit watershed, and the results were good for all water quality variables except for the SS. This seems to be attributable to the large changes in observation values due to changes in the watershed runoff characteristics caused by rainfall; moreover, the number of observations is extremely small to reflect the variation characteristics. It is expected that an enhanced model could be constructed if detailed ANN learning were performed through continuous accumulation of the water quality data of the existing water quality monitoring network. Significant quantitative model evaluation is difficult owing to the insufficient data of probabilistic weather forecasting, which started in 2014, and irregular water quality measurement dates. However, the improvement of accuracy through data accumulation in the future can be expected.
- The meteorological and water quality changes in the watershed have large spatiotemporal variability. Water quality data have strong nonlinear characteristics of the ecosystem due to very complex reaction mechanisms. Because the meteorological effects already contain some of the characteristics of water quality, the probabilistic forecasting of water quality will be possible through the ANN-based water quality forecast model in the future.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Lee, S.Y.; Dunn, R.J.K.; Young, R.A.; Connolly, R.M.; Dale, P.E.R.; Dehayr, R.; Lemckert, C.J.; McKinnon, S.; Powell, B.; Teasdale, P.R.; et al. Impact of urbanization on coastal wetland structure and function. Austral. Ecol.
**2006**, 312, 149–163. [Google Scholar] [CrossRef] - Freeman, L.A.; Corbett, D.R.; Fitzgerald, A.M.; Lemley, D.A.; Quigg, A.; Stepe, C.N. Impacts of Urbanization and Development on Estuarine Ecosystems and Water Quality. Estuaries Coasts
**2019**, 42, 1821–1838. [Google Scholar] [CrossRef] - Wellen, C.; Kamran-Disfani, A.R.; Arhonditsis, G.B. Evaluation of the current state of distributed watershed nutrient water quality modeling. Environ. Sci. Technol.
**2015**, 49, 3278–3290. [Google Scholar] [CrossRef] [PubMed] - Ji, Z.G. Hydrodynamics and Water Quality: Modeling Rivers, Lakes, and Estuaries; John Wiley & Sons: Hoboken, NJ, USA, 2017. [Google Scholar]
- Wenyan, W.; Graeme, C.D.; Holger, R.M. Protocol for developing ANN models and its application to the assessment of the quality of the ANN model development process in drinking water quality modelling. Environ. Model. Softw.
**2014**, 54, 108–127. [Google Scholar] - Kim, S.E.; Seo, I.W. Artificial Neural Network ensemble modeling with conjunctive data clustering for water quality prediction in rivers. J. Hydro-Environ. Res.
**2015**, 9, 325–339. [Google Scholar] [CrossRef] - Palani, S.; Liong, S.Y.; Tkalich, P. An ANN application for water quality forecasting. Mar. Pollut. Bull.
**2008**, 56, 1586–1597. [Google Scholar] [CrossRef] [PubMed] - Patki, V.K.; Jahagirdar, S.; Patil, Y.M.; Karale, R.; Nadagouda, A. Prediction of water quality in municipal distribution system. Mater. Today Proc.
**2021**, in press. [Google Scholar] [CrossRef] - Chang, F.J.; Tsai, Y.H.; Chen, P.A.; Alexandra, C.; Georges, V. Modeling water quality in an urban river using hydrological factors e Data driven approaches. J. Environ. Manag.
**2015**, 151, 87–96. [Google Scholar] [CrossRef] [PubMed] - Dunn, R.J.K.; Teasdale, P.R.; Warnken, J.; Jordan, M.A.; Arthur, J.M. Evaluation of the in situ, time-integrated DGT technique by monitoring changes in heavy metal concentrations in estuarine waters. Environ. Pollut.
**2007**, 148, 213–220. [Google Scholar] [CrossRef] [PubMed][Green Version] - Jeong, K.S.; Kim, D.K.; Shin, H.S.; Yoon, J.D.; Kim, H.W.; Joo, G.J. Impact of summer rainfall on the seasonal water quality variation (chlorophyll a) in the regulated Nakdong River. KSCE J. Civil. Eng.
**2011**, 15, 983–994. [Google Scholar] [CrossRef] - Kim, S.E.; Seo, I.W.; Choi, S.Y. Assessment of water quality variation of a monitoring network using exploratory factor analysis and empirical orthogonal function. Environ. Model. Softw.
**2017**, 94, 21–35. [Google Scholar] [CrossRef] - Rojas, R. The Backpropagation Algorithm. In Neural Networks; Springer: Berlin/Heidelberg, Germany, 1996; pp. 149–182. [Google Scholar] [CrossRef]
- ASCE. Criteria for evaluation of watershed models. J. Irrig. Drianage Eng. ASCE
**1993**, 119, 429–442. [Google Scholar] [CrossRef] - Legates, D.R.; McCabe, G.J. Evaluating the use of “goodness-of-fit” measures in hydrologic and hydro climatic model validation. Water Resour. Res.
**1999**, 35, 233–241. [Google Scholar] [CrossRef] - Moriasi, D.N.; Arnold, J.G.; Van Liew, M.W.; Bingner, R.L.; Harmel, R.D.; Veith, T.L. Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans. ASABE
**2007**, 50, 885–900. [Google Scholar] [CrossRef]

**Figure 5.**Flowchart for the development of the ANN-based water quality prediction model that applies the probability forecasts.

**Figure 6.**Learning results of the ANN-based water quality prediction model in the Namgang A unit watershed.

**Figure 7.**Learning results of the ANN-based water quality prediction model in the Namgang B unit watershed.

**Figure 8.**Learning results of the ANN-based water quality prediction model in the Namgang C unit watershed.

**Figure 9.**Learning results of the ANN-based water quality prediction model in the Namgang D unit watershed.

**Figure 10.**Learning results of the ANN-based water quality prediction model in the Namgang E unit watershed.

Weather Station | Input Variables | Collection Period | Reference |
---|---|---|---|

Sancheong | Precipitation, Relative Humidity, Temperature, Solar Radiation, Wind Speed | 2007–2016 | KMA * |

Jinju |

Gauging Station | Input Variables | Collection Period | Reference |
---|---|---|---|

Namgang A | Water Temperature, EC, pH, DO, BOD, COD, SS, T-N, NH_{3}-N, NO_{3}-N, T-P, PO_{4}-P, Chl-a, TOC, Flow | 2007–2016 | KWIS ** |

Namgang B | |||

Namgang C | |||

Namgang D | |||

Namgang E |

Method | Basic Equation | Description of Variables |
---|---|---|

RMSE | $\mathrm{RMSE}=\sqrt{\frac{1}{n}{\displaystyle \sum _{i=1}^{n}}{\left[{P}_{i}-{O}_{i}\right]}^{2}}$ | ${O}_{i}$ = observed value, ${P}_{i}$ = simulated value, $\overline{O}$ = mean observed value n = number of data |

NSE | $\mathrm{NSE}=1-\frac{{\sum}_{i=1}^{n}{\left({O}_{i}-{P}_{i}\right)}^{2}}{{\sum}_{i=1}^{n}{\left({O}_{i}-\overline{O}\right)}^{2}}$ | |

R^{2} | ${\mathrm{R}}^{2}=\frac{{\sum}_{i=1}^{n}{\left({O}_{i}-\overline{O}\right)}^{2}-{\sum}_{i=1}^{n}{\left({O}_{i}-{P}_{i}\right)}^{2}}{{\sum}_{i=1}^{n}{\left({O}_{i}-\overline{O}\right)}^{2}}$ |

Unit Watershed | Factor 1 | Factor 2 | Factor 3 | Factor 4 | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

Factor | Eigenvalue | Cumulative | Factor | Eigenvalue | Cumulative | Factor | Eigenvalue | Cumulative | Factor | Eigenvalue | Cumulative | |

Namgang A | W.T, T, DO, T-N, NO_{3}-N | 4.713 | 0.295 | Q, SS, Qs, COD, P | 2.671 | 0.462 | T-P, TOC, pH, BOD, Chl-a | 2.557 | 0.621 | Sun, R.H. | 1.534 | 0.717 |

Namgang B | W.T, T, DO, T-N, NO_{3}-N | 5.366 | 0.335 | COD, BOD, TOC, SS, T-P, Chl-a | 2.555 | 0.495 | Q, Qs, pH | 2.150 | 0.630 | Sun, R.H. | 1.455 | 0.720 |

Namgang C | W.T, T, DO, EC | 6.246 | 0.347 | SS, COD, TOC, T-P, Q | 2.755 | 0.500 | Sun, Rad, R.H., P | 2.262 | 0.626 | pH, BOD, Chl-a | 1.266 | 0.696 |

Namgang D | BOD, COD, TOC, T-P, Chl-a | 5.492 | 0.305 | W.T, T, EC, DO, T-N | 3.965 | 0.525 | Sun, R.H., P, Rad | 2.510 | 0.665 | pH, SS, Q. | 1.907 | 0.771 |

Namgang E | W.T, T, EC, DO, T-N, NO_{3}-N, NH_{3}-N | 5.159 | 0.287 | BOD, COD, TOC, T-P, Chl-a | 4.008 | 0.509 | SS, Q, Q_{n}, pH, PO_{4}-P | 2.505 | 0.649 | Rad, R.H., Sun | 2.010 | 0.760 |

Unit watershed | Water Quality Prediction Variable | Common Input Variable | Input Variable |
---|---|---|---|

Namgang A | DO_{t+1} | Temperature_{t−1}, Temperature_{t},Temperature _{t+1},Precipitation _{t−1},Precipitation _{t},Precipitation _{t+1} | DO_{t}, DO_{t−1}, DO_{t−2}, T-N_{t} |

BOO_{t+1} | BOD_{t}, BOD_{t−1}, BOD_{t−2}, TOC_{t}, T-P_{t}, Chl-a_{t} | ||

COO_{t+1} | COD_{t}, COD_{t−1}, COD_{t−2}, SS_{t} | ||

TOC_{t+1} | TOC_{t}, TOC_{t−1}, TOC_{t−2}, BOD_{t}, T-P_{t}, Chl-a_{t} | ||

T-P_{t+1} | T-P_{t}, T-P_{t−1}, T-P_{t−2}, BOD_{t}, TOC_{t}, Chl-a_{t} | ||

SS_{t+1} | SS_{t}, SS_{t−1}, SS_{t−2}, COD_{t} | ||

Namgang B | DO_{t+1} | DO_{t}, DO_{t−1}, DO_{t−2}, T-N_{t} | |

BOO_{t+1} | BOD_{t}, BOD_{t−1}, BOD_{t−2}, TOC_{t}, T-P_{t}, COD_{t}, SS_{t}, Chl-a_{t} | ||

COO_{t+1} | COD_{t}, COD_{t−1}, COD_{t−2}, BOD_{t}, TOC_{t}, T-P_{t}, SS_{t}, Chl-a_{t} | ||

TOC_{t+1} | TOC_{t}, TOC_{t−1}, TOC_{t−2}, BOD_{t}, T-P_{t}, COD_{t}, SS_{t}, Chl-a_{t} | ||

T-P_{t+1} | T-P_{t}, T-P_{t−1}, T-P_{t−2}, BOD_{t}, TOC_{t}, COD_{t}, SS_{t}, Chl-a_{t} | ||

SS_{t+1} | SS_{t}, SS_{t−1}, SS_{t−2}, BOD_{t}, TOC_{t}, T-P_{t}, COD_{t}, Chl-a_{t} | ||

Namgang C | DO_{t+1} | DO_{t}, DO_{t−1}, DO_{t−2} | |

BOO_{t+1} | BOD_{t}, BOD_{t−1}, BOD_{t−2}, Chl-a_{t} | ||

COO_{t+1} | COD_{t}, COD_{t−1}, COD_{t−2}, TOC_{t}, T-P_{t}, SS_{t} | ||

TOC_{t+1} | TOC_{t}, TOC_{t−1}, TOC_{t−2}, COD_{t}, T-P_{t}, SS_{t} | ||

T-P_{t+1} | T-P_{t}, T-P_{t−1}, T-P_{t−2}, COD_{t}, TOC_{t}, SS_{t} | ||

SS_{t+1} | SS_{t}, SS_{t−1}, SS_{t−2}, COD_{t}, TOC_{t}, T-P | ||

Namgang D | DO_{t+1} | DO_{t}, DO_{t−1}, DO_{t−2}, T-N_{t} | |

BOO_{t+1} | BOD_{t}, BOD_{t−1}, BOD_{t−2}, TOC_{t}, T-P_{t}, COD_{t}, Chl-a_{t} | ||

COO_{t+1} | COD_{t}, COD_{t−1}, COD_{t−2} BOD_{t}, TOC_{t}, T-P_{t}, Chl-a_{t} | ||

TOC_{t+1} | TOC_{t}, TOC_{t−1}, TOC_{t−2} BOD_{t}, T-P_{t}, COD_{t}, Chl-a_{t} | ||

T-P_{t+1} | T-P_{t}, T-P_{t−1}, T-P_{t−2} BOD_{t}, TOC_{t}, COD_{t}, Chl-a_{t} | ||

SS_{t+1} | SS_{t}, SS_{t−1}, SS_{t−2} | ||

Namgang E | DO_{t+1} | DO_{t}, DO_{t−1}, DO_{t−2}, T-N_{t} | |

BOO_{t+1} | BOD_{t}, BOD_{t−1}, BOD_{t−2}, TOC_{t}, T-P_{t}, COD_{t}, Chl-a_{t} | ||

COO_{t+1} | COD_{t}, COD_{t−1}, COD_{t−2} BOD_{t}, TOC_{t}, T-P_{t}, Chl-a_{t} | ||

TOC_{t+1} | TOC_{t}, TOC_{t−1}, TOC_{t−2} BOD_{t}, T-P_{t}, COD_{t}, Chl-a_{t} | ||

T-P_{t+1} | T-P_{t}, T-P_{t−1}, T-P_{t−2} BOD_{t}, TOC_{t}, COD_{t}, Chl-a_{t} | ||

SS_{t+1} | SS_{t}, SS_{t−1}, SS_{t−2}, |

**Table 6.**Evaluation results for the ANN-based water quality prediction model that utilizes probability forecasts.

Unit Watershed | R^{2} | RMSE | NSE | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

DO | BOD_{5} | COD | TOC | T-P | SS | DO | BOD_{5} | COD | TOC | T-P | SS | DO | BOD_{5} | COD | TOC | T-P | SS | |

Namgang A | 0.793 | 0.602 | 0.612 | 0.512 | 0.561 | 0.598 | 0.872 | 0.420 | 0.801 | 0.718 | 0.032 | 3.889 | 0.798 | 0.597 | 0.525 | 0.507 | 0.409 | 0.587 |

Namgang B | 0.796 | 0.505 | 0.570 | 0.601 | 0.571 | 0.471 | 0.896 | 0.578 | 0.903 | 0.614 | 0.020 | 6.187 | 0.789 | 0.589 | 0.496 | 0.584 | 0.350 | 0.426 |

Namgang C | 0.866 | 0.315 | 0.761 | 0.730 | 0.629 | 0.529 | 0.807 | 0.448 | 0.405 | 0.283 | 0.009 | 4.761 | 0.865 | 0.401 | 0.764 | 0.730 | 0.595 | 0.504 |

Namgang D | 0.673 | 0.663 | 0.620 | 0.554 | 0.391 | 0.533 | 1.012 | 0.310 | 0.502 | 0.376 | 0.017 | 3.223 | 0.658 | 0.605 | 0.606 | 0.551 | 0.341 | 0.338 |

Namgang E | 0.854 | 0.673 | 0.926 | 0.809 | 0.785 | 0.602 | 0.675 | 0.472 | 0.381 | 0.424 | 0.012 | 3.214 | 0.847 | 0.658 | 0.864 | 0.749 | 0.705 | 0.561 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Jung, W.S.; Kim, S.E.; Kim, Y.D. Prediction of Surface Water Quality by Artificial Neural Network Model Using Probabilistic Weather Forecasting. *Water* **2021**, *13*, 2392.
https://doi.org/10.3390/w13172392

**AMA Style**

Jung WS, Kim SE, Kim YD. Prediction of Surface Water Quality by Artificial Neural Network Model Using Probabilistic Weather Forecasting. *Water*. 2021; 13(17):2392.
https://doi.org/10.3390/w13172392

**Chicago/Turabian Style**

Jung, Woo Suk, Sung Eun Kim, and Young Do Kim. 2021. "Prediction of Surface Water Quality by Artificial Neural Network Model Using Probabilistic Weather Forecasting" *Water* 13, no. 17: 2392.
https://doi.org/10.3390/w13172392