Quantitating Wastewater Characteristic Parameters Using Neural Network Regression Modeling on Spectral Reflectance
Abstract
:1. Introduction
- What data preprocessing tasks must be carried out on the WW spectral data to produce regression NN models with good prediction performance?
- What are the effects of common hyperparameters in NN modeling, including the number of hidden layers and number of neuron units in each hidden layer?
- How does the number of modelled outputs, i.e., WW parameters, affect the prediction performance of NN models?
- How can hyperparameter tuning and k-fold cross-validation on regression NN models improve prediction performance?
2. Methodology
2.1. Dataset
2.1.1. WW Data Source and Structure Overview
2.1.2. Training Set and Test Set
2.2. NN Model Training and Testing
2.2.1. Learning Cost Function
2.2.2. Optimization Algorithm, Learning Rate, Activation Function, and Training Epoch
2.2.3. Effect of Number of Hidden Layers, Number of Neuron Units, and Outputs
2.2.4. NN Model Hyperparameter Grid-Search
2.2.5. Repeated K-Fold Cross-Validation during NN Training
3. Results
3.1. Need for a Dedicated NN Model for Wastewater Stream Groups
3.2. Challenge with Increasing Number of NN Model Output Variables
3.3. Improving the NN Model via Hyperparameter Grid-Search and K-Fold Cross-Validation
4. Discussion
4.1. Answers to the Main Questions of the Study
4.2. Limitations of the Proposed Data Analytics
4.3. Perspective
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Liu, W.K.; Gan, Z.; Fleming, M. Deep Learning for Regression and Classification. In Mechanistic Data Science for STEM Education and Applications; Liu, W.K., Gan, Z., Fleming, M., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 171–214. [Google Scholar]
- US-EPA. National Pollutant Discharge Elimination System (NPDES): Municipal Wastewater. 2023. Available online: https://www.epa.gov/npdes/municipal-wastewater (accessed on 1 August 2023).
- Zhang, S.; Zhou, P.; Xie, Y.; Chai, T. Improved model-free adaptive predictive control method for direct data-driven control of a wastewater treatment process with high performance. J. Process Control 2022, 110, 11–23. [Google Scholar] [CrossRef]
- Li, F.; Su, Z.; Wang, G.-M. An effective integrated control with intelligent optimization for wastewater treatment process. J. Ind. Inf. Integr. 2021, 24, 100237. [Google Scholar] [CrossRef]
- Han, H.-G.; Fu, S.-J.; Sun, H.-Y.; Qiao, J.-F. Hierarchical nonlinear model predictive control with multi-time-scale for wastewater treatment process. J. Process Control 2021, 108, 125–135. [Google Scholar] [CrossRef]
- Bernardelli, A.; Marsili-Libelli, S.; Manzini, A.; Stancari, S.; Tardini, G.; Montanari, D.; Anceschi, G.; Gelli, P.; Venier, S. Real-time model predictive control of a wastewater treatment plant based on machine learning. Water Sci. Technol. 2020, 81, 2391–2400. [Google Scholar] [CrossRef] [PubMed]
- Xing, Z.; Chen, J.; Zhao, X.; Li, Y.; Li, X.; Zhang, Z.; Lao, C.; Wang, H. Quantitative estimation of wastewater quality parameters by hyperspectral band screening using GC, VIP and SPA. PeerJ 2019, 7, e8255. [Google Scholar] [CrossRef]
- Wu, J.L.; Ho, C.-R.; Huang, C.-C.; Srivastav, A.L.; Tzeng, J.-H.; Lin, Y.-T. Hyperspectral sensing for turbid water quality monitoring in freshwater rivers: Empirical relationship between reflectance and turbidity and total solids. Sensors 2014, 14, 22670–22688. [Google Scholar] [CrossRef]
- Tu, X.; Hu, Z.; Chai, X.-S.; Su, Y. Simple and efficient dual-wavelength spectroscopy for the determination of organic matter in sewage sludge from wastewater treatment. RSC Adv. 2019, 9, 12580–12584. [Google Scholar] [CrossRef]
- Chi, T.; Cao, G.; Li, B.; Abdurahman, Z.K. Estimation Model Based on Spectral-Reflectance Data. In Advances in Information and Communication; Springer International Publishing: Cham, Switzerland, 2020. [Google Scholar]
- Kupssinskü, L.S.; Guimarães, T.T.; de Souza, E.M.; Zanotta, D.C.; Veronez, M.R.; Gonzaga, L.; Mauad, F.F. A Method for Chlorophyll-a and Suspended Solids Prediction through Remote Sensing and Machine Learning. Sensors 2020, 20, 2125. [Google Scholar] [CrossRef]
- Poblete, T.; Ortega-Farías, S.; Moreno, M.A.; Bardeen, M. Artificial Neural Network to Predict Vine Water Status Spatial Variability Using Multispectral Information Obtained from an Unmanned Aerial Vehicle (UAV). Sensors 2017, 17, 2488. [Google Scholar] [CrossRef]
- Sassu, A.; Gambella, F.; Ghiani, L.; Mercenaro, L.; Caria, M.; Pazzona, A.L. Advances in Unmanned Aerial System Remote Sensing for Precision Viticulture. Sensors 2021, 21, 956. [Google Scholar] [CrossRef]
- Galal, H.; Elsayed, S.; Allam, A.; Farouk, M. Indirect Quantitative Analysis of Biochemical Parameters in Banana Using Spectral Reflectance Indices Combined with Machine Learning Modeling. Horticulturae 2022, 8, 438. [Google Scholar] [CrossRef]
- NVIDIA. NVIDIA Jetson Nano. 2023. Available online: https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-nano/ (accessed on 10 July 2023).
- TensorFlow. Module: Tf.Keras. 2023. Available online: https://www.tensorflow.org/api_docs/python/tf/keras (accessed on 2 July 2023).
- Fortela, D.L.B. GitHub Repositpry: Neural Network Regression Modelling on Wastewater Spectral Reflectance. 2023. Available online: https://github.com/dhanfort/WW_Spectra_NNlearning.git (accessed on 2 July 2023).
- Scikit-Learn. Module: Sklearn.Preprocessing.Minmaxscaler. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html (accessed on 2 July 2023).
- Scikit-Learn. Module: 6.3. Preprocessing Data. 2023. Available online: https://scikit-learn.org/stable/modules/preprocessing.html (accessed on 2 July 2023).
- TensorFlow. Module: Tf.Keras.Losses. 2023. Available online: https://www.tensorflow.org/api_docs/python/tf/keras/losses (accessed on 2 July 2023).
- Scikit-Learn. Module: Sklearn.Metrics.Mean_Absolute_Error. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.mean_absolute_error.html (accessed on 14 September 2023).
- Scikit-Learn. Module: Sklearn.Metrics.r2_Score. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html (accessed on 14 September 2023).
- TensorFlow. Module: Tf.Keras.Optimizers. 2023. Available online: https://www.tensorflow.org/api_docs/python/tf/keras/optimizers (accessed on 2 July 2023).
- TensorFlow. Module: Tf.Keras.Activations. 2023. Available online: https://www.tensorflow.org/api_docs/python/tf/keras/activations (accessed on 2 July 2023).
- Scikit-Learn. Module: Sklearn.Model_Selection.GridSearchCV. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html (accessed on 2 July 2023).
- Scikit-Learn. Module: Sklearn.Model_Selection.RepeatedKFold. 2023. Available online: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.RepeatedKFold.html (accessed on 25 August 2023).
- Jung, Y.; Hu, J. A K-fold averaging cross-validation procedure. J. Nonparametric Stat. 2015. 27, 167–179. [CrossRef]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Louis, O. Hyperparameter Tuning with Python: Boost Your Machine Learning Model’s Performance via Hyperparameter Tuning; Packt Publishing: Birmingham, UK, 2022; p. 1. [Google Scholar]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning, 1st ed.; Springer Texts in Statistics: Berlin/Heidelberg, Germany, 2023. [Google Scholar]
- Kokkinos, Y.; Margaritis, K.G. Managing the computational cost of model selection and cross-validation in extreme learning machines via Cholesky, SVD, QR and eigen decompositions. Neurocomputing 2018, 295, 29–45. [Google Scholar] [CrossRef]
- Slack, D.; Friedler, S.A.; Scheidegger, C.; Roy, C.D. Assessing the Local Interpretability of Machine Learning Models. arXiv 2019, arXiv:1902.03501. [Google Scholar]
- Kar, K.; Kornblith, S.; Fedorenko, E. Interpretability of artificial neural network models in artificial intelligence versus neuroscience. Nat. Mach. Intell. 2022, 4, 1065–1067. [Google Scholar] [CrossRef]
- Ojeda, C.B.; Rojas, F.S. Process Analytical Chemistry: Applications of Ultraviolet/Visible Spectrometry in Environmental Analysis: An Overview. Appl. Spectrosc. Rev. 2009, 44, 245–265. [Google Scholar] [CrossRef]
- Li, P.; Hur, J. Utilization of UV-Vis spectroscopy and related data analyses for dissolved organic matter (DOM) studies: A review. Crit. Rev. Environ. Sci. Technol. 2017, 47, 131–154. [Google Scholar] [CrossRef]
- Zhang, Y.; Giardino, C.; Li, L. Water Optics and Water Colour Remote Sensing. Remote. Sens. 2017, 9, 818. [Google Scholar] [CrossRef]
- Rajput, D.; Wang, W.-J.; Chen, C.-C. Evaluation of a decided sample size in machine learning applications. BMC Bioinform. 2023, 24, 48. [Google Scholar] [CrossRef]
- Icke, O.; van Es, D.M.; de Koning, M.F.; Wuister, J.J.G.; Ng, J.; Phua, K.M.; Koh, Y.K.K.; Chan, W.J.; Tao, G. Performance improvement of wastewater treatment processes by application of machine learning. Water Sci. Technol. 2020, 82, 2671–2680. [Google Scholar] [CrossRef]
- Szeląg, B.; Barbusiński, K.; Studziński, J.; Bartkiewicz, L. Prediction of wastewater quality indicators at the inflow to the wastewater treatment plant using data mining methods. E3S Web Conf. 2017, 22, 00174. [Google Scholar] [CrossRef]
- Dev, P.; Jain, S.; Arora, P.K.; Kumar, H. Machine learning and its impact on control systems: A review. Mater. Today Proc. 2021, 47, 3744–3749. [Google Scholar] [CrossRef]
- Niu, S.S.; Xiao, D. Advanced Process Control. In Process Control: Engineering Analyses and Best Practices; Niu, S.S., Xiao, D., Eds.; Springer International Publishing: Cham, Switzerland, 2022; pp. 169–216. [Google Scholar]
- Han, L.; Rundquist, D.C. Comparison of NIR/RED ratio and first derivative of reflectance in estimating algal-chlorophyll concentration: A case study in a turbid reservoir. Remote Sens. Environ. 1997, 62, 253–261. [Google Scholar] [CrossRef]
- Joiner, J.; Fasnacht, Z.; Qin, W.; Yoshida, Y.; Vasilkov, A.P.; Li, C.; Lamsal, L.; Krotkov, N. Use of Hyper-Spectral Visible and Near-Infrared Satellite Data for Timely Estimates of the Earth’s Surface Reflectance in Cloudy and Aerosol Loaded Conditions: Part 1–Application to RGB Image Restoration Over Land With GOME-2. Front. Remote Sens. 2022, 2, 716430. [Google Scholar] [CrossRef]
- Wenjun, J.; Zhou, S.; Jingyi, H.; Shuo, L. In situ measurement of some soil properties in paddy soil using visible and near-infrared spectroscopy. PLoS ONE 2014, 9, e105708. [Google Scholar] [CrossRef]
- Xiao, Z.; Li, Y.; Feng, H. Modeling soil cation concentration and sodium adsorption ratio using observed diffuse reflectance spectra. Can. J. Soil Sci. 2016, 96, 372–385. [Google Scholar] [CrossRef]
- Luca, L.; Vilanova, R.; Ifrim, G.A.; Ceanga, E.; Caraman, S.; Barbu, M. Control Strategies of a Wastewater Treatment Plant. IFAC-PapersOnLine 2019, 52, 257–262. [Google Scholar] [CrossRef]
NN Model Settings | Model Output(s) Y Settings |
---|---|
(1) One Hidden Layer: H1 w/ 32 neuron units (2) One Hidden Layer: H1 w/ 1000 neuron units (3) Two Hidden Layers: H1 w/ 64 neuron units; H2 w/ 32 neuron units (4) Two Hidden Layers: H1 w/ 1000 neuron units; H2 w/ 32 neuron units | Multiple Outputs: (1) BOD, COD, NH3-N, TDS, TA, TH (all) (2) BOD, COD, NH3-N, TDS, TA (3) BOD, COD, NH3-N, TDS (4) BOD, COD, NH3-N (5) BOD, COD Single Output: (6) BOD; (7) COD; (8) NH3-N; (9) TDS; (10) TA; (11) TH |
Total Number of Modelling Settings = (NN Model Settings) × (Model Output(s) Settings) = 4 × 11 = 44 |
NN Model Hyperparameter Grid-Search Settings | Best NN Model Hyperparameter Setting |
---|---|
Optimizer: [‘Adam’, ‘Adadelta’, ‘SGD’] | Optimizer: ‘Adam’ |
Activation function in H1: [‘ReLU’, ‘Linear’] | Activation function in H1: ‘Linear’ |
Activation function in H2: [‘ReLU’, ‘Linear’] | Activation function in H2: ‘ReLU’ |
Activation function in output layer: [‘ReLU’, ‘Linear’] | Activation function in output layer: ‘Linear’ |
Number of neuron units in H1: [1600, 1000, 64] | Number of neuron units in H1: 1000 |
Number of neuron units in H2: [64, 32, 9] | Number of neuron units in H2: 32 |
Learning rate *: [0.00001, 0.0001, 0.001, 0.01] | Learning rate *: 0.0001 |
Total hyperparameter grid-search settings with full-factorial grid via Scikit-Learn ‘GridSearchCV’ = 864 Epoch for each setting NN model training = 5000 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fortela, D.L.B.; Travis, A.; Mikolajczyk, A.P.; Sharp, W.; Revellame, E.; Holmes, W.; Hernandez, R.; Zappi, M.E. Quantitating Wastewater Characteristic Parameters Using Neural Network Regression Modeling on Spectral Reflectance. Clean Technol. 2023, 5, 1186-1202. https://doi.org/10.3390/cleantechnol5040059
Fortela DLB, Travis A, Mikolajczyk AP, Sharp W, Revellame E, Holmes W, Hernandez R, Zappi ME. Quantitating Wastewater Characteristic Parameters Using Neural Network Regression Modeling on Spectral Reflectance. Clean Technologies. 2023; 5(4):1186-1202. https://doi.org/10.3390/cleantechnol5040059
Chicago/Turabian StyleFortela, Dhan Lord B., Armani Travis, Ashley P. Mikolajczyk, Wayne Sharp, Emmanuel Revellame, William Holmes, Rafael Hernandez, and Mark E. Zappi. 2023. "Quantitating Wastewater Characteristic Parameters Using Neural Network Regression Modeling on Spectral Reflectance" Clean Technologies 5, no. 4: 1186-1202. https://doi.org/10.3390/cleantechnol5040059
APA StyleFortela, D. L. B., Travis, A., Mikolajczyk, A. P., Sharp, W., Revellame, E., Holmes, W., Hernandez, R., & Zappi, M. E. (2023). Quantitating Wastewater Characteristic Parameters Using Neural Network Regression Modeling on Spectral Reflectance. Clean Technologies, 5(4), 1186-1202. https://doi.org/10.3390/cleantechnol5040059