# A Practical Protocol for the Experimental Design of Comparative Studies on Water Treatment

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. The Protocol

#### 2.1. Stage I: Pre-Experimental Planning

#### 2.1.1. Target Definition and a List of Variables of Interest

#### 2.1.2. Preliminary Study

#### 2.1.3. Selection of Design Factors

^{k}full factorial design grows exponentially with k (e.g., with k = 10, 2

^{10}= 1024 or 3

^{10}= 59,049 experiments are required). Therefore, one may prefer to consider fractional factorial design. With fractional factorial designs, the total number of experiments can be reduced by a factor of 2, 4, or 8. However, this comes at a cost of not all effects being estimated, while estimable effects may show total confounding with other (interaction) effects. Care must be taken that the effects of the most interest are estimable and not confounded with other important effects. For each fractional factorial design, a list of estimable and confounded effects can be computed, which is referred to as the alias structure of the design. Textbooks, e.g., Quinn and Keough [7] and Montgomery [8], and software are available for assistance with setting up appropriate fractional factorial design. There have been special designs proposed to reduce the number of experiments, while still allowing the estimation of typical important effects (e.g., main, or linear, and first-order interaction effects), e.g., Plackett‒Burman and Box‒Behnken designs. Once the most important factors have been selected with screening designs, these factors can be studied with more precision in a follow-up experiment. The same designs can be used, but with a smaller number of factors, perhaps a full factorial design can be considered or a fractional factorial design with a larger fraction. Hence, more effects (e.g., interaction/nonlinear) can be estimated. Whenever possible, replications of experiments are advocated to further increase the precision or power of the statistical analysis.

#### 2.2. Stage II: Detailed Experimental Planning

#### 2.2.1. Predictive Models

#### 2.2.2. Hypothesis Testing

_{0}) is measured by a test statistic. Upon using probability theory, a null hypothesis is formally rejected when the test statistic exceeds a threshold, thereby controlling the probability of falsely rejecting the null hypothesis at a small fixed probability (the significance level, α). The correct use of the probability theory, however, often requires assumptions on the distribution of the response variable. In fact, environmental engineers have frequently failed to check these assumptions. For example, a critical assumption is the independence of collected samples, which is frequently violated due to the dependency of collected data on time and space [15]. Particularly, longitudinal and/or spatial measurements taken close together in time and/or space show larger correlation than observations taken further apart. This phenomenon is known as autocorrelation. To correct the impact of spatiotemporal autocorrelations, mixed-effects modeling is highly advised as a valid statistical inference procedure as mixed-effects methods can incorporate the dependency through the introduction of random effects [16]. Other statistical approaches are also recommended, including autocovariate regression, spatial eigenvector mapping, autoregressive models, and generalized estimating equations [17].

#### 2.2.3. Power Analysis Tests

_{0}). More specifically, in the alternative hypothesis (H

_{a}), this probability is equal to 1 − β where β as type II error rate is the chance of falsely retaining H

_{0}(Figure 3a). The value of power depends on three factors: (1) the significance level (α); (2) the sample size and the variance of the experimental observations; (3) the true effect size [18]. Firstly, as the α and β values are inversely related, increasing α decreases β hence increases power (Figure 3b). The balance between α and β is dependent on the research objective but α is normally set up smaller than β because the consequences of false positive inference are considered more serious than those of false negative inference [19]. Figure 3c shows the relationship between sample size and the power using sample distributions under H

_{0}and H

_{a}. When sample size increases, the variance of the sample mean is decreased, leading to the reduction of the overlap between the two distributions and, eventually, increasing the power. Lastly, as the third controlling factor, the true effect size is the difference between the two means in a two-sample t-test. When increasing the true effect size, the overlap between two distributions is decreased, resulting in a higher power as shown in Figure 3d. Based on these interactions, the required sample size for an experiment can be calculated. The specification of the variance is more difficult because no data are yet available in the design stage of the study. This is another reason for setting up a preliminary small scale experiment to obtain a more reliable estimate of the variance.

#### 2.3. Stage III: Lab Experiment Execution

#### 2.3.1. Sampling Plan

#### 2.3.2. Experimental Implementation

#### 2.4. Stage IV: Advanced Data Analysis

#### 2.4.1. Results Analysis and Interpretation

#### 2.4.2. Conclusions and Recommendations

#### 2.5. The Loops

## 3. A Case Study: Performance Comparison of a Conventional Activated Sludge and a Waste Stabilization Pond in a Peak Load Scenario

#### 3.1. Pre-Experiments

#### 3.1.1. Step 1: Target Definition and a List of Variables of Interest

#### 3.1.2. Step 2: Preliminary Study

#### 3.1.3. Step 3: Selection of Design Factors

_{2}× L

^{−1}), EC (µS × cm

^{−1}), and sludge volume index (SVI) (mL × g

^{−1}) were manually monitored on a daily basis. Some uncontrollable factors were identified, including the chemical contamination in the influent, the fluctuation of the peristaltic pump, and the variation of influent wastewater constituents. Although a great deal of attention was paid to limit their influence, these uncontrollable factors were the main reasons of the variations of the samples among the replicas.

#### 3.2. Experimental Planning

#### 3.2.1. Step 4: Preliminary Models

#### 3.2.2. Step 5: Hypothesis Testing

#### 3.2.3. Step 6: Power Analysis Tests

^{−1}for both COD and TN. The α was set at 0.05. The simulation-based analysis was carried out in R [29]. The results of these simulations can be found in Supplementary Material C.

#### 3.3. Experimental Conducting

#### 3.3.1. Step 7: Sampling Plan

#### 3.3.2. Step 8: Experimental Implementation

#### 3.4. Experimental Analysis

#### 3.4.1. Step 9: Statistical Analysis and Results Interpretation

#### 3.4.2. Step 10: Conclusions and Recommendations

- Two systems appeared with relatively high capacity for removing organic matter (>90%), while higher nitrogen removal was obtained in WSP systems compared to AS;
- Regarding resilient capacity, the WSP systems proved their ability of replacing CAS in dealing with the shock load, especially regarding nitrogen removal;
- First-order kinetic models showed higher accuracy for CAS systems compared to WSP systems. A more sophisticated model is suggested for further studies, such as system optimization and performance analysis.
- To investigate the tolerance threshold for shock load of both systems, scenarios with higher strength of wastewater can be implemented in future experiments;
- To assess the sustainability of the two systems, other indicators regarding economic performance and societal sustainability need to be concerned in subsequent studies.

## 4. Discussion

## 5. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## References

- Water, U. Tackling a Global Crisis: International Year of Sanitation 2008. Available online: http://www.wsscc.org/fileadmin/files/pdf/publication/IYS_2008_tackling_a_global_crisis.pdf (accessed on 23 April 2016).
- Muga, H.E.; Mihelcic, J.R. Sustainability of wastewater treatment technologies. J. Environ. Manag.
**2008**, 88, 437–447. [Google Scholar] [CrossRef] [PubMed] - van Loosdrecht, M.C.; Nielsen, P.H.; Lopez-Vazquez, C.M.; Brdjanovic, D. Experimental Methods in Wastewater Treatment; IWA Publishing: London, UK, 2016. [Google Scholar]
- APHA (American Public Health Association). Standard Methods for the Examination of Water and Wastewater; American Public Health Association (APHA): Washington, DC, USA, 2005. [Google Scholar]
- Johnson, P.C.D.; Barry, S.J.E.; Ferguson, H.M.; Muller, P. Power analysis for generalized linear mixed models in ecology and evolution. Methods Ecol. Evol
**2015**, 6, 133–142. [Google Scholar] [CrossRef] [PubMed] - Doran, G.T. There’s a S.M.A.R.T. Way to write management’s goals and objectives. Manag. Rev.
**1981**, 70, 35–36. [Google Scholar] - Quinn, G.P.; Keough, M.J. Experimental Design and Data Analysis for Biologists; Cambridge University Press: Cambridge, UK, 2002. [Google Scholar]
- Montgomery, D.C. Design and Analysis of Experiments; Wiley: Hoboken, NJ, USA, 2013. [Google Scholar]
- Goos, P.; Jones, B. Optimal Design of Experiments: A Case Study Approach; Wiley: Hoboken, NJ, USA, 2011. [Google Scholar]
- Casler, M.D. Fundamentals of experimental design: Guidelines for designing successful experiments. Agron. J.
**2015**, 107, 692–705. [Google Scholar] [CrossRef] - Claeys, F.; Chtepen, M.; Benedetti, L.; Dhoedt, B.; Vanrolleghem, P.A. Distributed virtual experiments in water quality management. Water Sci. Technol.
**2006**, 53, 297–305. [Google Scholar] [CrossRef] [PubMed] - Refsgaard, J.C.; van der Sluijs, J.P.; Hojberg, A.L.; Vanrolleghem, P.A. Uncertainty in the environmental modelling process—A framework and guidance. Environ. Model. Softw.
**2007**, 22, 1543–1556. [Google Scholar] [CrossRef] - Ho, L.T.; Van Echelpoel, W.; Goethals, P.L.M. Design of waste stabilization pond systems: A review. Water Res.
**2017**, 123, 236–248. [Google Scholar] [CrossRef] - Popper, K. The Logic of Scientific Discovery; Routledge: Abingdon, UK, 2005. [Google Scholar]
- Keitt, T.H.; Bjornstad, O.N.; Dixon, P.M.; Citron-Pousty, S. Accounting for spatial pattern when modeling organism-environment interactions. Ecography
**2002**, 25, 616–625. [Google Scholar] [CrossRef] - Zuur, A.F.; Leno, E.N.; Walker, N.J.; Saveliev, A.A.; Smith, G.M. Mixed Effects Models and Extensions in Ecology with R; Springer: New York, NY, USA, 2009. [Google Scholar]
- Dormann, C.F.; McPherson, J.M.; Araujo, M.B.; Bivand, R.; Bolliger, J.; Carl, G.; Davies, R.G.; Hirzel, A.; Jetz, W.; Kissling, W.D.; et al. Methods to account for spatial autocorrelation in the analysis of species distributional data: A review. Ecography
**2007**, 30, 609–628. [Google Scholar] [CrossRef] - Cohen, J. Statistical Power Analysis for the Behavioral Sciences, 2nd ed.; Academic Press: Cambridge, MA, USA, 1988. [Google Scholar]
- Krzywinski, M.; Altman, N. Points of significance: Power and sample size. Nat. Meth.
**2013**, 10, 1139–1140. [Google Scholar] [CrossRef] [Green Version] - Vanrolleghem, P.A.; Schilling, W.; Rauch, W.; Krebs, P.; Aalderink, H. Setting up measuring campaigns for integrated wastewater modelling. Water Sci. Technol.
**1999**, 39, 257–268. [Google Scholar] [CrossRef] - Festing, M.F.W.; Altman, D.G. Guidelines for the design and statistical analysis of experiments using laboratory animals. ILAR J.
**2002**, 43, 244–258. [Google Scholar] [CrossRef] [PubMed] - Cleveland, W.S. Visualizing Data; At&T Bell Laboratories: Murray Hill, NJ, USA, 1993. [Google Scholar]
- Dochain, D.; Gregoire, S.; Pauss, A.; Schaegger, M. Dynamical modelling of a waste stabilisation pond. Bioprocess Biosyst. Eng.
**2003**, 26, 19–26. [Google Scholar] [CrossRef] - Verstraete, W.; Vlaeminck, S.E. Zerowastewater: Short-cycling of wastewater resources for sustainable cities of the future. Int. J. Sustain. Dev. World
**2011**, 18, 253–264. [Google Scholar] [CrossRef] - Mara, D.D. Waste stabilization ponds: Past, present and future. Desalin. Water Treat.
**2009**, 4, 85–88. [Google Scholar] [CrossRef] [Green Version] - Ho, L.; Van Echelpoel, W.; Charalambous, P.; Gordillo, A.; Thas, O.; Goethals, P. Statistically-based comparison of the removal efficiencies and resilience capacities between conventional and natural wastewater treatment systems: A peak load scenario. Water
**2018**, 10, 328. [Google Scholar] [CrossRef] - Reichert, P. Aquasim—A tool for simulation and data analysis of aquatic systems. Water Sci. Technol.
**1994**, 30, 21–30. [Google Scholar] [CrossRef] - Morrell, C.H. Likelihood ratio testing of variance components in the linear mixed-effects model using restricted maximum likelihood. Biometrics
**1998**, 54, 1560–1568. [Google Scholar] [CrossRef] [PubMed] - R Core Team. R: A Language and Environment for Statistical Computing; R Core Team: Vienna, Austria, 2014; ISBN 3-900051-07-0. [Google Scholar]
- Pinheiro, J.; Bates, D.; DebRoy, S.; Sarkar, D. R Development Core Team (2012) Nlme: Linear and Nonlinear Mixed Effects Models; R Package Version 3.1-103; R Foundation for Statistical Computing: Vienna, Austria, 2013. [Google Scholar]
- Mitchell, C.; McNevin, D. Alternative analysis of bod removal in subsurface flow constructed wetlands employing monod kinetics. Water Res.
**2001**, 35, 1295–1303. [Google Scholar] [CrossRef] - Vanrolleghem, P.A.; Gernaey, K.; Petersen, B.; De Clercq, B.; Coen, F.; Ottoy, J.P. Limitations of short-term experiments designed for identification of activated sludge biodegradation models by fast dynamic phenomena. Comput. Appl. Biotechnol.
**1998**, 31, 535–540. [Google Scholar] [CrossRef] - Van der Star, W.R.; Abma, W.R.; Blommers, D.; Mulder, J.W.; Tokutomi, T.; Strous, M.; Picioreanu, C.; Van Loosdrecht, M.C. Startup of reactors for anoxic ammonium oxidation: Experiences from the first full-scale anammox reactor in rotterdam. Water Res.
**2007**, 41, 4149–4163. [Google Scholar] [CrossRef] [PubMed] - Nakajima, J.; Sakka, M.; Kimura, T.; Furukawa, K.; Sakka, K. Enrichment of anammox bacteria from marine environment for the construction of a bioremediation reactor. Appl. Microbiol. Biotechnol.
**2008**, 77, 1159–1166. [Google Scholar] [CrossRef] [PubMed] - Ho, L.; Pompeu, C.; Van Echelpoel, W.; Thas, O.; Goethals, P. Model-based analysis of increased loads on the performance of activated sludge and waste stabilization ponds. Water
**2018**, 10, 1410. [Google Scholar] [CrossRef] - Ho, L.T.; Alvarado, A.; Larriva, J.; Pompeu, C.; Goethals, P. An integrated mechanistic modeling of a facultative pond: Parameter estimation and uncertainty analysis. Water Res.
**2019**, 151, 170–182. [Google Scholar] [CrossRef] [PubMed] - Reichert, P.; Vanrolleghem, P. Identifiability and uncertainty analysis of the river water quality model no. 1 (rwqm1). Water Sci. Technol.
**2001**, 43, 329–338. [Google Scholar] [CrossRef] - Spriet, J. Structure characterization-an overview. IFAC Proc. Volumes
**1985**, 18, 749–756. [Google Scholar] [CrossRef] - Reichert, P.; Omlin, M. On the usefulness of overparameterized ecological models. Ecol. Model.
**1997**, 95, 289–299. [Google Scholar] [CrossRef] - Henze, M.; van Loosdrecht, M.; Ekama, G.A.; Brdjanovic, D. Biological Wastewater Treatment: Priniciples, Modelling and Design; IWA Publishing: London, UK, 2008. [Google Scholar]
- Refsgaard, J.C.; Henriksen, H.J.; Harrar, W.G.; Scholten, H.; Kassahun, A. Quality assurance in model based water management—Review of existing practice and outline of new approaches. Environ. Model. Softw.
**2005**, 20, 1201–1215. [Google Scholar] [CrossRef] - Jakeman, A.J.; Letcher, R.A.; Norton, J.P. Ten iterative steps in development and evaluation of environmental models. Environ. Model. Softw.
**2006**, 21, 602–614. [Google Scholar] [CrossRef] - Aumann, C.A. A methodology for developing simulation models of complex systems. Ecol. Model.
**2007**, 202, 385–396. [Google Scholar] [CrossRef] - Todo, K.; Sato, K. Directive 2000/60/ec of the european parliament and of the council of 23 october 2000 establishing a framework for community action in the field of water policy. Environ. Res. Q.
**2002**, 66–106. [Google Scholar] - Walker, W.E.; Harremoës, P.; Rotmans, J.; van der Sluijs, J.P.; van Asselt, M.B.; Janssen, P.; Krayer von Krauss, M.P. Defining uncertainty: A conceptual basis for uncertainty management in model-based decision support. Integr. Assess.
**2003**, 4, 5–17. [Google Scholar] [CrossRef] - Reichert, P. A standard interface between simulation programs and systems analysis software. Water Sci. Technol.
**2006**, 53, 267–275. [Google Scholar] [CrossRef] - Matott, L.S.; Babendreier, J.E.; Purucker, S.T. Evaluating uncertainty in integrated environmental models: A review of concepts and tools. Water Resour. Res.
**2009**, 45. [Google Scholar] [CrossRef] [Green Version] - Liptak, B.G. Analytical Instrumentation; Taylor & Francis: Abingdon, UK, 1994. [Google Scholar]
- Jennions, M.D.; Moller, A.P. A survey of the statistical power of research in behavioral ecology and animal behavior. Behav. Ecol.
**2003**, 14, 438–445. [Google Scholar] [CrossRef] [Green Version] - Ioannidis, J.P.A. Why most published research findings are false. PLoS Med.
**2005**, 2, 696–701. [Google Scholar] [CrossRef] - Murphy, K.R.; Myors, B.; Murphy, K.; Wolach, A. Statistical Power Analysis: A Simple and General Model for Traditional and Modern Hypothesis Tests; Taylor & Francis: Abingdon, UK, 2003. [Google Scholar]
- Faul, F.; Erdfelder, E.; Lang, A.G.; Buchner, A. G*power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behav. Res. Methods
**2007**, 39, 175–191. [Google Scholar] [CrossRef] - Muthen, L.K.; Muthen, B.O. How to use a monte carlo study to decide on sample size and determine power. Struct. Equ. Model.
**2002**, 9, 599–620. [Google Scholar] [CrossRef] - Martin, J.G.A.; Nussey, D.H.; Wilson, A.J.; Reale, D. Measuring individual differences in reaction norms in field and experimental studies: A power analysis of random regression models. Methods Ecol. Evol.
**2011**, 2, 362–374. [Google Scholar] [CrossRef] - Reich, N.G.; Myers, J.A.; Obeng, D.; Milstone, A.M.; Perl, T.M. Empirical power and sample size calculations for cluster-randomized and cluster-randomized crossover studies. PLoS ONE
**2012**, 7, e35564. [Google Scholar] [CrossRef] - Donohue, M.; Edland, S. Longpower: Power and Sample Size Calculators for Longitudinal Data; R Package Version 1.0-11; R Core Team: Vienna, Austria, 2013. [Google Scholar]
- Green, P.; MacLeod, C.J. Simr: An R package for power analysis of generalized linear mixed models by simulation. Methods Ecol. Evol.
**2016**, 7, 493–498. [Google Scholar] [CrossRef] - Bas, D.; Boyaci, I.H. Modeling and optimization i: Usability of response surface methodology. J. Food Eng.
**2007**, 78, 836–845. [Google Scholar] [CrossRef] - Myers, R.H.; Montgomery, D.C.; Vining, G.G.; Borror, C.M.; Kowalski, S.M. Response surface methodology: A retrospective and literature survey. J. Qual. Technol.
**2004**, 36, 53–77. [Google Scholar] [CrossRef] - Box, G.E.P.; Draper, N.R. Response Surfaces, Mixtures, and Ridge Analyses; Wiley: Hoboken, NJ, USA, 2007. [Google Scholar]
- Jones, B.; Nachtsheim, C.J. Split-plot designs: What, why, and how. J. Qual. Technol.
**2009**, 41, 340–361. [Google Scholar] [CrossRef]

**Figure 1.**Overview of different types of comparative studies on water treatment and the use of comparative indicators.

**Figure 2.**Main structure of the protocol. The protocol includes two feedback loops that allow for new research objectives or research questions, leading to further experiments.

**Figure 3.**The effect of significant level (α), sample size, and true effect size on the statistical power. Statistical power for one sample test for the mean of a normal distribution with know variance (

**a**). The statistical power increases in case: (

**b**) α increases or β decreases; (

**c**) sample size increases; (

**d**) effect size increases [19].

**Figure 4.**The effluent pollutant levels of the CAS and WSP with the p-value of the hypothetical tests during the peak scenario ((

**a**): COD, (

**b**): TN).

**Figure 5.**Predicted and observed effluent pollutant levels of conventional activated sludge ((

**a**) and (

**b**)) and waste stabilization pond ((

**c**) and (

**d**)) systems during the shock load scenario.

Null Hypotheses | Performance Comparison |
---|---|

H_{0-1}: The mean pollutant levels in the effluent of the two systems are equal during the beginning period. | Removal efficiency |

H_{0-2}: The mean pollutant levels in the effluent of the two systems are equal during the shock load. | Resilience capacity |

H_{0-3}: The mean pollutant levels in the effluent of the two systems are equal during the recovering phase. | Removal efficiency |

H_{0-4}: The mean pollutant levels in the effluent are equal before and after the peak load. | Recoverability |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Ho, L.; Thas, O.; Van Echelpoel, W.; Goethals, P.
A Practical Protocol for the Experimental Design of Comparative Studies on Water Treatment. *Water* **2019**, *11*, 162.
https://doi.org/10.3390/w11010162

**AMA Style**

Ho L, Thas O, Van Echelpoel W, Goethals P.
A Practical Protocol for the Experimental Design of Comparative Studies on Water Treatment. *Water*. 2019; 11(1):162.
https://doi.org/10.3390/w11010162

**Chicago/Turabian Style**

Ho, Long, Olivier Thas, Wout Van Echelpoel, and Peter Goethals.
2019. "A Practical Protocol for the Experimental Design of Comparative Studies on Water Treatment" *Water* 11, no. 1: 162.
https://doi.org/10.3390/w11010162