Relationship between Instant Sampling and Daily Average Values of COD for Urban Wastewater Treatment Plants in China

: Most guidelines for urban wastewater treatment plants (WWTPs) are established on a time basis determination to determine performance and compliance with discharge standards and limits (i.e., maximum daily-average). Nevertheless, there is a lack of a systematic analysis of the relationship between instant sampling and daily average discharge concentration values. The present study used the chemical oxygen demand (COD) automated monitoring data that were collected from 1738 WWTPs in China to discuss the relationship between instant sampling and daily average values. A ratio model ( K value) was developed to study the relationship between the reliability of the instant sampling value and the daily average limit for the COD measurements. The K value revealed that the ratio of COD instant and daily average measured concentrations for WWTPs in China collectively ranged from 1.00 to 1.45. The results of this study suggest setting the K value of COD to 1.3 for WWTPs in China to estimate the corresponding instant sampling limit of COD, as well as for most WWTPs in China, in order to ensure the stability of compliance to the daily average limits. The instant sampling value of COD in 24 h should be controlled no more than five times out of the instant sampling limit, which is 1.3 times of the daily average limit.


Introduction
Like most countries worldwide, China's discharge standards have been designed to regulate the end-of-pipe wastewater discharges. The averaging time is very important in the determination of discharge limits. Most countries and international organizations set the discharge limits according to a certain time, such as hourly, daily, monthly, and annually, while others, such as Germany, set them through instant sampling limits [1]. The considerations of setting the averaging time on the discharge limits include (1) acute or chronic environmental effects of the pollutant(s), (2) the variations of the industrial process, (3) the time needed to obtain statistically representative samples, (4) the response time of the instrument involved, and (5) environmental objectives [2].
The "daily average limit" was adopted in China's wastewater discharge standards for most of the regulated pollutants in order to reflect the stability of wastewater treatment systems and avoid acute toxic effects from above-the-limit wastewater discharge at a specific time. However, with the popularization of automatic monitoring systems, there are many continuous instant sampling discharge concentration values (e.g., one per hour) characterized by the fluctuation. Thus, a study regarding the fluctuation characterization of the instant sampling values, and its relationship with the daily average values, would be helpful for automated monitoring data used in the discharge standards compliance assessment.
Three main challenges need to be addressed for determining the relationship between instant sampling and daily average values. The first is the determination of the concentration level of instant sampling for a certain daily average concentration level, which is helpful in defining the instant sampling limit under the same treatment condition. The second is the determination of the number of failures in meeting the instant sampling limit, which would lead to exceeding the daily average limit. The third is an evaluation of the probabilities of failures or non-failures of a wastewater treatment system, which can be used to solve the second problem that is mentioned above.
Some discharge limits set in other countries have demonstrated certain relationships between different averaging times of the limits. For example, the "maximum for any one day" in the United States (US) water pollutant discharge standards is generally twice the "monthly average" [3][4][5][6]. The "7-day average" in "Secondary Treatment Standard" (40 CFR 133) of the US is 1.5 times that of the "30-day average" [7]. In Japan's National Effluent Standards, the ratio of "permissible limit" to "daily average" of chemical oxygen demand (COD), biological oxygen demand (BOD), and total suspended solids (TSS) is 1.33, and the ratio for the nitrogen and phosphorus is 2 [8]. In Canada's paper industry discharge requirements, the ratio of the daily average to the monthly average of BOD ranges from 1.6 to 2 and it is different in different provinces [9]. However, there has been a lack of a systematic method for evaluating the ratio between two different averaging times of limits, particularly for the ratio between the instant sampling limit and the daily average limit.
Reliability analysis has been used to evaluate the probability of the non-failures of a wastewater treatment system. For example, Wheatland used statistical methods in 1972 to discuss the relationship between the averages of the BOD and SS discharge concentrations and the limits of municipal wastewater treatment plants in the United Kingdom (UK). He proposed that the average discharge concentration should be below a certain value in order to meet the limits under a certain probability (such as 50% or 95%) [10]. Niku et al. also used statistical methods to study the discharge stability and reliability of activated sludge, biofilms, and other wastewater treatment processes [11][12][13][14]. Djeddou et al. evaluated the daily discharge stability of COD, BOD, and TSS of municipal wastewater treatment plants by an activated sludge process in eastern Albania while using a statistical model [15]. However, there is still a lack of research on the relationship between the reliability of a short averaging time and a longer averaging time, which is important in treatment process control and a discharge standards compliance assessment.
At present, approximately 40% of the urban WWTPs in China are listed as key water pollution sources, and the automatic monitoring systems of COD and ammonia nitrogen are installed as required. Our previous study [16] used the COD automated monitoring data of WWTPs to analyze the statistical distribution and developed the discharge limits for COD that were based on statistical methods. In the present study, we further investigated the relationship between the instant sampling and daily average values of COD for WWTPs in China by determining the ratio between the two values. Additionally, we studied the relationship between the reliability of the instant sampling value and the daily average limit. The results of this study may provide relevant insights for more precise COD control in WWTPs in China and discharge standards compliance assessments.

Data Source and Analysis
The automated monitoring data of the COD concentrations that were discharged from 1753 WWTPs in 2015 were used as the data source. Among them, 139 were from Northeast China, 324 from the North, 615 from the East, 416 from the South, 81 from the Northwest, and 178 from the Southwest. Among the WWTPs, 185 were large-scale, 1385 were mid-scale, and 183 were small-scale.
For treatment techniques, 106 WWTPs used the anoxic-oxic (A/O) process, 468 the anaerobic-anoxicoxic (A/A/O) process, 347 the sequencing batch reactor (SBR) process, 544 the oxidation ditch process, 185 the traditional activated sludge process, 45 the biological membrane process, and 58 used other techniques, referring to some tertiary treatment processes (ozonation, etc.) that were used after second treatment. Regarding the ratio of industrial wastewater treated, 126 WWTPs were >70%, 63 WWTPs were between 50% and 70% (including 50%), and 1564 WWTPs were <50%. The selected WWTPs were representative of Chinese WWTPs in terms of locations, sizes, treatment techniques, and ratios of the received industrial wastewater. Fifteen WWTPs did not provide effective COD automated monitoring data; hence, the number of WWTPs with effective data was 1738.
The data collected were the COD concentrations that were sampled instantly once per hour and mainly analyzed using the dichromate method by the automated monitoring system. There were >7200 h of data for each WWTP. We eliminated negative and zero concentrations, which were invalid according to the "Technical Specifications for Validity of Wastewater Online Monitoring Data (HJ/T 356-2007)" [17]. We also eliminated the data for which the COD concentrations were <10 mg/L, because these data were below the method detection limit according to the "Water Quality Determination of the Chemical Oxygen Demand-Dichromate Method (GB 11914-89)" [18]. The daily arithmetic average concentrations of COD were calculated for days in which there were >18 h of data. Lastly, the daily arithmetic average concentrations for each WWTP were obtained.
Many researchers [19][20][21][22] have illustrated that a log-normal distribution provides a reasonable and practical basis for analyzing conventional pollutant discharge concentration data. Our previous study [16] demonstrated that the COD data of WWTPs in China had a log-normal distribution plot. According to the Kolmogorov-Smirnov (K-S) test for the logarithm of COD measured discharge concentration of 1738 WWTPs in China, the result verified the log-normal distribution with the pvalue > 0.05 for 84% of the WWTPs. The WWTPs that did not pass the K-S test also presented the approximately log-normal distribution with the correlation coefficient of the fitting linear of the logarithm of COD concentrations against the normal probabilities higher than 0.8. Afterwards, further study was conducted based on the log-normal distribution of COD data of WWTPs.

Determination of the K Value
Discharge limits were derived with Equation (1) while using the long-term average (LTA) multiplied by the variability factor (VF). This equation has been used by the US Environmental Protection Agency (US EPA) for establishing water discharge limits for many industries [4][5][6]23] and it was adopted in our previous study [16] to determine the COD discharge limits for WWTPs in China. The LTA reflects the average discharge level of a WWTP, and the mean of the instant sampling in a certain period is equal to that of the daily average. Therefore, the difference between the instant sampling limit and the daily average limit is mainly reflected in the VF. The VF is defined by Equation (2), based on the 99th percentile of the data.
where: -estimated 99th percentile of measured discharge concentration, ( )-estimated expected value of measured discharge concentration, and -measured discharge concentration. According to the log-normal probability and statistics theory, the VF of the instant sampling and daily average values can be obtained while using Equations (3) and (4), respectively. In this case, the K value is defined as the ratio of the instant sampling limit to the daily average limit to reflect the relationship between the two by using Equation (5) under the same discharge control level. It should be emphasized that as the "limit" is defined and calculated by Equation (1), so the "limit" here is not the limit that has already been set, but the value calculated to reflect the real discharge level, and it can be used to decide the discharge limit.
where: -average of the natural logarithm of the measured discharge concentration, and -standard deviation of the natural logarithm of the measured discharge concentration.

Relationship between the Reliability of Instant Sampling and Daily Average Limit
Based on the influent load, wastewater treatment process, and management level, the wastewater discharges fluctuate to a certain extent. Niku et al. [11] defined the "reliability (R)" of a wastewater treatment system for an activated sludge process as "the ability to perform the specified requirements free from failure" or "the probability of adequate performance for at least a specified period of time under specified conditions". Thus, the "reliability (R)" can be defined while using Equations (6) and (7); this method has been used in many studies [24][25][26][27] for the assessment of the performance of water treatment systems.
where: R-reliability, ( )-probability of failure, -measured discharge concentration (e.g., instant sampling value), and -discharge limit concentration (e.g., instant sampling limit). Niku et al. [11] also developed the coefficient of reliability (COR) on the basis of the assumed log-normality of the discharge concentration data, which can be used to estimate the reliability of the treatment plants. The COR relates the values of the average of the discharge concentrations to the limits to be achieved, on a probability basis, as follows: where: -average of the measured discharge concentration (e.g., daily average), and -coefficient of reliability. Because of the log-normal distribution and after the logarithmic and standardization transformations, the following equations can be drawn: where: -discharge concentration in accordance with the log-normal distribution, and -probability of failure of meeting the discharge limit.
where: -value of the standardized normal distribution, -variance of the log-normal distribution calculated using Equation (11), and -expected value of the log-normal distribution calculated using Equation (12).
where: -coefficient of the variation of the discharge concentration, and -standard deviation of the discharge concentration.
After some algebraic manipulations, COR and the value of the standardized normal distribution were obtained, as follows: = − ln ( + 1) [ln( + 1)] According to Equation (14), if is assumed to be the mean of the instant sampling for 24 h, which is equal to the daily average, and is assumed to be the discharge limit for the instant sampling, and then the relationship between the reliability of the instant sampling and the daily average can be obtained from Equation (15). Figure 1 shows the relationship between COR and the coefficient of variation and it is calculated while using Equation (16). At a certain reliability value, the COR value decreases with an increase in the coefficient of variation and increases after the COR reaches the minimum value. Furthermore, from Equation (16), the minimum value for COR can be obtained, as follows:

COR and Coefficient of Variation
Moreover, the corresponding coefficient of variation ( ) can be obtained when COR takes the minimum value, as shown in Equation (18): The minimum COR value and the corresponding coefficient of variation are shown in Table 1 with different values for the reliability of the instant sampling. It can be seen that when 6 of the instant sampling values exceeded the instant sampling limit (24 h reliability = 75%), the CORmin = 0.797, i.e., (the daily average/the instant sampling limit)min = 0.797, showing that the daily average is definitely greater than 0.797 times of the instant sampling limit. Additionally, if we set the daily average limit as 0.797 or less times of the instant sampling limit (the instant sampling limit is 1.254 times or greater of daily average limit), the daily average would definitely exceed the daily average limit. Hence, when the instant sampling limit was 1.3 times the daily average value (COR = 1/1.3 = 0.7692) and 24 h reliability = 75%, the daily average value would definitely exceeded the daily average limit. Equations (19)-(21) showed a better understanding for the above reasoning. = ≥ 0.797 (19) ≥ 0.797 × If = 0.797 or less, then: where: -daily average discharge limit.

K Value of COD for WWTPs
While using the method discussed above, the K values of the COD of 1738 WWTPs were calculated, and Table 2 shows the statistical characteristics. According to the statistical results, the average K value of the COD of WWTPs was 1.10, and the median value was 1.09. The K value of COD in 95% of the WWTPs was less than 1.20 and in 99% of the WWTPs was 1.30.  According to the one-way analysis of variance, the three factors-scale, treatment process, and ratio of treated wastewater from industries-were all non-significantly correlated with the K value, with all of the p-values of the t-test being higher than 0.05. The geographical location was significantly correlated with K, with a p-value < 0.05. Table 3 shows the percentiles of the K values for different geographical locations.  Table 3 shows that the K values in the Northeast and Northwest of China were higher as compared to the other geographical locations, which indicated that the COD discharge concentrations from WWTPs in these two areas fluctuated considerably. These can be translated to a higher instant sampling value against the daily average value. The reason for this might be the relatively lower social and economic development and technical management level in these two regions, and the corresponding management of WWTPs should be further strengthened. However, from a national perspective, the K value of COD for WWTPs in China was recommended to be 1.3, which ensured that 99% of the WWTPs met the requirement (see Table 2). Therefore, the instant sampling limit was recommended to be 1.3 times the daily average limit for the COD of WWTPs.

Relationship between Reliability of Instant Sampling and Daily Average Limit of COD for WWTPs
We obtained Equation (22) according to Equation (16) while using the result that the instant sampling limit should be 1.3 times the daily average limit for the COD of WWTPs, i.e., COR = 1/1.3 = 0.7692: If = −[ln( + 1)] , then = ln( + 1), and Equation (22) can be rewritten, as follows: If = [ln( + 1)] , then = ln( + 1), and Equation (22) can be rewritten, as follows: Therefore, there were two circumstances: From Equation (25), we obtained the information presented in Table 4. It can be seen that when all 24 instant sampling values in one day met the instant sampling limit, which was 1.3 times the daily average limit and COR = 0.7692, if the coefficient of variation of the WWTP was <0.07 or very large, the daily average would still exceed the daily average limit. Moreover, when one of the 24 instant sampling values exceeded the instant sampling limit, if the coefficient of variation of the WWTP was <0.16 or >234.47, the daily average would confidently exceed the daily average limit. When six of the 24 instant sampling values exceeded the instant sampling limit, CORmin (0.797, according to Table 1) was larger than the developed COR (0.7692), and the daily average would confidently exceed the daily average limit, irrespective of the coefficient of variation. As to the WWTPs with K < 1.3, the daily average would always conform to daily average limit when no more than five of instant sampling values exceeding the instant sampling limit. According to the data analysis, the coefficient of variation of the COD concentration in 24 h of WWTPs in China was generally <1. Therefore, the daily average of the COD would mostly exceed the limit, especially when the instant sampling value was kept at a higher concentration level and the standard deviation was relatively small.

Conclusions
This study constructed the K value model to reveal the relationship between the instant sampling and daily average values. The K value revealed that the ratio of COD instant sampling and daily average values for WWTPs in China ranged from 1.00 to 1.45. From a national perspective, the study suggested setting the K value of COD to 1.3 for WWTPs to determine the instant sampling limit of COD that corresponded to the current daily average limit. Based on the reliability statistical model, this study further investigated the relationship between the reliability of the instant sampling value and the daily average limit. The results revealed that if the K value of COD was 1.3, which meant that the instant sampling limit was 1.3 times the daily average limit, the daily average would exceed the daily average limit when six of the 24 instant sampling values exceeded the instant sampling limit in one day. According to this result, the study suggested that, for most of the WWTPs in China, the instant sampling value of COD in 24 h should be controlled no more than five times out of the instant sampling limit in order to ensure the stability of compliance to the daily average limits.