Estimation of River Pollution Index in a Tidal Stream Using Kriging Analysis

Tidal streams are complex watercourses that represent a transitional zone between riverine and marine systems; they occur where fresh and marine waters converge. Because tidal circulation processes cause substantial turbulence in these highly dynamic zones, tidal streams are the most productive of water bodies. Their rich biological diversity, combined with the convenience of land and water transports, provide sites for concentrated populations that evolve into large cities. Domestic wastewater is generally discharged directly into tidal streams in Taiwan, necessitating regular evaluation of the water quality of these streams. Given the complex flow dynamics of tidal streams, only a few models can effectively evaluate and identify pollution levels. This study evaluates the river pollution index (RPI) in tidal streams by using kriging analysis. This is a geostatistical method for interpolating random spatial variation to estimate linear grid points in two or three dimensions. A kriging-based method is developed to evaluate RPI in tidal streams, which is typically considered as 1D in hydraulic engineering. The proposed method efficiently evaluates RPI in tidal streams with the minimum amount of water quality data. Data of the Tanshui River downstream reach available from an estuarine area validate the accuracy and reliability of the proposed method. Results of this study demonstrate that this simple yet reliable method can effectively estimate RPI in tidal streams.


Introduction
The complex flow of tidal streams is mainly influenced by interactions between river water and seawater. Thus, tidal streams are in constant flux as they adapt to river and climate conditions. The half-day tidal variation of the sea is the main driver of cyclic fluctuation in tidal streams [1]. Seasonal changes in flow conditions from estuaries determine water salinity. Upstream flooding is a factor in the changes in the vertical texture of water bodies and is attributed to flow irregularities. Among climatic factors, wind conditions significantly affect tidal streams. Waves caused by wind shear alter the circulation patterns and mixing process of river water and seawater. Within this process, only a 2% difference in density between river water and sea water results in a horizontal pressure gradient, which affects water flow. This difference is mainly attributed fluctuations in temperature and water salinity; the role of the latter is significantly stronger [2]. The physical process of the flow mechanism appears to be complex and is rather difficult to explain. Moreover, this phenomenon drives additional processes in the water body such as sedimentology, biology, and chemistry [3]. Thus, traditional models fail to accurately estimate the flow and water quality of tidal streams. The flow field must be estimated through a hydraulic model before determining water quality. Therefore, a considerable amount of river data including sectional features, flow, level, and quality is necessary to calibrate the numerous model parameters and requires substantial time, labor, and capital [4][5][6].
The application of geostatistical methods or those combined with other models to water quality monitoring and estimation has been discussed extensively during the past two decades. Lo et al. (1999) [7] applied a steady-state quality model to simulate the biochemical oxygen demand (BOD) and kriging theory for selecting optimal sampling locations and frequency. Their results indicated a total number of monitoring stations in the Keelung River of 21 and a sampling frequency of approximately 2-3 times per month. Mohammad et al. [8] also applied genetic algorithm-, kriging-and analytical hierarchy process-based methods to evaluate suitable sampling locations and frequency in Iran. Yang et al. [9] proposed a spatial regression method in conjunction with the kriging approach to estimate the nitrogen concentrations of nonpoint source pollution (NOP) in some Iowa (USA) streams. Polus et al. [10] demonstrated that geostatistical methods reduce uncertainties in a physically based models (DPBMs) distributed along the Seine River in France. Liu et al. [11] used hierarchical clustering analysis, principal component, and factor analysis with geostatistics to assess the water quality of an alpine lake in Taiwan. Moreover, other researches focused on spatial modeling of the scaling issue. Militino et al. [12] proposed a linear mixed incorporating both spatial as well as longitudinal information for detecting excessive nitrate. Garreta et al. [13] proposed various relevant methods based on geostatistics for application to prediction and error maps of Meuse and Moselle basins in France.
Tidal stream water quality is difficult to simulate with the water quality model because of the many effects of run-off convergence from rainfall upstream and tide recession downstream; thus, a geostatistical method was used in this study. Four variables of water quality were obtained simultaneously from sampling stations along the tidal streams. The water quality for each station was then estimated by kriging analysis to evaluate the pollution of tidal streams. In particular, our proposed algorithm is based on 1D kriging and provides a simple and efficient solution for the complexities of boundary conditions encountered in traditional 2D hydrological models.

Theory and Methods
The geostatistical method adopted in this study for estimating RPI is based on the sampling data obtained from the Tanshui River. The use of an RPI is characterized by the fact that, although the spatial distribution of drainage area is 2D, pollution in the mainstream generally remains in a 1D variable flow transmitted from upstream to downstream. Thus, spatial estimation based on a 2D random variable domain is impossible. In this study, RPI calculations were performed at various separate sampling points in the mainstream of the Tahan River upstream from the converging point between the Hsintien and Keelung rivers to establish the testing semivariogram by geostatistics on an hourly basis. On the basis of the optimal theoretical semivariogram model, the RPI values were then estimated hourly for other areas without measuring points along the three rivers. Since the Tanshui River is a tidal stream, pollutants are likely transmitted back upstream at high tide. The overlapping estimates of RPI per river kilometer among the three rivers were averaged; data from the separate stations were estimated directly without overlapping.

Kriging Analysis
Geostatistics, a scientific method that analyzes spatial structures, is based on parameters of natural phenomena in the structural characteristics of spatial distribution. In this method, regionalized variables are established in various locations, the estimation of which is based on variograms. Various research fields that apply the theory of regionalized variables include meteorology, soil physics, groundwater, mining and metallurgy, environmental monitoring, and hydrology [14][15][16][17][18].
Matheron [19,20] pioneered kriging analysis, referring to the function as Krige. The spatial variation of rainfall was interpolated by using an ordinary kriging method. While not designed to optimize the appearances of interpolation, kriging is characterized by its statistical capability to increase estimation accuracy at grid points. Kriging is the decomposition of the variable Z(x) into the sum: where m(x) represents the mean and e(x) represents the zero-mean function specific for a given position x. Notably, mean m(x) is an unknown constant that leads to ordinary kriging, which follows the best linear unbiased estimator. The kriging estimator of derived as if all n observed data used is of the linear model as the form: where Z i represents observed data and λ i represents a weight placed on Z i . Using the unbiased estimator, To satisfy the optimal condition, λ i is selected to minimize the error Z 0 -Z * 0 : Equation (4) can be solved by the method of Lagrange multipliers, subsequently yielding: where γ ij represents the covariance of i and j; |x i − x j | represents the distance between x i and x j ;  is the mean value. The kriging variance (σ 2 ok ), which provides a measure of the error associated with the kriging estimator, is obtained by premultiplying the first n equation of (5) by λ i : Based on the hypothesis of second-order stationarity, the development of kriging assumes that the mean and variogram are known. Therefore: Variance of the increments has a finite value 2γ(h), depending on length h within the domain. The variogram indicates the extent of which the dissimilarity between Z(x) and Z(x + h) evolves with distance h. The graph of γ(h) against h reveals that the semivariogram increases with h, as shown in Figure 1. However, the semivariogram is bounded by a finite value known as sill. Notably, Z(x) and Z(x + h) are uncorrelated with each other when h is larger than sill. A nugget effect may occur when significant variance occurs in a very short distance h. Additionally, the semivariogram and covariance function shown in Figure 2 are related. The value of γ(h) approaches C(0) when distance h increases to infinity. Several models, including spherical, exponential, Gaussian, and power-law models, are used to correlate with the relation of γ(h) and h to determine the sill and range ( Figure 3). The power model is expressed as: the spherical model: (12) the exponential model: with influence range ; (13) and the Gaussian model: The exponential mode is a conventionally used covariance function for modeling discontinuity at the origin of the variogram. In addition to the four basic theoretical models described above, a nested structure consisting of these models can be used to correlate with the realistic variance of a random field. Additional details of the kriging theory were reported by Journel and Huijbregts [21].

River Pollution Index
The conventionally adopted classification system in Taiwan for monitoring water quality is an RPI [22] that includes four variables: dissolved oxygen (DO), biochemical oxygen demand (BOD 5 ), suspended solids (SS), and ammonia nitrogen (NH 3 -N). DO is an important index for the quality of water bodies and includes dissolutions from the atmosphere, natural and artificial aeration, and photosynthesis from water plants. In water contaminated by organic matter, DO is consumed by aquatic microorganisms during decomposition; hypoxia occurs when DO in the water is diminished. BOD 5 indicates the content of organic matter that can be decomposed by aquatic microorganisms, indirectly representing the degrees of contamination by organic matter in water bodies. Organic matter containing nitrogen is derived mainly from the decomposition of animal waste, animal corpses, and plant remains. During the decomposition process, amino acids are released first, followed by the sequential release of ammonia nitrogen, nitrite nitrogen, and nitrate nitrogen until stabilization. Therefore, the presence of ammonia indicates the short-term contamination of the water body. SS refer to organic or inorganic particles suspended in water by stirring or flowing, including colloids. SS impair light penetration in the water, and their effects on aquatic organisms are similar to those of turbidity. SS deposited on riverbanks block water flow, while solids deposited in reservoir areas diminish reservoir capacity.
Each variable of water quality used to determine RPI is converted to one of four index scores (S i = 1, 3, 6, or 10). Notably, RPI refers to the arithmetic average of these index scores with respect to the water quality: (15) where S i represents the index scores based on Table 1 and the RPI value ranges from 1 to 10. According to the river pollution index listed in Table 1, the four classifications of pollution are unpolluted, negligibly polluted, moderately polluted, and severely polluted. The Beishih and Nanshih rivers converge near Hsintien, then discharge into the Tanshui River at Jiangzicui. The Keelung River originates at Jingtong Mountain with gorges above Badu and flows downward into a plain to converge with the Tanshui River at Guandu, which has a drainage area of 600 km 2 .
This study analyzed water quality data from nine sampling stations along the drainage area of the Tanshui River at the Shain and Shinhai bridges along Tahan River; the Zonan and Chung Cheng bridges along the Hsintien River; the Jansho, Nanhu, and Banlin bridges along the Keelung River; and the Taipei and Guandu bridges along the Tanshui River ( Figure 4). The data included DO, BOD 5 , NH 3 -N, and SS values of water quality obtained from the sampling stations during 13 h from 5 a.m. to 5 p.m. on 29 September 2010. Each data point was converted to the corresponding score index; indicator integral values were later calculated by consolidating the data of the four categories. The results represent the RPI for each sampling point.

One-Dimensional Design along the Tanshui River
1D ordinary kriging analysis was performed along the river to estimate RPIs of the Tanshui River during a 13 h period. First, this study divided the drainage area of the Tanshui River into three sections. The first section included four sampling stations along the Keelung and Tanshui rivers: the Jansho, Nanhu, Banlin and Guandu bridges from upstream to downstream. The second section included four sampling stations along the Hsintien and Tanshui rivers: the Zonan, Chung Cheng, Taipei, and Guandu bridges from upstream to downstream. The third section included four sampling stations along the Tahan and Tanshui rivers: the Shain, Shinhai, Taipei, and Guandu bridges from upstream to downstream. The 1D distance between two sampling stations equaled the distance from an estuary along the river direction. Figure 5 describes the spatial relative positions and distance of the rivers in the Tanshui River drainage area. The results of the four water quality designations including DO, BOD 5 , SS, and NH 3 -N are listed in Table 2.

Spatial Variability Analysis
The 1 D ordinary kriging analysis was performed along the river to estimate RPI. The hourly testing semivariogram for the three rivers in the Tanshui River drainage area were calculated, and the data of the testing semivariograms were mixed. The results obtained were then applied to the theoretical semivariogram models, including power, sphere, index, and Gaussian models. Finally, RPIs of the Tanshui River were estimated by using the theoretical semivariogram model. Individual semivariograms were established for three estuaries.  The RPI obtained at 5 a.m. on 29 September 2010, from the Tanshui River was chosen as the standard value. In this study, subsequent testing semivariograms were applied to the theoretical semivariogram models. Table 3 and Figure 6 summarize the results applied to the Tanshui River. These error sums of squares results are expressed as RSS; a smaller value implies a smaller error. This study demonstrated that the highest error sum of squares occurred in the power model, while the applied results from the other models were similar. Additionally, the correlation coefficient of the regression model is expressed as R 2 ; a higher coefficient implies better applied results. According to Table 3, the coefficients of determination from the sphere, index, and Gaussian models were higher than those of the power model. An inflection phenomenon occurred in the Gaussian model at the short-distance area, while the sphere model was limited to certain distances. Therefore, in this study, the index model was selected to calculate the RPIs for those rivers. Table 4 summarizes the applied results of the index model for the 13 h of study along the Tanshui River.

Estimation of RPI
This study used 1D ordinary kriging analysis to examine the RPI of the Tanshui River. During the study, four values of water quality obtained from the sampling stations were assigned to the corresponding points; the RPIs at various time intervals were calculated for each sampling station. Finally, 1D ordinary kriging analysis was performed again to estimate the RPIs along the Tanshui River. As shown in Figure 5, RPIs were individually estimated for three estuaries from Herko to upstream points. Hence, the computation for RPIs of the main estuary of the Tanshui River can be divided into three sections. The first section, from its origin to its convergence with the Hsintien River, included the same RPI as that estimated for the Tanshui River itself. In the second section, between the connections with the Hsintien and the Keelung rivers, the RPI was the average of that estimated for the Tanshui and Hsintien rivers. In the third section, between the convergence with the Keelung River and Herko, the RPI was equal to the average of all three RPI estimates.
Based on Table 1, water quality is classified as unpolluted for the integral of RPI under 2.0; negligibly polluted refers to the integral of RPI between 2.0 and 3.0; moderately polluted indicates water quality above 3.0 but under 6.0. For the integral of RPI above 6.0, water quality is classified as severely polluted. In this study, the RPIs are assigned gradient colors to indicate the levels of river pollution. The estimated RPI from data obtained at 3 p.m. on 29 September 2010, served as the standard value, as shown in Figure 7a. According to the figure, the Tahan River showed the highest pollution level, followed by the Hsintien River; the Keelung River was the least polluted. The water quality of the Tahan River was classified as moderately polluted; the RPI value of the section near the Hsintien River reached 6.83 and was classified as severely polluted. The water quality of the Hsintien River was also classified as moderately polluted; however, the RPI values ranged from 4 to 5, which were lower than that of the Tahan River. Finally, the water quality of Keelung Creek was also classified as moderately polluted; however, the RPI values were largely under 4.3, indicating a lower pollution level than that of the Hsintien River. Moreover, the Tanshui River is a tidal stream, thus allowing the hourly fluctuations during the daytime to be understood through analysis of various RPIs of the river sections. In addition, the inverse distance weighting IDW method was applied for spatial interpolation of RPIs along the Tanshui River, as illustrated in Figure 7b. A comparison of Figure 7a and 7b reveals that if RPI values of adjacent two sampling stations are the same, the RPIs between those two stations remain the same value. However, the RPIs determined through kriging are estimated through the semivariogram versus distance, which appears a non-linear relationship in Figure 6. In such a case, the RPIs differ. Moreover, the RPI value estimated by IDW was smaller than that by kriging between Herko and the Guandu Bridge because the RPI at Herko was assumed to be zero. In particular, IDW estimation indicates that the adjacent upstream and downstream river sections are all severely polluted at approximately 23 river kilometers, near the Shinhai Bridge. Figure 8 shows the temporal fluctuations in estimated RPI at the distance. The highest RPI value was apparent at 1 p.m.

Conclusions
The water quality of tidal streams was calculated by using conventionally adopted hydrological and water quality models, which are time-consuming and costly. In this study, the pollutant transfer from the upstream to downstream was first estimated by a 1D concept and later used to determine the value of pollutants in a 2D space. The spatial distribution of RPIs of the Tanshui River and its branches was simulated successfully by combining the 1D ordinary kriging method with water quality data collected in the field. This approach is simpler than simulation through conventional 2D variable hydrological models. Moreover, this approach solves the problem of determining complex initial conditions necessary for boundary building in models; instead, only the sampled data are used to represent the average water quality of the studied river section. In this method water quality along a tidal stream can be obtained efficiently. This study also analyzed the spatial distribution of RPIs obtained from various sections at given times in addition to the time distribution for each sampling station. The water quality estimation model in this study was constructed on the basis of the water quality of the tidal stream from the Tanshui River, subsequently allowing for determination of the water quality of various river sections. The results of this study demonstrate the feasibility of using the geostatistics method to estimate the complex water quality of tidal streams.