Next Article in Journal
Massive Open Online Course (MOOCs) Acceptance: The Role of Task-Technology Fit (TTF) for Higher Education Sustainability
Previous Article in Journal
Sustainability Analysis of Fish Feed Derived from Aquatic Plant and Insect
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Water Transparency Prediction of Plain Urban River Network: A Case Study of Yangtze River Delta in China

1
State Key Laboratory of Hydrology-Water Resources and Hydraulic Engineering, Nanjing Hydraulic Research Institute, Nanjing 210029, China
2
College of Water Conservancy and Hydropower Engineering, Hohai University, Nanjing 210098, China
3
Department of Civil and Environmental Engineering, The University of Auckland, Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand
4
School of Water Resources and Hydropower Engineering, Wuhan University, Wuhan 430072, China
*
Authors to whom correspondence should be addressed.
Sustainability 2021, 13(13), 7372; https://doi.org/10.3390/su13137372
Submission received: 6 May 2021 / Revised: 18 June 2021 / Accepted: 22 June 2021 / Published: 1 July 2021

Abstract

:
Water transparency is commonly used to indicate the combined effect of hydrodynamics and the aquatic environment on water quality throughout a river network. However, how water transparency responds to these indicators still needs to be explored, especially their complicated nonlinear relationship; thus, this study represents an analysis of the Suzhou civil river network. Using an artificial neural network (ANN) hydrological model and a multiple linear model (MLR) with in-situ data between 2013–2019, we investigated the Suzhou River’s sensitivity to the six factors and water transparency, which including flow velocity and data from five categories of water-quality monitoring data: total suspended matter (TSS), water temperature (TE), dissolved oxygen (DO), chlorophyll (Chl) and chemical oxygen demand (COD). The results suggest that the ANN model can achieve better performance than the MLR model. Furthermore, results also show a well-established correlation between enhanced hydrodynamics and improved water transparency when the flow velocity ranged from 0.22 to 0.45 m/s. Overall, COD is a vital factor for the SD prediction because including the COD can see a notable improvement in the ANN model (with a correlation coefficient of 0.918). This study demonstrates that the ANN model with hydrodynamic and water quality parameters can achieve a better prediction of water transparency than other discussed models for a coastal plain urban river network.

1. Introduction

Urbanization has accelerated the degradation of urban aquatic ecosystems, and the associated ecological issues have been recognized in China [1,2,3]. To mitigate these effects and improve urban river health, large-scale water-clearing regulations have been instituted for plain urban river networks, such as the one that empties into the Yangtze River Delta [4,5]. Water transparency is a commonly used indicator of water quality [6] because it incorporates physical, chemical, and biological processes [7,8] (e.g., flow rate, nutrient cycling, and phytoplankton photosynthesis, respectively [9,10,11]. Moreover, it is a key indicator for measuring the effect of ecological restoration [12,13] and is widely valued in China when constructing urban water environments [14,15,16,17,18].
Additionally, studies have shown that the three abovementioned processes also relate to geographical location [19,20,21]. For instance, water transparency is dominated by dynamic conditions associated with wind, waves, and human activities in shallow areas such as the Yangtze Delta. Other environmental water quality indicators include chlorophyll-a (Chl-a) [22,23]; nutritional status [24]; total phosphorus [25]; sediment resuspension, transportation and settlement; and total suspended substance [26]. For a long time, the Yangtze River was diverted to bring water to the cities on the coastal plain, but this method was once considered to be a waste of money and labour. Therefore, an investigation into how hydrodynamic and hydro-environmental factors affected water transparency was begun. This improved understanding of the large-scale water clearing regulation can provide a theoretical basis for a better evaluation of water diversion projects.
The Secchi disk depth, or Secchi depth (SD), is a simple, traditional measure of water transparency [27,28]. A black-and-white disk is immersed vertically into water, and the visual depth is called the Secchi depth [29], indicating the extent of the water transparency. Although the Secchi disk is a powerful tool, its main disadvantage is discrete spatial-temporal and asynchronous observations [30,31] for large areas. Additionally, errors may occur due to a lack of visual acuity. Therefore, the traditional method might not be adequate to evaluate water quality for large river networks. The Secchi method also consumes an abundance of labour and resources because of the complex branch system of a plain urban river network. As such, satellite sensors like China’s CBERS-1 can provide high-quality water transparency measurements with high spatial-temporal resolution [32]. Colour sensors combined with remote sensors have been successfully applied to estuarine and thalassic water [21], but there are limited studies on its application to China’s inland shallow areas.
Previous studies have shown that SD is a function of hydrodynamic factors like velocity and water level [33,34]. Further, hydrodynamic variations may lead to variations in hydro-environmental indicators [35], and variations in hydro-environmental indicators might further cause variations in SD. As such, only employing hydrodynamic indicators to predict SD might be inadequate by themselves. For this reason, previous studies have explored the function of SD with various environmental parameters [32,36,37,38,39]. A partial list is given in Table 1). Over the past 50 years, several regression models have been developed, and in those models, the SD (its natural or decimal logarithm) was shown as a function of one or two parameters. Compared to the regression model, studies that employed and ANN on SD predictions is limited to date [40].
Nevertheless, limited studies were using the ANN for SD prediction considering the three processes (physical, chemical, biological) impact on the plain urban river network. Inspired by big-data analysis and machine learning techniques, we attempted to develop a machine learning model based on remotely sensed SD and other hydrodynamic and environmental parameters from 2013 to 2019 to assess the response of SD to the large-scale water clearing regulation in the Yangtze Delta. Selected input candidates for the machine learning model include hydrodynamic condition index and water environmental factors: surface velocity (V), total suspended solids (TSS) concentration, dissolved oxygen (DO) concentration, near-surface chlorophyll (Chl) concentration, chemical oxygen demand (COD) concentration, and water temperature (TE).
The objectives of this study are:
  • Ÿ To evaluate the big data analysis and self-learning ability of the developed machine learning model in SD prediction for a plain urban river network with long-term field observations;
  • Ÿ To compare the SD prediction performance between a machine model and a regression model to provide a better prediction model and highlight suitable parameters.

2. Materials and Methods

2.1. Overview of the Study Area

The study area includes a plain urban river network throughout Suzhou city in the Yangtze River Delta located in the southeastern part of Jiangsu province (city centre: 31.19° N, 120.37° E, altitude from −998.09 m to 611.9 m) (Figure 1). The river network across the city is 34.72 km long, with a river catchment area of 14.2 km2. Its water depth varies between 2.8 and 3.2 m all year round [41]. Complicated water systems and flat terrain result in poor hydrodynamics of the river network [42,43]. Furthermore, the river network is the primary water resource as water is conveyed from Lake Taihu through this river.

2.2. Field Observation

Fourteen monitoring sites were set up in the study area to monitor hydrodynamic and water quality conditions. Corresponding indicators through the river network were collected. They were recorded between March 2013 and December 2019 at 10 a.m. Hydro-dynamic data include the flow rate (Q) and water depth (H) measured from installed instruments: a fixed acoustic Doppler flowmeter and electronic water gauge (MXT04, China). The on-site data were then standardized, and detailed explanations shown in the following subsections. Gridded Secchi-disk transparency (SD) and other water quality parameters were collected from the Suzhou Ecological Environment Bureau (http://sthjj.suzhou.gov.cn/, accessed on 31 December 2019) once a week.

2.3. Analytical Methods

2.3.1. Flow Data Processing Method

According to previous studies, to predict the concentration of both chemical and biological water quality indicators in river networks. A method considering the cumulated effects of previous flow on the indicators is given as [44]:
          d = i = 1 j d j + 1 i Q i i = 1 j d j + 1 i
where d is the discount coefficient and can be 0~1, the coefficient at 1, indicating the current water quality parameters in the river network all contributed by previous water flow, and vice versa. Here, i corresponds to the specific time series, j is the total number of observations, Q i represents the flow under the i time series. The d value generally selected at 0.95, considering that the stepped pattern of flow rate during the water transportation process leads to the most weight to the cumulated changes in water quality in the river networks.
The weekly mean, maximum, and minimum SD were calculated from the observations and referred to SD WA, SD MX, and SD MI, respectively. The mean, maximum, minimum of other parameters throughout the monitoring period refer to Xmean, Xmax, and Xmin. Considering the practicability and accuracy of the model, appropriate parameters were chosen for the SD prediction based on the correlation coefficients.

2.3.2. Multiple Linear Regression Method

The multiple linear regression (MLR) model was used to predict or estimate the dependent variable through the optimal combination of multiple optional independent variables with a set of coefficients. In this study, there are several potential predictors for SD prediction in the study area, which can be described as:
y = β 0 + β 1 x 1 + β 2 x 2 + β 3 x 3 + + β i x i
where y is the dependent variable (SD), x i denotes independent parameters (hydrodynamic parameters and water quality parameters),   β i denotes coefficients from multiple linear regression. The equation was converted to a logarithmic scale for the MLR model.

2.3.3. Artificial Neural Networks

Artificial neural networks (ANN) are a machine learning technique that is generally employed in unknown relationships in considerable information and are used to handle complex nonlinear features in the big dataset and perform classification and regression [45]. Inspired by the principle of neurons in human brains, artificial neurons are arranged in different layers, and each layer contains numerous neurons. The layers are mainly grouped into three categories: the input layer, the hidden layer, and the output layer. The ANN models employed in this study have a hidden layer with a sigmoid activation function, which is often used in biology for the characteristics of smoothness and ease of derivation. The application of the ANN models in the environment research field generally includes: estimating the reference evapotranspiration (ET0) in a river ecosystem [46], predicting the total phosphorus (TP) concentration in the overlying water of the Huai River based on hydrological and hydrodynamic parameters [47], predicting the total dissolved gas (TDG) downstream of spillways of dams [48], and predicting the algae distribution in the large shallow lake based on wind speed index [49]. However, there are few studies on the application of ANN to SD prediction.
The neurons in the input layer are corresponding to the number of input parameters. The hidden layer is the most important part of the ANN model to predict SD, where the neurons calculate the sum of the weighted input and add a deviation value (threshold). The running process of the ANN (Figure 2) model can be presented as:
A i = B 1 + i = 1 h w i j x i
Here, the A i is the weighted sum of the i th hidden neuron, j is the number of the corresponding hidden neuron, and h is the total number of inputs, w i j represents the weight characterized by the connection Mth input to the Nth hidden neuron, and B 1 is the deviation term of each neuron in the hidden layer. The function gives the output of the Mth hidden neuron:
Y i = f ( A i )
The adopted activation function refers to the sigmoid function:
f ( A ) = 1 1 + e A
The ANN output is given by
O h = B 2 + i = 1 m w i h Y i
where w i h represents the weight, characterizing the connection between the Mth hidden neuron to the Pth output neuron, with the total number of m hidden neurons, and B 2 is the deviation term.

2.3.4. Random Variables Regression Model

In this study, we investigate the linear relationship between flow velocity and SD depth, and it is given by:
y = a + b x + ε
where a   and b are fitting parameters and ε is an error. ε follows the standardized normal distribution as ε ~N(0, σ 2) in most cases, and thereby y~N( a + b x , σ 2).
The maximum likelihood method (MLM) was used here to estimate the parameters a and b [50]. For a given sample set: (x1, y1), (x2, y2), …, (xn, yn), the joint density L was calculated by:
L = i = 1 n 1 σ 2 π exp [ 1 2 σ 2 ( y i a b x i ) 2 ] = ( 1 σ 2 π ) n exp [ 1 2 σ 2 ( y i a b x i ) 2 ]
The goal of the MLM is to obtain the maximum of L , i.e., the minimum of:
Q ( a , b ) = i = 1 n ( y i a b x i ) 2
To get the minimum of Q ( a , b ) , the following equations should be satisfied:
Q a = 2 i = 1 n ( y i a b x i ) = 0 Q b = 2 i = 1 n ( y i a b x i ) x i = 0 }
Solving this, the expression of the estimator a ^ and b ^ were obtained:
b ^ = n i = 1 n x i y i ( i = 1 n x i ) ( i = 1 n y i ) i = 1 n x i 2 ( i = 1 n x i ) 2 = i = 1 n ( x i x ¯ ) ( y i y ¯ ) i = 1 n ( x i x ¯ ) 2 , a ^ = 1 n i = 1 n y i b ^ n i = 1 n x i = y ¯ b ^ x ¯ , }
where: x ¯ = 1 n i = 1 n x i , y ¯ = 1 n i = 1 n y i .The solution of Equation (6) equals to the solution of the least square method when y i follows the normal distribution.
The unbiased estimator σ ^ 2 is calculated by the equation follows:
σ ^ 2 = 1 n 2 ( S Y Y b ^ S x Y ) ,
where: S Y Y = i = 1 n ( y i y ¯ ) 2 , S x Y = i = 1 n ( x i x ¯ ) ( y i y ¯ ) 2
Determining SD by using the newly developed function is not similar to the common regressions. For a given x0, an interval of SD y0 with a confidence level of 1 − α, instead of a unique value, is predicted by:
a ^ + b ^ x 0 - t α / 2 ( n 2 ) σ ^ 1 + 1 n + ( x 0 x ¯ ) 2 S x x < y 0 < a ^ + b ^ x 0 + t α / 2 ( n 2 ) σ ^ 1 + 1 n + ( x 0 x ¯ ) 2 S x x
where t is the t-distribution. As seen from Equation (13), the length of the predicted interval was a function of a flow velocity as well. According to this idea of the random variable regression, the support vector machine (SVM) is introduced to explore the relationship between SD and flow velocity of urban river work, aiming to verify the rationality of using flow velocity as a predictive model parameter in this study.

2.3.5. Model Performance Assessment Methods

The performance of models used in this study was evaluated through the coefficient of determination (R2), the root mean squared error (RMSE) and the mean absolute error (MAE) [51].

R2 (Coefficient of Determination)

The R2 coefficient is used to estimate the goodness of fit between predicted values and observed values. The mean square error (MAE) has been used as the average value of the absolute error between the predicted value and the observed value.

2.3.6. Support Vector Machine Methods

The support vector machine method is a class of generalized classifiers that binary classification of data in a supervised learning method, and its decision boundary is the maximum margin hyperplane that is solved for the learning sample. SVM is usually used to analyze the complex nonlinear relationship between A and B. If given X = {X1, …, Xn}, Y = {y1, …, yn}, Each sample of the input data contains multiple features and thus constitutes a feature space, and the learning objective is a binary variable. If there is a hyperplane as the decision boundary in the feature space where the input data is located, the learning targets are divided into positive and negative classes, and the distance from the point to the plane of any sample is greater than or equal to 1. The decision boundary is given by:
wTX + b = 0
yi(wTXi + B) ≥ 1
where the w and b are the normal vector and intercept of the hyperplane, respectively.
Then, it is claimed that the classification problem has linear separability, and the parameters are the normal vector and intercept of the hyperplane.

3. Results

3.1. Characteristics of the SD Measured

SD WA, SD MX, and SD MI during the observation period from 5 January 2013 to 12 December 2019 in the study area are shown in Figure 3, and their descriptive statistics are summarized in Table 2. Figure 3a shows that the weekly mean SD during the long observation ranges from 0.26 m to 0.89 m. The weekly maximum SD varied from 0.544 m to 1.103 m with an average value of 0.803m, while for weekly minimum SD ranges from 0.179 m to 0.566 m with a mean of 0.321 m (Figure 3b and Table 2). Statistical results showed that the observed SD exceeding 0.4m accounted for 75% of total datasets. This proportion was higher than most other plain river network city in the Yangtze River Delta and was believed to benefit from long-term water transfer projects. The result in Figure 3 shows a tendency for SD to gradually get better under long-term hydrodynamic control measures, which indicate that under several years of hydrodynamic regulation, the water environment of the Suzhou urban river network has been positively improved gradually.
Figure 3a also illustrates the SD seasonality. Generally, the mean value is higher high in the warm season, low in the cool season. This finding is in line with previous studies [52,53]. Conversely, the DO concentration in the study area is low in the warm season and high in the cool season, affected by temperature [54]. Interestingly, the two parameters (DO and TE) have always been the most important parameter used in the Environmental prediction model. It was believed to be beneficial to handle the complex relationships between various parameters and promote the effectiveness of predicting models. In contrast, the other five parameters do not show noticeable seasonal changes.

3.2. Input and Output of SD Prediction Model

Prior to the model training, datasets were preprocessed to reduce the impact of outliers on the model performance [55]. Therefore, data below 1.5 times the 25th percentile value and higher than 1.5 times the 75th percentile values were not considered, taking 5% of the total data. Based on previous studies (Table 1), five parameters include flow velocity, DO, TSS, COD, Chl, and TE were chosen as potential input parameters for the model development and selected by the correlation coefficient with SD. Thereafter, four different models were developed and evaluated based on the selected inputs, and statistics of the weekly SD is summarized in Table 3.
Table 4 shows the descriptive statistics of the input dataset that related to the SD in previous research. Here the Xmean, Xmax, Xmin, Sx, CC and Cv represent the mean, the maximum, the minimum, the standard deviation, the correlation coefficient and the coefficient of variation with the Secchi Depth, respectively. It is summarized that the top three related to transparency are TSS, Chl and COD, and the velocity also has a significant correlation. The normalization is proved to be an important process that can increases the performance of the models significantly [56]. Therefore, this means all the input data obtained were normalized to possess zero mean and unit variance.
Based on the selected parameters, four kinds of the model were developed(M1, M2 and M3), the M1 model was developed with only velocity, TSS and Chl, DO, and TE was added to the M2 model on the basis of M1, and the M3 model was developed using all input parameters(V, TSS, DO, COD, Chl and TE). Finally, the M4 model was developed with V, TSS, Chl and COD. For the four models based on the ANN technique, the number of input and output neurons is closely related to the structure of the model. In this study, trial and error are needed to find the optimal hidden layer, and the hidden layer with 15 neurons was tested to give the best result in this study. For the four models based on the MLR technique, the discounted flow rate was obtained with a discount coefficient of 0.95 based on the original data of the flowmeter, then the corresponding flow velocity according to the river topography at each observing point was, respectively obtained by discount calculation. Note that 60% of the data points in each model were randomly selected as training dataset, and the rest of 20% were regarded as verification and 20% for the testing dataset. Each model was tested three times.
The MLR model and ANN model were fitted on the training dataset, verification dataset, and testing dataset. Moreover, the testing dataset was used for unseen data to evaluate the performance of this fitted model through RMSE, each model was run three times, and the average of the results was given in Table 5. The ANN-based models performed remarkably better than the MLR model in all phases (Table 5). Among other ANN models, the M4 model has the best performance in all the phases. The ANN results clearly show significant improvement in the performances from M1 (CC = 0.859) to M3 (CC = 0.882). The CC increases from 0.875 to 0.897, with a 2.5% rate of improvement, the MAE decrease from 0.859 to 0.834 with a 3.0% rate of improvement. For the MLR models, it was well fitted on the training dataset while performed poorly in other phases. MLR-based M3 model has the best performance. The CC declined slowly from model M1 (CC = 0.594) to model M2 (CC = 0.573), and increased slightly from model M2 to model M3 (CC = 0.607) for other MLR-based models. Regarding the RMSE and MAE, the improvements are less than 10.5% and 6.2%, respectively, which is negligible. While this is not reflected in the case of ANN models, the performances of the ANN models with the training data was notably better than those obtained with the MLR models.
Additionally, varying the number of input parameters in the models from three (M1) to five (M2) does not see a remarkable improvement in model performance, and slight improvement is obtained using the ANN model: 2.5% and 6.7% of improvement in favour of the M1 model based on the CC and RMSE, respectively. However, 3.1% of improvement was obtained regarding the MAE in favour of the M3 model: the MAE drops from 0.902 to 0.643 (decreased by 40.1%). As we can see from Table 5, the results are intriguing and encouraging with COD adding to the SD prediction model, and the scatter plots of the predictions against observations of the SD ANN and MLR models are shown in Figure 4.

3.3. Analysis of the Correlation between SD and Velocity by Using Machine Learning Methods

The plain urban river network has a slow flow rate, the hydrodynamic of which is completely manually controlled. Therefore, velocity is considered as an important input parameter of the SD model in this study, and exploring the correlation between flow velocity and SD is needed after the model is established.
In view of the complex relationship between flow velocity and SD, two models in the scikit-learn library, the Linear Regression model (LR model) and the Support Vector Machine Regression model (SVR model), were selected to fit the two preprocessed datasets. Regression results were plotted in Figure 5 and the RMSE of training and test datasets were summarized in Table 6.
A comparison indicated that the SD predicted using the LR and SVR models exhibited patterns of a straight line, curve and scatted points, respectively. LR model showed that the RMSE is below 25.00 and the R2 over 0.41, which agrees with the results in Figure 5 that the majority of predicted SD based on the training dataset overlapped with the observed values. However, the RMSE increased significantly to be higher than 27.36, and R2 decreased less than 0.33 when predicting the SD in the test dataset using the same model. This is a typical result of overfitting. LR and SVR models, in contrast, exhibited a result that the RMSE on training and test dataset were close to each other. In order to avoid overfitting, the SVR model was optimized by adjusting the parameters. The RMSE and R2 of the optimized SVR model were added to Table 6. Since the effect of the relationship was estimated by RMSE, the R2 of the SVR model on testing and training datasets were still significantly different.
Results in Table 7 show that the SVR model has the higher RMSE on predicting the SD in the test dataset, while SVR has differences of RMSE within 2.00. Comparing to the LR model, the SVR model showed a higher RMSE at 26.82, but the R2 was better than the LR model in the two datasets, indicating better performances on clarifying the relationship between SD and velocity.

4. Discussion

The ANN model and MLR model are compared based on their performances in (i) training, (ii) verification, and (iii) testing phases, with results summarized in Table 4. It appears that the ANN model is more accurate and consistent in different subsets since all the values of RMSE and MAE are similar, and all the correlation coefficients are also close to unity, and the performance of this model can be well demonstrated based on RMSE. It also shows that the ANN model results in a much higher value of the CC than the MLR model. The prediction results regarding the CC value during the verification phase showed an approximately 38.1% of improvement. In addition, the forecast results regarding the CC value during the test phase improved by approximately 36.9%. In some previous studies, the reported prediction of SD was not tested on the training dataset, which was due to the insufficient data size [57]. In this study, as we can see from Table 4, these are very encouraging results regarding the modelling of SD, and the results were fitted in all phases.
According to Table 4, the results show that during the verification phase, the ANN model shows a reasonable estimation of SD. Furthermore, an acceptable level can be observed using the model M1 and M3, and through the comparison of various statistical indices (CC, RMSE and MAE) expounds the performance of ANN models better than the MLR models, which demonstrates that the ANN method has the good advantage on predictive ability to acquire the SD of the plain urban river network. In the verification phase, using the ANN model, the best results are achieved using the M4 model. Therefore, in this comparison, the prediction performance of M4 is slightly better than that of M1 and M3. In the testing phase, as shown in Table 3, model M4 is always the best model, while for the MLR model, the M1 is the best model. In order to possess a good predictive ability, RMSE and MAE should be as low as possible, but for CC, the value of this parameter should be as high as possible.
Consequently, we can see that the inclusion of the two parameters (DO and TE) may not improve the performance of the model. Interestingly, besides TSS and Chl, the COD assumed major importance when included simultaneously as input to the model. As water quality parameters that affect the SD of water body, when included with COD parameters, DO and TE did not contribute significantly to model performance to predict the SD of the urban river network. As the most important environmental factors in water bodies, DO and TE mainly affect the degradation rate of pollutants in urban river networks. As the river network of plain cities in the Yangtze River Delta has undergone years of diversion and flow control, the water quality of the river network gradually improved and entered a steady state. Therefore, DO and TE, which affects the chemical process, are less sensitive to transparency. In contrast, COD is extremely difficult to degrade in urban river network water bodies and is closely related to TSS, which cause the SD to be more sensitive to COD in plain urban river network. The final model selected to predict the SD of the urban river network in this study contained velocity, TSS, Chl and COD (M4). The inclusion of the DO and TE may not improve the model performance and even sometimes contribute to increasing the values of the error indices. Additionally, more suitable and fewer inputs will help to simplify the implementation and the calculations procedure, which improve the practicality of the model.
Finally, for a given regression model, the LR model exhibited a slightly lower RMSE for exponential correlation than power correlation, whereas the SVR model resulted in an opposite result. The result indicates that within a certain flow rate threshold, there is a positive correlation between transparency and flow rate, which reveals that the hydrodynamic factors of the plain river network have a significant impact on the water transparency and can be used as an effective parameter of the prediction model. Within the range of flow velocity 0.22–0.45 m/s, increased flow rate has a positive effect on SD. On the one hand, the improvement in hydrodynamics brought by water resources regulation does have a positive impact on the water environment of urban river networks. On the other hand, the method of improving the water environment through hydrodynamic regulation has an improved flow rate threshold, which means hydrodynamic control is not a once-and-for-all method.
Additionally, in the analysis of long-term SD changes and ANN model results, it is essential to consider the influence of flow velocity changes caused by water regulation. The comparison of the correlation coefficient shown in Table 3 reveals that flow velocity has a larger impact weight on the water transparency of urban river network, and the absolute value of its correlation coefficient is ranked before dissolved oxygen and temperature.

5. Conclusions

In this study, an artificial neural networks model is proposed for estimating Secchi depth in a plain urban river network using long-term observed data. Through the comparison of results between the ANN model and MLR model, it reveals that the hydrodynamic parameters can be used as effective parameters for SD prediction models of the urban river network. Additionally, the impact of COD concentration on transparency is crucial in the river network due to the notable improvement with the inclusion of COD parameter as input in the model. The more accurate and more practical model of SD is the one with input parameters including flow velocity, TSS, COD, and Chl, and sensitivity ranks from high to low as TSS, Chl, COD and flow velocity.
In addition, ANN models perform better than the MLR models, which demonstrates the existence of a complex nonlinear relationship between SD and various parameters. The support vector machine was used to deduce the relationship between SD and hydrodynamic parameter, and a strong positive correlation was explored in this study when velocity range from 0.22 m/s to 0.45 m/s. Over 90% of data fall in the predicted intervals of SD for the method, reflecting the flow rate threshold of hydrodynamic regulation to improve water transparency in the urban river network.

Author Contributions

Conceptualization, Y.L. (Yipeng Liao) and Y.L. (Yun Li); Methodology, Y.L. (Yipeng Liao) and Z.W.; Validation, Y.L. (Yipeng Liao) and B.J.; Formal Analysis, Y.L. (Yipeng Liao) and Z.F.; Writing—Original Draft Preparation, Y.L. (Yipeng Liao) and J.S.; Writing—Review & Editing, Y.L. (Yipeng Liao) and J.S.; Visualization, Y.L. (Yipeng Liao); Supervision, Y.L. (Yun Li). All authors have read and agreed to the published version of the manuscript.

Funding

This research was jointly funded by the Major Science and Technology Program of China (2017ZX07205003), Jiangsu Water Conservancy Science and Technology Project (2017001ZB) and NHRI Special funds for basic research operations (Y120013).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The Secchi-disk transparency (SD) data and other related water quality parameters used in this study can be acquired from the Suzhou Ecological Environment Bureau official website (http://sthjj.suzhou.gov.cn/) (accessed on 31 December 2019).

Acknowledgments

We are grateful to the anonymous reviewers for their constructive comments and helpful suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Zhang, J.; Gilbert, D.; Gooday, A.J.; Levin, L.; Naqvi, S.W.A.; Middlelburg, J.J.; Scranton, M.; Ekau, W.; Peña, A.; Dewitte, B.; et al. Nature and human-induced hypoxia and consequences for coastal areas: Synthesis and future development. Biogeoscience 2010, 7, 1443–1467. [Google Scholar] [CrossRef] [Green Version]
  2. The State Counsil of China. Action Plan for Prevention and Control of Water Pollution; People’s Publishing House: Beijing, China, 2015. [Google Scholar]
  3. Zhang, J.Y.; Li, Y.; Wang, X.J. Reconsideration on issues related to water ecological civilization construction in China. China Water Resour. 2016, 19, 8–11. (In Chinese) [Google Scholar]
  4. Lu, X.Y. Impact of Water Diversion on the Growth of Dominant Eutrophic Algae in Lake Taihu; Nanjing Hydraulic Research Institute: Nanjing, China, 2013. (In Chinese) [Google Scholar]
  5. Cui, G.B.; Chen, X.; Xiang, L.; Zhang, Q.; Xu, Q. Evaluation of water environment improvement by interconnected river network in plain area. J. Hydraul. Eng. 2017, 48, 1429–1437. (In Chinese) [Google Scholar]
  6. Arnberger, A.; Eder, R. The Influence of green space on community attachment of urban and suburban residents. Urban Urban Green 2012, 11, 41–49. [Google Scholar] [CrossRef]
  7. Phlips, E.J.; Aldridge, F.J.; Schelske, C.J.; Crisman, T.L. Relationships between light availability, chlorophyll a, and tripton in a large, shallow subtropical lake. Limnol. Oceanogr. 1995, 40, 416–421. [Google Scholar] [CrossRef]
  8. Kukushkin, A.S. Long-term seasonal variability of water transparency in the surface layer of the deep part of the Black Sea. Russ. Meteorol. Hydrol. 2014, 39, 178–186. [Google Scholar] [CrossRef]
  9. Abu Hanipah, A.H.; Guo, Z.R. Reaeration caused by intense boat traffic. Asian, J. Water Environ. Pollut. 2019, 16, 15–24. [Google Scholar] [CrossRef]
  10. Aarup, T. Transparency of the North Sea and Baltic Sea-a Secchi depth data mining study. Oceanologia 2002, 44, 323–337. [Google Scholar]
  11. Blenckner, T.; Adrian, R.; Livingstone, D.M.; Jennings, E.; Weyhenmeyer, G.A.; George, D.G.; Jankowski, T.; Järvinen, M.; Aonghusa, C.N.; Noges, T.; et al. Large-scale climatic signatures in lakes across Europe: A meta-analysis. Glob. Chang. Biol. 2007, 13, 1314–1326. [Google Scholar] [CrossRef] [Green Version]
  12. Bricaud, A.; Morel, A. Light attenuation and scattering by phytoplanktonic cells: A theoretical modeling. Appl. Opt. 1986, 25, 571–580. [Google Scholar] [CrossRef]
  13. Bulkley, J.W.; Mathews, A.P. Water Quality Relationships in the Great Lakes: Analysis of a Survey Questionnaire; The University of Michigan: Ann Arbor, MI, USA, 1974; pp. 1–84. [Google Scholar]
  14. Wang, C.; Wei, Z.; Zhang, L. Experimental study on improvement of water environment by water diversion in plain river networks. J. Hohai Univ. Nat. Sci. 2005, 33, 136–138. (In Chinese) [Google Scholar]
  15. Xia, J.; Zhai, X.Y.; Zeng, S.; Zhang, Y. Systematic solutions and modeling on eco–water and its allocation applied to urban river restoration: Case study in Beijing, China. Ecohydrol. Hydrobiol. 2014, 14, 39–54. [Google Scholar] [CrossRef]
  16. Zhang, C.; Yu, Z.G.; Zeng, G.M.; Jiang, M.; Yang, Z.Z.; Cui, F.; Zhu, M.Y.; Shen, L.Q.; Hu, L. Effects of sediment geochemical properties on heavy metal bioavailability. Environ. Int. 2014, 73, 270–281. [Google Scholar] [CrossRef]
  17. Cai, J.; Wang, C.S.; Wang, W. Dicussion on improving the water environment in the river course by diverting clean water and conducting better management. China Water Resour. 2011, 7, 39–41. (In Chinese) [Google Scholar]
  18. Yang, Q.Q.; Wu, S.Q.; Wu, X.F.; Yang, Z.Z.; Cui, F.; Zhu, M.Y.; Shen, L.Q.; Hu, L. Effects of simulated water diversion on water quality and phytoplankton community in Meiliang bay. J. Hydroecol. 2015, 36, 42–49. (In Chinese) [Google Scholar]
  19. Mukundan, R.; Pierson, D.C.; Schneiderman, E.M.; O’donnell, D.M.; Pradhanang, S.M.; Zion, M.S.; Matonse, A.H. Factors affecting storm event turbidity in a New York City water supply stream. Catena 2013, 107, 80–88. [Google Scholar] [CrossRef]
  20. Zhang, Q.H.; Yan, B.; Wai, O.W.H. Fine sediment carrying capacity of combined wave and current flows. Int. J. Sediment Res. 2009, 24, 425–438. [Google Scholar] [CrossRef]
  21. Chung, E.G.; Bombardelli, F.A.; Schladow, S.G. Modeling linkages between sediment resuspension and water quality in a shallow, eutrophic, wind-exposed lake. Ecol. Model. 2009, 220, 1251–1265. [Google Scholar] [CrossRef]
  22. Lorenzen, M.W. Use of chlorophyll-Secchi disk relationships. Limnol. Oceanogr. 1980, 25, 371–372. [Google Scholar] [CrossRef]
  23. Fleminglehtinen, V.; Laamanen, M. Long-Term changes in Secchi depth and the role of phytoplankton in explaining light attenuation in the Baltic Sea. Estuar. Coast. Shelf Sci. 2012, 102, 1–10. [Google Scholar] [CrossRef]
  24. Avigliano, E.; Schenone, N. Water quality in Atlantic rainforest mountain rivers (South America): Quality indices assessment, nutrients distribution, and consumption effect. Environ. Sci. Pollut. Res. 2016, 23, 15063–15075. [Google Scholar] [CrossRef] [PubMed]
  25. Johengen, T.H.; Biddanda, B.A.; Cotner, J.B. Stimulation of Lake Michigan Plankton Metabolism by Sediment Resuspension and River Runoff. J. Great Lakes Res. 2008, 34, 213–227. [Google Scholar] [CrossRef] [Green Version]
  26. Taillie, D.M.; O’Neil, J.M.; Dennison, W.C. Water quality gradients and trends in New York Harbor. Reg. Stud. Mar. Sci. 2020, 33, 100922. [Google Scholar] [CrossRef]
  27. Bessell-Browne, P.; Negri, A.P.; Fisher, R.; Clode, P.L.; Duckworth, A.; Jones, R. Impacts of turbidity on corals: The relative importance of light limitation and suspended sediments. Mar. Pollut. Bull. 2017, 117, 161–170. [Google Scholar] [CrossRef]
  28. Tyler, J.E. The Secchi disc. Limnol. Oceanogr. 1968, 8, 1–6. [Google Scholar] [CrossRef]
  29. Davies-Colley, R.J. Measuring water clarity with a black disc. Limnol. Oceanogr. 1988, 33, 616–623. [Google Scholar] [CrossRef]
  30. Preisendorfer, R.W. Secchi disk science: Visual optics of natural waters. Limnol. Oceanogr. 1986, 31, 909–926. [Google Scholar] [CrossRef] [Green Version]
  31. Larson, G.L.; Hoffman, R.L.; Hargreaves, B.R.; Collier, R.W. Predicting Secchi disk depth from average beam attenuation in a deep, Ultra-clear lake. Hydrobiologia 2007, 574, 141–148. [Google Scholar] [CrossRef]
  32. Mcclain, C.R. A decade of satellite ocean color observations. Annu. Rev. Mar. Sci. 2009, 1, 19–42. [Google Scholar] [CrossRef] [Green Version]
  33. Carlson, R.E. A trophic state index for lakes. Limnol. Oceanogr. 1977, 22, 361–369. [Google Scholar] [CrossRef] [Green Version]
  34. Kim, S.H.; Yang, C.S.; Ouchi, K. Spatio-Temporal patterns of Secchi depth in the waters around the Korean Peninsula using. MODIS data. Estuar. Coast. Shelf Sci. 2015, 164, 172–182. [Google Scholar] [CrossRef]
  35. Brezonik, P.L. Effect of organic color and turbidity of secchi disk transparency. J. Fish Res. Board Can. 1978, 35, 1410–1416. [Google Scholar] [CrossRef]
  36. Gikas, G.D.; Yiannakopoulou, T.; Tsihrintzis, V.A. Water quality trends in a coastal lagoon impacted by non-point source pollution after implementation of protective measures. Hydrobiologia 2006, 563, 385–406. [Google Scholar] [CrossRef]
  37. Gikas, G.D.; Tsihrintzis, V.A.; Akratos, C.S.; Haralambidis, G. Water quality trends in Polyphytos reservoir, Aliakmon River, Greece. Environ. Monit. Assess. 2009, 149, 163–181. [Google Scholar] [CrossRef] [PubMed]
  38. Wu, G.; Leeuw, J.D.; Liu, Y. Understanding seasonal water clarity dynamics of Lake Dahuchi from in situ and remote sensing data. Water Resour. Manag. 2009, 23, 1849–1861. [Google Scholar] [CrossRef]
  39. Yan, Z.; Ding, F.Y.; Qian, Y.; Pan, S.; Gai, Y.; Cheng, W.; Liu, X.; Tang, S. Variations of Water Transparency and Impact Factors in the Bohai and Yellow Seas from Satellite Observations. Remote Sens. 2021, 13, 514. [Google Scholar] [CrossRef]
  40. Xu, Y.X.; Wang, W.C.; Zeng, W.F.; Li, Y.; Lai, Q.; Yin, X.; Zhang, S. Simulation on improvement of water environment in plain river network by water diversion. Water Resour. Prot. 2018, 34, 70–76. (In Chinese) [Google Scholar]
  41. Ibáñez Civera, J.; Garcia Breijo, E.; Laguarda Miró, N.; Gil Sánchez, L.; Garrigues Baixauli, J.; Romero Gil, I.; Masot Peris, R.; Alcañiz Fillol, M. Artificial neural network onto eight bit microcontroller for secchi depth calculation. Sensors Actuators B Chem. 2011, 156, 132–139. [Google Scholar] [CrossRef]
  42. Liao, Y.P.; Zhou, Y.L.; Fan, Z.; Jia, Y.; Li, Y. Study on water quality changes of river work in Suzhou ancient city under summer drainage conditions. Hydro Sci. Eng. 2019, 5, 18–26. (In Chinese) [Google Scholar]
  43. Jia, H.F.; Yang, C.; Zhang, Y.H.; Chen, Y.R. Simulations of water quality improvement for urban river networks. Qinghua Daxue Xuebao 2013, 53, 665–672. (In Chinese) [Google Scholar]
  44. Chaouni, A. River remediation and urban development scheme. In Proceedings of the Global Holcim Awards, Fez, Morocco, 17 July 2009. [Google Scholar]
  45. Adamala, S.; Raghuwanshi, N.S.; Mishra, A. Generalized quadratic synaptic neural networks for ET0 modeling. Environ. Process. 2015, 2, 309–329. [Google Scholar] [CrossRef] [Green Version]
  46. Chen, H.K. Effects of Sediment Motions on the Transport and Transformation of Phosphorus at Water-Sediment Interface; Hohai University: Nanjing, China, 2017. (In Chinese) [Google Scholar]
  47. Zeng, C.J.; Mo, K.L.; Chen Q, W. Improvement on numerical modeling of total dissolved gas dissipation after dam. Ecol. Eng. 2020, 156, 105965. [Google Scholar] [CrossRef]
  48. Zhu, M.; Zhu, G.; Zhao, L.; Yao, X.; Zhang, Y.; Gao, G.; Qin, B. Influence of algal bloom degradation on nutrient release at the sediment–water interface in Lake Taihu, China. Environ. Sci. Pollut. Res. Int. 2013, 20, 1803–1811. [Google Scholar] [CrossRef]
  49. Yin, C.M.; Zhao, L.C. Strong consistency and asymptotic normality of maximum likelihood estimates in generalized linear models. Chin. J. Appl. Probab. Stat. 2005, 21, 249–260. [Google Scholar] [CrossRef]
  50. Wang, Y.; Kuhnert, P.; Henderson, B. Load estimation with uncertainties from opportunistic sampling data–a semiparametric approach. J. Hydrol. 2011, 396, 148–157. [Google Scholar] [CrossRef]
  51. Heddam, S. Generalized regression neural network based on approach as a new tool for predicting total dissolved gas (TDG) downstream of spillways of dams: A case study of Columbia River Basin dams, USA. Environ. Process. 2017, 4, 235–253. [Google Scholar] [CrossRef]
  52. Legates, D.R.; McCabe, G.J. Evaluating the use of “goodness of fit” measures in hydrologic and hydroclimatic model validation. Water Resour. Res. 1999, 35, 233–241. [Google Scholar] [CrossRef]
  53. Zhang, Y.L. Distribution seasonal variation and correlation analysis of the transparency in Taihu Lake. Trans. Oceanol. Limnol. 2003, 96, 30–36. (In Chinese) [Google Scholar]
  54. Zhang, Y.; Qin, B.; Hu, W.; Wang, S.; Chen, Y.; Chen, W. Temporal-spatial variations of euphotic depth of typical lake regions in Lake Taihu and its ecological environmental significance. Sci. China Ser. D 2006, 49, 431–442. [Google Scholar] [CrossRef]
  55. Zheng, S.S.; Wang, P.F.; Chao, W.A.N.G.; Jun, H.O.U. Sediment resuspension under action of wind in Taihu Lake, China. Int. J. Sediment Res. 2015, 30, 48–62. [Google Scholar] [CrossRef]
  56. Li, X.; Zecchin, A.C.; Maier, H.R. Improving partial mutual information-based input variable selection by consideration of boundary issues associated with bandwidth estimation. Environ. Model. Softw. 2015, 71, 78–96. [Google Scholar] [CrossRef] [Green Version]
  57. Reshef, D.N.; Reshef, Y.A.; Finucane, H.K.; Grossman, S.R.; McVean, G.; Turnbaugh, P.J.; Lander, E.S.; Mitzenmacher, M.; Sabeti, P.C. Detecting novel associations in large data sets. Science 2011, 334, 1518–1524. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Figure 1. Location of the study area in Suzhou urban river network (China).
Figure 1. Location of the study area in Suzhou urban river network (China).
Sustainability 13 07372 g001
Figure 2. Process of Artificial neural network for predicting Secchi depth.
Figure 2. Process of Artificial neural network for predicting Secchi depth.
Sustainability 13 07372 g002
Figure 3. Different types of SD at Suzhou urban river network from 5 January 2013 to 28 December 2019: (a) weekly average SD, (b) weekly maximum and minimum SD.
Figure 3. Different types of SD at Suzhou urban river network from 5 January 2013 to 28 December 2019: (a) weekly average SD, (b) weekly maximum and minimum SD.
Sustainability 13 07372 g003
Figure 4. Scatter plots of measured versus predicted values of SD (M4), for ANN model: (a) training, (b) verification, and (c) testing (upper panel), and for MLR model: (d) training, (e) verification, and (f) testing (lower panel).
Figure 4. Scatter plots of measured versus predicted values of SD (M4), for ANN model: (a) training, (b) verification, and (c) testing (upper panel), and for MLR model: (d) training, (e) verification, and (f) testing (lower panel).
Sustainability 13 07372 g004
Figure 5. Predicted results of correlation between velocity and SD using LR, and SVR models with the preprocessed datasets. (a): Training; (b): Testing.
Figure 5. Predicted results of correlation between velocity and SD using LR, and SVR models with the preprocessed datasets. (a): Training; (b): Testing.
Sustainability 13 07372 g005
Table 1. Selected studies for different Secchi depth (SD) prediction model.
Table 1. Selected studies for different Secchi depth (SD) prediction model.
AuthorsModelInputsOutputR2
Carlson (1977) [33]Ln(SD) = 2.040 − 0.68Ln(Chl-a)Chl-aSD0.86
Carlson (1977) [33]Ln(SD) = 3.876 − 0.98Ln(TP)TPSD-
Brezonik (1978) [35]Log(SD) = 0.63 − 0.55Log(Chl-a)Chl-aSD0.75
Brezonik (1978) [35]Log(SD) = 0.48 − 0.72Log(TUR)TURdSDa0.53
Gikas et al. (2006) [36]SD = 0.52(Chl-a) − 0.05Chl-aeSD0.02h
Gikas et al. (2006) [36]SD = 0.85(Chl-a) − 0.22Chl-aeSD0.34i
Gikas et al. (2009) [37] Log(SD) = 5.32Log(Chl-a) − 0.38 + 2.11Log(TSS) − 0.16Chl-ae, TSSSD0.37
Gikas et al. (2009) [37]Log(SD) = 10.96Log(TP) − 0.54TPfSD0.35
Wu et al. (2009) [38]Ln(SD) = −0.712 + 0.093WL − 0.278WV [WL ≥ 14.75]WL, WVSD0.72
Wu et al. (2009) [38]Ln(SD) = −19.887 + 1.393WL − 0.217WV [WL < 14.75]WL, WVSD0.72
Abbreviations: SDa: Secchi depth in Centimeters (cm); SDb: Secchi depth in feet; TURd: turbidity in Formazin Nephelometric Unit (FNU); Chl-ae: chlorophyll-a (μg/L); TPf: total phosphorus (μg/L); B: blue band Moderate Resolution Imaging Spectroradiometer (MODIS); R: red band MODIS; WL: water level in meters; WV: wind velocity (m/s); 0.34i: model from 1998 to 1999.
Table 2. Three processes of water transparency change in plain urban river networks.
Table 2. Three processes of water transparency change in plain urban river networks.
No.TypeContent
1physicalChanges in the distribution of sediment in the water body caused by the flow rate or wind speed lead to a decrease in transparency, etc.
2chemistryPollutants in the water body deteriorate the water quality, resulting in a decrease in transparency, etc.
3biologicalTransparency changes caused by increased aquatic biomass, such as excessive growth of algae and mud turning by fish, etc.
Table 3. Statistical characteristics of different SD (m).
Table 3. Statistical characteristics of different SD (m).
MeanStandard DeviationMinimumMaximum
Weekly average SD0.5581.790.260.89
Weekly maximum SD0.8031.950.5441.103
Weekly minimum SD0.3211.700.1790.566
Table 4. Statistical parameters of the input data.
Table 4. Statistical parameters of the input data.
ParametersXmeanXmaxXminSxCvCC
SD (m)0.5581.2160.1470.4390.6921.000
V (m/s)0.2360.6240.0890.3480.7610.146
TSS (mg/L)25.854.49.16.5741.235−0.497
TE (°C)17.832.61.34.2230.542−0.134
DO (mg/L)5.148.920.231.1740.211−0.085
COD (mg/L)20.2336.7211.485.2410.463−0.363
Chl (μg/L)14.9624.676.723.6210.283−0.47
Table 5. Performance of the ANN and MLR in different phases.
Table 5. Performance of the ANN and MLR in different phases.
ModelTrainVerifyTest
CCRMSEMAECCRMSEMAECCRMSEMAE
ANNM10.8761.3080.9110.8561.4410.8690.8870.8820.848
M20.8951.2230.8720.8371.6191.0140.8610.9311.009
M30.9090.8850.6590.8841.5430.8460.9121.2480.938
M40.9270.8670.6460.8931.5460.8480.9190.9060.869
MLRM10.5882.6111.9020.5282.8132.2670.5782.7382.352
M20.5712.7831.9280.5073.1282.4430.5582.8622.486
M30.6052.6552.0070.5722.9652.4910.5832.9232.573
M40.6162.6121.9150.5892.8542.3730.6022.7742.409
Table 6. Description of regression equations by the MLR models.
Table 6. Description of regression equations by the MLR models.
ModelRegression Equations
M18.316 + 0.306LnV − 0.132TSS − 0.117Chl
M28.723 + 0.289LnV − 0.141TSS − 0.052TE − 0.124DO − 0.105Chl
M39.852 + 0.267LnV − 0.129TSS − 0.052TE − 0.084DO − 0.113COD − 0.102Chl
M48.521 + 0.294LnV − 0.131TSS − 0.106Chl − 0.112COD
Table 7. RMSE and R2 of different models.
Table 7. RMSE and R2 of different models.
ModelsTraining DatasetTest Dataset
RMSER2RMSER2
LR model27.360.322624.440.4139
SVR model25.030.653526.820.4718
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Liao, Y.; Li, Y.; Shu, J.; Wan, Z.; Jia, B.; Fan, Z. Water Transparency Prediction of Plain Urban River Network: A Case Study of Yangtze River Delta in China. Sustainability 2021, 13, 7372. https://doi.org/10.3390/su13137372

AMA Style

Liao Y, Li Y, Shu J, Wan Z, Jia B, Fan Z. Water Transparency Prediction of Plain Urban River Network: A Case Study of Yangtze River Delta in China. Sustainability. 2021; 13(13):7372. https://doi.org/10.3390/su13137372

Chicago/Turabian Style

Liao, Yipeng, Yun Li, Jingxiang Shu, Zhiyong Wan, Benyou Jia, and Ziwu Fan. 2021. "Water Transparency Prediction of Plain Urban River Network: A Case Study of Yangtze River Delta in China" Sustainability 13, no. 13: 7372. https://doi.org/10.3390/su13137372

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop