Predicting Runo ﬀ Chloride Concentrations in Suburban Watersheds Using an Artiﬁcial Neural Network (ANN)

: Road salts in stormwater runo ﬀ , from both urban and suburban areas, are of concern to many. Chloride-based deicers [i.e., sodium chloride (NaCl), magnesium chloride (MgCl 2 ), and calcium chloride (CaCl 2 )], dissolve in runo ﬀ , travel downstream in the aqueous phase, percolate into soils, and leach into groundwater. In this study, data obtained from stormwater runo ﬀ events were used to predict chloride concentrations and seasonal impacts at di ﬀ erent sites within a suburban watershed. Water quality data for 42 rainfall events (2016–2019) greater than 12.7 mm (0.5 inches) were used. An artiﬁcial neural network (ANN) model was developed, using measured rainfall volume, turbidity, total suspended solids (TSS), dissolved organic carbon (DOC), sodium, chloride, and total nitrate concentrations. Water quality data were trained using the Levenberg-Marquardt back-propagation algorithm. The model was then applied to six di ﬀ erent sites. The new ANN model proved accurate in predicting values. This study illustrates that road salt and deicers are the prime cause of high chloride concentrations in runo ﬀ during winter and spring, threatening the aquatic environment. prediction. Finally, density distribution analysis the spatial distribution (the of clustered data of the concentration. Density distributions showed about 70% of clustered value is detected for Sites 5 and 6, due to the nearby parking lot. The other four sites, which are close to the agricultural zone, cover less than 50%. These findings again point to used road salt as the main agent of chloride delivery to the groundwater. ANN modeling of environmental data has great potential in future work on improved prediction of chloride and other pollutant concentrations, and provides a useful tool for water resource and environmental managers.


Introduction
Urban areas require the construction of buildings, roads, and parking areas, yet such urban development causes hydrologic impacts and pollution as pervious surfaces are made impervious [1]. For safety, given abundant snowfall during the winter season, most communities in New England use salt or deicing on roads and parking areas. Road salts or deicing during the winter season are the primary factors for increasing salinity in surface soils, surface water, groundwater, and runoff. In the USA, an average of 24 million metric tons of road salt is applied each year to roads [2]. It is well-established that the application of road salts leads to the accumulation of sodium and chloride in soils and surface waters [3][4][5], with adverse impacts on downstream aquatic ecosystems [6]. In fact, when impervious surface areas increase, the areas that need to be deiced also increase.
Chloride-based deicers [i.e., sodium chloride (NaCl), magnesium chloride (MgCl 2 ), and calcium chloride (CaCl 2 )] dissolve in runoff, percolate into soils, and leach into groundwater. Chloride from chloride-based deicers does not efficiently precipitate or biodegrade but is absorbed by mineral/soil surfaces [7]. Although winter road deicing is an essential service for urban areas in USA (especially in the upper Midwest and Northeast), it contributes to a significant increase in chloride concentration [8]. The United States Geological Survey (USGS) (2014) conducted a temporal, seasonal, and environmental analysis of chloride concentrations in urban areas and assessed effects on water quality and the environment, especially on aquatic organisms across the USA [8]. This study concluded that there is an increasing trend of high chloride concentrations in urban areas due to expansion of impervious cover that requires deicing. Recently, several studies have applied artificial neural network (ANN) methods to predict resulting water quality based upon input variables [12]. Since 1990, ANN has been applied in many fields, including environmental sciences, ecological sciences, and water engineering [13]. According to Haykin (1999) [14], ANN is highly capable in modeling nonlinear system estimation and is highly adaptable. ANN allows precise predictions of the target parameter for specific materials or stages [14,15].
In this study, an ANN model is developed with a back-propagation algorithm. The back-propagation algorithm incorporates highly nonlinear relationships [15]. The ANN model was developed for rapid calculation and prediction of selected water quality variables at any location of interest. Within the model, unknown parameter weights are adjusted to obtain the best correlation between appropriate input parameters or a historical set of model inputs and the corresponding outputs [16]. This study provides the ANN modeling method needed to simulate and forecast chloride concentrations in runoff. The aim of the study is to (i) develop an ANN model of the system trained using a small data set, (ii) obtain the best-fit models for predicting chloride concentrations using data from monitoring sites, (iii) evaluate the ANN model performance using 3 years (2016-2019) of observed data versus predicted data from the model, and (iv) determine the accuracy of the Recently, several studies have applied artificial neural network (ANN) methods to predict resulting water quality based upon input variables [12]. Since 1990, ANN has been applied in many fields, including environmental sciences, ecological sciences, and water engineering [13]. According to Haykin (1999) [14], ANN is highly capable in modeling nonlinear system estimation and is highly adaptable. ANN allows precise predictions of the target parameter for specific materials or stages [14,15].
In this study, an ANN model is developed with a back-propagation algorithm. The back-propagation algorithm incorporates highly nonlinear relationships [15]. The ANN model was developed for rapid calculation and prediction of selected water quality variables at any location of interest. Within the model, unknown parameter weights are adjusted to obtain the best correlation between appropriate input parameters or a historical set of model inputs and the corresponding outputs [16]. This study provides the ANN modeling method needed to simulate and forecast chloride concentrations in runoff. The aim of the study is to (i) develop an ANN model of the system trained using a small data set, (ii) obtain the best-fit models for predicting chloride concentrations using data from monitoring sites, (iii) evaluate the ANN model performance using 3 years (2016-2019) of observed data versus predicted data from the model, and (iv) determine the accuracy of the ANN model performance. The model also assesses the impact of road salt applications through assessment of a spatial density distribution focused on probable high chloride concentration in an area.
ANN model performance. The model also assesses the impact of road salt applications through assessment of a spatial density distribution focused on probable high chloride concentration in an area.

Materials and Methods
A three-year study on the effectiveness of the stormwater best management practices (BMPs) was conducted in the Chipuxet watershed of South Kingstown, Rhode Island, USA ( Figure 2). shows the exact location of the study site shown above (Google Earth screenshot-on the right).
An overview of the chloride concentration for three years (2016-2019) from six sites represented in Figure 3. Based on the analysis of the stormwater runoff quality data, an apparent seasonal variation of chloride concentration is observed ( Figure 3). The higher concentration of chloride was found during winter and early spring season (at the tail end of winter). Our study results are highly consistent with the study conducted by the USGS (2014) [7]. The USGS study showed the increasing trend of chloride concentration in the New England zone, and our data also provided the same impression. As stated above, high chloride concentration (0. 8-197.9) mg/L was seen on the winter samples for all the sites. Site 5 and 6 are in close proximity to the parking lot; parking lots are highly impermeable, and exhibited higher chloride concentration (0. 9-197.9) mg/L than the remaining four sites that are close to the agricultural field ( Figure 3). The chloride concentration data were then used to investigate the future scenario through the ANN model. The steps used to develop the model include the choice of model performance criteria, preprocessing of available data, the selection of appropriate model inputs, and network structure.  An overview of the chloride concentration for three years (2016-2019) from six sites represented in Figure 3. Based on the analysis of the stormwater runoff quality data, an apparent seasonal variation of chloride concentration is observed ( Figure 3). The higher concentration of chloride was found during winter and early spring season (at the tail end of winter). Our study results are highly consistent with the study conducted by the USGS (2014) [7]. The USGS study showed the increasing trend of chloride concentration in the New England zone, and our data also provided the same impression. As stated above, high chloride concentration (0. 8-197.9) mg/L was seen on the winter samples for all the sites. Site 5 and 6 are in close proximity to the parking lot; parking lots are highly impermeable, and exhibited higher chloride concentration (0.9-197.9) mg/L than the remaining four sites that are close to the agricultural field ( Figure 3). The chloride concentration data were then used to investigate the future scenario through the ANN model. The steps used to develop the model include the choice of model performance criteria, preprocessing of available data, the selection of appropriate model inputs, and network structure.
Hydrology 2020, 7, x FOR PEER REVIEW 3 of 17 ANN model performance. The model also assesses the impact of road salt applications through assessment of a spatial density distribution focused on probable high chloride concentration in an area.

Materials and Methods
A three-year study on the effectiveness of the stormwater best management practices (BMPs) was conducted in the Chipuxet watershed of South Kingstown, Rhode Island, USA ( Figure 2). An overview of the chloride concentration for three years (2016-2019) from six sites represented in Figure 3. Based on the analysis of the stormwater runoff quality data, an apparent seasonal variation of chloride concentration is observed ( Figure 3). The higher concentration of chloride was found during winter and early spring season (at the tail end of winter). Our study results are highly consistent with the study conducted by the USGS (2014) [7]. The USGS study showed the increasing trend of chloride concentration in the New England zone, and our data also provided the same impression. As stated above, high chloride concentration (0. 8-197.9) mg/L was seen on the winter samples for all the sites. Site 5 and 6 are in close proximity to the parking lot; parking lots are highly impermeable, and exhibited higher chloride concentration (0.9-197.9) mg/L than the remaining four sites that are close to the agricultural field ( Figure 3). The chloride concentration data were then used to investigate the future scenario through the ANN model. The steps used to develop the model include the choice of model performance criteria, preprocessing of available data, the selection of appropriate model inputs, and network structure.

Artificial Neural Network (ANN)
The ANN concept was first introduced by McCulloch and Pits in 1943, and ANN applications in research areas started with the back-propagation algorithm for feed-forward ANN in 1986 [17,18]. ANNs consist of multiple layers; basic layers are common to all models (i.e., input layer, output layer), and several hidden layers may be needed (located between the input and output of the algorithm) [19]. Each of the layers in an ANN consists of a parameterizable number of neurons. Neurons are activation functions of adjustable weight based on a priori and domain knowledge [20]. In this study, an ANN with three different learning approaches, such as back-propagation neural network (Levenberg-Marquardt), curve fitting, and density distribution, were considered and adapted to develop the final model for predicting and validating chloride concentration in the runoff. The overall objective of the ANN model was to reduce model error, E, defined as where p = total number of training patterns and E p = error for the training pattern p. E p is calculated with the following equation: where N = total number of output nodes; o k = network output at the k th output node; and t k = target output at the k th output node. Additional details on the mechanics of this study are described below.

Back Propagation (BP) Algorithm
Back propagation (BP) is the most widely used method for training multiplier feed-forward networks. Before BP, almost all of the networks used non-identifiable complex binary nonlinear methods to self-test, such as step functions, statistical time series models, auto regressive integrated moving average (ARIMA), and moving average (MA) [21,22]. Layered networks from BP algorithm are useful for nontrivial calculations with the different attractive features such as fast response, fault tolerance, the ability to observation from input parameters, and the capability to generalize beyond the training data. A set of input variables is needed to train the network to match desired outputs, with a function that measures the "value" of differences between network outputs and desired values [22]. The most straightforward implementation of the standard BP algorithm adjusts the network weights and biases in the target direction, and this adjustment helps to achieve the model accuracy.
The back-propagation neural network structure consists of two or more layers of neurons, and network weights connect all the neurons [22,23]. The final output is captured by the developed system, when input data pass through the hidden layers to the output layer. This process is shown in Equation (3).
In Equation (3), W ji represents the weights that connect two neurons i and j, and every neuron calculates its output based on the number of stimulations it obtains from the given input vector x i , where x i is the input of neuron i. The "net input" of a neuron is measured as the weighted sum of total number of input variables, and the output of the neuron is based on the active function (active function indicates the magnitude of the "net input" [24]). BP is a training algorithm consisting of two steps: first, values are fed-forward, and second, error is calculated and propagated back to the earlier layers.

Curve Fitting Algorithm
Polynomial models for curves are given by Equation (4) where n + 1 and n represent the order of the polynomial and the degree of the polynomial, respectively, and the range of n is 1 ≤ n ≤ 9.
A third-degree (cubic) polynomial Equation (5) is also pertinent Polynomials (as in Equation (5)) are frequently used when a simple experiential model is required, or when a model needs interpolation or extrapolation. The main advantages of polynomial fitting comprise cognitive flexibility for the most complex and large data sets [25]. The polynomial curve fitting process is simple and linear [25].

Density Distribution Algorithm
Distribution fitting applies to model the probability distribution of a single variable. The normal distribution is the most applied statistical distribution approach in research. In this study, we calculated the probability density function (PDF) of the predicted chloride concentration. The following Equation (6) is used in this study to specify the probability of the predicted output [25].
where σ is standard deviation, σ 2 is variance, and µ is mean.

Model Structure
In recent years, neural network technology has been adopted in water quality prediction, in which the back-propagation network is commonly used [26,27]. The model created in this study is a BP neural network model with a single hidden layer ( Figure 4). In this ANN, the input layer is R, the hidden layer is a 1 , the output layer is a 2 , the weight matrix of the input layer is IW1.1, and the weight matrix from the hidden layer to the output layer is LW2.1. The threshold values of the hidden and output layers are b1 and b2, respectively. f1 and f2 are the neuron transfer functions of the hidden and output layers, respectively.
As theoretically verified, the BP model as shown in Figure 4 can handle any nonlinear function with minimum interruptions at any accuracy as long as there are a sufficient number of neurons in the hidden layer of the model and the number of neurons are determined based on a priori and domain knowledge [28]. The proposed ANN model ( Figure 5) has two hidden layers of sigmoid neurons that are followed by an output layer of linear neurons. Sigmoid neuron makes the output smoother, and a small change in the input only causes a little variation in the output [29]. This network system can be utilized as a general function approximator. It can estimate any function with a finite number of discontinuities given sufficient neurons in the hidden layer [30,31].

ANN Parameter Selection: Hidden Layers and Nodes
The number of hidden layers in ANN model is usually determined by trial and error. The number of training set samples should be higher than the number of synaptic weights, a rule of thumb for defining the number of hidden nodes [31,32]. Most ANN modelers usually consider a one-hidden-layer network (i.e., the number of hidden nodes is between input nodes and (2*(input nodes) + 1) [30]). However, hidden nodes should not be less than the maximum of one third of input nodes and the number of output nodes. The optimum value of hidden nodes is fixed by trial and error. Networks with minimum number of hidden nodes are usually preferred due to better generalization capabilities and fewer overfitting problems. For this study, a trial and error procedure for the number of hidden node selection was carried out by gradually changing the number of hidden layer nodes.

ANN Parameter Selection: Hidden Layers and Nodes
The number of hidden layers in ANN model is usually determined by trial and error. The number of training set samples should be higher than the number of synaptic weights, a rule of thumb for defining the number of hidden nodes [31,32]. Most ANN modelers usually consider a one-hidden-layer network (i.e., the number of hidden nodes is between input nodes and (2*(input nodes) + 1) [30]). However, hidden nodes should not be less than the maximum of one third of input nodes and the number of output nodes. The optimum value of hidden nodes is fixed by trial and error. Networks with minimum number of hidden nodes are usually preferred due to better generalization capabilities and fewer overfitting problems. For this study, a trial and error procedure for the number of hidden node selection was carried out by gradually changing the number of hidden layer nodes. In the developed model structure (Figure 5), the input and output variables are established for the evaluation of water quality. Multiple layers of neurons of the developed ANN structure with nonlinear transfer functions let the network assess nonlinear and linear relationships that underlay input and output vectors. The final output layer is linear, allowing the network to produce values outside the range −1 to +1.

ANN Parameter Selection: Hidden Layers and Nodes
The number of hidden layers in ANN model is usually determined by trial and error. The number of training set samples should be higher than the number of synaptic weights, a rule of thumb for defining the number of hidden nodes [31,32]. Most ANN modelers usually consider a one-hidden-layer network (i.e., the number of hidden nodes is between input nodes and (2*(input nodes) + 1) [30]). However, hidden nodes should not be less than the maximum of one third of input nodes and the number of output nodes. The optimum value of hidden nodes is fixed by trial and error. Networks with minimum number of hidden nodes are usually preferred due to better generalization capabilities and fewer overfitting problems. For this study, a trial and error procedure for the number of hidden node selection was carried out by gradually changing the number of hidden layer nodes.

ANN Parameter Selection: Learning Rate and Momentum
The functions of the learning rate and momentum parameters are to enhance model training and ensure that error is reduced. There is no precise rule for the selection of values for these parameters. Here, the learning rate was controlled by internal validation: after the end of each epoch, the weights Hydrology 2020, 7, 80 7 of 17 were updated. The number of epochs with the smallest internal validation error indicates which weights to select [33]. In this study, the learning rate for the weights connecting input layer and the hidden layer was set at double the size of the learning rate for the weights connecting the hidden layer to the output layer, to increase the rate of network convergence. The momentum was initially fixed at a value of 0.015, with the number of hidden nodes initially estimated as number of input nodes +1, similar to the study conducted by Maier and Dandy (1996) [34].

ANN Parameter Selection: Initial Weights
When the weights of a network is trained by BP, it is always better to initialize from small, non-zero random values, although ANN modelers can start over with a different set of initial weights [22,23]. In this study, the amplitude of a connection between two nodes (synaptic weights) of the proposed ANN networks was adjusted using the normally distributed random numbers having the range from −1 to 1.

ANN Parameter Selection: Selection of Input Variables
In an ANN, one of the main tasks is to determine the model input variables that significantly affect the output variable(s). The selection of input variables is usually related to a priori knowledge of output variables, inspections of time series plots, and statistical analysis of potential inputs and outputs. In this study, the input variables for the present neural network modeling were selected based on a statistical correlation analysis of the runoff quality data, the prediction accuracy of water quality variables, and domain knowledge. Domain knowledge is the specific field knowledge that supports interpretation of data when applying machine learning algorithms like regression, stepwise approach, and classification to predict some test data [35]. In a stepwise approach, separate networks are trained for each input variable [36]. We experimented with the water quality variables included in the parameters listed above in several models to both identify the optimal predictive model and reduce the monitoring cost by including fewer input parameters. After selecting the appropriate input variables, the next step involved determining appropriate lags for each of these variables. The selected appropriate input variables are rainfall amount, duration of rainfall, intensity, runoff coefficient, runoff depth, peak discharge, turbidity, total suspended solids (TSS), dissolved organic carbon (DOC), sodium, chloride, and total nitrate concentrations that were used to develop the ANN model. Appropriate lags are needed for complex problems, where the numbers of potential inputs are significant, and no a priori knowledge is available. Lags allow the model to establish significant connection or bonding between the output and the input variables. By doing so, the best network performance is retained, and the effect of adding each of the remaining inputs in turn is assessed. The correlations between the input variables and output variables are computed separately for each lagged input variable [37]. In this study, optimal networks for each of these combinations were obtained with these time-lagged variables, and the results were compared with the target dataset.

ANN Parameter Selection: Data Partition
It is essential to divide the data set in such a way that both training and overfitting test data sets are statistically comparable. The test set should be approximately 10-30% of the size of the training set of data [38]. In this study, the water quality data were divided into three partitions: the first set contained 70% of the records used as a training set, the second test contained 15% of the records and was used as an overfitting test set, and the rest of the data (15%) were used as the validation set. This process is necessary, because the efficiency of the developed neural network model is highly dependent on the quantity and quality of the data as stated by Palani et al., 2008 [18].

ANN Parameter Selection: Model Performance Evaluation
The model's efficiency was evaluated using the root mean square error (RMSE, see Equation (7)), the mean absolute error (MAE, see Equation (8)), and R 2 (see Equation (9)) [39]. Scatter plots and time series plots were used for visual comparison of the observed and predicted chloride concentrations values.
R 2 values of zero indicate that the observed mean is as good a predictor as the model, R 2 value of one represents a perfect fit, and a negative R 2 value reflects a better predictor than the model [40]. Depending on the sensitivity of water quality parameters and any mismatch between the forecasted and measured water quality variables, one can decide whether the predictive power of the ANN model is accurate enough to inform crucial decisions regarding data usage.
F and F 0 could be described using following two equations.
where N is the total number of observations. The other primary criterion used to select the optimum ANN model was the sum of square error (SSE), determined from the following empirical equation: where w i are the weights and y i andŷ i are the observed response value and the fitted response value, respectively. The weights determine how much each response value influences the final parameter estimates. A high-quality data point influences the fit more than a low-quality data point. Weighting data is recommended if the absolute weights are known, or if there is good cause for weighting data differently.

Model Output
The BP ANN architecture was applied to five hidden layers with different activation functions and initial weights of 0.3, optimum learning rate (0.1), and momentum (0.015), as described in Sections 2.5.2 and 2.5.3. The proposed ANN model was designed considering 11 input variables from 42 storm events. The sensitivities of the input parameters for the chloride concentration prediction are smaller than those used for the validation dataset. An individual ANN model was run for each of the sites considering the same ANN model structure shown in Figure 5 and the input parameters. The parameters that produce the "best results" for all sites (Table 1) were then used as the final chloride concentration prediction. The model output or the performance of the ANN model was evaluated based on the R 2 values for training, validation, and testing. R 2 values for each of the sites were similar for the three data partitions, as indicated in Table 1. The weights are methodically changed by the learning algorithms such that for a given input, the difference between the ANN output, and the actual output was small. The developed ANN model with nine hidden nodes was considered optimal here, considering the output (Tables 1 and 2). The optimum network parameters associated with the model output are presented in Tables 1 and 2. Validation errors were calculated after the optimization of the network parameters and the topology. Error was calculated in two ways. First, the cross validation was applied. In this method, data were separated into three parts: training (70%), validation (15%), and testing (15%). The output of the first technique is shown in Table 1. Secondly, the ANN predicted outputs were validated using curve fitting technique. Curve fitting analysis showed a good fit between the targeted or observed and the predicted value ( Figure 6). The model outputs were considered acceptable based on the R 2 values for training, validation, and testing. In general, the accuracy of the model can be improved by adding data to the validation step or to input variables.

Curve Fitting Analysis
Curve fitting analysis examines the relationship between target output and the model output. The fit between the target and predicted values were represented for all six sites ( Figure 6). Except for site number 1, all the other sites had the best fitting between the two datasets (target and model output). This fulfilled the aims of applying the polynomial bi-square fitting for the presence of concentrated chloride. Polynomial fitting with a high-order polynomial uses the large predictor values as the basis for a matrix, which often creates scaling problems [25]. In this study, most of the analyzed chloride concentration range is from 1 to 20 mg/L, but during the winter these ranges are from 25 to 200 mg/L. No axis range modifications were made here; axes ranges were kept as appropriate for the chloride concentration range. A reasonably good match between the output from the developed ANN model and the curve fitting output was obtained for the sites. To illustrate this, a prediction boundary is provided in Figure 6 for every site with a 95% confidence interval.  Curve-fitting information regarding the developed ANN model output is presented in Table 2; note the SSE, R-Square, and RMSE values are robust. All the ANN models constructed using nine nodes in the hidden layer produced the lowest SSE. Therefore, the site 3 ANN model showed the lowest value of SSE, and the model for Site 6 showed the highest SEE value having the lowest R-square value. In summary, all the ANN models developed here for six sites showed an acceptable range for all the model justification factors. The predicted values are reasonable for all the sites. Curve fitting assessment is a cross-validation approach, proving the accuracy of the developed ANN model. No significant difference in the R 2 values can be seen in the Table 2.

Density Distribution of the Predicted Chloride Concentration
The density distribution was applied to show the spatial distribution of the predicted chloride concentration values (Figure 7). Usually, the continuous data values tend to cluster around the mean Curve-fitting information regarding the developed ANN model output is presented in Table 2; note the SSE, R-Square, and RMSE values are robust. All the ANN models constructed using nine nodes in the hidden layer produced the lowest SSE. Therefore, the site 3 ANN model showed the lowest value of SSE, and the model for Site 6 showed the highest SEE value having the lowest R-square value. In summary, all the ANN models developed here for six sites showed an acceptable range for all the model justification factors. The predicted values are reasonable for all the sites. Curve fitting assessment is a cross-validation approach, proving the accuracy of the developed ANN model. No significant difference in the R 2 values can be seen in the Table 2.

Density Distribution of the Predicted Chloride Concentration
The density distribution was applied to show the spatial distribution of the predicted chloride concentration values (Figure 7). Usually, the continuous data values tend to cluster around the mean in a normal distribution, and the farther a value is from the mean, the more uncertain it is. The tails are asymptotic, which implies that they approach but never meet the X-axis. In this study, density distribution curve fitting resulted in a 95% confidence interval. This 95% confidence interval means that 95% of values fall within two standard deviations from the mean.
Considering the six study sites, four sites are close to an agricultural field and two are close to a parking lot; the spatial distribution range of the agricultural field sites is smaller than those close to the parking lot. For sites 5 and 6, more than 70% of predicted chloride concentrations are clustered at the mean value, and the peak is wide. On the other hand, the density distribution for sites 1, 2, 3, and 4 cover less than 50% predicted chloride concentration values. Sites 5 and 6 have higher chloride concentrations, because they received salt/chloride from both sides (from the road and the parking lot). The highest spatial range was observed for sites 5 and 6. These results are consistent with the fact that sites 5 and 6 received the chloride from both sides.

Cross Validation Based on Snow and Precipitation Events
Predicted data were cross validated by taking advantage of temporally close snow and precipitation events. Chloride concentrations were high in storm events that followed severe snow events. Here, we analyzed data from US climate data repositories with three years (2017, 2018, and 2019) of snow and daily precipitation [41] (Figure 8). We then used ANN model to predict chloride concentration based on the results obtained for 2018 data (Figure 9).
Rhode Island receives approximately 94 cm of snow every year, but snowfall totals can vary significantly from town to town, even though the state is relatively small and the terrain is flat [42]. Moreover, the number of snow events vary widely from year to year. Both salt and sand are used on roads during snow events. Given additional chloride derived from snow removal deicers, Rhode Island collects a considerable amount of chloride in its surface water and groundwater, and the salts accumulate in the soils and later percolate into the groundwater. The groundwater becomes saltier every year, since chloride is a dissolved phase and cannot be removed naturally from the water [42][43][44]. Seventy percent of the salt applied to roads stays within the region's watershed [45]. In this study, chloride data for rain events occurring immediately after snow events in 2017, 2018, and 2019 showed that chloride concentrations increased.
Runoff pollutant concentrations also depend on the size and duration of precipitation event. Both longer duration storms and storms of high intensity impact chloride concentrations. Longer period storms and high intensity storms can dilute the pollutant concentration. For example, on 4 January 2018, a 220 mm snow event preceded 830 mm of rainfall on 13 January 2018, in a storm of 4 hours' duration. Chloride concentrations were highest after the 13 January storm events. On the other hand, a 147 mm snow event occurred on 10 March 2017, followed by 17 March 2017 storm of 7 hours' duration. The detected chloride concentration from 17 March 2017 storm events were not significant relative to 2018 storm events. In the ANN model predicting chloride concentration, runoff volume and duration of the rainfall are considered as a positive sensitivity parameter. The March 2017 storm pair, which did not lead to elevated chloride, could reveal the counteracting impacts of street density, street width, and location of the street. Furthermore, the accumulated chloride could be attributed primarily to the amount of salt application, which varies from event to event.
Chloride concentrations greater than 600 ppt (1 mg/L = 1 ppt) are considered harmful for freshwater aquatic life and for the groundwater in general [46]. The developed ANN model prediction ( Figure 9) and probability density output (Figure 7) indicated that aquatic habitats at sites 4, 5 and 6 are at risk. State planners need to take necessary action regarding the implications of road salting and snow removal.

Conclusions
In this study, a new ANN model is developed to predict elevated chloride concentrations due to road salt and deicer applications in a suburban watershed, based on three years of data (2016-2019) collected at six study sites. Study sites are close to agricultural land (Sites 1, 2, 3, and 4) and an impervious parking lot (Sites 5 and 6). Seasonal variation is evident in the three years of collected data. For the ANN model, input variables were derived from the hydrometeorological database, stormwater runoff quality, key network parameters, and network topology. Preliminary ANN models were constructed using a subset of all data (for 42 storm events from 2016 to 2019) where it covered all four seasons (15 winter events, 6 spring events, 7 summer events, and 15 fall events). A series of sensitivity analyses were considered to determine the relative significance of input variables used in the ANN models. Applying the BP algorithm, developed ANN models showed a good fit between observed and predicted data (about 91%). Model accuracy was initially optimized using a cross-validation approach, and the developed model offers an appropriate and time-efficient approach to constraining the target water quality parameter. The curve fitting assessment resulted in a 95% confidence interval, used here as cross validation of ANN outputs, and provided an optimum summary for every site. The predicted ANN outcome could be more significant or could be trained better if the study duration were longer than three years and/or involved more frequent events. This study focused on the winter season because of the amount of road salt applied to the impervious surfaces, generating high concentrations of chloride in runoff water. The presence of chloride in non-winter season data is negligible compared to the winter season, but the detection of chloride could be due to the use of fertilizer in the agricultural zone. According to the best-fit results, chloride in the study area is mostly affected by the rain and snow that occurred during the winter season, and chloride concentration depends on storm duration, intensity, and runoff volume. We propose neural network modeling as an effective tool for water quality parameter prediction. Finally, density distribution analysis revealed the spatial distribution (the amount of clustered data value around the mean) of the chloride concentration. Density distributions showed about 70% of clustered value is detected for Sites 5 and 6, due to the nearby parking lot. The other four sites, which are close to the agricultural zone, cover less than 50%. These findings again point to used road salt as the main agent of chloride delivery to the groundwater. ANN modeling of environmental data has great potential in future work on improved prediction of chloride and other pollutant concentrations, and provides a useful tool for water resource and environmental managers.

Conclusions
In this study, a new ANN model is developed to predict elevated chloride concentrations due to road salt and deicer applications in a suburban watershed, based on three years of data (2016-2019) collected at six study sites. Study sites are close to agricultural land (Sites 1, 2, 3, and 4) and an impervious parking lot (Sites 5 and 6). Seasonal variation is evident in the three years of collected data. For the ANN model, input variables were derived from the hydrometeorological database, stormwater runoff quality, key network parameters, and network topology. Preliminary ANN models were constructed using a subset of all data (for 42 storm events from 2016 to 2019) where it covered all four seasons (15 winter events, 6 spring events, 7 summer events, and 15 fall events). A series of sensitivity analyses were considered to determine the relative significance of input variables used in the ANN models. Applying the BP algorithm, developed ANN models showed a good fit between observed and predicted data (about 91%). Model accuracy was initially optimized using a cross-validation approach, and the developed model offers an appropriate and time-efficient approach to constraining the target water quality parameter. The curve fitting assessment resulted in a 95% confidence interval, used here as cross validation of ANN outputs, and provided an optimum summary for every site. The predicted ANN outcome could be more significant or could be trained better if the study duration were longer than three years and/or involved more frequent events. This study focused on the winter season because of the amount of road salt applied to the impervious surfaces, generating high concentrations of chloride in runoff water. The presence of chloride in non-winter season data is negligible compared to the winter season, but the detection of chloride could be due to the use of fertilizer in the agricultural zone. According to the best-fit results, chloride in the study area is mostly affected by the rain and snow that occurred during the winter season, and chloride concentration depends on storm duration, intensity, and runoff volume. We propose neural network modeling as an effective tool for water quality parameter prediction. Finally, density distribution analysis revealed the spatial distribution (the amount of clustered data value around the mean) of the chloride concentration. Density distributions showed about 70% of clustered value is detected for Sites 5 and 6, due to the nearby parking lot. The other four sites, which are close to the agricultural zone, cover less than 50%. These findings again point to used road salt as the main agent of chloride delivery to the groundwater. ANN modeling of environmental data has great potential in future work on improved prediction of chloride and other pollutant concentrations, and provides a useful tool for water resource and environmental managers.
Author Contributions: K.J. and S.M.P., conceptualization; K.J., data collection, analysis, and writing the original draft preparation; S.M.P., writing, review, and editing, S.M.P., funding acquisition. Both the authors have read and agreed to the published version of the manuscript.