# Novel Salinity Modeling Using Deep Learning for the Sacramento–San Joaquin Delta of California

^{1}

^{2}

^{3}

^{4}

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Background

^{2}of farmlands [10]. The Delta is also used by millions for recreation and transportation [11]. Upstream riverine runoff (typically controlled by reservoirs) provides water to meet the water supply needs of the projects, and to meet Delta salinity requirements for both agriculture and wildlife. Within the Delta, consumptive uses of water include evaporation, seepage, and crop evapotranspiration. Salinity levels across the Delta depend upon the complex interactions between fresh water and seawater, which vary by location and are affected by river channel geometry, physical structures such as gates and barriers, diversions, and upstream reservoir releases.

#### 1.2. Literature Review

#### 1.3. Scope of the Current Work

## 2. Methodology

#### 2.1. Study Area and Dataset

#### 2.2. Machine Learning Models

#### 2.3. Input Preprocessing

#### 2.4. Forecasting Setup

**Step****1:**- We prepare model inputs the same way as discussed in Section 2.3, which consists of ${\widehat{x}}_{i}^{t},\dots ,{\widehat{x}}_{i}^{(t-7)}$ ($1\le i\le 8$) and $\overline{{\widehat{x}}_{i}^{(t-8)\to (t-18)}},\dots ,\overline{{\widehat{x}}_{i}^{(t-107)\to (t-117)}}$.
**Step****2:**- We formulate the target output values by shifting the salinity values forward by ${t}_{l}$ days, represented by ${y}_{k}^{t+{t}_{l}},k=1,2,\cdots ,23$.

#### 2.5. Evaluation Metrics

#### 2.6. Implementation Details

## 3. Results

#### 3.1. Model Performance on the Daily Scale

#### 3.2. Forecasting Performance

#### 3.3. Model Performance on the Hourly Scale

## 4. Discussions

#### 4.1. Overfitting Potential versus Model Complexity

#### 4.2. Comparing with a Process-Based Model

#### 4.3. Implications

#### 4.4. Limitations and Future Work

## 5. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## Appendix A. Data Sources

## Appendix B. Summary of Datasets from Previous Studies

Study | Dataset |
---|---|

Rath et al. (2016) [2] | The input features of this study are daily freshwater flow to the estuary, daily mean coastal water level, and the daily tidal range for water years 1922–2012. Labels are salinity data from nine locations collected from sensors in the Delta. |

Chen et al. (2018) [28] | In this study, the machine learning emulator is based on data generated using DSM2 (a process-based model) including its outputs at 17 locations for 10 scenarios (two decades each). The use of 10 scenarios is intended to augment the dataset to bound the range of possible water management operations. The study period 1990–2010 was strategically selected as it contains widely varying hydrology and is a period where the DSM2 model is well-calibrated. |

Mosavi et al. (2018) [31] | In this study, the authors examined studies that used field data from rain gauges and other sensing devices, including data from remote sensing technologies. |

He et al. (2020) [10] | The dataset includes a 24-year period (WY 1991–2014) of daily observed water stage at Martinez, Martinez salinity and the net Delta outflow. |

Jayasundara et al. (2020) [26] | Input features used are Northern flow, San Joaquin River flow, exports, Delta cross channel gate operation, net Delta consumptive use, tidal energy, and San Joaquin River inflow salinity at Vernalis. Labels include multiple sets of DSM2 simulated salinity data representing a range of operational conditions. |

Qi et al. (2021) [27] | The input features in this study are the same as Jayasundara et al. (2020) and the labels are DSM2-simulated salinity data at 12 locations. |

Qi et al. (2022) [29] | The input features of this study are eight inputs representing boundary flows or operating rules for Delta flow and salinity management. DSM2 simulated daily salinity at the 28 study locations during 1990–2019 is used as the training label dataset. |

## Appendix C. Diagrams of MLP, ResNet, LSTM and GRU Networks

**Figure A1.**Diagram of the MLP network from [29]. The number in the input layer denotes input shape and those in the subsequent FC layers represent the numbers of neurons of the layers.

**Figure A2.**Diagram of the ResNet network from [29]. The number in the input layer denotes input shape and those in the FC layers represent the numbers of units/neurons of the layers. In the convolutional layers following the input layer, “f” denotes the number of convolutional filters, “k” denotes size of convolutional kernels and “s” denotes stride.

**Figure A3.**Diagram of the LSTM network from [29]. The number in the input layer denotes input shape and those in the subsequent layers represent the numbers of units/neurons of the layers.

**Figure A4.**Diagram of the GRU network from [29]. The number in the input layer denotes input shape and those in the subsequent layers represent the numbers of units/neurons of the layers.

## Appendix D. Detailed Values for Figure 7 and Figure 12

${\mathit{r}}^{2}$ | 0∼75% | 75∼95% | 95∼100% |
---|---|---|---|

RSAC064 | 0.9529 | 0.5940 | 0.6464 |

SLCBN002 | 0.9748 | 0.8334 | 0.6973 |

SLSUS012 | 0.9909 | 0.8803 | 0.8518 |

SLMZU011 | 0.9904 | 0.9111 | 0.9003 |

RSAC075 | 0.9760 | 0.8008 | 0.7332 |

SLMZU025 | 0.9777 | 0.7759 | 0.7619 |

RSAC081 | 0.9623 | 0.7956 | 0.7625 |

RSAN007 | 0.9568 | 0.8286 | 0.8028 |

ROLD059 | 0.9728 | 0.7287 | 0.6267 |

RSAN058 | 0.9809 | 0.9135 | 0.9437 |

OLD MID | 0.9817 | 0.8729 | 0.6846 |

RSAN072 | 0.9860 | 0.8975 | 0.8538 |

SLDUT007 | 0.9905 | 0.9156 | 0.9509 |

CHDMC006 | 0.9064 | 0.6368 | 0.5679 |

CHSWP003 | 0.9379 | 0.5695 | 0.4932 |

RSAN018 | 0.9434 | 0.7981 | 0.7969 |

CHVCT000 | 0.9894 | 0.9708 | 0.9405 |

ROLD024 | 0.9860 | 0.8916 | 0.8700 |

SLTRM004 | 0.9027 | 0.7200 | 0.8676 |

RSAC092 | 0.5819 | 0.8165 | 0.8177 |

RSAN037 | 0.9815 | 0.9221 | 0.8310 |

RSAN032 | 0.8751 | 0.7713 | 0.7742 |

RSMKL008 | 0.9484 | 0.7347 | 0.8541 |

Bias | 0∼75% | 75∼95% | 95∼100% |
---|---|---|---|

RSAC064 | 0.4684 | −2.8600 | −3.7659 |

SLCBN002 | −1.3600 | −0.8149 | −1.4523 |

SLSUS012 | 0.2163 | −0.1661 | −0.6855 |

SLMZU011 | −0.2336 | −0.7538 | −1.0799 |

RSAC075 | 1.5492 | −0.4411 | −0.8134 |

SLMZU025 | 2.1184 | −0.0241 | −1.2001 |

RSAC081 | 4.8211 | −0.0190 | −1.6114 |

RSAN007 | −2.6612 | −0.8044 | −2.3899 |

ROLD059 | 0.0202 | −1.0317 | −1.8503 |

RSAN058 | −0.3692 | −1.2292 | −1.3330 |

OLD MID | −0.0381 | −0.3858 | −0.9698 |

RSAN072 | 0.5610 | −0.1732 | −0.1627 |

SLDUT007 | 1.9058 | 0.8568 | −0.0001 |

CHDMC006 | 1.3642 | −1.4991 | −3.1108 |

CHSWP003 | 0.9000 | −1.1438 | −3.7689 |

RSAN018 | 3.4133 | 2.5636 | −2.7383 |

CHVCT000 | 0.5407 | 0.2339 | 0.1670 |

ROLD024 | 0.5134 | 0.3675 | 0.1518 |

SLTRM004 | −1.2958 | −0.8454 | −1.6861 |

RSAC092 | −1.6766 | 1.0572 | −3.6682 |

RSAN037 | 0.9873 | 0.2697 | −0.8718 |

RSAN032 | −1.3944 | −0.8808 | −4.2074 |

RSMKL008 | 0.6971 | −0.2226 | −0.6141 |

RSR | 0∼75% | 75∼95% | 95∼100% |
---|---|---|---|

RSAC064 | 0.2200 | 0.8297 | 0.8356 |

SLCBN002 | 0.1609 | 0.4508 | 0.6052 |

SLSUS012 | 0.0960 | 0.3660 | 0.4214 |

SLMZU011 | 0.0981 | 0.3109 | 0.3718 |

RSAC075 | 0.1582 | 0.4841 | 0.5874 |

SLMZU025 | 0.1552 | 0.5096 | 0.5669 |

RSAC081 | 0.2087 | 0.4888 | 0.5559 |

RSAN007 | 0.2201 | 0.4363 | 0.5086 |

ROLD059 | 0.1661 | 0.6113 | 0.7402 |

RSAN058 | 0.1385 | 0.3238 | 0.2714 |

OLD MID | 0.1359 | 0.3856 | 0.6345 |

RSAN072 | 0.1191 | 0.3383 | 0.4005 |

SLDUT007 | 0.1023 | 0.3063 | 0.2214 |

CHDMC006 | 0.3243 | 0.7288 | 0.8950 |

CHSWP003 | 0.2598 | 0.8280 | 0.9698 |

RSAN018 | 0.2541 | 0.5150 | 0.4731 |

CHVCT000 | 0.1037 | 0.1754 | 0.2459 |

ROLD024 | 0.1207 | 0.3509 | 0.3848 |

SLTRM004 | 0.3391 | 0.6186 | 0.3708 |

RSAC092 | 0.9923 | 0.4407 | 0.4853 |

RSAN037 | 0.1392 | 0.2900 | 0.4270 |

RSAN032 | 0.3860 | 0.5242 | 0.5271 |

RSMKL008 | 0.2320 | 0.5844 | 0.4060 |

NSE | 0∼75% | 75∼95% | 95∼100% |
---|---|---|---|

RSAC064 | 0.9516 | 0.3117 | 0.3017 |

SLCBN002 | 0.9741 | 0.7968 | 0.6337 |

SLSUS012 | 0.9908 | 0.8660 | 0.8224 |

SLMZU011 | 0.9904 | 0.9033 | 0.8618 |

RSAC075 | 0.9750 | 0.7657 | 0.6549 |

SLMZU025 | 0.9759 | 0.7403 | 0.6786 |

RSAC081 | 0.9564 | 0.7611 | 0.6910 |

RSAN007 | 0.9516 | 0.8096 | 0.7413 |

ROLD059 | 0.9724 | 0.6263 | 0.4521 |

RSAN058 | 0.9808 | 0.8952 | 0.9264 |

OLD MID | 0.9815 | 0.8513 | 0.5975 |

RSAN072 | 0.9858 | 0.8856 | 0.8396 |

SLDUT007 | 0.9895 | 0.9062 | 0.9510 |

CHDMC006 | 0.8948 | 0.4689 | 0.1990 |

CHSWP003 | 0.9325 | 0.3145 | 0.0595 |

RSAN018 | 0.9354 | 0.7348 | 0.7762 |

CHVCT000 | 0.9892 | 0.9692 | 0.9395 |

ROLD024 | 0.9854 | 0.8769 | 0.8519 |

SLTRM004 | 0.8850 | 0.6174 | 0.8625 |

RSAC092 | 0.0154 | 0.8058 | 0.7645 |

RSAN037 | 0.9806 | 0.9159 | 0.8177 |

RSAN032 | 0.8510 | 0.7252 | 0.7222 |

RSMKL008 | 0.9462 | 0.6585 | 0.8351 |

${\mathit{r}}^{2}$ | 0∼75% | 75∼95% | 95∼100% |
---|---|---|---|

RSAC064 | 0.9744 | 0.5923 | 0.7580 |

SLCBN002 | 0.9854 | 0.8367 | 0.6724 |

SLSUS012 | 0.9915 | 0.8796 | 0.8667 |

SLMZU011 | 0.9865 | 0.8907 | 0.8532 |

RSAC075 | 0.9836 | 0.8220 | 0.7386 |

SLMZU025 | 0.9834 | 0.8091 | 0.7832 |

RSAC081 | 0.9764 | 0.8359 | 0.7744 |

RSAN007 | 0.9730 | 0.8189 | 0.7607 |

ROLD059 | 0.9805 | 0.7518 | 0.7275 |

RSAN058 | 0.9797 | 0.9156 | 0.9489 |

OLD MID | 0.9792 | 0.8817 | 0.6887 |

RSAN072 | 0.9834 | 0.9060 | 0.8040 |

SLDUT007 | 0.9910 | 0.9033 | 0.9512 |

CHDMC006 | 0.9697 | 0.7823 | 0.8735 |

CHSWP003 | 0.9720 | 0.7799 | 0.8500 |

RSAN018 | 0.9801 | 0.7914 | 0.8068 |

CHVCT000 | 0.9855 | 0.9608 | 0.9206 |

ROLD024 | 0.9852 | 0.8897 | 0.8608 |

SLTRM004 | 0.9730 | 0.8626 | 0.8684 |

RSAC092 | 0.8405 | 0.8825 | 0.8410 |

RSAN037 | 0.9850 | 0.9560 | 0.8381 |

RSAN032 | 0.9575 | 0.8381 | 0.7932 |

RSMKL008 | 0.9572 | 0.8070 | 0.9379 |

Bias | 0∼75% | 75∼95% | 95∼100% |
---|---|---|---|

RSAC064 | 3.6008 | −1.2224 | −1.6747 |

SLCBN002 | 3.7842 | 1.2626 | −0.3216 |

SLSUS012 | 4.7572 | 1.8593 | 0.4100 |

SLMZU011 | 0.9901 | 0.0644 | −0.6901 |

RSAC075 | −1.6408 | −1.1783 | −1.2080 |

SLMZU025 | −2.5367 | −0.9088 | −1.4725 |

RSAC081 | −2.2831 | −1.7392 | −1.0961 |

RSAN007 | 2.0662 | −0.4297 | −0.7600 |

ROLD059 | 1.9174 | 0.1989 | −0.3832 |

RSAN058 | −0.5765 | −1.3418 | −0.4266 |

OLD MID | 0.4040 | −0.0756 | −0.7731 |

RSAN072 | −2.1535 | −1.7981 | −1.3735 |

SLDUT007 | 3.1500 | 1.0906 | −0.1996 |

CHDMC006 | 0.3502 | −1.3688 | −0.7877 |

CHSWP003 | 1.0587 | −0.6732 | −0.5765 |

RSAN018 | −1.9600 | −0.1313 | −2.5437 |

CHVCT000 | −0.7863 | −0.4636 | −1.1248 |

ROLD024 | 3.5323 | 0.7066 | 0.0888 |

SLTRM004 | 6.3652 | 0.9577 | −0.8914 |

RSAC092 | −11.2734 | −2.8386 | −1.7970 |

RSAN037 | −2.6610 | −2.3256 | −2.4599 |

RSAN032 | 0.4398 | −0.0980 | −2.0932 |

RSMKL008 | −1.2886 | −1.8978 | −1.8280 |

RSR | 0∼75% | 75∼95% | 95∼100% |
---|---|---|---|

RSAC064 | 0.1681 | 0.7889 | 0.6164 |

SLCBN002 | 0.1332 | 0.4571 | 0.5819 |

SLSUS012 | 0.1194 | 0.4326 | 0.3862 |

SLMZU011 | 0.1183 | 0.3456 | 0.3991 |

RSAC075 | 0.1287 | 0.4669 | 0.6117 |

SLMZU025 | 0.1359 | 0.4665 | 0.5618 |

RSAC081 | 0.1552 | 0.4445 | 0.5323 |

RSAN007 | 0.1716 | 0.4434 | 0.5567 |

ROLD059 | 0.1446 | 0.5735 | 0.5859 |

RSAN058 | 0.1420 | 0.3179 | 0.2296 |

OLD MID | 0.1433 | 0.3778 | 0.6007 |

RSAN072 | 0.1329 | 0.3776 | 0.5197 |

SLDUT007 | 0.1010 | 0.3287 | 0.1968 |

CHDMC006 | 0.1745 | 0.5584 | 0.3696 |

CHSWP003 | 0.1681 | 0.5464 | 0.4189 |

RSAN018 | 0.1459 | 0.4913 | 0.4763 |

CHVCT000 | 0.1201 | 0.2027 | 0.3275 |

ROLD024 | 0.1358 | 0.3567 | 0.3865 |

SLTRM004 | 0.1801 | 0.3922 | 0.3667 |

RSAC092 | 0.4723 | 0.3580 | 0.4551 |

RSAN037 | 0.1303 | 0.2589 | 0.5091 |

RSAN032 | 0.2149 | 0.4265 | 0.5118 |

RSMKL008 | 0.2116 | 0.5490 | 0.2906 |

NSE | 0∼75% | 75∼95% | 95∼100% |
---|---|---|---|

RSAC064 | 0.9718 | 0.3777 | 0.6201 |

SLCBN002 | 0.9823 | 0.7910 | 0.6613 |

SLSUS012 | 0.9857 | 0.8129 | 0.8508 |

SLMZU011 | 0.9860 | 0.8806 | 0.8407 |

RSAC075 | 0.9834 | 0.7820 | 0.6258 |

SLMZU025 | 0.9815 | 0.7824 | 0.6843 |

RSAC081 | 0.9759 | 0.8024 | 0.7166 |

RSAN007 | 0.9705 | 0.8034 | 0.6901 |

ROLD059 | 0.9791 | 0.6711 | 0.6567 |

RSAN058 | 0.9798 | 0.8989 | 0.9473 |

OLD MID | 0.9795 | 0.8572 | 0.6392 |

RSAN072 | 0.9823 | 0.8574 | 0.7299 |

SLDUT007 | 0.9898 | 0.8920 | 0.9613 |

CHDMC006 | 0.9696 | 0.6881 | 0.8634 |

CHSWP003 | 0.9718 | 0.7015 | 0.8246 |

RSAN018 | 0.9787 | 0.7586 | 0.7731 |

CHVCT000 | 0.9856 | 0.9589 | 0.8927 |

ROLD024 | 0.9816 | 0.8727 | 0.8506 |

SLTRM004 | 0.9676 | 0.8461 | 0.8655 |

RSAC092 | 0.7769 | 0.8718 | 0.7929 |

RSAN037 | 0.9830 | 0.9330 | 0.7408 |

RSAN032 | 0.9538 | 0.8181 | 0.7381 |

RSMKL008 | 0.9552 | 0.6986 | 0.9155 |

## Appendix E. Numbers of Parameters in Simplified or Complicated Architectures

Number of Units in the Recurrent Layer | LSTM | GRU |
---|---|---|

322 | 627,279 | 486,243 |

276 | 486,887 | 378,695 |

230 | 363,423 | 283,843 |

184 (Baseline) | 256,887 | 201,687 |

138 | 167,279 | 132,227 |

92 | 94,599 | 75,463 |

46 | 38,847 | 31,395 |

23 | 17,319 | 14,122 |

**Table A11.**Numbers of parameters of simplified or complicated MLP, ResNet, Res-LSTM and Res-GRU models.

Numbers of Neurons in Hidden Layers | MLP | ResNet | Res-LSTM | Res-GRU |
---|---|---|---|---|

368,184 | 125,511 | 768,119 | 256,292 | 237,340 |

368,92 | 89,447 | 732,055 | 220,228 | 201,276 |

184,184 | 64,975 | 386,503 | 140,004 | 129,516 |

184,138 | 55,407 | 376,935 | 130,436 | 119,948 |

184,92 (Baseline) | 367,367 | 367,367 | 120,868 | 110,380 |

184,46 | 36,271 | 357,799 | 111,300 | 100,812 |

138,46 | 27,485 | 268,743 | 88,576 | 80,204 |

92,46 | 18,699 | 179,687 | 65,852 | 59,596 |

46,46 | 9913 | 90,631 | 43,128 | 38,988 |

46,23 | 8303 | 89,021 | 41,518 | 37,378 |

## Appendix F. Preliminary Data Distortion and Cross-Validation Results

**Figure A5.**Comparison of six models on observed data without (“w/o”) or with (“w/”) data distortion.

**Figure A6.**Comparison of the 5-fold cross-validation on the MLP architecture using observed data. “SP” stands for “split”.

**Figure A7.**Comparison of the 5-fold cross-validation on the Res-LSTM architecture using observed data. “SP” stands for “split”.

**Figure A8.**Comparison of the 5-fold cross-validation on the Res-GRU architecture using observed data. “SP” stands for “split”.

## Appendix G. Time Series Plots of Observed Salinity Levels Versus Model Simulations

**Figure A9.**Time series plots of observed salinity levels versus Res-LSTM simulations and DSM2 simulations of the 23 stations. Detailed values of four evaluation metrics of Res-LSTM and DSM2 are marked for each station.

## References

- Alber, M. A conceptual model of estuarine freshwater inflow management. Estuaries
**2002**, 25, 1246–1261. [Google Scholar] [CrossRef] - Rath, J.S.; Hutton, P.H.; Chen, L.; Roy, S.B. A hybrid empirical-Bayesian artificial neural network model of salinity in the San Francisco Bay-Delta estuary. Environ. Model. Softw.
**2017**, 93, 193–208. [Google Scholar] [CrossRef] - Xu, J.; Long, W.; Wiggert, J.D.; Lanerolle, L.W.; Brown, C.W.; Murtugudde, R.; Hood, R.R. Climate forcing and salinity variability in Chesapeake Bay, USA. Estuaries Coasts
**2012**, 35, 237–261. [Google Scholar] [CrossRef] - Tran Anh, D.; Hoang, L.P.; Bui, M.D.; Rutschmann, P. Simulating future flows and salinity intrusion using combined one-and two-dimensional hydrodynamic modelling—The case of Hau River, Vietnamese Mekong delta. Water
**2018**, 10, 897. [Google Scholar] [CrossRef][Green Version] - Mulamba, T.; Bacopoulos, P.; Kubatko, E.J.; Pinto, G.F. Sea-level rise impacts on longitudinal salinity for a low-gradient estuarine system. Clim. Chang.
**2019**, 152, 533–550. [Google Scholar] [CrossRef] - MDBMC. The Salinity Audit of the Murray-Darling Basin, A 100-Year Perspective; Murray-Darling Basin Commission: Canberra, Australia, 1999. Available online: https://www.mdba.gov.au/sites/default/files/archived/mdbc-salinity-reports/2072_Salinity_audit_of_MDB_100_year_perspective.pdf (accessed on 1 July 2022).
- MDBMC. Basin Salinity Management 2030 (BSM2030), MDBA Publication No 21/15; Murray–Darling Basin Ministerial Council: Canberra, Australia, 2015. [Google Scholar]
- Myers, N.; Mittermeier, R.A.; Mittermeier, C.G.; Da Fonseca, G.A.; Kent, J. Biodiversity hotspots for conservation priorities. Nature
**2000**, 403, 853–858. [Google Scholar] [CrossRef] - Moyle, P.B.; Brown, L.R.; Durand, J.R.; Hobbs, J.A. Delta smelt: Life history and decline of a once-abundant species in the San Francisco Estuary. San Fr. Estuary Watershed Sci.
**2016**, 14. [Google Scholar] [CrossRef][Green Version] - He, M.; Zhong, L.; Sandhu, P.; Zhou, Y. Emulation of a process-based salinity generator for the sacramento–san joaquin delta of california via deep learning. Water
**2020**, 12, 2088. [Google Scholar] [CrossRef] - Healey, M.; Dettinger, M.; Norgaard, R. Perspectives on Bay–Delta Science and Policy. San Fr. Estuary Watershed Sci.
**2016**, 14. [Google Scholar] [CrossRef][Green Version] - CDWR. Minimum Delta Outflow Program. In Methodology for Flow and Salinity Estimates in the Sacramento-San Joaquin Delta and Suisun Marsh: 11th Annual Progress Report; CDWR: Sacramento, CA, USA, 1990. [Google Scholar]
- CDWR. Calibration and verification of DWRDSM. In Methodology for Flow and Salinity Estimates in the Sacramento-San Joaquin Delta and Suisun Marsh: 12th Annual Progress Report; CDWR: Sacramento, CA, USA, 1991. [Google Scholar]
- Denton, R.A. Accounting for Antecedent Conditions in Seawater Intrusion Modeling—Applications for the San Francisco Bay-Delta. In Hydraulic Engineering; ASCE: Reston, FL, USA, 1993; pp. 448–453. [Google Scholar]
- Cheng, R.T.; Casulli, V.; Gartner, J.W. Tidal, residual, intertidal mudflat (TRIM) model and its applications to San Francisco Bay, California. Estuarine, Coast. Shelf Sci.
**1993**, 36, 235–280. [Google Scholar] [CrossRef] - DeGeorge, J.F. A Multi-Dimensional Finite Element Transport Model Utilizing a Characteristic-Galerkin Algorithm; University of California: Davis, CA, USA, 1996. [Google Scholar]
- Hutton, P.H.; Rath, J.S.; Chen, L.; Ungs, M.J.; Roy, S.B. Nine decades of salinity observations in the San Francisco Bay and Delta: Modeling and trend evaluations. J. Water Resour. Plan. Manag.
**2016**, 142, 04015069. [Google Scholar] [CrossRef][Green Version] - MacWilliams, M.; Bever, A.J.; Foresman, E. 3-D simulations of the San Francisco Estuary with subgrid bathymetry to explore long-term trends in salinity distribution and fish abundance. San Fr. Estuary Watershed Sci.
**2016**, 14. [Google Scholar] [CrossRef][Green Version] - MacWilliams, M.L.; Ateljevich, E.S.; Monismith, S.G.; Enright, C. An overview of multi-dimensional models of the Sacramento–San Joaquin Delta. San Fr. Estuary Watershed Sci.
**2016**, 14. [Google Scholar] [CrossRef][Green Version] - Chao, Y.; Farrara, J.D.; Zhang, H.; Zhang, Y.J.; Ateljevich, E.; Chai, F.; Davis, C.O.; Dugdale, R.; Wilkerson, F. Development, implementation, and validation of a modeling system for the San Francisco Bay and Estuary. Estuarine, Coast. Shelf Sci.
**2017**, 194, 40–56. [Google Scholar] [CrossRef] - Sandhu, N.; Finch, R. Application of artificial neural networks to the Sacramento-San Joaquin Delta. In Estuarine and Coastal Modeling; ASCE: Reston, FL, USA, 1995; pp. 490–504. [Google Scholar]
- CDWR. Modeling Flow-Salinity Relationships in the Sacramento-San Joaquin Delta Using Artificial Neural Networks; Technical Information Record OSP-99-1; CDWR: Sacramento, CA, USA, 1999. [Google Scholar]
- Wilbur, R.; Munevar, A. Integration of CALSIM and Artificial Neural Networks Models for Sacramento-San Joaquin Delta Flow-Salinity Relationships. In Methodology for Flow and Salinity Estimates in the Sacramento-San Joaquin Delta and Suisun Marsh: 22nd Annual Progress Report; CDWR: Sacramento, CA, USA, 2001. [Google Scholar]
- Mierzwa, M. CALSIM versus DSM2 ANN and G-model Comparisons. In Methodology for Flow and Salinity Estimates in the Sacramento-San Joaquin Delta and Suisun Marsh: 23rd Annual Progress Report; CDWR: Sacramento, CA, USA, 2002. [Google Scholar]
- Seneviratne, S.; Wu, S. Enhanced Development of Flow-Salinity Relationships in the Delta Using Artificial Neural Networks: Incorporating Tidal Influence. In Methodology for Flow and Salinity Estimates in the Sacramento-San Joaquin Delta and Suisun Marsh: 28th Annual Progress Report; CDWR: Sacramento, CA, USA, 2007. [Google Scholar]
- Jayasundara, N.C.; Seneviratne, S.A.; Reyes, E.; Chung, F.I. Artificial neural network for Sacramento–San Joaquin Delta flow–salinity relationship for CalSim 3.0. J. Water Resour. Plan. Manag.
**2020**, 146, 04020015. [Google Scholar] [CrossRef] - Qi, S.; Bai, Z.; Ding, Z.; Jayasundara, N.; He, M.; Sandhu, P.; Seneviratne, S.; Kadir, T. Enhanced Artificial Neural Networks for Salinity Estimation and Forecasting in the Sacramento-San Joaquin Delta of California. J. Water Resour. Plan. Manag.
**2021**, 147, 04021069. [Google Scholar] [CrossRef] - Chen, L.; Roy, S.B.; Hutton, P.H. Emulation of a process-based estuarine hydrodynamic model. Hydrol. Sci. J.
**2018**, 63, 783–802. [Google Scholar] [CrossRef] - Qi, S.; He, M.; Bai, Z.; Ding, Z.; Sandhu, P.; Zhou, Y.; Namadi, P.; Tom, B.; Hoang, R.; Anderson, J. Multi-Location Emulation of a Process-Based Salinity Model Using Machine Learning. Water
**2022**, 14, 2030. [Google Scholar] [CrossRef] - Zounemat-Kermani, M.; Matta, E.; Cominola, A.; Xia, X.; Zhang, Q.; Liang, Q.; Hinkelmann, R. Neurocomputing in surface water hydrology and hydraulics: A review of two decades retrospective, current status and future prospects. J. Hydrol.
**2020**, 588, 125085. [Google Scholar] [CrossRef] - Mosavi, A.; Ozturk, P.; Chau, K.W. Flood prediction using machine learning models: Literature review. Water
**2018**, 10, 1536. [Google Scholar] [CrossRef] - Yaseen, Z.M.; Sulaiman, S.O.; Deo, R.C.; Chau, K.W. An enhanced extreme learning machine model for river flow forecasting: State-of-the-art, practical applications in water resource engineering area and future research direction. J. Hydrol.
**2019**, 569, 387–408. [Google Scholar] [CrossRef] - Tongal, H.; Booij, M.J. Simulation and forecasting of streamflows using machine learning models coupled with base flow separation. J. Hydrol.
**2018**, 564, 266–282. [Google Scholar] [CrossRef] - Islam, A.R.M.T.; Talukdar, S.; Mahato, S.; Kundu, S.; Eibek, K.U.; Pham, Q.B.; Kuriqi, A.; Linh, N.T.T. Flood susceptibility modelling using advanced ensemble machine learning models. Geosci. Front.
**2021**, 12, 101075. [Google Scholar] [CrossRef] - Shahabi, H.; Shirzadi, A.; Ghaderi, K.; Omidvar, E.; Al-Ansari, N.; Clague, J.J.; Geertsema, M.; Khosravi, K.; Amini, A.; Bahrami, S.; et al. Flood detection and susceptibility mapping using sentinel-1 remote sensing data and a machine learning approach: Hybrid intelligence of bagging ensemble based on k-nearest neighbor classifier. Remote Sens.
**2020**, 12, 266. [Google Scholar] [CrossRef][Green Version] - Costache, R.; Hong, H.; Pham, Q.B. Comparative assessment of the flash-flood potential within small mountain catchments using bivariate statistics and their novel hybrid integration with machine learning models. Sci. Total. Environ.
**2020**, 711, 134514. [Google Scholar] [CrossRef] - Tang, Y.; Zang, C.; Wei, Y.; Jiang, M. Data-driven modeling of groundwater level with least-square support vector machine and spatial–temporal analysis. Geotech. Geol. Eng.
**2019**, 37, 1661–1670. [Google Scholar] [CrossRef] - El Bilali, A.; Taleb, A.; Brouziyne, Y. Groundwater quality forecasting using machine learning algorithms for irrigation purposes. Agric. Water Manag.
**2021**, 245, 106625. [Google Scholar] [CrossRef] - Yin, J.; Medellín-Azuara, J.; Escriva-Bou, A.; Liu, Z. Bayesian machine learning ensemble approach to quantify model uncertainty in predicting groundwater storage change. Sci. Total. Environ.
**2021**, 769, 144715. [Google Scholar] [CrossRef] - Kumar, D.; Pandey, A.; Sharma, N.; Flügel, W.A. Daily suspended sediment simulation using machine learning approach. Catena
**2016**, 138, 77–90. [Google Scholar] [CrossRef] - Choubin, B.; Darabi, H.; Rahmati, O.; Sajedi-Hosseini, F.; Kløve, B. River suspended sediment modelling using the CART model: A comparative study of machine learning techniques. Sci. Total. Environ.
**2018**, 615, 272–281. [Google Scholar] [CrossRef] - Melesse, A.M.; Khosravi, K.; Tiefenbacher, J.P.; Heddam, S.; Kim, S.; Mosavi, A.; Pham, B.T. River water salinity prediction using hybrid machine learning models. Water
**2020**, 12, 2951. [Google Scholar] [CrossRef] - Nauman, T.W.; Ely, C.P.; Miller, M.P.; Duniway, M.C. Salinity yield modeling of the Upper Colorado River Basin using 30-m resolution soil maps and random forests. Water Resour. Res.
**2019**, 55, 4954–4973. [Google Scholar] [CrossRef] - Derot, J.; Yajima, H.; Jacquet, S. Advances in forecasting harmful algal blooms using machine learning models: A case study with Planktothrix rubescens in Lake Geneva. Harmful Algae
**2020**, 99, 101906. [Google Scholar] [CrossRef] - Alizadeh, M.J.; Kavianpour, M.R.; Danesh, M.; Adolf, J.; Shamshirband, S.; Chau, K.W. Effect of river flow on the quality of estuarine and coastal waters using machine learning models. Eng. Appl. Comput. Fluid Mech.
**2018**, 12, 810–823. [Google Scholar] [CrossRef][Green Version] - Shamshirband, S.; Jafari Nodoushan, E.; Adolf, J.E.; Abdul Manaf, A.; Mosavi, A.; Chau, K.w. Ensemble models with uncertainty analysis for multi-day ahead forecasting of chlorophyll a concentration in coastal waters. Eng. Appl. Comput. Fluid Mech.
**2019**, 13, 91–101. [Google Scholar] [CrossRef][Green Version] - Jiang, Y.; Zhang, T.; Gou, Y.; He, L.; Bai, H.; Hu, C. High-resolution temperature and salinity model analysis using support vector regression. J. Ambient. Intell. Humaniz. Comput.
**2018**, 1–9. [Google Scholar] [CrossRef] - Thai-Nghe, N.; Thanh-Hai, N.; Chi Ngon, N. Deep learning approach for forecasting water quality in IoT systems. Int. J. Adv. Comput. Sci. Appl.
**2020**, 11, 686–693. [Google Scholar] [CrossRef] - Granata, F.; Papirio, S.; Esposito, G.; Gargano, R.; De Marinis, G. Machine learning algorithms for the forecasting of wastewater quality indicators. Water
**2017**, 9, 105. [Google Scholar] [CrossRef][Green Version] - Barzegar, R.; Asghari Moghaddam, A.; Adamowski, J.; Ozga-Zielinski, B. Multi-step water quality forecasting using a boosting ensemble multi-wavelet extreme learning machine model. Stoch. Environ. Res. Risk Assess.
**2018**, 32, 799–813. [Google Scholar] [CrossRef] - Ahmed, A.N.; Othman, F.B.; Afan, H.A.; Ibrahim, R.K.; Fai, C.M.; Hossain, M.S.; Ehteram, M.; Elshafie, A. Machine learning methods for better water quality prediction. J. Hydrol.
**2019**, 578, 124084. [Google Scholar] [CrossRef] - Ghalambor, C.K.; Gross, E.S.; Grosholtz, E.D.; Jeffries, K.M.; Largier, J.K.; McCormick, S.D.; Sommer, T.; Velotta, J.; Whitehead, A. Ecological Effects of Climate-Driven Salinity Variation in the San Francisco Estuary: Can We Anticipate and Manage the Coming Changes? San Fr. Estuary Watershed Sci.
**2021**, 19. [Google Scholar] [CrossRef] - Lund, J.R. California’s agricultural and urban water supply reliability and the Sacramento–San Joaquin Delta. San Fr. Estuary Watershed Sci.
**2016**, 14. [Google Scholar] [CrossRef][Green Version] - Namadi, P.; He, M.; Sandhu, P. Salinity-constituent conversion in South Sacramento-San Joaquin Delta of California via machine learning. Earth Sci. Informatics
**2022**, 15, 1–16. [Google Scholar] [CrossRef] - He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv
**2014**, arXiv:1412.6980. [Google Scholar] - Panda, S.S.; Amatya, D.M.; Muwamba, A.; Chescheir, G. Estimation of evapotranspiration and its parameters for pine, switchgrass, and intercropping with remotely-sensed images based geospatial modeling. Environ. Model. Softw.
**2019**, 121, 104487. [Google Scholar] [CrossRef] - Dietterich, T. Overfitting and undercomputing in machine learning. ACM Comput. Surv. (CSUR)
**1995**, 27, 326–327. [Google Scholar] [CrossRef] - Ying, X. An overview of overfitting and its solutions. In Journal of Physics; Conference Series; IOP Publishing: Bristol, UK, 2019; Volume 1168, p. 022022. [Google Scholar]
- Adadi, A.; Berrada, M. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access
**2018**, 6, 52138–52160. [Google Scholar] [CrossRef] - Ancona, M.; Ceolini, E.; Öztireli, C.; Gross, M. Towards better understanding of gradient-based attribution methods for deep neural networks. arXiv
**2017**, arXiv:1711.06104. [Google Scholar] - Shrikumar, A.; Greenside, P.; Kundaje, A. Learning important features through propagating activation differences. In Proceedings of the International Conference on Machine Learning, PMLR, Sydney, Australia, 6–11 August 2017; pp. 3145–3153. [Google Scholar]
- Dettinger, M.; Anderson, J.; Anderson, M.; Brown, L.R.; Cayan, D.; Maurer, E. Climate change and the Delta. San Fr. Estuary Watershed Sci.
**2016**, 14. [Google Scholar] [CrossRef][Green Version] - Wilson, T.S.; Sleeter, B.M.; Cameron, D.R. Future land-use related water demand in California. Environ. Res. Lett.
**2016**, 11, 054018. [Google Scholar] [CrossRef] - Kimmerer, W.; Wilkerson, F.; Downing, B.; Dugdale, R.; Gross, E.S.; Kayfetz, K.; Khanna, S.; Parker, A.E.; Thompson, J. Effects of drought and the emergency drought barrier on the ecosystem of the California Delta. San Fr. Estuary Watershed Sci.
**2019**, 17. [Google Scholar] [CrossRef][Green Version] - Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys.
**2019**, 378, 686–707. [Google Scholar] [CrossRef] - Lu, L.; Meng, X.; Mao, Z.; Karniadakis, G.E. DeepXDE: A deep learning library for solving differential equations. Siam Rev.
**2021**, 63, 208–228. [Google Scholar] [CrossRef] - Gay, P.S.; O’Donnell, J. A simple advection-dispersion model for the salt distribution in linearly tapered estuaries. J. Geophys. Res. Ocean.
**2007**, 112. [Google Scholar] [CrossRef]

**Figure 1.**Schematic showing the Sacramento–San Joaquin Delta (Delta), the 23 study locations, and the DSM2 model domain.

**Figure 2.**Boxplot of salinity observations (represented by electrical conductivity) at study locations, sorted by their medians. Numbers next to each station’s box represent the ratios of available observations in the dataset during the 20-year study period. Each box represents the interquartile range from the 25th to the 75th percentiles. The line inside each box represents the median value. The open circles represent outliers.

**Figure 3.**Diagram of the proposed Res-LSTM network. The number in the input layer denotes input shape and those in the subsequent layers represent the numbers of units / neurons of those layers.

**Figure 4.**Diagram of the proposed Res-GRU network. The number in the input layer denotes input shape and those in the subsequent layers represent the numbers of units / neurons of those layers.

**Figure 6.**Exceedance probability plot and time series plot of Res-LSTM simulated versus observed salinity at daily time step.

**Figure 7.**Heatmap showing Res-LSTM performance at different salinity ranges on the daily time step: low–middle range (lowest 75%), high range (75 to 95 percentile), and extreme high range (highest 5%) at the study locations.

**Figure 11.**Exceedance probability plot and time series plot of Res-LSTM simulated versus observed salinity at daily time step with daily inputs.

**Figure 12.**Heatmap showing Res-LSTM performance at different salinity ranges on the hourly time step: low–middle range (lowest 75%), high range (75 to 95 percentile), and extreme high range (highest 5%) at the study locations.

**Figure 13.**Model performance versus total numbers of parameters of the proposed models and their variants with varying structural complexities.

**Figure 14.**Time series plots of observed salinity levels versus Res-LSTM simulations and DSM2 simulations of six key stations. Detailed values of four evaluation metrics of Res-LSTM and DSM2 are marked for each station.

Index | Input Feature Name | Definition |
---|---|---|

1 | Northern Flow | Sum of Sacramento, Yolo Bypass, Mokelumne River, Cosumnes River, and Calaveras River flows. |

2 | San Joaquin River Flow | San Joaquin River at Vernalis Flow. |

3 | Pumping | Sum of pumping from Banks Pumping Plant, Jones Pumping Plant, and Contra Costa Water District at Rock Slough, Old River, and Victoria Canal. |

4 | Delta Cross-Channel Gate Operation | Delta Cross-Channel Gate Openings. |

5 | Consumptive Use | Net Delta Consumptive use estimated by Delta Channel Depletion (DCD) and Suisun Marsh Channel Depletion (SMCD) models. |

6 | Martinez Tidal Energy | Tidal energy at Martinez, calculated as the daily maximum–the daily minimum astronomical tide at Martinez. |

7 | San Joaquin River EC | Electrical conductivity measured at San Joaquin River at Vernalis. |

8 | Sacramento River EC | Electrical conductivity measured at Sacramento River at Greens Landing. |

Architecture | MLP | ResNet | LSTM | GRU | Res-LSTM | Res-GRU |
---|---|---|---|---|---|---|

Number of parameters | 36, 271 | 357,799 | 227,263 | 201,687 | 111,300 | 100,812 |

Name | Definition | Formula |
---|---|---|

MSE | Mean Squared Error | MSE $={\sum}_{t={t}_{l}+1}^{T}{({S}_{Observed}^{t}-{S}_{ANN}^{t})}^{2}$ |

${r}^{2}$ | Squared Correlation Coefficient | ${r}^{2}$$={\left(\frac{{\sum}_{t={t}_{l}+1}^{T}|({S}_{Observed}^{t}-\overline{{S}_{Observed}})\times ({S}_{ANN}^{t}-\overline{{S}_{ANN}})|}{T\times {\sigma}_{Observed}\times {\sigma}_{ANN}}\right)}^{2}$ |

Bias | Percent Bias | Bias $=\frac{{\sum}_{t={t}_{l}+1}^{T}({S}_{ANN}^{t}-{S}_{Observed}^{t})}{{\sum}_{t={t}_{l}+1}^{T}{S}_{Observed}^{t}}\times 100\%$ |

RSR | RMSE-observations standard deviation ratio | RSR $=\frac{\sqrt{{\sum}_{t={t}_{l}+1}^{T}{({S}_{Observed}^{t}-{S}_{ANN}^{t})}^{2}}}{\sqrt{{\sum}_{t={t}_{l}+1}^{T}{({S}_{Observed}^{t}-\overline{{S}_{Observed}})}^{2}}}$ |

NSE | Nash-Sutcliffe Efficiency coefficient | NSE $=1-\frac{{\sum}_{t={t}_{l}+1}^{T}{({S}_{Observed}^{t}-{S}_{ANN}^{t})}^{2}}{{\sum}_{t={t}_{l}+1}^{T}{({S}_{Observed}^{t}-\overline{{S}_{Observed}})}^{2}}$ |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Qi, S.; He, M.; Bai, Z.; Ding, Z.; Sandhu, P.; Chung, F.; Namadi, P.; Zhou, Y.; Hoang, R.; Tom, B.; Anderson, J.; Roh, D.M. Novel Salinity Modeling Using Deep Learning for the Sacramento–San Joaquin Delta of California. *Water* **2022**, *14*, 3628.
https://doi.org/10.3390/w14223628

**AMA Style**

Qi S, He M, Bai Z, Ding Z, Sandhu P, Chung F, Namadi P, Zhou Y, Hoang R, Tom B, Anderson J, Roh DM. Novel Salinity Modeling Using Deep Learning for the Sacramento–San Joaquin Delta of California. *Water*. 2022; 14(22):3628.
https://doi.org/10.3390/w14223628

**Chicago/Turabian Style**

Qi, Siyu, Minxue He, Zhaojun Bai, Zhi Ding, Prabhjot Sandhu, Francis Chung, Peyman Namadi, Yu Zhou, Raymond Hoang, Bradley Tom, Jamie Anderson, and Dong Min Roh. 2022. "Novel Salinity Modeling Using Deep Learning for the Sacramento–San Joaquin Delta of California" *Water* 14, no. 22: 3628.
https://doi.org/10.3390/w14223628