Hydrological Responses to Various Land Use , Soil and Weather Inputs in Northern Lake Erie Basin in Canada

In the last decade, Lake Erie, one of the great lakes bordering Canada and the USA has been under serious threat due to increased phosphorus levels originating from agricultural fields. Large scale watersheds contributing to Lake Erie from the USA side are being simulated using hydrological and water quality (H/WQ) models such as the Soil and Water Assessment Tool (SWAT) and the results from the model are being used by policy and decision makers to implement better management decisions to solve emerging phosphorus issues. On the Canadian side, modeling applications are limited to either small watersheds or one major watershed contributing to Lake Erie. To the best of our knowledge, no efforts have been made to model the entire contributing watersheds to Lake Erie from Canada. This study applied the SWAT model for Northern Lake Erie Basin (NLEB; entire contributing basin to Lake Erie). Various provincial, national and global inputs of weather, land use and soil at various resolutions was assessed to evaluate the effects of input data types on the simulation of hydrological processes and streamflows. Twelve scenarios were developed using the input combinations and selected scenarios were evaluated at selected locations along the Grand and Thames Rivers using model performance statistics, and graphical comparisons of time variable plots and flow duration curves (FDCs). In addition, various hydrological components such as surface runoff, water yield, and evapotranspiration were also evaluated. Global level coarse resolution weather and soil did not perform better compared to fine resolution national data. Interestingly, in the case of land use, global and national/provincial land use were close, however, fine resolution provincial data performed slightly better. This study found that interpolated weather data from Environment Canada climate station observations performed slightly better compared to the measured data and therefore could be a good choice to use for large-scale H/WQ modeling studies.


Introduction
The Great Lakes, bordering Canada and USA, hold one-fifth of all the fresh water on Earth, and are an unparalleled treasure for Canada and in particular the province of Ontario [1].Most Ontarians Water 2018, 10, 222 2 of 17 live in the watersheds of the Great Lakes and benefit from the lakes in numerous ways.In the last few decades, Lake Erie's (one of the Great Lakes) health is under serious threat due to increased levels of harmful pollutants and rising levels of phosphorus.During the 1960s, high phosphorus levels caused blue-green algae (cyanobacteria) growth in Lake Erie, making it a major environmental issue.The governments of Canada and the USA signed the first Great Lakes Water Quality Agreement (GLWQA) in 1972, which helped reduce algae levels in Lake Erie.The algae levels have begun to resurface again in recent decades.In 2011, the algae levels in the Lake Erie were 50 times above the World Health Organization limit for safe physical contact and, 1200 times higher than the limit of safe drinking water, according to the Environmental Protection Agency (EPA).The summer of 2015 produced the largest algae bloom in the Lake Erie in 100 years.In order to reduce the algae problems, Canada and the USA amended the GLWQA and announced a goal to reduce phosphorus levels by 40 percent by 2025 and an interim reduction goal of 20 percent by 2020.Several studies [2][3][4][5][6] have used hydrological and water quality (H/WQ) models and reported that the main source of elevated phosphorus levels into Lake Erie is polluted runoff from agricultural areas within the watersheds contributing to Lake Erie.Therefore, there is an urgent need to develop management decisions within the watersheds that contribute to Lake Erie to reduce algae blooms.
H/WQ models along with monitoring data are being used in various studies to determine the impacts of land management, land use, climate, and agricultural conservation practices on water resources, ecology, and water related ecosystem services.They are also expected to provide a better insight into proposed strategies for the reduction of phosphorus loads.One such H/WQ model that is widely recognized and used extensively around the world is the Soil and Water Assessment Tool [7,8].Watersheds contributing to the Lake Erie from USA side were extensively simulated using the SWAT model.For example, Bosch et al. [9] simulated six watersheds contributing to the Lake Erie using the SWAT model and have evaluated streamflow and sediment, nitrogen, and phosphorus concentrations in various watersheds.Several other studies [10][11][12][13][14] have used the SWAT model in various large-scale watersheds (e.g., Western Lake Erie basin; Maumee basin) that contribute to Lake Erie to investigate the impacts of land use changes, agricultural practices, and climate changes, and determine the effectiveness of BMPs at preventing non-point source pollution, and have provided valuable input to policy and decision makers.The SWAT model is also being used in several small nested watersheds to understand various hydrological and nutrient fate and transport mechanisms.On the Ontarian/Canadian side, the SWAT model application is limited to small to medium size watersheds [15][16][17].However, Liu et al. [18] simulated the Grand River basin, which is a major contributor to the Lake Erie, using the SWAT model and reported that nutrient management and wetland restoration are the most effective BMPs to reduce phosphorus in Lake Erie.The Thames River basin, which is another major contributor to Lake Erie, is currently being modeled using SWAT, but no reported literature is available at this time.The Northern Lake Erie basin (NLEB), which includes all the contributing watersheds to Lake Erie from the Canada/Ontario side, needs to be simulated to support a wide variety of watershed planning and regional scale economic and policy analysis so that better policy decisions related to watershed management can be made.To the best of our knowledge, no efforts have been made to model the entire NLEB in Canada.
The SWAT model needs input data such as topography (digital elevation models (DEMs), soil, land use, land management and weather to simulate real world conditions within a simulated watershed [19].Romanowicz et al. [20] reported that SWAT model output is sensitive to input data provided into the model.Several studies in the literature have compared data input and their effects on various SWAT-produced outputs.The input data can be readily downloaded from various data sources [21], including global, national, provincial, and local sources.For example, Wang and Melesse [22] and Peschel et al. [23] evaluated the effects of two soil layers that are widely used in the USA (Soil Survey Geographic (SSURGO) database [24] and the State Soil Geographic (STATSGO) database [25] on SWAT streamflow predictions and found that the SSURGO soil layer predicted streamflow better than the STATSGO soil layer.Heathman et al. [26] evaluated the impact of different combinations of soil data (SSURGO and STATSGO) and land use data (GAP and NASS) on streamflow prediction and found that the two land use layers studied resulted in greater differences compared to the two soil layers studied.Daggupati et al. [21] compared the impacts of land use (field-reconnaissance land use layer vs. National Agricultural Statistics Service (NASS) vs. National Land Cover Dataset (NLCD data, topography (10 m vs. 30 m DEM) and soils (SSURGO vs. STATSGO) on sediment yields.They found out that the sediment yield outputs were greatly sensitive towards the land use data source and was less sensitive towards topographic and soil data sources.Radcliffe and Mukundan [27] compared PRISM (parameter-elevation relationships on independent slopes model, and climate forecast system reanalysis (CFSR) weather data impacts on streamflow and have found out that PRISM provided a better estimate than CFSR data in US watersheds where rain gauge data was not available.Gao et al. [28] compared next generation radar (NEXRAD), PRISM and National Climate Data Center (NCDC) on hydrological processes using SWAT and found that the PRISM dataset performed better compared to NCDC and NEXRAD.Most of the studies reported above are from the USA and not many studies are reported in Canada on the comparison of input data sources on modeling outputs.
The overall goal of this study is to apply the SWAT model for the entire contributing basin of Lake Erie from the Canadian/Ontarian side.Specific objectives are to compare and evaluate the impacts of various data inputs (land use, soil and weather) available from various sources (e.g., global, national and provincial) on watershed hydrological processes and streamflow's.

Study Area
The Northern Lake Erie Basin (NLEB) is in agricultural region of southern Ontario, Canada and has a drainage area of approximately 21,750 km 2 (Figure 1).Major rivers in the watershed are the Grand River (280 km) and the Thames River (273 km) followed by Sydenham Creek (165 km), Bear Creek (75 km) and Big Creek (66 km).The Grand River basin (6800 km 2 ) and the Thames River basin (5825 km 2 ) constitutes around 60% of the NLEB.The Grand River basin directly contributes to Lake Erie, while the Thames River basin contributes to Lake St. Clair.The outflow from the Lake St. Clair flows from its southwestern end into the Detroit River and into Lake Erie.The elevation in the watershed ranges from 170 to 539 m above sea level.The south-western part of the watershed is relatively flat, with elevation differences within 5 m.Agriculture is the dominant land use in the watershed, followed by forest, rangeland, urban areas and others.
Water 2018, 10, x 3 of 16 were greatly sensitive towards the land use data source and was less sensitive towards topographic and soil data sources.Radcliffe and Mukundan [27] compared PRISM (parameter-elevation relationships on independent slopes model, and climate forecast system reanalysis (CFSR) weather data impacts on streamflow and have found out that PRISM provided a better estimate than CFSR data in US watersheds where rain gauge data was not available.Gao et al. [28] compared next generation radar (NEXRAD), PRISM and National Climate Data Center (NCDC) on hydrological processes using SWAT and found that the PRISM dataset performed better compared to NCDC and NEXRAD.Most of the studies reported above are from the USA and not many studies are reported in Canada on the comparison of input data sources on modeling outputs.
The overall goal of this study is to apply the SWAT model for the entire contributing basin of Lake Erie from the Canadian/Ontarian side.Specific objectives are to compare and evaluate the impacts of various data inputs (land use, soil and weather) available from various sources (e.g., global, national and provincial) on watershed hydrological processes and streamflow's.

Study Area
The Northern Lake Erie Basin (NLEB) is in agricultural region of southern Ontario, Canada and has a drainage area of approximately 21,750 km 2 (Figure 1).Major rivers in the watershed are the Grand River (280 km) and the Thames River (273 km) followed by Sydenham Creek (165 km), Bear Creek (75 km) and Big Creek (66 km).The Grand River basin (6800 km 2 ) and the Thames River basin (5825 km 2 ) constitutes around 60% of the NLEB.The Grand River basin directly contributes to Lake Erie, while the Thames River basin contributes to Lake St. Clair.The outflow from the Lake St. Clair flows from its southwestern end into the Detroit River and into Lake Erie.The elevation in the watershed ranges from 170 to 539 m above sea level.The south-western part of the watershed is relatively flat, with elevation differences within 5 m.Agriculture is the dominant land use in the watershed, followed by forest, rangeland, urban areas and others.

Input Data
The prediction accuracy of the SWAT model, and perhaps all hydrological models, depends on how well the model input variables describe the characteristics of the watershed.A range of information is

Input Data
The prediction accuracy of the SWAT model, and perhaps all hydrological models, depends on how well the model input variables describe the characteristics of the watershed.A range of information is required to run the comprehensive SWAT model.The main inputs that are required to run the SWAT model are the following: the digital elevation model (DEM), land use, soil, and forcing weather data (mainly precipitation and minimum and maximum temperature).SWAT model inputs can be obtained from various sources, such as global, national and provincial sources, as well as local level databases.Table 1 provides an overview of the input data used in this study and a detailed description of each input data is given below The DEM used in this study was collected from the Ontario Ministry of Natural Resources and Forestry.The provincial level DEM is a seamless three-dimensional model that represents ground surface elevation in meters.The resolution of the DEM is 30 m.The change in elevation in the NLEB is shown in Figure 1.

Land Use and Land Cover
Currently, both global and provincial level databases are available to download land use data for Canadian watersheds.In this study, two different sources of land use data including the Global Land Cover Classification (GLCC) [29] and Southern Ontario Land Resource Information System (SOLRIS) [30] were used (Figure 2a,b).
The GLCC land use data, which is made available through the U.S. Geological Survey's (USGS) database (https://lta.cr.usgs.gov/GLCC),covers the whole globe at 1 km resolution.The USGS National Center for Earth Resources Observation and Science (EROS), the University of Nebraska-Lincoln (UNL) and the Joint Research Centre of the European Commission derived the data from the Advanced Very-High-Resolution Radiometer (AVHRR) ten-day Normalized Difference Vegetation Index (NDVI) composites [31].They generated the global land use data by using a multi-temporal unsupervised classification of NDVI data with post-classification refinement using multi-source earth science data.To prepare the GLCC data, they used the AVHRR source imagery covering from 1992 to 1993.The GLCC land cover data is available at a resolution of 1 km and is intended to be used for various environmental research and modeling applications.The agricultural area in the GLCC makes up 91% followed by forest (5%), urban and transportation (2%) and rangeland (2%) in NLEB (Figure 2b).The other land cover data used in the study is SOLRIS, which is a provincial land use data that is available through the Ontario Geographic Data Exchange (OGDE) as part of the Land Information Ontario (LIO) data warehouse.The land use data is produced mainly to support landscape-scale planning initiatives in southern Ontario such as source water protection, biodiversity conservation, natural spaces, and state of resources reporting.The SOLRIS data was derived from different sources including topographic maps, aerial photos and satellite imagery.The SOLRIS has 23 categories, which were created based on the Ontario Ministry of Natural Resources' Ecological Land Classification (ELC) [32].SOLRIS data has a scale of 1:50,000 and spatial resolution of 30 m.The agricultural area in SOLRIS is 76%, followed by rangeland (8%), urban and transportation (8%) and forest (8%) in the NLEB (Figure 2b).

Soil
In this study, global and national level soil databases were used.At the global level, the FAO-UNESCO (Food and Agricultural Organization of the United Nations-United Nations Educational, Scientific and Cultural Organization) soil map of the world was investigated in this study.The FAO-UNESCO soil map covers the entire world at a spatial scale of 1:5 million.(~5 km spatial resolution).In the NLEB, the soil hydrologic groups are constituted by C (20,198 km 2 ) and D (1552 km 2 ) (Figure 3b).At the national level, the Soil Landscapes of Canada (SLC) version 3.2 [33] was investigated in this study.The SLC is a dataset developed by the federal government, in particular Agriculture and Agri-Food Canada.The data contains the soil map of Canada together with the major characteristics of the soil for the whole country.The SLC was compiled at a scale of 1:1 million, (~1 km spatial resolution), and each polygon on the map describes a distinct type of soil and its associated characteristics.In the NLEB, the soil hydrological groups constituted are A (8105 km 2 ), B (8250 km 2 ) and C (5365 km 2 ) (Figure 3a).

Soil
In this study, global and national level soil databases were used.At the global level, the FAO-UNESCO (Food and Agricultural Organization of the United Nations-United Nations Educational, Scientific and Cultural Organization) soil map of the world was investigated in this study.The FAO-UNESCO soil map covers the entire world at a spatial scale of 1:5 million.(~5 km spatial resolution).In the NLEB, the soil hydrologic groups are constituted by C (20,198 km 2 ) and D (1552 km 2 ) (Figure 3b).At the national level, the Soil Landscapes of Canada (SLC) version 3.2 [33] was investigated in this study.The SLC is a dataset developed by the federal government, in particular Agriculture and Agri-Food Canada.The data contains the soil map of Canada together with the major characteristics of the soil for the whole country.The SLC was compiled at a scale of 1:1 million, (~1 km spatial resolution), and each polygon on the map describes a distinct type of soil and its associated characteristics.In the NLEB, the soil hydrological groups constituted are A (8105 km 2 ), B (8250 km 2 ) and C (5365 km 2 ) (Figure 3a).
In the NLEB, the soil hydrologic groups are constituted by C (20,198 km 2 ) and D (1552 km 2 ) (Figure 3b).At the national level, the Soil Landscapes of Canada (SLC) version 3.2 [33] was investigated in this study.The SLC is a dataset developed by the federal government, in particular Agriculture and Agri-Food Canada.The data contains the soil map of Canada together with the major characteristics of the soil for the whole country.The SLC was compiled at a scale of 1:1 million, (~1 km spatial resolution), and each polygon on the map describes a distinct type of soil and its associated characteristics.In the NLEB, the soil hydrological groups constituted are A (8105 km 2 ), B (8250 km 2 ) and C (5365 km 2 ) (Figure 3a).

Weather
Spatially explicit, high-resolution meteorological data are becoming increasingly available as inputs for hydrological models.Such data can be obtained from different sources, both at global and local scales.In this study, three different sources of weather data: the daily 10 km Gridded Climate Dataset for Canada (GCDC) [34], CFSR and measured climatic data sets were assessed for their applicability as forcing weather data for the SWAT model (Figure 4b-d).

Weather
Spatially explicit, high-resolution meteorological data are becoming increasingly available as inputs for hydrological models.Such data can be obtained from different sources, both at global and local scales.In this study, three different sources of weather data: the daily 10 km Gridded Climate Dataset for Canada (GCDC) [34], CFSR and measured climatic data sets were assessed for their applicability as forcing weather data for the SWAT model (Figure 4b-d).
The GCDC data contains daily precipitation, maximum and minimum temperature data at a spatial resolution of 10 km for the Canadian watersheds south of 60° latitude.The GCDC data was prepared by interpolating data from the daily Environment Canada climate station observations using a thin plate smoothing spline surface fitting method implemented by ANUSPLIN V4.3.The data, which covers from 1961-2003, is made available for download through the Agriculture and Agri-Food Canada database.The CFSR is a global domain third generation reanalysis product [35].CFSR is a high-resolution coupled atmosphere-ocean-land surface-sea ice system available globally at a spatial scale of 38 km from 1979 to 2014 at daily time steps.Several studies [36][37][38][39] across the globe have successfully used CFSR data for various hydrological applications and provided successful results.The measured climate data from rain gauges installed on the ground was extracted from the national climate data archive of environment Canada.Altogether, the NLEB is constituted by 222 stations of the GCDC, 22 stations of the CFSR and 17 stations of measurement (Figure 4b-d).

SWAT Model and Scenario Development
The Soil and Water Assessment Tool (SWAT) is a semi-distributed, physically-based hydrological model that is widely used across the world [8].The SWAT model divides the basin into sub-watersheds and further into Hydrologic Response Units (HRUs), which have unique land use, soil and slope combinations.All processes are simulated at the HRU level at user-defined time steps and are summed up to the sub-basin level and routed through reaches until finally reaching the outlet of the basin.The model is supported by documentation (http://swat.tamu.edu/documentation/),which reviews all the hydrological processes simulated by the model.In this study we used ArcSWAT (rev 664) for the ArcGIS 10.3 Geographic Information System interface to setup four models with combinations of land use and The GCDC data contains daily precipitation, maximum and minimum temperature data at a spatial resolution of 10 km for the Canadian watersheds south of 60 • latitude.The GCDC data was prepared by interpolating data from the daily Environment Canada climate station observations using a thin plate smoothing spline surface fitting method implemented by ANUSPLIN V4.3.The data, which covers from 1961-2003, is made available for download through the Agriculture and Agri-Food Canada database.The CFSR is a global domain third generation reanalysis product [35].CFSR is a high-resolution coupled atmosphere-ocean-land surface-sea ice system available globally at a spatial scale of 38 km from 1979 to 2014 at daily time steps.Several studies [36][37][38][39] across the globe have successfully used CFSR data for various hydrological applications and provided successful results.The measured climate data from rain gauges installed on the ground was extracted from the national climate data archive of environment Canada.Altogether, the NLEB is constituted by 222 stations of the GCDC, 22 stations of the CFSR and 17 stations of measurement (Figure 4b-d).

SWAT Model and Scenario Development
The Soil and Water Assessment Tool (SWAT) is a semi-distributed, physically-based hydrological model that is widely used across the world [8].The SWAT model divides the basin into sub-watersheds and further into Hydrologic Response Units (HRUs), which have unique land use, soil and slope combinations.All processes are simulated at the HRU level at user-defined time steps and are summed up to the sub-basin level and routed through reaches until finally reaching the outlet of the basin.The model is supported by documentation (http://swat.tamu.edu/documentation/),which reviews all the hydrological processes simulated by the model.In this study we used ArcSWAT (rev 664) for the ArcGIS 10.3 Geographic Information System interface to setup four models with combinations of land use and soil.In each model, three different weather data were used to develop a total of twelve scenarios.Table 2 summarizes the different scenarios (SCs) developed using the SWAT model. of 0-2%, 2-4% and >4% and a HRU threshold of 5% were used for land use, soil and slope, respectively, to delineate HRUs.Altogether, Model 1 (SC 1 to 3) had 3831 HRUs, Model 2 (SC 4 to 6) had 1470 HRUs, Model 3 (SC 7 to 9) had 2961 HRUs, and Model 4 (SC 10 to 12) had 1159 HRUs.In each model, agricultural lands were given a corn-soybean rotation based on heat units.In addition, tile drainage was implemented for all agricultural lands in flat areas (0 to 2% slope).The parameters and values used to represent tile drainage in the SWAT were Ddrain: 1000; Gdrain: 48; Tdrain: 24; D_IMP: 2100; and the daily curve number calculation method: Plant-Based ET; CNCOEFF = 0.5.The model was simulated at a monthly time step from 1980 to 1993 with three years as the warm-up period [40].Flow calibration was not conducted in this study to avoid biases associated with individual model calibrations which would not make a fair comparison between inputs, which is the purpose of this study.Moriasi et al. [41] recommended comparing crop yields in case flow calibration was not performed so that the model represents real-world conditions reasonably well.Therefore, in this study, the average corn and soybean yields generated from the model were compared with historical observed yields for this region.The simulated average yields were 7000 kg/ha and 2200 kg/ha, while the average observed yields were 6500 kg/ha and 2400 kg/ha, respectively, for corn and soybean.In addition, the overall water balance, a major intra-watershed processes, as affected by tile-drainage, was also evaluated.Figure 5 showed that the tile-drainage contribution to the total water yield ranged from 40 to 50% which is in close agreement with various watershed experts' opinions (personal communication with conservation authorities).Finally, simulated outputs from each scenario were saved for further analysis.
graphical comparisons of time variable plots and flow duration curves (FDCs) were also used to compare simulated and observed flows for different scenarios.Graphical plots provided greater insights into hydrograph representation, and low flow and high flow comparisons.In this paper, model performance criteria for all scenarios at four locations were presented while the graphical comparisons were only presented at T1 location for selected scenarios to compare inputs (weather, land use, and soil).

Compare Different Climate Datasets Using Hydrological Budgets and Measured Streamflow
The SWAT outputs were compared between weather scenarios (SC1 (GCDC), SC2 (CFSR) and SC3 (measured)) with the combination of SLC soil and SOLARIS land use.The average annual values in millimeters (mm) for various components of the hydrological cycle for SC1, SC2 and SC3 were shown in Figure 5.The scenario SC2 (CFSR) overpredicted various components of hydrological budgets compared to SC1 and SC3.The major reason for the overprediction by SC2 was due to the higher precipitation of 1129 mm compared to 939 mm by SC1 and 970 mm by SC3.However, it was interesting to note that the precipitation and various components of hydrological budgets for SC1 and SC3 were similar.The model performance of the simulated streamflows of SC1, SC2, and SC3 were compared with observed streamflows at four locations (Table 3).The Nash-Sutcliffe efficiency (NSE) values at T1 (Thames at Ingresol) and G1 for SC1, SC2 and SC3 were 0.73, -0.69, 0.70 and 0.71, -1.42, 0.75, respectively.The PBIAS values for SC1, SC2 and SC3 were 13.56, -50.72, 14.65 and 1.09, -65.19, -4.47, respectively.The model performance for SC1 and SC3 was good based on the NSE and ranged from satisfactory to good based on PBIAS [41].The model performance for SC2 was unsatisfactory based on both NSE and PBIAS.However, the model performance was significantly improved at T2 and G2, which were located downstream of the Thames River and the Grand River (Table 3) for SC1 and SC3, while SC2 still remained unsatisfactory.For example, at T2 (Thames at Thameville), the NSE was 0.84 compared to the NSE of 0.73 at T1 for SC1.The time series and FDCs between the observed and simulated streamflows for SC1, SC2 and SC3 are shown in Figure 6 a, b.The temporal streamflows in SC2 were overpredicted compared to SC1 and SC3.When compared to the observed streamflows, SC2 overestimated during high flow (10% exceedance probability) and middle flow while the low flow (90% exceedance probability) was close to the observed values.SC1 and SC3 were close to each other and were also in close agreement with observed streamflow during high, middle, and low flow periods.SC3 performed slightly better compared to SC1 in terms of statistics (at certain locations) as well as graphical comparisons.
The overprediction of SC2 compared to SC1 and SC3 at all locations was mainly due to higher precipitation amounts in SC2 resulting in higher streamflow.The hydrological budgets and streamflow comparisons for SC1 and SC3 were similar.This provided confidence for the GCDC climate data that was interpolated at 10 km from the Environment Canada climate station observations and can be an appropriate choice to use for large scale modeling due to greater spatial representation of data points compared to measured climate station data.It would also be interesting to compare the GCDC in a small

Input Data Assessment
The water balance of a watershed is greatly influenced by the weather, and the geophysical characteristics of the watershed such as topography, land use and soil [42].Evaluating various hydrological components of the water balance is crucial for understanding the hydrological status of the watershed.In this study, various hydrological components such as surface runoff, water yield, and evapotranspiration were evaluated for the selected twelve scenarios to evaluate the impacts of land use, soil and weather.
The observed and simulated streamflow at two locations (G1 and G2) in the Grand River and two locations (T1 and T2) in the Thames River were selected to compare the selected twelve scenarios in order to evaluate the impact of inputs (land use, soil and climate) (Figure 4a).Model performance criteria (statistics) recommended by Moriasi et al. [43] were used to evaluate the performance of the model for compared scenarios.In this study, statistics of percent bias (PBIAS) and Nash-Sutcliffe efficiency (NSE) were used, as these two statistics were used in various hydrological studies [12,21,[44][45][46] In addition, graphical comparisons of time variable plots and flow duration curves (FDCs) were also used to compare simulated and observed flows for different scenarios.Graphical plots provided greater insights into hydrograph representation, and low flow and high flow comparisons.In this paper, model performance criteria for all scenarios at four locations were presented while the graphical comparisons were only presented at T1 location for selected scenarios to compare inputs (weather, land use, and soil).

Compare Different Climate Datasets Using Hydrological Budgets and Measured Streamflow
The SWAT outputs were compared between weather scenarios (SC1 (GCDC), SC2 (CFSR) and SC3 (measured)) with the combination of SLC soil and SOLARIS land use.The average annual values in millimeters (mm) for various components of the hydrological cycle for SC1, SC2 and SC3 were shown in Figure 5.The scenario SC2 (CFSR) overpredicted various components of hydrological budgets compared to SC1 and SC3.The major reason for the overprediction by SC2 was due to the higher precipitation of 1129 mm compared to 939 mm by SC1 and 970 mm by SC3.However, it was Water 2018, 10, 222 9 of 17 interesting to note that the precipitation and various components of hydrological budgets for SC1 and SC3 were similar.
The model performance of the simulated streamflows of SC1, SC2, and SC3 were compared with observed streamflows at four locations (Table 3).The Nash-Sutcliffe efficiency (NSE) values at T1 (Thames at Ingresol) and G1 for SC1, SC2 and SC3 were 0.73, -0.69, 0.70 and 0.71, -1.42, 0.75, respectively.The PBIAS values for SC1, SC2 and SC3 were 13.56, -50.72, 14.65 and 1.09, -65.19, -4.47, respectively.The model performance for SC1 and SC3 was good based on the NSE and ranged from satisfactory to good based on PBIAS [41].The model performance for SC2 was unsatisfactory based on both NSE and PBIAS.However, the model performance was significantly improved at T2 and G2, which were located downstream of the Thames River and the Grand River (Table 3) for SC1 and SC3, while SC2 still remained unsatisfactory.For example, at T2 (Thames at Thameville), the NSE was 0.84 compared to the NSE of 0.73 at T1 for SC1.The time series and FDCs between the observed and simulated streamflows for SC1, SC2 and SC3 are shown in Figure 6a,b.The temporal streamflows in SC2 were overpredicted compared to SC1 and SC3.When compared to the observed streamflows, SC2 overestimated during high flow (10% exceedance probability) and middle flow while the low flow (90% exceedance probability) was close to the observed values.SC1 and SC3 were close to each other and were also in close agreement with observed streamflow during high, middle, and low flow periods.SC3 performed slightly better compared to SC1 in terms of statistics (at certain locations) as well as graphical comparisons.
The overprediction of SC2 compared to SC1 and SC3 at all locations was mainly due to higher precipitation amounts in SC2 resulting in higher streamflow.The hydrological budgets and streamflow comparisons for SC1 and SC3 were similar.This provided confidence for the GCDC climate data that was interpolated at 10 km from the Environment Canada climate station observations and can be an appropriate choice to use for large scale modeling due to greater spatial representation of data points compared to measured climate station data.It would also be interesting to compare the GCDC in a small watershed with observed streamflow where measured data is not present.However, it is beyond the scope of this paper and is being currently investigated and will be reported in a future publication.
Water 2018, 10, x 9 of 16 watershed with observed streamflow where measured data is not present.However, it is beyond the scope of this paper and is being currently investigated and will be reported in a future publication.

Comparison of Land Uses Using Hydrological Budgets and Measured Streamflow
The average annual values in millimeters (mm) for various components of the hydrological cycle as associated with SOLRIS and GLCC land uses were compared using SC 1 and SC 4 where soil is SLC and the climate is GCDC (Figure 7a).The precipitation (939 mm) was the same for both the scenarios, while a minimum change (<10%) was noticed in other components of the hydrological cycle.In addition, the two land uses were also compared using SC 7 and SC 10 under different soil conditions (FAO soils) (Figure 7b).A difference of less than 5% was noticed in various components of the hydrological cycle

Comparison of Land Uses Using Hydrological Budgets and Measured Streamflow
The average annual values in millimeters (mm) for various components of the hydrological cycle as associated with SOLRIS and GLCC land uses were compared using SC 1 and SC 4 where soil is SLC and the climate is GCDC (Figure 7a).The precipitation (939 mm) was the same for both the scenarios, while a minimum change (<10%) was noticed in other components of the hydrological cycle.In addition, the two land uses were also compared using SC 7 and SC 10 under different soil conditions (FAO soils) (Figure 7b).A difference of less than 5% was noticed in various components of the hydrological cycle between SOLARIS and GLCC land use with the same precipitation.The model performance of SC1 and SC4 was compared with observed streamflow at four gauging stations (Table 3).The NSE and PBIAS values at T1 were 0.73 and 0.70 and 13.56 and 15.43, respectively, and at G1 these were 0.71 and 0.68 and 1.09 and 3.59, respectively.The model performance was considered good based on the NSE at T1 and G1.Based on PBIAS, it was very good at G1 and satisfactory for SC1 and unsatisfactory for SC4 at T1.In the case of SC7 and SC10, the NSE and PBIAS values at T1 were 0.48 and 0.50 and 5.87 and 4.15, respectively, and at G1 they were 0.36 and 0.30 and −0.47 and −1.40, respectively.The model performance was unsatisfactory based on the NSE and ranged from very good to good based on PBIAS at the T1 and G1 locations.The model performance significantly improved at downstream The model performance of SC1 and SC4 was compared with observed streamflow at four gauging stations (Table 3).The NSE and PBIAS values at T1 were 0.73 and 0.70 and 13.56 and 15.43, respectively, and at G1 these were 0.71 and 0.68 and 1.09 and 3.59, respectively.The model performance was considered good based on the NSE at T1 and G1.Based on PBIAS, it was very good at G1 and satisfactory for SC1 and unsatisfactory for SC4 at T1.In the case of SC7 and SC10, the NSE and PBIAS values at T1 were 0.48 and 0.50 and 5.87 and 4.15, respectively, and at G1 they were 0.36 and 0.30 and −0.47 and −1.40, respectively.The model performance was unsatisfactory based on the NSE and ranged from very good to good based on PBIAS at the T1 and G1 locations.The model performance significantly improved at downstream locations (G2 and T2) for the compared scenarios.The temporal streamflow and the FDCs for SC1 and SC4 at T1 location are shown in Figure 8a,b.The streamflow of SC1 and SC4 were close to each other and were in agreement with the observed flows during high flow.However, the low flow significantly differed especially after 70% probability exceedance.The difference in the low flow was seen through the high PBIAS (Table 3).The underprediction of the low flow by the model is due to baseflow contribution in this watershed, which was not accounted for in the model simulations as the model was not calibrated, as mentioned above.SC7 and SC10's temporal streamflow and the FDCs at the T1 location are shown in Figure 9a,b.The streamflow of SC7 and SC10 were close to each other but overpredicted during the peaks or high flow and underpredicted during low flow compared to the observed streamflow values.This scenario is a very good example of where statistics can be misleading.For example, if only PBIAS was used to discuss results, then the model performance would be considered good.The real performance of the model can only be seen when using other statistics, such as NSE and graphical comparisons.Therefore, we recommend that the model users use as many statistics as possible along with temporal time series and flow duration curves to discuss modelling efforts.The poor model performance of SC7 and SC10 when FAO soil was used is discussed in the next section.
As seen above, the two land uses (SOLRIS and GLCC), despite having a different resolution and categories (e.g., 76% agriculture vs. 94% agriculture), did not cause much variation among various components of the hydrological cycle.Model performance statistics show that model outputs of the two land uses were in close agreement with the observed data, however the performance for the SOLRIS land use was slightly better than GLCC.
Water 2018, 10, x 11 of 16 land uses were in close agreement with the observed data, however the performance for the SOLRIS land use was slightly better than GLCC.

Impact of Soil on Hydrological Budgets and Streamflow
SLC and FAO soils were compared by evaluating average annual hydrology (mm) for various components of the hydrological cycle for SC1 (SLC) and SC7 (FAO) where the land use was SOLARIS (Figure 10a).Also, SC4 (SLC) and SC10 (FAO) were compared to evaluate soils under a different land use (GLCC) (Figure 10b).In both cases, the weather used was GCDC.SC1 and SC7 significantly differed from each other in terms of hydrological budgets for a given precipitation.For example, surface runoff varied by 45% and total water yield by 11.9%.Similarly, SC4 and SC10 also significantly differed where SLC and FAO soils were compared under a different land use (GLCC).The SLC soil was of fine resolution, where 75% of the area in the NLEB was in the A and B hydrologic soil group (Figure 3b).The FAO soils were of coarser resolution, where 93% of the area was in the C hydrologic soil group (Figure 3a).The C hydrologic soil group in FAO soils resulted in higher runoff in SC4 and SC10 and thus affected other hydrological cycle components.
Simulated and observed streamflows were compared for SC1 and SC7, and SC4 and SC10 using model performance criteria (Table 3).At T1, the NSE and PBIAS values for SC1 and SC7 were 0.73 and 0.48 and 13.56 and 5.87, respectively, while at G1, the values were 0.71 and 0.68 and 1.09 and 3.59, respectively.Based on the NSE, the model performance was considered good for SC1 at the T1 and G1 locations, and unsatisfactory for SC7 at the T1 and G1 locations.Satisfactory and very good performance was seen for SC1 at the T1 and G1 locations and good and very good for SC7 at the T1 and G1 locations based on PBIAS.The model performance for SC4 and SC10 where soils were compared using different land uses (GLCC) was very similar to SC1 and SC7 mainly because of minimal differences in land use.The model performance significantly improved at T2 and G2 even with FAO soils.For example, the NSE at T2 and G2 were 0.71 and 0.75, which relates to good model performance.Evaluating model performance at the downstream locations can be misleading and may not accurately represent the intra-watershed processes.In addition, calibration using the downstream location would Water 2018, 10, 222 13 of 17 further alter the bio-physiochemical variations in the watershed and therefore calibration efforts should include upstream locations, which harmonizes with the recommendations of Daggupati et al. [40].The time series and FDCs for observed and simulated streamflows at T1 for SC1 and SC7 are shown in Figure 11a,b.SC7 overpredicted during high flow compared to SC1, however it was close during low flows.When compared with observed streamflows, SC7 overpredicted during high flow and both SC1 and SC7 underpredicted the low flow.Similar trends were seen in SC4 and SC10 but are not shown in this paper.The underprediction of low flow is mainly attributed to baseflow contributions in the watershed, which was not accounted for in model simulations, as the model was not calibrated.
As seen above, the SLC and FAO soils compared in this study caused considerable variations among various components of the hydrological cycle.In addition, model performance, time series and FDC curves also showed that the SLC soil performed better compared to the FAO soil.The reason for the poor performance of the FAO soil (SC7 and SC4) was mainly the presence of a larger area with the C hydrological soil group resulting in higher runoff and eventually resulting in higher peak flows.

Conclusions
In the study, a SWAT model was developed for the entire contributing basin of Lake Erie from Ontario/the Canadian side.Land use, soil and weather datasets obtained from various data sources

Conclusions
In the study, a SWAT model was developed for the entire contributing basin of Lake Erie from Ontario/the Canadian side.Land use, soil and weather datasets obtained from various data sources (e.g., global, national and provincial) were assessed to evaluate the effects of input data types on the simulation of hydrological processes and streamflows.Weather comparisons (GCDC vs. CFSR vs. measured) showed that GCDC and measured data performed better compared to the CFSR data.The CFSR had higher precipitation amounts which resulted in the overprediction of various components of the hydrological cycle and thereby streamflow.The GCDC, which is interpolated data from the Environment Canada climate station observations, performed slightly better compared to measured data and therefore could be a good choice to use for hydrological modeling studies.Not much change was seen in the two compared land uses (SOLRIS and GLCC) in terms of hydrological components and streamflow comparisons with observed data.However, SOLRIS land use performed slightly better due to its fine resolution and accounting for various land use categories.The soils (SLC and FAO) compared in this study showed major differences in terms of hydrological components and streamflow when compared with observed data.The FAO soils (coarse resolution) have higher runoff potential, which alters the hydrological budgets and thereby streamflow, resulting in poor performance.This study shows that fine resolution data available from national or provincial sources better predict the hydrological processes and streamflow in NLEB.Next steps would be to use the SWAT model with national and provincial level data, to calibrate and validate the model for quantity (flow) and water quality (sediment, nitrogen and phosphorus) and to use the model to make appropriate management decisions to solve emerging water quality (phosphorus) issues in Lake Erie.In addition, this study also recommends using various performance measure statistics along with graphical comparisons when presenting hydrological model results.In addition, presenting model results at downstream locations (outlets) can be misleading.Evaluating and presenting modeling results at upstream locations would give greater perspectives of hydrological and intra-watershed processes occurring within the watershed.

Figure 1 .
Figure 1.Study area location map-Northern Lake Erie basin, Ontario, Canada.

Figure 1 .
Figure 1.Study area location map-Northern Lake Erie basin, Ontario, Canada.

Water 2018, 10 , x 5 of 16 Figure 2 .
Figure 2. Two land use coverage maps of Northern Lake Erie basin: (a) Southern Ontario Land Resource Information System (SOLRIS) with 30 m resolution.(b) Global land cover classification (GLCC) with 1 km resolution.

Figure 2 .
Figure 2. Two land use coverage maps of Northern Lake Erie basin: (a) Southern Ontario Land Resource Information System (SOLRIS) with 30 m resolution.(b) Global land cover classification (GLCC) with 1 km resolution.

Figure 3 .
Figure 3. Two soil coverage maps the for Northern Lake Erie basin at spatial scale: (a) Food and Agricultural Organization (FAO) soil with 1:5 M resolution.(b) Soil Landscapes of Canada (SLC) with 1:1 M resolution.

Figure 3 .
Figure 3. Two soil coverage maps the for Northern Lake Erie basin at spatial scale: (a) Food and Agricultural Organization (FAO) soil with 1:5 M resolution.(b) Soil Landscapes of Canada (SLC) with 1:1 M resolution.

Figure 4 .
Figure 4. Three weather coverage maps of Northern Lake Erie basin with streamflow gauge stations: (a) Location of four (G1, G2, T1 & T2) streamflow gauge stations with drainage pattern with in study area.(b) Measured climatic weather data sets location.(c) Gridded Climate Dataset for Canada (GCDC) with resolution of 10 km gridded.(d) Climate Forecast System Reanalysis (CFSR) with a spatial scale of 38 km 2 resolution.

Figure 4 .
Figure 4. Three weather coverage maps of Northern Lake Erie basin with streamflow gauge stations: (a) Location of four (G1, G2, T1 & T2) streamflow gauge stations with drainage pattern with in study area.(b) Measured climatic weather data sets location.(c) Gridded Climate Dataset for Canada (GCDC) with resolution of 10 km gridded.(d) Climate Forecast System Reanalysis (CFSR) with a spatial scale of 38 km 2 resolution.

Figure 7 .
Figure 7. Water budget scenario comparison: (a) SC1 and SC4 land use with a combination of SLC soil and GCDC weather.(b) SC7 and SC10 land use with combination of FAO soil and GCDC weather.

Figure 7 .
Figure 7. Water budget scenario comparison: (a) SC1 and SC4 land use with a combination of SLC soil and GCDC weather.(b) SC7 and SC10 land use with combination of FAO soil and GCDC weather.

Figure 8 .
Figure 8.Comparison of simulated temporal streamflow scenarios (SC1 and SC4) with observed data at Ingresol station (T1).(a) Monthly time series plot for the time period of 1983 to 1993.(b) Flow duration curve (FDC) at station T1.

Figure 8 .
Figure 8.Comparison of simulated temporal streamflow scenarios (SC1 and SC4) with observed data at Ingresol station (T1).(a) Monthly time series plot for the time period of 1983 to 1993.(b) Flow duration curve (FDC) at station T1.

Figure 8 .
Figure 8.Comparison of simulated temporal streamflow scenarios (SC1 and SC4) with observed data at Ingresol station (T1).(a) Monthly time series plot for the time period of 1983 to 1993.(b) Flow duration curve (FDC) at station T1.

Figure 9 .
Figure 9.Comparison of simulated temporal streamflow scenarios (SC7 & SC10) with observed data.(a) Monthly time series plot for the time period of 1983 to 1993.(b) Flow duration curve (FDC).

Figure 9 .
Figure 9.Comparison of simulated temporal streamflow scenarios (SC7 & SC10) with observed data.(a) Monthly time series plot for the time period of 1983 to 1993.(b) Flow duration curve (FDC).

Figure 10 .
Figure 10.Annual hydrologic balance components: (a) comparison of scenarios SC1 and SC7 FAO soil with combination of Solaris (land use) and GCDC weather.(b) Comparison of scenarios SC4 and SC10 FAO soil with combination of GLCC (land use) and GCDC weather.

Figure 10 .
Figure 10.Annual hydrologic balance components: (a) comparison of scenarios SC1 and SC7 FAO soil with combination of Solaris (land use) and GCDC weather.(b) Comparison of scenarios SC4 and SC10 FAO soil with combination of GLCC (land use) and GCDC weather.

Figure 10 .
Figure 10.Annual hydrologic balance components: (a) comparison of scenarios SC1 and SC7 FAO soil with combination of Solaris (land use) and GCDC weather.(b) Comparison of scenarios SC4 and SC10 FAO soil with combination of GLCC (land use) and GCDC weather.

Figure 11 .
Figure 11.Comparison of simulated temporal streamflow scenarios (SC1 and SC7) with observed data.(a) temporal time series plot for the time period of 1983 to 1993.(b) Plotted flow duration curve (FDC) at T1 Station.

Figure 11 .
Figure 11.Comparison of simulated temporal streamflow scenarios (SC1 and SC7) with observed data.(a) Monthly temporal time series plot for the time period of 1983 to 1993.(b) Plotted flow duration curve (FDC) at T1 Station.

Table 1 .
Inputs used, their resolution, availability and website to download the data.

Table 2 .
Twelve scenarios developed in this study.3 constituted SLC soil and SOLARIS land use, SC 4 to 6 constituted SLC soil and GLCC land use, SC 7 to 9 constituted FAO soil and SOLARIS land use, and SC 10 to 12 constituted FAO soil and GLCC land use.During SWAT model development, slopes

Table 3 .
Model performance statistics.

Table 3 .
Model performance statistics.