Improved Prediction of Stream Flow Based on Updating Land Cover Maps with Remotely Sensed Forest Change Detection

The water balance in a watershed can be disrupted by forest disturbances such as harvests and fires. Techniques to accurately and efficiently map forest cover changes due to disturbance are evolving quickly, and it is of interest to ask how useful maps of different types of disturbances over time can be in the prediction of water yield. We assessed the benefits of using land cover maps produced at annual vs. five-year intervals in the prediction of monthly streamflows across 10 watersheds contained entirely within the US National Forest System. We found that annually updating land cover maps with forest disturbance data significantly improved water yield predictions using the Soil and Water Assessment Tool (SWAT; p < 0.01 improvement for both the Nash–Sutcliffe efficiency measure and the ratio of the root mean square error to the standard deviation of the measured data). Improvement related to using annually updated land cover maps was directly related to the amount of disturbance observed in a watershed. Our results lay a foundation to apply new high-resolution disturbance datasets in the field of hydrologic modeling to monitor ungauged watersheds and to explore potential water yield changes in watersheds if climate conditions or management practices were to change forest disturbance processes.


Introduction
Changes in forest structure can impact water balance dynamics of a given catchment area. There is ample evidence that disturbance-such as fires [1], harvests [2], and bark beetles outbreaks [3]-can affect interception, infiltration, evapotranspiration, and runoff. However, while a variety of options exist for using remote sensing platforms such as Landsat to develop long-term spatial records of disturbance [4], there has been no effort to date that the authors know of to integrate these products in the models of annual stream flow.
Approximately 65% of the water supply in the western US states originates on forested lands, with 33% of the total being contributed solely from upper watersheds located within national forests and grasslands [5]. Prediction of forest water yield, particularly as it is affected by recent disturbances, is therefore important in managing the water supply of millions of people.
While land cover change information has been included in hydrologic models to better understand water dynamics [6][7][8][9], no reference was found of work that integrates comprehensive, annually specific disturbance maps into hydrologic models. There are, however, examples [10][11][12] of recent research using different dates (five years apart) of the National Land Cover Dataset NLCD [13,14] in an effort to integrate land cover change in hydrologic simulations.
The current research was conducted to address this question: do annual land cover maps, updated annually to reflect forest disturbances, produce better water yield estimates compared to five-year (i.e., multitemporal NLCD) land cover maps generated on an approximate five-year interval? We used the Soil and Water Assessment Tool SWAT [15,16] to model flows using both annual (land cover maps that include 22 years of disturbance from the National Forest System (NFS)), and five-year (land cover maps based on multitemporal NLCD products) across watersheds of the Intermountain West. We used SWAT due to its versatility with respect to geospatial inputs, ability to generate multiple scenarios under different land cover and climate regimes, and adaptability to work in agricultural [17], tropical [18], and snowmelt-driven [19] watersheds. Here, we combine highly-detailed disturbance datasets with a physically-based hydrologic model to predict water dynamics in 10 forested watersheds. Each of the watersheds is within the US NFS, which manages 75 million hectares of public land. A detailed map of forest disturbances was created across the NFS using Landsat imagery and conventional change detection and image interpretation methods.
Our objective-to assess any benefits in predicting water yield by using annual land cover data-is a starting point for leveraging recent advances in satellite time series analysis [20] in the monitoring and prediction of important elements of the forest hydrologic cycle.

Study Area
We selected 10 watersheds of different sizes and ecological characteristics across the Northern US Rocky Mountains ( Figure 1). The criteria for selecting watersheds to be used as pilots were as follows: (a) the watershed's spatial extent should be completely contained within the spatial domain of disturbance datasets developed across the NFS-this would guarantee a comprehensive understanding of disturbance in these watersheds; and (b) within the watershed, there should be at least one hydrometric station with sufficient (at minimum, five years) flow records that could be used to assess the goodness of fit of SWAT's monthly flow outputs. A selected number of landscape and climatic characteristic for the watersheds in the pilot study are summarized in Table 1. The attributes of the USGS hydrometric stations that were used for calibration and validation of the SWAT model in each watershed area were also provided. Notice that six stations had complete time series for the discharge data, and four stations had partial coverage for the 1990-2011 time series. Therefore, model performance for units 07, 10, 13, and 14 when reported only covers those years that overlap this study's time series. Links to all stations used in this study are provided in Table S1 in the Supplementary material. The daily stream gauge discharge data are reported in cubic feet per second, so in order to use it with SWAT's outputs, the data had to be converted to cubic meters per second.  55 58" 1995-2005 In the table above, the areas of the different forest type groups (FTG) were calculated for each watershed by intersecting the FTG geospatial layer with each watershed's boundaries. The first and second largest FTG are listed for each watershed. The FTG dataset and corresponding metadata for the contiguous US and Alaska can be accessed from the US Forest Service Geodata Clearinghouse (https://data.fs.usda.gov/geodata/rastergateway/forest_type/). One of the key parameters that has a big influence on SWAT's streamflow outputs is that of the soil hydrologic group (SHG). These are classified into four categories in a spectrum-A, B, C, and D-based on infiltration characteristics and surface runoff conditions that are required to simulate the movement of water in the soil, and the corresponding conveyance to the reach. Soils type A have the best characteristics for high infiltration and low runoff whereas soils type D are the opposite, and B and C are intermediate classifications [23]. The predominance of the soil hydrologic groups in each watershed was obtained following the same procedure as the FTG. In the table above, the areas of the different forest type groups (FTG) were calculated for each watershed by intersecting the FTG geospatial layer with each watershed's boundaries. The first and second largest FTG are listed for each watershed. The FTG dataset and corresponding metadata for the contiguous US and Alaska can be accessed from the US Forest Service Geodata Clearinghouse (https://data.fs.usda.gov/geodata/rastergateway/forest_type/). One of the key parameters that has a big influence on SWAT's streamflow outputs is that of the soil hydrologic group (SHG). These are classified into four categories in a spectrum-A, B, C, and D-based on infiltration characteristics and surface runoff conditions that are required to simulate the movement of water in the soil, and the corresponding conveyance to the reach. Soils type A have the best characteristics for high infiltration and low runoff whereas soils type D are the opposite, and B and C are intermediate classifications [23]. The predominance of the soil hydrologic groups in each watershed was obtained following the same procedure as the FTG.

The SWAT Model
SWAT (the Soil and Water Assessment Tool) has been widely used for the analysis of the hydrologic impact of land cover and management in tropical and temperate regions. The following basic description of the model was obtained from [24][25][26]. For modeling, SWAT divides the basin into sub-basins, and these into hydrologic response units (HRU), consisting of homogenous areas of soils, land cover/use, and topography that can impact the hydrology of the study area in a similar way. The SWAT model simulates what happens in the basin by predicting each component of the hydrologic cycle. The main processes are divided into two phases: terrestrial and hydrological. A brief summary of both phases is provided as supplement material.

The SWAT Model
SWAT (the Soil and Water Assessment Tool) has been widely used for the analysis of the hydrologic impact of land cover and management in tropical and temperate regions. The following basic description of the model was obtained from [24][25][26]. For modeling, SWAT divides the basin into sub-basins, and these into hydrologic response units (HRU), consisting of homogenous areas of soils, land cover/use, and topography that can impact the hydrology of the study area in a similar way. The SWAT model simulates what happens in the basin by predicting each component of the hydrologic cycle. The main processes are divided into two phases: terrestrial and hydrological. A brief summary of both phases is provided as supplement material.

SWAT's Inputs and Principal Processes
In this study, we used ArcSWAT version 2012.10_2.18 [27] within the ArcMap 10.2.2 environment. ArcSWAT is freely available, and ArcGIS version-specific ArcSWAT extensions can be obtained from the US Department of Agriculture-Agricultural Research Service website (https://swat.tamu.edu/ software/arcswat/).

Watershed Delineation
Delineation of the pilot watersheds was based on a 30 m grid cell size digital elevation model (DEM) obtained from the Geospatial Data Gateway [28]. A source area delineation threshold [29] was adopted for each pilot watershed. Following this criterion guaranteed that the average size of delineated sub-watersheds within each watershed would be an average of 4.5% of the delineated basin, which has been suggested [29,30] as the minimum threshold to detect the impact of management practices. In this context, each pilot watershed had an average of 22 sub-watersheds. The outlet location for each pilot watershed was determined by the coordinates of the hydrometric stations ( Table 1) used to assess the model's goodness of fit.

Climate Data
Daily summaries of total precipitation and air temperature (maximum, minimum, and mean) for all the pilot watersheds were obtained from the National Centers for Environmental Information [31]. A period of record dating back to the five years before the beginning of our time series (1985), and ending with the most recent records (2016) was defined, hence completely including the time period of our disturbance datasets (1990-2011). Hydrologic cataloging units (HUC levels 08 and 12) were used as spatial domains to select all available climatic data. All stations available at the HUC level 08 (even if they were located outside the actual boundaries of the watersheds) were initially selected. This provided the best observed climatic data within each watershed or in closest proximity to it. However, only those climate stations with a temporal coverage of at least 90% of our time series were ultimately chosen. An advantage to using this climate source is that data are already provided in the proper units (metric), and in the format required for SWAT analyses. Table S2 is available in the Supplementary Material. This table contains the name of the HUC level 08 that each watershed belongs to, as well as weblinks where the daily climatic records can be accessed. We defined elevation bands for each pilot watershed to account for elevation gradients and their effects on snow accumulation and melt processes [30] in mountainous watersheds. Three elevation bands were used in each pilot area.

Land Cover Characteristics
The focal point of this paper is to determine the effects, if any, of using alternative land cover maps with SWAT. The two alternatives were: five-year (multitemporal NLCD), and annual (integration of disturbance products) maps. SWAT models were fit to each watershed keeping all other inputs (DEM, climate, and soils-described below) and SWAT default parameters constant. A description of each land cover set follows. Whenever five-year maps are mentioned throughout the paper, it means that these land cover maps were refreshed or updated with a five-year delay instead of annually, which is the case for the annual maps.
Our assumption here is that the 1992 product resembles the land cover conditions at the time when our disturbance datasets start (i.e., 1990), and then the subsequent products (2001, 2006, and 2011) can be used to describe changes in the landscape that have occurred for the complete duration of our time series. No changes were made to these maps in this stage. Nevertheless, SWAT only ingests a single description of the land cover conditions for the entire duration of a regular simulation. In other words, it assumes by default that land cover does not change for whatever number of years are included in the simulation. Even though SWAT users are able to utilize NLCD products during simulations without adjustment, we needed to establish a process to make both datasets (annual and five-year maps) directly comparable in order to guarantee that no advantages were given to one dataset over the other. The process to integrate the four available temporal NLCD scenarios in one SWAT simulation is described directly after the description of the annual land cover.
• Annual land cover maps A long-term annual (30 m spatial resolution) time series was created that showed the timing, extent, and type of disturbance across the NFS from 1990-2011. As a first step in the production of these maps, an automated forest change detection algorithm [32] was run on an annual, cloud-free series of Landsat TM and ETM+ imagery. Technicians manually edited this map by adding, subtracting and changing the shape of disturbance patches using several sources of reference data, including: multi-temporal composites of Landsat data transformed to one-band per year with the Disturbance Index [33]; high-resolution time series of aerial imagery served through Google Earth; a spatial database of historical harvest activities (the National Forest System's Forest service Activity Tracking System or, FACTS); a national interagency database of fires over 405 hectares in size [34]; and insect activity records obtained from aerial detection surveys [35]. This process also allowed the technicians to label the type of disturbance (harvest, fire, bark beetle) for each affected patch.
• Fusion of a forested land cover baseline with disturbance maps For SWAT models using both five-year and annual land cover maps, we utilized a product that describes the spatial distributions of forest types groups (FTG) as described in [21]. It has been demonstrated [30] that explicitly integrating the configuration of the forested landscape (i.e., which pixels are Ponderosa Pine forests, which pixels are Douglas-Fir forests, etc.) in SWAT provides much better results. The FTG geospatial dataset was used as baseline for the 1990 year for both datasets. The five-year maps were spatially-intersected with the FTG dataset, and the forest spatial domain from the FTG was maintained. Pixels from the five-year maps that did not exactly match the FTG spatial domain were assigned to a FTG class based on proximity. This process was repeated for all the five-year maps. In this context, the parameters for the NLCD forest classes were not used in SWAT. Rather, the specific parameters for FTG classes were used. This cross-walking (NLCD to FTG) activity guaranteed that the five-year maps were using the same SWAT parameters that the annual maps used. In the case of the annual maps, and starting with the 1990 FTG map, each subsequent year was then updated to reflect the changes that occurred on the landscape due to the mapped disturbances. For instance, the land cover conditions for the very next year (1991) were determined to be the same FTG from 1990 with the exception of the disturbed pixels, which were recorded as a new land cover class that we named 'disturbed'. After another year, the classification was allowed to go back to 'forest' and the original FTG class. This is, in essence, the difference between the annual and five-year maps: the periodic use of the disturbed class. We had considered keeping disturbed pixels as disturbed in subsequent years but did not have the data to determine what the land cover condition was post disturbance. This is a limitation in our analysis. Parameters for these forest classes, and the disturbed class with regards to interception capacity, canopy density, and surface roughness were obtained from the application of SWAT in snowmelt-driven catchments reported by [30].

Soil Characteristics
We used the State Soil Geographic database STATSGO [23] to characterize the effects of different types of soils in the hydrologic regime of each watershed. These datasets describe general soil properties and contain the required attributes for SWAT to simulate the movement of water in the soil, and the corresponding conveyance to the reach. While the Soil Survey Geographic (SSURGO) [36] has a more detailed spatial resolution than that of STATSGO we did not find complete coverage for all of our pilot watersheds. According to [37] SWAT models fitted using STATSGO produced better accuracies for low streamflows and better performance during validation. The predominant soil hydrologic groups (SHGs) (i.e., A, B, C or D) [38] for each watershed are listed in Table 1.

Hydrologic Response Units
Hydrologic response units (HRUs) are intrinsically the main modeling unit in SWAT. Each watershed is divided into unique combinations of land cover/FTG, soils, and management practices prior to conducting a simulation. Due to their spatial homogeneity, uniform hydrologic responses are expected from each HRU. There are several options in SWAT which control the spatial variability and number of HRUs, which can influence the outputs [39] obtained during simulations (i.e., a greater number of HRUs capture more variability). In this study, we defined a threshold of 0% minimum of watershed area for land cover and soils units, thereby keeping all possible combinations of land cover and soils. This guaranteed a high level of resolution for modeling purposes. More details are provided below. A single slope class was used during HRUs definition, but as mentioned above the effects of snow accumulation and melt processes were included during the modeling by defining three elevation bands, which has been found [40] to improve SWAT's performance in mountainous terrain with snowmelt-driven hydrology.

Integration of Land Cover Change in SWAT
With the implementation of SWAT version 2009, a new module was added to account for land use and land cover changes that take place over the time series or the modeling period. Before this module was added, an assumption that the land cover in a given watershed did not change during the modeling period was made. This assumption is usually not true, especially during modeling periods that span several decades, such as the ones conducted in this study. A cloud-based software tool (SWAT LUU) can ingest multiple land cover geospatial datasets and produce the required inputs for the SWAT land cover change module. The multiple five-year (1992,2001,2006, and 2011) temporal datasets (modified as explained above through fusion with a forested land cover baseline), and the annual land cover maps (1990-2011) were uploaded to SWAT LUU [41] in conjunction with their corresponding SWAT modeling projects to obtain the required inputs for simulation of land cover change. This process was conducted for each one of the pilot watersheds (i.e., one project using the five-year maps, and another project using the annual maps). With each new land cover map that is available, SWAT LUU recalculates the proportions that each HRU (i.e., Ponderosa pine with an 'x' soil series, disturbed forest with a 'y' soil series, etc.) occupies in the watershed under study. With an updated description of the landscape, and with the changing climatic conditions, SWAT is then able to model the effects of land cover change. Because SWAT cannot use land cover classes not present in the first year of the simulation, a negligible amount of 'disturbed' land (five 30-m pixels per soil group) were added to the FTG map in 1990. This process was required for both annual and five-year maps.

SWAT Simulations
For each pilot watershed, two simulations were set up in SWAT: one using the five-year maps datasets and the other using the annual land cover maps which include the disturbance time series. All inputs (DEM, climate, soils), and parameters influencing those inputs were kept the same for the two simulations with the exception of land cover. Default parameters that relate to canopy structure, interception capacity, and surface roughness for the simulations using the annual maps were modified following [30] the forest classes (Lodgepole, Spruce-fir, and disturbed forest), and for the rangeland and barren classes. The Ponderosa and Douglas-fir forests classes were adjusted using a combination of the characteristics detailed by [30], and an assessment of leaf area index time series [42][43][44]. Tree growth default parameters for both simulations (five-year and annual land cover) were also modified. Preliminary runs of these models indicated that transpiration was much lower than expected. As pointed out by [45], this was due to a default operation practice in SWAT that kills and harvests the crops or forests every year. Such a forestry operation was not expected in our watersheds, and thus was removed, and the initial settings for tree growth for both types of simulations (five-year and annual) were modified accordingly.

Model Calibration
Once the initial models were fitted as described above, we conducted a calibration process so that the SWAT parameters used during model validation would be based on the local specific conditions of the 10 watersheds, and not on SWAT's default parameter values. This is particularly important given that all of our watersheds' hydrologic regimes are snowmelt dependent, and SWAT's default parameters are specific for agricultural watersheds. SWAT's parameters are process-based and must be used within a sensible and realistic range that can change from watershed to watershed. The main goal of the calibration process is to determine which parameters have a predominant influence on the hydrology of the watershed, and to estimate a range of values for those parameters that better reflect the physical processes while at the same time reduces prediction uncertainty. An initial set of 26 parameters that have been previously documented [30] as being significant in affecting the hydrology of this type of watersheds was used. The list of parameters, and the initial range of values are provided as supplement material.
Model calibration was performed using the Sequential Uncertainty Fitting (SUFI2) semi-automated algorithm that is available within the SWAT Calibration and Uncertainty Programs SWAT-CUP software [46]. In SUFI2 many iterations are performed over the selected subset of parameters, and the initial range of values. As more iterations are applied, the "parameter ranges get smaller zooming in a region of the parameter space, which produces better results than the previous iteration" [47]. The user can tune the uncalibrated model by minimizing the differences between observed and simulated flows. This process can be done by defining an objective function to optimize, which in this study it was the Nash-Sutcliffe efficiency (NSE) technique. The NSE is "a normalized statistic that determines the relative magnitude of the residual variance" [48], and indicates how well the plot of observed versus simulated data fits the 1:1 line [49] A thorough description of calibration and validation concepts, and applications can be found in [47]. The calibration was conducted independently for each of the pilot watersheds, however the initial set of parameters to be calibrated was the same for all watersheds. The calibration for each watershed was conducted as follows: (a) with the initial set of parameters and their initial values Table S3, 750 simulations were run to obtain the results for the first iteration; (b) the calibration outputs-i.e., the global sensitivity analysis, of the iteration were reviewed to determine which parameters should not be included in the next iteration. We chose to leave out those parameters (usually one or two) with the highest p-values, which may be interpreted as not having a significant impact in changing the predicted flow values; (c) we then reviewed the new parameter range values after the iteration, and copy those values for the parameters that we decided to include in the next iteration; (d) a new iteration was run with the chosen subset of parameters and their new values using 750 simulations; (e) the process to select the new subset of parameters to be used in the next iteration was repeated, and the change in NSE was noted; (f) this process was repeated until there was not a visible and noteworthy change in NSE improvement from iteration to iteration; (g) the SWAT original model was modified accordingly using the final subset of parameters, and their calibrated values. This is a computationally demanding process, and SWAT-CUP had to be used with an extension that allows parallel processing, thereby reducing the required time to run hundreds of simulations.

Model Validation
The accuracy of the SWAT calibrated models using the annual and five-year land cover was assessed independently. The NSE technique [48], the percent bias (PBIAS) measure [50], and the ratio of the root mean square error to the standard deviation (RSR) were calculated by comparing the simulated SWAT flows with the observed monthly statistics (monthly average of flow discharge) at each one of the watersheds outlets. These averages were obtained directly from the time series: monthly statistics that have been calculated for each hydrometric station and that are available from the USGS website (https://waterdata.usgs.gov), Table 1 and Table S1. During SWAT's model configuration, the spatial location of these outlets was defined by using the same coordinates of USGS's surface-water hydrometric stations. The performance of SWAT was estimated for either the entire disturbance time series  or just for the years that data were available at each hydrometric station and overlapped our period of analysis. All stations had complete monthly availability, except Wat03 for which no records were measured for the months of December through March, thus the performance that is reported for this watershed is based only on eight months of data per year. The characteristics of these stations are provided (Table S1) as supplement material.
NSE has been recommended [49,51] in the literature as the standard to compute hydrologic model performance. NSE is an excellent indicator of how well the simulated data mimics the observed data and ranges from −∝ to 1.0. Values < 0 are thought to indicate an unacceptable performance, and values > 0 and <1.0 indicate acceptable degrees of accuracy. If one model renders a value <0, then the mean of the observed data is a better predictor than the simulated values [49]. The PBIAS is usually recommended in the literature as it measures the average tendency of the simulated flows to be larger or smaller than the observed values. A PBIAS value of 0.0 is the optimal while positive values indicate model underestimation and negative values indicate a bias toward overestimation [50]. RSR varies from an optimal value of 0, which would mean a zero-residual variation-in other words, a perfect prediction by the model, to a large positive value which indicates very poor model performance. Thus, the lower the RSR, the lower the RMSE, and the better flow predictions by the hydrologic model [47]. In order to see how well the calibrated models performed using independent data, we conducted the validation phase in a period not covered during the calibration phase. We conducted the calibration at any particular watershed using data from the decade 1991-2000-namely for the years 1991-1995. We then conducted the validation by running the calibrated version of the model with the data that pertained to the period 2000-2010, namely the years 2001-2005. We considered that this would provide a much better idea of how well the calibrated models would perform in different temporal conditions.

Improvement in SWAT Predictions
To better understand any improvements due to more temporally precise disturbance data, we explored the relationship between the number of HRUs and the relative improvement in the streamflow predictions defined by Improvement = NSE or RSR using dynamic maps − NSE or RSR using static maps , The total number of HRUs is defined by Total HRUs = Initial HRUs 1991 × Number o f years with changes, Change defined by the availability of a five-year or an annual map.

Disturbance Dynamics in the Pilot Watersheds
The fraction of the land disturbed in each year for most watersheds was generally less than 3% of the total area of each watershed. However, there are some watersheds that observed larger disturbed fractions such as Wat04 in 2007 and 2008, and Wat01, Wat07, and Wat10 toward the end of the time series. In general, fire was the most prolific disturbance agent, followed by insect outbreaks. The distribution varies by watershed, however. The aggregated percent of the land that was disturbed each year per watershed is shown in the Figure 2. Details about the annual area (hectares) and proportion (%) disaggregated by causal agent (fires, abiotic, insects, and harvests) can be found in the supplementary information (Table S4).

Major Impacts of Disturbance on Streamflows
There were some notable effects on streamflow regimes in certain watersheds. These effects were conspicuous particularly on watersheds that observed the highest fractions of disturbed land. We present these impacts aggregated by the major disturbance causal agent.

Insect Attacks
Three watersheds (Wat01, Wat07, and Wat10) observed the highest per-year proportions of insect attacks. There is a comprehensive daily flow (1933-2017) dataset for Wat01 but not for Wat07 or Wat10, for which records are available from 1965-1994 and 1963-1996, respectively. Thus, an assessment of the impact of insects on streamflow could not be conducted for units Wat07 and Wat10. For Wat01, disturbances did not exceed 0.5% of the total area until 2006. Taking that year as a breakpoint, we generated the hydrologic regime for this watershed from the period 1991-2005, and then 2006-2015 (see Figure 3). The pre-disturbance period showed a relatively stable mean streamflow during the summer months, whereas the post-disturbance period exhibits a rather variable (multiple peaks) mean streamflow during the same period. Larger flows (up to 90th percentile) have decreased in the summer months as well hovering around 25 m 3 /s compared to around 35 m 3 /s in the pre-disturbance period. We utilized the R package called "FlowScreen" to prepare these graphs that show the hydrologic regime for the pre and post disturbance periods. The package "FlowScreen" automatically shades the area between the 10th and 90th percentile.
Although no trend was detected (data not shown) for the pre-disturbance median annual streamflow, there is a significant (p < 0.05) upward trend in the post-disturbance time. This can be observed in Figure 4. No upward trend was observed for precipitation during the same period of time. Insect attacks peaked at the very end of the disturbance mapping time series with 28.5% of the watershed being affected by this causal agent. To test that there was no trend in the precipitation we used the Mann-Kendall test available in the R package "Trend".

Major Impacts of Disturbance on Streamflows
There were some notable effects on streamflow regimes in certain watersheds. These effects were conspicuous particularly on watersheds that observed the highest fractions of disturbed land. We present these impacts aggregated by the major disturbance causal agent.

Insect Attacks
Three watersheds (Wat01, Wat07, and Wat10) observed the highest per-year proportions of insect attacks. There is a comprehensive daily flow (1933-2017) dataset for Wat01 but not for Wat07 or Wat10, for which records are available from 1965-1994 and 1963-1996, respectively. Thus, an assessment of the impact of insects on streamflow could not be conducted for units Wat07 and Wat10. For Wat01, disturbances did not exceed 0.5% of the total area until 2006. Taking that year as a breakpoint, we generated the hydrologic regime for this watershed from the period 1991-2005, and then 2006-2015 (see Figure 3). The pre-disturbance period showed a relatively stable mean streamflow during the summer months, whereas the post-disturbance period exhibits a rather variable (multiple peaks) mean streamflow during the same period. Larger flows (up to 90th percentile) have decreased in the summer months as well hovering around 25 m 3 /s compared to around 35 m 3 /s in the pre-disturbance period. We utilized the R package called "FlowScreen" to prepare these graphs that show the hydrologic regime for the pre and post disturbance periods. The package "FlowScreen" automatically shades the area between the 10th and 90th percentile.
Although no trend was detected (data not shown) for the pre-disturbance median annual streamflow, there is a significant (p < 0.05) upward trend in the post-disturbance time. This can be observed in Figure 4. No upward trend was observed for precipitation during the same period of time. Insect attacks peaked at the very end of the disturbance mapping time series with 28.5% of the watershed being affected by this causal agent. To test that there was no trend in the precipitation we used the Mann-Kendall test available in the R package "Trend".

Forest Fires
For the other watersheds, fire was the major causal disturbance agent. As opposed to the insect attacks which were observed by the end of the time series, fires were a prominent feature throughout the period of study. There is a significant (p < 0.1) upward trend in the median annual flows for watersheds affected by fires, which can be observed in Figure 5. In this Figure 5a we can observed that breakpoints [52] were also detected at several years in the time series. The breakpoint found during the mid-1990s corresponds well with a peak in precipitation around those years. Using the year 1997 as a cutoff for the time series we see (Figure 5b) that there is an upward trend (p < 0.01) for

Forest Fires
For the other watersheds, fire was the major causal disturbance agent. As opposed to the insect attacks which were observed by the end of the time series, fires were a prominent feature throughout the period of study. There is a significant (p < 0.1) upward trend in the median annual flows for watersheds affected by fires, which can be observed in Figure 5. In this Figure 5a we can observed that breakpoints [52] were also detected at several years in the time series. The breakpoint found during the mid-1990s corresponds well with a peak in precipitation around those years. Using the year 1997 as a cutoff for the time series we see (Figure 5b) that there is an upward trend (p < 0.01) for

Forest Fires
For the other watersheds, fire was the major causal disturbance agent. As opposed to the insect attacks which were observed by the end of the time series, fires were a prominent feature throughout the period of study. There is a significant (p < 0.1) upward trend in the median annual flows for watersheds affected by fires, which can be observed in Figure 5. In this Figure 5a we can observed that breakpoints [52] were also detected at several years in the time series. The breakpoint found during the mid-1990s corresponds well with a peak in precipitation around those years. Using the year 1997 as a cutoff for the time series we see (Figure 5b) that there is an upward trend (p < 0.01) for the median annual streamflows in watersheds affected by fire. For reference, and for all watersheds, we have included a breakdown (total area and percent annual disturbed) by causal agent (fires, harvests, insects) in the supplementary material (Table S4). the median annual streamflows in watersheds affected by fire. For reference, and for all watersheds, we have included a breakdown (total area and percent annual disturbed) by causal agent (fires, harvests, insects) in the supplementary material (Table S4).

SWAT's Outputs
Although SWAT is capable of modeling all components of the water balance (i.e., interception, evapotranspiration, water yield, etc.), and the dynamics at the reach (i.e., streamflows, sediments, chemical concentrations, etc.), we only extracted the mean monthly streamflows (m 3 /s) because it was the only consistent field measurement that was available for all pilot basins. Figure 6 depicts a subset of a typical observed vs. simulated time series flows for one of the pilot areas. A visual evaluation suggests that in this watershed the utilization of the annual maps in the SWAT simulation generated a better fit compared to using the five-year maps. The results observed during the calibration phase are shown in Figure 6a, while the results for the validation period are depicted in Figure 6b. Both land cover datasets seem to capture the monthly trends adequately during the calibration period. However, during the validation period the annual LC maps were able to capture the seasonality much better than the five-year maps. Both LC datasets generated higher values than the observed annual means, but the annual maps showed narrower differences for the majority of the years shown in this calibration and validation subsets.

Annual Maps Provide Better Streamflow Estimates than Five-Year Maps
Performance metrics were obtained for all pilot watersheds during the calibration and the validation phases for either the entire time series or only the synchronized months as explained in the methods section. The performance metrics obtained during the validation phase are shown in Table 2.

SWAT's Outputs
Although SWAT is capable of modeling all components of the water balance (i.e., interception, evapotranspiration, water yield, etc.), and the dynamics at the reach (i.e., streamflows, sediments, chemical concentrations, etc.), we only extracted the mean monthly streamflows (m 3 /s) because it was the only consistent field measurement that was available for all pilot basins. Figure 6 depicts a subset of a typical observed vs. simulated time series flows for one of the pilot areas. A visual evaluation suggests that in this watershed the utilization of the annual maps in the SWAT simulation generated a better fit compared to using the five-year maps. The results observed during the calibration phase are shown in Figure 6a, while the results for the validation period are depicted in Figure 6b. Both land cover datasets seem to capture the monthly trends adequately during the calibration period. However, during the validation period the annual LC maps were able to capture the seasonality much better than the five-year maps. Both LC datasets generated higher values than the observed annual means, but the annual maps showed narrower differences for the majority of the years shown in this calibration and validation subsets.

Annual Maps Provide Better Streamflow Estimates than Five-Year Maps
Performance metrics were obtained for all pilot watersheds during the calibration and the validation phases for either the entire time series or only the synchronized months as explained in the methods section. The performance metrics obtained during the validation phase are shown in Table 2.    Note: Bold/Underlined in this table denote "unsatisfactory" performance for streamflow simulations according to [49].
The annual land cover maps produced better performance estimates across the time series available for each pilot watershed as it can be deducted from the NSE values. For all watersheds, the NSE for SWAT models using annual maps were positive, and in eight units the NSE values using the annual maps were higher than those using the five-year maps. Further, three watersheds (03, 04, and 05) had negative NSE values using the five-year maps, meaning that simply using the mean of the observed values is better than the simulated values. With regards to the PBIAS, we found that annual and five-year LC maps frequently produced values of the opposite sign. Using the five-year maps, seven watersheds had unsatisfactory performance ratings, whereas when the annual maps were used, only three watersheds (02, 04, and 07) were rated as such. Unsatisfactory model predictions are PBIAS values ≥ ±25 for streamflow predictions [49]. Similar results are observed using RSR. Six watersheds qualified as having unsatisfactory performances when the five-year maps were used as oppose to only one unit (014) when the annual maps were used. Unsatisfactory flow predictions are those with an RSR > 0.70. With the exception of units 05 and 13, the annual maps had better R 2 values (data not shown here but included in the supplementary information) when comparing observed vs. predicted flows using the calibrated models.

Source of the Improvement in Predictions
The introduction of many years of available disturbance maps in comparison to just four NLCD different maps creates a difference in the total number of hydrologic response units HRUs that were available for SWAT to model changes from year to year. Due to the differences found in the PBIAS noted above (overestimation using one set of maps and underestimations using the other set), and to the linearity in the NSE and RSR interpretations, we provide the improvement in predictions estimates only for NSE and RSR.
The data used to evaluate the relationship is shown in Table 3. Linear models fitting improvement in NSE and RSR to the difference in the number of HRUs in a watershed yielded significant positive slopes coefficients for both estimators NSE and RSR.
The coefficients, R 2 , and p-values obtained are shown in Figure 7. Increasing the number of HRUs considered by the model by accounting for annual disturbances was strongly associated with better predictions of flow using different estimators of the model's goodness of fit.
Linear models fitting improvement in NSE and RSR to the difference in the number of HRUs in a watershed yielded significant positive slopes coefficients for both estimators NSE and RSR. The coefficients, R 2 , and p-values obtained are shown in Figure 7. Increasing the number of HRUs considered by the model by accounting for annual disturbances was strongly associated with better predictions of flow using different estimators of the model's goodness of fit.
(a) (b) Figure 7. Linear models fitting the improvement in NSE (a); and RSR (b) to the difference in the total number of HRU used by SWAT indicating significant gains in performance by using the annual maps.

Using Annual Maps in SWAT Produce Better Streamflow Predictions
To the best of our knowledge, there are no references that have used SWAT or other hydrologic models with land cover inputs updated on an annual basis. SWAT and the National Land Cover Dataset NLCD have been used to assess the impact of projected land cover change [53], historic landscape change in agricultural settings [54], and the changes in phosphorus loads between two NLCD years 1992 and 2001 [55]. We demonstrated that increasing the refresh rate of canopy cover data using remote sensing derived change maps enables SWAT to produce better streamflow estimates. Keeping all other data sources (i.e., topography, soils, and climate) constant, and individually calibrating each watershed, NSE and RSR values improved in all 10 watersheds when annual land cover maps were used. While five-year land cover maps produced negative NSE value even after calibration, indicating it would have been better to use the mean of the observed data, not a single watershed had negative NSE values when using the annual maps. All NSE values from annual maps can be considered satisfactory (NSE > 0.5) [49], which is similar to what was found for the RSR values, except one watershed (Wat014). This is a clear indication that the enhanced temporal resolution (multiple years available during the calibration period) of the annual disturbance maps, helped SWAT to better accommodate the changing landscape effects on the watershed hydrology that was observed during the validation period. While the PBIAS results showed that three of the units that used the annual maps had unsatisfactory values (Units 02, 04, 07), the situation was much worse using the five-year maps where 70% of the units had poor model performance.

Using Annual Maps in SWAT Produce Better Streamflow Predictions
To the best of our knowledge, there are no references that have used SWAT or other hydrologic models with land cover inputs updated on an annual basis. SWAT and the National Land Cover Dataset NLCD have been used to assess the impact of projected land cover change [53], historic landscape change in agricultural settings [54], and the changes in phosphorus loads between two NLCD years 1992 and 2001 [55]. We demonstrated that increasing the refresh rate of canopy cover data using remote sensing derived change maps enables SWAT to produce better streamflow estimates. Keeping all other data sources (i.e., topography, soils, and climate) constant, and individually calibrating each watershed, NSE and RSR values improved in all 10 watersheds when annual land cover maps were used. While five-year land cover maps produced negative NSE value even after calibration, indicating it would have been better to use the mean of the observed data, not a single watershed had negative NSE values when using the annual maps. All NSE values from annual maps can be considered satisfactory (NSE > 0.5) [49], which is similar to what was found for the RSR values, except one watershed (Wat014). This is a clear indication that the enhanced temporal resolution (multiple years available during the calibration period) of the annual disturbance maps, helped SWAT to better accommodate the changing landscape effects on the watershed hydrology that was observed during the validation period. While the PBIAS results showed that three of the units that used the annual maps had unsatisfactory values (Units 02, 04, 07), the situation was much worse using the five-year maps where 70% of the units had poor model performance.

Updated Land Cover Helps Yield Prediction: The Relationship between Disturbance and Flow
Forest disturbances are generally followed by higher water yields due to changes in the water balance in a watershed [56,57]. Peak discharges tend to be larger after harvest events, leading to a reduction of recharge of base flow which is needed for streams to flow during dry seasons [1]. If prompt recovery of the lost canopy cover occurs, then water yield will tend to decrease after the initial temporal increments due to amplified evapotranspiration and improved interception and infiltration due to vegetation regrowth [58].
We observed a sustained upward trend in median annual flows for two watersheds in this study: Wat01 and Wat04. This trend was visible and statistically significant for about 10 years in Wat01, and more than 15 years in Wat04. A positive trend in precipitation was not observed for these watersheds (data not shown), and thus upward trends in streamflows cannot be directly associated with a tendency of wetter years. The upward trends in streamflows may signal that land cover has not gone back to pre-disturbance conditions [58][59][60]. The potentially variable duration of disturbance effect raises the possibility that our single-year representation of the disturbed category may be insufficient and additional effort should be made to track recovery through time. This is, currently, not a trivial matter since land cover will transition to a number of different states until it returns to a condition similar to pre-disturbance [61].
Despite this limitation, the annual disturbance map data improved yield predictions in a majority of tested cases. More detailed canopy cover information is manifested in a higher number of HRUs, essentially giving the model more information to work with in predicting water yield. The increase in number of HRUs due to the addition of an annually updated disturbance was strongly correlated at the watershed level with SWAT's prediction of water yield: increased thematic and temporal resolution improved performance.

Usefulness of SWAT Outputs in Forested Watersheds
Enhancing the capacity of models such as SWAT to predict water yield under specific climatic, edaphic, and disturbance conditions will improve our ability to monitor flow in ungauged watersheds. Furthermore, improved models increase the confidence with which water managers can project water yield trends under different climatic scenarios. The proposed use of annual disturbance maps, in particular, will allow managers to explicitly consider changes in disturbance rates that may accompany a changing climate. The same predictive property may likewise be useful in understanding likely trends in water yield associated with proposed changes in harvest or treatment frequency. Such considerations are particularly important when, as is frequently the case across the NFS, watersheds are an important source of drinking water.
The disturbance-mapping process used in this study was conventional and labor-intensive. However, several automated disturbance-mapping processes based on remote sensing imagery from satellites such as Landsat are beginning to offer high-quality disturbance maps across the country and the globe [62,63]. To the degree that forest disturbance affects water yield-and all evidence here and elsewhere suggests that it does-automated remote disturbance detection offers tremendous opportunity for water management decision support.

Conclusions
We assessed the benefits of using annually-updated vs. five-year land cover maps to predict streamflows in 10 watersheds in the western US. Enhanced temporal resolution of land cover conditions during the calibration phase resulted in more precise streamflow predictions.. Improved performance was observed in the widely used [49] metrics of NSE, PBIAS, and RSR, with detected improvements on NSE and RSR being statistically significant (p < 0.01). Monitoring and predicting water yield represents an important potential application as satellite-based methods of land cover change mapping become more sophisticated and more available.

Supplementary Materials:
The following are available online at http://www.mdpi.com/1999-4907/9/6/317/s1, Document S2: description of the land and water phases of the SWAT model. Table S1: Hydrometric stations used in each watershed-weblinks to access the data online. Table S2: Climatic information used for each watershed-weblinks to access the data online. Table S3: SWAT parameters and ranges used during calibration.