Tree-Ring Reconstructions of Streamflow for the Tennessee Valley

This study reports the preliminary results from a statistical screening of tree-ring width records from the International Tree-Ring Data Bank (ITRDB), to evaluate the strength of the hydrological signal, in dendrochronological records from the Tennessee Valley. We used United States Geological Survey (USGS) streamflow data from 11 gages, within the Tennessee Valley, and regional tree-ring chronologies, to analyze the dendroclimatic potential of the region, and create seasonal flow reconstructions. Prescreening methods included correlation, date, and temporal stability analysis of predictors to ensure practical and reliable reconstructions. Seasonal correlation analysis revealed that large numbers of regional tree-ring chronologies were significantly correlated (p ≤ 0.05) with the May–June–July streamflow. Stepwise linear regression was used to create the May–June–July streamflow reconstructions. Ten of the 12 streamflow stations were considered statistically skillful (R2 ≥ 0.40). Skillful reconstructions ranged from 208 to 301 years in length, and were statistically validated using leave-one-out cross validation, the sign test, and a comparison of the distribution of low flow years. The long-term streamflow variability was analyzed for the Nolichucky, Nantahala, Emory, and South Fork (SF) Holston stations. The reconstructions revealed that while most of the Western United States (U.S.). was experiencing some of its highest flow years during the early 1900s, the Tennessee Valley region was experiencing a very low flow. Results revealed the potential benefit of using tree-ring chronologies to reconstruct hydrological variables in the Southeastern U.S., by demonstrating the ability of proxy-based reconstructions to provide useful data beyond the instrumental record.


Introduction
Water planners and managers can make more accurate decisions based on information provided by the expanding hydrological records. Tree rings have been widely used as a proxy to reconstruct hydrological variables in the Western United States (U.S.) [1][2][3][4]. Relatively little dendroclimatological research has been conducted within the Southeastern U.S. during the past 20 years, when compared to the number of studies conducted in the Southwestern, Northwestern, and Rocky Mountain regions of the U.S. In the Southeastern U.S., many misconceptions still linger among scientists that tree-ring research is not possible due to the high decomposition and decay rates, a lack of trees that are long-lived, and the absence of climatically sensitive patterns of tree rings to facilitate cross-dating [5]. Furthermore, a lower priority is put on the hydrological reconstructions in the Southeast U.S., due to the abundant water supplies.
The limited number of reconstructions for the Southeastern U.S. can be explained by several factors. The Tennessee Valley Authority (TVA) dam construction has limited the number of undisturbed streams in the region. The region's natural topography divides the area into many small catch basins and obstructs rainfall pathways within watersheds. The effects of the topography may explain why tree-ring chronology to a streamflow gage is not always indicative of a statistically significant streamflow-tree-growth relationship. In addition, the Southeastern U.S. receives more precipitation than most parts of the country, especially when compared to the Western U.S., providing less motivation for water quantity studies. The lack of streamflow gage and tree-ring datasets spanning cooperative lengths, contributes to the difficulty of obtaining long calibration windows.
Although misconceptions still exist regarding the applicability of dendroclimatology in the southeast, tree rings in the region have been used to investigate the relationships between climate and tree-growth. Blasing et al. [6] found that tree-rings were a good predictor of May-June precipitation for East Tennessee. Phipps [7] reconstructed the Occoquan River monthly summer streamflow in Virginia, finding June streamflow to be the strongest predictand. Stahle et al. [8] created a 1000-year spring-summer precipitation reconstruction within North Carolina, South Carolina, and Georgia, which was found to replicate most of the multidecadal variability apparent in the available instrumental rainfall data. More recent studies have found strong climate signals in tree-ring patterns, from Texas to Florida to Virginia, and sites that are further inland [9][10][11][12], confirming the potential for the development of a more extensive network of sites, for spatial reconstructions of the past climate.
The first objective of this research was to analyze the dendroclimatic potential of a critical flood control and hydropower region in the Southeastern U.S. (Tennessee Valley), using streamflow and regional tree-ring chronology datasets. The streamflow gages selected, contribute to the Tennessee River. The Tennessee River is the largest tributary of the Ohio River and has a length of over 1000 km and a watershed area of over 100,000 km 2 . It originates in eastern Tennessee and, thus, the streamflow gages selected are, in and adjacent to, the headwaters of the basin. Based on previous studies, we hypothesized that regional tree-growth would be significantly correlated with spring-summer streamflow. This study focused on the development of skillful reconstructions of streamflow and did not assess the relationship between climate signals and ring growth variations. Our second objective was to create statistically skillful (based on the overall variance explained and model stability) streamflow reconstructions for 11 gages within the Tennessee Valley. Our final objective was to examine the long-term hydrological variability of the Tennessee Valley streamflow, on a timescale exceeding the instrumental record. The current research evaluated the hydrological reconstruction potential in the Tennessee Valley and the need for additional sampling of tree ring proxies in the region, to improve the understanding of past climates. Doing so might provide valuable water availability information to the Tennessee Valley water resource planners and managers.

Materials and Methods
The methodology for developing streamflow reconstructions begins with the collection of streamflow and tree-ring chronology datasets. The streamflow data collected was converted from flowrate to seasonal volume, and was the dependent variable in the regression model. Tree-ring chronology data was then collected and was the independent variable in the regression model. Prior to inputting the tree-ring chronology data into the regression model, prescreening (date of collection, correlation, and stability) was performed. Regression models were then developed and model fit (skill) was evaluated.

Streamflow (United States Geological Survey (USGS))
Streamflow data for 11 gages within the Tennessee Valley were obtained from the United States Geological Survey (USGS) website, via the National Water Information System [13]. One of the most Hydrology 2019, 6, 34 3 of 11 important components in a streamflow reconstruction is the accuracy and length of the existing streamflow gage records. Although the USGS streamflow-gaging program began collecting streamflow data as early as 1887, not all USGS gage stations had the same period of record. Some USGS gage stations had missing data, due to technical, mechanical, or otherwise unknown reasons. The USGS gage stations that were used in this study contained no missing data and most of the stations had an acceptable record to calibrate with the regional tree-ring chronologies. Although these rivers were in close proximity (Figure 1), the elevation and drainage area of each station was unique (Table 1). Monthly cumulative flow in million cubic meters (hm 3 , MCM) was used. The monthly variability of streamflow for the four stations (Nolichucky, Nantahala, Emory, and SF Holston) was provided ( Figure 2). Streamflow data for 11 gages within the Tennessee Valley were obtained from the United States Geological Survey (USGS) website, via the National Water Information System [13]. One of the most important components in a streamflow reconstruction is the accuracy and length of the existing streamflow gage records. Although the USGS streamflow-gaging program began collecting streamflow data as early as 1887, not all USGS gage stations had the same period of record. Some USGS gage stations had missing data, due to technical, mechanical, or otherwise unknown reasons. The USGS gage stations that were used in this study contained no missing data and most of the stations had an acceptable record to calibrate with the regional tree-ring chronologies. Although these rivers were in close proximity (Figure 1), the elevation and drainage area of each station was unique (Table 1). Monthly cumulative flow in million cubic meters (hm 3 , MCM) was used. The monthly variability of streamflow for the four stations (Nolichucky, Nantahala, Emory, and SF Holston) was provided ( Figure 2). Reconstruction TRCs indicate tree-ring chronologies that were found to be statistically correlated with streamflow and were used in the developed reconstructions. Non-Reconstruction TRCs indicate tree-ring chronologies that were not found to be statistically correlated with streamflow and were not used in the developed reconstructions.  Reconstruction TRCs indicate tree-ring chronologies that were found to be statistically correlated with streamflow and were used in the developed reconstructions. Non-Reconstruction TRCs indicate tree-ring chronologies that were not found to be statistically correlated with streamflow and were not used in the developed reconstructions.

Tree-Ring Chronologies (ITRDB)
Tree-ring chronology datasets within and around the Southeastern U.S. were retrieved from the International Tree-Ring Data Bank (ITRDB) [14], which was maintained by the National Oceanic and Atmospheric Administration (NOAA) Paleoclimatology Program. All ring width series were uniformly processed and standardized, using the AutoRegressive STANdardization (ARSTAN) program [15] and those results are available on the ITRDB. Conservative detrending methods (negative exponential/straight line fit or a cubic spline two thirds the length of the series) were used to combine all series into a single site chronology [16]. Low-order autocorrelation in the chronologies that may, in part, be attributed to biological factors [17], was removed by autoregressive modeling, and the resulting residual chronologies were used for analysis. The residual chronology type has been previously found to be appropriate (rather than the standard chronology type which retains autocorrelation), when modeling hydrological variables in the Western [1][2][3][4] and Southeastern U.S. [18]. As the reconstruction length and moisture sensitivity of Eastern U.S. tree species were unknown at the time of data collection, we initially examined 102 chronologies across 12 states (Figure 1), for the strength of their responses to the Tennessee Valley streamflow.

Predictor Prescreening Methods
Three prescreening methods were used to identify the most suitable tree-ring chronologies to use as predictors for the reconstruction models. First, a date screen was used. Many of the tree-ring samples within the Southeastern U.S. were last collected during the early 1980s. We used the year 1980 as the cutoff date for initial predictor pool tree-ring chronologies, and removed any chronologies cored before 1980, from the analysis.
Next, we inspected correlation coefficients between various streamflow seasons and residual tree-ring chronologies (in and adjacent to the Tennessee Valley), to identify the streamflow season most influential to tree growth and, therefore, most suitable for reconstruction. One of the most

Tree-Ring Chronologies (ITRDB)
Tree-ring chronology datasets within and around the Southeastern U.S. were retrieved from the International Tree-Ring Data Bank (ITRDB) [14], which was maintained by the National Oceanic and Atmospheric Administration (NOAA) Paleoclimatology Program. All ring width series were uniformly processed and standardized, using the AutoRegressive STANdardization (ARSTAN) program [15] and those results are available on the ITRDB. Conservative detrending methods (negative exponential/straight line fit or a cubic spline two thirds the length of the series) were used to combine all series into a single site chronology [16]. Low-order autocorrelation in the chronologies that may, in part, be attributed to biological factors [17], was removed by autoregressive modeling, and the resulting residual chronologies were used for analysis. The residual chronology type has been previously found to be appropriate (rather than the standard chronology type which retains autocorrelation), when modeling hydrological variables in the Western [1][2][3][4] and Southeastern U.S. [18]. As the reconstruction length and moisture sensitivity of Eastern U.S. tree species were unknown at the time of data collection, we initially examined 102 chronologies across 12 states (Figure 1), for the strength of their responses to the Tennessee Valley streamflow.

Predictor Prescreening Methods
Three prescreening methods were used to identify the most suitable tree-ring chronologies to use as predictors for the reconstruction models. First, a date screen was used. Many of the tree-ring samples within the Southeastern U.S. were last collected during the early 1980s. We used the year 1980 as the cutoff date for initial predictor pool tree-ring chronologies, and removed any chronologies cored before 1980, from the analysis.
Next, we inspected correlation coefficients between various streamflow seasons and residual tree-ring chronologies (in and adjacent to the Tennessee Valley), to identify the streamflow season most influential to tree growth and, therefore, most suitable for reconstruction. One of the most important aspects of the seasonal correlation analysis was to determine a common streamflow season to reconstruct for all 11 of the streamflow gages. Based on similar studies in the surrounding regions, we hypothesized that a strong relationship would be found between tree growth and the spring-summer (April-August) streamflow. However, numerous streamflow seasons of various lengths were analyzed for completeness. We considered the relationship between tree growth and ten different streamflow seasons of various durations. Three-month seasonal streamflow periods investigated, included January-March, April-June, May-July, July-September, and October-December. Six-month seasonal streamflow periods included January-June, April-September, and July-December. May-June and annual streamflow were also considered. We retained significant (p ≤ 0.05), positive r-values for the analysis.
The last pre-screening method involved temporal stability analysis. Temporal stability analysis consisted of performing a 30-year moving correlation window (using MS Excel), similar to Biondi and Waikul [19], between the various streamflow seasons and residual chronologies. Chronologies containing negative 30-year correlation values with seasonal flow were considered unstable and removed from analysis. Stability analysis ensured that reliable and practical streamflow reconstructions were generated.

Reconstruction Methodology
Model calibration windows were controlled by the date that streamflow was first collected at each gage station. While all calibration windows ended at 1980, the beginning dates of the calibration windows ranged from 1919 to 1949 (Table 1). The ability of the statistically significant and stable moisture sensitive tree-ring chronologies to predict streamflow, was tested using a forward and backward (standard) stepwise regression model. A standard stepwise regression adds and removes predictors, as needed, for each step. The model stops when all variables not in the model have p-values that are greater than the specified alpha-to-enter value and when all variables in the model have p-values that are less than or equal to the specified alpha-to-remove value. Following the procedure of Woodhouse et al. [20], the F-level for a predictor chronology had to have a maximum p-value of 0.05 for entry and 0.10 for retention in our stepwise regression model.
Numerous statistical measures were used to establish the statistical skill of each streamflow reconstruction. R 2 explained the amount of variance being explained by each model. R 2 -predicted was calculated from the Predicted REsidual Sums of Squares (PRESS) statistic. The PRESS statistic is based upon a leave-one-out cross-validation, in which a single year or observation is removed when fitting the model. As a result, the prediction errors are independent of the predicted value at the removed observation [21]. The Variation Inflation Factor (VIF) indicates the extent to which multicollinearity is present in a regression analysis. Generally, a VIF value close to 1.0 indicates low correlation between predictors, and is ideal for a regression model [22]. The Durbin-Watson (D-W) statistic was used to analyze the autocorrelation structure of model residuals. The sign test, a nonparametric procedure to count the number of agreements and disagreements between instrumental and reconstructed flow, was used for additional model validation.

Results
After the date screening, 72 of the 102 chronologies were retained and used for seasonal correlation analysis. As seen in Blasing et al. [6], the two-month period May-June, contained the largest number of significant tree-ring chronologies for the majority of the 11 gages. Furthermore, the winter months never yielded many highly correlated tree-ring chronologies. While the number of significant tree-ring chronologies was similar for the seasons of April-June and May-July, tree-growth contained a stronger moisture signal (higher correlation) with the May-July streamflow, when compared to the April-June streamflow. Rather than reconstructing May-June streamflow as performed in Blasing et al. [6], we reconstructed the May-July streamflow, because reconstructing a three-month season provides more information on temporal characteristics of climate variability, over a longer season. The number of chronologies containing positive, significant (p ≤ 0.05) r-values after seasonal correlation, varied for each streamflow station, and ranged from three (Watauga gage) to thirty-five (NF Holston, Nolichucky, and Valley gages). Following stability analysis, the final number of chronologies that were entered as initial predictors in the calibration models, ranged from three (Watauga gage) to thirty-four (NF Holston gage).
For all streamflow gages, the most feasible calibration models and reconstructions were chosen ( Table 2). We based feasibility on the length of the reconstruction, the overall variance explained of the model, and the predictability of the model. Ten of the 12 calibration models were considered to be statistically skillful (R 2 ≥ 0.40). The D-W test for autocorrelation in the residuals from regression showed that the autocorrelation was not significant for most of the models, indicating that the residuals were random and the models were appropriate [23]. The D-W value for the Nolichucky calibration suggested that the model had a serial correlation, but results were not conclusive. VIF values for all models were within the acceptable ranges and the sign test results were significant (p ≤ 0.01) for 11 of the 12 calibration models. Tree-ring chronologies that were retained by at least one of the stepwise regression models were comprised of various locations ( Figure 1) and species (Table 3). The Knob Job chronology (eastern red cedar) was retained by the highest number of calibration models. More oak chronologies were available on the ITRDB in the Southeastern U.S. than any other species, and they were retained by the greatest number of models. While the Hampton Hills chronology (white oak) contained a strong moisture signal and was retained in four of the models, it only dated to 1772, which limited the reconstruction length of those gages. Furthermore, many of the bald cypress tree-ring chronologies on the Atlantic coast previously found to contain a high moisture signal [8], were also retained in many of our models.
We chose four streamflow stations (Nolichucky, Nantahala, Emory, and SF Holston) that had sufficient calibration windows (≥40 years) and covered a large spatial region of the Tennessee Valley ( Figure 1) for analysis. These four calibration models (Figure 3) explained 42%-52% of the variance in the May-June-July streamflow records. The models generally captured the year-to-year trends and the peaks of the regional streamflow ( Figure 3). We chose four streamflow stations (Nolichucky, Nantahala, Emory, and SF Holston) that had sufficient calibration windows (≥40 years) and covered a large spatial region of the Tennessee Valley ( Figure 1) for analysis. These four calibration models ( Figure 3) explained 42%-52% of the variance in the May-June-July streamflow records. The models generally captured the year-to-year trends and the peaks of the regional streamflow ( Figure 3). May-June-July streamflow reconstructions, smoothed with five-year end year filters, were created for the Nolichucky, Nantahala, Emory, and SF Holston gages (Figure 4). Flow at the May-June-July streamflow reconstructions, smoothed with five-year end year filters, were created for the Nolichucky, Nantahala, Emory, and SF Holston gages (Figure 4). Flow at the Nolichucky gages was reconstructed back to 1686, Nantahala (1679), and flow at the Emory and SF Holston gages was reconstructed back to 1772. The reconstructions revealed numerous wet and dry periods that varied slightly at each gage. The distribution of flow years in the lowest 10th percentile from 1772-1980 was analyzed for the visual validation of the streamflow reconstructions ( Figure 5). The distribution of low flow years across the four stations was consistent from 1772 to 1910. The period from 1910 to 1940 revealed numerous dry years that matched favorably across the four stations. In the Western U.S., specifically the Upper Colorado River Basin, the highest sustained flows in the last 500 years occurred in the early decades of the 20th century [20]. The Tennessee Valley experienced numerous May-June-July low flow years from 1910 to 1940. Studies done by Stahle et al. and Stahle and Cleaveland [8,24] also found dry periods in their reconstructions of North Carolina, South Carolina, and Georgia, in the spring-summer precipitation, during this period. We noted for the first time that, while most of the Western U.S. was experiencing some of its highest flow years during the early 1900s, the Tennessee Valley region was experiencing very low spring-summer conditions. In comparing the observed and reconstructed extreme (low and high) flows for the four streams by applying the five-year-end year filter (Figure 4), generally the most extreme observed low flows (when compared to the reconstructed flows) occurred in the late 1980 s, while the most extreme high flows were in the recent (1990 s and 2000 s) records. Additionally, the Emory River and SF Holston displayed a decline in streamflow at the end of the observed record.    Although our reconstructions were not as robust (in terms of length and explained variance) as those found in the Western U.S., they could provide regional water managers with a visual tool to analyze current and future spring-summer streamflow patterns and extremes within the Tennessee Valley. Climatic persistence from year to year and biological persistence in tree growth in the Southeastern U.S. makes it difficult to create statistically skillful hydrological reconstructions, because tree growth is likely driven by several environmental variables. Value would be found in the collection of more recent samples from tree species that were found to contain a significant response to precipitation, in our research. Many of the chronologies in the region available on the ITRDB were last cored in the 1980s, making it difficult to compare the recent changes in climate with the climate of past centuries.

Discussion
Reconstructions of the hydrological parameters provide valuable information to water managers and planners given the limited period of record of the observed data. While preliminary, the current research represents the first comprehensive evaluation of the streamflow reconstruction potential in eastern Tennessee and Western North Carolina. Statistically skillful reconstructions of the seasonal streamflow were developed for multiple gages, providing useful information about past periods of drought and pluvial periods in the region. As noted previously, the distribution of low flow years across the four stations was consistent from 1772 to 1910. Additionally, the most recent period (1990′s and 2000′s in the observed record) appeared to be a pluvial period, when compared to the reconstructed flows. Climate signals (e.g., El Nino Southern Oscillation-ENSO, Atlantic Multidecadal Oscillation-AMO) are well established in Southeast U.S. and have been shown to influence streamflow [25] and, in turn, tree growth [8,26]. While these climate signals have not been shown to extend to the Midwest U.S., streamflow [25] and tree-ring-based reconstructions of drought [27] have been linked to the North Atlantic Oscillation (NAO), indicating that the method of utilizing tree ring proxies influenced by climate signals would be applicable in other regions. Future collections of new tree ring proxies would likely increase the statistical skill of the reconstructions and, perhaps, increase or lengthen the season (i.e., May-June-July) of the streamflow reconstruction, providing increased information on past water availability.  Although our reconstructions were not as robust (in terms of length and explained variance) as those found in the Western U.S., they could provide regional water managers with a visual tool to analyze current and future spring-summer streamflow patterns and extremes within the Tennessee Valley. Climatic persistence from year to year and biological persistence in tree growth in the Southeastern U.S. makes it difficult to create statistically skillful hydrological reconstructions, because tree growth is likely driven by several environmental variables. Value would be found in the collection of more recent samples from tree species that were found to contain a significant response to precipitation, in our research. Many of the chronologies in the region available on the ITRDB were last cored in the 1980s, making it difficult to compare the recent changes in climate with the climate of past centuries.

Discussion
Reconstructions of the hydrological parameters provide valuable information to water managers and planners given the limited period of record of the observed data. While preliminary, the current research represents the first comprehensive evaluation of the streamflow reconstruction potential in eastern Tennessee and Western North Carolina. Statistically skillful reconstructions of the seasonal streamflow were developed for multiple gages, providing useful information about past periods of drought and pluvial periods in the region. As noted previously, the distribution of low flow years across the four stations was consistent from 1772 to 1910. Additionally, the most recent period (1990 s and 2000 s in the observed record) appeared to be a pluvial period, when compared to the reconstructed flows. Climate signals (e.g., El Nino Southern Oscillation-ENSO, Atlantic Multidecadal Oscillation-AMO) are well established in Southeast U.S. and have been shown to influence streamflow [25] and, in turn, tree growth [8,26]. While these climate signals have not been shown to extend to the Midwest U.S., streamflow [25] and tree-ring-based reconstructions of drought [27] have been linked to the North Atlantic Oscillation (NAO), indicating that the method of utilizing tree ring proxies influenced by climate signals would be applicable in other regions. Future collections of new tree ring proxies would likely increase the statistical skill of the reconstructions and, perhaps, increase or lengthen the season (i.e., May-June-July) of the streamflow reconstruction, providing increased information on past water availability.