Nitrogen and Phosphorus Concentration Thresholds toward Establishing Water Quality Criteria for Pennsylvania, USA

: Nutrient enrichment is currently a leading cause of impairment to streams in Pennsylvania. Evaluating the water quality condition and eutrophic status of streams and rivers is a challenge without established thresholds for nutrient concentrations, which can vary depending on climate and landscape characteristics. The US Environmental Protection Agency (USEPA) has published nutrient criteria for nutrient ecoregions nationwide that are used as regional baseline values; and has encouraged states to develop more reﬁned values if better data are available. In this study, we quantiﬁed long-term nutrient concentrations observed in streams and rivers across Pennsylvania using a robust water quality dataset compiled from monitoring data collected over the past two decades (2000–2019) by multiple agencies. We estimated nutrient criteria concentration thresholds for each ecoregion using USEPA’s percentile approach. The 25th percentile median concentrations observed in streams and rivers ranged from 0.27 to 2.30 mg / L for total nitrogen (TN), and from 0.010 to 0.053 mg / L for total phosphorus (TP). The percent of sites with available data that exceeded the 25th percentile was 53% for TN and 60% for TP, reﬂecting longstanding problems with nutrient pollution of rivers and streams in Pennsylvania. The 25th percentile may overestimate background condition levels, as nutrient conditions vary substantially within and among ecoregions. We compared our contemporary concentrations at the threshold values to other published recommended criteria for the region and explored the inﬂuence of landscape heterogeneity and seasonality on nutrient concentrations. The spatial and temporal variability of nutrient conditions emphasizes the importance of using percentile analysis as only a guide toward more robust response-based methods, rather than as a method for setting nutrient criteria in Pennsylvania. Our results provide environmental managers with new insights regarding the status of nutrient conditions in streams and rivers across Pennsylvania ecoregions toward further developing numeric nutrient criteria. defensible numeric Reliable estimates of nutrient criteria values are needed, based on the best available water quality observations, to guide policy and management. In this study, we quantified long-term nutrient concentrations observed in streams and rivers across Pennsylvania, and calculated nutrient thresholds using USEPA’s recommended percentile approach. Our nutrient concentration values were developed using a new, robust water quality dataset compiled from all reliable monitoring data statewide, collected over the past two decades (2000–2019) by multiple agencies [20]. We compare our new data-based values for Pennsylvania to USEPA’s recommended ecoregional nutrient criteria and to other published estimates. Further, we consider the influence of geology, physiography, and land use on nutrient concentration values. Our results provide environmental managers with new insights regarding the contemporary status of nutrient conditions in Pennsylvania’s ecoregions toward establishing


Introduction
Nutrient pollution from nitrogen and phosphorus is associated with many surface water impairments, including eutrophication of lakes and estuaries, nuisance and harmful algae blooms, and coastal hypoxia [1,2]. The US Clean Water Act requires states to assess water quality conditions, and nutrient enrichment from nitrogen and phosphorus is a common cause of impairment to water bodies across the Nation [3,4]. Despite long-term policy and regulation efforts and substantial investments in pollution management and control, concentrations of nutrients in many streams and rivers of Pennsylvania exceed natural background levels [2,5]. Currently, nutrients are the fifth-leading cause of impairment in Pennsylvania streams ( Figure 1) [4]. Initial reductions of nutrient loadings in Pennsylvania resulted from early statutes such as the state's Clean Streams Law that regulated discharge of sewage to streams [6] and laid the foundation for regulating pollutants under the U.S. Clean Water Act [4]. The primary mechanism to set legal pollution limits on phosphorus and nitrogen in streams is through development of Total Maximum Daily Load (TMDL), computed for impaired waters that do not meet their designated use [7]. Federal and state control measures have been mandated and enforced for point sources such as atmospheric emissions and wastewater effluent, but less so for non-point runoff to surface waters. For example, under the U.S. Clean Air Act Amendments of 1990, the United States Environmental Protection Agency (USEPA) has implemented several regulations including the Acid Rain Program, the Clean Air Interstate Rule, and the Cross-State Air Pollution Rule, which have successfully reduced emissions of nitrogen oxides from power plants and have had beneficial effects on water quality [8,9]. Further, under the U.S. Clean Water Act [4], USEPA has implemented pollution control programs where it is unlawful to discharge wastewater effluent into surface waters unless permits are obtained through the National Pollutant Discharge Elimination System. While EPA has developed national water quality criteria recommendations for pollutants in surface waters, progress has been limited in achieving nutrient goals. Compliance with erosion and nutrient management plans to reduce nutrient loss from agriculture sectors has historically been mostly voluntary, except for more recent requirements for agricultural inspections of farms in the Chesapeake Bay watershed [10].
Evaluating the water quality condition and eutrophic status of streams and rivers is a challenge without established thresholds for nutrient concentrations, which can vary depending on climate, land use, soils, and geology. The USEPA has instructed states and tribes to adopt numeric nutrient criteria as one strategy toward reducing pollution of waterways with excess nitrogen and phosphorus [11,12]. Toward that goal, USEPA has published technical guidance and criteria recommendations to help states develop scientifically defensible nutrient criteria for rivers and streams using percentile and/or predictive analysis approaches [13][14][15][16]. For a percentile analysis, a data frequency distribution is used-where the suggested criterion for an ecoregion is determined by the 75th percentile of reference streams or the 25th percentile of all streams with nutrient concentrations data available [13,17]. Predictive analysis uses relationships of response-based variables with nutrients to identify reference conditions. In the landscape classification developed by USEPA based on geology, land use, ecosystem type, and nutrient conditions, the contiguous United States has 14 nutrient ecoregions that are an aggregation of Omernik Level III ecoregions [18,19]. These ecoregions provide groupings of similar ecosystem types that can be used for analysis [18]. The USEPA has published baseline nutrient criteria for the aggregate nutrient ecoregions across the Nation based on percentile analysis [13] and has encouraged states to replicate this approach with more refined regional or statewide datasets, or to develop alternative predictive models. Initial reductions of nutrient loadings in Pennsylvania resulted from early statutes such as the state's Clean Streams Law that regulated discharge of sewage to streams [6] and laid the foundation for regulating pollutants under the U.S. Clean Water Act [4]. The primary mechanism to set legal pollution limits on phosphorus and nitrogen in streams is through development of Total Maximum Daily Load (TMDL), computed for impaired waters that do not meet their designated use [7]. Federal and state control measures have been mandated and enforced for point sources such as atmospheric emissions and wastewater effluent, but less so for non-point runoff to surface waters. For example, under the U.S. Clean Air Act Amendments of 1990, the United States Environmental Protection Agency (USEPA) has implemented several regulations including the Acid Rain Program, the Clean Air Interstate Rule, and the Cross-State Air Pollution Rule, which have successfully reduced emissions of nitrogen oxides from power plants and have had beneficial effects on water quality [8,9]. Further, under the U.S. Clean Water Act [4], USEPA has implemented pollution control programs where it is unlawful to discharge wastewater effluent into surface waters unless permits are obtained through the National Pollutant Discharge Elimination System. While EPA has developed national water quality criteria recommendations for pollutants in surface waters, progress has been limited in achieving nutrient goals. Compliance with erosion and nutrient management plans to reduce nutrient loss from agriculture sectors has historically been mostly voluntary, except for more recent requirements for agricultural inspections of farms in the Chesapeake Bay watershed [10].
Evaluating the water quality condition and eutrophic status of streams and rivers is a challenge without established thresholds for nutrient concentrations, which can vary depending on climate, land use, soils, and geology. The USEPA has instructed states and tribes to adopt numeric nutrient criteria as one strategy toward reducing pollution of waterways with excess nitrogen and phosphorus [11,12]. Toward that goal, USEPA has published technical guidance and criteria recommendations to help states develop scientifically defensible nutrient criteria for rivers and streams using percentile and/or predictive analysis approaches [13][14][15][16]. For a percentile analysis, a data frequency distribution is used-where the suggested criterion for an ecoregion is determined by the 75th percentile of reference streams or the 25th percentile of all streams with nutrient concentrations data available [13,17]. Predictive analysis uses relationships of response-based variables with nutrients to identify reference conditions. In the landscape classification developed by USEPA based on geology, land use, ecosystem type, and nutrient conditions, the contiguous United States has 14 nutrient ecoregions that are an aggregation of Omernik Level III ecoregions [18,19]. These ecoregions provide groupings of similar ecosystem types that can be used for analysis [18]. The USEPA has published baseline nutrient criteria for the aggregate nutrient ecoregions across the Nation based on percentile analysis [13] and has encouraged states to replicate this approach with more refined regional or statewide datasets, or to develop alternative predictive models.
Except for waterways designated as public supplies, there are no accepted scientifically defensible numeric criteria for nutrients in Pennsylvania to date. Reliable estimates of nutrient criteria values are needed, based on the best available water quality observations, to guide policy and management. In this study, we quantified long-term nutrient concentrations observed in streams and rivers across Pennsylvania, and calculated nutrient thresholds using USEPA's recommended percentile approach. Our nutrient concentration values were developed using a new, robust water quality dataset compiled from all reliable monitoring data statewide, collected over the past two decades (2000-2019) by multiple agencies [20]. We compare our new data-based values for Pennsylvania to USEPA's recommended ecoregional nutrient criteria and to other published estimates. Further, we consider the influence of geology, physiography, and land use on nutrient concentration values. Our results provide environmental managers with new insights regarding the contemporary status of nutrient conditions in Pennsylvania's ecoregions toward establishing statewide numeric nutrient criteria.

Study Area
The state of Pennsylvania, USA, covers an area of 116,083 km 2 with a population of 12.8 million people. With 137,029 km of streams [4], Pennsylvania has the second-most miles of streams in the United States and the highest stream density per unit area than any other state. The streams and rivers of PA drain to the Delaware Bay (from the Delaware River basin), the Chesapeake Bay (from the Susquehanna and Potomac basins), the Gulf of Mexico (from the Ohio River basin), or to the Great Lakes (from the Lake Erie and Genesee basins) (Figure 2a). The environmental conditions are heterogeneous, as the state spans large gradients of terrain, climate, and land use. Pennsylvania's physiographic provinces include the Appalachian Plateaus (60.8%), Ridge and Valley (27.7%), Piedmont (10%), New England (0.5%), Central Lowlands (0.5%), and Coastal Plain (0.5%). The geology is comprised mainly of siliciclastic, crystalline, carbonate, and sand/gravel rocks overlain by a landscape of forested (54.2%), agriculture (23.5%), and developed (15.5%) land [21].
Pennsylvania has five major nutrient ecoregions [15][16][17] (Figure 2b). Ecoregion XI-Central and Eastern Forested Uplands covers 52.7% of PA and is predominately an unglaciated forested mountainous and upland plateau region with high-relief terrain and steeply sloped streams. In ecoregion VII-Glaciated Dairy Region (15.6%), unconsolidated sediments deposited from past glaciation events have left many wetlands and lakes surrounded by rolling hills covered with forests and dairy operations. The portion of Pennsylvania in ecoregion VIII-Nutrient-Poor, Largely Glaciated Upper Midwest and Northeast (21.1%) is a heavily forested plateau of horizontal siliciclastic rocks with less glaciation in the west compared to the eastern glaciated Pocono region. Ecoregion IX-Southern Temperate Forested Plains and Hills (10.2%) is composed of irregular plains and hills, often with more intensive agricultural pasture and cropland or developed areas. The small portion of ecoregion XIV-Eastern Coastal Plain in PA (0.4%) is mostly developed with some agriculture and woodland on a mostly flat landscape of poorer drainage.
Water 2020, 12, x FOR PEER REVIEW 3 of 13 Except for waterways designated as public supplies, there are no accepted scientifically defensible numeric criteria for nutrients in Pennsylvania to date. Reliable estimates of nutrient criteria values are needed, based on the best available water quality observations, to guide policy and management. In this study, we quantified long-term nutrient concentrations observed in streams and rivers across Pennsylvania, and calculated nutrient thresholds using USEPA's recommended percentile approach. Our nutrient concentration values were developed using a new, robust water quality dataset compiled from all reliable monitoring data statewide, collected over the past two decades (2000-2019) by multiple agencies [20]. We compare our new data-based values for Pennsylvania to USEPA's recommended ecoregional nutrient criteria and to other published estimates. Further, we consider the influence of geology, physiography, and land use on nutrient concentration values. Our results provide environmental managers with new insights regarding the contemporary status of nutrient conditions in Pennsylvania's ecoregions toward establishing statewide numeric nutrient criteria.

Study Area
The state of Pennsylvania, USA, covers an area of 116,083 km 2 with a population of 12.8 million people. With 137,029 km of streams [4], Pennsylvania has the second-most miles of streams in the United States and the highest stream density per unit area than any other state. The streams and rivers of PA drain to the Delaware Bay (from the Delaware River basin), the Chesapeake Bay (from the Susquehanna and Potomac basins), the Gulf of Mexico (from the Ohio River basin), or to the Great Lakes (from the Lake Erie and Genesee basins) ( Figure 2a). The environmental conditions are heterogeneous, as the state spans large gradients of terrain, climate, and land use. Pennsylvania's physiographic provinces include the Appalachian Plateaus (60.8%), Ridge and Valley (27.7%), Piedmont (10%), New England (0.5%), Central Lowlands (0.5%), and Coastal Plain (0.5%). The geology is comprised mainly of siliciclastic, crystalline, carbonate, and sand/gravel rocks overlain by a landscape of forested (54.2%), agriculture (23.5%), and developed (15.5%) land [21].
Pennsylvania has five major nutrient ecoregions [15][16][17] (Figure 2b). Ecoregion XI-Central and Eastern Forested Uplands covers 52.7% of PA and is predominately an unglaciated forested mountainous and upland plateau region with high-relief terrain and steeply sloped streams. In ecoregion VII-Glaciated Dairy Region (15.6%), unconsolidated sediments deposited from past glaciation events have left many wetlands and lakes surrounded by rolling hills covered with forests and dairy operations. The portion of Pennsylvania in ecoregion VIII-Nutrient-Poor, Largely Glaciated Upper Midwest and Northeast (21.1%) is a heavily forested plateau of horizontal siliciclastic rocks with less glaciation in the west compared to the eastern glaciated Pocono region. Ecoregion IX-Southern Temperate Forested Plains and Hills (10.2%) is composed of irregular plains and hills, often with more intensive agricultural pasture and cropland or developed areas. The small portion of ecoregion XIV-Eastern Coastal Plain in PA (0.4%) is mostly developed with some agriculture and woodland on a mostly flat landscape of poorer drainage.

Estimating Nutrient Water Quality Threshold Concentrations
In this study, we followed a method specified in the US Environmental Protection Agency's technical guidance to develop scientifically defensible nutrient criteria for Pennsylvania rivers and streams. To estimate numeric values of nutrient thresholds, USEPA recommends synthesizing water quality data from all streams and rivers within each nutrient ecoregion and evaluating the frequency distribution of the concentration values using a percentile approach [13][14][15]. This frequency distribution is based on seasonal median concentration values for a given nutrient water quality parameter, observed at various monitoring locations across broad geographic regions (Figure 3). Seasonal median concentration values are used to represent typical conditions, minimizing bias from outliers or undue influence by a small number of monitoring sites [14]. The threshold nutrient criterion value is determined by the first quartile of all streams with nutrient concentration data available [13,17]. This lower 25th percentile of all nutrient data is estimated to represent a basis for background water quality conditions considered good water quality without nutrient impairments [14] (Figure 3).

Estimating Nutrient Water Quality Threshold Concentrations
In this study, we followed a method specified in the US Environmental Protection Agency's technical guidance to develop scientifically defensible nutrient criteria for Pennsylvania rivers and streams. To estimate numeric values of nutrient thresholds, USEPA recommends synthesizing water quality data from all streams and rivers within each nutrient ecoregion and evaluating the frequency distribution of the concentration values using a percentile approach [13][14][15]. This frequency distribution is based on seasonal median concentration values for a given nutrient water quality parameter, observed at various monitoring locations across broad geographic regions (Figure 3). Seasonal median concentration values are used to represent typical conditions, minimizing bias from outliers or undue influence by a small number of monitoring sites [14]. The threshold nutrient criterion value is determined by the first quartile of all streams with nutrient concentration data available [13,17]. This lower 25th percentile of all nutrient data is estimated to represent a basis for background water quality conditions considered good water quality without nutrient impairments [14] (Figure 3). A multi-agency dataset was synthesized using USEPA's suggested methodology consisting of nutrient concentrations observed at stream and river monitoring sites within Pennsylvania from 2000 to 2019, monitored by federal, state, and local agencies [20]. We retrieved data from samples collected by the US Geological Survey, the US Environmental Protection Agency, the Pennsylvania Department of Environmental Protection, the New York State Department of Environmental Conservation, the Susquehanna River Basin Commission, Stroud Water Research Center and other agencies-which are all publicly available via the national Water Quality Portal [22]. We compiled A multi-agency dataset was synthesized using USEPA's suggested methodology consisting of nutrient concentrations observed at stream and river monitoring sites within Pennsylvania from 2000 to 2019, monitored by federal, state, and local agencies [20]. We retrieved data from samples collected by the US Geological Survey, the US Environmental Protection Agency, the Pennsylvania Department of Environmental Protection, the New York State Department of Environmental Conservation, the Susquehanna River Basin Commission, Stroud Water Research Center and other agencies-which are all publicly available via the national Water Quality Portal [22]. We compiled concentration data on total nitrogen (TN, expressed in units of mg N/L), and total phosphorus (TP, expressed in units of mg P/L). In the various databases, TN and TP concentration values were described as either the total or the Water 2020, 12, 3550 5 of 13 unfiltered fraction of the water sample, and both were used interchangeably. Data values reported as missing or outliers outside six standard deviations from the mean were removed from the dataset [23]. Censored data (non-detects) having concentration values less than the reporting limit or reported as values of zero were set to the detection limit for that parameter and only medians or percentiles above the highest reporting level were used to present descriptive statistics [24]. Any detection value < 0.0001 was reported as 0.0001.
For each monitoring site, a median concentration value for each season for each year from 2000 to 2019 was used as the aggregate dataset for analysis in this study (Figure 3) [15]. If only 2 observations were available for a site during a season of a year the average was computed and if only one observation was available, the single true value was used. This approach preserves the spatial extent of observations and range of streams sampled among the land use and geological settings in Pennsylvania. Eliminating data (e.g., 3 or less observations) would greatly reduce the spatial extent of observations and bias the data analysis to frequently sampled sites such as large rivers and locations monitored for pollution issues. Seasonal timeframe classifications were based on those used for ambient water quality criteria recommendations for nutrient ecoregions of PA [16]. This aggregate dataset of median seasonal concentrations was then used to create summary and frequency distributions for total nitrogen (TN) and total phosphorus (TP) for this 20-year period for the nutrient ecoregions of Pennsylvania. To simplify terminology, the use of nutrient concentration in this paper refers to the aggregate values of median-seasonal concentration described above.

Results
From our synthesis of water quality data, a total of 55,437 individual nutrient observations of TN or TP concentrations from 1307 stream sampling locations across Pennsylvania were used [20]. This is a rich dataset allowing robust summary statistics, ensuring that the median values can be determined properly even in the presence of outlier concentration values. These data were used to calculate the median seasonal concentrations observed in streams and rivers from 2000 to 2019, with 13,861 values for TN, and 13,822 values for TP statewide ( Table 1). The frequency, 25th percentile, median, and 75th percentile for the median seasonal concentrations of TN and TP for each aggregate nutrient ecoregion, physiographic province, geological setting, and land use category of Pennsylvania were calculated from our water quality dataset (Table 1). A full data table of these percentile values, from 5 to 95% in 5% increments, is provided along with this manuscript as Supplementary Material. A large population and range of concentrations were available for each of USEPA's aggregate nutrient ecoregions found in Pennsylvania, except for ecoregion XIV which occupies a small geographic area of the state. The distribution of TN and TP varied greatly across the regions (Table 1, Figure 4). Overall, the highest nutrient concentrations were found in ecoregions IX and XIV which are largely within Piedmont and the Atlantic Coastal Plain physiographic settings.
Assessments of water quality condition require the ability to compare the current status against an expectation of what natural conditions might be in minimally disturbed areas referred to as a reference condition [25]. Following USEPA's guidance, the 25th percentile median value of an ecoregion is suggested as a simple metric to represent reference conditions that can be used for nutrient criteria development. Based on our statewide seasonal median concentrations, the percent of sites with available data that exceeded the USEPA suggested nutrient criteria in ecoregions across the state for TN and TP was 85% and 87%, respectively. The percent of sites that exceeded the 25th percentile from the data compiled in this study with a more refined and robust statewide dataset was 53% for TN and 60% for TP. Similarly, long-term TP concentrations exceeded natural background levels in 52% of streams nationwide [5]. The high percentage of streams with TN and TP concentrations exceeding the 25th percentile median values highlights longstanding problems with nutrient pollution of rivers and streams in Pennsylvania.
We compared the nutrient concentration threshold values in ecoregions derived from our contemporary Pennsylvania water quality dataset with estimates from previous studies ( Table 2) that Water 2020, 12, 3550 6 of 13 were conducted regionally or nationally [16,[26][27][28][29]. The 25th percentile values of TN from this study were comparable to USEPA's regional criteria values for ecoregions VII and VIII, whereas our values for ecoregions IX, XI, XIV were larger by a factor of 2 to 3 ( Figure 4, Table 2). The 25th percentile values of TP from this study were comparable to USEPA's regional criteria values for ecoregions VII, VIII, IX and XI, but were almost double as high for ecoregion XIV. The regions where our Pennsylvania-specific results are most different from other studies have large areas of agricultural and developed land. Although our 25th percentile values for PA were different from USEPA recommended criteria in some ecoregions, our high and low criterion values among ecoregions were generally consistent.  The red line is USEPA's recommended regional criterion value for each ecoregion. The dashed line is the detection limit.

Discussion
This study is the first to provide a percentile analysis of long-term median nutrient concentrations in Pennsylvania using a robust multi-agency water quality dataset. Across the USEPA-defined ecoregions of Pennsylvania, the 25th percentile median concentrations observed in streams and rivers ranged from 0.27 to 2.30 mg/L for TN, and from 0.010 to 0.053 mg/L for TP, (Table 1). This is somewhat consistent with the statewide nutrient concentration thresholds (2.01 mg/L TN and 0.07 mg/L TP) estimated for earlier decades [30]. Within the aggregate ecoregions of Pennsylvania, the 25th percentile threshold may overestimate true background or reference condition levels, as nutrient conditions differ substantially within and among ecoregions due to both natural and anthropogenic factors.
We quantified the variability in median nutrient concentrations and associated threshold potential nutrient criteria values observed across common statewide geographic characterizations (see Table 1, Figure 4). These results highlight that using a single statewide numeric nutrient criterion for nutrients (TN or TP) is unreasonable, as concentrations show orders of magnitude differences across geological, land use and ecoregional settings, which in turn are controlled by both natural processes and anthropogenic inputs. Our results also highlight that aiming to use a single numeric nutrient threshold criterion (i.e., a single value of TN or TP) for an entire USEPA aggregate ecoregion is not well-suited for policy and management purposes in Pennsylvania (Figures 4 and 5). Interrelated variables like land use have been shown to be better predictors of variability than ecoregions [31]. To illustrate the variability within ecoregions, we explored how seasonality affects nutrient concentrations among land use and geological settings within Pennsylvania ( Figure 5). Across all ecoregions the TN concentrations are often less in the spring and summer compared to fall and winter, likely attributed to plant uptake and assimilation of nutrients during the growing season. Land uses such as agriculture in carbonate settings has a large influence on TN criteria for ecoregions IX and XI. Similarly, developed land use increases the TP criterion in ecoregions such as IX.
The shortcomings of the percentile approach to estimate numeric nutrient criteria have been well documented (e.g., [17,27]) and were apparent in this analysis as well. The percentile approach produces nutrient concentration values dependent on the land use, geology, climate, and time-period of the sample population. For example, the 25th percentile median value in ecoregion IX in southeastern Pennsylvania is not likely representative of background or reference conditions, as environmental characteristics such as agricultural land use produce higher concentrations of nutrients observed throughout the general dataset (Table 1). Inconsistencies in data and reporting practices across agencies for censored data limited the use of smaller percentiles such as the 5th percentile, which might be a better metric toward representing reference or background conditions. (Note that Supplementary Material Tables S1 and S2 provide the 5th to the 95th percentiles of median seasonal concentrations for TN and TP, respectively). The spatial and temporal variability of nutrient conditions emphasizes the importance of using percentile analysis as only a guide toward more robust response-based methods, rather than as a method for setting nutrient criteria in Pennsylvania.
Differences in 25th percentile potential criteria values estimated among studies ( Table 2) can result from dissimilar datasets, time periods and study regions. The larger geographic region used in the development of USEPA ecoregion criteria development has been shown to be too coarse [28] and often produces a lower criterion and less accurate conditions compared with the use of a more robust state specific dataset. For instance, the portion of ecoregion IX within Pennsylvania represents more intensive agriculture than exists for the same ecoregion outside of the state. Limitations of the percentile approach include the difficulty of assembling similar multi-agency datasets across comparable time periods. Importantly, the agency water quality monitoring programs themselves may cause bias in the results, for example if a data collection program focused on sampling mostly streams with water quality impairments, sampling particular land use types, or sampling mainly under baseflow rather than storm flow conditions. Water 2020, 12, x FOR PEER REVIEW 9 of 13

Conclusions
The USEPA views numeric nutrient criteria as needed for protecting and restoring a waterbody's designated uses related to pollution of streams and rivers with TN and TP [12]. The aim is to use the criteria to assess the water quality conditions of a waterbody relative to its designated use in order to help formulate discharge permits, and to inform management plans for restoring impaired water bodies [12]. The percentile-based reference condition approach is one of several tools promoted by USEPA for the development of nutrient criteria, along with predictive analysis such as empirical stressor-response relationships and mechanistic modeling approaches. Numeric criteria may not be necessarily needed for accomplishing the goal of controlling excessive nutrient loadings. For example, many published modeling studies for individual watersheds or water regions directly link inputs of TN or TP to response variables that represent water quality impairments such as dissolved oxygen and algal growth. There remains a need for considering how the development of nutrient criteria fit with decision making or regulatory processes within Pennsylvania and other states; for example, establishing linkages among designated uses for stream and river segments, their water quality condition (e.g., legally impaired or not in relation to criteria set for their designated use), and nutrient inputs.
Simulations of spatially-and temporally-averaged TN and TP concentrations and loadings to individual reaches of the stream and river network are now publicly available from the USGS Spatially Referenced Regression on Watersheds (SPARROW) approach, applied at regional and national scales [32][33][34]. Incorporating further monitoring datasets from multiple agencies into future SPARROW models could help improve predictions of nutrient concentrations for freshwater tributaries across the watersheds of Pennsylvania. In turn, there is the potential to use modelled results to provide a new perspective on water quality conditions and threshold nutrient criteria values.
Section 304(a) of the U.S. CWA and mandated by the 1998 Clean Water Action Plan requires each state to develop numeric criteria for nutrients. It has now been several decades since the U.S. Environmental Protection Agency published recommended regional values for N and P using the percentile approach as a starting point for states [14]. The Pennsylvania Department of Environmental Protection is currently working to further develop, water quality criteria and standards consistent with the CWA. They have yet to formalize legally defensible nutrient criteria values for the state, and are currently pursuing response-based approaches [35] and other methods. Our composite, contemporary dataset on nutrient concentrations observed across the heterogenous environmental settings of Pennsylvania, and the median 25th percentile values produced in this study ( Table 2) will be useful toward guiding the further development of nutrient criteria within the state.
Even with the large amount of monitoring data available for Pennsylvania, there were still too few observations for streams in several settings, particularly undeveloped areas underlain by carbonate rock, to adequately understand background levels, as affected by natural and human processes. The paucity of data representing specific background conditions points to the need for additional monitoring in representative reference settings. Similarly, the seasonal periods where no observational data are available over the past two decades in certain geographic settings ( Figure 5) point to the need for additional consideration for monitoring of water quality over representative time frames. Furthermore, there is substantial loss of government-funded investment in water quality data collection programs among multiple agencies within Pennsylvania for secondary use of data that cannot be used beyond original intent. This cost could be avoided with a coordinated effort to address ambiguous data issues, implementation of common metadata practices and additional collection of imperative parameters such as streamflow for nutrient load and trend analysis for better usability of the data [20,36]. A future statewide framework that utilizes all agency data could provide useful datasets to assess nutrient conditions across all streams, develop spatially consistent regional reference sites and support nutrient criteria development.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4441/12/12/3550/s1. Table S1: Percentile values (from 5% to 95% in increments of 5%) for median seasonal concentrations for total nitrogen (mg N/L) at stream and river monitoring sites across Pennsylvania from 2000 to 2019. Table S2: Sample percentiles for median seasonal concentrations for total phosphorus (mg P/L) at stream and river monitoring sites across Pennsylvania from 2000 to 2019. Values are shown by the major USEPA ecoregions, physiographic provinces, geological settings, and land uses of Pennsylvania.