An Analysis of Terrestrial and Aquatic Environmental Controls of Riverine Dissolved Organic Carbon in the Conterminous United States

: Analyses of environmental controls on riverine carbon ﬂuxes are critical for improved understanding of the mechanisms regulating carbon cycling along the terrestrial-aquatic continuum. Here, we compile and analyze riverine dissolved organic carbon (DOC) concentration data from 1402 United States Geological Survey (USGS) gauge stations to examine the spatial variability and environmental controls of DOC concentrations in the United States (U.S.) surface waters. DOC concentrations exhibit high spatial variability in the U.S., with an average of 6.42 ± 6.47 mg C/L (Mean ± Standard Deviation). High DOC concentrations occur in the Upper Mississippi River basin and the southeastern U.S., while low concentrations are mainly distributed in the western U.S. Soil properties such as soil organic matter, soil water content, and soil sand content mainly show positive correlations with DOC concentrations; forest and shrub land have positive correlations with DOC concentrations, but urban area and cropland demonstrate negative impacts; and total instream phosphorus and dam density correlate positively with DOC concentrations. Notably, the relative importance of these environmental controls varies substantially across major U.S. water resource regions. In addition, DOC concentrations and environmental controls also show signiﬁcant variability from small streams to large rivers. In sum, our results reveal that general multi-linear regression of twenty environmental factors can partially explain (56%) the DOC concentration variability. This study also highlights the complexity of the interactions among these environmental factors in determining DOC concentrations, thus calls for processes-based, non-linear methodologies to constrain uncertainties in riverine DOC cycling.


Introduction
Riverine carbon cycling is an important but insufficiently investigated component of the global carbon cycle [1][2][3][4]. Global carbon export from land to rivers ranges from 1.9 to 2.7 Pg C/year [1,4], which is comparable with the net carbon uptake by terrestrial ecosystems [5]. The inclusion of riverine carbon in regional carbon monitoring and assessment is necessary and critical, as highlighted in the Fifth Assessment Report of the International Panel on Climate Change (IPCC) [6]. However, our incomplete understanding of the variability and environmental controls of riverine carbon limits our confidence in integrating this component into regional carbon budgeting [2,7].
(NWIS): http://qwwebservices.usgs.gov/. According to the USGS, DOC was measured through UV-promoted persulfate oxidation and infrared spectrometry: https://nwql.usgs.gov/pubs/Method% 20holding%20times%205- 25-10.pdf. Most of the samples were collected during the 1980s to 2010s. To explore the impacts of autochthonous production on DOC concentrations, we obtained chlorophyll-a (parameter code: 70953) data from USGS (246 stations with valid observations). We obtained total nitrogen (parameter code: 00600) and total phosphorus (parameter code: 00665) observations to examine the impacts of riverine nutrients on DOC.
In this study, we primarily focus on the spatial variability of DOC concentrations. We calculated the average DOC concentrations at each station for the subsequent statistical analysis. Since outliers may have significant impacts on the statistical analyses and interpretation of environmental controls on DOC concentrations, data quality control was performed before the statistical analyses. Specifically, we screened the water quality data to remove observations beyond three standard deviations from the mean [35].
In this study, we primarily focus on the spatial variability of DOC concentrations. We calculated the average DOC concentrations at each station for the subsequent statistical analysis. Since outliers may have significant impacts on the statistical analyses and interpretation of environmental controls on DOC concentrations, data quality control was performed before the statistical analyses. Specifically, we screened the water quality data to remove observations beyond three standard deviations from the mean [35]. River basin attributes for each gauge station were obtained from the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES-II) database [36]. This database provides geospatial attributes for stations with streamflow records longer than 20 consecutive years: https://water.usgs.gov// GIS/metadata/usgswrd/XML/gagesII_Sept2011.xml#stdorder. These attributes represent the average conditions of the drainage area of each gauge station. The environmental factors considered in this study include climate conditions (temperature and precipitation), hydrological factors (runoff), land cover (cropland, forest, wetlands, urban area, and grassland), soil properties (soil texture, soil organic matter, soil water content, and soil bulk density), topographic factors (river basin slope and river order), nutrient availability (total nitrogen and total phosphorus), and anthropogenic impacts (dam density). These variables are closely related to either the carbon cycle or the water cycle, thus may affect DOC concentrations in surface waters directly or indirectly. The unites of the selected variables can be found in the Supplementary Material (S1). To reduce the impacts of non-normal distribution on the outcome of statistical analyses, we used log transformation to convert highly skewed variables to approach symmetric distributions [25,37]. The variables that were transformed are listed in Table  1, and the results of the logarithmic transformations can be found in the Supplementary Material (S2). River basin attributes for each gauge station were obtained from the Geospatial Attributes of Gages for Evaluating Streamflow (GAGES-II) database [36]. This database provides geospatial attributes for stations with streamflow records longer than 20 consecutive years: https://water.usgs.gov//GIS/ metadata/usgswrd/XML/gagesII_Sept2011.xml#stdorder. These attributes represent the average conditions of the drainage area of each gauge station. The environmental factors considered in this study include climate conditions (temperature and precipitation), hydrological factors (runoff), land cover (cropland, forest, wetlands, urban area, and grassland), soil properties (soil texture, soil organic matter, soil water content, and soil bulk density), topographic factors (river basin slope and river order), nutrient availability (total nitrogen and total phosphorus), and anthropogenic impacts (dam density). These variables are closely related to either the carbon cycle or the water cycle, thus may affect DOC concentrations in surface waters directly or indirectly. The unites of the selected variables can be found in the Supplementary Material (S1). To reduce the impacts of non-normal distribution on the outcome of statistical analyses, we used log transformation to convert highly skewed variables to approach symmetric distributions [25,37]. The variables that were transformed are listed in Table 1, and the results of the logarithmic transformations can be found in the Supplementary Material (S2). We divided the 1402 stations into 18 groups based on the delineation of the major U.S. water resource regions. This allows us to examine the regional variations in DOC concentrations and associated environmental controls. We also investigated the impact of the environmental factors on DOC concentrations by river orders through grouping all the stations based on their Strahler river order numbers. In this study, headwaters refer to first-through third-order streams, whereas large rivers are those of the sixth and higher-order streams.
We employed linear regression analyses to quantify the impacts of environmental factors on DOC concentrations [38]. First, we conducted bivariate regression analyses across all the 1402 stations to evaluate whether the linear correlation between each of the selected factors and DOC concentrations is statistically significant. Next, general linear models (GLMs) were employed to combine all environmental factors and quantify the collective influences of the selected factors on DOC concentrations. The GLM analyses were performed at two spatial levels; over the entire U.S. and in each water resource region. In addition, we grouped DOC data by four river orders to understand how the hierarchy of river networks can influence DOC concentrations. Factors with high collinearity were removed from these GLM analyses. The variance inflation factor (VIF, ranges from 1 to infinity) is an index quantifying collinearity among multiple independent factors; this index increases with the degree of collinearity. We checked the VIF values of all selected factors in each GLM simulation. The factor with the highest VIF value was removed from the GLM simulations repeatedly until the VIF of each factor was less than 10. Details of the GLM models and the statistical analyses can be found in the Supplementary Material (S3, S4, and S5). The average DOC of the available observations at each station was used as the dependent variable for the GLM simulations. All statistical analyses were conducted using the R statistical software (http://www.R-project.org). A confidence level of 95% (or p value < 0.05) was used to detect statistically significant correlations between an environmental factor and DOC concentrations.

Spatial Variability of DOC Concentrations in the U.S.
The average DOC concentration of all 1402 stations is 6.42 ± 6.47 mg C/L (Mean ± Standard deviation), with low DOC concentrations (<2 mg C/L) mainly located in the Piedmont area of the eastern U.S. and in the northwestern U.S. (Figure 1). High DOC concentrations (>15 mg C/L) occur in the Upper Mississippi River basin and Florida. The stations in the coastal plains in the eastern U.S. and the Lower Mississippi River Basin also have high DOC concentrations (>10 mg C/L).
Regional analyses indicate that the Souris-Red-Rainy region (region 4) and the South Atlantic-Gulf region (region 15) have higher DOC concentrations than the other regions (

Spatial Variability of DOC Concentrations in the U.S.
The average DOC concentration of all 1402 stations is 6.42 ± 6.47 mg C/L (Mean ± Standard deviation), with low DOC concentrations (<2 mg C/L) mainly located in the Piedmont area of the eastern U.S. and in the northwestern U.S. (Figure 1). High DOC concentrations (>15 mg C/L) occur in the Upper Mississippi River basin and Florida. The stations in the coastal plains in the eastern U.S. and the Lower Mississippi River Basin also have high DOC concentrations (>10 mg C/L).
Regional analyses indicate that the Souris-Red-Rainy region (region 4) and the South Atlantic-Gulf region (region 15) have higher DOC concentrations than the other regions (

Bivariate Analyses of Impacts of Environmental Factors on DOC Concentrations
Most of the selected variables exert statistically significant impacts on DOC concentrations at the national scale (Table 1). In general, DOC concentrations increase with the drainage area of the corresponding gauge stations. Temperature and precipitation show contrasting impacts; temperature has positive impact on DOC concentrations, while precipitation has negative impact. As precipitation is the major driver of runoff, it is not surprising to see the negative impact of runoff on DOC concentrations. Both soil water content and soil bulk density are positively correlated with DOC concentration. All of the land use types examined here, including urban area, cropland, and forest, significantly correlate with DOC concentration. Riverine nutrients (nitrogen and phosphorus) have positive correlations with DOC concentration. The percentage of first order streams is negatively correlated with DOC concentration.
Notably, although most of the selected variables have significant correlations with DOC concentrations, each single factor only explains a small portion of the DOC spatial variability (Table 1). Only six factors, including percentage of first order streams, total nitrogen, total phosphorus, wetlands, river basin slope, and forest cover, could each explain more than 10% of the DOC variability.

Multivariate Analyses of Environmental Impacts on DOC Concentrations
The results of the general regression simulations suggest that DOC concentrations are affected collectively by multiple variables (Table 2). At the national level, temperature, soil organic matter, soil sand content, soil water content, and soil bulk density demonstrate positive impacts. Among the examined land cover types, forest, shrub land, and wetlands have positive impacts, but urban area and cropland show negative influences on DOC concentrations. Slope of landscapes shows negative impacts on DOC concentrations. Levels of instream total phosphorus concentrations and dam density positively correlate with DOC concentrations. At the national level, the GLM explains ca. 56% of the variability in DOC concentrations by combining all twenty factors.
It is worth noting that the relative contribution of each of the twenty selected variables varies from region to region (Table 2). For Missouri, Lower Colorado, Rio Grande, and Tennessee, the regression model failed to provide statistically significant estimates of DOC concentrations due to insufficient observations. In the Missouri region, other factors such as runoff (negative) and soil clay (positive) also significantly affect riverine DOC. For New England, Great Lakes, Arkansas-White-Red, and Texas-Gulf, no variables show significant impact on DOC at the confidence level of 95%. Soil organic matter content has significant positive impact on DOC in Upper Mississippi, Mid-Atlantic, and South Atlantic-Gulf. Wetlands positively affect DOC in California and South Atlantic-Gulf but exert negative influence in Missouri. In the southeastern U.S. (region 15), precipitation tends to negatively affect DOC concentration. Temperature increases DOC in Mid-Atlantic and Lower Mississippi but decreases DOC in Missouri. In the western U.S. (regions 8 and 10), DOC concentrations decrease with drainage area. Barren area exert negative impact on DOC concentration in Ohio. Shrub land has positive effect in Upper Colorado. Although all selected soil properties (soil water content, sand content, and soil bulk density) significantly affect DOC at the national scale, their influences are significant in only four regions, and soil bulk density has varied influence. For example, soil water content exhibits positive impacts in South Atlantic-Gulf, while soil bulk density has positive impact in Pacific Northwest but shows negative influence in Lower Mississippi. DOC concentration tend to be lower in Ohio and Lower Colorado that have higher percentages of first-order streams. Total nitrogen and total phosphorus concentrations positively affect DOC in Great Basin and South Atlantic-Gulf, respectively. Dam density positively affects DOC in Upper Colorado and Ohio but shows negative impact in Mid-Atlantic. The U.S. P ** P * P * P ** P * P ** P ** P ** P ** P ** N * N ** N * 0.

DOC Variability and Environmental Controls by River Orders
To investigate the variability of DOC concentrations over the river order hierarchy, we regrouped all stations by their Strahler river orders (Figure 3). The stations on the 2nd order rivers have the highest average DOC concentration of 8.37 mg C/L, and these stations also have the highest standard deviation among all six groups (S7 and S8). The lowest concentrations occur at the 1st order stations, with an average of 5.30 mg C/L. Average DOC concentration increases from 5.65 mg C/L to 6.94 mg C/L from order 3 to order 5 rivers. The average DOC concentration for stations on > 6th order rivers is 6.05 mg C/L. Detailed statistics of DOC concentrations by river groups can be found in the Supplementary Material (S7 and S8).

DOC Variability and Environmental Controls by River Orders
To investigate the variability of DOC concentrations over the river order hierarchy, we regrouped all stations by their Strahler river orders (Figure 3). The stations on the 2nd order rivers have the highest average DOC concentration of 8.37 mg C/L, and these stations also have the highest standard deviation among all six groups (S7 and S8). The lowest concentrations occur at the 1st order stations, with an average of 5.30 mg C/L. Average DOC concentration increases from 5.65 mg C/L to 6.94 mg C/L from order 3 to order 5 rivers. The average DOC concentration for stations on > 6th order rivers is 6.05 mg C/L. Detailed statistics of DOC concentrations by river groups can be found in the Supplementary Material (S7 and S8). The relative impacts of the selected factors on DOC concentrations vary with river orders. Since we obtained observations for the 1st and 2nd order stations, we combined stations of order 1 through order 3 to derive valid GLM analyses (Table 3). For this combined group, temperature and runoff show significant correlations with DOC. The DOC concentrations of stations on 4th order rivers have positive correlation with soil sand content, soil water content, dam density, and soil organic matter content but demonstrate negative correlation with urban area and runoff. For the 5th order stations, soil sand content, soil water content, dam density, total phosphorus, forest cover, and shrub land have positive correlations, whereas slope and runoff have negative correlations. For the 6th and above order stations, DOC concentrations are positively correlated with soil sand content, soil water content, dam density, soil organic matter, total phosphorus, forest cover, total nitrogen, and wetland but are negatively correlated with urban area, soil bulk density, slope, and runoff. The relative impacts of the selected factors on DOC concentrations vary with river orders. Since we obtained observations for the 1st and 2nd order stations, we combined stations of order 1 through order 3 to derive valid GLM analyses (Table 3). For this combined group, temperature and runoff show significant correlations with DOC. The DOC concentrations of stations on 4th order rivers have positive correlation with soil sand content, soil water content, dam density, and soil organic matter content but demonstrate negative correlation with urban area and runoff. For the 5th order stations, soil sand content, soil water content, dam density, total phosphorus, forest cover, and shrub land have positive correlations, whereas slope and runoff have negative correlations. For the 6th and above order stations, DOC concentrations are positively correlated with soil sand content, soil water content, dam density, soil organic matter, total phosphorus, forest cover, total nitrogen, and wetland but are negatively correlated with urban area, soil bulk density, slope, and runoff. Order 1-3 P * N ** 0.40 361 Order 4 P * P * P ** P * N ** N ** 0.52 302 Order 5 P * P ** P * P ** P ** P ** N ** N ** 0.65 321 Order 6 and higher orders P * P * P * P * P * P * P * P * N ** N ** N ** N ** 0. 59 418 Om: soil organic matter; Wet: wetland area; T: temperature; Fost: percentage of forest cover; Shrb: percentage of shrub area; Urbn: percentage of urban area; R: runoff; Sand: percentage of soil sand; Sw: soil water content; Sd: soil bulk density; Slop: average slope of the drainage area; Tn: total nitrogen in water; Tp: total phosphorus in water; Damd: dam density. N: negative correlation; P: positive correlation; Ad-R 2 : adjusted coefficient of determination for the general linear models; *: p < 0.05; **: p < 0.01.

Spatial Patterns of DOC Concentrations
A better understanding of the environmental controls on riverine DOC production and cycling requires a comprehensive analysis based on observational data over different climatic, geological, and hydrological zones [9,28,39]. The variability of DOC concentrations from head waters to downstream regions suggests the importance of considering the spatial patterns of DOC in analyzing DOC dynamics along the terrestrial-aquatic continuum [28,40].
The spatial distribution of high DOC concentrations from this study is consistent with the previous investigations reporting high DOC concentrations in the Upper Mississippi River basin and the southeastern U.S. [41,42]. Similarly, the DOC concentration distribution matches the higher DOC loads in the southeastern parts than other regions of the U.S. [9]. In addition to regions with a large area of wetlands, the coastal plains along the eastern U.S. and Gulf of Mexico also have high DOC concentrations, as a result of high carbon input from riparian swamps in these low-relief landscapes [43,44]. Identifying regions with high DOC concentrations is useful for detecting critical zones where land-to-river transport of DOC may significantly affect the net carbon budgets of terrestrial ecosystems [45].
We did not find consistent patterns of DOC variations and environmental controls across different climatic or hydrological zones. This result corroborates that riverine DOC is regulated by a series of complex and interactively connected processes [17]. Climatic and hydrological factors mainly affect temporal variability rather than spatial distribution of riverine DOC [46,47]. By contrast, for long-term average DOC concentrations, as depicted in our correlation analysis of land cover types, soil properties, and topographic factors, may have more significant impacts than climate factors over large spatial scales [48].
Changes in DOC concentrations by river order reflect the sources of carbon inputs and varied removal rates from headwaters to large rivers [49]. Due to dense riparian plant canopy, litter fall provides the majority of organic carbon loads to headwaters, which is characterized by high bioavailability and decomposition rates. The high litter inputs combined with high decomposition rates favor the generation of DOC and explain the relatively high average DOC concentrations in the 2nd order rivers (Figure 3). For large rivers, terrestrial sources tend to be outweighed by instream processes due to differences in water depth, land cover types, and nutrient conditions, as compared with small streams [49]. The standard deviation for stations on 6th and higher order rivers is smaller than that for low order (order < 5) rivers, which may be caused by the relatively balanced DOC input and removal in downstream regions [50].

Environmental Controls on DOC Concentrations
Comprehensive statistical analyses of DOC concentrations and environmental controls are necessary and critical for identifying key processes that are regulating aquatic DOC cycling [51,52]. Previous investigations showed large uncertainties or even contrasting results in explaining riverine DOC using climate variables, soil properties, land cover types, and hydrological conditions [17,46,53]. Our analyses, based on the extensive USGS observational dataset, provide a systematic examination of DOC variability and the associated environmental controls at regional and national scales. We observed substantial changes in the role of different environmental factors for explaining variations in DOC concentrations across different water resource regions in the U.S., which testifies to the complexity of coupled-terrestrial aquatic carbon cycling [49,54]. Significant spatial variability in environmental factors resulted in region-specific environmental controls on DOC concentrations [53]. Analyses by river orders indicated that the numbers of factors demonstrating significant controls on DOC increased from headwaters to large basins. This pattern may reflect the increasing complexity and heterogeneity in factors controlling DOC such as land use, human activities, and geographical setting [55,56].
In line with previous site-level investigations [57,58], our analyses highlight the important role of soil properties in determining DOC concentrations. For example, soil organic matter has significant impacts on DOC concentrations [59,60] at the national scale and in major three water resource regions ( Table 2). All soil property indicators (i.e., soil sand content, soil water content, and soil bulk density) have significant positive correlations with DOC concentrations at the national-scale, highlighting the important role of soil DOC leachate production and movement for controlling riverine DOC concentrations. DOC adsorption/desorption in soils determines the DOC partition between soil colloids and soil water [61]. Due to the lower adsorption capacity of sandy soils relative to other soil types, saturation of adsorption sites in sandy soils may result in elevated DOC export [62], which may be the reason for the positive impact of soil sand content on DOC concentrations at the national scale. Positive impacts of soil water content on DOC concentrations at the national scale and in the South Atlantic-Gulf region suggest that soil water availability is likely to be a limiting factor in DOC transport from soils to rivers [63].
The results of this study indicate that temperature plays a more important role in affecting DOC spatial variability than hydrological factors (precipitation and runoff), at the national scale. The positive correlation between temperature and DOC concentration is attributable to the fact that the decomposition of litter and soil organic matter is regulated by microbial activities that are sensitive to temperature changes [64]. The findings of this study are also in line with watershed-scale investigation, which demonstrated the significant influence of temperature on DOC [46]. Although the reasons for increased DOC discharge from peat land in recent decades are still under debate [21,51,65], the positive correlation between temperature and DOC concentrations from this study indicate that the potential impact of a warmer climate may enhance soil organic matter loss through DOC export.
Our results also show the important role of wetlands in DOC cycling, as reported previously [66]. Due to the low decomposition rate under anaerobic conditions, wetlands accumulate large amounts of organic carbon and act as important sources of riverine DOC [48]. Forest and shrub land have positive impact, while cropland demonstrates negative influence on DOC concentrations. This may be caused by higher soil organic matter and litter inputs in natural ecosystems than in managed lands, because fresh litter decomposition is an important source of riverine DOC [67]. Additionally, intensive management activities in cropland such as tillage and irrigation may enhance soil organic matter decomposition and reduce DOC leaching from arable soils [68]. The negative influence off urban sprawl may be caused by the reduced water infiltration and associated DOC transport from land surface to deeper soil layers [56]. Our findings highlight the importance of considering the interactions among multiple land use types in explaining the variability of DOC concentrations.
Our analyses also suggest that anthropogenic activities have noticeable impact on riverine DOC concentrations. At the national scale, total phosphorus is an important factor shaping the spatial patterns of DOC concentrations. Positive correlations suggest that instream primary production may have more important impact on DOC flux than previously assumed [1]. In contrast with [46], which reported insignificant impact of in-stream production on DOC, the significant correlation between DOC and chlorophyll-a in this study indicates the important contribution of instream primaryproduction to riverine DOC in the U.S. (Figure 4). The difference between results and those reported earlier [46] may be caused by the higher phosphorus concentrations in U.S. rivers, which enhances in-stream primary production. Our findings are consistent with the conclusions of [68,69] that algae growth can be an important source of riverine organic carbon. Our results also suggest that chemical fertilizer use may alter carbon cycling across the terrestrial-aquatic interface by enhancing algae growth in aquatic ecosystems.
We found that phosphorus, rather than nitrogen, has significant impact on DOC production. This result is in accordance with previous findings that the primary production of aquatic ecosystems is more limited by phosphorus than by nitrogen [70]. The impact of dams should also be taken into consideration in explaining DOC spatial variability [71]. Positive correlations between dam density and DOC concentration may be attributed to the elevated algae growth following the construction of dams [72].

Interplays among Multiple Processes in Determining DOC Concentrations
DOC concentration in riverine systems reflect the combined impacts of multiple processes during DOC leachate production, transport, and transformation. Factors showing significant correlation with DOC concentrations in the bivariate analyses (Table 1) do not necessarily have significant impact in the multiple-variable regression analysis (Table 2) since some processes may be counteractive and cancel out each other's influences. The difference between the single-and multifactor analyses, as demonstrated in Tables 1 and 2, implies the complex interplays of environmental factors in affecting DOC concentrations. Identifying these interactions helps to elucidate the variability in DOC concentrations across multiple spatial scales.
Temperature either enhances or reduces DOC, and the net effects depend on the balance between its influences on litter and organic carbon decomposition in soils and DOC removal in in riverine systems. The production of DOC leachate during decomposition is sensitive to temperature increase, which leads to the positive correlation between DOC concentrations and temperature [73]. Meanwhile, the removal of DOC from surface waters also increases under warming temperatures [53], and this may be the reason for the negative correlation between temperature and DOC concentrations in the Missouri region ( Table 2). The varying role of temperature found in this study corroborates results from seven Swiss rivers with long-term DOC observations [46].
Although wetlands can provide large amounts of organic carbon to downstream waters, we find that wetlands have negative impact on DOC concentrations in the Missouri region. High water tables may change wetlands from DOC sources to sinks as a result of the high DOC retention rates [22]. Admittedly, these assumptions need to be further tested to avoid non-causative correlations.
Dam density has positive influence on DOC at the national level but reduces its concentrations in Mid-Atlantic. Although attenuated flow velocity and prolonged residence time of streamflow following dam construction may favor algae growth, enhanced DOC removal with prolonged residence time in reservoirs may offset this effect and eventually reduce the DOC concentrations [74,75].

Uncertainties and Future Work
In this study, we analyzed the impacts of terrestrial and aquatic factors on DOC concentrations in riverine systems using statistical models. The GLM models provided valuable insights into understanding the spatial patterns of DOC and the associated physical processes regulating riverine carbon. However, the uncertainties associated with the data and statistical methods should be

Interplays among Multiple Processes in Determining DOC Concentrations
DOC concentration in riverine systems reflect the combined impacts of multiple processes during DOC leachate production, transport, and transformation. Factors showing significant correlation with DOC concentrations in the bivariate analyses (Table 1) do not necessarily have significant impact in the multiple-variable regression analysis (Table 2) since some processes may be counteractive and cancel out each other's influences. The difference between the single-and multi-factor analyses, as demonstrated in Tables 1 and 2, implies the complex interplays of environmental factors in affecting DOC concentrations. Identifying these interactions helps to elucidate the variability in DOC concentrations across multiple spatial scales.
Temperature either enhances or reduces DOC, and the net effects depend on the balance between its influences on litter and organic carbon decomposition in soils and DOC removal in in riverine systems. The production of DOC leachate during decomposition is sensitive to temperature increase, which leads to the positive correlation between DOC concentrations and temperature [73]. Meanwhile, the removal of DOC from surface waters also increases under warming temperatures [53], and this may be the reason for the negative correlation between temperature and DOC concentrations in the Missouri region ( Table 2). The varying role of temperature found in this study corroborates results from seven Swiss rivers with long-term DOC observations [46].
Although wetlands can provide large amounts of organic carbon to downstream waters, we find that wetlands have negative impact on DOC concentrations in the Missouri region. High water tables may change wetlands from DOC sources to sinks as a result of the high DOC retention rates [22]. Admittedly, these assumptions need to be further tested to avoid non-causative correlations.
Dam density has positive influence on DOC at the national level but reduces its concentrations in Mid-Atlantic. Although attenuated flow velocity and prolonged residence time of streamflow following dam construction may favor algae growth, enhanced DOC removal with prolonged residence time in reservoirs may offset this effect and eventually reduce the DOC concentrations [74,75].

Uncertainties and Future Work
In this study, we analyzed the impacts of terrestrial and aquatic factors on DOC concentrations in riverine systems using statistical models. The GLM models provided valuable insights into understanding the spatial patterns of DOC and the associated physical processes regulating riverine carbon. However, the uncertainties associated with the data and statistical methods should be considered in interpreting the findings of this work. First, additional environmental controls should be included to explain DOC variability. The environmental factors selected for our statistical analyses were primarily from the GAGES-II database. However, other factors such as soil pH, land management activities, nitrogen deposition [76], acid deposition [27], soil iron content [77], duration of the main growing season [78], riparian area, and anthropogenic organic carbon [46] also demonstrated significant impacts on DOC [18]. Therefore, the inclusion of additional factors may contribute to explaining more DOC variability. In addition, only a few stations have long-term DOC observations, and thus hinder the analysis of temporal patterns of DOC in the U.S. Additional field data and experiments are needed to further constrain uncertainties in the statistical analyses.
Second, although the GLM models identify possible aquatic and terrestrial controls of riverine DOC, these impacts need to be further evaluated by fieldwork due to the inherent limitations in statistical models. For example, sample size affects the stability of the correlation coefficient [79]. However, due to the availability of DOC observations, GLM analyses by water resource regions or river orders used different sample sizes, which may have introduced uncertainties into our analyses. We also note that high correlations do not necessarily indicate causal relationships between environmental factors and DOC concentrations. Whether the correlations identified in this study hold in other regions or scales also needs further investigation.
Third, process-based methodologies need to be developed to better understand DOC variability. Although the selected variables and GLM models could explain up to 56% of the DOC variability, there is a large portion of variability left unexplained by the GLM models. Statistical analyses, however, confirm the complex interactions of the terrestrial and aquatic processes that affect riverine DOC at different spatial and temporal scales [46]. As a result, the development of process-rich modeling frameworks will be needed to constrain uncertainties in DOC modeling and to improve our understanding of DOC dynamics in the context of environmental changes.

Conclusions
We investigated the spatial variability of riverine DOC concentrations and associated environmental controls using data from over 1402 USGS stations in conterminous U.S. In general, DOC concentrations exhibit high spatial variability at the national level, with an average of 6.42 ± 6.47 mg C/L (Mean ± Standard Deviation). Statistical analyses suggest that the variability of DOC could be partially explained (56% at the national level; 29-78% across the 18 water resources regions) by combining soil properties, temperature, land cover, total phosphorus concentration, and dam density into general multi-linear models. Our findings agree with previous investigations, which emphasized the impacts of soil characteristics, climate, and human activities on riverine organic carbon fluxes. In addition to terrestrial factors, our examination of riverine chlorophyll-a indicates that autochthonous production should not be neglected in understanding DOC dynamics along the terrestrial-aquatic continuum.
Notably, the relative importance of different environmental factors in controlling DOC concentration varies substantially across regions and by river order. We did not identify any single factor that can exert statistically significant impact on DOC concentrations across all eighteen water resource regions in the U.S. and all river orders.
Overall, the high variability of DOC concentration, changes in significant environmental controls across different spatial scales, and remaining unexplained variance in DOC variation point to the complexity of interactions between a series of terrestrial and aquatic processes in regulating riverine DOC. These results highlight the need for process-based, non-linear methodologies that consider both autochthonous and allochthonous carbon sources, transport pathways, and processing of DOC along the terrestrial-aquatic continuum in order to effectively incorporate this important carbon cycling component into regional carbon budgeting.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4441/9/6/383/s1. S1: Units of the variables selected for the statistical analyses; S2: Variables transformed for statistical analyses; S3: General linear regression models for the national and regional analyses; S4: Summary of GLM models for the U.S. and 18 water resource regions; S5: Summary of GLM models for the four river order groups; S6: Detailed statistics of DOC concentration observations in the U.S. and 18 water resource regions; S7: Detailed statistics of DOC concentration observations of six river order groups; S8: P values for ANOVA analysis among the six river groups.