Timing of Landsat Overpasses Effectively Captures Flow Conditions of Large Rivers

: Satellites provide a temporally discontinuous record of hydrological conditions along Earth’s rivers (e.g., river width, height, water quality). The degree to which archived satellite data effectively capture the overall population of river flow frequency is unknown. Here, we use the entire archives of Landsat 5, 7, and 8 to determine when a cloud-free image is available over the United States Geological Survey (USGS) river gauges located on Landsat-observable rivers. We compare the flow frequency distribution derived from the daily gauge record to the flow frequency distribution derived from ideally sampling gauged discharge based on the timing of cloud-free Landsat overpasses. Examining the patterns of flow frequency across multiple gauges, we find that there is not a statistically significant difference between the flow frequency distribution associated with observations contained within the Landsat archive and the flow frequency distribution derived from the daily gauge data ( 𝛼 = 0.05), except for hydrological extremes like maximum and minimum flow. At individual gauges, we find that Landsat observations span a wide range of hydrological conditions (97% of total flow variability observed in 90% of the study gauges) but the degree to which the Landsat sample can represent flow frequency distribution varies from location to location and depends on sample size. The results of this study indicate that the Landsat archive is, on average, representative of the temporal frequencies of hydrological conditions present along Earth’s large rivers with broad utility for hydrological, ecologic and biogeochemical evaluations of river systems.


Introduction
Rivers serve as the chief source of renewable water to humans and to freshwater habitats [1] and thus they represent an important nexus between the water cycle, civilization, and aquatic ecology. Effective water resource management depends on the ability to monitor river systems and understand how they are responding to changes in climate and land use. While hydrological simulations can provide useful data for evaluating river system dynamics, they are often unable to resolve unpredictable processes and events, leading to high uncertainty, particularly when operating over large areas [2]. Observational monitoring of rivers is therefore essential for a complete understanding of the real-world complexities of river systems.
However, an important unknown is whether the accumulated observations of rivers from Landsat, or the Landsat sample, correspond to the true population of hydrological conditions in rivers. If the available Landsat sample adequately captures the on-the-ground frequency distribution of river flow, then Landsat-based surveys of river characteristics can be interpreted to be representative of the hydrological conditions present along Earth's rivers. Quantifying the ability of the Landsat archive to represent hydrological conditions, represented here by flow frequency, is motivated by several unexplored applications in the field of remote sensing of rivers. These include 1) using water occurrence information from the Landsat archive to estimate discharge by constructing river width-based rating curves; 2) using water occurrence information to quantify the seasonality of river and stream inundation extent, an important metric for estimating biogeochemical exchange between rivers and the atmosphere [33]; 3) using water color information to understand the variability and dynamics of river water quality [19].
If the Landsat archive is of adequate length or, in other words, if the Landsat sample size is large enough, then fundamental assumptions can be made regarding the representativeness of the Landsat record of hydrological conditions along Earth's large rivers. Thus, given the long Landsat archive (Landsat 5 launched on March 1 st , 1984), we hypothesize that long-term aggregations of Landsat imagery capture the flow frequency of Earth's large rivers. Here, we test this hypothesis by conducting a simple temporal sampling analysis of the United States Geological Survey (USGS) gauge records based on cloud-free overpass timing of Landsat 5, 7 and 8 ( Figure 1). While a similar approach has been taken to predict the observational potential of the future Surface Water and Ocean Topography (SWOT) satellite mission [34], there has been no evaluation conducted for the widely used Landsat archive.  Landsat 5,7, and 8 data availability based on the number of images with less than 30% cloud cover. Regions affected by cloud cover and low sun angle tend to have lower data availability. Rivers equal to or wider than Landsat's 30 m resolution at mean discharge shown as blue lines [28].

Gauge Data
We represent the true flow frequency of rivers using daily streamflow records from gauge stations in the United States (US) and operated by the USGS (Figure 1). To exclude gauges that are located on rivers too narrow to be observable by Landsat, we only consider the 1134 USGS gauges that were used to validate the Global River Widths from Landsat (GRWL) database [28]. These gauges 1) have upstream drainage areas larger than 1000 km 2 ; 2) have records that span at least 10 years; 3) are located within 1 km Euclidean distance of a GRWL centerline; 4) are not immediately adjacent to lakes, reservoirs or river confluences (determined using the GRWL flag field and by visually examining each gauge location [28]); 5) have in situ river width data available [35]. We only analyze discharge records occurring between March 1st, 1984 and August 14, 2019, because this time period spans the range of available data from the Landsat 5, 7 and 8 missions at the time of this study's analysis. To help ensure a representative distribution of discharge at each gauge, we omit from our analysis 102 gauges that contain less than 5 years of discharge records within the 35 year study period. Additionally, because discharge conditions can be obscured by the presence of river ice [36], we exclude all gauge measurements that were taken when river ice was recorded at the gauge (USGS Quality Code set to "ice"). Similarly, we exclude discharge records that the USGS reported as provisional, estimated, underestimated, or overestimated (USGS Quality Code set to "p", "E", "<", or ">"). Combined, these exclusions remove 5.2% of the daily discharge measurements but they produce a more accurate and representative discharge record for this study. The gauges used here are located on rivers, with a median width at mean annual discharge of 76 m and a first and third quartile of 50 to 117 m, respectively, as measured by the USGS [28].

Landsat Data
To assess the ability of Landsat to capture the true flow frequency distribution, we match coincident Landsat 5, 7, and 8 overpasses with daily discharge measurements at each USGS gauge over the 35-year study period (Figure 2a,c). Note that this matching process is idealized in that the discharge value at the gauge is exactly assigned to the corresponding Landsat image. This approach effectively assumes that river discharge can be accurately estimated from Landsat width measurements, an active field of research [7,22,27,[37][38][39]. This simplifying assumption is a necessary first step that neglects potential errors in remote sensing and discharge algorithms, but allows for our stated focus on analyzing Landsat sampling capabilities. We conduct this Landsat availability analysis using the Google Earth Engine platform [24], which hosts the entire digitized Landsat archive. Each Landsat satellite observes the same location at least once every 16 days, although in areas with frequent cloud cover, the actual interval of cloud-free observations can be much longer ( Figure 1). To account for the impact of cloud cover, we also determine when an available Landsat scene is cloud-free within a 500-m radius around the gauge. We use a 500-m radius around each gauge because this distance is longer than the corresponding in situ river width at mean discharge of all but 14 gauges considered in this study [28]. We identify clouds based on the USGS Landsat Bitwise Quality Assessment (BQA) product [40]. To account for the impact of the ETM+ Scan Line Corrector failure on data quality [41], we conservatively omit all Landsat 7 observation dates after May 31, 2003, when the failure occurred. In total, the number of observations considered in this analysis after excluding clouds and problematic gauge measurements is 327,177.

Statistical Comparison at Individual Gauges
At each individual USGS gauge, we compare the flow frequency distribution of the idealized Landsat sample to the flow frequency distribution of the daily gauge record ( Figure 2). The upper panel in Figure 2 shows an example of where the Landsat sample and the gauge record have flow distributions that are relatively similar, whereas the lower panel shows an example where the Landsat sample does not represent the gauged flow distribution with high fidelity. We use the nonparametric Kolmogorov-Smirnov (K-S) test [42] to characterize the statistical difference between the Landsat sample and the daily record of flow at each gauge. We use a significance level of = 0.05 for all statistical tests in this study. Note that the K-S p-value is highly sensitive to sample size, causing it to be an impractical statistic for comparing across individual gauges. For example, a gauge with more Landsat observations (a larger sample size) is more likely to be considered significantly different than the same gauge with fewer observations (a smaller sample size), according to the K-S p-value [43,44]. Thus, the K-S p-value often yields contrary results, in which a large sample with a distribution that appears similar to that of its population will be considered to be significantly different (e.g., Figure  2a, b), while a small sample that appears to highly deviate from the gauge record will not be considered to be significantly different (e.g., Figure 2c, d). Due to this contradictory behavior, we do not place emphasis on the K-S p-value in this analysis but rather focus on the descriptive K-S Dstatistic (KSD statistic), which provides a clear summary of the difference in flow frequency distributions between the Landsat sample and gauge record.
Additionally, we analyze the ability of Landsat to capture hydrological extremes at each individual gauge by determining the maximum and minimum percentile of gauged flows sampled by Landsat. We then explore potential factors affecting the ability of the Landsat sample to represent true flow frequency distributions across gauges including climate (cloudiness), watershed area and flow regime (flashiness). Flashiness is quantified according to the Richards-Baker Flashiness Index that sums the differences in daily flow divided by the total flow over a given time period [45]. This non-dimensional index is commonly used and has been observed to vary from 0 to ~1.5, although large rivers are generally less flashy and tend to exhibit index values of less than ~0.5 [34]. We correlate these factors with the KSD statistic using the Spearman rank correlation test [46] across all gauges and we examine the spatial patterns in the KSD statistic.

Statistical Comparison Across Multiple Gauges
To determine the ability of Landsat to represent river flow frequency across space, we compare the Landsat sample to the daily gauge record at different flow frequencies (Figure 2e). This approach tests whether different locations can be combined to represent flow frequency and is analogous to the classic hydrological concept of downstream hydraulic geometry [47]. In such a conceptual framework, the approach taken in Section 2.3 is equivalent to at-a-station hydraulic geometry. At each gauge, we calculate (1) discharge from the full gauge record and (2) discharge from the Landsat sample at multiple flow percentiles. Figure 2b, d shows two examples of calculating these two metrics at the 50th flow percentile (median flow). In addition to median flow, we calculate the two metrics at the following flow percentiles: 0% (minimum flow), 5%, 10%, 90%, 95%, and 100% (maximum flow).
For each of these percentiles, we compare the Landsat sample discharge to the gauge record discharge at every gauge using the non-parametric Mann-Whitney-Wilcoxon test (MWW) [48]. To avoid bias from outliers, we use the Theil-Sen median estimator [49] to derive a robust linear regression between the Landsat-sampled discharge and the gauge record discharge at each percentile. Additionally, we also calculate relative root mean square error, and the relative bias, where N is the total number of gauges used in this study, Qi,j is the flow (m 3 s −1 ) from the Landsat sample distribution at a given percentile, j, at gauge i and , is the flow from the gauge distribution at the same percentile and same gauge. By comparing the Landsat sample to the gauge record across all the gauges at each selected percentile, we evaluate the ability of Landsat to represent a given flow frequency through spatial averaging.

Minimum Length of Landsat Observations
We expect that the ability of satellites to effectively represent river flow frequency is related to the length of the observational archive and hence the sample size. Theoretically, longer sampling periods enable better capture of the flow frequency distribution during the sample period. To examine the effect of observation length on Landsat's capacity to represent flow frequency, we simulated different observation period lengths over which Landsat-sampled discharge from the gauge record. For each gauge in our analysis, we created 50 random permutations of a continuous temporal range with an n year duration (n = 1, 2, 3 …, 10). To allow for random permutations of a 10 year period, we only included gauges that contained more than 15 years of continuous data (N = 927). Within each temporal range, we compared the Landsat-sampled flows and those from daily gauge records from the same period. Specifically, we calculated flow values at 0%, 1%, 5%, 50%, 95%, 99%, and 100% percentiles. From these data, we calculated the coefficient of determination (R 2 ), relative error metrics (rBias, rMAE, and rRMSE), and absolute error metrics (Bias, MAE, and RMSE). Together, these statistics help characterize the ability of Landsat to represent flow conditions with increasing observation duration.

How Well Can the Landsat Archive Capture Flow Conditions at Individual Gauges?
We find that the Landsat archive contains observations corresponding to the near full range of discharge conditions for the vast majority of gauges (Figure 3a). For example, at 90% of the study gauges, the idealized Landsat sample captures at least 97% of the full range of discharge percentiles recorded by the gauge. The majority of gauges (55%) show no significant difference between the Landsat sample and the gauge record of flow according to the K-S p-value at the 95% confidence interval. However, as previously noted, the K-S p-value is exceedingly sensitive to sample size and often produces contradictory results when comparing samples of different sizes, as we do here. On the other hand, the descriptive KSD statistic, which does not exhibit this contradictory behavior, ranges from 0.016 to 0.36, with a median value of 0.083. Among the sites with no significant difference between the Landsat-sampled and gauged flow frequency, the KSD statistic ranged from 0.016 to 0.27, with a median value of 0.06. Thus, while there is considerable variation in the potential ability of Landsat to reconstruct flow frequency from gauge to gauge, the average difference between the idealized Landsat sample and the gauge flow frequency distribution is small.
To determine potential drivers of Landsat's ability to capture river flow frequency, we examine correlations between the KSD statistic and the environmental variables of cloudiness, watershed area, and flow flashiness. We find a negative correlation (p < 0.001) between the proportion of cloud-free observations at a gauge and the KSD statistic (Figure 3b). This pattern is evident for locations where there is a statistically significant difference between the Landsat sample and the gauge record (gray points in Figure 3b) as well as locations that exhibit no significant difference (black points in Figure  3b; Spearman rank correlation coefficients of r = −0.40 and r = −0.47, respectively). We also find a weak negative correlation between the KSD statistic and watershed area (r = −0.09 and −0.16; p = 0.02 and p < 0.001; Figure 3c) for locations with significant difference and locations with a significant difference, respectively. Conversely, we find no significant correlations between flow flashiness and the KSD statistic at sites with statistically significant differences (r = 0.012, p = 0.78) nor at sites with statistically insignificant differences (r = 0.018, p = 0.66; Figure 3d). Landsat-observable rivers are large and span a relatively narrow range of the Richards-Baker Flashiness Index, from 0.016 to 0.84, compared to small streams that can vary up to 1.5 [34]. The lack of correlation between flashiness and the KSD statistic implies that flow regimes on large Landsat-observable rivers do not affect Landsat's ability to capture flow frequency distribution. We find no readily apparent spatial patterns in the Dstatistic (see Figure S1 for an interactive map showing flow frequency distributions of the Landsat sample and the gauge record for each stream gauge). While more sophisticated statistical approaches could be employed to predict locations where Landsat best captures river flow frequency, this task is beyond the scope of this study.

How Well Can the Landsat Archive Capture Flow Conditions Across Multiple Gauges?
Comparing the Landsat sample to the gauge record across all gauge stations reveals consistent patterns between a given flow percentile and the ability of Landsat to potentially represent flow at that percentile (Figure 4). For a wide range of flows (i.e., the 1% through the 95% percentiles in Figure  4), we find no statistically significant difference between the Landsat sample and the full gauge record according to the MWW test. Thus, except for extreme hydrological conditions like maximum and minimum flow, the idealized Landsat sample is not statistically different from the daily flow measured by multiple gauges ( = 0.05). Error metrics tend to be highest at extreme flow conditions and lowest at median flow. For intermediate percentiles that show no statistically significant difference, rRMSE values range from 12% to 78% and rBIAS values range from −6.4% to 8%.
Examining the error statistics of the extreme flow percentiles yields additional insights. At minimum flow, the Landsat sample always either matches or overestimates minimum flow, as seen by the points always being above the 1:1 line in the 0% percentile panel of Figure 4, resulting in a high positive relative bias of 200%. Conversely, at maximum flow, the Landsat sample either equals or underestimates maximum flow, producing a negative relative bias of −31%. These underestimates tend to increase in magnitude with increasing discharge (errors exhibit heteroscedasticity), resulting in a Theil-Sen median estimator that significantly deviates from unity at the 100% percentile (red line). These patterns also persist at the 99% percentile, in which the Landsat sample tends to underestimate discharge, albeit to a lesser degree than at maximum flow.

What is the Minimum Length of Landsat Observations Needed to Represent Flow Frequency?
Our findings confirm that long periods of Landsat observation correspond with an improved ability of the Landsat archive to contain observations that can effectively represent river flow frequency. Generally, after 3 years of Landsat observation, all percentiles except for minimum and maximum flow are in close agreement with the reference values derived from the gauge record, with R 2 values above 0.9 (Figure 5a). Moreover, with the exception of minimum and maximum flow, we find a general trend of decreasing relative error statistics (rBias, rMAE, and rRMSE; Figure 5b) and absolute error statistics (Bias, MAE, RMSE; Figure 5c) with increasing duration of observation. Thus, as satellite observation duration increases, the distribution of flow frequency observed by Landsat expectedly converges with the flow frequency of the daily gauge record, except for extreme flows like minimum and maximum discharge.
Additionally, our results reveal that the degree of increasing similarity between the Landsat sample and the gauge record flow frequency distributions strongly depends on the flow percentile being studied. With increasing observation duration, smaller percentile flows generally show a more dramatic improvement in performance relative to larger flows. For example, Figure 5a shows a more substantial increase in R 2 for low flow percentiles (0%, 1%, and 5% flow) at a 3 year duration relative to the larger flow percentiles. This pattern also persists for the relative error metrics whereby the decreasing trend is greatest in smaller percentiles, with the exception of minimum and maximum flow. We note that low flow frequencies correspond to higher relative errors compared to higher flow frequencies, likely because of their smaller denominators in Equations (1) and (2). Conversely, high flow frequencies correspond to higher absolute error metrics relative to lower flow frequencies and the non-linear gap between the percentile curves may result from the positive skewness of flow frequency distributions in most rivers.

Interpretations of Primary Findings
Our results indicate that Landsat can effectively capture river flow frequency over large spatial areas given an adequate duration of observation and accurate remote sensing and discharge Remote Sens. 2020, 12, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing algorithms. Although Landsat's sampling capability varies at individual gauge sites, we find that spatially averaging over multiple gauges enables effective representation of flow frequency, with the exception of extremely high and low flows (Figure 4). Landsat more aptly captures flow frequency at the lowest flows rather than at the highest flows, likely because of the positive skew characteristic of flow frequency distributions (e.g., Figure 2b). Small changes in flow frequency at the highest flows represent large variations in discharge that are difficult to capture due to the 16 day repeat orbit of Landsat. Cloudy conditions often accompany high flows, which further inhibits Landsat's ability to characterize the frequency of high flow events. The duration of observation also plays an important role in the ability of Landsat to capture river flow frequency. As expected, a longer observation duration will produce a better representation of river flow frequency, albeit with diminishing returns ( Figure 5). Several error metrics initially improve rapidly with time until approximately a 3 year duration, after which they improve more gradually. Thus, we suggest that at least 3 years of Landsat observations should be aggregated before they adequately contain observations representative of river flow frequency.

Implications for River Remote Sensing Applications
River flow frequency analysis is a key tool for a variety of hydrological applications including flood hazard and risk evaluations, hydraulic engineering, and water resources management [50,51]. Flow frequency is also related to a river's water quality and distribution of freshwater habitats [52][53][54]. While satellite remote sensing cannot measure discharge directly, it can measure other attributes of rivers that scale with discharge including river morphology and water quality [47,55]. For example, Landsat can observe surface water inundation extent from which river surface area and width can be extracted, which both scale with discharge [27,[56][57][58].
This study's findings indicate river water occurrence data derived from long-term aggregations of Landsat observations correspond to the flow frequency of Earth's large rivers. For example, on average, median river width derived from long-term temporal composites of classified Landsat data [25,26] corresponds to median river flow. Our results also indicate that these same relationships can be extended to a wide range of flow frequencies, except for extremely high and low flows. This result has key utility for developing percentile-based width rating curves for estimating discharge [14,22] or for estimating variability in river surface area [59]. We emphasize that at any single given location, these relationships do not necessarily apply but rather that these relationships are valid when averaging over space. Thus, applying at-a-station hydraulic geometry [47] at individual single crosssections solely from Landsat water occurrence data may often be invalid. However, our results suggest that developing at-a-station hydraulic geometry relationships across multiple cross-sections over a large area is valid for non-extreme flow frequencies and given an adequate Landsat sample size with potential implications for remote sensing of discharge approaches [15,[37][38][39].
Our results also have implications for riparian ecology and river water quality applications. Flow is a "master" variable in river ecology and water quality [54] and, like river width and surface area, Landsat can also measure water quality parameters [31,60]. Specifically, Landsat imagery is commonly paired with in situ water samples to derive empirical relationships with optically-active constituents such as suspended sediment concentrations [61,62], chlorophyll-a [60,63], and colored dissolved organic matter (CDOM) [64,65]. While these water quality parameters generally vary with discharge, the relationship between river flow and water quality varies. Suspended sediment generally increases non-linearly with flow [66] but chlorophyll-a and CDOM can increase, decrease, or vary independently of flow depending on river size, season, and watershed properties [67][68][69][70]. Understanding flow conditions captured by satellite observations is therefore important for deriving representative surveys of river water quality measurements. The most extreme flow events are rarely observed by Landsat, but Landsat observes a wider range of flows (97% of flow percentiles at 90% of gauges) than well-designed water quality field sampling programs which sample 80% of flow percentiles at best [71]. Thus, as remote sensing of water quality methods typically relies on in situ field measurements, it is critical that field sampling programs collect measurements at high and low flows to match the wide range of hydrological conditions captured by Landsat observations. Remote Sens. 2020, 12, x; doi: FOR PEER REVIEW www.mdpi.com/journal/remotesensing

Limitations and Future Directions
While the results of this study are encouraging for river remote sensing applications, our study does have limitations. First, we assume that daily discharge measured from the USGS study gauges represents the true flow frequency of rivers, but river gauges are biased in their placement and fluctuations in river discharge often can occur over subdaily timescale [5,72]. Related to this point, our results may differ in regions outside the US, which have dissimilar conditions. The gauge stations used in this study measure the majority (52%) of US Landsat-observable river reaches but only 6.6% of Earth's observable rivers are located within the US, according to the GRWL database [28]. Future work could explore this uncertainty by using an international gauge database [73] or a global hydrological simulation [74] to sample streamflow worldwide. Second, we emphasize that Landsat will not necessarily produce a representative sample of flow at any given single location along a river network. However, our results indicate that, given an adequate period of observation, spatiallyaveraged Landsat observations can capture flow frequency distribution over large spatial areas. Further, Landsat does not adequately capture minimum and maximum flow conditions in most locations. Third, while this study does not assume stationarity, we emphasize that river flow frequency is non-stationary such that flow frequencies derived over one time period cannot necessarily be used to infer flow frequencies over another period [3]. Finally, this study does not consider the uncertainty associated with the Landsat remote sensing measurements themselves. So, while we find that the timing of Landsat observations is adequate to capture river flow frequency, the ability of Landsat to accurately measure the parameter of interest (e.g., river width, suspended sediment concentration, discharge) itself remains unconstrained here. Constraining this uncertainty is application and algorithm specific [7,20] and beyond the scope of this analysis. Additionally, determining the drivers of the heterogeneity in Landsat's ability to represent flow frequency from location to location ( Figure S1) is a recommended topic for further research.
Similar approaches to this study may be used to understand river sampling capabilities of satellite missions other than Landsat. Satellite programs with optical sensors and shorter revisit times, such as Sentinel 2 (2-5 days) and Planet (~1 day), likely capture river flow frequency over a shorter study duration, although this advantage of more frequent retrieval will be bottlenecked by persistent cloud cover in some regions. Other sensor technologies enable remote sensing of rivers over a broader array of atmospheric and solar illumination conditions. Indeed, thermal, passive microwave, radar and lidar remote sensing have distinct advantages over optical remote sensing and can provide alternative observations of river systems. Similar to this study, a recent analysis found that 3 years of SWOT data were sufficient to represent the flow frequency distribution over the Mississippi River basin [34]. SWOT will have a longer repeat orbit (21 days) and narrower swath width (100 km) than Landsat but these attributes are counterbalanced by the ability of its Ka-band radar instrument to collect surface returns during cloudy and nighttime conditions [11].

Conclusions
This study's findings show that the Landsat archive can effectively represent river flow frequency over large areas given an adequate period of observation and accurate remote sensing and discharge algorithms. At individual locations, the ability of Landsat to capture flow frequency in large rivers is positively correlated with the cloud occurrence, weakly correlated with watershed area, and does not correlate with flow flashiness (Figure 3; S1). While the Landsat record captures a wide range of flow conditions (97% of the flow percentiles at 90% of sites), its ability to capture the flow frequency distribution varies widely from location to location (KSD statistic ranges from 0.016-0.36). This implies that, at any single site along a river, the Landsat archive cannot be assumed to adequately capture flow frequency. Nevertheless, we find that spatially averaging over multiple locations effectively enables representation of hydrological conditions at a given flow frequency, with the exception of hydrological extremes like maximum and minimum flow (α = 0.5; Figure 4). Such an averaging process could hence be used to improve existing Landsat-based discharge algorithms. Landsat can better capture flow frequency at the lowest flows than at the highest flows likely because of the positive skew characteristic of river flow frequency distributions. We also find that a longer Landsat observation time period positively impacts the representation of river flow frequency, albeit with diminishing returns with increasing observation time ( Figure 5). We suggest that, on average, a minimum of 3 years of aggregated observations is necessary if Landsat is used to reconstruct flow frequency distribution. We emphasize that this analysis only considers the timing of Landsat observations in relation to river flow frequency and does not consider the ability of Landsat to accurately measure the parameter of interest (e.g., river width, suspended sediment, discharge). We also note that the gauges used in this analysis only measure a relatively small portion of the global observable river network, solely located within the United States. Regardless, the results of this study support the hypothesis that long-term aggregations of Landsat data can be used to capture the flow frequency of Earth's large rivers. Thus, Landsat-based surveys of river characteristics can be interpreted to be representative of the hydrological conditions present along Earth's large rivers, with wide-ranging utility for river hydrology, water quality, and ecology.