1. Introduction
Lakes and reservoirs provide an important water resource for human use. These resources range from recreational activities, power generation, drinking water, agricultural irrigation, or other commercial and industrial uses. They are also notable indicators of climatic changes [
1,
2] as well as local and regional drought/flooding events [
3,
4]. Furthermore, monitoring changes in lake and reservoir water levels are of a benefit to both local and regional water managers so that they may make more informed decisions about water management policies. This is especially true at the continental scale where the spatiotemporal changes in water levels are notably diverse across the landscape [
5,
6]. To this end, continental scale monitoring of lake and reservoir water level changes is particularly difficult to accomplish solely by the utilization of in situ water level gages. This statement holds true for countries that have a meaningful in situ gaging and monitoring system in place. The United States Geological Survey’s (USGS) National Water Information System (NWIS) is a notable example. To that end,
Figure 1 highlights the spatial distribution of the ~6700 lakes and reservoirs >1 km
2 in the contiguous United States (CONUS). The USGS monitored water levels for only 430 (~6%) of those waterbodies within the past three years. In comparison, the water level monitoring technique described herein provides lake and reservoir water elevation measurements for 6163 (~92%) waterbodies. This highlights a stark need to utilize other methodologies and datasets for monitoring water level changes in the CONUS.
Remote sensing techniques have been used for many years to augment the use of in situ surface water level measurements [
7,
8]. For example, Ref. [
9] used early spaceborne radar altimetry data from the United States Navy’s Geosat platform to monitor temporal changes in water levels within large lakes and inland seas. Furthermore, work from the early to mid-2010s utilized a multi-platform approach for longer temporal scale lake level changes [
10,
11,
12,
13] at notably larger spatial scales. More recent water level studies using radar altimeters have improved upon past research [
14,
15,
16,
17]. However, most spaceborne radar altimeters are limited in their ability to meaningfully resolve water level changes for a large quantity of lakes. This is mostly due to spatial and temporal gaps in data coverage as well as the ground footprint size of the altimeter’s energy source. The latter is especially true for radar altimeter platforms (e.g., Topex/Poseidon’s ~1 km footprint). Spaceborne laser altimeters (like NASA’s first ICESat mission) have notably smaller footprint diameters (~70 m) in comparison to their radar altimeter counterparts. The reduced footprint size of the altimeter allows for water level changes to be derived for waterbodies with smaller areal extents. This of course increases the number of lakes and reservoirs where meaningful measurements can be acquired and allows for a more complete picture of surface water changes. Several researchers have successfully utilized highly accurate ICESat (IS-1) laser altimetry products to monitor water level changes over the period of the sensor’s lifetime [
18,
19,
20,
21].
In 2018, NASA launched IS-1’s successor, ICESat-2 (IS-2) into polar orbit. Like its predecessor, IS-2’s main scientific objectives revolve around cryospheric measurements in the polar regions [
22]. However, secondary mission objectives do involve the monitoring of inland surface water height changes. The Advanced Topographic Laser Altimeter System (ATLAS) is the primary sensor onboard the IS-2 platform and is a notable improvement over IS-1’s Geoscience Laser Altimeter System (GLAS). The ATLAS sensor utilizes three different pairs of beam tracks (six in total), which enables an increased spatial coverage as compared to the single track of the GLAS sensor onboard the IS-1 mission. Furthermore, the ATLAS sensor utilizes a novel photon counting approach that allows for an increase in both the precision and the accuracy of the vertical time-of-flight elevation measurements [
23]. The ground footprint and the laser posting for the IS-2 platform have both been vastly improved from the GLAS sensor as well. The ATLAS sensor’s footprint is ~17.5 m with an along-track posting of about 70 cm [
22]. The specifications of the ATLAS sensor allow it to resolve more and smaller waterbodies than its predecessor. This is especially true when comparing it to spaceborne radar altimeter platforms with their notably larger footprints.
Recently, more water level analyses have begun to utilize IS-2 ATLAS datasets into their studies. For example [
24,
25], employ ATLAS products to accurately monitor water level changes for several lakes on the Tibetan Plateau (TP) and for ~220,000 global waterbodies, respectively. A recent study also notes that utilizing IS-2 data increases the quantity of measurable lakes on the TP by a factor of two as compared to IS-1 datasets [
24]. An accuracy comparison of in situ water level gage readings with levels derived from IS-2 and a modern spaceborne altimeter (Satellite with ARgos and ALtika [SARAL]) for around 30 reservoirs in China shows that the relative altimetric accuracy from IS-2 data is nearly two decimeters better than SARAL’s [
26]. Some studies have compared IS-2 water level measurements with in situ gage data and have found high relative accuracies [
26,
27,
28,
29]. These studies further highlight the quality of lake and reservoir monitoring products from IS-2 data as compared to previous spaceborne altimeters. However, there are some limitations to consider when utilizing IS-2 data to monitor water levels. In particular, there is a known issue that occurs over smooth open water where false multiple surface returns are found in the raw ATL03 data. These double echoes are likely caused by after-pulses or electronic noise after the primary surface return [
30]. The workflow proposed herein helps to alleviate this issue.
This study seeks to provide a novel automated workflow that utilizes the latest spaceborne altimetric products in order to monitor lake and reservoir water level changes for all waterbodies >1 km2 in the CONUS. The vast number of waterbodies in the CONUS requires the need for the automated workflow aspect of this work. Furthermore, users of the readily available surface water height products derived from IS-2’s ATLAS sensor (i.e., ATL13) are reliant on their static water body extents that are built into their processing pipeline. This necessitates the need for the novel automated workflow proposed herein. An added objective of this work is to provide accuracy assessments of these remotely sensed water level products as compared to thousands of temporally and spatially overlapping water level changes from USGS gage readings. Furthermore, another goal of this study is to disseminate these results via an interactive website where an interested reader may better understand the spatiotemporal differences in lake and reservoir level changes at this scale. Lastly, this work seeks to provide these IS-2-based water level products to other users in the community in hopes that they will be of a benefit for future studies. This research was undertaken in order to provide baseline water level monitoring on a much larger scale than is traditionally possible with in situ monitoring. Furthermore, knowledge of the accuracy of these remotely sensed spaceborne laser altimeter water level products validates the quality of the IS-2 ATLAS platform for use in future water level monitoring studies. This work presents the largest (to the best of our knowledge) comparison of IS-2 water level measurements to in situ gage measurements. We utilize careful processing techniques and cluster computing workflows with a validated waterbody extent product and all spatially overlapping ATL03 photons for lakes and reservoirs >1 km2 in the CONUS along with Landsat Dynamic Surface Water Extent (DSWE) products in order to satisfy our goals as laid out above.
2. Data and Methods
2.1. ICESat-2 ATLAS ATL03
This study uses photons from the IS-2 ATLAS platform in order to derive water level changes for all overlapping lakes >1 km
2 in the CONUS. The spaceborne laser altimeter platform was launched in 2018 and the first available products were acquired on 12 October of the same year. The ATLAS sensor uses a beam splitting approach to separate photons into three different pairs of beams (six in total). These individual beams have been named GT1L/GT1R, GT2L/GT2R, and GT3L/GT3R. The beams in each pair (e.g., GT1L/GT1R) are separated by ~90 m in the across-track direction and by ~2.5 km in the along-track direction and each of the three beam pairs are separated by ~3.3 km. Each pair consists of a “strong” beam and a “weak” beam where the “strong” beams have ~4 times the energy as the “weak” beams [
22]. The ground footprint size and along-track spacing of the photons is ~17.5 m and ~70 cm, respectively.
Photon data from the ATLAS sensor are provided to the scientific community in different data products at varying levels of post-processing. There is an inland water surface height Level-3A product (ATL13) derived by the IS-2 science team that provides along-track water surface heights from overlapping lakes, reservoirs, bays, estuaries, and rivers [
31]. These ATL13 data are derived from a Level-2 global geolocated photon data product (ATL03). The ATL13 dataset consists of estimated mean water surface heights within different segment lengths (~100 m and ~1–3 km). These segmented heights were derived from individual ATL03 photons that are within an inland body mask used in their processing workflow [
32]. Users of this ATL13 product are reliant on the inland waterbody mask used in that workflow for the actual bodies of water that can be investigated using the ATL13 surface water heights. Previous versions of the ATL13 product did not allow for a complete analysis of all lakes >1 km
2 in the CONUS due to their selection of that particular waterbody mask.
This study utilizes individual photons from the more robust ATL03 Level-2 product so that we have complete control over which photons will be processed in our water level workflows. The ATL03 products for this study were acquired from the National Snow and Ice Data Center Distributed Active Archive Center (NSIDC DAAC) [
33]. ATL03 photons are separated into 14 different data granules that encompass around 1/14 of a full IS-2 orbit. We acquired all ATL03 (Version 3) data granules overlapping the CONUS (Regions 01 and 02: ascending; Regions 06 and 07: descending) from the NSIDC servers. These consisted of ~7900 different ATL03 granules with a temporal range of October 2018 to November 2020. The ATL03 products consist of processed geolocated photons with the main fields of interest being the height above the ellipsoid (WGS84), time of photon event, various confidence flags, and geodetic latitude and longitude for the individual photons [
34]. The IS-2 processing methodology employed for this research is further described in
Section 2.3.
2.2. Waterbody Extents
This study uses spatial extents of all lakes and reservoirs larger than 1 km
2 in the CONUS as the initial waterbody mask for the ATL03 photons. The mask was derived from techniques described in [
6] and underwent significant human-aided quality assurance and quality control in order to increase the quality of the waterbody extent product. The extents from this initial water mask were buffered inward by 30 m in order to reduce the quantity of edge photons (i.e., photons that capture elevation from the water/land periphery). This in-buffered extent file consists of 6690 different waterbodies and the centroids from those extents are displayed in
Figure 1. Using a custom lake mask allows complete control over the size and location of waterbodies used as an initial filter for the ATL03 photons. This is in comparison to the segmented photons of the ATL13 product that rely on their own proprietary waterbody masks. Our custom waterbody mask is used in the water level workflow as described in
Section 2.3 to initially filter ATL03 photons within full IS-2 granules into their spatially overlapping water boundaries.
The dynamics of surface water hydrologic systems (e.g., lakes and reservoirs) necessitate a need to further spatially filter ATL03 photons by waterbody extents that are as temporally coincident to each IS-2 data acquisition date and time as is possible. The waterbody mask described in the previous paragraph can be thought of as an initial static extent filter for lakes >1 km
2 in the CONUS. Landsat Level-3 DSWE datasets are used in this study to further spatially filter ATL03 photons into their actual active bodies of water. DSWE data are a gridded Landsat Level 3 science product derived from Landsat Analysis Ready Data (ARD) that provides cell-by-cell information pertaining to the existence and condition of surface water extents [
35]. Landsat ARD products consist of the most geometrically accurate Landsat 4–5 Thematic Mapper (TM), Landsat 7 Enhanced Thematic Mapper Plus (ETM+), and Landsat 8 Operational Land Imager (OLI)/ Thermal Infrared Sensor (TIRS) data that are processed to the highest scientific standards and level of processing required for direct use in assessing and monitoring landscape changes [
36]. These DSWE products are used in the water level workflow as described in
Section 2.3 to better spatially filter ATL03 photons into the actual water boundaries of the temporally closest DSWE grid.
2.3. Altimeter Water Levels
Each of the 7898 ATL03 granules acquired from the NSIDC that overlap the CONUS were used as the basis of the IS-2 ATL03 processing workflow described in this subsection (
Figure 2). A parallel computing and spatial indexing approach was used to spatially filter the IS-2 granules due to the vast quantities of ATL03 photons as well as the number of vertices in the waterbody mask. To this end, each granule was sent to its own core within UCLA’s Hoffman2 Cluster in an embarrassingly parallel computing workflow in order to split the granule photons into their respective spatially overlapping waterbody extents using the lake/reservoir mask as described in the first portion of
Section 2.2. To speed up processing times, the initial static water mask is clipped based on the buffered extent of the granule that is being processed prior to the assignment of the photons into their corresponding overlapping waterbody. This step uses spatial indexing techniques to rapidly determine which of the waterbodies in the mask are likely to contain the photons in the granule. This tremendously speeds up the precise spatial determination of the intersecting photons into their respective waterbodies. This initial photon intersection analysis is done for each of the six tracks (three “weak” and three “strong”) for the granule and only photons with land, inland ice, and inland water “signal_conf_ph” values flagged as high confidence (i.e., a value of 4) are saved for further processing. The outputs of this initial processing step are arrays of high confidence photons from all of the 7898 granules for every track for each waterbody in the inward-buffered mask.
The extents of several waterbodies in the lake/reservoir database exhibit notable inter- and intra-annual variations in areal extent. However, the initial waterbody mask provides only a snapshot of these dynamic areal extents. That said, care must be taken in order to filter out ATL03 photons that made it through the initial static spatial filtering process but do not actually fall within regions in which surface water was present during the time of the individual IS-2 granule acquisitions. These photons should be further filtered using a dynamic waterbody extent mask in order to remove the false surface water photons from the database so that the final water level products will have an increased accuracy. Each waterbody that has photons within its extent after the initial static spatial filter was then passed to its own core for the dynamic spatial filtering process. This parallel filtering approach dramatically speeds up processing times. The individual ATL03 track photons for every date for a given waterbody were looped through and the temporally closest DSWE scene where less than 20% of that track’s photons fell within the “cloud, cloud shadow, and snow” flagged regions were used to filter out photons that did not fall within the DSWE “Water–high confidence” and “Water–moderate confidence” flagged regions. The DSWE scene with the lowest percentage of cloudy photons was utilized for tracks where the 20% cloud threshold was not met by any DSWE scene acquired within ±4 years of that particular track’s acquisition date. This dynamic spatial extent filtering process was utilized in order to remove as many of the false surface water photons over waterbodies that see larger spatial extent changes.
The individual tracks were then processed into “weak” beam and “strong” beam water levels for each date for every lake or reservoir in the database using an outlier filtering, segmenting, and clustering technique. To that end, photons in a given waterbody’s track were converted from WGS84 ellipsoidal height into orthometric height using the EGM2008 geoid model. Next, outlier photons were filtered out using the “dem_h” data within the ATL03 product where all photons outside of the “dem_h” mean value minus 200 m and the “dem_h” mean value plus 100 m elevation range were excluded from further processing. These photons were then histogrammed into 1 m bins and the water level for the most-frequent bin was attained. This max bin level was used to further filter out photons where only those that were within +3 m and −2 m of the max bin level were kept for further processing.
The photons were then quantitatively and spatially segmented based on a maximum number of photons in a segment and a ~100 m distance threshold value. Segments consist of 50 photons for “strong” beams and 25 photons for “weak” beams where the photons in each segment are within ~100 m from each other. Segments containing less than 50 (“strong” beam) or 25 (“weak” beam) photons were removed from further processing. A histogram peak filter was run on each set of segmented (50 or 25) photons in order to remove false subsurface water signals (displayed in
Figure 3). This peak filter entails histogramming the photons for each segment into 5 cm bins and then determining the three highest frequency bins. The bins containing less than 33% of the maximum frequency bin were removed. The water level of the maximum frequency bin of the remaining three bins was selected in order to utilize the bin that is typically associated with the actual surface reflectance (and not the false sub-surface photons). However, the water level of the second most frequent bin was selected if it was >55 cm than the most frequent bin. This was done in case the segment’s photons exhibited a higher number of pulses within a lower return bin as compared to the upper return bin. Only photons in this segment that were within ±50 cm of the water level from the selected maximum frequency bin were kept for additional processing. A subsequent filtering step was applied on the segments where the photons with an absolute deviation outside of the median absolute deviation (MAD) were removed from further processing. The final segment water levels were assigned by taking the mean of the remaining photons within each segment.
These segments were then clustered by along-track distance and water level using a density-based spatial clustering of applications with noise (DBSCAN) method with an epsilon value of 50 and a minimum segment sample setting of 1 [
37]. The epsilon value is analogous to the maximum distance between two segments for one segment to be considered within the neighborhood of the other segment. Prior to clustering, the segments’ latitudes were normalized such that an along-track distance of ~500 m was akin to a ~50 in the data. This allowed the epsilon value of 50 to appropriately cluster the segments by their spatial distance values of 50 cm (water level) and ~500 m (along-track distance). Clusters consisting of only one segment were removed from the processing workflow (minimum sample = 1). The mean cluster water levels were determined from the individual segments within each cluster. Clusters whose means that were outside of two standard deviations from the all-segment mean and outlier clusters that bring the all-cluster standard deviation above 20 cm were removed from further processing. Clusters with a mean absolute deviation greater than 2.5 cm were further filtered such that segments in a cluster whose water levels were outside of ±5 cm from the water level at the peak of the Gaussian-smoothed non-parametric kernel density estimation derived from the intra-cluster segment water levels were removed, and a new cluster mean was derived. Otherwise, the previous cluster mean was used. The final lake level for the track was determined by the median of all remaining cluster water level values for each beam type.
Figure 3 highlights the output of this workflow and displays photons, segments, and clusters for two example “strong” beam tracks.
2.4. In Situ Gage Level and Altimeter Water Level Comparison
USGS surface water gage locations and their IDs were acquired from the USGS NWIS for twenty parameter codes related to surface water levels (i.e., 61055, 62615, 00062, 72292, 99064, 72277, 62600, 72293, 62614, 99065, 72264, 99020, 62617, 00065, 72214, 72020, 30211, 62616, 30207, 72275). These 765 different gages were manually matched to their corresponding lake or reservoir in the initial static waterbody mask that was described at the beginning of
Section 2.2. An automated spatial matching approach was considered, but a manual approach was eventually decided upon. This was mostly due to the occasional poor accuracy of the reported gage coordinates used as an input into the minimum distance based spatial matching approach. The 351 different centroids for each lake in our custom static waterbody mask that have both IS-2 and gage water levels are plotted in light blue within
Figure 1. All available temporally coincident in situ water level readings for each waterbody with an active water level gage were acquired from the NWIS via the
hydrofunctions python package from Martin Roberge and contributors (
https://github.com/mroberge/hydrofunctions, accessed on 21 March 2021). Initial analysis of the gage levels sought to convert the many different water height datums to a uniform vertical datum to directly compare to the spatially and temporally overlapping water levels derived from the IS-2 processing workflow. However, this method was abandoned due to the poor quality of many of the gages’ metadata. Most of the gages’ vertical datum offsets were not precise or accurate enough to directly compare to the IS-2 water levels. Instead, we compared temporally coincident relative water level changes from the gages and the IS-2 water level products.
Relative water level changes were derived for each of the gages’ temporally overlapping IS-2 date pairs. The IS-2 acquisition times were converted to local gage times in order to acquire the proper coincident gage readings. Furthermore, only gage readings acquired within ±24 h from their corresponding IS-2 granule’s acquisition time were utilized in the comparison. USGS gage data are served via two temporal scales: (1) daily values (DV) and instantaneous values (IV). Typically, DVs for a given gage are derived using some statistical analysis (e.g., daily mean or median) of IV values, or they are simply an IV at a given time of day (e.g., noon). All gage/IS-2 comparisons were done with the temporally closest IV data as compared to the IS-2 acquisition time. This allowed for the utilization of the temporally closest gage readings to be compared to their IS-2 water level counterparts so as to more accurately compare the water level readings between the two methods. These relative water level changes from temporally overlapping USGS gage data were ultimately used to compare to the IS-2 relative water level changes in order to assess the accuracy of our ATL03-derived water level changes as described in the workflow in the previous subsection.