Monitoring Variations in Lake Water Storage with Satellite Imagery and Citizen Science

: Despite lakes being a key part of the global water cycle and a crucial water resource, there is limited understanding of whether regional or lake-speciﬁc factors control water storage variations in small lakes. Here, we study groups of small, unregulated lakes in North Carolina, Washington, Illinois, and Wisconsin, USA using lake level measurements gathered by citizen scientists and lake surface area measurements from optical satellite imagery. We show the lake level measurements to be highly accurate when compared to automated gauges (mean absolute error = 1.6 cm). We compare variations in lake water storage between pairs of lakes within these four states. On average, water storage variations in lake pairs across all study regions are moderately positively correlated ( ρ = 0.49) with substantial spread in the degree of correlation. The distance between lake pairs and the extent to which their changes in volume are correlated show a weak but statistically signiﬁcant negative relationship. Our results indicate that, on regional scales, distance is not a primary factor governing lake water storage patterns, which suggests that other, perhaps lakes-speciﬁc, factors must also play important roles.


Introduction
Surface water stored in lakes is a key part of the global water cycle and a crucial water resource that provides drinking water, supports irrigation systems, promotes economic activity and tourism, and generates hydroelectric power. In addition to supporting human activities, lakes sustain diverse ecosystems important for many natural processes. Lakes encompass geographically diverse areas, are the lowest points in their surrounding landscapes, serve as records of past hydrologic and geologic events, and regulate surrounding climate. Because they integrate so many processes, lakes and reservoirs act as sentinels of climate change and are among the areas most threatened by climate change and other human impacts [1][2][3][4]. Lake water level, especially in endorheic (or seepage) lakes, is very sensitive to changes in the water balance [3,5]. Studying lake levels allows us to better understand the effects of climate change on the water cycle and lake ecosystems [2,3].
Lake measurement systems, whether in situ or satellite-based, tend to focus mostly on larger lakes. Lakes smaller than 10 km 2 are generally poorly observed and monitored [6,7]. There are, however, many more small lakes than large lakes in the world. For example, of the >1.4 million lakes larger than 0.1 km 2 in the HydroLakes database [8], >1.2 million are smaller than 1.0 km 2 . Furthermore, as lakes decrease in waterbody size, the total shoreline length of those lakes increases [8]. As such, most land-lake interfaces occur in small lakes, in return giving these small lakes a significant role in biogeochemical cycles [6,8]. Natural lakes are not spread evenly over the globe; some areas have many lakes, while others have none. In lake-rich regions, it is possible to conceptualize lakes as either individual features or as common expressions of a shared water table. Which of these makes more sense depends on the specific question at hand but also on how correlated variations in lake water storage are within a region.
While there have been studies looking at regional patterns of change in lake water level [9,10] or surface area [11] and while many ecologists study small lake ecosystems, spatiotemporal patterns in lake water storage, especially for small lakes, remain poorly studied despite their abundance and hydrological significance. A handful of studies [12][13][14], mostly focused on the Tibetan Plateau, have examined controls on lake volume across multiple lakes within a region. However, our understanding remains limited regarding how the storage of water in regional groups of small lakes fluctuates, how that differs from large lakes, what controls these fluctuations, and the implication of these fluctuations on the ecosystem services they provide [6,15].
This gap in knowledge regarding small lakes is in large part caused by limitations in current lake monitoring systems [3]. To date, lake monitoring is primarily done through government-run gauge networks, studies by individual scientists, and satellite radar altimetry. In the United States, the most robust gauge network is maintained by the United States Geological Survey (USGS) and includes around 300 lakes, most of which are relatively large, man-made, or dammed reservoirs [16]. Many state and local governments have their own gauge networks, such as Lake Level Minnesota [17] and Water Data for Texas [18]. These data are often limited for scientific purposes by major temporal discontinuities (as in the case of Minnesota) or by a focus on regulated reservoirs (as in Texas). Datasets collected by individual scientists tend to be of short duration and cover a limited number of lakes. This leaves a large majority of the natural lakes in the United States unmonitored [6][7][8]. Similar deficiencies exist elsewhere in the world [19][20][21]. Furthermore, there has been an overall decline in gauging networks due to a lack of funding, lack of personnel for installation and data collection, and difficulty of installation [22]. Satellite radar altimetry can help monitor lake water level and lake water volume in areas where in situ measurements are not available, as well as provide measurements over relatively long-time spans [23]. These measurements, which are increasingly applicable to smaller lakes [24,25], remain most effective for lakes larger than 1-10 km 2 [26,27] and are available only for a small fraction of lakes worldwide because satellite altimeter ground tracks are widely spaced [28].
These limitations of current lake monitoring systems suggest that new strategies are needed to understand variations in small lakes. High-resolution remote sensing of inundation extent paired with measurements of lake stage by citizen scientists offers a promising alternative approach. Time series of optical satellite imagery from sensors on satellites such as Landsat and Sentinel 2 can be used to measure variations in lake area over time, an important metric that provides key information about changes in lake water storage [11,23,[29][30][31]. However, on their own, these lake area measurements can be difficult to interpret, since they provide only indirect information about water surface elevation. Meanwhile, a new emphasis on citizen science in hydrology has shown that it is possible to work with members of the public to measure water levels across a broad range of water bodies [32][33][34][35]. By combining measurements of lake area from satellites and lake level from citizen scientists, we can measure variations in lake volume and storage to address key science questions in lake hydrology.
In this paper, we aim to better understand the factors that control regional patterns of lake water storage. We test whether the storage of water in regional groups of small lakes varies in concert, suggesting regional-scale drivers, or if such variations are primarily driven by local factors unique to each lake. To achieve this objective, we calculate changes in lake water storage over time in groups of small natural lakes in Wisconsin, Illinois, North Carolina, and Washington, USA through the use of lake level measurements gathered by citizen scientists and lake extent measurements taken from optical satellite imagery. We compare these time series of water storage between lakes within regional groups. If time series within a group are highly correlated, then controls on lake water storage are likely regional in nature. If they are uncorrelated, then local controls specific to each lake must be relatively important.

Study Areas
For this study, we selected groups of natural lakes in North Carolina, Washington, Illinois, and Wisconsin ( Figure 1). We chose these lakes based on the feasibility of gauge installation, ease of access, and lack of active controls on water levels. Because we were interested in observing natural patterns, it was important to exclude any actively humancontrolled lakes or reservoirs. However, some of the lakes have structures that, while not actively managed, were designed to generally keep water levels higher than would naturally occur. The one exception is Horsepen Lake, in North Carolina, which experienced minor changes to its outlet following Hurricane Florence in 2018. We retained it in our analysis, though removing it would not substantially affect our results. We included both drainage lakes and seepage lakes and did not explicitly attempt to differentiate between them. The lake regions selected are of varying geographies and contain lakes with diverse characteristics, such as area, depth, and proximity to other study lakes (Table 1) [36]. A table of lake properties can be found in the Appendix A (Table A1). We worked with citizen scientists to collect water level data for lakes in North Carolina, Washington, and Illinois as part of the Lake Observations from Citizen Scientists and Satellites (LOCSS) Project [37], and we obtained data for lakes in Wisconsin from an existing network operated by the Wisconsin Department of Natural Resources [38]. is possible to work with members of the public to measure water levels across a broad range of water bodies [32][33][34][35]. By combining measurements of lake area from satellites and lake level from citizen scientists, we can measure variations in lake volume and storage to address key science questions in lake hydrology.
In this paper, we aim to better understand the factors that control regional patterns of lake water storage. We test whether the storage of water in regional groups of small lakes varies in concert, suggesting regional-scale drivers, or if such variations are primarily driven by local factors unique to each lake. To achieve this objective, we calculate changes in lake water storage over time in groups of small natural lakes in Wisconsin, Illinois, North Carolina, and Washington, USA through the use of lake level measurements gathered by citizen scientists and lake extent measurements taken from optical satellite imagery. We compare these time series of water storage between lakes within regional groups. If time series within a group are highly correlated, then controls on lake water storage are likely regional in nature. If they are uncorrelated, then local controls specific to each lake must be relatively important.

Study Areas
For this study, we selected groups of natural lakes in North Carolina, Washington, Illinois, and Wisconsin ( Figure 1). We chose these lakes based on the feasibility of gauge installation, ease of access, and lack of active controls on water levels. Because we were interested in observing natural patterns, it was important to exclude any actively humancontrolled lakes or reservoirs. However, some of the lakes have structures that, while not actively managed, were designed to generally keep water levels higher than would naturally occur. The one exception is Horsepen Lake, in North Carolina, which experienced minor changes to its outlet following Hurricane Florence in 2018. We retained it in our analysis, though removing it would not substantially affect our results. We included both drainage lakes and seepage lakes and did not explicitly attempt to differentiate between them. The lake regions selected are of varying geographies and contain lakes with diverse characteristics, such as area, depth, and proximity to other study lakes (Table 1) [36]. A table of lake properties can be found in the Appendix A (Table  A1). We worked with citizen scientists to collect water level data for lakes in North Carolina, Washington, and Illinois as part of the Lake Observations from Citizen Scientists and Satellites (LOCSS) Project [37], and we obtained data for lakes in Wisconsin from an existing network operated by the Wisconsin Department of Natural Resources [38].

Methods
This project combines lake water levels collected by citizen scientists (Section 3.1) with same-day surface water areas calculated from Landsat and Sentinel 2 imagery (Section 3.2) to determine the change in lake water storage over time (Section 3.3). We then compared the time series of lake water storage between lakes within each regional group to determine if lakes fluctuate in coherence (Section 3.4) as well as determine if the distance between lakes drives coherence (Section 3.5).
3.1. Measuring Lake Water Levels 3.1.1. Data Acquisition Two citizen science projects provide the lake level data used in this study. For North Carolina, Washington, and Illinois, the lake level data come from measurements collected through the LOCSS Project. LOCSS is a NASA-funded project, begun in 2017, that combines lake level data reported by a network of citizen scientists with satellite images, from which lake area is determined, to better understand how the volume of water in natural lakes is changing over time [39]. To collect these measurements, the LOCSS project installed water level gauges into natural lakes. On top of the gauge is a sign with instructions, a unique gauge ID, and a phone number ( Figure 2). Citizen scientists passing by the gauge read the lake level and text in the measurement, along with the gauge ID, to the phone number. Alternatively, citizen scientists provide measurements on datasheets, via the LOCSS website [37], or occasionally via other means such as email or a phone call. The measurements are recorded and displayed in real-time on the LOCSS website. The record begins on 18 April 2017 for the North Carolina lakes, on either 10 September 2018 or 11 June 2019 for the Washington lakes, and on 13 May 2019 for the Illinois lakes, depending on the lake gauge installation date. All US LOCSS data are collected from gauges incremented in units of 0.02 feet (~6 mm). Because it is possible to estimate between two increments, we generally receive measurements to the nearest 0.01 foot (~3 mm). The data submitted by the citizen scientists occur mostly at random time intervals, with some gauges having more readings than others. We were interested in capturing measurements on or near Landsat and Sentinel 2 overpass dates, however, to complete our volumetric analysis, it is also worth noting that citizen scientists can sign up for email notifications and the LOCSS newsletter on the LOCSS website (www.locss.org, accessed 1 March 2021), which encourages and reminds them to make measurements on or close to Landsat and Sentinel 2 overpass dates. 1 March 2021), which encourages and reminds them to make measurements on or close to Landsat and Sentinel 2 overpass dates. The lake level data from Wisconsin come from multiple sources but are dominated by volunteer lake level monitoring efforts coordinated by the Wisconsin Department of Natural Resources (WDNR) and partners (USGS, The North Lakeland Discovery Center, county-level land and water departments, Table 1) [38]. Like the LOCSS project, volunteers in Wisconsin take readings on a staff gauge and various professionals install/remove and survey the staff gauges each spring and fall. Unlike LOCSS, individual volunteers are responsible for monitoring a specific lake at least monthly, but ideally weekly, during the ice-free season and are either trained to enter their data into WDNR's Surface Water Integrated Monitoring System or mail in a paper copy of their datasheet. The earliest data used for this study start from on 1 April 2015, but start dates vary by lake. For this study, we analyzed data collected until 1 February 2020. The LOCSS data collection method was, in part, inspired by the CrowdHydrology text messaging system [32,34] and the Wisconsin network protocols [38], so this study is partially indebted to their previous work. An example of the lake level time series for Bay Tree Lake in eastern North Carolina shows the typical format of the data (Figure 3a). Outliers were automatically filtered out of the North Carolina, Washington, and Illinois lakes based upon the maximum and minimum values on the gauge boards. Measurements that exceeded the maximum possible value on a gauge were automatically removed. There are eight outliers in North Carolina (out of 2238 total measurements), 33 (out of 1296) in Washington, and none (out of 827) in Illinois. In Wisconsin, no outlier removal was required because the data were preprocessed by Wisconsin's DNR. In total, 5287 lake level measurements were used in this study. The lake level data from Wisconsin come from multiple sources but are dominated by volunteer lake level monitoring efforts coordinated by the Wisconsin Department of Natural Resources (WDNR) and partners (USGS, The North Lakeland Discovery Center, county-level land and water departments, Table 1) [38]. Like the LOCSS project, volunteers in Wisconsin take readings on a staff gauge and various professionals install/remove and survey the staff gauges each spring and fall. Unlike LOCSS, individual volunteers are responsible for monitoring a specific lake at least monthly, but ideally weekly, during the ice-free season and are either trained to enter their data into WDNR's Surface Water Integrated Monitoring System or mail in a paper copy of their datasheet. The earliest data used for this study start from on 1 April 2015, but start dates vary by lake. For this study, we analyzed data collected until 1 February 2020. The LOCSS data collection method was, in part, inspired by the CrowdHydrology text messaging system [32,34] and the Wisconsin network protocols [38], so this study is partially indebted to their previous work. An example of the lake level time series for Bay Tree Lake in eastern North Carolina shows the typical format of the data (Figure 3a). Outliers were automatically filtered out of the North Carolina, Washington, and Illinois lakes based upon the maximum and minimum values on the gauge boards. Measurements that exceeded the maximum possible value on a gauge were automatically removed. There are eight outliers in North Carolina (out of 2238 total measurements), 33 (out of 1296) in Washington, and none (out of 827) in Illinois. In Wisconsin, no outlier removal was required because the data were preprocessed by Wisconsin's DNR. In total, 5287 lake level measurements were used in this study.

GPS Processing
For three lakes in North Carolina and one in Washington, there are multiple gauges on the same lake. We focused on studying change in lake level for the whole lake, not the individual gauges, so for these cases, the lake level records were merged together to create one record of change in lake level. To do this accurately, the scales of the lake levels were matched based on GPS elevation data taken in the field during gauge installation. GPS data were recorded at a rate of one measurement per second using a high-precision Septentrio PolaRx-5 receiver (Leuven, Belgium) that was floated on a small raft [40] for one hour immediately adjacent to the gauge.
This elevation was paired with the lake level measurement that was taken on that day. If the height of a gauge were to change due to the effects of ice, human impacts, or other factors, a new GPS elevation would be collected once the gauge was reinstalled or fixed. Depending on the elevation and lake level measurement pair taken on that day the old and new record could be matched. To match up multiple gauges on one lake, we used the GPS and lake level pairing for each gauge to determine what the lake levels were at the same elevations. We then calculated the difference between the levels at the same elevation and added or subtracted that value from the lake level data record for each gauge.

Validation
One important concern regarding citizen science data is data accuracy, especially when collected by those without specific training [41]. To address this concern and validate our lake level data, we installed Levelogger Junior Edge pressure transducers (Manufactured by Solinst, Georgetown, ON, Canada) , corrected for atmospheric pressure variations using a Solinst Barologger located within at least 40 km. We collected coincident citizen science and pressure transducer measurements at 14 gauges in North Carolina, 7 gauges in Washington, and 12 gauges in Illinois. The pressure transducers recorded measurements every 15 min for periods of 6-10 months. The pressure transducer

GPS Processing
For three lakes in North Carolina and one in Washington, there are multiple gauges on the same lake. We focused on studying change in lake level for the whole lake, not the individual gauges, so for these cases, the lake level records were merged together to create one record of change in lake level. To do this accurately, the scales of the lake levels were matched based on GPS elevation data taken in the field during gauge installation. GPS data were recorded at a rate of one measurement per second using a high-precision Septentrio PolaRx-5 receiver (Leuven, Belgium) that was floated on a small raft [40] for one hour immediately adjacent to the gauge.
This elevation was paired with the lake level measurement that was taken on that day. If the height of a gauge were to change due to the effects of ice, human impacts, or other factors, a new GPS elevation would be collected once the gauge was reinstalled or fixed. Depending on the elevation and lake level measurement pair taken on that day the old and new record could be matched. To match up multiple gauges on one lake, we used the GPS and lake level pairing for each gauge to determine what the lake levels were at the same elevations. We then calculated the difference between the levels at the same elevation and added or subtracted that value from the lake level data record for each gauge.

Validation
One important concern regarding citizen science data is data accuracy, especially when collected by those without specific training [41]. To address this concern and validate our lake level data, we installed Levelogger Junior Edge pressure transducers (Manufactured by Solinst, Georgetown, ON, Canada) , corrected for atmospheric pressure variations using a Solinst Barologger located within at least 40 km. We collected coincident citizen science and pressure transducer measurements at 14 gauges in North Carolina, 7 gauges in Washington, and 12 gauges in Illinois. The pressure transducers recorded measurements every 15 min for periods of 6-10 months. The pressure transducer data and citizen science data were then adjusted to the same units and scale, matched by date and time, and regressed against each other to determine accuracy.

Data Acquisition
To monitor change in lake surface extent, we used Landsat 8 Optical Land Imager (OLI) and Sentinel 2 MultiSpectral Imager (MSI) images. The Landsat satellites have a temporal resolution of 16 days and a spatial resolution of 30 m. The Sentinel 2 satellites have a temporal resolution of five days and a spatial resolution of 10 m. The use of both Landsat and Sentinel 2 sensors greatly increases image availability. We acquired the images and calculated the lake surface area using code written in Google Earth Engine (GEE) [42,43]. The satellite images were selected using lake polygons from the National Hydrography Dataset (NHD) [44]. Lake polygons do not exist in the NHD database for Horsepen Lake in North Carolina or for Deep Quarry, Harrier, and Herrick lakes in Illinois. In these cases, lake polygons are hand drawn in GEE. Clouds in images lead to commission errors, false positives, and overestimation, so images with more than 30% cloud cover were filtered out. After this filter was applied, we retained 8590 satellite images during the timeframe of our study.

Water Mask
For all images in this collection, we calculated lake surface area using the Dynamic Surface Water Extent (DSWE) method [45] on both the Landsat and Sentinel 2 imagery. In order to use DSWE for the Sentinel 2 imagery, we first applied a transformation function to the Sentinel 2 imagery to approximate reflectance values for equivalent Landsat bands [46,47]. The DSWE algorithm then classified each pixel into one of five categories: high probability of water, moderate probability of water, low probability of water, not water, or wetland ( Figure 4b). Open water pixels with a high or moderate level of confidence and wetland pixels were selected to create the water mask over each lake (Figure 4d). We included wetland pixels due to the large amounts of inundated vegetation in some lakes. To allow the inclusion of images in which a lake is partially obscured by clouds, we employed a cloud filling algorithm. We identifyed cloudy areas over lakes using the CFmask algorithm for Landsat [48] and the built-in quality band for Sentinel 2. It fills in the cloud-created gaps in the lake mask (Figure 4c) within the lake polygon boundary using the Surface Water Occurrence (SWO) Value developed by Pekel et al. [28]. The SWO value represents the frequency, between 1984 and 2015, that surface water was observed by Landsat. We used the mean water occurrence value from the shoreline pixels along the DSWE water mask as a dynamic threshold for water occurrence to fill in the cloud-obscured areas within the lake and designate those pixels as water (Figure 4e). Using a function written in Google Earth Engine (GEE), we then calculated the lake surface area for each lake image in the time series (Figure 3f).
Values >±25% from the median surface area value in the lake surface area time series were treated as outliers and removed. This threshold was chosen by manually checking lakes with apparent changes in area. In lakes with very high variations in surface area, this threshold might require adjustment, but the surface areas of lakes studied here are sufficiently stable so the application of this threshold is unlikely to result in the removal of correct observations. After removing the outliers, the number of usable satellite images dropped to 7961. These images were matched to the lake level data to calculate variations in lake volume (See Section 3.3).  Values >±25% from the median surface area value in the lake surface area time series were treated as outliers and removed. This threshold was chosen by manually checking lakes with apparent changes in area. In lakes with very high variations in surface area, this threshold might require adjustment, but the surface areas of lakes studied here are sufficiently stable so the application of this threshold is unlikely to result in the removal of correct observations. After removing the outliers, the number of usable satellite images dropped to 7961. These images were matched to the lake level data to calculate variations in lake volume (See Section 3.3).

Validation
To evaluate the accuracy of our automated lake surface extent calculations from the Landsat and Sentinel 2 imagery, we used high-resolution satellite data from Planet (San Francisco, CA, USA), PlanetScope and RapidEye imagery, to provide a second estimate of lake surface extent [49]. Planet imagery was collected using 170 CubeSat satellites and provided daily temporal resolution [49]. It has three satellite types: PlanetScope with 3.7 m resolution, RapidEye with 6.5 m resolution, and SkySat with 72 cm resolution [49]. For this analysis, we used the PlanetScope and RapidEye images because they are freely available to researchers, and we conducted the analysis on the Planet imagery platform, which includes a built-in tool that automatically calculates the area of a polygon. While Cooley et al. [11,50] developed automated surface area calculation methods using Planet imagery, here we manually delineated surface areas primarily due to the small scale of the analysis and potential errors associated with automated detection, such as errors caused by clouds, shadows, and wetlands. We chose three lakes for the validation: Bay Tree Lake in eastern North Carolina, Hastings Lake in Illinois, and Lake Wenatchee in Washington. These lakes were chosen because they represent different test cases. Bay Tree Lake poses little difficulty when calculating lake surface area. Hastings Lake, however, is one of our smallest lakes, leading to potential errors. Lake Wenatchee, meanwhile, experiences both topographic shadow and adjacent snow cover. We selected five dates from each lake's surface area record that also had a corresponding lake level measurement and found the corresponding Planet image on that date. In three cases, images were matched plus or minus a day due to a gap in the Planet record or clouds. Some dates had

Validation
To evaluate the accuracy of our automated lake surface extent calculations from the Landsat and Sentinel 2 imagery, we used high-resolution satellite data from Planet (San Francisco, CA, USA), PlanetScope and RapidEye imagery, to provide a second estimate of lake surface extent [49]. Planet imagery was collected using 170 CubeSat satellites and provided daily temporal resolution [49]. It has three satellite types: PlanetScope with 3.7 m resolution, RapidEye with 6.5 m resolution, and SkySat with 72 cm resolution [49]. For this analysis, we used the PlanetScope and RapidEye images because they are freely available to researchers, and we conducted the analysis on the Planet imagery platform, which includes a built-in tool that automatically calculates the area of a polygon. While Cooley et al. [11,50] developed automated surface area calculation methods using Planet imagery, here we manually delineated surface areas primarily due to the small scale of the analysis and potential errors associated with automated detection, such as errors caused by clouds, shadows, and wetlands. We chose three lakes for the validation: Bay Tree Lake in eastern North Carolina, Hastings Lake in Illinois, and Lake Wenatchee in Washington. These lakes were chosen because they represent different test cases. Bay Tree Lake poses little difficulty when calculating lake surface area. Hastings Lake, however, is one of our smallest lakes, leading to potential errors. Lake Wenatchee, meanwhile, experiences both topographic shadow and adjacent snow cover. We selected five dates from each lake's surface area record that also had a corresponding lake level measurement and found the corresponding Planet image on that date. In three cases, images were matched plus or minus a day due to a gap in the Planet record or clouds. Some dates had two usable images due to overlapping satellite orbits. In these cases, both were used and tested. For the analysis, we first manually drew the lake boundary on each image, creating a lake polygon ( Figure 5). Next, we calculated the surface area over that polygon using the Planet interface tool. Everything inside of the polygon is considered water, so the lake surface area is equivalent to the polygon area. Additionally, we compared the same-day area estimates from Sentinel 2 and Landsat for the subset of lakes/days when both sensors captured images (n = 209). two usable images due to overlapping satellite orbits. In these cases, both were used and tested. For the analysis, we first manually drew the lake boundary on each image, creating a lake polygon ( Figure 5). Next, we calculated the surface area over that polygon using the Planet interface tool. Everything inside of the polygon is considered water, so the lake surface area is equivalent to the polygon area. Additionally, we compared the same-day area estimates from Sentinel 2 and Landsat for the subset of lakes/days when both sensors captured images (n = 209).

Measuring Lake Water Storage
To calculate change in lake water storage on dates without coincident satellite images, we developed rating curves between lake level and lake volume. We calculated the initial lake water volume using the lake level measurements collected by the citizen scientists and the surface area measurements derived from the Landsat and Sentinel 2 imagery. We then used a rating curve to estimate variations in volume from lake level alone.

Calculation of Lake Water Storage
To calculate change in lake storage, or volume, over time, the lake water level measurements for each of the lakes were combined with lake surface area measurements. If a lake level and a surface area measurement fell on the same day, they were automatically paired. If there were two lake level measurements on the same day, then a mean surface area was used. If there was no lake level taken on the day a surface area measurement was taken, then the nearest lake level measurement ±1 day was used as a pairing. If there was a lake level measurement from the day before as well as one for the day after the image was collected, we averaged the lake level measurements. If there was no matching lake surface area measurement for a lake level measurement or vice versa, those data were not used in calculating volume variations.

Measuring Lake Water Storage
To calculate change in lake water storage on dates without coincident satellite images, we developed rating curves between lake level and lake volume. We calculated the initial lake water volume using the lake level measurements collected by the citizen scientists and the surface area measurements derived from the Landsat and Sentinel 2 imagery. We then used a rating curve to estimate variations in volume from lake level alone.

Calculation of Lake Water Storage
To calculate change in lake storage, or volume, over time, the lake water level measurements for each of the lakes were combined with lake surface area measurements. If a lake level and a surface area measurement fell on the same day, they were automatically paired. If there were two lake level measurements on the same day, then a mean surface area was used. If there was no lake level taken on the day a surface area measurement was taken, then the nearest lake level measurement ±1 day was used as a pairing. If there was a lake level measurement from the day before as well as one for the day after the image was collected, we averaged the lake level measurements. If there was no matching lake surface area measurement for a lake level measurement or vice versa, those data were not used in calculating volume variations. Lake volume change was then calculated for each date with a lake level and area pairing based on a linear equation that assumes lake volume change can be approximated by trapezoidal volume: where V is volume, h is height, B 1 is one base measurement of the trapezoid and B 2 is the other base measurement of the trapezoid. This basic equation can then be applied to capture the change in water volume for lakes [51]: where ∆V t i t io is change in volume from t i , time of image, to t io , time of the first image in the time series. B(t i ) is the lake water surface area at the time of the image, while B(t io ) is the lake water surface area at the time of the first image in the time series. The variable h(t i ) is the height at time of image and h(t io ) is the height of the lake in the first image in the time series. Applying this equation to each level and area pair leaves us with a time series of variations in lake water volume for each lake (Figure 3c). Absolute volume cannot be calculated because lake bathymetry is unknown. We worked under the assumption that lake bathymetry remains constant throughout the study period. It is also worth noting that the trapezoidal hypothesis most likely only works for a certain range of lake level to area pairings since area grows quadratically; however, in this study, the hypsometries found are very linear, confirming the applicability of the trapezoidal hypothesis. Note that in Figure 3 the water level and storage change data are highly correlated, while the inundated area data are not. Because variations in lake area are so small (standard deviation 0.04 km 2 , 3% of mean area), nearly all variations in storage are explained by increases or decreases in water level. In addition, some of the observed variation in area for this lake is likely due to errors in satellite retrievals.

Hypsometric Curve
In order to maximize data usage for each lake, we used a hypsometry approach to estimate variations in lake volume. This approach has been used by others [23,[52][53][54]. Ogilvie et al. (2018) [52] noted that for most small lakes, change in lake water volume primarily functions as a response to changes in lake water level; we made the same observation when looking at the degree of variation of change in lake water levels, areas, and volumes in each region. To develop a level-volume change rating curve for each lake, we regressed the calculated variations in lake water volume against the corresponding lake level measurements. A linear equation was determined for this relationship ( Figure 6). The equation was then applied to all lake level measurements in the lake's record to calculate a volume variation time series.
Water 2021, 13, x FOR PEER REVIEW 11 of 22 Figure 6. Example of the linear relationship determined between change in height and calculated change in volume to create a rating curve at Bay Tree Lake.

Propagation of Lake Height Errors into Volume Variations
Ideally, we would like to estimate errors in volume change calculations. However, doing so robustly would require bathymetric measurements that we did not have for most lakes. Instead, we evaluated how the error in the lake height measurements propagates into volume estimates. To do so, we also calculated lake volume changes using lake level data collected from the pressure transducers using the same steps described in Sections 3.3.1-3.3.2. The change in volume time series for both the pressure transducer data and the citizen science data were overlaid on the same plot. The time series show similar variations, though some of the high-frequency variability is not captured by the less Figure 6. Example of the linear relationship determined between change in height and calculated change in volume to create a rating curve at Bay Tree Lake.

Propagation of Lake Height Errors into Volume Variations
Ideally, we would like to estimate errors in volume change calculations. However, doing so robustly would require bathymetric measurements that we did not have for most lakes. Instead, we evaluated how the error in the lake height measurements propagates into volume estimates. To do so, we also calculated lake volume changes using lake level data collected from the pressure transducers using the same steps described in Sections 3.3.1 and 3.3.2. The change in volume time series for both the pressure transducer data and the citizen science data were overlaid on the same plot. The time series show similar variations, though some of the high-frequency variability is not captured by the less temporally dense citizen science data (Figure 7). We then calculated Pearson's Correlation Coefficient (r) between the pressure transducer data and the citizen science data time series. change in volume to create a rating curve at Bay Tree Lake.

Propagation of Lake Height Errors into Volume Variations
Ideally, we would like to estimate errors in volume change calculations. However, doing so robustly would require bathymetric measurements that we did not have for most lakes. Instead, we evaluated how the error in the lake height measurements propagates into volume estimates. To do so, we also calculated lake volume changes using lake level data collected from the pressure transducers using the same steps described in Sections 3.3.1-3.3.2. The change in volume time series for both the pressure transducer data and the citizen science data were overlaid on the same plot. The time series show similar variations, though some of the high-frequency variability is not captured by the less temporally dense citizen science data (Figure 7). We then calculated Pearson's Correlation Coefficient (r) between the pressure transducer data and the citizen science data time series. Figure 7. Time series of change in lake water volume calculated from the citizen science data versus the change in lake water volume calculated from pressure transducer data in Grays Lake, located in Illinois. This demonstrates the accuracy of the citizen science data.

Data Acquisition
Once variations in lake water storage are calculated for all lakes in all regions, we calculated the Spearman's correlation coefficient (ρ) [55] of those changes between all pairs of lakes in each region. Lake pairs were simply identified as all possible combinations of study lakes within a study region. We chose to use Spearman's ρ because it is less sensitive than a Pearson's correlation to the assumption that relationships between paired lake water levels are linear. We also calculated Pearson's correlations (not reported here) and noted only minor differences with Spearman's ρ. Observations for each lake were matched by date, or ±1 day if no same-date match. If there were fewer than 10 paired measurements, a correlation coefficient was not calculated, and the lake to lake pair was not used. We used a threshold of n = 10 to avoid a case where a very small number of observations led to a spurious correlation. However, we noted that including all lakes does not substantially change our results. This analysis was repeated for all lake pairs in a region, no matter how close or distant, and completed separately for each of the four regions.

Validation
In order to evaluate whether paired lake correlations based on sparse citizen science data are useful, we compared them against paired lake correlations based on much more temporally dense pressure transducer measurements. If correlation coefficients are similar using the two different methods, then we have higher confidence in the analysis based on citizen science measurements.

Spatial Analysis
To assess whether correlations in paired lake water storage are controlled by distance, we regressed the correlation coefficients described in Section 3.4.1 against the distance between the pair of lakes associated with each correlation. We assessed whether distance is a major control on correlation using Spearman's ρ between paired lake distance and paired lake correlation coefficient.

Citizen Science Data
Comparison of lake level measurements made by citizen scientists against pressure transducers show that measurements by citizen scientists are highly accurate (Figure 8). Across 2702 corresponding lake level measurements, we observe an R 2 value of >0.99, a mean absolute error (MAE) of 1.6 cm, and a root mean squared error (RMSE) of 2.7 cm. The measurement error of the pressure transducers is 0.8 cm. There are several (n = 9) outliers in the citizen science data, and we attributed most of these to data entry errors by citizen scientists.

Lake Surface Area
Comparison against manual classifications based on high-resolution Planet imagery suggests that automated Landsat and Sentinel 2 classifications are generally accurate. For Bay Tree Lake, the average surface area delineated from Planet imagery is 98.95% of the surface area delineated from Landsat or Sentinel 2 imagery (mean absolute error (MAE): 0.066 km 2 ); for Hastings Lake, it is 80.19% (MAE: 0.059 km 2 ); and for Lake Wenatchee, it is 103% (MAE: 0.373 km 2 ). In Bay Tree Lake, the most straight-forward case, the percent difference between the Landsat or Sentinel 2 imagery and the Planet imagery is minimal. Lake Wenatchee, a more difficult case due to the topographic shadow that covers the lake at times as a result of the lake's location in the Cascade Mountains, has a slightly larger percent difference. The values of topographically shadowed pixels in the imagery are similar to those of the water, leading to a modest overestimation of surface water in our automatic Landsat and Sentinel 2 classifications compared to Planet imagery. Hastings Lake, our representative small lake case, has the greatest percent difference. We see an underestimation of lake surface area in our automated classifications. This is most likely due to the coarse spatial resolution of the imagery compared to the lake area, especially in the case of Landsat. In general, we see a larger difference between Landsat and Sentinel 2 derived areas in small lakes because of Sentinel 2's higher spatial resolution. This

Lake Surface Area
Comparison against manual classifications based on high-resolution Planet imagery suggests that automated Landsat and Sentinel 2 classifications are generally accurate. For Bay Tree Lake, the average surface area delineated from Planet imagery is 98.95% of the surface area delineated from Landsat or Sentinel 2 imagery (mean absolute error (MAE): 0.066 km 2 ); for Hastings Lake, it is 80.19% (MAE: 0.059 km 2 ); and for Lake Wenatchee, it is 103% (MAE: 0.373 km 2 ). In Bay Tree Lake, the most straight-forward case, the percent difference between the Landsat or Sentinel 2 imagery and the Planet imagery is minimal. Lake Wenatchee, a more difficult case due to the topographic shadow that covers the lake at times as a result of the lake's location in the Cascade Mountains, has a slightly larger percent difference. The values of topographically shadowed pixels in the imagery are similar to those of the water, leading to a modest overestimation of surface water in our automatic Landsat and Sentinel 2 classifications compared to Planet imagery. Hastings Lake, our representative small lake case, has the greatest percent difference. We see an underestimation of lake surface area in our automated classifications. This is most likely due to the coarse spatial resolution of the imagery compared to the lake area, especially in the case of Landsat. In general, we see a larger difference between Landsat and Sentinel 2 derived areas in small lakes because of Sentinel 2's higher spatial resolution. This difference is generally negligible. The 209 coincident Landsat and Sentinel 2 observations that occurred during our study period show a mean difference of −0.8% (σ = 5%) in their predicted areas. We note that because the underestimation typical of small lakes is relatively consistent in time, it will likely have a small impact on the time series of water storage used for lake correlations and will, instead, represent a systematic offset. We also note that, during our study period, we observed minimal change in lake area in all study lakes. Thus, in the cases studied here, changes in lake volume are almost entirely driven by variations in water surface elevation.

Lake Volume Correlations
In order to test propagation of errors in lake heights from citizen scientists into lake volume calculations, we compared the correlation coefficients of change in lake volume between pairs of lakes derived from the citizen science measurements against those calculated from the automated water level loggers. We see that they compare very well and are similar, though not identical (Figure 9). The Pearson's correlation coefficient is 0.76 between paired lake correlation coefficients from the two different data sources. That, plus the relative similarity of the best fit and one-to-one lines, suggests that, in general, correlation coefficients from temporally sparse citizen science data are likely reliable. The average difference in correlation coefficients calculated for the same pair of lakes using data from citizen scientists and from pressure transducers is 0.23. There is no relationship between the number of measurements collected and the difference between citizen sciencebased and pressure transducer-based correlation coefficients in any of the lake regions.

. Lake Volume Correlations
In order to test propagation of errors in lake heights from citizen scientists into lake volume calculations, we compared the correlation coefficients of change in lake volume between pairs of lakes derived from the citizen science measurements against those calculated from the automated water level loggers. We see that they compare very well and are similar, though not identical (Figure 9). The Pearson's correlation coefficient is 0.76 between paired lake correlation coefficients from the two different data sources. That, plus the relative similarity of the best fit and one-to-one lines, suggests that, in general, correlation coefficients from temporally sparse citizen science data are likely reliable. The average difference in correlation coefficients calculated for the same pair of lakes using data from citizen scientists and from pressure transducers is 0.23. There is no relationship between the number of measurements collected and the difference between citizen science-based and pressure transducer-based correlation coefficients in any of the lake regions. Figure 9. Correlation coefficients between lake pairs in a region of the citizen-scientist-based change in volume versus the correlation coefficients between lake pairs in a region of the pressuretransducer-based change in volume. A strong correlation between the two is seen. The dashed line is a one-to-one line and the blue line is the line of best fit.

Variations in Lake Water Storage
4.2.1. Regional Coherence of Changes in Lake Water Volume On average, water storage variations in pairs of lakes within each of the study regions are positively correlated (average = 0.48, Figure 10). The distributions of correlations are similar for all four regions, though interestingly they are somewhat higher in Washington, the only region with notable topography, and lowest in North Carolina and Wisconsin.

Regional Coherence of Changes in Lake Water Volume
On average, water storage variations in pairs of lakes within each of the study regions are positively correlated (average ρ = 0.48, Figure 10). The distributions of correlations are similar for all four regions, though interestingly they are somewhat higher in Washington, the only region with notable topography, and lowest in North Carolina and Wisconsin. There is a substantial spread in the degree of correlation, with some pairs of lakes highly correlated and others uncorrelated.

Correlations and Relationship with Distance
There is a weak negative relationship between distance and paired lake correlation in two regions (Washington and Illinois) and no significant relationship in the other two regions (North Carolina and Wisconsin) (Figure 11b-e). When lakes in all regions are combined, there is a weak but statistically significant negative relationship between distance and paired lake correlation (Figure 11a).

Correlations and Relationship with Distance
There is a weak negative relationship between distance and paired lake correlation in two regions (Washington and Illinois) and no significant relationship in the other two regions (North Carolina and Wisconsin) (Figure 11b-e). When lakes in all regions are combined, there is a weak but statistically significant negative relationship between distance and paired lake correlation (Figure 11a).
We tested whether the correlations with distance are influenced by the number of paired measurements and found no relationship in all of the regions combined or in any region individually. We also tested whether the variability in paired lake correlation increases with paired lake distance. There is a slight but significant (p < 0.01) negative relationship in North Carolina and a positive relationship in Illinois. Results for other regions, and for all regions together, show no significant relationship.

Correlations and Relationship with Distance
There is a weak negative relationship between distance and paired lake correlation in two regions (Washington and Illinois) and no significant relationship in the other two regions (North Carolina and Wisconsin) (Figure 11b-e). When lakes in all regions are combined, there is a weak but statistically significant negative relationship between distance and paired lake correlation (Figure 11a). Figure 11. Relationship between the Spearman's correlation coefficient (ρ) of change in lake water storage between paired lakes in a region and the distance between those two lakes: all regions (a), Figure 11. Relationship between the Spearman's correlation coefficient (ρ) of change in lake water storage between paired lakes in a region and the distance between those two lakes: all regions (a), North Carolina (b), Illinois (c), Washington (d), and Wisconsin (e). Overall, there is a negative linear relationship between correlation and distance. This trend is different region by region.

Discussion
First and foremost, this study demonstrates that two still-evolving approaches, citizen science and optical remote sensing of lake area, can be combined to accurately monitor changes in lake water storage over time. Citizen science lake level data are nearly as accurate as pressure transducer data, with the primary difference being that the pressure transducer data capture a near-continuous time series. The lake surface areas automatically calculated from Landsat and Sentinel 2 imagery are quite similar to the areas manually calculated from Planet imagery in most cases. For simple cases, like Bay Tree Lake (Figure 5a), lake surface areas compare very well to manually digitized maps made from same-day, high-resolution Planet imagery. For cases where topographic shadows are important, like Lake Wenatchee (Figure 5c), lake surface area is overestimated in the automated classifications, likely due to topographic shadow pixels misclassified as water. For small lakes, like Hastings Lake (Figure 5b), we see an underestimation of lake surface area in our automated classifications, likely due to the fact that a large portion of the lake is covered by mixed water-land pixels. In general, the Planet imagery has a finer spatial resolution than that of the Landsat and Sentinel 2 imagery, meaning that Planet imagery can capture more heterogeneity and detect subtle changes in inundation extent not apparent in Landsat or Sentinel 2 imagery [11]. It is important to note that there could still be errors in the Planet surface area delineations due to user error. Ultimately, there is no available dataset that represents absolute truth when measuring surface area through satellite imagery [51]. Overall, our results suggest that citizen science data and optical satellite imagery accurately capture changes in lake water storage.
The study presented here begins to provide a baseline in the level of correlation we may expect to see in the changes of lake water storage between lakes in a region, at least on timescales of 1-2 years. Perhaps surprisingly, the average correlation between pairs of lakes within our four study regions is only moderately positive (ρ = 0.48) and highly variable. This result suggests that measurements made in one lake provide some regional information, but, at least in the regions we examined, there is also a great deal of lake-to-lake variability in water storage time series. This result contrasts with findings by Watras et al. (2014) [10], in four lakes in northern Wisconsin, who found that, over the last 70 years, water level variations were controlled by a near decadal oscillation (~13 years), independent of lake or aquifer hydrology and that the oscillation has been driven by net atmospheric flux as well as level-dependent outflow. In contrast, Euliss and Mushet (1996) [9], who studied the controlling factors of 36 small wetland waterbodies in the prairie-pothole region in North and South Dakota, found that during one spring and summer season, water level fluctuations were influenced by lake-to-lake variability. Variations were greater in wetlands with more agricultural activity, rather than grasslands, and water level fluctuations in semi-permanent wetlands were more stable due to the inputs of groundwater, whereas seasonal and temporary wetlands solely depended on runoff. Perales et al. [56] studied changes in lake water storage across 47 lakes in Northern Wisconsin during a five-year drought between 2005 and 2010. They found that most lakes lost water during this period, but the degree of loss varied considerably; much of this variability could be explained by lake-specific factors, including soil characteristics, elevation, the presence of littoral wetlands, and whether the lake was a drainage or seepage lake. Correlations between surface areas in small lakes and their sub-seasonal dynamics have also been studied by Cooley et al. (2019) [11] across Alaska and the Canadian Shield. They noted that, as a whole, lake surface area declined across all study regions, with some distinct localized exceptions, which primarily reflected negative net summer atmospheric flux in North America in 2017. They also note that changes in surface area are primarily driven by lake level fluctuation, not shoreline vegetation growth. As the body of work exploring controls on lake storage fluctuations remains limited, further work needs to be done to determine what localized and regional factors control observed patterns. Unlike Watras et al. [10], our study spans up to two years, and hence we do not capture decadal-scale patterns. The patterns we see are most likely yearly cyclical patterns and sub-seasonal dynamics, like in Cooley et al. [11], or localized weather events and also likely represent a small fraction of the total change in storage lakes would experience at longer time scales. A future study would benefit from at least 10 years of data, which would allow better comparison to long-term studies such as the work of Watras et al. [10].
We observed only a limited correlation between paired lake Spearman's ρ and paired lake distance. The hypothesis that as the distance between lakes increases, the correlation of change in lake water storage between those two lakes decreases is an intuitive one. However, the results from this study provide only limited evidence for this relationship. When aggregated together across all regions, our data show a significant but weak correlation (ρ = −0.26). Two regions, Illinois and Washington, show slightly stronger negative relationships between correlation and distance (ρ = −0.37 and −0.54, respectively), while North Carolina and Wisconsin show no significant relationship between correlation and distance. We also tested for any relationship between the variation in paired lake correlation (as measured by the absolute value of residuals) and distance but found no statistically significant relationship in any region. All of this suggests that, when thinking about the spatial synchrony between the change in lake water storage of the lakes, there are other factors in play that are more important than distance [57].
With such limited evidence supporting distance as a primary driver of the spatial synchrony of lake volume change, other local factors must be in play that matter more in governing lake volume change. Watras et al. [10] observed stronger lake-to-lake correlations than we did, which may reflect the possibility that regional drivers are more important at longer time scales. However, they also examined a much smaller number of lakes (focused mostly on one pair in Northern Wisconsin, along with the Great Lakes and groundwater well records). The Perales study referenced above [56] from Northern Wisconsin found that lake and watershed characteristics (e.g., seepage versus drainage lakes, conductivity, elevation, soil permeability, percent wetland buffer) were good predictors of water level change, with very different magnitudes of response to drought among some neighboring lakes. In addition, the lakes in our study experienced some major and somewhat unusual events during the study period-for example, the large spike in water levels in Salters Lake (Figure 3) is the result of Hurricane Florence, which impacted some North Carolina lakes much more than others, pointing to the local influence of the storm occurring at shorter time scales. Our results suggest that, at least in the places and over the time periods studied here, both regional and lake-specific factors control variations in lake water storage.
While the results of this study represent the first analysis of lake-to-lake water storage correlation across many lakes, there is still a great deal that we do not understand about the drivers of lake water levels. Exploring what these drivers might be is of great interest but is beyond the scope of this paper. Many relevant questions will be addressed after the launch of the upcoming SWOT satellite mission in 2022 [58]. Since SWOT will measure lakes as small as 250 m by 250 m, it will provide a time series of variations in water storage for millions of lakes globally. Continuing measurements from LOCSS and other sources based on citizen science and satellite imaging will provide key ground-truthing data for SWOT, especially in areas where on-the-ground measurements are limited.
Overall, this study addresses a gap in understanding regional patterns of lake water storage. We suggest that change in lake water volume in small lakes can be monitored accurately by combining measurements of water level from citizen scientists with optical satellite imaging of lake area. Assessment of relationships in water volume between pairs of lakes, on average, produce moderate correlations, suggesting that both regional patterns and lake-specific factors are important drivers of variations in lake water storage. To our knowledge, this is the first study to assess the degree of correlation in water storage time series among many small lakes in different regions. It presents the first results that can guide future hypotheses and data collection efforts. We hope that continued data collection by citizen scientists and satellites can further our understanding of natural lakes, including the usually overlooked small ones, especially in a time of rapid environmental change.