A High-Resolution Airborne Color-Infrared Camera Water Mask for the NASA ABoVE Campaign

: The airborne AirSWOT instrument suite, consisting of an interferometric Ka-band synthetic aperture radar and color-infrared (CIR) camera, was deployed to northern North America in July and August 2017 as part of the NASA Arctic-Boreal Vulnerability Experiment (ABoVE). We present validated, open (i.e., vegetation-free) surface water masks produced from high-resolution (1 m), co-registered AirSWOT CIR imagery using a semi-automated, object-based water classiﬁcation. The imagery and resulting high-resolution water masks are available as open-access datasets and support interpretation of AirSWOT radar and other coincident ABoVE image products, including LVIS, UAVSAR, AIRMOSS, AVIRIS-NG, and CFIS. These synergies o ﬀ er promising potential for multi-sensor analysis of Arctic-Boreal surface water bodies. In total, 3167 km 2 of open surface water were mapped from 23,380 km 2 of ﬂight lines spanning 23 degrees of latitude and broad environmental gradients. Detected water body sizes range from 0.00004 km 2 (40 m 2 ) to 15 km 2 . Power-law extrapolations are commonly used to estimate the abundance of small lakes from coarser resolution imagery, and our mapped water bodies followed power-law distributions, but only for water bodies greater than 0.34 ( ± 0.13) km 2 in area. For water bodies exceeding this size threshold, the coe ﬃ cients of power-law ﬁts vary for di ﬀ erent Arctic-Boreal physiographic terrains (wetland, prairie pothole, lowland river valley, thermokarst, and Canadian Shield). Thus, direct mapping using high-resolution imagery remains the most accurate way to estimate the abundance of small surface water bodies. We conclude that empirical scaling relationships, useful for estimating total trace gas exchange and aquatic habitats on Arctic-Boreal landscapes, are uniquely enabled by high-resolution AirSWOT-like mappings and automated detection methods such as those developed here.


Introduction
Accurate mapping of terrestrial surface water bodies is necessary for understanding the hydrologic cycle, energy and biogeochemical cycles, aquatic habitats, and improving earth system models [1].
Spatially, the world's greatest density of surface water bodies is in the Arctic-Boreal regions, and these water bodies change in extent based on climate variability and permafrost presence [2][3][4][5].Current water maps are derived from lake censuses [6,7] and satellite remote sensing [8][9][10].Large-scale satellite-based studies typically produce datasets with 30 m resolution, which limit the observable lakes to 0.002 km 2 (2000 m 2 ) or larger [6,8].Increasingly, the surface water area distribution has been shown to be a dynamic variable, and some satellite products contain multi-temporal information at both coarse global [11] and detailed regional scales [12].As it stands, there is no one surface water map suitable for all spatial and temporal scales, and regional maps remain crucial for detailed hydrologic studies.
Regional studies have used high-resolution remote sensing data (<6 m resolution) and ground-based methods to map lakes as small as 0.0001 km 2 (100 m 2 ) [13][14][15].However, broad-scale mappings of water bodies <0.001 km 2 are limited and often focused on constructed farm ponds in temperate climates [14,16,17].This limitation impedes scientific understanding of freshwater resources for at least three reasons: (1) the total count of small (<0.001 km 2 ) water bodies is known to be large but remains highly uncertain, especially in Arctic-Boreal regions [8], (2) lakes, ponds and wetlands are often a net source of greenhouse gases, and their extent is used as inputs to climate models, although it is uncertain [18][19][20], and (3) these water bodies are important for ecosystem services such as nutrient processing and biodiversity [13,21,22].
In the absence of global, high-resolution maps, area scaling is often applied to estimate the abundance of small water bodies and, subsequently, total surface water area [8,23,24].Especially at high resolution, it is hard to define a water body, as they are often ephemeral and can merge and sub-divide as water tables rise and fall [15].A power-law scaling relationship (Equation (1)) has frequently been invoked to extrapolate lake areas for unmapped lakes [7,16,20,23,25,26], although the scale over which power-laws can be used is unclear.The form of the power-law equation is: where A is water body area, P(A) is exceedance probability, the number of water bodies with area > A, and C and α are empirically-derived fitting coefficients.For example, Downing et al. [23] used a Pareto distribution, a form of a power-law with a horizontal shift, to model global lake areas larger than 0.1 km 2 .More recent studies, using higher-resolution data, found that other distributions may be better suited to modeling small lakes [13,24] and that there is no regionally-consistent distribution [14,15].Extrapolating with a lognormal distribution, for example, would imply 10-100 times fewer lakes, decreasing the global surface area and the number of small water bodies [24].New, Landsat-based observations of global river area, which exhibits similar area-abundance relationships, are between 15% and 59% greater than scaling-based estimates [25,27,28].Thus, while models of lake area distributions derived from coarse-resolution data may appear scale-invariant, including smaller water bodies demonstrates the contrary.This difference underscores the need for high-resolution surface water and wetland mapping.
To that end, we have produced a conservative, high-resolution map of open water bodies greater than 40 m 2 using airborne imagery.Data were collected from a color-infrared (CIR) camera included in the AirSWOT Ka-band radar instrument suite, designed to measure water surface elevation (WSE).These flights were part of the 2017 NASA Arctic-Boreal Vulnerability Experiment (ABoVE) campaign, which aims to quantify the vulnerability and resilience of North American arctic ecosystems and society, focusing on surface water, permafrost, carbon and disturbance [29].There are existing surface water maps covering the ABoVE spatial domain derived from satellite observations, but they are much coarser (30-5000 m 2 pixel size) than airborne data [30,31].The only broad-scale study of lake distribution at high resolution (<6 m pixel size) covers 923 km 2 [14,32] with limited coverage of the ABoVE domain.Our product was acquired over 23,380 km 2 of Arctic-Boreal land and water, including ground areas imaged within days or weeks of other ABoVE airborne sensors.Its broad extent and high spatial resolution produce a unique water map for the ongoing ABoVE campaign and allow us to test the effects of physiography on the water body distribution.
Table 1.Study regions, acquisition dates, and coverage areas of 2017 ABoVE AirSWOT color-infrared camera image acquisitions.The designation "river valley" refers to sorties roughly following a lowland river or river valley.The designation "wetland" refers to Arctic-Boreal wetlands (including river deltas), and "pothole" refers to glacially formed, prairie pothole lakes.[33], covering broad physiographic gradients and some of the most waterrich regions in the world.

Image Acquisition
The 16-megapixel (MP) CIR camera was mounted in a NASA B200 King Air and enclosed in a box fitted with a 16-inch diameter glass window.Mapping flights occurred at roughly 8-11 km altitude with a target along-track overlap of 60% and variable across-track overlap.For flights prior  [33], covering broad physiographic gradients and some of the most water-rich regions in the world.

Image Acquisition
The 16-megapixel (MP) CIR camera was mounted in a NASA B200 King Air and enclosed in a box fitted with a 16-inch diameter glass window.Mapping flights occurred at roughly 8-11 km altitude with a target along-track overlap of 60% and variable across-track overlap.For flights prior to 29 July 2017, data were collected with a fixed-length, 60 mm lens with a 34 • field-of-view.Individual raw image dimensions were 4072 × 4072 pixels before mosaicking.For flights after 29 July 2017, data were collected with a fixed-length 80 mm lens with a 25 • field-of-view.The auxiliary shutter on these cameras sometimes malfunctioned, resulting in only partial shutter opening, which increased vignetting at the top and bottom of raw images.To minimize vignetting effects for orthomosaic generation, raw images were clipped, resulting in final dimensions of 4072 (width) × 3472 (along-track) prior to image-stitching and orthorectification.Raw images having less than 50% cloud cover were digitally stitched and orthorectified in Agisoft PhotoScan 1.3.4.Out of 22,631 raw images acquired, 7086 were used to create 38 orthomosaics, each pertaining to a specific ground area flown on a single day.On average, each ground pixel in an orthomosaic was imaged 5.7 times from different angles.The orthomosaics were resampled to 1 m pixels from a native resolution ranging from 0.92 to 1.32 m.

Image Quality
Atmospheric conditions and mechanical failures caused challenges during the 2017 AirSWOT ABoVE sorties.Approximately one-third of the final images contain some clouds or cloud shadows.There are missing data values due to insufficient photogrammetric pixel correlations between raw image overlaps, typically over clouds, uniform lakes, and moving water.Most missions were flown during morning hours, where shifting solar conditions resulted in illumination inconsistencies between paths during individual missions, as well as fog.No atmospheric or illumination correction was applied, leading to subtle striping effects, with each stripe occurring along the track of a flight path.Per standard practice, the camera was integrated with the aircraft without consideration for circulation of warm air across the camera window, which resulted in ice and condensation on the lens during operation.These effects resulted in small, dark spots over some areas (appearing as roughly 10 m 2 patches on the ground).Despite these limitations and lack of radiometric calibration, the imagery is suitable for mapping fine-scale water bodies and other features on the land surface.

Geolocation Correction
The CIR orthomosaics were initially georeferenced using aircraft positional data (IMU and GPS).Initial orthomosaics had geolocation offsets ranging from 0-120 m between identical ground points imaged on different days.These offsets were resolved by manually georeferencing 29 orthomosaics using 303 user-selected ground control points (GCPs).The remaining nine orthomosaics had negligible registration errors, so no correction was needed.GCPs were manually digitized from the Digital Globe EV-WHS image service, which provides orthorectified imagery with spatial resolutions of ≤1 m [35].The four operational Digital Globe satellites [36] used as input to the service have a 90th percentile geolocation error ranging from 3.0 m (GeoEye-1 satellite) to 6.5 m (Worldview-1 satellite) [36].GCPs were chosen as persistent landscape features identifiable in both the orthomosaics and the Digital Globe image service, including road intersections, tree stand boundaries, and shorelines from water bodies having stable sizes and shapes.Images were warped using a 1st-order polynomial (affine) transformation in ArcMap 10.6, and the average and root-mean-squared average distances between the source and map GCPs were computed.If they differed by more than 20%, the image was split into two or more parts and the warp re-applied using the corresponding subsets of GCPs.These operations were performed on the original orthomosaics using the same geographic coordinate system as the DigitalGlobe service (WGS-84).
After georeferencing, 90% of GCPs deviated from DigitalGlobe by ≤13.3 m (13.3 pixels) (Figure 2).Including the accuracy of DigitalGlobe, the horizontal accuracy of the CIR imagery and masks is 19.8 m or better for 90% of the images.The 38 orthomosaics were then projected to Canada Equal Albers Conic and split to the ABoVE grid C [30], yielding 330 orthomosaic images in geotiff (.tif) format.Individual image accuracies are included in the data sets [37].

Open Water Classification
We designed a semi-automated image classifier to identify open water unobscured by aquatic or riparian vegetation, algae, and built objects (hereafter referred to as open water).This conservative classification yields a water body product uncontaminated by vegetation or other material that could bias the water surface elevation retrievals obtained from coincident AirSWOT Ka-band interferometric radar images.This conservatism, along with synchronous acquisition with Ka-band radar imagery, makes the CIR dataset presented here optimal for AirSWOT studies of surface water extent and elevation.
In some cases, the restrictive classification causes patches of open water within a vegetated lake to appear as individual water bodies.To mitigate these effects, polygon aggregation was used to ensure each mapped open water body had a one-to-one relationship with the lake, pond, or wetland feature it represented (Section 2.5).Other effects such as clouds, shadows, and lack of atmospheric correction that complicated the optical classification are noted (along with quality metrics) for each classified image.To assess the effects of this area-measurement method on open water area, the highresolution CIR mask was compared to typical lake area measurement from the global lake database HydroLAKES [38].Open water areas were correlated (r 2 = 0.62), as determined by comparing 729 lakes greater than 0.1 km 2 with their full extents imaged by the CIR camera.In general, although the two water body definitions differ, these results still provide insight into the distribution of lake areas.

Automated Classification Steps
The open water classifier (Figure 3) is based on the Normalized Difference Water Index (NDWI) [39], a normalized ratio of the near-infrared and green bands.In some cases, classification was improved by using only the near-infrared (NIR) band instead of the NDWI.Prior to classification, each orthomosaic image was split into processing tiles of ~ 40 km 2 to conserve memory and oversegmented into pixel-cluster objects, also called superpixels.Known as object-based image classification (OBIA), this procedure is a common technique in image [40][41][42] and water [43,44] Figure 2. Geolocation errors for CIR camera imagery after manual georeferencing with Digital Globe EV-WHS image service.Geolocation errors range from undetectable to 49.8 m, with a mean value of 6.0 ± 5.7 m).Over 90% of CIR images have an RMSE ≤14.7 m relative to high-resolution Digital Globe satellite imagery.

Open Water Classification
We designed a semi-automated image classifier to identify open water unobscured by aquatic or riparian vegetation, algae, and built objects (hereafter referred to as open water).This conservative classification yields a water body product uncontaminated by vegetation or other material that could bias the water surface elevation retrievals obtained from coincident AirSWOT Ka-band interferometric radar images.This conservatism, along with synchronous acquisition with Ka-band radar imagery, makes the CIR dataset presented here optimal for AirSWOT studies of surface water extent and elevation.
In some cases, the restrictive classification causes patches of open water within a vegetated lake to appear as individual water bodies.To mitigate these effects, polygon aggregation was used to ensure each mapped open water body had a one-to-one relationship with the lake, pond, or wetland feature it represented (Section 2.5).Other effects such as clouds, shadows, and lack of atmospheric correction that complicated the optical classification are noted (along with quality metrics) for each classified image.To assess the effects of this area-measurement method on open water area, the high-resolution CIR mask was compared to typical lake area measurement from the global lake database HydroLAKES [38].Open water areas were correlated (r 2 = 0.62), as determined by comparing 729 lakes greater than 0.1 km 2 with their full extents imaged by the CIR camera.In general, although the two water body definitions differ, these results still provide insight into the distribution of lake areas.

Automated Classification Steps
The open water classifier (Figure 3) is based on the Normalized Difference Water Index (NDWI) [39], a normalized ratio of the near-infrared and green bands.In some cases, classification was improved by using only the near-infrared (NIR) band instead of the NDWI.Prior to classification, each orthomosaic image was split into processing tiles of ~40 km 2 to conserve memory and over-segmented into pixel-cluster objects, also called superpixels.Known as object-based image classification (OBIA), this procedure is a common technique in image [40][41][42] and water [43,44] classification.For the segmentation, the simple linear iterative clustering with zero parameters (SLIC0) algorithm was used [45], implemented in Matlab 9.4, with a scale (controlling the average size of a cluster) of 100 pixels (100 m 2 ).The resulting clusters ranged in size from 20 (small, uniform regions, such as ponds) to 500 pixels (no-data boundaries, featureless water patches or fields) and served as the basis for the classification.Our choice of algorithm and parameters was informed by a previous study [46] and prevents clusters from being larger than the smallest observable water bodies.
Remote Sens. 2019, 11, x FOR PEER REVIEW 7 of 28 size of a cluster) of 100 pixels (100 m 2 ).The resulting clusters ranged in size from 20 (small, uniform regions, such as ponds) to 500 pixels (no-data boundaries, featureless water patches or fields) and served as the basis for the classification.Our choice of algorithm and parameters was informed by a previous study [46] and prevents clusters from being larger than the smallest observable water bodies.Next, each processing tile was assigned an optimal global NDWI or NIR threshold as per the connectivity-preserving algorithm of O'Gorman [47].This step maximized the number of connected regions without inducing noise by selecting too many total regions.To account for images with varying backgrounds, the original algorithm outputs several intensities that can be used to binarize the tile.
Here, we modified it to produce only one output level (representing the optimal cutoff between open water and land), given an a priori range of user-selected plausible thresholds.This algorithm has previously been implemented for remote sensing of landslides [48], ocean vessels [49], and sea ice [50].
Here, we present its first application to remote sensing of inland surface waters.
A local water index threshold was computed for each potential water body in the resulting binary image [51][52][53].First, clusters with rough textures were removed, as indicated by their entropy value [54], a statistical measure of randomness.High-entropy regions generally occur along shorelines and are analogous to mixed-pixels, so this operation morphologically erodes the edges of water bodies.Next, region-growing, based on the mean and standard deviation of the local water index, was applied to each eroded region.Adjacent pixel clusters were included if their mean value fell within two standard deviations of the mean value of the region.To produce the final open water mask, data gaps surrounded by water were reclassified as water, and small regions (<40 pixels) were removed to reduce commission errors caused by tree and terrain shadows.

Manual Classification Steps and Quality Assessment
After running the open water classifier, each classified image was visually inspected for clouds, haze, or other image quality concerns that prevented accurate image classification.Two post-classification adjustments were made to address image quality.First, the processing tile aspect ratio was adjusted to values between 1:1 and 1:2 for tiles with narrow data coverage due to linear flight lines.This approach was most effective on tiles covering ~5 km 2 or less lacking representative samples of both land and water classes.Second, the a priori upper and lower NDWI bounds used for the connectivity-preserving binarization were varied between −0.72 to −0.39 (lower) and 0.48 to 0.78 (upper).This change was necessary for images containing roads and buildings, whose superpixels were morphologically connected and were, thus, preferentially selected in the connectivity-preserving binarization.These procedures were then repeated until satisfactory results were obtained.For 30 of the images, the NIR band was used for the classification instead of the NDWI in order to avoid false positives.In total, approximately half of the 330 CIR images required re-processing.

Validation of Open Water Classification
Classification error was quantified by manually digitizing water bodies and computing a confusion matrix [55].Since the imagery contained mostly land, a random pixel-based validation scheme would not have included enough water pixels.Instead, 197 open water bodies were selected from six validation areas across the study area, including one of the areas visited in-situ.Boundaries were digitized solely around open water to avoid mixed pixels, using a consistent image stretch between regions.These digitized polygons were then rasterized and confusion matrix statistics calculated on a per-pixel and per-water-body basis between the manual and classified water bodies.
The CIR open water classification was further validated against in situ shoreline surveys of 26 lakes and rivers islands.These shorelines were surveyed using a handheld Garmin eTrex GPS with horizontal accuracy ~3 m.Surveys were conducted in YFB, CSH, and PPN within 30 h of CIR image acquisition (with the exception of YFB, which was collected 73 h after the flight).In areas of emergent aquatic vegetation, the outermost wet shoreline was mapped.Although this technique did not necessarily capture open water boundaries, it provided an independent verification of the maximum shoreline extent.

Water Body Morphometric Analysis
From the classified images, open water area for each cloud-free water body was computed and stored as a shapefile (.shp) attribute.Water masks from these image files were converted to polygons with adjacent boundaries dissolved to account for water bodies split between different files.Rivers and incomplete lakes were removed by deleting polygons that intersected the study region boundaries, with the rest removed based on visual inspection.Finally, water bodies within 20 m of each other were aggregated so that those erroneously appearing discontinuous due to vegetation were counted as one, and the final polygons were stored as a single shapefile (Figure 4).The conservative 20 m distance was chosen to avoid aggregating neighboring ponds when they should be treated separately.The resulting shapefile used for analysis covers 21,644 km 2 (37.7%) of the ABoVE foundational flight areas and contains only open water bodies not connected to river networks.

Water Body Morphometric Analysis
From the classified images, open water area for each cloud-free water body was computed and stored as a shapefile (.shp) attribute.Water masks from these image files were converted to polygons with adjacent boundaries dissolved to account for water bodies split between different files.Rivers and incomplete lakes were removed by deleting polygons that intersected the study region boundaries, with the rest removed based on visual inspection.Finally, water bodies within 20 m of each other were aggregated so that those erroneously appearing discontinuous due to vegetation were counted as one, and the final polygons were stored as a single shapefile (Figure 4).The conservative 20 m distance was chosen to avoid aggregating neighboring ponds when they should be treated separately.The resulting shapefile used for analysis covers 21,644 km 2 (37.7%) of the ABoVE foundational flight areas and contains only open water bodies not connected to river networks.Since some imaged lakes were too wide to be fully observed in an AirSWOT flight line, lake polygons falling across image boundaries were fused with the HydroLAKES database [6] before analysis.This approach increased the range of documented lake sizes by an order of magnitude and removed a sample bias towards small water bodies.As the study area includes lakes as large as Point Since some imaged lakes were too wide to be fully observed in an AirSWOT flight line, lake polygons falling across image boundaries were fused with the HydroLAKES database [6] before analysis.This approach increased the range of documented lake sizes by an order of magnitude and removed a sample bias towards small water bodies.As the study area includes lakes as large as Point Lake and Eskimo Southern Lake on the Canadian Shield (977 and 843 km 2 ), this fused dataset includes lakes from all size classes intersected by 2017 AirSWOT flight lines across Arctic-Boreal Alaska and Canada.

Power-Law Scaling of Water Body Area Distributions within Physiographic Subregions
We categorized the study area into five dominant, physiographic classes (Arctic-Boreal wetlands, Canadian shield, lowland river valleys, prairie pothole lakes, and thermokarst) and 13 smaller sub-regions (SAG, YFB, OCF, MRD, TKP, MRV, CSM, CSH, SLR, PAD, ATR, PPN, PPS, Figure 1 and Table 1).This categorization enables statistical comparisons of water body morphology and area distribution across a variety of typical Arctic-Boreal landscapes and spatial scales.A summary of the acquisition dates, physiographic categorization, and flight line areas is presented in Table 1.
To analyze the scaling behavior of mapped water bodies within these various sub-regions, we tested the hypothesis that surface water body area distributions follow a power-law [23].Maximum-likelihood estimation (MLE) was used to estimate the most probable value of α and the most probable onset (if any), A 0 , of power-law behavior in the area distribution.The MLE estimation was performed for each sub-region (n = 13), physiographic terrain class (n = 5), and over the entire dataset [56][57][58].To understand how the choice of sub-region boundary might have influenced sampling, a bootstrap method [59] was used to estimate p-values and one-standard-deviation confidence intervals for α and A 0 as follows: 1000 subsets for A > A 0 were generated using the MLE estimates of α.
Finally, p-values for α were estimated using the Kolmogorov-Smirnov (KS) statistic [56].Each MLE estimate of α was used to construct 1000 random distributions of P(A) for A > A 0 , and p-values were estimated as the fraction of those distributions with KS statistics that were further away from the true power-law distribution.We defined statistical significance as p < 0.1, meaning mapped water body size-distributions were further from a true power-law than 10% of the synthetic datasets and the non-power-law null hypothesis was supported.This procedure was repeated with A 0 set to the minimum water body area to test for power-law behavior across the full range of A.

Validation of the Open Water Classification
As expected, the open water classifier performed best on CIR images free from clouds, haze, shadows, and mosaicking artifacts.From manual inspection (Section 2.3.1),features including roads, buildings, shadows, and agricultural fields were sometimes incorrectly classified as water.PPN and PPS had the highest commission error due to human-made features, whereas CSH and CSM had the highest commission error due to tree-shadowing.Omission error can most often be attributed to the spectral properties of turbid water (e.g., sediment-rich rivers resembling the spectrum of concrete), especially if found within the same scene as clear water.The classification performed poorly over industrial ponds in the oil sand extraction regions of northern Alberta due to their extremely high turbidity.Misclassification of gravel bars was mitigated through the use of O'Gorman's connectivitypreserving algorithm [47], which performed better in an object-based framework than the popular Otsu method of maximizing between-class variance [60] (Figure 5).The classifier also did not perform well on narrow streams (<15 m wide), likely due to overhanging riparian vegetation and a high percentage of mixed pixels.However, remaining streams were manually removed and not used for the area scaling analysis (Section 2.4).
The 197 open water bodies used for manual digitization have a total area of 3.09 km 2 .The equivalent areas in the classified product contain 200 water bodies, with a total area of 3.34 km 2 .Thus, the classifier demarcated more water body area (7.7% difference), and slightly more water bodies overall (1.5% difference).The overall accuracy is 98.0%, with user's and producer's accuracies of 87.1% and 94.0%, respectively, and a kappa coefficient of 89.3% (Tables 2 and 3).As noted in Section 2.4, validation regions were digitized extremely conservatively, so the 7.7% extra classified areas represent an upper bound.

CIR Camera Water Body Classification Summary Statistics
The CIR open water classification identified 43,562 water bodies with areas ranging from 40 m 2 to 15 km 2 .The total open water extent mapped (F water ) is 2885 km 2 , or 12.34% of the 23,380 km 2 study area, varying from 2.57% in SAG to 37.60% in MRD.The portion of the landscape occupied by lakes and wetlands (i.e., with rivers removed, F lakes ) is 7.71%, varying from 0.36% in SLR to 23.3% in CSH.The median water body area (A med ) is 0.0007 km 2 (700 m 2 ), varying from 0.0003 km 2 (300 m 2 ) in SAG to 0.008 km 2 (8000 m 2 ) in TKP.These results show considerable variation in morphometry between sub-regions, largely driven by physiographic setting.
These data are presented as 330 raster files, 236 of which are free from significant classification errors due to clouds or haze.Also included in the dataset are: (1) a shapefile with attributes denoting geolocation accuracy, classification quality and the presence of clouds or haze; and (2) a shapefile containing classified water bodies, with areas given as an attribute.Data are freely available for download via the Oak Ridge National Laboratory (ORNL) Distributed Active Archive Center (DAAC) for Biogeochemical Dynamics at https://doi.org/10.3334/ORNLDAAC/1707.

Area Distributions of Mapped Water Bodies
Water body area distributions confirm the abundance of small water bodies (Figure 7), particularly those with areas <0.001 km 2 , which is the lower limit of current global studies [8,15,26].Of the 43,562 classified water bodies, 56.2% are <0.001km 2 , comprising 3.46 km 2 or 0.12% of total open water area (Table 4).The fraction of water bodies under this threshold (F 0.001 ) is 56% and ranges from 29% in TKP to 77% in ATR.
Table 4. Regional distribution statistics for 43,562 mapped water bodies.N is the number of water bodies in the region, A med is the median area, F water and F lakes are the fractions of area covered by open water and lakes/ponds, F 0.001 is the fraction of water bodies of areas <0.001 km 2 in area, A 0.001 is the fractional contribution to total area from water bodies <0.001 km 2 , A 0 is the water body area that produces the best goodness-of-fit for a power-law distribution based on water bodies of greater or equal-area (standard deviation σ A0 ), α is the power-law exponent taken for areas A 0 and greater (standard deviation σ α ), and p is the p-value from the power-law fitting test, considered for water bodies larger than A 0 .Regions having p values less than 0.1 (in bold) are rejected as power-law distributions.Most sub-regions have scale-invariant tails to their area distributions (Table 4), as shown by the linear portions of their cumulative distribution functions (CDFs, Figures 8 and 9).A power-law distribution is plausible for water bodies >0.34 ± 0.13 km 2 across the entire study region, representing 1.5% to 3.2% of water bodies by count and 89.6% to 93.4% by area.Scaling estimates pertaining to the remaining water bodies would thus, remain highly uncertain.Regionally, all categories except for the prairie pothole region PPN (p = 0.00) could fit a power-law for A0 ranging from 0.00050 (SAG) to 0.54 km 2 (PPS).Although PPN appears linear on a log-log plot, its much larger sample size (1-2 orders of magnitude larger than the other sites) provides more data points for the maximum-likelihood estimation (MLE) to rule out the power-law hypothesis, so this finding should be interpreted with caution.The hypothesis is plausible for other sub-regions, with the onset of power-law scaling A0 ranging from 350.5 m 2 (ATR, 99.18 to 100.00% of water bodies) to 0.54 km 2 (PPS, 59.16 to 64.5%).Finally, CSH is indistinguishable from a power-law in the tail end of the distribution (p = 1.00).The power-law scaling exponent α varies from 1.59 (CSM) to 2.51 (YFB), with a value of 1.89 for the entire study area.The variability in A0, α, and p suggest that a single power-law distribution does not apply to the areas classified here, as supported by the non-linear curves in Figure 8.Most sub-regions have scale-invariant tails to their area distributions (Table 4), as shown by the linear portions of their cumulative distribution functions (CDFs, Figures 8 and 9).A power-law distribution is plausible for water bodies >0.34 ± 0.13 km 2 across the entire study region, representing 1.5% to 3.2% of water bodies by count and 89.6% to 93.4% by area.Scaling estimates pertaining to the remaining water bodies would thus, remain highly uncertain.Regionally, all categories except for the prairie pothole region PPN (p = 0.00) could fit a power-law for A 0 ranging from 0.00050 (SAG) to 0.54 km 2 (PPS).Although PPN appears linear on a log-log plot, its much larger sample size (1-2 orders of magnitude larger than the other sites) provides more data points for the maximum-likelihood estimation (MLE) to rule out the power-law hypothesis, so this finding should be interpreted with caution.The hypothesis is plausible for other sub-regions, with the onset of power-law scaling A 0 ranging from 350.5 m 2 (ATR, 99.18 to 100.00% of water bodies) to 0.54 km 2 (PPS, 59.16 to 64.5%).Finally, CSH is indistinguishable from a power-law in the tail end of the distribution (p = 1.00).The power-law scaling exponent α varies from 1.59 (CSM) to 2.51 (YFB), with a value of 1.89 for the entire study area.The variability in A 0 , α, and p suggest that a single power-law distribution does not apply to the areas classified here, as supported by the non-linear curves in Figure 8.Based on the power-law exponent α, the regions analyzed generally fall into two end-members, with wetlands and thermokarst having the greatest α values (1.94 to 2.04) and pothole lakes having lower α values (1.51) and thus a flatter distribution.However, values between sub-regions PPN and PPS differed by 43%, implying contamination from other types of driving physiographic factors.This contamination was most prevalent in the larger and more heterogeneous PPN region, which transitions from sparsely populated boreal forest in the north to agricultural regions with constructed lakes in the south.The high-α end members make sense since the two thermokarst regions, TKP and OCF, contain seasonal tundra wetlands.The remaining regions, Canadian Shield and lowland river valleys, had intermediate α values (1.77 to 1.91).Coarse-scale water body maps in regions with large α values would be most affected, as they have a greater proportion of small water bodies, and, indeed, wetland regions have both high α values and a high A0.The two prairie pothole sub-regions in aggregate (PPN and PPS, Figure 9) and PPN individually did not fit a power-law distribution (p = 0.00), and they had among the highest percentage of small water bodies, making their abundance difficult to estimate from coarse-scale maps.In contrast, shield lakes very likely fit a power-law distribution (p = 1.00) and had low A0 values, implying that coarse-scale information in these regions could be sufficient to estimate water body abundance.Based on the power-law exponent α, the regions analyzed generally fall into two end-members, with wetlands and thermokarst having the greatest α values (1.94 to 2.04) and pothole lakes having lower α values (1.51) and thus a flatter distribution.However, values between sub-regions PPN and PPS differed by 43%, implying contamination from other types of driving physiographic factors.This contamination was most prevalent in the larger and more heterogeneous PPN region, which transitions from sparsely populated boreal forest in the north to agricultural regions with constructed lakes in the south.The high-α end members make sense since the two thermokarst regions, TKP and OCF, contain seasonal tundra wetlands.The remaining regions, Canadian Shield and lowland river valleys, had intermediate α values (1.77 to 1.91).Coarse-scale water body maps in regions with large α values would be most affected, as they have a greater proportion of small water bodies, and, indeed, wetland regions have both high α values and a high A 0 .The two prairie pothole sub-regions in aggregate (PPN and PPS, Figure 9) and PPN individually did not fit a power-law distribution (p = 0.00), and they had among the highest percentage of small water bodies, making their abundance difficult to estimate from coarse-scale maps.In contrast, shield lakes very likely fit a power-law distribution (p = 1.00) and had low A 0 values, implying that coarse-scale information in these regions could be sufficient to estimate water body abundance.This analysis has a strong dependence on how sub-regions are constructed; thus, distributions at three spatial scales were analyzed (Section 2.5).In general, as the number of water bodies grows, p decreases, although sample size only explains a small percentage of the variation (r 2 = 0.11, Figure A2).This effect results from smaller regions being more likely to have an unvarying distribution across their spatial domain.Spatial variability within regions could cause the slope breaks seen in the category-scale plots (Figure 8) and might also explain why the power-law hypothesis was rejected for PPN.This sub-region is 1-2 orders of magnitude larger than the others and includes numerous agricultural ponds and impoundments, as well as glacially-formed prairie pothole lakes.The variance among the fitted α and A0 parameters indicates that sub-region divisions captured boundaries between hydrologic landscapes.Thus, there is a preferential scale for power-law fitting analyses, determined by landscape physiography, and biases can be reduced by analyzing multiple scales to prevent arbitrary sub-region divisions from impacting results.[11,61,62] and area distribution scaling [6,23,63] using spaceborne and airborne color-infrared digital imagery by: (1) advancing a new, high-resolution AirSWOT color-infrared (CIR) camera dataset and open water classification for the NASA Arctic-Boreal Vulnerability Experiment (ABoVE); (2) applying the connectivity-preserving binarization technique [47] and an object-based framework to classify open water in the CIR dataset;

This study builds on prior studies of open water classification
(3) using the classification to map small (<0.001 km 2 ) water bodies over physiographic gradients; and (4) testing whether the mapped water body area distributions display scale-invariant power-law behavior, with a focus on selecting the appropriate area domain.Each of these four contributions is discussed next.This analysis has a strong dependence on how sub-regions are constructed; thus, distributions at three spatial scales were analyzed (Section 2.5).In general, as the number of water bodies grows, p decreases, although sample size only explains a small percentage of the variation (r 2 = 0.11, Figure A2).This effect results from smaller regions being more likely to have an unvarying distribution across their spatial domain.Spatial variability within regions could cause the slope breaks seen in the category-scale plots (Figure 8) and might also explain why the power-law hypothesis was rejected for PPN.This sub-region is 1-2 orders of magnitude larger than the others and includes numerous agricultural ponds and impoundments, as well as glacially-formed prairie pothole lakes.The variance among the fitted α and A 0 parameters indicates that sub-region divisions captured boundaries between hydrologic landscapes.Thus, there is a preferential scale for power-law fitting analyses, determined by landscape physiography, and biases can be reduced by analyzing multiple scales to prevent arbitrary sub-region divisions from impacting results.

Discussion
This study builds on prior studies of open water classification [11,61,62] and area distribution scaling [6,23,63] using spaceborne and airborne color-infrared digital imagery by: (1) advancing a new, high-resolution AirSWOT color-infrared (CIR) camera dataset and open water classification for the NASA Arctic-Boreal Vulnerability Experiment (ABoVE); (2) applying the connectivity-preserving binarization technique [47] and an object-based framework to classify open water in the CIR dataset; (3) using the classification to map small (<0.001 km 2 ) water bodies over physiographic gradients; and (4) testing whether the mapped water body area distributions display scale-invariant power-law behavior, with a focus on selecting the appropriate area domain.Each of these four contributions is discussed next.

Utility of CIR Open Water Classifications for the SWOT Satellite Mission
AirSWOT was developed as a calibration-validation platform for the forthcoming NASA Surface Water and Ocean Topography (SWOT) satellite mission (https://swot.jpl.nasa.gov/),anticipated to launch in 2021 [64,65].A key SWOT mission objective is to obtain water surface elevation (WSE) to ±10 cm vertical accuracy for 1 km 2 open water regions [66].Previous work used CIR-derived water masks and spatially-averaged AirSWOT Ka-band interferometric radar data to estimate AirSWOT WSE accuracies of ~9-10 cm in rivers and ~21 cm in lakes [53,[67][68][69].The conservative AirSWOT color-infrared (CIR) water mask provided here should enable unambiguous extraction of open water pixels used for spatial averaging of AirSWOT interferometric radar data, and, therefore, improve estimates of WSE accuracy (Figure 10).CIR imagery would also help identify phenomenology issues that should be excluded from spatial averaging of AirSWOT radar data, for example, exclusion of wet sediment bars as noted in Section 3.1, which could introduce height errors if erroneously classified as open water.Similarly, the presence of emergent aquatic vegetation may be identified by overlaying AirSWOT backscatter on CIR imagery.In general, the synchronicity and co-registration between the AirSWOT CIR camera and Ka-band interferometric radar is a powerful combination for accurate mapping of surface water extent and WSE.Paired AirSWOT CIR and InSAR imagery thus provide an effective way to validate both surface water extent and WSE during the forthcoming SWOT mission.
Several improvements to the AirSWOT camera system are recommended to further improve the value provided by this system.The CIR camera image registration could be improved by upgrading the positional and image timestamping systems, increasing the overlap between flight paths (to assist photogrammetric correlation), or both.Adding an additional spectral band in the mid-range infrared (1500-2000) nm would improve identification of water, which cannot be detected by the current 760-900 nm sensitivity.Including these wavelengths would permit the use of spectral indices such as the Automated Water Extraction Index (AWEI) [70] and Modified Normalized Difference Water Index (MNDWI) [71] which provide better contrast between land and water pixels, especially over human-influenced landscapes.Absolute radiometric calibration and solar illumination correction would also improve automated open water classification by enhancing consistency between images.Finally, the classifier presented here should be modified to improve automation and deliver a quicker product.These modifications would permit timely validation of SWOT-observed water extent over large areas where AirSWOT is flown during the proposed satellite mission.

Utility of CIR Imagery and Open Water Classifications for the NASA Arctic-Boreal Vulnerability Experiment (ABoVE)
AirSWOT CIR imagery and open water classifications also hold value for other flight campaigns conducted for the NASA Arctic-Boreal Vulnerability Experiment (ABoVE).The 2017 AirSWOT flight lines described here were designed to maximize overlap with other ABoVE airborne campaigns, including LVIS, UAVSAR, AIRMOSS, AVIRIS-NG, and CFIS.The user-friendly CIR image catalog and manually derived cloud flags enable end-users to locate high quality, high-resolution images over areas targeted by other field and remote sensing investigations [37].The high-resolution open water classification presented here [73] may provide a useful dataset for studies exploring links between surface hydrology and permafrost [53,74], ecosystem services [75], and carbon cycles [18,77] that currently rely on satellite products, lake censuses, or other coarse-resolution open water maps.
To illustrate this cross-discipline utility, we overlay the CIR open water mask with UAVSAR polarimetric L-band SAR data (Figure 11).Given its relatively long wavelength of 23.8 cm, L-band radiation can penetrate moderately dense vegetation and give strong double-bounce returns off vertical wetland vegetation emerging from the water surface.The CIR imagery helps identify this vegetation, and the open water mask can be used to identify the extent of emergent vegetation where double-bounce returns begin.This dual-sensor information can be used to map the entire surface area  AirSWOT CIR imagery and open water classifications also hold value for other flight campaigns conducted for the NASA Arctic-Boreal Vulnerability Experiment (ABoVE).The 2017 AirSWOT flight lines described here were designed to maximize overlap with other ABoVE airborne campaigns, including LVIS, UAVSAR, AIRMOSS, AVIRIS-NG, and CFIS.The user-friendly CIR image catalog and manually derived cloud flags enable end-users to locate high quality, high-resolution images over areas targeted by other field and remote sensing investigations [37].The high-resolution open water classification presented here [73] may provide a useful dataset for studies exploring links between surface hydrology and permafrost [53,74], ecosystem services [75], and carbon cycles [18,76] that currently rely on satellite products, lake censuses, or other coarse-resolution open water maps.
To illustrate this cross-discipline utility, we overlay the CIR open water mask with UAVSAR polarimetric L-band SAR data (Figure 11).Given its relatively long wavelength of 23.8 cm, L-band radiation can penetrate moderately dense vegetation and give strong double-bounce returns off vertical wetland vegetation emerging from the water surface.The CIR imagery helps identify this vegetation, and the open water mask can be used to identify the extent of emergent vegetation where double-bounce returns begin.This dual-sensor information can be used to map the entire surface area of lakes with higher confidence than either sensor alone.Although not available at this time, a combined product would minimize counting errors associated with disconnected open water patches and allow water to be partitioned between open and vegetated regions.
The CIR imagery may also add value to studies using the hyper-spectral sensor AVIRIS-NG and the waveform LiDAR sensor LVIS.For example, patterned ground features such as fine-scale polygonal networks and ponds caused by permafrost thaw are barely detectable in 5 m-resolution AVIRIS-NG imagery and are easier to resolve in the high-resolution CIR imagery (Figure 12).Although too small to be detected by the classifier, the raw CIR imagery resolves small ephemeral ponds formed above thawing permafrost that are important to the permafrost-carbon feedback [77].The hyperspectral capabilities of AVIRIS-NG offer the potential to study vegetation, water quality, trace gases, snow, soil, and other systems [33].The CIR open-water mask could also aid interpretation of 1064 nm laser backscatter from open water as from LVIS and together could offer an independent method for partitioning open and vegetated water surfaces.High-resolution digital elevation models (DEMs) provided by this sensor can enhance fine-scale runoff modeling and studies involving ephemeral water bodies.These synergies present opportunities for more precise water and vegetation mapping using concurrent ABoVE datasets.
Remote Sens. 2019, 11, x FOR PEER REVIEW 19 of 28 of lakes with higher confidence than either sensor alone.Although not available at this time, a combined product would minimize counting errors associated with disconnected open water patches and allow water to be partitioned between open and vegetated regions.
The CIR imagery may also add value to studies using the hyper-spectral sensor AVIRIS-NG and the waveform LiDAR sensor LVIS.For example, patterned ground features such as fine-scale polygonal networks and ponds caused by permafrost thaw are barely detectable in 5 m-resolution AVIRIS-NG imagery and are easier to resolve in the high-resolution CIR imagery (Figure 12).Although too small to be detected by the classifier, the raw CIR imagery resolves small ephemeral ponds formed above thawing permafrost that are important to the permafrost-carbon feedback [78].The hyperspectral capabilities of AVIRIS-NG offer the potential to study vegetation, water quality, trace gases, snow, soil, and other systems [33].The CIR open-water mask could also aid interpretation of 1064 nm laser backscatter from open water as from LVIS and together could offer an independent method for partitioning open and vegetated water surfaces.High-resolution digital elevation models (DEMs) provided by this sensor can enhance fine-scale runoff modeling and studies involving ephemeral water bodies.These synergies present opportunities for more precise water and vegetation mapping using concurrent ABoVE datasets.

Validated Open Water Classification Performance
The accurate water classifications obtained here suggest the object-based, connectivitypreserving classifier is well-suited for open water mapping.As noted by O'Gorman [47], pixel connectivity is a local measurement used for a global threshold, with the advantage that all regions of the image are used when choosing a threshold.Although this technique was originally developed for document image scanning, it works well for mapping open water bodies, which exhibit pixel connectivity just like letters in text.Moreover, we applied this technique to an object-based framework and demonstrated that small shadows and speckle noise in the uncalibrated imagery are evened-out.Working with pixel regions rather than pixels reduced the size of the data set, which saved memory and resulted in a reasonably fast processing time of about four minutes for a 50million-pixel processing tile (using a 3.7 GHz Intel Xeon processor with 16 GB of RAM).As in other high-resolution studies [32,51], classifications were manually edited to remove false-positives caused by shadows and other factors.
We took a conservative approach to validation, using manually-delineated water body shorelines and in-situ mapping.Consequently, the classifier is more likely to miss than falsely identify water.This is supported by most classified open water boundaries falling inside walked shorelines (Figure 6) in areas with littoral zone vegetation.The lower user's accuracy (87.1%) than

Validated Open Water Classification Performance
The accurate water classifications obtained here suggest the object-based, connectivity-preserving classifier is well-suited for open water mapping.As noted by O'Gorman [47], pixel connectivity is a local measurement used for a global threshold, with the advantage that all regions of the image are used when choosing a threshold.Although this technique was originally developed for document image scanning, it works well for mapping open water bodies, which exhibit pixel connectivity just like letters in text.Moreover, we applied this technique to an object-based framework and demonstrated that small shadows and speckle noise in the uncalibrated imagery are evened-out.Working with pixel regions rather than pixels reduced the size of the data set, which saved memory and resulted in a reasonably fast processing time of about four minutes for a 50-million-pixel processing tile (using a 3.7 GHz Intel Xeon processor with 16 GB of RAM).As in other high-resolution studies [32,51], classifications were manually edited to remove false-positives caused by shadows and other factors.
We took a conservative approach to validation, using manually-delineated water body shorelines and in-situ mapping.Consequently, the classifier is more likely to miss than falsely identify water.This is supported by most classified open water boundaries falling inside walked shorelines (Figure 6) in areas with littoral zone vegetation.The lower user's accuracy (87.1%) than producer's accuracy (94.0%,Table 2) is indicative of the conservative boundaries used for digitization.In addition, the 40 m 2 size limit and the spectral limitations of the camera exclude narrow streams.Optical classification of the stream and river channels is difficult due to the necessity of producing continuous networks from narrow regions [81].In addition, the process of creating a binary water mask is subjective.Even ground-based shoreline surveys may incorrectly identify shorelines obscured by vegetation or saturated sediments.In general, we found that small water bodies (<0.001 km 2 ) are particularly challenging for high-resolution water classification, as vegetation often obscures shorelines, making individual water bodies hard to count.Furthermore, these small water bodies are the most ephemeral.More work is needed to create multi-temporal products and understand seasonal variation.

Improving Mapping of Very Small Water Bodies
Small water bodies are important for methane release [82,83], biodiversity [22], and organic matter burial [22], but are under-represented in studies limited by the spatial resolution of Landsat or coarser sensors [8,21,22].One approach to solving this problem is the use of scale-based estimates, but such estimates may not apply at small scales [24].Remote sensing offers the only consistent method to identify these water bodies globally, but most space-borne sensors lack sufficient spatial resolution to test for scale-invariant behavior over all size scales.This dataset provides much-needed information on the distribution of these small water bodies in Arctic-Boreal North America.
Waterbody size metrics, such as area and shoreline length, could be valuable for earth system models that use coarse water cover fraction as an input.Inaccuracies in wetland maps used for methane models have been attributed to both undercounting [84] and double-counting by inclusion with lake classes [85].For models too coarse to resolve small water bodies, the fractions A 0.001 or similar perimeter metrics could be used to upscale the processes that occur in small ponds or wetlands.Accurate water body maps are crucial to improving methane emission estimates from models, so upscaling approaches may represent one way to transfer fine-scale information to coarse-scale maps used for models [8,38,82].In Alaska, high-resolution imagery has been used to upscale methane fluxes based on vegetation mapping [86], and these techniques can also be applied to water and wetland maps.The CIR imagery and open water masks cover large areas at high resolution and offer a valuable airborne-scale census of surface water over a variety of landscapes.
Boundaries used to define representative areas can influence conclusions drawn about enclosed small water bodies.Regionally, the Canadian Shield and thermokarst landscapes had the highest overall water fraction (14.4%), which is consistent with other work [6,23,38].For the YFB, we find F 0.001 is 52%, whereas McDonald et al. [15] found 28%, using a region five times smaller and different procedures for defining small water bodies.This discrepancy highlights the sensitivity of morphometric analyses to study area boundaries.
The contribution of small water bodies <0.001 km 2 to total water area has been considered negligible [82], on the order of fractions of a percent, and our findings of 0.12% for the entire study domain support this conclusion.However, river valley sub-regions such as SAG and SLR have A 0.001 > 3%.Ponds below 0.001 km 2 are estimated to make up 6-11% of the global surface area of lakes and ponds, yet comprise up to 23% of CO 2 emissions and 70% of diffusive CH 4 emissions from lentic freshwaters [83].They tend to have higher rates of carbon burial, fish productivity, and waterfowl, amphibian, plant and fish species richness than larger lakes [22].They can also provide ecosystem services such as small-scale water supply and nutrient processing [21].Thus, the fraction of small water bodies observed is not negligible for a variety of carbon cycle and ecosystem processes.

Testing Power-Law Regimes
The power-law hypothesis for lake size distribution is often invoked [7,16,20,23,25,26] but lacks rigorous tests on its adequacy for modeling small lakes.Log-log slope breaks (analogous to A 0 ) are known to occur at small areas [14,15] but these area thresholds have not been systematically calculated over a variety of regions.The domain-wide A 0 threshold we find (0.34 ± 0.13 km 2 ) is nearly identical to the 0.35 km 2 truncation threshold for lake area data fitted with a Pareto distribution [38], and the 0.46 km 2 threshold used for a power-law fit [26], both from global datasets.The similarity between our results in the 23,380 km 2 study area and these global studies suggest similar controls on water body distributions and the departure from power-law behavior at small areas.For illustration, if the CIR-mapped A 0 value presented here held globally, the smallest ~6% of freshwater bodies would not fit a power-law distribution.Given the large range in A 0 (0.00035-0.54km 2 ) across the ABoVE domain, it would be imprudent to extrapolate from the area distribution of large water bodies based on this equation without accounting for regional differences in A 0 .Most sub-regions demonstrate a slope break, often near A 0 , for which the power-law α coefficient would be smaller for the small area end of the domain.Consequently, a Pareto or power-law extrapolation would imply a far greater abundance of small lakes than are present [8,22,24].
The properties that govern A 0 , are not understood, but the ephemeral nature of small lakes and ponds is hypothesized to contribute to the location of the slope break [22].At small scales, micro-topography governs the area distribution [14] and it is therefore not scale-invariant [26].This hypothesis is supported by the strong power-law fit (p = 1.00) we find for the Canadian Shield, which is characterized by constant bedrock-controlled topography, which is more likely to be scale-invariant than more heterogeneous regions.We suggest that climatic and topographic variability within sub-regions could produce mixed distributions with one or more breakpoints.Attributing these factors is beyond the scope of this study, but our high-resolution water map provides an example of the type of direct observations of lakes and ponds needed to accurately assess water body scaling and improve accounting of global surface water.
Our findings support prior evidence against a power-law fit to the lake area distributions in high-latitude regions is scale-invariant.We used three spatial scales (study area, physiographic category, and local region) to show that scale-invariant area distributions are common, but only hold over the tail end of distributions, which generally begins at 0.34 km 2 .Individual regions showed considerable variability in morphology and power-law slopes, highlighting the complexity of choosing the correct scale for sub-region divisions.Furthermore, the proportion of small water bodies below the typical limit of 0.001 km 2 can vary regionally by several orders of magnitude, and may not be negligible, depending on the case study.In all cases, distribution plots had flatter slopes for smaller water bodies, implying fewer than would be estimated by extrapolation.Finally, physiography drives water body distribution, and the regional setting should be considered when making global estimates.

Conclusions
We present a collection of color-infrared (CIR) airborne camera imagery from the AirSWOT sensor suite and open water masks derived from an object-based classification.These products are provided at one-meter spatial resolution and are designed to aid spatial averaging of AirSWOT interferometric radar retrievals of water surface elevation (WSE).These data provide a rare, synchronous and co-registered reference for Ka-band water classification algorithms such as will be used for the upcoming NASA Surface Water and Ocean Topography (SWOT) satellite mission [64,87].These CIR products should also aid the interpretation and validation of other NASA ABoVE airborne sensors and high-resolution Arctic-Boreal surface water studies more generally.CIR mapping of very small water bodies (<0.001 km 2 ) across the ABoVE domain shows limitations in area-scaling extrapolations over a variety of physiographic terrains.We find that most collections of water bodies within the study area follow power-law distributions only if the smallest water bodies (<0.34 km 2 ) are removed from the distribution.Our regional analysis identifies clear differences in lake abundance and size distribution for different physiographic terrains found within the ABoVE spatial domain.Some regions, such as the Canadian Shield, appear well-suited for empirical scaling-law based extrapolations, but any extrapolation should be based on data from the same physiographic setting.We conclude that high-resolution mapping, such as demonstrated here using airborne CIR imagery, remains the most accurate approach for quantifying the full extent and abundance of Arctic-Boreal surface water.corresponding to lakes that had joined or split due to different water levels at the times of dataset acquisitions were removed.Thus, while the two variables are not identical, they can be used as correlates for each to compare between studies.

Figure A2.
The number of water bodies in a sub-region or category is correlated with its p-value in the maximum likelihood estimation (MLE) power-law fit-test but doesn't explain the total variance (r 2 = 0.11).

Figure A2.
The number of water bodies in a sub-region or category is correlated with its p-value in the maximum likelihood estimation (MLE) power-law fit-test but doesn't explain the total variance (r 2 = 0.11).

Figure 1 .
Figure 1.Study area map of NASA ABoVE (Arctic-Boreal Vulnerability Experiment) AirSWOT flight lines over 13 sub-regions organized into five physiographic categories (Arctic-Boreal wetlands, Canadian Shield, lowland river valleys, prairie pothole lakes, and thermokarst regions).AirSWOT sorties were flown along these flight lines from S-N in July and N-S in August 2019 along ABoVE foundational flight lines[33], covering broad physiographic gradients and some of the most waterrich regions in the world.

Figure 1 .
Figure 1.Study area map of NASA ABoVE (Arctic-Boreal Vulnerability Experiment) AirSWOT flight lines over 13 sub-regions organized into five physiographic categories (Arctic-Boreal wetlands, Canadian Shield, lowland river valleys, prairie pothole lakes, and thermokarst regions).AirSWOT sorties were flown along these flight lines from S-N in July and N-S in August 2019 along ABoVE foundational flight lines[33], covering broad physiographic gradients and some of the most water-rich regions in the world.

Figure 2 .
Figure 2. Geolocation errors for CIR camera imagery after manual georeferencing with Digital Globe EV-WHS image service.Geolocation errors range from undetectable to 49.8 m, with a mean value of 6.0 ± 5.7 m).Over 90% of CIR images have an RMSE ≤14.7 m relative to high-resolution Digital Globe satellite imagery.

Figure 3 .
Figure 3. Example of open water classification workflow over seven open water bodies from the Mackenzie River Delta (MRD).Panels (a) and (b) show the original AirSWOT color-infrared (CIR) image, (c) Normalized Difference Water Index (NDWI), (d) effect of varying l, the threshold level used for binarization, (e) erosion based on entropy value to provide seed pixels, and (f) discrete regions after region growing from the seed pixels.

Figure 3 .
Figure 3. Example of open water classification workflow over seven open water bodies from the Mackenzie River Delta (MRD).Panels (a,b) show the original AirSWOT color-infrared (CIR) image, (c) Normalized Difference Water Index (NDWI), (d) effect of varying l, the threshold level used for binarization, (e) erosion based on entropy value to provide seed pixels, and (f) discrete regions after region growing from the seed pixels.

Figure 4 .
Figure 4. To avoid counting open water patches within the same lake as separate water bodies, a 10 m buffer was used to aggregate open water patches within 20 m of each other.At Johnny's Cabin Pond in PAD (shown here), this step aggregated 24 open water patches (blue) to produce one water body of 0.16 km 2 .The area solely within the pale green buffer was not counted.

Figure 4 .
Figure 4. To avoid counting open water patches within the same lake as separate water bodies, a 10 m buffer was used to aggregate open water patches within 20 m of each other.At Johnny's Cabin Pond in PAD (shown here), this step aggregated 24 open water patches (blue) to produce one water body of 0.16 km 2 .The area solely within the pale green buffer was not counted.

Figure 5 .
Figure 5. Misclassification sometimes occurred if NDWI histogram peaks for turbid water were closer to the peaks for land than clear water.(a) Turbid water in large rivers was colored differently than lakes, potentially confounding classification in Yukon flats, AK, (b) Image histogram shows Normalized Difference Water Index (NDWI) peaks at 0.18 from the dark lake in the bottom left corner of the image, at 0.05 from the turbid Yukon River, and at −0.01 and −0.1 from wet sandbars and shadows, respectively.The connectivity-preserving method selected fewer gravel bars (circled in yellow) than the variance-maximizing method in this image.(c) The variance-maximizing threshold, and (d) proposed connectivity-preserving binarizations.Both classifiers selected both turbid and nonturbid water in this image.

Figure 5 .
Figure 5. Misclassification sometimes occurred if NDWI histogram peaks for turbid water were closer to the peaks for land than clear water.(a) Turbid water in large rivers was colored differently than lakes, potentially confounding classification in Yukon flats, AK, (b) Image histogram shows Normalized Difference Water Index (NDWI) peaks at 0.18 from the dark lake in the bottom left corner of the image, at 0.05 from the turbid Yukon River, and at −0.01 and −0.1 from wet sandbars and shadows, respectively.The connectivity-preserving method selected fewer gravel bars (circled in yellow) than the variance-maximizing method in this image.(c) The variance-maximizing threshold, and (d) proposed connectivity-preserving binarizations.Both classifiers selected both turbid and non-turbid water in this image.
Walked shorelines are inland of or equal to the classified shorelines for vegetated lakes, implying that the classifier conservatively mapped open water (Figure 6a,b).The classified shorelines of non-vegetated sandbars (Figure 6c-f) are also generally inland of walked shorelines, implying omission errors from water resembling wet sediment or changing water levels (the Yukon River surveys were conducted days after the flight).The maximum Euclidean distance between classified and walked shorelines is 45 m in the Yukon River sandbars, 16 m in the North Saskatchewan River, and 15 m for the ponds in PPN.Thus, the in-situ surveys provide a ground-verified reference to assess uncertainty in shoreline locations.Remote Sens. 2019, 11, x FOR PEER REVIEW 12 of 28 Walked shorelines are inland of or equal to the classified shorelines for vegetated lakes, implying that the classifier conservatively mapped open water (Figure6a,b).The classified shorelines of nonvegetated sandbars (Figure6c-f) are also generally inland of walked shorelines, implying omission errors from water resembling wet sediment or changing water levels (the Yukon River surveys were conducted days after the flight).The maximum Euclidean distance between classified and walked shorelines is 45 m in the Yukon River sandbars, 16 m in the North Saskatchewan River, and 15 m for the ponds in PPN.Thus, the in-situ surveys provide a ground-verified reference to assess uncertainty in shoreline locations.

Figure 6 .
Figure 6.Ground validation sites used to validate the automated CIR image classifier.Walked shorelines are plotted in gold and classified shorelines in red (only outlines are shown for visual clarity).(a) and (d) Prairie pothole lakes near Redberry Lake in PPN (shoreline survey less than 4.5 h after AirSWOT flight), (b) and (e) Sandbar islands in the North Saskatchewan River near Saskatoon, Saskatchewan in PPN (survey less than 21 h before flight), (c) and (f) Sandbar islands in the Yukon River in YFB (survey less than 73 and 30 h after flight, respectively).The automated classification is more conservative than the shoreline surveys for the purpose of mapping only open water body edges in areas of emergent aquatic vegetation (note gray-colored dead trees around edges of lakes in (a) and (d) not classified as open water.

Figure 6 .
Figure 6.Ground validation sites used to validate the automated CIR image classifier.Walked shorelines are plotted in gold and classified shorelines in red (only outlines are shown for visual clarity).(a,d) Prairie pothole lakes near Redberry Lake in PPN (shoreline survey less than 4.5 h after AirSWOT flight), (b,e) Sandbar islands in the North Saskatchewan River near Saskatoon, Saskatchewan in PPN (survey less than 21 h before flight), (c,f) Sandbar islands in the Yukon River in YFB (survey less than 73 and 30 h after flight, respectively).The automated classification is more conservative than the shoreline surveys for the purpose of mapping only open water body edges in areas of emergent aquatic vegetation (note gray-colored dead trees around edges of lakes in (a,d) not classified as open water.

Figure 7 .
Figure 7.The percentage of (a) water bodies, F0.001, and (b) areas, A0.001, from water bodies under 0.001 km 2 along with (c) median areas Amed, and (d) open water fraction Fwater across the five physiographic terrain categories.Prairie pothole and river valley lakes had the largest percent of lakes under 0.001 km 2 by area and number.Shield and thermokarst lakes overall held the largest median water body areas and open water fractions.

Figure 7 .
Figure 7.The percentage of (a) water bodies, F 0.001 , and (b) areas, A 0.001 , from water bodies under 0.001 km 2 along with (c) median areas A med , and (d) open water fraction F water across the five physiographic terrain categories.Prairie pothole and river valley lakes had the largest percent of lakes under 0.001 km 2 by area and number.Shield and thermokarst lakes overall held the largest median water body areas and open water fractions.

Figure 8 .
Figure 8. Reverse cumulative distribution functions (CDFs) for water body areas (black line), plotted with three logarithmically-spaced bins per decade.The start of the most likely power-law tail (A0), if applicable, is shown as a vertical dark gray line with standard deviation σA0 indicated by the width of the shaded light grey region around it.The solid red line is the maximum likelihood estimate (MLE) power-law distribution with exponent α computed from lakes larger than size A0, and the solid blue line is the MLE power-law distribution for all measured water bodies in the sub-region.Dashed red lines are the standard deviation in power-law exponent α (σα).Note that the axes obscure the magnitude of the difference between data and power-law fits.p-values for each MLE fit are indicated in the legend.The number, n, of data points in the tail and power-law exponent, α, are shown in the title.The power-law hypothesis is ruled out for p-values < 0.10 and is only considered valid across the study area for water bodies >0.34 km 2 .

Figure 8 .
Figure8.Reverse cumulative distribution functions (CDFs) for water body areas (black line), plotted with three logarithmically-spaced bins per decade.The start of the most likely power-law tail (A 0 ), if applicable, is shown as a vertical dark gray line standard deviation σ A0 indicated by the width of the shaded light grey region around it.The solid red line is the maximum likelihood estimate (MLE) power-law distribution with exponent α computed from lakes larger than size A 0 , and the solid blue line is the MLE power-law distribution for all measured water bodies in the sub-region.Dashed red lines are the standard deviation in power-law exponent α (σ α ).Note that the logarithmic axes obscure the magnitude of the difference between data and power-law fits.p-values for each MLE fit are indicated in the legend.The number, n, of data points in the tail and power-law exponent, α, are shown in the title.The power-law hypothesis is ruled out for p-values < 0.10 and is only considered valid across the study area for water bodies >0.34 km 2 .

Figure 9 .
Figure 9. Testing for scale-invariant power-law behavior in water body area distribution for the 13 sub-regions.All lines are as in Figure 8.Among the sub-regions, Prairie Potholes North (PPN) fails the power-law hypothesis test (p = 0.00), and all others have A0 ranging from 350.5 m 2 to 0.54 km 2 .

Figure 9 .
Figure 9. Testing for scale-invariant power-law behavior in water body area distribution for the 13 sub-regions.All lines are as in Figure 8.Among the sub-regions, Prairie Potholes North (PPN) fails the power-law hypothesis test (p = 0.00), and all others have A 0 ranging from 350.5 m 2 to 0.54 km 2 .

Figure 10 .
Figure 10.Use of CIR imagery to improve AirSWOT Ka-band radar retrievals [72] of water surface elevation (WSE) in the Mackenzie River Delta (MRD).(a) AirSWOT Ka-band radar backscatter, (b) AirSWOT CIR image, (c) Interferometrically derived WSEs, (d) CIR open-water classification used as a mask (black) to define correct areas for spatial averaging of (c).In (c) and (d), white regions correspond to no data, which is more prevalent in the far range of the radar image (bottom portion).

Figure 10 .
Figure 10.Use of CIR imagery to improve AirSWOT Ka-band radar retrievals [72] of water surface elevation (WSE) in the Mackenzie River Delta (MRD).(a) AirSWOT Ka-band radar backscatter, (b) AirSWOT CIR image, (c) Interferometrically derived WSEs, (d) CIR open-water classification used as a mask (black) to define correct areas for spatial averaging of (c).In (c,d), white regions correspond to no data, which is more prevalent in the far range of the radar image (bottom portion).

4. 2 .
Utility of CIR Imagery and Open Water Classifications for the NASA Arctic-Boreal Vulnerability Experiment (ABoVE)

Figure 11 .
Figure 11.Wetlands and a river-connected lake in the Peace-Athabasca Delta (PAD).(a) UAVSAR Lband backscatter in HH polarization, (b) AirSWOT color-infrared (CIR) imagery, (c) L-band polarimetric returns are shown in a Freeman-Durden decomposition [79], and (d) same, with CIR open water classification superimposed.Color scale for (c) and (d): red = double-bounce scattering, green = volume scattering, blue = single-bounce scattering.The CIR water classification excludes the littoral zone vegetation surrounding the lake, which can be seen as high double-bounce regions in the UAVSAR image.UAVSAR data courtesy NASA/JPL-Caltech.

Figure 11 .
Figure 11.Wetlands and a river-connected lake in the Peace-Athabasca Delta (PAD).(a) UAVSAR L-band backscatter in HH polarization, (b) AirSWOT color-infrared (CIR) imagery, (c) L-band polarimetric returns are shown in a Freeman-Durden decomposition [78], and (d) same, with CIR open water classification superimposed.Color scale for (c,d): red = double-bounce scattering, green = volume scattering, blue = single-bounce scattering.The CIR water classification excludes the littoral zone vegetation surrounding the lake, which can be seen as high double-bounce regions in the UAVSAR image.UAVSAR data courtesy NASA/JPL-Caltech.

Figure 12 .
Figure 12.Patterned ground typical of permafrost regions.(a) LVIS bare-earth digital elevation model (DEM) [80], (b) AVIRIS-NG hyperspectral imagery shown in natural color [81], (c) CIR image from AirSWOT suite, and (d) Same as (c), overlain by the CIR open water classification.The high-resolution CIR camera offers the potential for mapping individual ponds within the patterned ground.In (a), the LVIS waveform LIDAR ground returns have been rasterized to a 5.2 m grid cell to match AVIRIS-NG.

Figure 12 .
Figure 12.Patterned ground typical of permafrost regions.(a) LVIS bare-earth digital elevation model (DEM) [79], (b) AVIRIS-NG hyperspectral imagery shown in natural color [80], (c) CIR image from AirSWOT suite, and (d) Same as (c), overlain by the CIR open water classification.The high-resolution CIR camera offers the potential for mapping individual ponds within the patterned ground.In (a), the LVIS waveform LIDAR ground returns have been rasterized to a 5.2 m grid cell to match AVIRIS-NG.

Table 2 .
Confusion matrix between 197 manually digitized water bodies and the automated open water classification (units in m 2 , and percentage of pixels).Evaluation areas came from six geographically distributed regions (YFB, OCF, MRD, CSH, PAD, PPN).

Table 2 .
Confusion matrix between 197 manually digitized water bodies and the automated open water classification (units in m 2 , and percentage of pixels).Evaluation areas came from six geographically distributed regions (YFB, OCF, MRD, CSH, PAD, PPN).

Table 3 .
Summary accuracy metrics for open water detection.Negative percent differences imply larger values for the classified water (map) than the digitized water (reference).

Table 3 .
Summary accuracy metrics for open water detection.Negative percent differences imply larger values for the classified water (map) than the digitized water (reference).