SnowCloudMetrics: Snow Information for Everyone

Snow is a critical component of the climate system, provides fresh water for millions of people globally, and affects forest and wildlife ecology. Snowy regions are typically data sparse, especially in mountain environments. Remotely-sensed snow cover data are available globally but are challenging to convert into accessible, actionable information. SnowCloudMetrics is a web portal for on-demand production and delivery of snow information including snow cover frequency (SCF) and snow disappearance date (SDD) using Google Earth Engine (GEE). SCF and SDD are computed using the Moderate Resolution Imaging Spectroradiometer (MODIS) Snow Cover Binary 500 m (MOD10A1) product. The SCF and SDD metrics are assessed using 18 years of Snow Telemetry records at more than 750 stations across the Western U.S. SnowCloudMetrics provides users with the capacity to quickly and efficiently generate local-to-global scale snow information. It requires no user-side data storage or computing capacity, and needs little in the way of remote sensing expertise. SnowCloudMetrics allows users to subset by year, watershed, elevation range, political boundary, or user-defined region. Users can explore the snow information via a GEE map interface and, if desired, download scripts for access to tabular and image data in non-proprietary formats for additional analyses. We present global and hemispheric scale examples of SCF and SDD. We also provide a watershed example in the transboundary, snow-dominated Amu Darya Basin. Our approach represents a new, user-driven paradigm for access to snow information. SnowCloudMetrics benefits snow scientists, water resource managers, climate scientists, and snow related industries providing SCF and SDD information tailored to their needs, especially in data sparse regions.


Introduction
We have developed SnowCloudMetrics (www.snowcloudmetrics.app) as a cloud-computing portal where users can access and customize snow information globally to help address a wide range of challenges. Snow is Earth's most climatically sensitive land cover type. Streamflow derived from snowmelt provides critical water supply for hydropower, agriculture, ecosystems, industry, and municipalities [1,2]. Biologically, the onset of snow in the winter and the timing of its disappearance in spring and summer is a key control on phenology and ecology of vegetation and wildlife [3][4][5][6]. The timing of melt also affects the timing and severity of wildfire in seasonally snow-covered landscapes [7]. Satellite observations show global changes in the spatial extent of snow [8,9], maximum snow water equivalence (SWE), elevational distributions in mountainous regions, and the timing of meltwater production and snow disappearance from the land surface [10][11][12][13]. Data suggest that high elevation mountain regions are warming faster than low elevations, with the most significant trends along the 0 • C isotherm level due to the albedo-temperature feedback associated with declining snow cover and glacier extent [14,15]. The annual duration of snow cover is also critically important to the winter tourism economy. In the United States (U.S.) alone, the snow sports economy generates about $11.3B per year and supports over 191,000 jobs [16]. In spite of snow's critical importance to multiple sectors, informative in situ snow observations are sparse in most regions of the world.
Long-term, high quality observational snow datasets exist primarily in Canada, the U.S., the European Union, Russia, and China. Yet seasonal snow cover is vital to mountain hydrology across the globe, including many countries in Central Asia where long-term environmental monitoring programs once existed but were terminated following the collapse of the Soviet Union [17][18][19].
Most of the world's major rivers begin in mountain regions, with alpine headwaters often crossing international boundaries. International, transboundary watersheds where nations have effective data sharing provisions are more likely to have improved adaptive capacity and increased flexibility with regards to flow variability, especially in the context of climate change [20,21]. Despite their importance many transboundary watersheds have not implemented data sharing provisions [20,22,23], further exacerbating the challenges of sparse snow data. For example, the snowy reaches of the upper Amu Darya Basin (ADB) primarily spans four Central Asian countries: Afghanistan, Turkmenistan, Tajikistan, and Uzbekistan. Currently this watershed does not have a treaty of river basin organization to facilitate data sharing [23]. In addition to the lack of international agreements, many Central Asian watersheds do not have long-term environmental monitoring programs despite the fact that agriculture heavily relies on surface water runoff for irrigation in these transboundary watersheds, equaling >85% of water withdrawals for the region [18,19].
Even in locations where snow data are available, such data may not translate into relevant information. For example, in the western U.S., ground-based transects and station-based snow observations have been the standard for measuring mountain snowpacks for decades [11,24]. However, the snow telemetry (SNOTEL) and snow course observations may not be spatially representative of snow distributions across steep elevation gradients, land cover types, or remote mountain regions [25][26][27]. As mountain snowpacks continue to decline, information from the fixed locations of such networks may become less spatially representative than originally intended [28,29]. In regions where seasonal snowpacks are highly susceptible to warming winter temperatures, such as the Oregon Cascade Range, U.S. [30,31], lower elevation stations lose their predictive value for streamflow forecasting when their snow cover disappears, and high elevation snow cover is not routinely monitored [32]. Globally these challenges are even more acute, especially when focusing on data-scarce regions.
Changes in the duration of snow cover due to anthropogenic climate change in mountain and Arctic environments are occurring globally, and decreases in the duration of snow cover have been observed and modeled, especially in the North American Arctic [33][34][35]. The changing cryosphere is spatially heterogeneous throughout mountain environments and the Pan-Arctic region, and local changes are occurring in the timing of snow cover onset in autumn, the timing of snow disappearance in spring, mid-winter melt events, and snow cover duration [35]. These changes in snow conditions profoundly affect the ecosystems of mountain and Arctic environments [35,36]. For example, two studies have found that snow cover duration is one of the primary factors that prompt migratory behavior in red deer populations in Europe and moose in Scandinavia [37,38]. Similarly, an earlier onset of the green-up period for vegetation has been found to positively affect reindeer calf body mass in the subsequent months [39]. Another study found that some populations of Dall sheep in the Arctic have more success rearing their young when the snow disappears earlier in the spring season [40]. This recent research in plant phenology and wildlife ecology suggests the importance and utility of global-to-local scale snow metrics, which can characterize changing snow conditions in the context of climate change.
Remotely-sensed snow (RSS) products provide a rich data resource for understanding and quantifying snow cover and runoff [32,41]. While RSS data are freely available, there are technological and cost barriers that limit their use in developed and developing countries. Satellite-derived snow data products are not easily customized for user-specific needs. RSS data, such as the National Aeronautics and Space Administration (NASA) MOD10A1 version 6 snow cover product [42][43][44], in addition to other satellite-derived snow cover datasets, are freely available via the EarthData web portal, but multiple barriers limit their use. First, the lack of data storage capacity limits the user's capabilities for analyses that span the full temporal and spatial scope of the satellite collections [45]. Even with the digital storage capacity, these large datasets require intensive computing power for analysis [45]. Similarly, there are obstacles related to the financial cost of performing geospatial analysis that are not unique to satellite-derived snow datasets, but nevertheless important to highlight. Access to expensive proprietary software limits the usage of these datasets among non-academic researchers, and the cost associated with a geospatial technician further raises the budget required for many potential users of RSS data. The GEE framework and data library allows the user to take their questions to the data and harness the power of cloud computing. This transition democratizes data and computing access, bringing applied research questions to the data. This represents a new paradigm in data processing and information delivery for scientists, data providers, natural resource managers, and the general public.

Datasets and Methods
Here, we introduce SnowCloudMetrics, and two of its data products that are calculated entirely in a cloud-based platform using the GEE data library and analysis tools: (1) snow cover frequency (SCF) and (2) snow disappearance date (SDD). The SnowCloudMetrics app allows users to define temporal and spatial subsets of snow cover data based upon time period, watershed, elevation range, political boundary, or user-defined region. The interface allows users to explore the snow information and download scripts for access to tabular and image data in non-proprietary formats for additional analyses. These metrics employ the global, daily, 500 m MOD10A1 version 6 snow cover dataset to identify when and where snow cover occurs on the landscape. The MOD10A1 relies on the Normalized Difference Snow Index (NDSI) to determine the presence or absence of snow [42]. We note that Moderate Resolution Imaging Spectroradiometer (MODIS) binary snow classification has been shown to have lower detection accuracy in areas of patchy snow, such as during the spring season and at lower elevations in the seasonal snow zone [46].

Snow Cover Frequency
SCF is computed as the number of days that snow is observed on the land surface divided by the number of valid observations in the time period of interest (Equation (1)).
A valid observation is one that does not include pixels with ocean, lakes, detector saturation, missing data, night, or no-decision within the MOD10A1 data set. The result is a single image of SCF with pixel floating point values ranging from 0 to 1.0, with zero denoting a pixel having no snow covered days in the time period and 1.0 denoting all days in the time period are snow covered (Figure 1a). Within SnowCloudMetrics, users can compute SCF for any water year within the MOD10A1 data set.
A lower SCF threshold of <=0.07 is set for the SnowCloudMetrics app images based on the snow persistence metric developed by Hammond et al. [47,48]. Additionally, the MOD10A1 version 6 product contains all values of NDSI that are >0, but previous versions of the product set the minimum NDSI value at 0.40 [42,49]. In areas with forest cover, setting the minimum NDSI value at 0.15 allows for detection of snow in the gaps [46,[50][51][52]. For the SCF images, we set the NDSI threshold at 0.15 to increase the detection of snow in areas with forest cover. . The number of satellite observed snow days, along with a cloud correction, are divided by the valid number of observations from the same time period to create an image that contains the number of days snow was observed in each pixel. (b) An idealized schematic representing the snow disappearance date (SDD) algorithm in GEE for a water year in the Northern Hemisphere. The SDD algorithm builds a yearly image collection based on the water year from October 1 to September 30, and searches from end to beginning for the day when the pixel switches from snow covered to no-snow.

Snow Disappearance Date
SDD also uses the MOD10A1 snow cover product. Computationally, SDD is based upon an image collection of 365 days in the water year (WY; described subsequently) by searching each grid cell temporally backwards, from the end of the water year to the beginning. We locate the last 5-day time period without snow at the end of the water year that is preceded by a 5-day presence of snow, and find the date that the grid cell switches from 'snow' to 'no-snow' in the image collection. This 5day period was selected to avoid missing late-season snow events but also to disregard transient late season snow events. The result is an image with values that represent the day of snow disappearance from the landscape based on the water year ( Figure 1b). The WY designation contains assumptions about the seasonality of the snowpack. The SDD algorithm for SnowCloudMetrics assumes an October 1 to September 30 WY because of the timing and prevalence of seasonal snow in the Northern Hemisphere. If the SDD algorithm were to be employed in the Southern Hemisphere, a different beginning and end for the WY would need to be implemented for it to accurately assess the switch from 'snow' to 'no-snow' at the end of the season. When no snow is found during the water year, the pixel is masked, and no value is assigned for SDD. Since glacier surfaces may have the presence of snow year-round, we account for the glacier surfaces in the SnowCloudMetrics app using the Global Land Ice Measurements from Space (GLIMS) 2017 dataset available in GEE. Additionally, the Southern Hemisphere calculation of SDD lies outside the scope of this research, but will be addressed in subsequent versions of the platform.

Cloud Correction in GEE Algorithm
One of the realities of snowfall is that it is coincident in time with cloud cover. Thus, we need to account for cloudy days in both the SCF and SDD algorithms. Pixels are designated as cloud-covered or cloud-free in the MOD10A1 version 6 product. The version 6 product contains a cloud designation based on the MOD32_L2 cloud mask noted in the product user guide. We use this built-in cloud mask to adjust both the SCF and SDD algorithms based on the following three temporal cases: (1) snow

Snow Disappearance Date
SDD also uses the MOD10A1 snow cover product. Computationally, SDD is based upon an image collection of 365 days in the water year (WY; described subsequently) by searching each grid cell temporally backwards, from the end of the water year to the beginning. We locate the last 5-day time period without snow at the end of the water year that is preceded by a 5-day presence of snow, and find the date that the grid cell switches from 'snow' to 'no-snow' in the image collection. This 5-day period was selected to avoid missing late-season snow events but also to disregard transient late season snow events. The result is an image with values that represent the day of snow disappearance from the landscape based on the water year ( Figure 1b). The WY designation contains assumptions about the seasonality of the snowpack. The SDD algorithm for SnowCloudMetrics assumes an October 1 to September 30 WY because of the timing and prevalence of seasonal snow in the Northern Hemisphere. If the SDD algorithm were to be employed in the Southern Hemisphere, a different beginning and end for the WY would need to be implemented for it to accurately assess the switch from 'snow' to 'no-snow' at the end of the season. When no snow is found during the water year, the pixel is masked, and no value is assigned for SDD. Since glacier surfaces may have the presence of snow year-round, we account for the glacier surfaces in the SnowCloudMetrics app using the Global Land Ice Measurements from Space (GLIMS) 2017 dataset available in GEE. Additionally, the Southern Hemisphere calculation of SDD lies outside the scope of this research, but will be addressed in subsequent versions of the platform.

Cloud Correction in GEE Algorithm
One of the realities of snowfall is that it is coincident in time with cloud cover. Thus, we need to account for cloudy days in both the SCF and SDD algorithms. Pixels are designated as cloud-covered or cloud-free in the MOD10A1 version 6 product. The version 6 product contains a cloud designation based on the MOD32_L2 cloud mask noted in the product user guide. We use this built-in cloud mask to adjust both the SCF and SDD algorithms based on the following three temporal cases: (1) snow cover, followed by cloud cover, followed by snow cover; (2) snow cover, followed by cloud cover, followed by snow-free; and (3) snow-free, followed by cloud cover, followed by snow cover. For a complete description of the cloud mask algorithm, please refer to Sproles et al. [41].

An Example Application of SCF and SDD in the Amu Darya Basin
The ADB is used as an example of the application of SCF and SDD. We selected the ADB because it is the largest watershed in Central Asia (623,000 km 2 ). It is also a transboundary watershed, with its headwaters originating in Afghanistan, Tajikistan, and Uzbekistan, before flowing downstream to Kyrgyzstan, and Turkmenistan. The snowy and glacierized upper elevations of the ADB serve as the main source of surface water for the lower elevation, desert-steppe region [53]. The ADB makes up a portion of the headwaters for the Aral Sea, along with the Syr Darya Basin to the north. Precipitation in the ADB is spatially heterogeneous, with much of the arid portion receiving as little as 100 mm yr -1 , while parts of the higher elevation regions receive 1000 mm yr −1 [53]. For these reasons, the ADB is a location where the SCF and SDD snow metrics have the potential to add supplemental snow information where little currently exists.
For mapping and subsetting, the SCF and SDD metrics within the ADB, we used the publicly available United States Geological Survey HydroSHEDS global watershed dataset [54]. Following computation of SCF and SDD, further geospatial post-processing was completed using open-source QGIS software, along with a free, publicly available hypsometric shaded relief global raster, and a country border vector layer from Natural Earth. The spatial analysis described here used the SnowCloudMetrics app in GEE, publicly available datasets, and publicly available, free software, so that it can be replicated without the need for expensive, proprietary software. Ancillary elevation data incorporated into SnowCloudMetrics allows users to subset regions by elevation. The elevation model, accessed via GEE, is the Shuttle Radar Topography Mission's (SRTM) 30 m global digital elevation model [55]. For the ADB country-level analysis, country boundary shapefiles were downloaded from the Thematic Mapping and accessed in GEE through uploaded shapefiles.
First, the nineteen SCF and SDD yearly images were exported in GeoTIFF format, in the same manner that a potential SnowCloud user might export these datasets to a Google Drive folder. Second, these images were imported back into GEE as nineteen separate assets for each metric. Third, an image collection was created in GEE that was spatially bounded by the ADB shapefile from the HydroSHEDS dataset. From each 19-layer image collection (one per year), three calculations were made on a pixel-by-pixel basis: (1) the 19-year maximum value per metric (SCFmax, SDDmax); (2) the 19-year minimum value per metric (SCFmin, SDDmin); and (3) the 19-year mean value per metric (SCFmean, SDDmean). These min, max, and mean images were then subset into six different elevation bands ranging from 0 to >5000 m by creating masks using the SRTM 30 m DEM. Lastly, each elevation-subset metric was spatially averaged and compared to the WY2016 SCF and WY2016 SDD images.

Assessment Using Snow Telemetry Network in the Western U.S.
The Snow Telemetry (SNOTEL) network is used to assess the SCF and SDD metrics generated in GEE from the MOD10A1 version 6 product. The SNOTEL network contains over 800 automated data collection sites in the Western U.S., with daily records of snow depth, SWE, precipitation, solar radiation, and temperature [24]. For the assessment of the SCF and SDD metrics, the SWE record at 755 SNOTEL sites in 12 states is accessed during the MOD10A1 version 6 satellite collection years, from October 1st, 2000 through September 31st, 2018. We use the Bias Correction/Quality Control (BCQC) SNOTEL dataset from the Pacific Northwest National Laboratory for our analysis [56,57]. We use an 18-year period for the SNOTEL analysis because the 2019 water year has not yet been added to the BCQC dataset.
At each SNOTEL station, the record contains daily measurements of SWE obtained from the snow pillow. A snow pillow relies on measuring the pressure from the overlying mass of snow at a specific location, whereas the MODIS satellite product relies on the reflectance values of light collected passively by the instrument, with snow cover determined by the NDSI values. The reported SNOTEL snow pillow sensitivity is 2.54 mm (0.1 in) of SWE. The instrument limitations have been discussed by previous studies, with accuracy issues divided into two categories, SWE-over measurement errors and SWE-under measurement errors, most of which occur during mid-accumulation season [24,58]. In order to compare the GEE-derived SCF and SDD metrics to the SNOTEL dataset, a SNOTEL-derived SCF and SDD value are calculated at each station for each WY over the 18-year assessment time period. The SNOTEL-derived SCF metric counts the number of days in the WY when the snow pillow data is > 0 and divides this number by 365. The SNOTEL-derived SDD metric finds the last 5-day time period with snow present during each WY, and records the day of the WY when the SNOTEL switches from 'snow' to 'no-snow'. This is repeated for all stations and all years. For more information on the snow pillow sensors used by SNOTEL stations, see the data availability section.

Global Mapping of SCF and SDD
Global SCF (WY2016- Figure 2) is a global representation of the number of days in WY2016 that the MODIS satellite observed snow on the landscape relative to the total number valid observations. The resolution was modified to 2 km from the MODIS 500 m resolution due to the increased image export time required when generating an image with more than 100 million pixels from the GEE code editor. Mountain regions such as coastal Alaska, the Patagonian Andes, the Himalayan Mountains, and the North American cordillera and high latitudes in North America, Asia, and Europe have the highest annual values of SCF. The lowest values represent locations where snowpack is ephemeral or only persists for a few months during the winter. Locations that received no snow are masked out of the image.
Remote Sens. 2018, 5, x FOR PEER REVIEW 6 of 18 SNOTEL snow pillow sensitivity is 2.54 mm (0.1 in) of SWE. The instrument limitations have been discussed by previous studies, with accuracy issues divided into two categories, SWE-over measurement errors and SWE-under measurement errors, most of which occur during midaccumulation season [24,58]. In order to compare the GEE-derived SCF and SDD metrics to the SNOTEL dataset, a SNOTEL-derived SCF and SDD value are calculated at each station for each WY over the 18-year assessment time period. The SNOTEL-derived SCF metric counts the number of days in the WY when the snow pillow data is > 0 and divides this number by 365. The SNOTEL-derived SDD metric finds the last 5-day time period with snow present during each WY, and records the day of the WY when the SNOTEL switches from 'snow' to 'no-snow'. This is repeated for all stations and all years. For more information on the snow pillow sensors used by SNOTEL stations, see the data availability section.

Global Mapping of SCF and SDD
Global SCF (WY2016- Figure 2) is a global representation of the number of days in WY2016 that the MODIS satellite observed snow on the landscape relative to the total number valid observations. The resolution was modified to 2 km from the MODIS 500 m resolution due to the increased image export time required when generating an image with more than 100 million pixels from the GEE code editor. Mountain regions such as coastal Alaska, the Patagonian Andes, the Himalayan Mountains, and the North American cordillera and high latitudes in North America, Asia, and Europe have the highest annual values of SCF. The lowest values represent locations where snowpack is ephemeral or only persists for a few months during the winter. Locations that received no snow are masked out of the image. Northern Hemisphere SDD (WY2016- Figure 3), represents the day of water year that snow disappears from the landscape. We aggregated the day of year values to monthly values for the SDD map visual representation in Figure 3. The highest latitudes and mountain ranges in North America, Europe, and Asia have the latest SDD values, and Greenland contains no data values because it remains primarily snow covered all year. Antarctica was masked out of the analysis completely for the SDD and SCF metrics, and is not represented in our global maps. We focus our SDD results on Northern Hemisphere SDD (WY2016- Figure 3), represents the day of water year that snow disappears from the landscape. We aggregated the day of year values to monthly values for the SDD map visual representation in Figure 3. The highest latitudes and mountain ranges in North America, Europe, and Asia have the latest SDD values, and Greenland contains no data values because it remains primarily snow covered all year. Antarctica was masked out of the analysis completely for the SDD and SCF metrics, and is not represented in our global maps. We focus our SDD results on the Northern Hemisphere for the present study. Recall that SDD represents a water year from October to September. If the same algorithm were applied to the Southern Hemisphere, the results would not align with the definition of a water year defined in this study. In a future version of SnowCloudMetrics, we plan to implement an SDD algorithm that is specific to the water year in the Southern Hemisphere.
Remote Sens. 2018, 5, x FOR PEER REVIEW 7 of 18 the Northern Hemisphere for the present study. Recall that SDD represents a water year from October to September. If the same algorithm were applied to the Southern Hemisphere, the results would not align with the definition of a water year defined in this study. In a future version of SnowCloudMetrics, we plan to implement an SDD algorithm that is specific to the water year in the Southern Hemisphere. The SDD values represent the day of the water year that snow disappears from the landscape and are binned into months for visualization and interpretation. The Southern Hemisphere is not included because the water year date range differs by hemisphere.

Results from the SNOTEL Assessment
In order to assess the GEE-derived SCF and SDD snow metrics, we accessed 18 years of SNOTEL site SWE records at >750 stations across the Western United States. A SNOTEL-derived SCF and SDD value for each WY and each station are compared to the GEE-derived SCF and SDD values in Figure  4 below. The assessment was performed at the colocated 500 m MODIS pixels and the individual SNOTEL station sites. In Figure 4, the two datasets are subset by state, averaged over the 18-year time period, and the SNOTEL dataset is subtracted from the GEE dataset. For the SCF metric, the bulk of the GEE-derived values are lower than the SNOTEL-derived SCF. Likewise, for the SDD metric, the bulk of the GEE-derived occur earlier than the SNOTEL-derived SDD dates. A detailed comparison of the SNOTEL-derived SCF and the GEE-derived SCF metric subset by state is available in Table 1. The root mean squared error (RMSE) of the SCF values, calculated at each station and aggregated by state, show that the GEE-derived SCF metric shows more agreement with the SNOTEL dataset in some regions and elevation bands (Utah, California, and Colorado) than others (Washington, Oregon, and New Mexico). Table 2 compares the GEE-derived and SNOTEL-derived SDD values divided by state. The high elevation stations consistently show more agreement with the SNOTEL dataset than the low-and mid-elevation stations. The effects of these differences between the SNOTEL-derived metrics and the GEE-derived metrics are examined further in the discussion section. The SDD values represent the day of the water year that snow disappears from the landscape and are binned into months for visualization and interpretation. The Southern Hemisphere is not included because the water year date range differs by hemisphere.

Results from the SNOTEL Assessment
In order to assess the GEE-derived SCF and SDD snow metrics, we accessed 18 years of SNOTEL site SWE records at >750 stations across the Western United States. A SNOTEL-derived SCF and SDD value for each WY and each station are compared to the GEE-derived SCF and SDD values in Figure 4 below. The assessment was performed at the colocated 500 m MODIS pixels and the individual SNOTEL station sites. In Figure 4, the two datasets are subset by state, averaged over the 18-year time period, and the SNOTEL dataset is subtracted from the GEE dataset. For the SCF metric, the bulk of the GEE-derived values are lower than the SNOTEL-derived SCF. Likewise, for the SDD metric, the bulk of the GEE-derived occur earlier than the SNOTEL-derived SDD dates. A detailed comparison of the SNOTEL-derived SCF and the GEE-derived SCF metric subset by state is available in Table 1. The root mean squared error (RMSE) of the SCF values, calculated at each station and aggregated by state, show that the GEE-derived SCF metric shows more agreement with the SNOTEL dataset in some regions and elevation bands (Utah, California, and Colorado) than others (Washington, Oregon, and New Mexico). Table 2 compares the GEE-derived and SNOTEL-derived SDD values divided by state. The high elevation stations consistently show more agreement with the SNOTEL dataset than the lowand mid-elevation stations. The effects of these differences between the SNOTEL-derived metrics and the GEE-derived metrics are examined further in the discussion section.      In Table 3 we use a confusion matrix to estimate the ability of the GEE algorithm to correctly identify the presence or absence of snow after performing the cloud adjustment. We disaggregated the GEE dataset into daily images of snow occurrence and compare these images to the SNOTEL time-series for each SNOTEL station. We subset the datasets into early-, mid-, and late-snow season using a confusion matrix. We include the accuracy and Kappa Coefficient values for each subset. The mid-season true positive (TP) values are the highest and the true negative (TN) are the lowest of any subset. The mid-season also contains the best accuracy score while the entire dataset contains the highest Kappa Coefficient. The confusion matrix, overall accuracy, Kappa Coefficient all show that the GEE-derived snow metrics are adequately identifying the presence and absence of snow during all parts of the water year, although there is some variation during different parts of the year. The expected increase in FNs during the early-and late-snow season are also revealed by the confusion matrix analysis, and these are examined further in the discussion section. The SNOTEL station locations include a wide ranges of elevations and latitudes. Figure 5

Results for the Amu Darya Basin Example
For the ADB, yearly SCF images were calculated in GEE from 2001 to 2019. Figure 6a is a map of the ADB, while Figure 6b displays an SCF image exported from GEE for WY2016, and Figure 6c shows the maximum, mean, and minimum values spatially binned by elevation in the ADB. For the following case study, WY2016 was chosen as a sample year to compare with the 19-year averages for each metric. The SRTM 30 m elevation data were spatially averaged into six 1000 m elevation zones. While this short-term record is not long enough to analyze trends or to create SCF climatologies, this

Results for the Amu Darya Basin Example
For the ADB, yearly SCF images were calculated in GEE from 2001 to 2019. Figure 6a is a map of the ADB, while Figure 6b displays an SCF image exported from GEE for WY2016, and Figure 6c shows the maximum, mean, and minimum values spatially binned by elevation in the ADB. For the following case study, WY2016 was chosen as a sample year to compare with the 19-year averages for each metric. The SRTM 30 m elevation data were spatially averaged into six 1000 m elevation zones. While this short-term record is not long enough to analyze trends or to create SCF climatologies, this snow metric provides valuable spatial information that can augment in situ observations of snow water equivalent and snow depth that may exist within the study area.
Remote Sens. 2018, 5, x FOR PEER REVIEW 11 of 18 snow metric provides valuable spatial information that can augment in situ observations of snow water equivalent and snow depth that may exist within the study area. SDD images for the ADB were computed for the same 2001-2019 time period. Maximum, minimum, and mean SDD images were compiled and spatially binned by elevation. Figure 7a is a map of the ADB, while Figure 7b displays the yearly SDD image exported from GEE for WY2016 in which the seasonality of SDD changes from April to September throughout the high elevation Pamir Mountains. Figure 7c shows the maximum, mean, and minimum values spatially binned by elevation. The WY2016 mean SDD values are lower than the 19-year average at the lowest elevations and higher than average at the highest elevations, much like the WY2016 mean SCF values within the ADB Basin. Since these trends are likely to vary by country in this transboundary watershed, we spatially subset the SCF and SDD datasets by country and elevation. Turkmenistan contains only the lowest elevation ranges in the ADB, so we focus our country-level analysis on Afghanistan, Tajikistan, and Uzbekistan. The results of the country-level SCF and SDD metrics within the transboundary ADB ( Figure 8) show that differences arise by country and elevation. The range in SCFmax and SCFmin values for Afghanistan and Uzbekistan is lower in the 3000-4000 m elevation bands, while Tajikistan shows more variability in the SCFmax and SCFmin at the same elevations. The SDDmax and SDDmin values for Afghanistan and Tajikistan have higher ranges in the upper elevations. Note that the ADB within Uzbekistan does not contain elevations above 5000 m and, therefore, were not included in the SDD images for the ADB were computed for the same 2001-2019 time period. Maximum, minimum, and mean SDD images were compiled and spatially binned by elevation. Figure 7a is a map of the ADB, while Figure 7b displays the yearly SDD image exported from GEE for WY2016 in which the seasonality of SDD changes from April to September throughout the high elevation Pamir Mountains. Figure 7c shows the maximum, mean, and minimum values spatially binned by elevation. The WY2016 mean SDD values are lower than the 19-year average at the lowest elevations and higher than average at the highest elevations, much like the WY2016 mean SCF values within the ADB Basin.
Remote Sens. 2018, 5, x FOR PEER REVIEW 11 of 18 snow metric provides valuable spatial information that can augment in situ observations of snow water equivalent and snow depth that may exist within the study area. SDD images for the ADB were computed for the same 2001-2019 time period. Maximum, minimum, and mean SDD images were compiled and spatially binned by elevation. Figure 7a is a map of the ADB, while Figure 7b displays the yearly SDD image exported from GEE for WY2016 in which the seasonality of SDD changes from April to September throughout the high elevation Pamir Mountains. Figure 7c shows the maximum, mean, and minimum values spatially binned by elevation. The WY2016 mean SDD values are lower than the 19-year average at the lowest elevations and higher than average at the highest elevations, much like the WY2016 mean SCF values within the ADB Basin. Since these trends are likely to vary by country in this transboundary watershed, we spatially subset the SCF and SDD datasets by country and elevation. Turkmenistan contains only the lowest elevation ranges in the ADB, so we focus our country-level analysis on Afghanistan, Tajikistan, and Uzbekistan. The results of the country-level SCF and SDD metrics within the transboundary ADB ( Figure 8) show that differences arise by country and elevation. The range in SCFmax and SCFmin values for Afghanistan and Uzbekistan is lower in the 3000-4000 m elevation bands, while Tajikistan shows more variability in the SCFmax and SCFmin at the same elevations. The SDDmax and SDDmin values for Afghanistan and Tajikistan have higher ranges in the upper elevations. Note that the ADB within Uzbekistan does not contain elevations above 5000 m and, therefore, were not included in the Since these trends are likely to vary by country in this transboundary watershed, we spatially subset the SCF and SDD datasets by country and elevation. Turkmenistan contains only the lowest elevation ranges in the ADB, so we focus our country-level analysis on Afghanistan, Tajikistan, and Uzbekistan. The results of the country-level SCF and SDD metrics within the transboundary ADB ( Figure 8) show that differences arise by country and elevation. The range in SCFmax and SCFmin values for Afghanistan and Uzbekistan is lower in the 3000-4000 m elevation bands, while Tajikistan shows more variability in the SCFmax and SCFmin at the same elevations. The SDDmax and SDDmin values for Afghanistan and Tajikistan have higher ranges in the upper elevations. Note that the ADB within Uzbekistan does not contain elevations above 5000 m and, therefore, were not included in the Figure 6. The WY2016 has a unique SCF and SDD signature among all countries in the ADB. At the highest elevations in WY2016, the SCF and SDD values are varied when compared to the 19-year average, with some above, below, or basically equivalent to the 19-year average. Remote Sens. 2018, 5, x FOR PEER REVIEW 12 of 18 Figure 6. The WY2016 has a unique SCF and SDD signature among all countries in the ADB. At the highest elevations in WY2016, the SCF and SDD values are varied when compared to the 19-year average, with some above, below, or basically equivalent to the 19-year average.

Discussion
The GEE cloud computing and visualization tools provide easy, web-based access to petabytes of snow-centric satellite data for scientists and a variety of other potential users. The SCF and SDD metrics calculated in SnowCloudMetrics provide information on the spatial and temporal characteristics of local to global snow cover dynamics. These metrics can offer valuable insights on snow cover processes across multiple spatial and temporal scales, and cloud-based GEE analysis has the potential to transform the types of questions and the geographic scope of scientific snow research. The SnowCloudMetrics app is freely accessible and available to natural resource managers, snow industry representatives, scientists, and the general public. This gives the users the ability to create information based on these satellite datasets that is "fit for purpose" and customizable. We have designed the GEE app functionality to give multiple options to users that include global WY visualization for both metrics, graphing of point-based time series, multi-band GeoTIFF image export, importing of vector-based shapefiles, and SNOTEL station location. See Table 4 for more information about SnowCloudMetrics functionality. In order to get access to the table and GeoTIFF export SCF and SDD images, the user must become a developer with Google Earth Engine and download the source code from GitHub. Detailed instructions are located in the SnowCloudMetrics app, at our website, and in the data availability section of this publication.

Discussion
The GEE cloud computing and visualization tools provide easy, web-based access to petabytes of snow-centric satellite data for scientists and a variety of other potential users. The SCF and SDD metrics calculated in SnowCloudMetrics provide information on the spatial and temporal characteristics of local to global snow cover dynamics. These metrics can offer valuable insights on snow cover processes across multiple spatial and temporal scales, and cloud-based GEE analysis has the potential to transform the types of questions and the geographic scope of scientific snow research. The SnowCloudMetrics app is freely accessible and available to natural resource managers, snow industry representatives, scientists, and the general public. This gives the users the ability to create information based on these satellite datasets that is "fit for purpose" and customizable. We have designed the GEE app functionality to give multiple options to users that include global WY visualization for both metrics, graphing of point-based time series, multi-band GeoTIFF image export, importing of vector-based shapefiles, and SNOTEL station location. See Table 4 for more information about SnowCloudMetrics functionality. In order to get access to the table and GeoTIFF export SCF and SDD images, the user must become a developer with Google Earth Engine and download the source code from GitHub. Detailed instructions are located in the SnowCloudMetrics app, at our website, and in the data availability section of this publication. The access that GEE provides empowers users to create custom datasets that address their questions and specific problems, and may lead to the creation of new information for questions of global or regional interest. While this is generally applicable to all remotely-sensed, global satellite collections (Landsat, Sentinel 1, MODIS, etc.), we have found it to be particularly applicable in the case of global snow products and the application of new snow metrics for scientists. For example, Sproles et al. outline a simple operational hydrologic model, SnowCloudHydro, which predicts streamflow for snow-dominated watersheds [41]. In an Alaskan wildlife study of Dall sheep populations, Van de Kerk et al. used GEE-derived SCF and SDD information to understand the spatially varying effects of snow conditions on lamb survival across multiple sheep ranges [40]. They found that SCF and SDD have a greater impact on northern Dall sheep populations compared to their southern counterparts. Information derived from SnowCloudMetrics is also being incorporated into a local scale climate action plan for the town of Whitefish, Montana [59]. The authors of this plan are using SCF and SDD to explore past and potential future impacts of climate change on their local ski area and the related economic impact. While still preliminary, these examples demonstrate how users can customize these snow metrics for their unique purposes without a considerable investment of hardware or software.
Global snow metrics based on satellite-derived snow cover datasets can provide new information about seasonal snow in data sparse regions. SNOTEL assessment is the first step to verify that these new, global scale metrics perform adequately for use in scientific research. SNOTEL stations are also often located in open, flat, meadow areas with snow cover that persists longer in the season than the surrounding more complex terrain and forest cover [60,61]. An earlier GEE-derived SDD is expected, given the difference in spatial resolution between the two methods and the known challenges of using MODIS snow cover in forested regions [60,62]. During the melt season, the presence of snow at a SNOTEL site is not likely to be represented by the patchy snow cover in a MODIS pixel with heterogeneous land cover. There is likely differential melting of snow occurring throughout the surrounding 0.25 km 2 , with rock, vegetation, and water contributing to the spectral signature of the pixel, thus altering the NDSI values calculated by MODIS during the ablation season [63]. For similar reasons, we expect the GEE-derived SCF values to be lower than the SNOTEL-derived SCF values. This is because ephemeral snow days (several days of coverage followed by complete melt-off) that occur at the beginning or end of the season would likely be measured by the SNOTEL snow pillow, and possibly not characterized as snow cover at the 500 m resolution of the MODIS satellite product.
Still, assessment limitations exist for these datasets. One limitation is that the 500 m scale of the snow cover observation may not be representative of the physical process scales that affect snowpacks, especially in mountain and forested regions [62,64]. Furthermore, a 500 m pixel is likely to represent a combination of snow, rock, vegetation, water, etc. [63] making accurate snow cover mapping a challenge. This brings into focus two issues regarding the assessment and interpretation of SCF and SDD. First, in situ data that are spatially extensive and have daily temporal resolution simply do not exist. A second issue is that spatially extensive data from finer-scale remote sensing data such as Landsat, which can tease apart snow cover in mixed pixels, is only available at 16-day temporal resolution. Interpretation of SCF in terms of SWE is challenging. Conceptually, pixels that have high SCF may have high SWE but this is likely to be regionally specific. For instance, the tundra snow climate can retain snow cover for nine months of the year, yet tundra snowpacks are relatively shallow [65]. Maritime snowpacks of the Pacific Northwestern United States may persist for five or six months, but store considerable amounts of water as snow [65]. We see from our preliminary analyses that lower elevation snowpacks have lower SCF and earlier SDD, implying that they have less SWE than higher elevation snowpacks. Thus, interpretation of SCF in terms of SWE requires understanding of the local-to-regional snow climate conditions and further investigations into the relationship between SCF and SWE are merited.
The confusion matrix analysis reveals some interesting and expected characteristics of the GEE-derived dataset. The SNOTEL station elevations and latitudes would most likely have snow occurrences during the vast majority of the mid-season time periods. The early-and late-season false negatives (FN) are both higher than the mid-season FN values, and this is expected. Short-term or transient snow storms are likely in the early-season, and often they include shallow snow depths that melt away quickly after the storm passes. These storms would likely include multiple cloudy days during snowfall events. Similarly, as the snow melt season progresses, patchy late-season snow would increase the likelihood that a 500-m MODIS pixel may record the absence of snow (according to the NDSI value) and, thus, provide a FN value when in reality, the SNOTEL stations have recorded some amount of SWE on the snow pillow. Additionally, there is one temporal sequence that was omitted from the cloud correction algorithm, the case in which a pixel is snow-free, followed by cloud, followed by snow-free. This would represent a snowfall event at the beginning or end of the season that quickly melts. This case may potentially add to the discrepancy between the GEE-derived metric values and the SNOTEL-derived values.
Despite the assessment concerns, global snow metrics generated using GEE have the potential to overcome these challenges simply because of the diverse global-scale datasets that can be simultaneously accessed within the platform. The MODIS Snow Cover Area and Grain Size (MODSCAG) product from NASA uses spectral unmixing to differentiate the fractional components of the spectral end-members, and provides a proportion of each pixel that is composed of those end-members [60]. While the MODSCAG dataset has not been added to the GEE datasets as of this writing, GEE developers have the ability to program the MODSCAG algorithm in GEE using the spectral unmixing capabilities that already exist within the platform. Many global-scale spectral datasets (top-of-atmosphere reflectance, surface reflectance, etc.) can be accessed in GEE, as well as a ready-made spectral unmixing function. Additionally, the entire Landsat image archive is accessible and GEE allows users to upload their own grid-based products. While global-scale assessment of these snow metrics is at the moment unattainable, the distinct advantage that GEE offers is the ability to cross-reference spatially and temporally continuous global satellite datasets simultaneously within the same platform, algorithm, or function, without the need for the computational resources of a supercomputer.

Conclusions
In this study, we present SnowCloudMetrics (https://www.snowcloudmetrics.app), a snow information web portal that produces two global-to-local scale snow metrics, snow cover frequency (SCF) and snow disappearance date (SDD). Access to relevant and timely global snow datasets has traditionally required proprietary software, remote sensing expertise and prohibitive financial costs for storage, computing resources, and geospatial analysis. Yet seasonal snow plays a crucial role in the hydrological cycle in remote, mountainous regions around the globe where much needed snow information is often sparse or non-existent. By employing GEE cloud computing and cloud storage capabilities, SnowCloudMetrics efficiently accesses the entire spatial (global) and temporal (daily from February 2000 to present) satellite collection for the MODIS/Terra snow covered area binary product (MOD10A1). Anyone with a web browser can create customizable snow information rather than simply downloading spatially specific and temporally partial datasets. This new, cloud-based paradigm for data processing and delivery is applied to the ADB basin in Central Asia, a transboundary watershed where snow dynamics play a critical role in the hydrologic regime of multiple countries. Potential users, such as natural resource managers or snow industry representatives, can generate annual SCF and SDD datasets for any local-to-global scale inquiry in minutes. This new snow information will help these users understand the snow component of their landscapes and watersheds.
Author Contributions: A.W.N. conceived of the new snow metric, Snow Cover Frequency (SCF), and additional development of the Snow Disappearance Date (SDD) metric as an extension of the commonly used snow cover duration concept; R.L.C. wrote the original programming to implement the SCF metric in Google Earth Engine, along with A.W.N. and E.J.M. who further developed the SDD and SCF metric for the SnowCloudMetrics app; R.L.C. performed the spatial analysis for the paper; R.T.P. coded the user interface and SnowCloudMetrics app within Google Earth Engine; E.A.S. contributed to the analysis and development of the hydrologic model application for the SCF metric; R.L.C. wrote the paper, with editorial assistance from R.L.C., A.W.N. and E.A.S. All authors have read and agreed to the published version of the manuscript.
Funding: This research received an incubator grant from the Earth Science Information Partners (ESIP) and by National Aeronautics and Space Administration Grant # NNX16AG35G.