Estimation of Daily Spatial Snow Water Equivalent from Historical Snow Maps and Limited In-Situ Measurements

: We present a scheme aimed at estimating daily spatial snow water equivalent (SWE) maps in real time and at high spatial resolution from scarce in-situ SWE measurements from Internet of Things (IoT) devices at actual sensor locations and historical SWE maps. The method consists of ﬁnding a background SWE ﬁeld, followed by an update step using ensemble optimal interpolation to estimate the residuals. This novel approach allowed for areas with parsimonious sensors to have accurate estimates of spatial SWE without explicitly discovering and specifying the spatial-interpolation features. The scheme is evaluated across the Tuolumne River basin on a 50 m grid using an existing LiDAR-based product as the historical dataset. Results show a minimum RMSE of 30% at 50 m resolutions. Compared with the operational SNODAS product, reduction in error is up to 80% with historical LiDAR-measured snow depth as input data.


Introduction
Estimating the spatial distribution of snow water equivalent (SWE) at a basin scale and in a timely manner is crucial for efficiently operating downstream water-supply reservoirs and hydropower networks. This need has become more pressing with a changing climate where past trends no longer predict the future. For instance rising temperatures and increase in precipitation is causing unexpected increases in streamflow in the Indian Himalayan Region [1]. On-the-ground sensors across watersheds in California and across the world lack sufficient coverage to accurately determine spatial snowpack information or other hydrologic inputs and states [2]. However, with The advent and rapid development of accessible cyber-physical systems technology, watershed instrumentation is expected to increase. We find multiple such systems reported in the recent literature [3][4][5]. Four clusters of state-of-the-art Wireless Sensor Network based measurement systems recently deployed in the Feather River watershed better represent the variability of SWE across the landscape and complement spatial estimates from existing snow pillows [6].
Existing methods for spatial SWE estimation are outlined in Reference [7] and fall into the following categories: (i) spatial interpolation from in-situ sensors constrained by remote sensing, (ii) SWE reconstruction using snowmelt models given the point of disappearance of snow determined from remote sensing, (iii) global SWE remote sensing based on passive microwave, (iv) a snow model assimilated by dense in-situ sensors or by remote sensing products such as Sentinel-2 [8] or Lansat [9] and (v) emerging methods such as air-borne Light Detection and Ranging (LiDAR) altimetry. When no SWE estimation is possible due to the lack of in-situ sensors or remote sensing, a mass balance physical model, calibrated by manual snow measurements is used to directly estimate streamflow [10]. Multiple methods described above can be combined together, for instance in Reference [11] an energy balance model is updated with both remotely sensed snow cover and extrapolated in-situ snow sensor data. SWE reconstruction on its own cannot be used in a real-time context because its estimate depends on a future observation of snow disappearance, typically determined from remote sensing such as MODIS [12], Landsat, or The state-of-the-art Sentinel-2 [13]. Remote-sensing data from NASA, such as the Advanced Microwave Scanning Radiometer-EOS (AMSR-E) and Moderate Resolution Imaging Spectroradiometer (MODIS) [14], provide coarse estimates of SWE (16 × 16 km) and fractional snow cover (500 × 500 m), respectively, with no information about sub-pixel variability. MODIS fractional snow cover does not provide information about the amount of snow after the pixel is saturated (i.e., totally covered with snow); and when it is useful during the melt phase, it is often subject to canopy-induced error [15]. AMSR-E's low resolution renders it unusable for estimating the intra-basin distribution of SWE in complex terrain. On the other hand, in-situ operational sensors provide frequent and accurate measurements but with low spatial coverage. Air-borne LiDAR scans combined with snow density values provide SWE estimates at a large spatial scale, but as a stand-alone method are currently expensive to conduct at a frequent temporal scale.
Many studies have tried to combine different measurements with a physical snow model to benefit from both the accuracy of the measurements and the temporal resolution and continuity of the model [11,[16][17][18][19]. Such methods are typically used with dense in-situ measurements, remote sensing or interpolation from scarce measurements, and require forcing data such as precipitation and temperature.
Regression with augmented explanatory variables from a historical remote-sensing-based SWE reconstruction model was used in Reference [20] to interpolate in real-time in-situ SWE measurements across basins in the upper Colorado River basin in the US. Results show that not only did the added features largely improve physiographically based regression beyond the physiographic conditions represented by the snow-sensor network, but on average it explained more than twice the variance of any one of the physiographic variables. Using an accurate historical product for real-time snow interpolation is thus more desirable than solely interpolating based on physiographic features, especially when available instrumentation is scarce.
To that end, Reference [21] exploited the inter-annual consistency in mountain snow distribution shown in References [22][23][24] mainly due to snow loading, and used the nearest-neighbor technique to find the nearest historical LIDAR-derived SWE scan that best fits current conditions, and then corrected the residuals using the Gaussian Process to predict from physiographic features such as elevation, canopy cover, aspect and slope. Results were promising, however around 30 hypothetically placed sensors were used, which implies a significant investment in additional operational instrumentation. Basins with such high instrumentation are very rare in reality, and thus a need arises to achieve similar performance using fewer sensors, which would enable near-real-time spatial SWE estimation across wide geographies. Many mountainous regions across the globe such as the Himalaya [10] and the Andes [25] suffer from a deficiency of measurements and would benefit from such endeavor. Ideally, if better accuracy is needed, we should aim to supplement existing systems with those that add the most information.
The work described here aims to combine the spatial information contained in historical products with realistic scarce in-situ measurements of SWE, in order to generate a near real-time daily SWE spatial product. Unlike the majority of work described above, the method we propose does not explicitly use any physiographic features to spatially interpolate SWE. Rather, it uses historical SWE patterns, which have embedded spatial dependencies. it blends various approaches outlined by Reference [7]. We use information derived from (ii)-(v), depending on the historical product that is selected, to implement (i) using methods derived from data assimilation (iv). Section 2 presents the statistical Ensemble Optimal Interpolation (EnOI) method we use, its mathematical formulation and variations. Variations in ensemble selection schemes are first investigated (see Supplementary Materials) using simulations conducted on the Feather River Basin using a daily 90 m resolution SWE product spanning 30 years and the entirety of the Sierra Nevada mountain chain. We compare our results to other published methods using simulations conducted on the Tuolumne River Basin, where bi-weekly 50 m LiDAR-derived SWE scans have been conducted beginning in 2014. LiDAR-derived SWE maps that should provide closest representation of the truth are scarce in existence and are only available for few basins. This study seeks to answer the following questions: 1. to what extent does the proposed method based on the Ensemble Optimal Interpolation (EnOI) improve estimation of daily spatial SWE compared to existing methods? 2. is it better to use a LiDAR-derived but temporally sparse historical SWE product or a Landsatderived daily product in the daily SWE estimate? 3. How does the proposed method compare to SNODAS, a current operational SWE product?

Materials and Methods
The main scheme of the study consists of assimilating an a priori field representing the possible spatial SWE distributions with daily point SWE measurements using the Ensemble Optimal Interpolation method (EnOI) described in Section 2.1.

Ensemble Optimal Interpolation (EnOI)
The adopted EnOI method consists of a Bayesian approach to optimally combine background estimates with measurements. The forward model is not dynamical or physically based, and thus the method can be thought of as more similar to an Ensemble based Gaussian Process (GP). An Ensemble approach is used because due to the high resolution of the SWE maps, the dimension of our system is high, making it impractical to handle the covariance matrices required for this method, which would have the size of the square of the number of pixels. After estimating the background SWE map as the mean of the field, a difference between the estimate and the actual observations, often called residuals, will likely remain. Such residuals are re-evaluated to further improve the estimate by updating the background field with observations across the spatial extent. We first define ensemble members Ψ i ∈ R n (i = 1, ..., N), where N is the ensemble size and n is the dimension of the model state. The ensemble of model states is stored in a matrix A: In our case, A holds all the N historical maps Ψ i of SWE linearized into columns of length n. n is equal to the product of the map's rows and columns. The state-space equations can be written as: where x t is the background forecast map field of day t and is completely defined by the selected prior ensemble A with meanĀ. ffl accounts for the forecast error, assumed to be a random normal field and is defined by the covariance of the selected ensemble A. The observed states (or pixels) are retrieved from the state space using the following equation: where d t is the observation vector made of a collection of SWE sensor measurements at time t. H ∈ R mxn is the measurement operator relating the true model state Ψ t to the observations d t and allowing for zero mean normally distributed measurement errors fl. The Ensemble Kalman Filter (EnKF) and EnOI were introduced and used by References [26,27] to efficiently assimilate sensor measurements into non-linear physical models. EnOI is an efficient approximation of the EnKF method and is extensively used in oceanology [28][29][30]. The ensemble mean of A can be expressed as A = A1 N where 1 N ∈ R NxN is the matrix where each element is equal to 1 N . We define the ensemble perturbation matrix: Given a vector of measurements d ∈ R m , where m is the number of measurements, D ∈ R mxN is defined to hold the perturbed measurements and Υ ∈ R mxN to hold the perturbations only with ensemble mean equal to zero. The measurement-error covariance matrix can then be expressed as: P e = ΥΥ T N−1 ∈ R mxm . Detailed derivation of the Ensemble Kalman Filter analysis equation can be found in Reference [31] and can be expressed by both the standard equation in terms of the ensemble covariance matrices: A a = A + P e H T (HP e H T + R e ) −1 (D − HA) (5) or using the ensemble of innovation vectors defined as D = D − HA where P e = A A T N−1 ∈ R nxn is the ensemble covariance matrix. At every model time step, the EnKF integrates N ensemble members in time regardless of the availability of measurements and dynamic covariance of the time-evolving ensemble is used at the analysis step. On the other hand, the EnOI only integrates 1 member Ψ ∈ R n and the analysis is computed in the space spanned by a stationary ensemble denoted A s of model states as shown in the equation: where a α ∈ (0, 1] is a parameter giving different weights for the forecast and measurement-error covariances. α is used because the variance of a stationary (i.e., not changing in time) ensemble over a long period usually overestimates the instantaneous variability. In the subsequent experiments, the parameter α is tuned to minimize the cross-validation error at the sensor locations of the day during the test year. The EnOI is thus more computationally efficient than the EnKF but at the cost of slightly lower performance. The forecast model is usually a dynamical model, but in our case, it is simply the mean of the background field. In addition, the forecast model and observations must be unbiased. Some have used a Multivariate Optimal Interpolation (MVOI) technique where the covariance Pe is modeled by explicit functions [29,32]. However, Reference [29] demonstrated that using the model-based covariances of EnOI has more benefits . The rich data provided by historical snow-cover products should provide adequate values for the complex spatial correlations and the anisotropic nature of the snowpack, a task more challenging, if at all possible, to achieve using explicit functions.
Unlike the EnKF, EnOI does not provide an absolute uncertainty estimate for the updated state because of the method of procuring the prior ensemble distribution and the α scaling parameter; however the post-analysis ensemble spread of the field does describe the relative uncertainty between the state pixels. Localization is typically used when the ensemble size is small, to mitigate the effect of spurious correlations appearing between physically non-correlated states (for instance between distant pixels). However, in our case, severe localization will have a potentially detrimental effect since the background estimate's bias is not negligible. Regions with reduced correlations due to localization will retain the background mean estimate after analysis, contributing to strong biases in the result. The appropriate way to deal with this issue is to choose a background estimate with low bias. Moreover, localization becomes an attractive option when the number of sensors becomes large enough that their field of influence becomes highly superimposed. This is not our case, where only scarce measurements are available. Nevertheless, a study of the effect of localization is encouraged as future work.
Finally, it is interesting to observe the similitude between EnOI and GP; where GP can be thought of as EnOI with background estimate with mean equal to zero and where the α and localization characteristic length in EnOI are indirectly estimated using optimization methods in GP. Moreover GP supports kernels that could substantially change the state-space covariance structure compared to a simple historical sample covariance used EnOI.

Prior-Ensemble Sampling
Before evaluating the performance of the proposed method, we evaluated three different schemes to generate the prior-ensemble estimate that describes our prior state as well as the error covariance of that state. Results of this method selection are shown in the supplementary material. The reader who is interested in the main results rather than the details of this evaluation may want to skip Section 2.2.
In the first scheme, the ensemble consists of all the available historical maps of SWE (except for the test water year). The historical maps of SWE used consist of the 90 m Landsat-derived reanalysis product in Reference [9] and further described in Section 2.4. We refer to this scheme in supplementary by the label menoi. This ensemble is static and does not change throughout the daily simulation.
The use of a static ensemble to represent the error covariance is probably sub-optimal. The covariance structure likely changes in time, especially between the accumulation and ablation phases of the snowpack. In the second scheme, the ensemble consists of the collection of historical SWE maps occurring at the same day of each of the other water years. We name this scheme menoic. The Ensemble mean in this scenario consists of the well-known climatologic-mean estimate. The ensemble thus changes every day.
In the third scheme, the ensemble consists of the collection of yearly nearest-neighbor maps to the measured SWE from the historical record. We name this scheme menoi_ynn. This approach should reduce the bias in the prior when sensor locations used in the nearest-neighbor procedure are non-biased estimates of the spatial SWE. Note that in all three schemes we employ only data-driven covariances. The menoic and menoi_ynn schemes use a dynamic ensemble that changes daily, while the menoi scheme uses the same static ensemble for all the days of the water year.
The three different methods outlined above are compared with each other over the Feather River basin where eight existing snow-pillow sensors are currently operational. Summary statistics are in Table S1 and Figure S1, with annual evaluations in Figures S2-S12.

Experimental Setup
We used Tuolumne basin as test basin for our central study, as a biweekly LiDAR-derived SWE product is readily available for use as reference for true spatial SWE. The ensemble is sampled as all the available historical maps of SWE (except for the test water year). The historical maps of SWE used consist of the 90 m Landsat-derived reanalysis product in Reference [9] and the 50 m LiDAR-based product in Reference [33] for the schemes menoi_Pr and menoi_Li respectively and are found in the summary Table 1.
The sensor locations used in the experiments are those of existing snow pillows. The snow-pillow network is maintained by the California Department of Water Resources (DWR). These remote in-situ sensors "directly" measure SWE through the weight of snow and transmit the measurements to an online database. They are typically sparsely scattered throughout the basin, as shown in Figure 1. Measurements from other systems such as the WSNs can be incorporated into the estimation process by simply appending rows to the measurement operator matrix H in Equation (3) that maps the new measurement value appended to vector d to the state (or pixel number) in the background vector x. The uncertainties of the background forecast field is calculated as the sample covariance of the ensemble selected, whose mean is the forecast map. In Situ measurements are assumed to be near-perfect during the simulation run and their uncertainties are modelled as white noise with negligible standard deviation. Therefore, at each update step, the total uncertainty of the prior SWE map will decrease. When using actual sensor measurements from the field, it is desirable to model and include their uncertainty, taking into consideration the different sources of error: intrinsic sensor imperfections, environmental noise, measurement error, and interpolation error from in-situ to the area of the grid cell. One way to model measurement uncertainty is to use the standard deviation of burst measurements.  [9]. b described in Reference [33]. c described in Reference [20].
Both the menoi_Li and menoi_Pr are then compared with the feature-selective multivariate regression (a potentially real-time method) schn [20] and the non-real-time Landsat-derived method marg [9] obtained from the recent literature, as well as with the operational method SNODAS [34] at the Tuolumne River basin using the LiDAR-derived spatial data product in Reference [33] as reference.

Dense Historical Samples
In the menoi_Pr scheme, we use a newly developed state-of-the-art snow water equivalent (SWE) Bayesian reanalysis dataset based on the assimilation of remotely sensed fractional snow-covered area data over the Landsat 5-8 record [9,35] to select the ensemble from (representing covariances) This product was also used for all the supplementary material experiments on the Feather river. The reanalysis datasets ranges from 1985 to 2015. The product's spatial and temporal resolutions are 90 m and daily, respectively. It is reported that a comparison with in-situ data showed a mean and root-mean-squared errors (RMSE) less than 3 and 13 cm, respectively [9].
The product covers the entirety of the Sierra Nevada range. The choice of a widely available product allows for convenient experiment replication on all the watersheds in the Sierra Nevada. We should note that using Sentinel-2 with a 5 days of revisit time for snow cover data instead of, or in addition to Landsat (which has 16 days) would provide a more-accurate dataset of daily SWE, and should be considered for future work [36].
For the EnOI update step, using the entire historical product as ensemble in method menoi_Pr, yields an ensemble size on the order of 10,000. Such a size can be prohibitive to store in memory using ordinary Desktop computers. We thus reduce it to 100 using the method outlined in Reference [31]first, the Singular Value Decomposition (SVD) of the ensemble perturbations A is computed. We then retain only the first N singular values of the decomposition and regenerate a 100-member ensemble to represent the covariance. This eigenvalue-based dimensionality reduction retains the directions of dominant variances.

Scarce Historical Samples
Given the scarcity of available LiDAR scans, the LiDAR-derived historical [33] product yields few ensemble members that are used in the menoi_Li scheme. An ensemble size of only approximately 20 was used. LiDAR provides rich spatial information, 3 m resolution measured snow depth and 50 m resolution modeled SWE, and is considered a highly accurate spatial product. NASA's Jet Propulsion Lab in partnership with the California DWR developed and applied an imaging spectrometer and scanning LiDAR system, termed the Airborne Snow Observatory, which measures snow depth at unprecedented basin-wide scale and spatial resolution of 3 m and produces spatial SWE maps at 50 m resolution spanning the Tuolumne and Merced river basins by coupling measured snow depth with the iSNOBAL snow model [33].
Four sensor measurements at existing snow-pillow locations, shown overlaid on the elevation map in Figure 1, were simulated by extracting their values from the withheld LiDAR test year. Unlike the Landsat-based product, where the modeled daily SWE maps exist for each day of the test water year, LiDAR-derived SWE maps were available biweekly for parts of 2014 -2017, and thus we estimated the SWE only for those days. The results can demonstrate the potential limit of the method in terms of number of historical scenes available. Results of this experiment can reveal both the reconstruction skill of the proposed methods using both a dense and scarse historical product, and The SWE accuracy of the result compared to LiDAR. We should note, however, that a modeling error is inevitably introduced when raw LiDAR snow-depth measurements were converted to SWE, as described in Reference [33].
The real-time product proposed in References [20,37] uses a method that selects the most-informative independent variables out of a collection of up to 16 to interpolate SWE measured by a collection of sensors at sparse locations using the multivariate regression. Those independent variables include: elevation, north-western barrier, south-western barrier, distance to ocean, latitude and closest SWE map from the historical product. The referenced research uses a reconstruction product as historical product, while we will use the LiDAR-derived product instead for the Tuolumne and call the product schn.
The product termed marg cannot be generated in realtime, as it depends on future satellite observations; nevertheless we also simulate it to compare with LiDAR at Tuolumne.
Finally, after downscaling the SWE maps obtained from using the menoi_Li and menoi_Pr methods, we compare them to SNODAS.
Note that for all of the experiments, each test water year is excluded from the ensemble selection process and is subsequently used as the "truth".
The different experiments underline the flexibility in dealing with different products and spatio-temporal scales (both 90 m, 50 m and 1 km resolution daily to bi-weekly) and evaluate whether such method is appropriate and ready to implement operationally.

Datasets Summary
The following datasets and source codes are used in the experiment: •

Results
In Figure 2, menoi_Li extracts the ensemble from the sparse LiDAR-derived product (up to 21 scans) while the menoi_Pr extracts the ensemble from the Landsat-derived product (total of >3000 scenes per pixel). The RMSE plots show that on average the menoi methods both outperform the feature-selective multivariate-regression method schn. For the 2016 average water year, menoi_Li shows an approximately 25%, 30% and 50% lower SWE RMSE compared to marg, menoi_Pr and schn, respectively during the peak snow season (on days 1 April 2016, 7 April 2016, 16 April 2016 and 26 April 2016) in Figure 3). In the midst of the snowmelt season of the wet water year 2017, both menoi_Li and menoi_Pr show a 50% lower RMSE compared to schn. menoi_Li has a higher RMSE during the early melt season of the dry periods but lower during the later melt season of 2015. menoi_Pr shows a lower RMSE than the marg method during the early melt season of the dry year 2014, but not during the end of its snow season; and unlike menoi_Li, consistently had higher RMSE during 2016 compared to marg. On average, both menoi methods have substantially lower RMSE (up to 80% and 70% during dry and wet years respectively) than the operational SNODAS product. The heat maps in Figure 3 show a consistent improvement in error distribution in terms of mean and standard deviation for menoi_Pr compared to SNODAS. The spatial patterns for menoi_Pr are also better matching with the "truth" map (LiDAR), although not always better than SNODAS in terms of the amount of SWE for every pixel. SNODAS seems to be generally over-estimating SWE across the basin especially during the dry year 2014. On The other hand, menoi_Pr tends to underestimate SWE (with exception of day 7 April 2016).

Discussion
The key contribution of this work is that we present a method that can be used to generate real-time high-quality SWE maps (90 m and 50 m resolution) using a small number of representative SWE sensors (e.g., Tuolumne only 4 stations were used), which is not possible with traditional regression techniques that necessitate a higher number of sensor stations to adequately fit pre-selected features and observations to models. For instance, the method based on NN and Gaussian Process Regression (GPR) described in Reference [21] is a state-of-the-art method that aims to achieve a similar goal as that described here, however finding the GPR parameters for a good performance requires approximately 10 times the number of SWE sensors than are currently available. The work presented here is an attempt to break those constraints, while safeguarding the quality of the result.
Comparing the results obtained on Tuolumne basin with a similar elevation region such as the mountainous reaches of central Chile and Argentina in Reference [11], we see potentially better performance during the peak season. Noting that the other study was conducted at 500 m resolution on a region 100 times the size of Tuolumne and using 3 times the number of sensors as well as forcing data, energy balance model and remotely sensed fractional snow-covered area, their overall reported peak SWE RMSE is 274 mm. Our results indicate that for both average year 2016 and wet year 2017, the peak SWE spatial RMSE (aggregating every grid on the map) of menoi_Pr is below 250 mm and that of menoi_Li even below 100 mm for 2016 when using LiDAR historical data. While our approach could be applied at a larger scale, there is also benefit to making predictions at the scale of water-resources decision making, which for the Sierra Nevada is the watershed-scale.
Feature-based methods such as Reference [21], among many others, might better adapt to changing feature conditions. Our proposed method is vulnerable to changes in spatial correlations of snow. For example, wildfires can change the landscape and the snowfall and snowmelt dynamics. This could be easily accounted for in feature-based methods by updating the vegetation feature. On The other hand, the method presented here would require historical spatial maps of SWE with the new reduced canopy. Nevertheless, most features currently used are stationary, for example, elevation, aspect, northness. A warming climate will change relative snow patterns with elevation [25,38,39]. For example, an earlier recession in the snow line. In Reference [25], trends with significant increase in the snowline and changes in the persistence of snow in some areas were detected. This could result in a change in inter-pixel correlation especially at low elevations. As weather events become more extreme, new and unseen changes in snow patterns could challenge the accuracy of the spatial correlations extracted from historical SWE maps, however this can be compensated by (i) increasing the number of in-situ snow sensors measuring real-time data, (ii) improving the ensemble-selection method and (iii) fusing near-real-time remotely sensed fractional-snow-cover data. For instance, the ensemble can be sampled using a sliding window approach of most recent N scenes. As those new climatic changes become a trend, their effect become more pronounced on the spatial correlations being used. Another approach can consist of constructing the ensemble representing the co-variance from a truncated set of yearly nearest-neighbor snow maps that could outperform the static method of selecting all maps and that of selecting a climatologic sample. This in part reflects the inter-yearly similarities in snow patterns depending on the timing of precipitation and ablation patterns. For instance, rain-on-snow events tend to dramatically change SWE spatial patterns and hence it is best to only include historical scenes that share such events in the ensemble.
The second contribution is the novel and purely data-driven ability to interpolate in-situ SWE observations without the explicit use of physiographic features or satellite observations, but by statistical means through the historical ensemble that incorporates effects of features unknown to the researcher or potentially challenging to measure. For instance, snow loading necessitates wind data that, unlike elevation, are rarely if at all available at the needed resolution or spatial extent. Other features such as differing geographical orientation of mountain ranges also influence snow distribution and can be challenging to model.

Alternative Selections of Prior Ensemble
As shown in Table S1, on average the menoi ensemble selection scheme slightly outperforms the other two (menoic and menoic_ynn) in terms of reducing the 32-year median MAE (from 22 to 20%) and RMSE (from 39 to 35%). menoic_ynn performed best in terms of median bias (7 compared to 9 and 13%). This superior performance of menoi could be explained by the use of more scenes in the ensemble selection process compared to the other two methods (100 vs. 31). This results in a finer spatial covariance structure. The plots show that the menoi, having a stationary ensemble, resulted in the smoothest RMSE series, while the menoi_ynn exhibits occasional jumps due to the changes in the nearest-neighbor-matched maps.
A synergistic combination of methods could lead to an even superior outcome since no method is consistently better than the other. In some periods, menoic had better performance, whereas in other periods menoi_ynn was better. The disadvantages of menoi_ynn is that NN scenes for a given day could be selected from historical days that are outside the snow season of that day; for instance, a best-matching scene during a summer melt period could be a scene from the winter accumulation period of a different year. It is desirable to limit the yearly NN search window to a few months from the current simulation/estimation day to address this potential disadvantage, and benefit from a more appropriate climatic covariance structure.
Moreover, it could be attractive to prune or replace scenes (NN of years with low matching score) from the ensemble with the objective of reducing the background ensemble bias as much as possible without too adversely affecting the covariance structure or the ensemble size (to avoid spurious correlations). The menoic's main disadvantage is that snow-season timings are not inter-yearly constant on the daily scale. Dry years often have a shorter snow season than wet years, and The dates of onset and disappearance of snow vary. Therefore it constitutes a more-biased background field in non-average scenarios. Note that our current estimation method is stateless. it would be interesting to add a state to the simulation, where the evolution of the state (SWE) and not the state itself is estimated based on menoi_ynn with the ensemble sampling strategy, considering accumulation and ablation periods and amount of snowfall.
It is evident in Table S1 that errors as low as 22% (RMSE of 2009) and 13% (MAE of 2009) and 0 biases can be achieved. Those well-performing scenarios typically occur when the biases of the estimates are low. The methods do not perform well during dry years and at the end of the snow-melt seasons. This is where real-time remote-sensing information becomes crucial for accuracy, especially when the majority of sensors are reporting zero SWE because they are not located at high-elevation where snow persists. Given the scarcity of the sensor network used, this could explain the high errors reported during such situations, shown by the U shaped error curves of Figures S2-S12. This error pattern is also found in most of the literature employing interpolation methods [37]. However, we should note that near-real-time satellite information on snow cover from MODIS, Landsat and Sentinel-2 is needed to significantly improve the snow-melt season error as demonstrated in Reference [40]. This concern could also be addressed by strategically adding sensors. Another possible reason is that during dry years, snow distribution is more dependent on less-stationary variables such as wind direction, temperature fluctuations and storm occurrence compared to wet seasons where stationary-variables such as elevation and topography have the dominant influence on snow distribution. This implies that real-time satellite based information are mostly beneficial during dry years.

LiDAR vs. Satellite-Derived Products
Comparison with the 50 m biweekly LiDAR-derived products showed that, on average, the LiDARderived menoi_Li outperformed other methods during the peak snow-season. During those periods it is better to use LiDAR for ensemble selection in the menoi_Li method instead of the Landsat-derived product menoi_Pr. This is mostly evident during water-year 2016, where it even outperformed the Landsat-derived product from Reference [9]. Note that the latter product cannot be generated in real time since it uses Landsat information from later in the snowmelt season for the reanalysis. However the statistical advantage of sampling from an extensive historical product could be highlighted during the few days where menoi_Pr outperformed menoi_Li even during the peak snow season of the wet year 2017. Moreover, menoi_Pr's RMSE is less volatile compared to menoi_Li, probably due to the larger number of ensembles used.
The Landsat-derived menoi_Pr outperformed the selective-regression method [20] during the majority of the peak snow periods across all simulation years ( Figure 2). It only outperformed the non-real-time Landsat-product during the dry 2014 water year but not the 2016 wet water year ( Figure 2). Given that the product used in menoi_Pr is Landsat-derived, it explains why it would have better performance during dry seasons since Landsat snowcover does not report SWE information but snowcover and thus is mostly useful during the patchy snow cover of dry/melt seasons.
Moreover, the menoi_Li method resulted in water-year 2016 RMSE errors (ex: 14-15 cm from 01-04-2016 to 26-04-2016) similar to those obtained in Reference [21] from the interpolation of SWE across the same region, but using 22 measurements with GPR instead of four with menoi_Li. This comparison implies that the added information from the historical LiDAR product was able to mitigate the information loss of reducing the sensors from 22 to 4, and further reinforces prior findings that SWE exhibits strong inter-annual spatial patterns.
The plots comparing RMSEs with SNODAS ( Figure 2) show significant error differences that highlight the importance of the airborne and ground measurement systems in estimating spatial SWE.
In summary, results imply that the proposed method could have a beneficial impact in the spatial interpolation of SWE whenever accurate historical data that are either LiDAR-or remote sensing-derived are available and only a few continuous ground measurements are available ( Figure 3). However sensors installed at higher elevations are crucial for dry and end of melt seasons. We assume that LiDAR-derived SWE maps are closer to ground truth than the energy-balance reconstruction methods. However this is yet to be evaluated. Furthermore, it is not known how this method would compare with the feature-based regression methods when dense sensor measurements are available. More work is needed for such cases to generalize the conclusion from those findings.
An additional benefit of the presented method is that the information held in the covariance of the historical ensemble can be used to strategically place sensors. The near-optimal algorithm consists of sequentially finding the sensor location that maximizes the reduction in total basin ensemble variance during the EnOI update step explained in Section 2. The algorithm is only near optimal because the sensor locations are found sequentially and thus we do not go through all the placement combinations to find the one that maximizes the variance reduction. The latter would be intractable, especially at high resolutions. With every new sensor location found, the remaining ensemble variance decreases until it reaches a saturation point. This is demonstrated in the left plots of Figures S14 and S16 for Feather and Tuolumne basins respectively where near-optimal sensor locations shown in Figures S13 and S15 respectively were selected. Sensor placement is validated by the observed saturated decrease in 1985 to 2016 mean of April 1st simulated SWE spatial RMSE as more sensors are added, shown in the right plots of Figures S14 and S16.

Conclusions
We have thus presented and evaluated a purely data-driven scheme that proved to be beneficial for real-time estimation of high-resolution spatial SWE across mountain basins having a small number of on-the-ground measurement points.
The scheme performed better during average and wet water years and not as well during dry years and late melt season where snow disappeared from the few sensor locations and spatial snow patterns probably depends on less stationary processes. This problem will only become more acute in a warming climate. To tackle this disadvantage, also observed in existing real-time methods such as SNODAS, we recommend strategically placing additional sensors. Another approach successfully used in the literature is to use real-time snow cover information from remote sensing.
Such findings validate previous studies where historical SWE maps were found to provide important information for estimating SWE in real time. We find that past spatial patterns of SWE are still the best predictors of future or current ones. However, the challenge is to identify those past patterns most informative of the current conditions, as many possibilities exist. As climatic conditions are changing and extreme-weather conditions are occurring more frequently, strategically placed ground measurements systems that minimize such uncertainty are the best tools to fill this gap.
It is best to use, on average, EnOI with a covariance represented by a large sample of LiDAR-derived product for SWE interpolation. Historical LiDAR data seem to capture most accurately the SWE of regions with deep snow where satellite observations cannot. As LiDAR scans are presently limited in coverage, more scans are needed. Alternatively, deriving the covariance from the synergistic combination of historical reconstruction products with remote-sensing-derived products (such as Landsat and Sentinel-2) could approach the LiDAR performance.
Finally, though such a method relies on a historical dataset, it provides an alternative to a dense and extensive measurement system that cover large portions of the watershed and potentially complex modeling using physiographic features. Only a few sensor clusters at key locations and with a representative estimate of their surrounding resolution were needed. it thus provides a convenient and scalable data-driven approach to spatially interpolate SWE sensors' measurements that could be very useful for high-mountain or glacio-hydrology such as the Arctics, Sierra Nevada, Himalaya, and Andes regions. For regions where no historical SWE maps are available, one must first use readily available historical remotely sensed fractional snow cover to generate a reconstruction record of SWE maps. Related future work of interest would be to formulate a method to identify such sensor placements with maximum information gain adapted for large dimensions and find the number of sensors needed to achieve a pre-specified error tolerance.
Supplementary Materials: The following are available online at http://www.mdpi.com/2306-5338/7/3/46/s1, Figure S1: Peak snow-season error statistics for each ensemble-selection method for water years from 1985 to 2016, Table S1: Peak snow-season error statistics for each ensemble-selection method for water years from a 1985 to 2016, Figure S2: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 1985, 1986 and 1987, Figure S3: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 1988, 1989 and 1990, Figure S4: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 1991, 1992 and 1993, Figure S5: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 1994, 1995 and 1996, Figure S6: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 1997, 1998 and 1999, Figure  S7: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 2000, 2001 and 2002, Figure S8: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 2003, 2004 and 2005, Figure S9: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 2006, 2007 and 2008, Figure S10: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 2009, 2010 and 2011, Figure S11: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 2012, 2013 and 2014, Figure S12: Feather River menoi RMSE as percentage of the daily spatial mean SWE with different Ensemble selection for water-years 2015 and 2016, Figure S13: Near-optimal sensor placement of 23 (black and green) and additional 15 (red) complementing the existing 8 sensors at Feather basin, overlaid on the normalized ensemble variance map of the Landsat-derived product, Figure S14: Reduction in normalized basin-wide ensemble variance (left) and in the 1985 to 2016 mean of April 1st SWE spatial RMSE (right) given the near optimal sensor placement of the 23 and complimentary 15 (to the existing 8) sensors at Feather basin shown in Figure S13. The red horizontal