Mountain ecosystems in the western US are very sensitive to changes in climate [1
]. Snowpack is melting earlier [4
], affecting the fragile alpine systems for, e.g., the peak wildflower season that closely follows the snow disappearance day each year is shifting [5
]. Unprecedented changes to mountain forests from fires and insect outbreaks [7
] are also the result of changing climates, and forest managers are already opening gaps in areas of dense forest to improve ecosystem health and resilience [8
]. Ecologists and conservation biologists study these systems using in situ monitoring, which is limited to point observations, precluding the understanding of the ongoing changes in areal extent and magnitude [9
Changes to snow cover in particular are likely to influence mountain forests and meadows, because snow cover in the mountains dictates the growing degree days (accumulation of warmth). Thus, snow cover can affect the phenological advancement in species, causing the emergence of pollinators (insects etc.). Earlier snowmelt could lead to earlier flowering, which might cause asynchrony with pollinators or make them susceptible to frost events [10
]. In a warming climate, early spring can also lead to spring runoff causing flooding and drier summers [12
]. Understanding the heterogeneity of snow cover has implications that could inform biologists about the changes in ecosystems and water managers about the snow cover variability determined by seasonal snowmelt, which is a major component in hydrological water estimates [14
The analysis of high-resolution (m-scale) lidar-derived snow depth datasets provides insights into the main factors controlling snow distribution and melt patterns, with important drivers shown to be elevation, followed by slope, aspect, and vegetation cover. With the lidar-derived snow depth datasets, one can observe snow in openings, in forest gaps, and under forest canopies, and how it varies in space and time at high spatial resolutions of 1–3 m [12
]. Airborne lidar datasets are known to be reliable in their estimates of snow depths in forested areas through comparing airborne lidar snow depths with manual measurements of snow around the trees [12
] and are generally found to agree within 5 cm at 0.5–5 m resolutions [16
]. High-resolution snow-covered areas can be derived from the lidar-derived snow-depth data using a prescribed threshold [17
]. In 2013, NASA started airborne snow depth collections in the Tuolumne watershed in California, USA, weekly during the ablation season under a program called the Airborne Snow Observatory (ASO [18
]). The airborne lidar snow depth collections have since been expanded to other areas, but it is cost prohibitive to expand at a global level, and the collections remain limited.
Fortunately, high spatiotemporal resolution imagery is available via Planet Labs, Inc. (Planet) [19
] which has the potential to transform the study of earth-processes through remote sensing. Planet operates a constellation of CubeSats—small satellites that are of 3U form factor (10 by 10 by 30 cm) that images the entire land surface area of the earth daily. The “PlanetScope” constellation is composed of roughly 130 satellites that operate in sun-synchronous orbit. The resulting product is orthorectified surface reflectance (SR) that is delivered at 12-bit resolution with 3–5 m resolution covering visible and near infrared bands [20
Of course, detecting snow from remote-sensing imagery requires a model to translate bands to predictions of snow, and machine learning (ML)-based methods have proven to be extremely promising at detecting snow cover. The majority of studies have used multispectral providers (e.g., MODIS and Landsat-8) because of the band diversity. Commonly used methods are support vector machines (SVM) and random forests (RF), but deep learning-based methods are being applied increasingly [21
]. Along with the multispectral bands, derived indices have also been used (for e.g., Normalized Difference Vegetative Index (NDVI) and Normalized Difference Snow Index (NDSI)), and yield snow cover area (SCA) estimates that have coarser spatial resolutions (e.g., 500 m) but finer temporal resolutions (e.g., daily) [22
]. Studies that applied a suite of ML methods to a multispectral dataset combined with ancillary information such as topography found that the relevance of non-spectral attributes is of limited importance in increasing model performance [23
]. Spatial resolution and accuracy tend to be issues in mountainous areas when using multispectral data (for e.g., MODIS)––coarser data products can capture less of environmental variation in very heterogeneous environments. Newer approaches have begun fusing multispectral satellite data with unmanned aerial vehicle (UAV)-based acquisitions or only using UAV-produced datasets to improve deficiencies over localized areas [25
]. Another hybrid approach has also been proposed, where a convolutional neural network (CNN) is used to extract features and then a RF-based method is then used to estimate the snow-covered area [27
]. Other explorations using satellite spaceborne synthetic aperture radar (SAR) data instead of multispectral have also shown promise in mapping snow-covered areas; SAR-based methods are not as prone to the presence of clouds, but SAR data have limitations in dense forests [28
]. Furthermore, recent work by Cannistra et al., 2021 [29
], has demonstrated that snow cover can be successfully mapped from PlanetScope data using a machine-learning approach based on a CNN using the 4-band (R, G, B and NIR) PlanetScope surface reflectance data as input. The study highlighted the viability of a CNN-based model trained on lidar-derived snow cover data despite limited radiometric bandwidth and band placement, but, similar to other satellite-derived data, the performance of the model is lower in forested areas.
Snow cover mapping in forested areas remains a complex challenge as it is driven by vegetation type, topography, and climate. The accumulation patterns of snow in forests vary as a function of distance from the canopy—under the canopy, to canopy edge, and in gaps [12
]. Dense canopy contributes to shading caused by tall trees and the proximity of varied overstories moderates the interception of snow [30
]. Furthermore, the directional variance around the trees has also been found to influence snow accumulation patterns [15
]. The relative influence of those factors on snow depth varies as a function of location and the snow season [31
]. For instance, Tennant et al., 2017 [32
], showed that elevation explained most of the variability in snow depth (16–79%) in forested areas, but the aspect explained more variability (11–40%) in open areas. Cristea et al., 2017, also showed that terrain may matter more in the open than in vegetated areas. Snow–forest interactions also vary as a function of climate and radiation effects, with snow disappearing first in the open or under the forest as a function of local conditions [33
]. These studies demonstrated that small-scale variability is being observed with lidar and identified the terrain and vegetation features controlling the spatial distribution of snow depth and, hence, snow cover. Based on these observations, we hypothesized that augmenting CNN-based models with terrain-derived and vegetation predictors is likely to improve predictive models in forested areas (e.g., [32
Therefore, in this study, we augment the Cannistra et al., 2021, CNN snow cover model by using additional predictors including vegetation structure (using lidar-derived canopy height from a canopy height model (CHM) and the Planetscope-derived Normalized Difference Vegetation Index (NDVI)), and the digital elevation model (DEM or elevation) and its derived attributes (i.e., slope, aspect and northness). It is important to note that the NDVI as a reliable predictor for canopy is supported by its sensitivity to vegetation response despite the availability of many other vegetative indices [35
]. We then evaluate if these augmentations lead to improvement upon the original band-only model performance using the produced m-scale SCA from PlanetScope imagery. In our assessments we test two hypotheses: (1) terrain-derived predictors and vegetation information improve snow mapping accuracy in both forested areas (FA) and open areas (OA), and (2) terrain-derived predictors such as slope and aspect are more accurate over open areas than in forested areas where elevation is more important. We evaluate performance across forested areas, near the canopy edge, and in open areas (gaps) using a set of canopy classification metrics.
Overall, our results suggested that combining Planet with other landscape- or satellite-derived landscape features (especially elevation and vegetation) will improve predictions of snow cover, especially under dense canopies. However, modeling approaches, site locations, and data availability added complexities to predictions, implying that higher-quality remote-sensing products and further methodological improvements could further improve snow-cover predictions from Planet data. In our study, we evaluated the addition of new features to an existing CNN-based model originally based only upon unaltered inputs of Planetscope band data to map snow-covered areas from Planetscope imagery at m-scale spatial resolution. These additional features included a vegetative index (NDVI), canopy height, DEM, and topographic features (slope, aspect, and northness, the variables derived from the DEM).
Our hypothesis that terrain and vegetation structure improve the mapping of snow in forested areas was substantiated by the more accurate predictions of models using NDVI (proxy of canopy greenness) rather than terrain-based features (DEM, slope, aspect, and northness). The NDVI-based model was an overall best-performing model, and the DEM-derived model was better for pixels right under trees. We disproved our second hypothesis (where we suggested that the addition of slope, aspect, and northness features would improve performance more in forested areas than in open areas), with our finding that models including these factors overestimated snow in open areas contrary to our expectation.
5.1. NDVI Model (Preferred) Performance
We see in general that all the models perform better with band-based models (models which only use the spectral bands, which includes NDVI as well) still outperforming DEM- and canopy-height-based models. Given that the F-scores across all our predictor combinations are in near proximity (except for the model with slope, aspect, and northness), all of them are suitable candidate models. The band-based models capture the dynamics of snow cover, but the derived vegetation index (NDVI) is more precise (Precision of 0.88 vs. 0.94). Furthermore, the NDVI-based model has the highest balanced accuracy (0.78), a metric valued in case of imbalanced classes. The DEM-based models (except for slope-, aspect-, and northness-based model) have similar performances (both have F-scores of 0.88), suggesting that they would have similar predictive abilities, but that the DEM only model might be a more parsimonious choice. Canopy-based models are comparable to “BASE” (F-score of 0.86 and 0.85), suggesting that adding canopy height on top of the spectral bands does not drastically improve the model performance. The slope, aspect and northness model is shown to be less precise and less accurate (F-score of 0.65), but it does show high Recall. This is likely because of a high proportion of true positives (pixels correctly classified as no snow) compared to a significantly smaller number false negatives (pixels incorrectly marked as snow) but a higher number of false positives (pixels incorrectly marked as no snow).
We show that multispectral bands alone are sufficient in mapping snow-covered areas in forested areas. It is well known that band indices work well for deriving SCA, for example, MODIS-based snow cover product uses a combination of NDVI and NDSI to determine snow cover [44
] and also more recent CNN-based models have found spectral bands to be sufficient at determining snow cover [23
]. The NDVI-based model (Model 1) was the best model (F-score of 0.89) in our study. In this case, the model is likely associating the range of NDVI with presence or absence of snow, i.e., the higher the value of NDVI (closer to 1 for trees) and lower the value (close to −1) is getting associated as no snow (Figure 7
E). However, this result is contrary to some other studies where NDVI was found to be limiting as a predictor [22
]; however, this could be because of the coarser resolution of the product [45
] that was used (MODIS has a 500 m multispectral resolution vs. the 3–5 m multispectral resolution of PlanetScope). Several studies using ML affirm the importance of multispectral bands in determining snow cover regardless of spatial resolution [22
]. We find that other models were also relatively good, F-score varied by 15% across the remaining models; the elevation-based models being the second best, elevation is an important driver to the presence of snow and hence is better at delineating snow cover. NDVI is calculated using the existing bands, so no additional datasets are required.
5.2. Effect of DEM (Elevation) and Its Derived Attributes
Models using the DEM-derived attributes were also better at classifying snow in forested and open areas. This is consistent with other studies where contributions of northness, slope, and aspect have been shown to influence the presence of snow [24
]. We found, in general, that the DEM-based models show comparable performance across different geographic areas (Figure 6
A,B). Although they still underestimate snow in forest understories, they have better performance than the NDVI and CHM-based models. At Gunnison, the performance of the DEM-derived model (especially the one with slope, aspect, and northness) is better (F-score of 0.92 in Gunnison site using DCE) under canopies than rest of the models, but the same model is comparable to rest of the models in Engadin (to note, we had less snow on the ground, see Appendix B
5.2.1. Effects of the DEM Resolution on the Training Performance
We show the accuracy of SCAs generated using PlanetScope imagery in forested complex terrain is also subject to DEM resolution. The use of the fine resolution DEM (3 m) is found to improve the detection of snow (Figure 7
and Figure 8
) over the 30 m resolution DEM, the 3 m model provides us with an accuracy of 0.85 vs. 0.78 from the SRTM-based model (Figure 10
). However, we posit that the performance achieved in this study using coarser-level DEM might be reproducible in other geographic areas, because of the ready availability of coarser-resolution DEMs with global coverage [47
5.2.2. Effects of the DEM Resolution on the Prediction Performance
We find that the coarser DEM was a reasonable replacement for the 3 m DEM. Specifically, the F-scores were comparable whether models incorporated the 3 m or the coarser DEM as the input (Figure 10
). Over the Gunnison site (n
= 19), the accuracy score with the 3 m DEM model with derived attributes as its input was 0.89 in comparison to a similar model that was trained using a coarser DEM, which gave a score of 0.87—a loss in accuracy of 2%. Moreover, with the coarser DEM, the model was not able to distinguish snow in 1 out of the 19 scenes. The difference in accuracy is not substantially different, but it does highlight the limitations in the use of coarser DEM when using DEM-based models. However, we deem that using DEM features from existing publicly available DEMs such as SRTM will ensure reliable model performance at other sites. Moreover, fine-resolution DEMs are not available everywhere, so the promise of a coarser DEM with a slight decrease in accuracy suggests that the model is theoretically applicable in many more geographic areas than it would be if only fine-resolution DEMs were used.
5.3. Applicability of Explored Models
Models can represent the SCA across forested terrain, but their skill in doing so varies as a function of forest-cover density. We note that the NDVI-based model can map the snow in open areas, gaps, and areas with sparse trees. The use of canopy quantification provided insights into model performance across canopy classes. Generally, models (including the best performing NDVI based model) performed better at canopy edges and in open areas than in the under-canopy areas. Models also showed significantly better performance in the under-canopy areas at the Engadin site than Gunnison. The F-scores across Gunnison ranged between 0.78–0.90 and were 6% higher than when using the BASE model.
The use of different canopy classes allowed us to benchmark model performance as a function of land-cover classifications and identify the critical thresholds at which the models succeed or fail. The addition of NDVI improves the performance (F-score of 0.87) of mapping snow cover in forested areas (Under Canopy metric in Table 2
) and is better than CHM (Canopy height)- and DEM (Elevation)-based models in these areas [44
], but the inclusion of DEM with slope, aspect, and northness is far more accurate (F-score of 0.92). The use of DEM derived attributes improves the prediction under the canopies where optical methods clearly have disadvantages.
We also note model improvements in different land-cover classes—open, sparse medium, and forested areas, with the performance varying by geographical area by 25% in F-score across the Gunnison site. Snow-covered tall vegetation would have a relatively higher albedo than snow-free shorter vegetation (<1 m) that has lower albedo; hence, optical collections such as PlanetScope are able to detect snow in low vegetative and open areas.
We document geographic differences in snow predictability in all our model evaluations. The F-score performance metric varies between 0.73 and 0.93. The snow characteristics at the two test sites inherently differ because of geomorphic and climatic differences. The characteristics of snowmelt dynamics are complex, as the effects of physiographic features play vital roles in snow accumulation and melt [30
]. This is particularly relevant to our study which used training data from the snow ablation period, where snow distribution is most heterogeneous. We also caution that this might limit the model’s transferability in a different geographical area (e.g., the Tibetan plateau [51
]). A multi-site composite training might alleviate this issue and could be explored as a next step [23
The vertical accuracy used in thresholding snow-depth was found to be reliable. The threshold value used for lidar-derived snow-classification played a minimal role in the differences we observed in the model’s performance (Figure 9
). We find that the performance is similar for thresholds from 8 cm to 10 cm and is lower for below 8 cm and above 10 cm. This also suggests that any threshold between 8 and 10 cm should suffice in model development, a finding echoed in other works as well [18
5.4. Model Feature Selection and Training Volume
Adding more features increased the training time proportionally, for example, a BASE model took an average of 4 h compared to a DEM-derived model (DEM along with slope, aspect, and northness) took five times longer. In machine-learning models, a simple model is preferred, as it prevents overfitting [52
]; increasing the number of dimensions in the model input makes it more likely that the model captures both real and random effects.
Firstly, we chose a limited set of predictors based on our physical understanding of snow dynamics in the system, and we believe it is likely that the models we ultimately selected would also perform well at predicting snow under trees in other snowy and forested systems. However, we suggest that those wishing to build similar predictive models in their own system consider other predictive variables to optimize their model to their local circumstances.
Secondly, the NDVI model captures the terrain dynamics as shown by the canopy quantifications but still misses snow in dense canopies. This is likely limited because of correlated reflectance in PlanetScope bands [54
], variable radiometric quality, and the general difficulty in capturing the reflectance of snow via optical-methods in forest understories [48
]. We speculate similar performance when other fractional band measures (e.g., the use of Green index that is different from the NIR and Green bands that are linked to forest canopy [55
]) are considered, again, because of narrower bands by PlanetScope and the high degree of correlation between the bands [56
]. We expect that the improved radiometric resolution and addition of new bands (such as the red edge) will help improve the predictions of snow in forest understories in the near future. Moreover, in general, snow under a forest canopy is challenging to observe via optical methods because of forest canopy cover and the resulting low signal-to-noise ratio [57
]. Additional availability of bands (e.g., shortwave infrared (SWIR)) would also enable use of the Normalized Difference Snow Index (NDSI) that might better the predictability of the model in forests [23
] and further gives the ability to mimic MODIS-type explorations that utilize broader band availability [44
We used predictors such as vegetation structure (using canopy height and NDVI), digital elevation models (DEM), and DEM-derived attributes to produce 3 m snow-covered area from PlanetScope imagery to evaluate improvements in two representative important river basins, Tuolumne and Gunnison (in USA), and at Engadin (Switzerland). Overall, we find that the inclusion of NDVI into the model increases the model transferability more significantly than DEM and DEM-derived attributes such as slope, aspect, and northness. Our best model that used NDVI along with visible (red, green, and blue) and NIR bands captures the influence of vegetation on snow distribution. Specifically, the use of vegetation proxies (NDVI and canopy height) and terrain-derived measures was found to improve the accuracy of detecting snow in forested areas. The use of slope, aspect, and northness improves the ability of predicting snow in forest understories. The best model with an F-score of 0.89 (Gunnison) and 0.93 (Engadin) was found to be 4% and 2% better than when using canopy height and terrain derived measures at Gunnison, respectively. The NDVI-based model results in the best snow-identification performance in both forested and open areas compared to other models. Furthermore, adding only DEM and its derived attributes was also found to be transferable in test areas. The use of slope, aspect, and northness was found to overpredict snow in open areas. Even though optical methods are known to have shortcomings in observing snow in dense forest understories, our model’s improvements, along with the detailed canopy-based evaluation metrics such as those presented here, can be used to improve model performance regarding various types of forest feature (i.e., gaps, canopy edges, and dense overgrowth). Climate change projections show hydrologic changes to many mountainous basins; the approaches used in our study could be beneficial in mitigation efforts regarding climate change uncertainties. Improving snow-covered area identification in forested areas could improve hydrologic modeling accuracy and help to estimate late-season snowpack distribution. Our model holds promise, as it can better predict snow in forested areas that is in sync with captured imagery.