Automated Classification of Terrestrial Images: The Contribution to the Remote Sensing of Snow Cover

The relation between the fraction of snow cover and the spectral behavior of the surface is a critical issue that must be approached in order to retrieve the snow cover extent from remotely sensed data. Ground-based cameras are an important source of datasets for the preparation of long time series concerning the snow cover. This study investigates the support provided by terrestrial photography for the estimation of a site-specific threshold to discriminate the snow cover. The case study is located in the Italian Alps (Falcade, Italy). The images taken over a ten-year period were analyzed using an automated snow-not-snow detection algorithm based on Spectral Similarity. The performance of the Spectral Similarity approach was initially investigated comparing the results with different supervised methods on a training dataset, and subsequently through automated procedures on the entire dataset. Finally, the integration with satellite snow products explored the opportunity offered by terrestrial photography for calibrating and validating satellite-based data over a decade.


Introduction
Snow cover is an important component of the cryosphere that plays a key role for climate dynamics and the resources availability: the seasonality of the snow cover influences, in fact, weather patterns, hydropower generation, agriculture, forestry, tourism, and aquatic ecosystems [1][2][3].Remote sensing is the most common tool for the routine estimation of the snow cover extent.However, two different aspects must be considered for the enhancement of the final output: time and spatial resolutions.Both components, using remotely sensed data, are connected to each other, since the higher the spatial resolution (below hundreds of meters), the lower the revisit time interval (more than 1 week) [4].
The state-of-the-art snow products concerning the snow extent are remotely sensed and they are based mainly on multispectral optical sensors.They can investigate the snow cover and give information about the size and the shape of snow grains [5]; the presence of impurity soot; the age of the snow; and the presence of depth hoars.Furthermore, the short-wave infrared signal can support the discrimination between snow and clouds [6].The estimation of the snow extent from remotely sensed multispectral images is based on the relation between the radiative behavior of the surface and the Fractional Snow Cover (FSC).This parameter describes the percentage of surface covered by snow [7] in a pixel element of a remotely sensed image.Considering that snow-covered surfaces are highly reflective in the visible range and low reflective in the short-wave infrared (swir) [8], it is possible to define an index that enhances the discrimination between snow and not snow in a single pixel.This index, defined as Normalized Difference Snow Index (NDSI), is calculated as follows: NDSI = green − swir green + swir (1) The green and the swir parameters are the bands available for each satellite sensor and their selection includes generally wavelength ranges centered at 500-600 nm (green) and 1500-1600 nm (swir).The relation between the FSC and the Normalized Difference Snow Index (NDSI) represents the most common inference required by remote sensing studies.There are two options for estimating the NDSI-FSC relation: the first one consists in combining satellite products with different spatial resolution [9,10]; and the second one can be approached having a ground truth information.The first solution is based on [8] combining Landsat and MODIS data and a NDSI to FSC relation is defined.FSC = 1.45 × NDSI − 0.01 (2) This knowledge is implemented in the SNOWMAP algorithm [11], which is the core of the MODIS data chain for the definition of remotely sensed snow products.The second solution can be approached defining an empirical reflectance-to-snow-cover model that requires a calibration having a number of reference sites in the satellite image.The most important example is the so-called Norwegian Linear Reflectance-to-snow-cover algorithm (NLR) [12] that is the core of the GlobSnow Snow Extent (SE) data chain [13].From this perspective, the availability of webcam networks is an important data source for calibration and validation processes.The attention of the scientific community of this proxy is increasing, and the literature about this topic is growing [9,[14][15][16][17].Furthermore, several tools (for example, FMIPROT and PRACTISE) can be considered for research purposes [18][19][20].The solutions available for the analysis of webcam imagery are commonly based on two different processes: orthorectification and classification.While the geometrical issue is based on the mathematical solution of the relationship between pixel elements and the ground surface, the detection of snow cover represents the real cognitive gap.The classification issue can be approached, following the applications available for the remote sensing imagery, using supervised, unsupervised or object-oriented methods [21], depending on the number of images that must be processed.
The focus of this paper is to investigate the contribution of the terrestrial photography to define site-specific thresholds useful for studying the snow cover with remotely sensed data.The expected outcomes are: (i) the description of an automated procedure able to process long time series of ground-based images; (ii) the comparison between automated approaches and supervised methods; (iii) and the evaluation of the potential contribution of terrestrial photography to the snow cover retrieval from remotely sensed data.

Methods
The purposes of this study required the investigation of different components and the integration of different data sources (Figure 1).The accomplishment of the declared objectives was approached selecting a study site where ground-based cameras were positioned for a decade.The first part of the effort was devoted to the analysis of the available terrestrial dataset.In this case, the selection of the most appropriate procedure was obtained considering the automated procedures and the supervised methods in order to check the overall performance of automated solutions under different conditions of illumination and snow coverage.Secondly, the collection of different satellite products provided the material useful for evaluating the potential impact of terrestrial photography on the estimation of snow extent from remotely sensed data.
Geosciences 2019, 9 FOR PEER REVIEW 3 different conditions of illumination and snow coverage.Secondly, the collection of different satellite products provided the material useful for evaluating the potential impact of terrestrial photography on the estimation of snow extent from remotely sensed data.

Study Area
The considered study area (Figure 2) is located in the Italian northeastern Alps (Lago di Cavia, Falcade, Italy).The webcam (46°21'24" N, 11°49'20" E, WGS84) was positioned in a ski resort at 2200 m above the sea level.The study site is characterized by a snow cover duration almost complete from mid-November to late April, a melting season completed at the beginning of June and occasional snowfall in the rest of the year [22].The selection of the site for the camera setup was supported by the topographic behavior of the location, which is an almost flat area with a soft slope where an artificial water body is located.The presence of an important ski facility and the management of this water resource outline the importance of this location.

Camera Setup
The webcam system was provided by Sistemi Video Monitoraggio S.r.l.(Romito Magra-SP) using a digital camera (Olympus C765).The camera was deployed at 2 m above the ground.The camera was featured by 4-megapixel sensor and a 1/2.5"CCD, the focal length was 6.3 mm and images were saved in the jpeg data format with an 800 × 600 pixel resolution.Data logging and transmission were provided by specific hardware placed into a waterproof case and the power supply was ensured by the direct connection to the electric mains and by photovoltaic panels with a buffer battery.Data transfer was performed using an intranet connection with the receiving station located in Arabba through a mobile connection.The Veneto Regional Agency for Environmental Protection and Prevention developed a webcam section in the website (www.arpa.veneto.it) that supported the near-real-time availability of the images.The field of view defined by the camera perspective considered an area of about 5000 m 2 with a maximum distance from the camera up to

Study Area
The considered study area (Figure 2) is located in the Italian northeastern Alps (Lago di Cavia, Falcade, Italy).The webcam (46 • 21 24" N, 11 • 49 20" E, WGS84) was positioned in a ski resort at 2200 m above the sea level.The study site is characterized by a snow cover duration almost complete from mid-November to late April, a melting season completed at the beginning of June and occasional snowfall in the rest of the year [22].The selection of the site for the camera setup was supported by the topographic behavior of the location, which is an almost flat area with a soft slope where an artificial water body is located.The presence of an important ski facility and the management of this water resource outline the importance of this location.

Camera Setup
The webcam system was provided by Sistemi Video Monitoraggio S.r.l.(Romito Magra-SP) using a digital camera (Olympus C765).The camera was deployed at 2 m above the ground.The camera was featured by 4-megapixel sensor and a 1/2.5"CCD, the focal length was 6.3 mm and images were saved in the jpeg data format with an 800 × 600 pixel resolution.Data logging and transmission were provided by specific hardware placed into a waterproof case and the power supply was ensured by the direct connection to the electric mains and by photovoltaic panels with a buffer battery.Data transfer was performed using an intranet connection with the receiving station located in Arabba through a mobile connection.The Veneto Regional Agency for Environmental Protection and Prevention developed a webcam section in the website (www.arpa.veneto.it) that supported the near-real-time availability of the images.The field of view defined by the camera perspective considered an area of about 5000 m 2 with a maximum distance from the camera up to 180 m.The camera acquired all-year-round images every 1 hour since 2004 to 2013.For this study, we considered a "complete" dataset with about 8000 images where every melting season was included in order to have a large range of snow cover and illuminating conditions.In addition of that, we defined a "small" dataset with 30 images dating back to 2008 and 2009, which included a large variability in terms of illuminating conditions and snow cover.
Geosciences 2019, 9 FOR PEER REVIEW 4 180 m.The camera acquired all-year-round images every 1 hour since 2004 to 2013.For this study, we considered a "complete" dataset with about 8000 images where every melting season was included in order to have a large range of snow cover and illuminating conditions.In addition of that, we defined a "small" dataset with 30 images dating back to 2008 and 2009, which included a large variability in terms of illuminating conditions and snow cover.

Terrestrial Image Classification
Following the guidelines developed for the analysis of multispectral remotely sensed images, the classification issue can be approached using different principles depending on the methods for measuring the spectral matching or the spectral similarity: the deterministic-empirical methods and the stochastic approaches [23].The deterministic measures include the spectral angle, the Euclidean distance and the cross-correlation of spectral vectors in the hyperspectral space.The stochastic measures evaluate the statistical distributions of spectral reflectance values of the targeted region of interests.Within this framework, a large variety of classification methods that can be grouped from different perspectives [24].

Supervised Methods
The requirement for the automated solution is a "parametric" method, based on a "per-pixel" classification about the presence of snow cover.The description of the pixel content must be definitive and, consequently, a "hard" classifier is necessary.Furthermore, the classification process cannot be iterative and specific for a single image.Consequently, the generalization for different images, under different illumination conditions, can be obtained with a "supervised" classifier, which considers a "training" Region Of Interest (ROIs) associated with the theoretical "white" snow.Looking at the supervised methods, we can include classifiers that are sensitive to the user experience during the definition of the region of interests and to the selection of discriminant parameters between snow and not snow.Some methods are associated with the threshold selection defined by the statistics of

Terrestrial Image Classification
Following the guidelines developed for the analysis of multispectral remotely sensed images, the classification issue can be approached using different principles depending on the methods for measuring the spectral matching or the spectral similarity: the deterministic-empirical methods and the stochastic approaches [23].The deterministic measures include the spectral angle, the Euclidean distance and the cross-correlation of spectral vectors in the hyperspectral space.The stochastic measures evaluate the statistical distributions of spectral reflectance values of the targeted region of interests.Within this framework, a large variety of classification methods that can be grouped from different perspectives [24].

Supervised Methods
The requirement for the automated solution is a "parametric" method, based on a "per-pixel" classification about the presence of snow cover.The description of the pixel content must be definitive and, consequently, a "hard" classifier is necessary.Furthermore, the classification process cannot be iterative and specific for a single image.Consequently, the generalization for different images, under different illumination conditions, can be obtained with a "supervised" classifier, which considers a "training" Region Of Interest (ROIs) associated with the theoretical "white" snow.Looking at the supervised methods, we can include classifiers that are sensitive to the user experience during the definition of the region of interests and to the selection of discriminant parameters between snow and not snow.Some methods are associated with the threshold selection defined by the statistics of the identified ROIs.This is the case, for example, of the Parallelepiped classifier (PA), where the user defines a threshold based on the standard deviation.Some other approaches consider the probability associated with a specific ROI [25], calculating the Euclidean distance for the Minimum Distance (MD) method, the Mahalanobis distance for the Mahalanobis classifier (MA), and the covariance-based discriminant function for the Maximum Likelihood method (ML).These algorithms are all implemented in the commercial suite ENVI version 4.7 (Exelis Visual Information Solutions, Boulder, Colorado).

Blue Thresholding
Within the group of automated methods, there is a well-established method that is currently in use for snow-cover purposes with some limitations: it is a linear classifier based on thresholding of the blue channel (BT) that was introduced by [26] in the Snow-noSnow software.The method is based on the frequency counting of the blue component, and its hardness is associated with the definition of snow-not-snow limit looking at increments in the blue-channel histogram.This method has been used in several studies and it has shown some limitations.The illuminating conditions, the surface roughness and the distance from the camera are critical issues that affect the performance on retrieving snowed covers [27].These limitations are the grounds of research for a higher performing method that possibly increases the depth of analyzing RGB imagery.

Spectral Similarity
The approach proposed in this paper is based on measuring the spectral variations in a 3D color space where reference endmembers are a theoretical "white" snow and a theoretical "black" target.The parameters estimated in this vector system are the spectral angle defined by [28] and the Euclidean distance [21], respectively calculated considering white and black references.While the parameter based on the Spectral Similarity (SS) represents an independent spectral feature, the Euclidean distance of the vector can be defined as a brightness-dependent feature.The involvement of all the three-color components will support the increase of surface types that can be discriminated: snow, shadowed snow and not snow.The proposed approach (Figure 1) was developed in the R programming environment [29].
The first step consists of rearranging the three-color components of each pixel into a new two-dimensional vector space, mathematically defined as follows: The spectral angle θ in Equation (3) represents the relative proportion of the three-pixel components (P R , P G and P B ) in relationship to the reference composition (R R = 1, R G = 1 and R B = 1).The angle varies from zero, which can be associated with a "flat" behavior of colors (R = G = B), to π 2 , referring to a very dissimilar behavior from the theoretical "white" reference.
The spectral distance ∆ in Equation ( 4) is conversely an estimation of the vector length in the RGB space.It can range from 0 (black) to 1.73 (white) and it can be associated with the Euclidean distance from a "black" reference RGB composition (R R = 0, R G = 0 and R B = 0).While this parameter is sensitive to the brightness of colors, the spectral angle is invariant with brightness [23].The outcome of this step consists in the frequency counting of pixels considering the two spectral components with a 0.05 resolution.Furthermore, the total number of included pixels ( f tot ) and the area included in the cluster perimeter (P f ) were estimated.
The second step of the procedure consists in discriminating clusters from the obtained frequency distribution, and a watershed algorithm [30] can support this segmentation phase.Each cluster was fitted with a normal distribution in order to retrieve modes (defined by µ ∆ and µ θ ) and deviations (σ ∆ and σ θ ).If clusters are very close to each other, they can be combined in one larger group depending on their probability to be discriminated using the Mahalanobis distance.The criteria adopted for the definition of the cluster perimeter was based on the pixel frequency f (∆ , θ ) higher than the Poisson error of the adjacent pixel f (∆, θ) (Equation ( 5)).
The procedure for the delimitation of the cluster perimeter was implemented using a per-pixel method following [31].
The final step consists in the identification of the surface type (snow, not snow and shadowed snow).This step was defined observing the frequency distributions of pixels in the defined spectral space (Figure 3).It was possible to detect that snow covers were generally characterized by higher θ angles and lower ∆ values than not-snow covers.Snowed centroids (defined by µ ∆ and µ θ ) were generally positioned where angles were higher than 0.9 and distances were lower than 0.1.
Geosciences 2019, 9 FOR PEER REVIEW 6 definition of the cluster perimeter was based on the pixel frequency  ′, ′ higher than the Poisson error of the adjacent pixel  ,  (Equation ( 5)).
′, ′ >  , The procedure for the delimitation of the cluster perimeter was implemented using a per-pixel method following [31].
The final step consists in the identification of the surface type (snow, not snow and shadowed snow).This step was defined observing the frequency distributions of pixels in the defined spectral space (Figure 3).It was possible to detect that snow covers were generally characterized by higher  angles and lower  values than not-snow covers.Snowed centroids (defined by  and  ) were generally positioned where angles were higher than 0.9 and distances were lower than 0.1.Furthermore, the range of cluster values ( ,  ,  and  ) were characterized by short distance variations compared to angles in the case of snow-covered surfaces.From this point of view, clusters with limited perimeters ( < 0.04) and a high number of included pixels ( > 50 of the analyzed pixels) described surfaces with homogeneous reflective behavior, as expected for snowcovered surfaces.The second rule that can be considered includes clusters with limited perimeters ( < 0.04) and consistent number of included pixels (10 <  < 50 of the analyzed pixels).The optical behavior of those clusters must be coupled to their centroid position that must have low spectral angles ( < 0.5).These constraints describe, also in this case, clusters characterized by a homogeneous spectral behavior coherent with a snow-covered surface.The third rule that completes the classification procedure consisted on estimating the range of  between the defined clusters in the image and on defining a threshold ( ) that discriminates snow and other surface types.Two situations can occur for defining clusters above the threshold as snow-covered surface: one with multiple clusters (Equation 6) and one with a single polygon (Equation 7).

𝜇 > 𝑚𝑎𝑥𝛥
−  (6) Furthermore, the range of cluster values (∆ max , ∆ min , θ max and θ min ) were characterized by short distance variations compared to angles in the case of snow-covered surfaces.From this point of view, clusters with limited perimeters (P f < 0.04) and a high number of included pixels ( f tot > 50 of the analyzed pixels) described surfaces with homogeneous reflective behavior, as expected for snow-covered surfaces.The second rule that can be considered includes clusters with limited perimeters (P f < 0.04) and consistent number of included pixels (10 < f tot < 50 of the analyzed pixels).The optical behavior of those clusters must be coupled to their centroid position that must have low spectral angles (µ ∆ < 0.5).These constraints describe, also in this case, clusters characterized by a homogeneous spectral behavior coherent with a snow-covered surface.The third rule that completes the classification procedure consisted on estimating the range of ∆ between the defined clusters in the image and on defining a threshold (T ∆ ) that discriminates snow and other surface types.Two situations can occur for defining clusters above the threshold as snow-covered surface: one with multiple clusters (Equation ( 6)) and one with a single polygon (Equation ( 7)).
Once performed the classification, the amount of snow-covered surface was obtained adding the contribution of each cluster identified as snow covered.Furthermore, the quality of the final output was checked by the target area over the 10-year series of images.From this perspective, the ground control points were used to estimate eventually-induced shifting of the target view, and also to control the occurrence of adverse meteorological conditions (fog, clouds, intense raining/snowing) that could affect the image.Finally, the dataset was filtered from artifacts coupling this analysis to some basic tests about the file corruption and the image resolution.

Orthorectification
The orthorectification module was based on the geometrical correction of the perspective view.This step was implemented following [32].The available digital elevation model [33], with a 5 m spatial resolution and 1 m vertical resolution, provided about 300 topographic points that were projected on the camera view (Figure 1c).The effectiveness of the correction was estimated considering eight ground control points.

Satellite Snow Products
Several satellite products are available for the remote sensing of the cryosphere and for this study we considered products obtained by optical sensors, characterized by different spatial resolutions: high (below 100 m); intermediate (below 1 km); and low (higher than 1 km).The integration between those products and ground-based imagery will be tested, in order to improve the dataset concerning the snow cover over a decade.

Optical Remote Sensing with High Spatial Resolution
The available remotely sensed snow products with a higher spatial resolution (below 100 m) were limited to Landsat missions, considering the studied time range (2004-2013).The selected sensors included Landsat satellites from 5 to 8, taking some differences into account in terms of band spectral ranges.All these data are now processed and available in the Swiss Data Cube [34].The Landsat satellites are characterized by a spatial resolution of 30 m and a revisit time of 16 days.The considered data were geometrically and atmospherically corrected (Level 2A) using the Second Simulation of the Satellite Signal in the Solar Spectrum (6S) algorithm available in the Atmospheric and Radiometric Correction of Satellite Imagery (ARCSI) software [35].The final estimation of NDSI was possible considering Eq.1 and the first short-wave infrared band of Landsat sensors.The wavelength ranges are specific for each sensor and they correspond to 520-600 nm and 1550-1755 nm for missions 5 and 7, and 533-590 nm and 1566-1651 nm for mission 8 [36].

Optical Remote Sensing with Intermediate Spatial Resolution
The highest time resolution available for optical remote sensing at our latitudes is provided, in the framework of the Earth Observing System (EOS) flagship, by NASA's satellites Terra and Aqua.Both platforms are equipped with the Moderate Resolution Imaging Spectroradiometer (MODIS) and they provide the coverage of the Earth two times daily (Terra in the local morning and Aqua in the local afternoon).The instrument is characterized by 36 bands with a spatial resolution of 250 m in the visible range and 500 m in the short-wave infrared.The NASA's data chain provides the retrieval of NDSI at the ground, and we considered the MOD10A1_006 and the MYD10A1_006 products, for Terra and Aqua respectively, obtained using the National Snow and Ice Data Center services [11].The NDSI values, calculated using the MODIS bands 4 (545-565 nm) and 6 (1628-1652 nm) in Equation1, were obtained in absence of clouds 2314 times over 6556 overpasses within the studied period.

Optical Remote Sensing with Low Spatial Resolution
The daily estimation of the snow cover extent is being provided, over the considered period, by the European Space Agency as a component of the Data User Element.The GlobSnow Snow Extent (SE) product covers the Northern Hemisphere and it is going to be extended to the Sentinel missions.The GlobSnow SE processing system applies optical measurements in the visual in the thermal part of the electromagnetic spectrum acquired by the ERS-2 sensor ATSR-2 and the Envisat sensor AATSR.The first step of the data chain is based on a cloud-cover retrieval algorithm (SCDA) where clouds, as well as large water bodies (oceans, lakes and rivers) and glaciers, are masked out.This algorithm is based on the brightness -temperature difference between 11 and 3.7 µm and on a set of additional rules, useful for certain sky conditions.Furthermore, the snow cover information is retrieved for not-vegetated areas by the NLR algorithm [37] where the band 2 (670 nm) is considered.This step is based on a semi-empirical reflectance model, where reflectance from a target is expressed as a function of the snow fraction.The Fractional Snow Cover can then be derived from the observed reflectance based on the given reflectance constants and the transmissivity values.The product is provided daily with a spatial resolution of 1 km and the data are available using the GlobSnow service [13].

Statistical Analysis
The statistical analysis performed on the available datasets was carried out using state-of-the-art tools [38] that were implemented in the R-Project programming environment [29].

Results
Results will be presented separating the three objectives of the paper.The first part of the analysis will consider a small dataset where different supervised and automated classifiers will be compared.The second section will consider a ten-year dataset where about 8000 images will be processed using automated solutions.Finally, the FSC estimated by terrestrial photography will be compared to the output obtained by remotely sensed data.

Comparison between Supervised and Automated Classifiers
This first part of the analysis includes two steps: one dedicated to the orthorectification of the panoramic view observed by the webcam; the other focused on the image classification performed considering the color components associated with a RGB color space.The first process produced a weighting mask applying a geometrical correction and all the considered classification algorithms used this product successively.The classification step was operated on a small dataset of 30 images due to the user intervention required by the supervised methods (ML, MD, MA and PD) for the definition of snowed ROIs.This is a strong limitation for the analysis of long time series, and it outlines the need of automated solutions since BT and SS algorithms, for example, did not require any user decision.The results obtained by the BT method and the SS algorithm were preliminarily analyzed considering the confusion matrix of each image and estimating the average overall accuracy as reported in Table 1.
Considering only two classes of cover (snow and not snow pixels), the comparison between automated and supervised classifiers showed in general a good agreement with an overall accuracy higher than 90%.Furthermore, SS showed a better performance compared to BT with an increased average accuracy of about 1-2% in terms of pixel number.While BT reached the full agreement with the supervised methods in 10% of images, SS matched the classifications obtained by the traditional approaches in more than 30% of images.The goodness of the automated algorithms is confirmed by the Cohen's kappa coefficient, which increases from 0.89 for BT to 0.93 for SS.Both averages indicated very good agreements between supervised and automated solutions but they confirmed the increased performance of the algorithm based on Spectral Similarity.Although these differences may seem limited, the contribution of 2000-5000 pixels (in a masked part of the camera image of 250,000 pixels) in terms of surface can be important, depending on the distance of each pixel.The projection of each pixel on the surface could increase consistently from closer to faraway pixels.From this perspective, the impact of omissions and false discoveries on the projected area could be higher than the overall accuracy in terms of pixels and it should be analyzed case by case.

Comparison between Automated Classifiers
The comparison between the estimated snow-covered areas obtained by the two automated algorithms (Figure 4a and Figure S1 for one example) confirmed the trend on underestimating the snow extent by BT compared to SS (see Table S1 for the raw data).The FSC estimated by the two methods differed slightly (the non-parametric Kruskal-Wallis chi-squared test indicated a non-significant statistical difference) and the Root Mean Squared Error (RMSE) was about 7.4%.The relation between the two FSC estimations showed a good correlation (R 2 close to 0.95) and the slope of the regression was 0.91 with an intercept of 11.5%.
Geosciences 2019, 9 FOR PEER REVIEW 9 seem limited, the contribution of 2000-5000 pixels (in a masked part of the camera image of 250,000 pixels) in terms of surface can be important, depending on the distance of each pixel.The projection of each pixel on the surface could increase consistently from closer to faraway pixels.From this perspective, the impact of omissions and false discoveries on the projected area could be higher than the overall accuracy in terms of pixels and it should be analyzed case by case.

Comparison between Automated Classifiers
The comparison between the estimated snow-covered areas obtained by the two automated algorithms (Figure 4a and Figure S1 for one example) confirmed the trend on underestimating the snow extent by BT compared to SS (see Table S1 for the raw data).The FSC estimated by the two methods differed slightly (the non-parametric Kruskal-Wallis chi-squared test indicated a nonsignificant statistical difference) and the Root Mean Squared Error (RMSE) was about 7.4%.The relation between the two FSC estimations showed a good correlation (R 2 close to 0.95) and the slope of the regression was 0.91 with an intercept of 11.5%.
Although BT and SS estimations were almost consistent considering only the small dataset, the complete dataset highlighted an improved performance of SS (Figure 4b).The Kruskal-Wallis test indicated differences with a significance level higher than 99% and the RMSE was about 12%.The relation between the two FSC estimations showed a limited correlation (R 2 close to 0.87) compared to the small dataset, and the slope of the regression was 0.87 with an intercept of 14.5%.The detection of snow-covered areas using SS was generally higher than that obtained by BT and in few occurrences, it was completely missed by BT (see Table S2 for the raw data).The points closer to the left axis were, in fact, situations where light conditions (low sun elevation or intense cloud cover) affected the BT output.Those illumination conditions were important also in additional cases, where BT underestimated the snow-covered area compared to SS.

Comparison between FSC Estimations Obtained by Terrestrial Photography and Remote Sensing
The comparison between satellite products and terrestrial photography retrievals was focused on evaluating the relationship associated with the two data sources (see Table S3 for the raw data).We considered remotely sensed data with different spatial resolutions and data chains.The Landsat images available in the considered time range was 189, but 55 images were discarded due to the intense cloud coverage.The MODIS values were obtained in absence of clouds 2314 times over 6556 Although BT and SS estimations were almost consistent considering only the small dataset, the complete dataset highlighted an improved performance of SS (Figure 4b).The Kruskal-Wallis test indicated differences with a significance level higher than 99% and the RMSE was about 12%.The relation between the two FSC estimations showed a limited correlation (R 2 close to 0.87) compared to the small dataset, and the slope of the regression was 0.87 with an intercept of 14.5%.The detection of snow-covered areas using SS was generally higher than that obtained by BT and in few occurrences, it was completely missed by BT (see Table S2 for the raw data).The points closer to the left axis were, in fact, situations where light conditions (low sun elevation or intense cloud cover) affected the BT output.Those illumination conditions were important also in additional cases, where BT underestimated the snow-covered area compared to SS.

Comparison between FSC Estimations Obtained by Terrestrial Photography and Remote Sensing
The comparison between satellite products and terrestrial photography retrievals was focused on evaluating the relationship associated with the two data sources (see Table S3 for the raw data).We considered remotely sensed data with different spatial resolutions and data chains.The Landsat images available in the considered time range was 189, but 55 images were discarded due to the intense cloud coverage.The MODIS values were obtained in absence of clouds 2314 times over 6556 overpasses within the studied period.Finally, 289 GlobSnow data points were available during the considered period.While Landsat and MODIS data were converted in FSC considering the state-of-the-art relation described by [8], the GlobSnow product is ready-to-be-used considering the ground-truth support of the calibration sites identified in the images.
The Landsat sensors provided 24 observations (Figure 5a) and 10 were characterized by NDSI higher than 0.6, indicating the total coverage of snow in pixels.While two observations showed coherent NDSI values with the camera estimates (when snow cover was absent, the NDSI was negative), intermediate values were 3 times slightly above the expected results estimated using Equation (2) and 9 times consistently higher (more than 30% of overestimation).Whereas illumination differences can be related to the definition of a possible site-specific relation, heavy differences occurred when a partial shadow of clouds on the ground was present during the satellite revisit.The non-parametric Kruskal-Wallis chi-squared test indicated differences with a significance level of 80%, the RMSE was about 21% and the correlation coefficient was 0.59.
The MODIS sensors provided 430 observations (Figure 5b) and 205 were characterized by NDSI higher than 0.6, indicating the total coverage of snow in terms of pixels.The intermediate values were, also in this case, generally above the expected results.A first group of 26 observations showed camera FSC higher than expected NDSI-derived values with a difference higher than 30%; 33 observations were up to 30% higher; and 15 times MODIS products didn't detect any snow cover while the camera measured FSC ranging between 10-60%.All of these situations occurred when the cloud screening missed to identify partial cloud shadows on the ground while the satellite was overpassing.This comparison, in addition to Landsat indications, showed negative estimations in eight cases.These estimations (more than 20%) were artifacts associated with wrong cloud masking (there was no snow on the ground and it was full of clouds in the sky).The non-parametric Kruskal-Wallis chi-squared test indicated differences with a significance level of 99%, the RMSE was about 14% and the correlation coefficient was 0.91.
Finally, the GlobSnow SE product provided 62 observations (Figure 5c) and the estimated output was coherent 57 times (with full snow coverage at the ground), whereas the GlobSnow product missed to detect the snow cover 5 times, compared to the camera observations.The non-parametric Kruskal-Wallis chi-squared test indicated differences with a significance level of 99%, the RMSE was about 18% and the correlation coefficient was 0.84.From a statistical point of view, all the satellite products showed significant differences compared to the camera-based estimations even if the correlation was good.This observation is influenced, of course, by the number of outliers included in the available dataset composed by the different satellite revisits, which depends mostly on cloud screening.negative), intermediate values were 3 times slightly above the expected results estimated using Equation (2) and 9 times consistently higher (more than 30% of overestimation).Whereas illumination differences can be related to the definition of a possible site-specific relation, heavy differences occurred when a partial shadow of clouds on the ground was present during the satellite revisit.The non-parametric Kruskal-Wallis chi-squared test indicated differences with a significance level of 80%, the RMSE was about 21% and the correlation coefficient was 0.59.The MODIS sensors provided 430 observations (Figure 5b) and 205 were characterized by NDSI higher than 0.6, indicating the total coverage of snow in terms of pixels.The intermediate values

Discussion
The first part of the results evidenced that automated solutions provide FSC estimations compatible to the supervised solutions available in literature.The major advantage of automated methods consists in the reduction of time consumption and, consequently, in the opportunity of processing long time series of terrestrial images.We described an automated approach based on the concept of Spectral Similarity [23], which could prevent artifacts under particular illuminating conditions.While a small training dataset supported the training of an SS-based algorithm, the ten-year dataset, with about 8000 images, showed a better performance compared to a state-of-the-art automated method BT described by [26].The trend of FSC underestimation (about 10%) outlined by the small training dataset was confirmed by the large decadal comparison.The observed statistically significant differences were limited in terms of pixel number (less than 1%), but these discrepancies were important in terms of surface.The projection of each pixel on the surface could increase consistently from closer to faraway pixels and from this perspective; the impact of omissions and false discoveries on the projected area could be high.Furthermore, the ability to analyze the "difficult" conditions affecting the BT performance [10] was confirmed by statistically significant differences detected between the two data series.The limitations of BT retrievals can be associated with poor illuminating conditions (low Sun elevation or heavy cloud coverage) and surface roughness.While low Sun elevation can occur in the early morning or in the late afternoon, surface roughness and cloud coverage are not time dependent.Furthermore, while the illuminating conditions can alter the reflective behavior of snow in response to a more blueish incident light, the roughness can imply the presence of shadowed surfaces that BT cannot discriminate compared to SS.While BT tends to separate shadowed and illuminated areas, SS can be trained to integrate both types since the spectral angle is similar and its only variation is the spectral distance.While BT can generally provide good results between 11:00 am and 3:00 pm local time, SS can enlarge the range of performing conditions in terms of both Sun elevation and cloud cover.These preliminary results concerning the SS approach represent a first step towards the development of a machine learning strategy aimed to analyze routinely ground-based images.Artifacts associated with purely-BT classification [19,26,28], which are well documented in literature [19,27,28], were reduced and the need to consider all the information present in a RGB composite image [27] was followed.Differently from [27], which combined principal component analysis to BT, SS is independent from BT and considers all the bands at the beginning of the classification step obtaining a discrimination between surface types based directly on the spectral behavior of each classified feature.Furthermore, SS considers the color variations induced by illumination conditions and the probability to separate different surface types is associated with statistical measurements such as the Mahalanobis distance.
Finally, the FSC estimated by terrestrial photography and satellite products evidenced different aspects to be considered: the spatial resolution and the cloud screening.The cloud screening is a critical step present in all of the data chain considered in this study.Our data demonstrated, in fact, that a large number of satellite omissions were associated with a wrong detection of clouds.In addition to those exclusions, different situations evidenced an underestimation of FSC affected by the presence of cloud shadows that reduce the reflection of light from the surface.Although the different data chains [6,8,13] of course, consider these anomalies, the contribution of terrestrial photography, in this case, could support for the validation of remotely sensed retrievals.Moving to the spatial resolution, we considered data ranging from a 30 m resolution (Landsat), to 500 m (MODIS), to 1 km (GlobSnow SE) in order to test different data chains with different spatial and time resolutions.The spatial resolution had, of course, an impact and we found a more reliable relation with Landsat data than with those characterized by a coarser resolution.While the projected area of the camera view is five times the surface covered by a single Landsat pixel, it represents the 2% of a MODIS pixel the 0.5% of a single GlobSnow grid element.This implies that the surface morphology can affect the final estimates due to the presence of hills and small valleys.
This framework outlines the potentiality of facilities where different satellite snow products can have a common term of comparison such as terrestrial cameras.Ground-based images represent a good proxy, useful for validating the coherence between different products.On the one hand, this data-source can support the reconstruction of long time series useful for climate change studies.On the other one, this kind of proxy can assist the definition of site-specific relation between FSC and the optical behavior of the surface.

Conclusions
The contribution of terrestrial photography for the definition of the relation between the Fractional Snow Cover and the spectral behavior of the surface is a major issue.Ground-based cameras represent a valuable proxy of data useful for investigating the snow cover extension over a long period.From this perspective, terrestrial photography can be used as ancillary information and it supports the integration among different multispectral remotely sensed datasets.The availability of an automated procedure useful for the discrimination between snow and not-snow covered surfaces can support the analysis of large datasets.The selected approach based on Spectral Similarity was compared with supervised methods and with the Blue Thresholding procedure on a training dataset.Considering the supervised methods as a reference, the Spectral Similarity approach showed better performance

Figure 1 .
Figure 1.Description of the workflow followed in the manuscript.While the green boxes represent the considered data sources, the other colored boxes constitute the final products obtained by the different procedures required for the estimation of the Fractional Snow Cover.

Figure 1 .
Figure 1.Description of the workflow followed in the manuscript.While the green boxes represent the considered data sources, the other colored boxes constitute the final products obtained by the different procedures required for the estimation of the Fractional Snow Cover.

Figure 2 .
Figure 2. The study site of Cima Pradazzo (a), close to Falcade (Italy).Panoramic view of the camera (b) with the considered mask in red.The orthorectified views of the camera (c): the grey shaded area with red contour shows the camera view projected on the ground; and the colored lines indicate the pixel grids of the different satellite products.

Figure 2 .
Figure 2. The study site of Cima Pradazzo (a), close to Falcade (Italy).Panoramic view of the camera (b) with the considered mask in red.The orthorectified views of the camera (c): the grey shaded area with red contour shows the camera view projected on the ground; and the colored lines indicate the pixel grids of the different satellite products.

Figure 3 .
Figure 3. Examples of two different snow-not-snow mixtures.Colored polygons identified areas of clusters in presence of two different situations: partial (a) and full (b) snow cover.Lower plots are frequency distribution of pixels at the different spectral angles () and spectral distances ().

Figure 3 .
Figure 3. Examples of two different snow-not-snow mixtures.Colored polygons identified areas of clusters in presence of two different situations: partial (a) and full (b) snow cover.Lower plots are frequency distribution of pixels at the different spectral angles (θ) and spectral distances (∆).

Figure 4 .
Figure 4. Performance of Blue Thresholding (BT) algorithm versus the Spectral Similarity (SS) method considering only the test dataset (a).Comparison between the two methods considering the complete dataset (b).

Figure 4 .
Figure 4. Performance of Blue Thresholding (BT) algorithm versus the Spectral Similarity (SS) method considering only the test dataset (a).Comparison between the two methods considering the complete dataset (b).

Figure 5 .
Figure 5.Comparison between Fractional Snow Cover estimations obtained by terrestrial photography and remote sensing.Plots refer to Landsat (a), MODIS (b) and GlobSnow (c).

Figure 5 .
Figure 5.Comparison between Fractional Snow Cover estimations obtained by terrestrial photography and remote sensing.Plots refer to Landsat (a), MODIS (b) and GlobSnow (c).