Using the MODIS Sensor for Snow Cover Modeling and the Assessment of Drought E ﬀ ects on Snow Cover in a Mountainous Area

: Snow is one of the essential factors in hydrology, freshwater resources, irrigation, travel, pastimes, ﬂoods, avalanches, and vegetation. In this study, the snow cover of the northern and southern slopes of Alborz Mountains in Iran was investigated by considering two issues: (1) Estimating the snow cover area and the (2) e ﬀ ects of droughts on snow cover. The snow cover data were monitored by images obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor. The meteorological data (including the precipitation, minimum and maximum temperature, global solar radiation, relative humidity, and wind velocity) were prepared by a combination of National Centers for Environmental Prediction-Climate Forecast System Reanalysis (NCEP-CFSR) points and meteorological stations. The data scale was monthly and belonged to the 2000–2014 period. In the ﬁrst part of the study, snow cover estimation was conducted by Multiple Linear Regression (MLR), Least Square Support Vector Machine (LSSVM), Group Method of Data Handling (GMDH), Multilayer Perceptron (MLP), and MLP with Grey Wolf Optimization (MLP-GWO) models. The most accurate estimations were produced by the MLP-GWO and GMDH models. The models produced better snow cover estimations for the northern slope compared to the southern slope. The GWO improved the MLP’s accuracy by 10.7%. In the second part, seven drought indices, including the Palmer Drought Severity Index (PDSI), Bahlme–Mooley Drought Index (BMDI), Standardized Precipitation Index (SPI), Multivariate Standardized Precipitation Index (MSPI), Modiﬁed Standardized Precipitation Index (SPImod), Joint Deﬁcit Index (JDI), and Standardized Precipitation-Evapotranspiration Index (SPEI) were calculated for both slopes. The results showed that the e ﬀ ects of a drought event on the snow cover area would remain up to 5 (or 6) months in the region. The highest impact of drought appears after two months in the snow cover area, and the drought index most related to snow cover variations is the 2–month time window of SPI (SPI 2 ). The results of both subjects were promising and the methods can be examined in other snowy areas of the world.


Snow Cover Data
To monitor snow cover, MODIS model and Digital Elevation Model (DEM) images were used. To that end, the MOD10A1 and MOD10A2 of MODIS data were first obtained from the NASA National Snow and Ice Data Center (NSIDC) database with a spatial resolution of 500 × 500 m in HDF format. Heavy snowfall in central Alborz usually occurs from early November to May and the snow cover persists until June. For this area, images were available for the 15

Snow Cover Data
To monitor snow cover, MODIS model and Digital Elevation Model (DEM) images were used. To that end, the MOD10A1 and MOD10A2 of MODIS data were first obtained from the NASA National Snow and Ice Data Center (NSIDC) database with a spatial resolution of 500 × 500 m in HDF format. Heavy snowfall in central Alborz usually occurs from early November to May and the snow cover persists until June. For this area, images were available for the 15 years from 2000 to 2014. To process the images, first, a preprocessing operation including geometric, radiometric, and atmospheric corrections was applied to them in ENVI 5.3 software. This operation included calibrating images, converting image coordinates to real terrain coordinates (WGS84 UTM 39N), and removing clouds from images. The following provides a description of the MODIS sensor's usage applied in the current study to monitor the snow cover area.
One of the five sensors on the Terra satellite is the MODIS sensor, which was launched on December 18, 1999. This sensor's coverage significantly increased with the launch of the Aqua satellite on May 4, 2002. The MODIS sensor has a spatial resolution (250, 500, and 1000 m), 12-bit radiometric resolution, high temporal resolution (1 to 2 days), and 36-band spectral separation of bands 20 to 23, excluding radiation bands and bands 1 to 7, which are reflective. It covers wavelengths from 0.4 microns (visible light) to 14.4 microns (thermal infrared). The MODIS sensor provides snow cover detection maps using the Normalized Difference Snow Index (NDSI) on a macro-scale and provides a rapid application in regional studies [13]. NDSI is a spectral measure of the relative magnitude of the characteristic reflectance difference between the visible and short-wave infrared reflectance of snow in the MODIS sensor, employed to detect changes in snow cover using the Normalized Difference Vegetation Index (NDVI) algorithm [14]. NDSI uses snow spectral reflectance advantages that exist in the visible band with a high reflectance and infrared spectrum with a low reflectance, and NDSI as an algorithm for detecting snow from clouds and snow-covered areas with a set of thresholds applied and calculated pixel to pixel [14]. In the first method, the dual reflectance criteria (i.e., the pixel reflectance values in band 6 > 11% and the pixel reflectance values in band 4 ≥ 10%) are applied if NDSI > 0.4, based on Equation (1). In the next step, three conditional tests are performed to extract the NDSI values, according to Equation (1). Similar to many spectral rationing methods, this index reduces atmospheric effects [15]. Applying the NDSI will result in the formation of pixels with a value of −1 to +1, for which values ranging from −1 to 0 indicate areas where there is no snow and those from 0 to +1 indicate areas where snow has positive coefficients due to its lightness and heaviness (depending on its depth). When the snow depth is higher, the number is closer to one, and for a lower depth of snow, the number tends toward zero. The MODIS snow map algorithm from bands 4 and 6 of this sensor is automatically implemented to extract the Normalized Difference Snow Index (NDSI) and is calculated based on Equation (1) [16].

Meteorological Data
In this study, the combined use of the National Centers for Environmental Prediction-Climate Forecast System Reanalysis (NCEP-CFSR) database and observations due to insufficient meteorological stations located at altitude (from the Iranian Meteorological Organization) is provided. The NCEP-CFSR points and the ground stations are specified in Figure 1. The meteorological variables used include the minimum temperature (T min ), maximum temperature (T max ), global solar radiation (GSR), relative Remote Sens. 2020, 12, 3437 5 of 24 humidity (RH), precipitation (P), and wind velocity (W), and their units are • C, • C, MJ m 2 ·day , %, mm, and m s , respectively. These meteorological data, prepared on monthly scales, belong to the period of 1979-2014 (36 years). These datasets were integrated to be used as inputs for snow cover estimation. As there was one data series for snow cover for each of the northern and southern slopes, the meteorological data had to be integrated into one equivalent series for the estimation and investigation. Therefore, the Thiessen polygon method was used to merge the data into one series, for each of the two slopes of Alborz mountain (according to Guhathakurta et al.'s [17] study). The statistical characteristics of Table 1 refer to the equivalent series. In this investigation, the drought indices were calculated by precipitation and temperature, recorded during the 1979-2014 period. However, according to the limitation of the snow cover data (because of the nonexistence of snow cover images before 2000), investigating the relationship between drought indices and snow cover was conducted for the period of 2000-2014. Furthermore, the estimation of snow cover was implemented for the 2000 to 2014 period. All of the variables, including the snow cover area and meteorological variables, were provided at a monthly scale for all 12 months of the years under study (Table 1 relates to the monthly scale data).

Drought Indices
Drought indices are used to monitor drought events. Droughts usually begin by precipitation deficit and affect other water resources, appearing as different types, such as meteorological droughts, agricultural droughts, etc. Different types of droughts can be monitored by special individual indices. For example, the Bhalme and Mooley Drought Index (BMDI) is a meteorological drought index and the Palmer Drought Severity Index (PDSI) is an agricultural drought index. On the other hand, there is the Standardized Precipitation Index (SPI), which refers to different types of drought in its different time windows [11]. There are also multivariate drought indices, such as the Joint Deficit Index (JDI) and Multivariate SPI (MSPI), which can monitor different types of drought simultaneously [12,18].
In the current research, seven different drought indices were employed to investigate the effect of drought on snow cover. Table 2 shows the drought indices used. Additionally, references for their calculation methods and supplementary information are provided. Multiple linear regression is a linear model implemented between one or more independent variables (input variables) and one dependent variable (target variable). This model is based on the regression coefficients for input variables, which are optimized by the least squares algorithm. In the current research, the inputs were the meteorological variables and the target was the snow cover area. The mathematical form of this model is as follows: where y is the target (dependent) variable, a is the model's constant coefficient or intercept of the regression line, b i represents the regression coefficients, x represents the input (independent) variables, and e is the model's error. Originating from support vector machines (SVMs), LSSVM is a powerful approach for solving nonlinear classification problems, function estimation, and regression. The SVM, developed by Cortes and Vapnik [26], is based on the principle of construction risk minimization (SRM), which reduces the overestimation of the model [27]. The parameters of this model are γ, representing the adjustment constant, and σ2, representing the radial basis function (RBF) kernel width optimized by the least squares algorithm in this study [6].

Group Method of Data Handling (GMDH)
Ivakhnenko presented GMDH in 1970 at the Ukrainian Institute of Technology. This is a nonlinear multivariate statistical analysis method that can be used to determine the complex structure of input-output relationships in complex systems [28]. GMDH is based on the separation of secondor third-order nested polynomials to simulate regression and classification topics [28]. GMDH is a multilayer network in which each layer is composed of multiple nodes. The results from each layer are transferred to the next layer; in fact, the results of one layer are used as inputs for the next layer [29,30].

Multilayer Perceptron (MLP)
The concept of the perceptron was first introduced by McCulloch and Pitts [31] as an artificial neuron. An MLP network provides a nonlinear relationship between the input and output vectors. This is accomplished by connecting neurons from one layer to another one (previous or next layer). The output of each neuron is multiplied by weight coefficients and given as the input for a nonlinear excitation function [32]. In the training phase, the training data are given to the perceptron, and the grid weights are then adjusted to minimize the error between the target and output of the model, or to reach the default number of training times. Then, like all modeling processes, different inputs (not present in the training phase) are used for model validation. The training of neural networks is generally very complex and can be stated to be an optimization problem with a large number of variables [33].

Multilayer Perceptron-Grey Wolf Optimization (MLP-GWO)
This model results from the integration of the MLP model and the GWO algorithm. The GWO algorithm is a population-based meta-heuristic algorithm that is inspired by the behavior of gray wolves during hunting [34]. This algorithm consists of three main steps: (1) Tracking and approaching; (2) pursing and encircling; and (3) attacking. There are four types of wolf in this algorithm. The alpha wolf is the main driver of the algorithm, the beta and delta wolves act as assistants to the alpha wolf, and the omega wolf follows both the alpha and beta wolves. Gray wolves hunt their prey through surrounding them. At the hunting stage, the gray wolf attacks the surrounded prey. The alpha wolf follows the hunting process, and the beta and delta wolves sometimes participate. For mathematically simulating wolf hunting behavior, it is usually assumed that alpha, beta, and delta wolves have better information on the location of prey [35]. Gray wolves attack and hunt when the target stops moving (see [34] for more information on this optimization algorithm). A schematic diagram of the models used to estimate the snow cover area is shown in Figure 2. Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 24

Model Performance Criteria
Several statistical criteria are used in each study to evaluate the performance of the models. These criteria for current research include the Root Mean Square Error (RMSE), the Normalized Root Mean Square Error (NRMSE), Nash-Sutcliff (NS), and the Willmott Index (WI). The mathematical equations of these indices are as follows: In these equations, Oi represents the observational data, Ei represents the estimated data, Ō represents the average observational data, Omax represents the maximum observational data, and Omin represents the minimum observational data. The closer values of WI and NS are to 1, and RMSE and NRMSE are to 0, the better the performance of the models.

Investigating the Relationship between Changes in the Snow Cover Area and Drought Indices
The seven drought indices of PDSI, BMDI, SPI, MSPI, SPImod, JDI, and SPEI were initially calculated. Then, the relationship between these indices and the snow cover area for both the northern and southern slopes of the Alborz Mountains was evaluated by the Spearman correlation test. The results of the Spearman test are presented in Tables 3 and 4.

Model Performance Criteria
Several statistical criteria are used in each study to evaluate the performance of the models. These criteria for current research include the Root Mean Square Error (RMSE), the Normalized Root Mean Square Error (NRMSE), Nash-Sutcliff (NS), and the Willmott Index (WI). The mathematical equations of these indices are as follows: In these equations, O i represents the observational data, E i represents the estimated data, O represents the average observational data, O max represents the maximum observational data, and O min represents the minimum observational data. The closer values of WI and NS are to 1, and RMSE and NRMSE are to 0, the better the performance of the models.

Investigating the Relationship between Changes in the Snow Cover Area and Drought Indices
The seven drought indices of PDSI, BMDI, SPI, MSPI, SPImod, JDI, and SPEI were initially calculated. Then, the relationship between these indices and the snow cover area for both the northern and southern slopes of the Alborz Mountains was evaluated by the Spearman correlation test. The results of the Spearman test are presented in Tables 3 and 4. Here, 24 time windows of the indices SPI, SPEI, and SPImod were calculated for each slope. The results indicate a significant relationship (p = 0.01) between the snow cover area and SPI for both the northern and southern slopes of Alborz and some SPEI time windows for the southern slope. The highest correlation coefficients were observed in SPI 1-, 2-, 3-, 4-, 5-, and 6-month time windows and accordingly, MSPI was calculated in time windows of 1-3 months (MSPI 1-3 ), 1-6 months (MSPI 1-6 ), 1-9 months (MSPI 1-9 ), and 1-12 months (MSPI 1-12 ). The five main time windows mentioned in the introduction of MSPI [23] were also calculated and investigated (including MSPI 3-6 , MSPI 6-12 , MSPI 3-12 , MSPI [12][13][14][15][16][17][18][19][20][21][22][23][24] , and MSPI  ). The drought indices, including SPImod (in all time windows), MSPI (in all time windows), JDI, BMDI, and PDSI, were not significantly related to snow cover variations. The highest correlation coefficients for the northern slope were 0.723 and 0.720, which were reported for SPI 2 and SPI 3 , respectively. For the southern slope, the highest correlation coefficients were 0.617 and 0.638, which respectively belong to SPI 1 and SPI 2 . The most important reasons for this are the higher humidity and cloudiness and consequently lower evaporation rates of the northern slopes, and conversely, the higher received radiation rates for the southern slopes. This causes the snow to melt later on the northern slopes. Therefore, the effects of snow cover on droughts will remain longer and consequently, larger time windows of SPI can be related to the snow cover area for the northern slopes. As a result, SPI 2 can be identified as the best time window of SPI and the drought variable (among these indices) most related to the snow cover variations in this region. In general, it can be reported that the drought status will be most effective during the increasing and decreasing of the snow cover area with a two-month time lag, remaining for up to five (or six) months.

Input Selection
Prior to modeling, the correlation matrix was used at the input selection stage. The correlation results of the variables are shown in Figure 3.
There is a rational and strong negative correlation between the maximum and minimum temperatures, as well as between radiation and snow cover areas, for both slopes. Moreover, in terms of the positive correlations, the relative humidity has a strong correlation with the snow cover and precipitation has a weaker positive correlation coefficient, but both of them are significant at the 0.01 level. The variable wind speed had the weakest correlation for both slopes, being significant at the 0.05 level for the northern slope and insignificant for the southern slope. The opposite direction of the wind speed's correlation for these two slopes can be related to the influential synoptic systems. The northern slope of Alborz Mountains is affected by the fronts from the Mediterranean Sea, Black Sea, and Caspian Sea. The absolute majority of rainfall due to these systems falls at the northern slope and a much smaller amount falls in the southern zones, which causes a big difference between these two areas' climates (north of these mountains, there are humid and super-humid climates and, to the south, there is an arid climate). The wind speed close to the earth's surface, especially at altitude and in mountainous regions, is significantly related to the atmospheric fronts' speed. Therefore, when the wind speed increases, a front transfers the moisture to altitude and causes snowfall, so the snow cover area will increase. However, the front loses its humidity and after descending the altitudes in the southern part, it usually produces chinook wind. This phenomenon causes snowmelt and decreases the snow cover area, so there will be a negative relation between snow cover and wind speed in this region.
In these datasets, the precipitation refers to rainfall data. Also at high altitudes, the snowfall events are transformed into water equivalents of snow and count as rainfall. It is obvious that snowfall can increase the snow cover and causes a positive correlation with the snow cover area. In contrast, warm precipitation causes a negative correlation. As we can see for the correlations, there are positive correlations (+0.333 for the southern slope and +0.479 for the northern slope) between the snow cover area and precipitation. To extend this issue, it can be stated that, at lower altitudes, warmer seasons have warm precipitation, which melts the snow and causes a negative correlation with the snow cover area, and colder seasons have both cold and warm precipitation. However, almost all of the precipitation at high altitudes occurs in the form of snowfall, which increases the snowy area and has a positive correlation with the snow cover area. Since the snow cover area exists at higher altitudes, cold precipitation and snowfall will be more effective in terms of the snow cover. Therefore, in the general form of this region, precipitation will be positively correlated with the snow cover area.
conversely, the higher received radiation rates for the southern slopes. This causes the snow to melt later on the northern slopes. Therefore, the effects of snow cover on droughts will remain longer and consequently, larger time windows of SPI can be related to the snow cover area for the northern slopes. As a result, SPI2 can be identified as the best time window of SPI and the drought variable (among these indices) most related to the snow cover variations in this region. In general, it can be reported that the drought status will be most effective during the increasing and decreasing of the snow cover area with a two-month time lag, remaining for up to five (or six) months.

Input Selection
Prior to modeling, the correlation matrix was used at the input selection stage. The correlation results of the variables are shown in Figure 3. There is a rational and strong negative correlation between the maximum and minimum temperatures, as well as between radiation and snow cover areas, for both slopes. Moreover, in terms of the positive correlations, the relative humidity has a strong correlation with the snow cover and precipitation has a weaker positive correlation coefficient, but both of them are significant at the 0.01 level. The variable wind speed had the weakest correlation for both slopes, being significant at the 0.05 level for the northern slope and insignificant for the southern slope. The opposite direction of the wind speed's correlation for these two slopes can be related to the influential synoptic systems. The northern slope of Alborz Mountains is affected by the fronts from the Mediterranean Sea, Black Sea, and Caspian Sea. The absolute majority of rainfall due to these systems falls at the northern slope and a much smaller amount falls in the southern zones, which causes a big difference between these The correlation of snow cover with meteorological variables is somewhat stronger for the northern slope, because the standard deviation, skewness, and coefficient of variation of this phenomenon (snow cover area) are lower for the northern slope, so it is somewhat closer to exhibiting a normal distribution. This phenomenon can lead to less variability and, consequently, greater consistency with other meteorological variables. To select the inputs, the meteorological variables for each slope were sorted separately by the correlation coefficients, and the input scenarios were selected as given in Table 5.
Therefore, input scenarios, inputs (meteorological variables), and the target (snow cover area) were arranged, and in each scenario, 75% of the number of months was selected for training and 25% for the test. Then, five methods, including MLR, LSSVM, GMDH, MLP, and MLP-GWO, were used for estimating the snow cover.

Models' Performances
This section only covers the test part, and the results are listed in Table 6. It can be observed that the MLR model is the weakest model in most cases. This indicates that the nonlinear relationships between the meteorological variables and the snow cover area are stronger than the linear relationships. The best performance in the northern and southern slopes belongs to the MLR6 and MLR3 scenarios, respectively. The equations derived from MLR models are given below: These equations are applicable for these two slopes for estimating the snow cover if no snow cover data are available. Equation (8) is the best MLR model for southern slopes, as it has the least error and can also be used with just three variables, including T min , T max , and RH.
Subsequently, the SVM and MLP artificial intelligence models performed relatively well in terms of the accuracy. Both of these models displayed their best performance with scenario 6 and the model performance can be considered very desirable, with NS > 0.75 and NRMSE < 0.10. However, the NRMSE value for these two models is more than 0.10 for the southern slope.
Among the AI-based models, the GMDH and/or MLP-GWO models produced the best results in all scenarios. GMDH provided its best estimation with the fourth input scenario (GMDH4), exhibiting results of RMSE = 816.011 km 2 , NRMSE = 0.089, NS = 0.918, and WI = 0.979 for the northern slope, and with the sixth input scenario (GMDH6), displaying results of RMSE = 1111.549 km 2 , NRMSE = 0.087, NS = 0.877, and WI = 0.965, for the southern slope. The most accurate performance of MLP-GWO for the northern slope resulted from the MLP-GWO6 input scenario, with RMSE = 754.711 km 2 , NRMSE = 0.083, NS = 0.930, and WI = 0.980. For the southern slope, the MLP-GWO2 input scenario produced the best estimation, with RMSE = 1261.877 km 2 , NRMSE = 0.098, NS = 0.842, and WI = 0.953. The NRMSE < 0.10 and NS > 0.80 for the southern slope and NS > 0.90 for the northern slope show that the performances of GMDH and MLP-GWO can be highly desirable and superior to the other models in the estimation of snow cover.
The GMDH model, with many optimization parameters, such as the number of layers and the number of each layer's neurons, can provide several make ups to adjust its network structure, whereas the LSSVM model only uses two parameters for optimization ( Table 6). The MLP model is similar to the GMDH in terms of having several optimization parameters. However, it uses the GWO algorithm, which can increase the accuracy of MLP (with an average of 10.7% improvement) by selecting the optimal parameters. The optimal parameters of the machine learning approaches are provided in Table 7.
Correlation graphs were used to compare the correlation between the snow cover estimation and the actual values (Figures 4 and 5). Here, the weakest coefficient of determination (for both slopes) is also reported in MLR estimation. For the northern slope (Figure 4), the highest R 2 is observed in the GMDH outputs (R 2 = 0.92 in GMDH4 and GMDH5), and then in the MLP-GWO model estimation (where R 2 reaches 0.91 in MLP-GWO3 and MLP-GWO4). The mean coefficients of determination are lower for the southern slope ( Figure 5). The weakest coefficient of determination belongs to MLR1 and MLR2, with R 2 = 0.68. The strongest result of this slope belongs to GMDH6, with R 2 = 0.88. These values indicate the favorable estimates of the snow cover provided by the GMDH and MLP-GWO models.
Taylor diagrams were plotted for the models' outputs, as shown in Figures 6 and 7. As there are a large number of models, only two Taylor diagrams were plotted for each slope: The first diagram displays the MLR, SVM, and GMDH outputs, and the second diagram displays the MLP and MLP-GWO outputs. The weakest estimates (points farthest from Obs) are points G and H, which belong to LSSVM1 and LSSVM2, respectively, for the northern slope ( Figure 6). The most accurate estimates are the points P, Q, u, and x, which were obtained for GMDH4, GMDH5, MLP-GWO3, and MLP-GWO6, respectively. These points are located closest to the Obs point between the two dashed lines of RMSE = 1000 km 2 and RMSE = 500 km 2 and below the R = 0.95 line. For the southern slope (Figure 7), the weakest estimates were obtained for points A and B, which belong to MLR1 and MLR2, and they are close to the R = 0.80 line. The most accurate estimation for this slope is the R point of GMDH6. This point is the closest mode to the R = 0.95 line and RMSE = 1000 km 2 dashed line. The effect of using the GWO algorithm on MLP optimization can be observed for both slopes, where, in all cases (with similar inputs), the MLP-GWO points are in a better position than the MLP points, indicating the relatively better performance of this model in estimating snow cover areas.   Taylor diagrams were plotted for the models' outputs, as shown in Figures 6 and 7. As there are a large number of models, only two Taylor diagrams were plotted for each slope: The first diagram displays the MLR, SVM, and GMDH outputs, and the second diagram displays the MLP and MLP-GWO outputs. The weakest estimates (points farthest from Obs) are points G and H, which belong to LSSVM1 and LSSVM2, respectively, for the northern slope ( Figure 6). The most accurate estimates are the points P, Q, u, and x, which were obtained for GMDH4, GMDH5, MLP-GWO3, and MLP-GWO6, respectively. These points are located closest to the Obs point between the two dashed lines of RMSE = 1000 km 2 and RMSE = 500 km 2 and below the R = 0.95 line. For the southern slope ( Figure  7), the weakest estimates were obtained for points A and B, which belong to MLR1 and MLR2, and  In another graphical form, the frequency distribution of the observed and estimated snow cover data was plotted by a violin plot (Figure 8). Here, the frequency distribution of the most accurate performances of each model is examined in the form of violins [36]. The similarity of the violins to the actual snow cover data violins may indicate that the frequency distribution of the estimated data is close to the actual data. For the northern slope, the GMDH outputs, followed by the MLP-GWO outputs, have the most curvature similar to the observational data violin (particularly GMDH, which performed very desirably in showing data curvature greater than the third quartile). The violin of the southern slope observational data is more complex. This complexity is due to the presence of two months of snow cover at the upper tail end of the distribution. Among the models, the GMDH model was able to estimate one of these two exceptional months, which is beyond the capability of the other In another graphical form, the frequency distribution of the observed and estimated snow cover data was plotted by a violin plot (Figure 8). Here, the frequency distribution of the most accurate performances of each model is examined in the form of violins [36]. The similarity of the violins to the actual snow cover data violins may indicate that the frequency distribution of the estimated data is close to the actual data. For the northern slope, the GMDH outputs, followed by the MLP-GWO outputs, have the most curvature similar to the observational data violin (particularly GMDH, which performed very desirably in showing data curvature greater than the third quartile). The violin of the southern slope observational data is more complex. This complexity is due to the presence of two months of snow cover at the upper tail end of the distribution. Among the models, the GMDH model was able to estimate one of these two exceptional months, which is beyond the capability of the other four models, particularly MLR and LSSVM. In estimating the violin curvature of the observational data, GMDH generally has the closest possible curvature to the observational data.
Remote Sens. 2020, 12, x FOR PEER REVIEW  21 of 24 four models, particularly MLR and LSSVM. In estimating the violin curvature of the observational data, GMDH generally has the closest possible curvature to the observational data.

Discussion
Due to the lack of studies on snow cover area estimation, we can follow studies on the estimation of other snow variables [9], as discussed. Estimates of snow depth in the German Black Forest mountains were obtained using inputs similar to this study (except for the fact that the average temperature was used instead of the maximum and minimum temperatures) [37]. In the study, the MODIS sensor was used to monitor the snow depth and the estimator models were similar to the artificial intelligence-based models. The results of this study with 0.83 < R < 0.93, like the current study, indicated the accuracy of artificial intelligence models for estimating the desired snow variables. The MLP model with precipitation, minimum, and maximum temperature inputs was used to estimate the maximum daily fresh snow accumulation (MDFSA) in South Korea [9]. This model presented a desirable result, with a correlation coefficient of R = 0.90, which is consistent with the results of the present study. Binaghi et al. [38] used the RBF neural network to estimate the thickness (depth) of snow in the central Alpine area in Italy. Meteorological inputs (temperature and

Discussion
Due to the lack of studies on snow cover area estimation, we can follow studies on the estimation of other snow variables [9], as discussed. Estimates of snow depth in the German Black Forest mountains were obtained using inputs similar to this study (except for the fact that the average temperature was used instead of the maximum and minimum temperatures) [37]. In the study, the MODIS sensor was used to monitor the snow depth and the estimator models were similar to the artificial intelligence-based models. The results of this study with 0.83 < R < 0.93, like the current study, indicated the accuracy of artificial intelligence models for estimating the desired snow variables. The MLP model with precipitation, minimum, and maximum temperature inputs was used to estimate the maximum daily fresh snow accumulation (MDFSA) in South Korea [9]. This model presented a desirable result, with a correlation coefficient of R = 0.90, which is consistent with the results of the present study. Binaghi et al. [38] used the RBF neural network to estimate the thickness (depth) of snow in the central Alpine area in Italy. Meteorological inputs (temperature and precipitation) were used in the study and NRMSE ranged from 0.04 to 0.07, indicating a better accuracy than the current study. This difference may be due to the nature of the variable (snow depth) itself, which can be explained by the fact that its estimation (based on the normalized RMSE) is more accurate than estimating the snow cover area.

Conclusions
This study investigated the relationships between several drought indexes, including PDSI, BMDI, SPI, MSPI, SPImod, JDI, and SPEI, and the snow cover area, and compared the accuracies of the artificial intelligence-based models, MLR, LSSVM, GMDH, MLP, MLP-GWO, and MLR in estimating the snow cover area using meteorological inputs, precipitation, the minimum and maximum temperature, the global solar radiation, the relative humidity, and the wind velocity. The evaluation of the correlations between drought indices and the snow cover area showed that the effect of drought on snow cover would remain up to 5 (or 6) months in the region. The most related index is the time window 2 of SPI (SPI 2 ), which shows that the effectiveness of a drought event will appear after two months based on snow cover variations. The outcomes of this study obtained in the modeling part confirmed that MLP-GWO and GMDH models can be considered acceptable for estimating the snow cover area of a region. The application of these data-driven models is significant in the following cases: (1) Difficulty in field measurements of the snow cover area; (2) a lack of accessibility to satellite images for monitoring the snow cover area; (3) estimation of the snow cover area in missing months; and (4) estimation of the snow cover area in years without satellite images. This study was conducted on a monthly scale, but a limitation for snow studies, at a daily scale, is the separation of snow from cloud. It can cause errors in monitoring the snow cover area and produces missing daily data. For this subject, the 8-day images of MODIS, or images of the other sensors such as Landsat, SPOT, ASTER, etc., can be used. The results of this study in both sections are acceptable and promising, and are well worth researching elsewhere in the world. They suggest using and testing other types of drought indices to further investigate snow cover variations' effects on droughts. Furthermore, other artificial intelligence-based models, such as ANFIS, RBF, and GRNN, and different optimization algorithms, such as the Genetic Algorithm, Particle Swarm Optimization, Firefly Algorithm, etc., are suggested for snow cover estimation.