Parsimonious Models of Precipitation Phase Derived from Random Forest Knowledge: Intercomparing Logistic Models, Neural Networks, and Random Forest Models

: The precipitation phase (PP) affects the hydrologic cycle which in turn affects the climate system. A lower ratio of snow to rain due to climate change affects timing and duration of the stream ﬂow. Thus, more knowledge about the PP occurrence and drivers is necessary and especially important in cities dependent on water coming from glaciers, such as Quito, the capital of Ecuador (2.5 million inhabitants), depending in part on the Antisana glacier. The logistic models (LM) of PP rely only on air temperature and relative humidity to predict PP. However, the processes related to PP are far more complex. The aims of this study were threefold: (i) to compare the performance of random forest (RF) and artiﬁcial neural networks (ANN) to derive PP in relation to LM; (ii) to identify the main drivers of PP occurrence using RF; and (iii) to develop LM using meteorological drivers derived from RF. The results show that RF and ANN outperformed LM in predicting PP in 8 out of 10 metrics. RF indicated that temperature, dew point temperature, and speciﬁc humidity are more important than wind or radiation for PP occurrence. With these predictors, parsimonious and efﬁcient models were developed showing that data mining may help in understanding complex processes and complements expert knowledge.


Introduction
The precipitation phase has a fundamental role in the hydrologic cycle and the energy fluxes between land and atmosphere, which affects the climate system. Due to climate change (CC) and climate warming, the rain to snow ratio is expected to increase, affecting the water supply stored as snowpack for one billion people world-wide in winter as well as affecting large cities in mountainous tropical regions such as the Andes [1]. Additionally, as global temperature raises, the impact on snowpack dynamics and streamflow amount and timing may affect an adequate management of water resources, added to the pressing need of water supply due to urbanization and migration to main cities.
The Andean glaciers are important for fresh water supply, especially in dry arid regions, making them vulnerable to changes in temperature. This happens because the glacier depends on the snow cover to delay mass loss [2]. However, the phase of precipitation is controlled by atmospheric profiles of temperature and moisture [3,4], identified an air temperature increase by approximately 0.1 • C/decade with a tendency to increase in the  period. For this reason, it is necessary to study the driving factors governing the occurrence of the precipitation phase and its variability.
The trend of higher rain to snow ratio under CC affects several aspects of the hydrological functioning of a basin. Furthermore, the phase of the hydrometeors depends on the energy transfer from the atmosphere's temperature and its water content. Thus, the hydrometeors phases observed at ground stations are influenced by factors such as the vertical profile of temperature, the height of the planetary boundary layer, and the type of temperature gradient [5][6][7]. Changes in precipitation leading to lower snow accumulation decrease the response time of the streamflow and affect the magnitude and the duration of the baseflow in summer. Lower response times of hydrologic basins added to enhanced rainfall extremes due to CC may produce higher frequency of flashfloods with its negative impacts associated. Furthermore, this affects the partitioning of water between evapotranspiration and runoff. Thus, changes in the rain to snow ratio affect the hydrological cycle, the land-atmosphere interaction, and beyond; they have societal impacts from the water management perspective and the assessment of risks related to natural disasters.
Traditionally, air temperature is used to estimate the PP. Empirical relations are calibrated for specific locations [8,9], and are mainly focused on the lower 5% fraction (snow) or values higher than 95% (rain) [10,11]. Other models use a lower and an upper threshold for mixed phase precipitation and then fit a linear or a more complex function to describe the mixed phase range [12,13]. More recently, [11], included humidity content related variables in snow covered mountain regions to model PP, which provided acceptable results. Therefore, the consideration of air moisture related variables provides proxy information about the latent heat fluxes driving energy change in the hydrometeors by sublimation or condensation [14]. Models that consider the hydrometeor temperature (Th) or the dew point temperature (Td) as predictors for PP are grounded in these concepts [15,16].
Few studies investigated the thresholds driving the precipitation phase in the tropical Andes. For instance, [8], compared snow and rain air temperature thresholds between Charquini in Bolivia (ca. 4800 m.a.s.l) and Davos in the Swiss Alps (1590 m.a.s.l). They found differences lower than 0.5 • C, despite its strong humidity and cloudiness seasonality [3,13], with 0% occurrence of snow for temperatures higher than 3-3.5 • C and 100% snow occurrence for temperatures lower than −1-−1.5 • C. Following a classical approach, [17], applied −1 • C and 3 • C thresholds for snow and rain occurrence in the Antisana glacier in Ecuador and fitted a polynomial equation based on the work of [18], to represent the mixed PP. However, the complex orography of the Antisana glacier added to the influence of strong easterly winds, which transport high humidity from the Amazon [4,19], makes this location adequate and scientifically interesting to evaluate the influence of other meteorological variables besides air temperature to derive PP.
The approaches to predict PP change from the variables used for prediction to the type of models used for this purpose. However, due to its simplicity and good performance, logistic models using temperature and/or humidity are of special interest. For instance, [7], used logistic regression models to compute the fraction of snow as a function of various meteorological variables, e.g., air temperature and relative. Their ability to incorporate humidity makes them better to predict the precipitation phase than methods using air temperature alone [11], especially in warm humid sites where the temperature range for solid precipitation increases in relation to dry cool zones [20]. Thus, logistical regression models produced the lowest mean biases compared to observations of snow depth and snow water equivalent [21]. However, despite the relative simplicity to identify the relation of the PP with the predictors in these models, which could make it easier to derive the processes involved, the consideration of more variables is far from straightforward, because expert knowledge is necessary to determine how some processes may be translated into equations. Thus, the use of highly effective data mining models may be an option to study the effect of considering additional variables for PP prediction.
Artificial intelligence (AI) methods have been used successfully in several applications for water resource management [22], where the most widely used approach is utilization of artificial neural networks (ANN). However, due to the simplicity and the highly accurate capabilities in classification and regression problems, random forest (RF) has been explored in water resources more recently. Therefore, in the present study, AI models were used to analyze the complex interacting processes between meteorological variables and PP occurrence. The application of AI methods for PP prediction is a pioneering approach according to the available literature, and the good performance of its application in several fields makes it appealing to evaluate these methods for PP research.
Thus, the aims of the present study were threefold: (i) to evaluate the performance of RF and ANN models in relation to logistic models (LM) to derive PP; (ii) to identify the main drivers of PP occurrence from RF models; and (iii) to use these drivers to develop parsimonious logistic models that can help in understanding the underlying processes in relation to PP occurrence. In Section 2, the study area and the data are described. In Section 3, the methodology is presented. In Section 4, the results are discussed, and in Section 5, the discussion and the concluding remarks are presented.

Study Area and Data
As a part of the Andean Regional Climate Change Adaptation Project/Adaptation to the impact of the accelerated retreat of glaciers in the tropical Andes (PRAA by the acronym in Spanish), the PRAA-Morrena station is in glacier 12 of the Antisana volcano at 4725 m a.s.l, 0 • 29 38 S, 78 • 09 40 O in the west flank of the Andes Cordillera in Ecuador ( Figure 1). Due to its closeness to the glacier, this station is more reliable to study the processes occurring at the terminal zone of the glacier and is more sensitive to changes in temperature and precipitation [23]. In Table 1, the sensors installed in this station are presented, and the study site is shown in Figure 1. The meteorological data were sampled every 10 s and recorded every 15 min due to limitations in memory space. The disdrometer OTT Parsivel 2 [24], measures the hydrometeor type and records the cumulative precipitation every 15 min. The disdrometer is located at 4730 m.a.s.l in the glacier foreland of the Antisana volcano. Kipp & Zonen CNR4-G3 5 < λ < 50 µm (1.00 m) Daily value ±10% The disdrometer measured data were corrected [25][26][27][28][29], for diameter, speed, and bulk variables. However, the PP recorded by the disdrometer was considered adequate.
The data were filtered, selecting precipitation higher than 0.1 mm h −1 , and data with error flags were discarded. Additionally, only data with the meteorological variables available were selected for this study to avoid time series infilling procedure. Finally, 12,248 events were obtained, with 10,747 events between July 2012 and January 2016 for the calibration of models and 1483 observations between July 2019 and February 2020 for the validation. The dew point temperature, T d , was derived from air temperature, T a ( • C), and relative humidity (RH) using the Magnus equation: con r = ln(RH/100) + bT a /(c + T a ) where b = 17.67 and c = 243.5 • C [30]. The hydrometeor temperature, T h , may be considered as the ventilated temperature of wet bulb thermometer related to the surface thermodynamic fluxes. Using the approach of [9]: Due to the non-linear dependence of T h on both sides of the equation, iteratively, the Newton-Rapson numerical setting was applied [9].
where Mvd is the ratio between the water vapor molecular weight and the mean molecular weight for dry air.
Here, e is the saturated vapor pressure for moist air given by the next equation: The list of variables used in the present study is presented in Table 2. These variables were used independently or in groups to generate data availability scenarios, as explained below.

Methods
The first aim of this study was the evaluation of RF and ANN models to predict PP in relation to LM. In Figure 2, the schematic representation of the methodology is shown. The LMs are simple parsimonious models based on expert knowledge which depend on one or two temperature or humidity related variables, limiting the possibility to evaluate the influence of other variables on PP occurrence. However, RF and ANN models can take several variables as inputs, which may improve the predictive performance of PP. Therefore, data availability scenarios (DAS) were devised to evaluate and compare several LM, RF, and ANN models, which may help to identify the performance of AI models depending on the availability of meteorological data. For the second aim, the mean decrease Gini index (MDG) was used to identify the main predictors of PP occurrence (see Figure 2). MDG was calculated as the average throughout the forest of the decrease in Gini impurity for a predictor, obeying higher values for the most important variables. Therefore, MDG was used to identify the most important PP weather drivers, which were used to create LMs. These models were created to evaluate the capacity of automatic generation of predictors and compare them with LMs based on expert knowledge. Thus, these models were evaluated against LM, RF, and ANN models from the first objective. A more detailed explanation of the methodology is presented in the following subsections.

Quality Control of Data
The technical information of the disdrometer used for this study indicated that the accuracies in solid and liquid precipitation were ±5% and ±20%, respectively [24]. In addition, only a few events were classified as solid PP, while high air temperature, e.g., 7 • C, was measured. As such, outliers affected the generation of the liquid phase as well as solid and mixed curves.
Due to the multidimensional nature of the problem, e.g., for each event, in addition to the disdrometer variables, meteorological variables were monitored to classify the large dataset of multidimensional data, and the unsupervised clustering approach k-means was applied. To identify the optimal number of clusters, the silhouette criteria was used. Then, the groups showing physically unfeasible center means (e.g., of PP, air temperature, dew point temperature, and humidity) were removed from the dataset and verified by performing the "S" curve with and without this group. Finally, to furnish the "S" curve [13], temperature bins were selected. For each bin, the median of PP was obtained. This was done for liquid and solid phases. The results obtained using K-means outliers' detection are shown in Figure 3 (T a , T h , T d from top to bottom rows). The original data are presented as "S" curves in cyan color, and the black curves were obtained after K-means filtering, which were smoother, as expected [7]. The change produced by this filter was more pronounced for T a than T h or T d , as shown in Figure 3. Figure 3. Plots of T a , T h , T d (from top to bottom) shown for solid, mixed, and liquid precipitation (from left to right). The "S" curves in cyan were plotted from the observed data, and the black curves were plotted with data after multidimensional K-means outliers' detection. Temperature class is in the horizontal axis in all plots and is represented by the median value per bin.

Data Availability Scenarios (DAS)
Despite the data availability of several meteorological variables for this study, 8 data availability scenarios (DAS) were considered. In this fashion, the results may be evaluated depending on the data available to the researcher, helping to evaluate the performance of the models for several DASs. This also indicates the versatility of the models to be applied.
The scenarios are shown in Table 3. It is worth mentioning that these scenarios were applied only to ANN and RF models, because the LMs are dependent on either T a , T h , or T d . DAS 6 used Ta and RH to compare AI models to LM1. DAS 7 used Th to compare to LM2, and DAS 8 used Td to compare to LM3. Table 3. Description of the data availability scenarios (DAS) and the corresponding variables used for each scenario. For all DAS, the target was the precipitation phase.
Logistic models 2 and 3 (LM2) and (LM3) were dependent of the hydrometeor temperature, Th, and the dew point temperature, Td, respectively. Ref. [9], proposed the power logistic equation: where: T x : is a temperature. For LM2 and LM3, Th and Td were used; b x , c x : are parameters to be calibrated.
To calculate the precipitation fraction related to a specific temperature, increments of 0.5 • C were used as: In similar fashion, this equation was applied to solid and mixed precipitation.
To express the LMs as the multiple linear regression (MLR) [33], presented in Equation (8), it was necessary to use the transformation shown in Equation (9). Then, the calibration of parameters for the MLR was conducted.
Here, y is the dependent variable calculated as Equation (9), x i represents independent variables, β i represents parameters to be calibrated, and ε is a stochastic error. The Table 4 shows the structure of each LM used based on the Equation (9). Table 4. Multiple linear regression models used for LM.

Artificial Neural Networks Models
ANNs are inspired by the functioning of the human brain. They are designed to detect patterns between inputs and outputs by learning algorithms. The artificial neurons have inputs and one output (see Figure 4), and a transfer function transforms the inputs to the output using, as its argument, the weighted sum of inputs. Among the most used transfer functions are the linear, the sigmoid, and the tansigmoid [34]. Artificial neurons are arranged in layers, which are fully connected, and the weights assigned to each connection are defined during the training of the network by complex, highly multidimensional optimization algorithms such as the gradient descent or the Levenberg-Marquart, among others [34]. ANNs are used in classification and regression [35,36]. They are considered as part of the data mining paradigm; therefore, the performance of this approach depends on the data availability. Usually, for calibration, 70% of the data are used, and 30% are used for validation. For a more detailed description of the ANN approach, see [37].
The selection of the transfer functions and the number of neurons for the hidden layer and the output layer was determined iteratively. The tansigmoid transfer function was used in the hidden layer, and the SoftMax function was used in the output layer. These transfer functions are available in the ANN toolbox of MATLAB as part of the "Classification network" option used in this study. For the number of neurons, 3 were used for the output layer and 10 for the hidden layer using the mean square error as loss function. The back-propagation algorithm was used for the optimization of the network synaptic weights. The proportions of data used for calibration, validation, and testing were 70%, 15% and 15%. The data were preprocessed according to the data availability scenarios previously explained, from DAS1 to DAS8. The target vector was one of the following: [0,0,1], [1,0,0], [0,1,0], representing rain, snow, and mixed PP, respectively. For each interval, the rain fraction was calculated as the number of rain events divided by the total events.

Random Forest Models
RF is an AI method based on supervised learning with extensive applications in science and engineering due to its good performance for classification and regression. RF is a group of tools called trees. The forest is formed by a random group of decision trees [38]. Within the forest, every tree is formed by a random sample called bagging, which is the starting point of decision branches. A tree in the end has a decision which counts as a vote, and the decision with the most votes is taken ( Figure 5). If more trees are considered, the prediction is more accurate [39]. The RF algorithm may be used to infer discrete or continuous data. Its versatility copes easily with gaps in the data; however, as with any black box model, at the end, the results are shown, but little information is available about the cause of the results [40]. RFs' ability to cope with forecasting and classification problems makes this technique more attractive for new applications in water resource management. The R library in random forest was used in this study [41]. The reader is referred to the studies of [42] for further information.

Metrics of Evaluation
To evaluate the performances of LM, ANN, and RF models, two kinds of metrics were used, e.g., metrics of detection and metrics of PP fraction quantification, hereafter referred to as fraction quantification metrics. The former measures the detection of snow, rain, or mixed PP events, and the latter measures the fraction of PP for a bin of a variable, for instance, air temperature. The proportions of data for calibration and validation were 80% and 20%, respectively. The data were selected randomly to avoid any trend, which also improved the robustness of the models.

Metrics of Fraction Quantification
The observed and the simulated precipitation phases are represented as Xobs(i) and Xsim(i), and the corresponding fractions are fr obs (k) and fr sim (k). N is considered as the number of observations, and M is the number of bins for the representation of a variable.
The mean bias, MB, indicates positive or negative departure from the observed data. Thus, a value of 0 indicates a null difference of means. The MB is calculated as: The RMSE of the precipitation fraction was calculated as: The explained variance of the precipitation fraction was calculated as: The Nash-Sutcliffe index (NSE) evaluates the representation of extreme values. NSE values of 1, 0, and negative are related to the best model, a model as the mean value, and a model inferior in prediction to the mean value [43]. NSE was calculated as: The Kling-Gupta coefficient (KGE) was used to overcome some limitation of the NSE. The best model has a value of 1, and some authors argue that positive values are generally good models and negative values are bad models, although others argue that values less than 0, e.g., −0.41 [43], indicate the benchmark comparison with the mean. The KGE was calculated as: where: σ f r sim and σ f r obs are the standard deviation of the simulated and the observed fractions. r is the linear correlation.

Metrics of Detection
To evaluate the misclassified events (PM) and the unclassified events (PU), the indexes PM and PU were calculated as shown in the paragraphs below. Events with fractions between 5% and 95% were considered as unclassified, whereas a misclassification was considered if the model simulated below 5% (snow) and rain was observed, or if the model simulated a PP fraction over 95% (the rain) and snow was observed. Following [7], to minimize uncertainties related to the phase of precipitation, the mix precipitation values were excluded.
The critical success index (CSI) is the ratio of the correct forecast to the sum of the correct forecast, plus false alarms, plus misses. The perfect model has a value of 1.
The snowfall change (SFC) and the precipitation accumulative error (PAE) measured the cumulative error in snow to rain ratio [10]. As mentioned, mixed events were excluded. where: S obs (i) and S sim (i) are the observed and the simulated snow indicators.

Normalization of Metrics
To ease the interpretation of the results, a normalization process was applied to all models' evaluation metrics. All metrics were mapped to the range 0 to 1, with a value of 0 for the best model and 1 for the worst model. It is important to clarify that, with metrics such as r 2 where 1 is the best value, the transformation 1-r 2 was applied.
For other metrics, the normalization was achieved by simply dividing the metric for a model x i by the maximum value among all models, as follows: where: N_x i is the normalized value of a x metric for a model i.
In Table 5, the description of the metrics together with the normalized description are shown.

Meteorological Drivers
An important step towards the development of a model is the identification of predictors. Despite the higher performance of AI models over LM, a pitfall is the lack of information about the processes being modeled and the interrelation between predictors and targets. The principal aim of this section was to evaluate the main meteorological drivers for each DAS. Once these main predictors were determined, simpler models, e.g., LM, were developed, as discussed in the next section. This approach might help to understand complex processes unveiling the interrelation of predictors and targets using "automatic methods" or algorithms. The evaluation of the importance of variables in RF can be achieved using two indexes, e.g., the mean decrease Gini (MDG) and the mean decrease accuracy (MDA) [39]. Authors such as [44], concluded that the ranking stability of MDG was superior to MDA for data without correlation between parameters. The main reason for this affirmation is that MDA measures may be scaled by dividing by its empirical standard error. However, MDG has shown to be sensitive to predictors with different scales of measurement, and MDG may be preferred over MDA because of increased stability [45]. Thus, MDG was used in this study to identify the main predictor for RF models in all DAS.

Development of LM Models from the Knowledge of Artificial Intelligence Models
Once the best predictors derived from RF using MDG were determined, logistic models were implemented and evaluated. This step was important because the logistic models use expert knowledge, whereas, here, we proposed to use AI derived knowledge to implement conceptual models.
Thus, after the evaluation of the best predictors for each DAS using the MDG index, LM type models were devised as: where: fr(rain) is the fraction of rain; α LM , β LM , γ LM are parameters to be calibrated; x 1 and x 2 are independent variables extracted from MDG.

Logistic Models
In Table 6, the calibrated parameters for LM1 to LM3 are shown, in addition to the range used for its calibration. As shown in Figure 6, from left to right, the performance of LM1 was superior to LM2 and LM3 and had a better fit, especially for lower and upper values of PP. Considering that LM1 considered relative humidity in addition to temperature, this result is in concordance with the study of [46], who reported that discrimination of PP at ground level was more accurate considering air saturation parameters in addition to temperature parameters. A higher influence of Th for the processes related to PP was evident from its better performance to predict PP in comparison to Td, which showed shortcomings, especially in values between −1 • C and 1 • C PP. However, both models LM2 and LM3 slightly underestimated higher values of PP and overestimated the lower values of PP. Further evaluation is needed in relation to the discrimination thresholds of percentiles 5 and 95 to separate solid from liquid precipitation. Such an analysis will be conducted in future research. Such an analysis is important to compare the thresholds determined in other latitudes, as in as Bolivia by [8].

ANN Models
In Figure 7, the results of rain fraction for the eight DASs are presented. The simulation of DAS1 to DAS6 was good, but for DAS7 and DAS8, ANN models overestimated the high values of PP. For mixed PP, DAS1 to DAS6 were well represented with the observed percentile of 50. This fact shows that ANN models captured intermediate PP better than LM1 and LM2. For low PP, ANN models showed clear shortcomings, underestimating PP for all DASs. This is a clear limitation of this approach because it was present for all DASs, regardless of the predictor variables used in the models.
As mentioned in Section 3.2.1, DAS6, DAS7, andDAS8 were devised to compared AI models to LM1, LM2, and LM3. The results of Figure 7 show that more variables such as DAS1 helped to improve PP representation, but the results for DAS6 surpassing DAS7 and DAS8 agree with the results obtained for LM1, LM2, and LM3. Thus, in addition to temperature related variables, the use of humidity related variables showed better results, regardless of the model.

Random Forest Models
The representation of rain fraction for RF is shown in Figure 8. For high PP values, RF captured well the observed data. The PP under 50% was captured adequately for all DASs. The higher and the lower PP values were well represented, clearly outperforming ANN and LM models. The comparison between DAS6, DAS7, and DAS8, did not show a significant difference, as was evident for LM and ANN models. Thus, the importance of variables using the RF approach may be more adequate using MDG, as discussed in the next section.

Intercomparison of LM, ANN, and RF Models
For further analysis and models intercomparison, in this section, several metrics of fraction quantification and detection are calculated. In Table 5, the fraction quantification metrics are in lower case, e.g., r 2 , kge, rmse, nash, and bias, whereas the detection metrics are in upper case, e.g., CSI, PU, PM, SFC, and PAE.
LMs showed similar results for quantification and detection metrics. However, LM1 metrics corresponded to the visual appreciation of Figure 6, showing it was better at simulating extremes than LM2 and LM3 according to kge. Due to the slight differences among models, the last column in Table 7 shows a score calculated as the sum of the best of the group, 1, or 0 otherwise. Thus, models LM1 and LM3 showed better performance than LM2. ANN models showed that NSE was less sensitive to kge among models. Thus, based on the kge index, ANN models for DAS1 and DAS3 performed slightly better than the rest of the models. RF models in all DASs showed similar results on quantification metrics, with DAS7 and DAS8 having a better fit on extreme values. However, DAS1 and DAS3 had better results in detection metrics. The scores showed that RF for DAS7 was the best RF model in five out of 10 metrics, followed by DAS4 and DAS5. Again, despite the good performance of RF models, it was difficult to easily interpret the reasons why a DAS showed better results. The results of the mean values (Table 7) showed that, overall, RF models had a better performance than ANN models and that LMs were less skillful than both AI models. Thus, RF models showed a robust performance for modeling PP in relation to LM and ANN.
The normalized metrics are displayed in Figure 9. The good performance of RF models was evident for fraction quantification metrics, where five out of five results were the best models, followed by LM and ANN. For detection metrics, ANN models showed a better performance than RF, and LMs were less skillful than ANN and RF. Based on these results, added to the analysis of the results of Table 7, overall, AI models performed better than LM, and RF showed better results than ANN models. The superiority of AI models over LM was expected, firstly because of their ability to map several input variables to represent an output and secondly because of the power of machine learning algorithms. However, an interesting result is the superiority of RF over ANN models. This may be related to the fact that detection of rain, snow, or mixed PP is a classification problem, where discrete outcomes are expected.

Meteorological Drivers
For an exploratory analysis in Figure 10, the center and the right plots showed that PP was more sensible to SH than RH. This interpretation assumed that less sensitive PP showed overlapping boxplots, while boxplots showing different ranges of a variable were more sensitive to it. For instance, in Figure 10 (center), the specific humidity showed the lowest values for snow, the highest values for rain, and mixed PP presented intermediate values of specific humidity. Contrarily, in Figure 10 (right), there was not a clear range of relative humidity associated with rain, snow, or mixed PP. This may imply that specific humidity may be a better predictor than relative humidity when considering air moisture related variables.
In Figure 10 (left), Th and Td were indicated to be better predictors than Ta. However, this assumption was still a qualitative one, and a better assessment of meteorological drivers was obtained from the MDG index of RF models.
To derive the meteorological drivers for PP, the MDG obtained from RF was calculated for the eight DASs. In Figure 11, the average value of MDG for each variable is shown. For DAS1, it was clear that specific humidity (SH) was the most important variable when all variables were considered. This was already reflected in the qualitative analysis of Figure 10. For DAS1, SH was followed by air temperature and then by outgoing radiation variables, finally followed by wind speed. Relative humidity was less important than SH as well as wind direction. These results agree with the findings from [47] in relation to the development of an empirical mass balance model for the Antisana glacier; although they used monthly values from reanalysis data only for temperature and wind, they argued that wind may be used as a proxy of moisture flux from the Amazon.  The MDG index for all DASs reflected that, in all cases, temperature and humidity related variables were more important than radiation related variables or wind speed or direction. This result is valuable from a phenomenological perspective because it indicates that monitoring temperature and air moisture content may be useful for surface PP forecasting. Additionally, when air temperature, hydrometeor temperature, and dew point temperature were available, Td was more important than Th, and this was more important than air temperature. This was shown in DAS3 and DAS4. Additionally, when specific humidity and relative humidity were available, specific humidity was more important than relative humidity.

Implementation of Logistic Models Based on AI Knowledge
The results of MDG for all DASs showed that some variables persistently surpassed others in importance. Here, three variables were selected to implement in two LM models, e.g., LM4 was devised with T d and SH d as independent variables and LM5 with T a and SH a . Afterwards, these models were compared to expert knowledge based classical LMs, LM1 to LM3.
The structures of LM4 and LM5 models are shown in Table 8 as well as the calibrated parameters. The MLR optimization returned a wider range for α, β, and γ parameters in model LM4 compared to LM5. This was probably due to the difference in the range of T d and T a .

Evaluation of Logistic Models Derived from MDG Predictors
According to Figure 12, LM5 performed better than LM1, LM2, and LM3 in six out of 10 metrics. However, LM4 could not surpass the performance of LM5 despite the use of Td and SH, the most important variables derived from MDG. This model showed, overall, a similar performance to LM3, an expert knowledge-based model. However, the use of SH improved the representation of LM1, which used RH as a humidity related variable. Although further research is necessary, these results highlight the possibility of "automatic" discovery of predictors by AI models and the conceptual implementation in simpler models. Nevertheless, LM4 did not show a significantly better performance than LM1 to LM3, although a similar performance showed the potentiality of AI-based development of knowledge, which is a limitation of black box models.
To evaluate the performance of LM5 in relation to RF and ANN models, in Figure 13, the normalized metrics are shown. This plot is like Figure 9, where the averages of LM1, LM2, and LM3 are presented. LM5 improved in kge as the best model and also improved for rmse with similar results to RF, surpassing ANN performance and SFC with better results than RF models. These results prove very valuable, because using both predictors selected by MDG improved the performance of parsimonious logistic models, turning information derived from black box models into more transparent models.

Models for Precipitation Phase and Predictor Variables
The temperature and the humidity profiles together with the depth of the surface layer control the precipitation phase reaching the ground surface [3]. However, when the surface layer temperatures are close to freezing, and the mixing ratios are neither close to saturation nor very dry, the phase at the surface becomes difficult to forecast. Furthermore, changes in temperature and atmosphere's saturation can also alter the precipitation phase due to pressure changes driven by changes in vertical air velocity or vertical air movement [48]. The PRAA station fulfills all these conditions due to (1) its altitude near the 0 • C isotherm located between 4800-5100 m.a.s.l [49]; (2) large variability of temperature and specific humidity between days; and (3) strong wind speed around 4.6 m/s with gusts up to 12 m/s [29]. Therefore, models based on both temperature and humidity variables achieved better results to forecast PP, as occurred with LM1 and LM5.
Comparing the calibrated parameters of LM1 with respect to the results presented by [7], the P5-P95 range and the P50 values were very different. This could be related to differences in climatic conditions between inner tropics and the Alps. On the other hand, we computed the rain fraction with P50 calibrated values for LM1 (Table 5) and the mean temperature (2.1 • C) and the relative humidity (81%) of the site (according [29]). Afterwards, it was compared with the rain fraction computed with the mean temperature and the relationship derived by [17] for the Antisana. With the LM1 model, we obtained a rain fraction of 0.83, less than the 0.91 obtained by Wagnon. Thus, the RH inclusion in LM1 decreased the rain fraction with respect to another validated approach based on Ta only.
The current use of Parsivel OTT2 disdrometer is an excellent opportunity to monitor present weather in the site and to explore new approaches to discriminate PP. Since only simple classifications based on temperature are available for the Andes [8,17], these findings will help to improve the knowledge about precipitation in alpine sites of tropical Andes, where solid precipitation is limited to altitudes above 4000 m.a.s.l. Furthermore, the parameters of LM models may be used to obtain solid precipitation, the key variable for glaciological modeling in the inner tropics [2].
The application of RF and ANN allowed us to recognize that only Ta provided a very poor explanation of the PP, while the inclusion of RH or SH improved the performance of LM models. This could be explained due to humidity variations being related to latent heat exchange of hydrometeors during their fall [9]. These facts justify why logistics models using temperature and humidity variables were demonstrated to be the most suitable to explain the sigmoidal S-shaped curve of rain-snow partitioning [7,11,46,50].

Precipitation Phase Trends and Climate Change
Studies related to PP are almost inexistent in Ecuador despite the important fact that, for instance, the rain to snow ratio represents important information for water resource management and risks related to flash floods occurrence. Due to climate change, the rain to snow ratio is expected to increase, affecting water supply stored as snowpack for one billion people world-wide in winter while also affecting large cities in mountainous tropical regions such as the Andes [1]. Snow falling in the Andes is stored as ice in glaciers and then is released during the dry season, contributing to the base flow. Thus, the buffering effect of glaciers is a vital service during the dry season for water consumption as well as industrial and agricultural uses [51]. Although the retreat in glaciers may not have a great impact in the hydrologic cycle at large scale, due to the buffering effect of wetlands in Ecuador and Perú [1], the change in snowpack may modify the hydrologic functioning of watersheds, wetlands, and groundwater reservoirs in addition to surface drainage [52]. Another aspect that deserves attention is the increasing risk of flash floods occurrence due to increasing rain to snow fraction in CC. However, the change of rain to snow fraction is highly complex in space, depending on several aspects such as topographic features, mean temperature, and snow cover duration, among others [53].
Considering similar thresholds under CC, e.g., −1 and 3, for snow or rain PP, this study showed that temperature and specific humidity are much more important than other variables such as incoming and outgoing radiation or wind. Although it is uncertain, the amount of precipitation in Ecuador is expected to increase due to CC [1,54], but climate models have less uncertainty with temperature, which will raise 4.5-5 • C by the end of this century under "business as usual" type emissions scenarios [55]. Additionally, more moisture is expected to affect the tropical Andes due to enhanced easterlies in CC. Therefore, these changes in the driving variables of PP may affect the processes involved, possibly showing non-linear responses, typically related to phase change processes. Thus, to develop informed base adaptation strategies to CC, more monitoring and studies related to the societal impact of PP change and occurrence are necessary, especially for large cities under increasing need for water provisioning and safety.

Conclusions
The dramatic decrease in the ratio of snow to rain due to climate change affects the timing and the duration of the stream flow, affecting the hydrologic cycle and the climate system through complex feedback processes. Therefore, more knowledge about the PP occurrence and the driving mechanisms is necessary. The understanding of these processes is especially important for water management in cities dependent on water coming from glaciers, such as several cities in the Andes, and along other mountain regions around the world.
This study first compared the performance of RF and ANN models in relation to logistic models to derive PP; then, it identified the main drivers of PP occurrence from RF models and finally used these drivers to develop parsimonious logistic models that can help in understanding the underlying processes in relation to PP occurrence. The results showed that RF and ANN outperformed LM in predicting PP in eight out of 10 metrics. Additionally, RF indicated that temperature, dew point temperature, and specific humidity were more important than wind or radiation for PP occurrence. With these predictors, parsimonious and efficient models were developed.
In this research, we presented a pioneering approach in this field. Several advantages of artificial intelligence methods were evident in helping to improve PP forecasting capabilities, identify predictors, and formulate parsimonious models. Thus, information and possibly "knowledge" could be extracted from massive amounts of data, which may complement the existing expert knowledge in this field.
Further work is necessary to evaluate, in other regions, the methodology developed in this study as well findings such as the model structure and the main predictors of PP, which may be specific for the Antisana glacier. Additionally, given the nonlinear response of PP to temperature thresholds, the development and the comparison of models for specific seasons would help to improve our understanding of the physical processes behind PP occurrence.

Data Availability Statement:
Measured and derived data that support the findings of this study are available from the corresponding author, L.C. on request.