A Model for the Relationship between Rainfall, GNSS-Derived Integrated Water Vapour, and CAPE in the Eastern Central Andes

: Atmospheric water vapour content is a key variable that controls the development of deep convective storms and rainfall extremes over the central Andes. Direct measurements of water vapour are challenging; however, recent developments in microwave processing allow the use of phase delays from L-band radar to measure the water vapour content throughout the atmosphere: Global Navigation Satellite System (GNSS)-based integrated water vapour (IWV) monitoring shows promis-ing results to measure vertically integrated water vapour at high temporal resolutions. Previous works also identiﬁed convective available potential energy (CAPE) as a key climatic variable for the formation of deep convective storms and rainfall in the central Andes. Our analysis relies on GNSS data from the Argentine Continuous Satellite Monitoring Network, Red Argentina de Monitoreo Satelital Continuo (RAMSAC) network from 1999 to 2013. CAPE is derived from version 2.0 of the ECMWF’s (European Centre for Medium-Range Weather Forecasts) Re-Analysis (ERA-interim) and rainfall from the TRMM (Tropical Rainfall Measuring Mission) product. In this study, we ﬁrst analyse the rainfall characteristics of two GNSS-IWV stations by comparing their complementary cumulative distribution function (CCDF). Second, we separately derive the relation between rainfall vs. CAPE and GNSS-IWV. Based on our distribution ﬁtting analysis, we observe an exponential relation of rainfall to GNSS-IWV. In contrast, we report a power-law relationship between the daily mean value of rainfall and CAPE at the GNSS-IWV station locations in the eastern central Andes that is close to the theoretical relationship based on parcel theory. Third, we generate a joint regression model through a multivariable regression analysis using CAPE and GNSS-IWV to explain the contribution of both variables in the presence of each other to extreme rainfall during the austral summer season. We found that rainfall can be characterised with a higher statistical signiﬁcance for higher rainfall quantiles, e.g., the 0.9 quantile based on goodness-of-ﬁt criterion for quantile regression. We observed different contributions of CAPE and GNSS-IWV to rainfall for each station for the 0.9 quantile. Fourth, we identify the temporal relation between extreme rainfall (the 90th, 95th, and 99th percentiles) and both GNSS-IWV and CAPE at 6 h time steps. We observed an increase before the rainfall event and at the time of peak rainfall—both for GNSS-integrated water vapour and CAPE. We show higher values of CAPE and GNSS-IWV for higher rainfall percentiles (99th and 95th percentiles) compared to the 90th percentile at a 6-h temporal scale. Based on our correlation analyses and the dynamics of the time series, we show that both GNSS-IWV and CAPE had comparable magnitudes, and we argue to consider both climatic variables when investigating their effect on rainfall extremes.


Introduction
The south-central Andes is an area that is affected by hydrometeorological extreme events, e.g., [1][2][3][4][5][6][7]. The combination of topography and climate forms the most important driver for generating deep convective storms along the eastern central Andes, e.g., [8][9][10][11]. Atmospheric water vapour content is a crucial variable triggering the convection and rainfall extremes in the south-central Andes [12]. Water vapour also plays an important role in controlling atmospheric stability as it is the primary variable leading to the formation of convective storm systems [13] by enhancing the convective available potential energy (CAPE) [14].
However, direct and three-dimensional measurements of water vapour in the atmosphere are difficult and requires atmospheric sounding [15] or recent developments in radar processing, such as Global Navigation Satellite System (GNSS) methods, to monitor the atmospheric integrated water vapour (IWV) content [16][17][18]. GNSS-derived troposphere products calculated from the zenith total delay (ZTD) [19] are now used as reliable meteorological data in climate studies [20]. The advantage of GNSS-IWV measurements is their high spatial and temporal resolutions.
Several previous research studies used GNSS atmospheric data to study extreme rainfall events [21,22]. Some studies have shown and documented that the variations in the GNSS-IWV are temporally correlated with rainfall. These studies have shown an increase in the GNSS-IWV several hours before extreme rainfall, mostly followed by a decrease after the event [23,24]. Furthermore, Priego et al. [13] investigated the joint effect of GNSS-IWV and atmospheric pressure on extreme rainfall and they showed a high spatiotemporal correlation between the variations of GNSS-IWV and severe rainfall in eastern Spain. Calori et al. [25] indicated that GNSS-IWV can show moisture variability in connection with severe storms in the Cuyo region in Mendoza in the south-central Andes. Thus, GNSS-IWV has been identified as a reliable parameter for detecting atmospheric convection and extreme rainfall.
An additional parameter that triggers convection and rainfall extremes and describes the atmospheric stability is CAPE [26]. CAPE indicates the amount of energy available for convection [27]. Mesgana et al. [28] defined CAPE as a proxy for extreme rainfall over the United States and Southern Canada. Furthermore, Murugavel et al. [29] have shown a high contribution of CAPE to heavy rainfall during the monsoon season over the Indian region. A different study suggested that the spatial pattern of extreme-rainfall events can be described by a combination of the dew-point temperature and CAPE in the south-central Andes [4].
These previous studies separately analysed the spatiotemporal distribution of extreme rainfall related to IWV and CAPE [25,28], but they were not fully sufficient in explaining the joint contribution of both variables on extreme rainfall. Our study aims to identify the relationship between rainfall, GNSS-IWV, and CAPE and to analyse the effect of GNSS-IWV and CAPE in the presence of each other on extreme rainfall generation. We specifically focus on the south-central Andes where convection plays an important role in extreme rainfall generation.

Data
We used the ERA-Interim reanalysis data (1979-present), version 2.0 of the ECMWF (European Centre for Medium-Range Weather Forecasts) [30,31] to analyse CAPE. CAPE is defined as [32]: It is the integral between the level of free convection (LFC) and the level of neutral buoyancy (LNB). T vp and T ve show the virtual temperature of the air parcel and surrounding environment, respectively. R d is the gas constant, and P is pressure.
For the calculation of CAPE in ERA-interim there are two key assumptions: First, there is no mixing of the parcels with the surrounding air. Second, there exists a pseudo-adiabatic ascent in which all condensed water falls out as precipitation [33]. The ERA-Interim reanalysis data used in this study have a spatial resolution of 0.75º × 0.75º and a temporal resolution of 6 h. ERA-Interim reanalysis data are interpolated to the station points using a nearest-neighbour interpolation method.
We used GNSS-integrated water vapour data from the Argentine Continuous Satellite Monitoring Network (RAMSAC) for two stations: San Miguel de Tucumán (TUCU, 1999-2013) located at 65º13' W and 26º50' S and San Fernando del Valle de Catamarca (CATA, 2008-2013) located at 65º46' W and 28º28' S ( Figure 1A). The distance between these two GNSS stations is about 189 km. These two stations were selected as they have the longest data availability for this region.
The RAMSAC network was created in 1998 and has grown to include around 100 continuously operating GNSS stations in north and central Argentina [34]. The GNSS-IWV data used in this study have a temporal resolution of 30 min. There were several missing data (3% from CATA and 14% from TUCU) in the GNSS-IWV data set that were removed from the associated CAPE and rainfall time series for these hours or days.
TRMM (Tropical Rainfall Measuring Mission) data [35], product 3B42 [36,37] (Version 7) with a spatial resolution of 0.25º × 0.25º and hourly temporal resolution have been used to analyse rainfall. TRMM rainfall data are interpolated at the station location using nearest neighbour interpolation method. TRMM data have been indicated to be a reliable dataset for investigating rainfall in South America [4,38,39].

GNSS Integrated Water Vapour (IWV) Processing
The Global Navigation Satellite System (GNSS) data are organized in units of 24 h periods and were processed using the earth parameter and orbit system software (EPOS) at the German research centre for geosciences (GFZ) [40]. To estimate the zenith path delay (ZPD), we used the Vienna Mapping Functions (VMF) [41]. The elevation cutoff angle is 7 degrees. The data processing was done based on the following steps: First, the two stations were processed in a Precise Point Positioning (PPP) model using the GFZ's own second reprocessed Global Positioning System (GPS) satellite clock and orbit products [42]. In the PPP processing, the low quality observations and outliers were removed.
In a second step, the remaining data were used for network processing. Several well distributed IGS (International GNSS Service) core stations are included in this network. The ZPD changes mainly with the temperature and water vapour content. In the last step, the estimated ZPD were converted to integrated water vapour (IWV) [17]. The retrieved IWV has an accuracy of 1-2 kg m −2 and a precision of 1 kg m −2 [43]. The humidityinduced part of ZPD provides a valuable source of vertically integrated water vapour, e.g., [16,17]. To estimate the IWV, meteorological information (ground pressure and the mean temperature above the station) is needed. In this study, the ECMWF ground pressure and mean temperature data were used to convert the ZPD to IWV [44].

Identifying the Effect of GNSS-IWV and CAPE on Extreme Rainfall Formation
Understanding the conditions leading to extreme rainfall events is difficult, because of complex, interfering atmospheric processes in the eastern central Andes. Previous research indicates that extreme rainfall in the south-central Andes is often caused by deep convective storms [5,10,46]. An analysis requires the investigation of the dominant climatic variables leading to extreme rainfall events with reliable data. In this study, we analyse the joint effect of GNSS-IWV and CAPE on extreme rainfall generation.
First, we analyse the rainfall distribution for each GNSS-IWV station (TUCU and CATA, see Figure 1A) in the south-central Andes using the complementary cumulative distribution function (CCDF) and by comparing the best-fit parameters.
Second, we investigate the seasonal behaviour and the fluctuations in the GNSS-IWV and CAPE in conjunction with rainfall for both stations (TUCU and CATA) on the daily scale. We use the wavelet coherence analysis to confirm the seasonal agreement between rainfall, GNSS-IWV, and CAPE.
Third, we analyse the relation between the daily mean of GNSS-IWV and rainfall for both station locations (TUCU and CATA). We perform a curve fitting process to model the rainfall as a function of the GNSS-IWV. After fitting several different models to the data, we found that the data were best fitted by an exponential model Equation (2) based on model fitting statistics (root mean squared error, R-squared value, statistic test for the F-test on the regression model, and p-value).
where α is the regression coefficient for GNSS-IWV. We take the natural logarithm of both sides of the equation to linearize it: Fourth, we analyse the correlation between the daily mean of rainfall and CAPE at both station locations (TUCU and CATA) using a power-law relationship as described in previous research based on parcel theory [4,26,27]. Based on parcel theory, we expect the rainfall intensity to be commensurate to √ CAPE (β = 0.5), if CAPE is efficiently transferred to parcel kinetic energy [26,27].
where β is the regression coefficient for CAPE. After data linearisation, the following equation is substituted to describe the parcel theory: ln rain f all = β * ln CAPE Fifth, we generate the regression model Equation (6), which shows a log-linear relationship between the daily mean rainfall and both variables at both station locations.
ln rain f all = c + α * GNSS-IWV + β * ln CAPE (6) where α and β are the regression coefficients for GNSS-IWV and CAPE, respectively. Our approach and the formulations are based on previous studies for extreme rainfall [4,26,27]. Sixth, we compare the contribution of each variable to the rainfall extreme events in the presence of each other by joint regression of both variables using quantile regression [47]. We then test the goodness of our model based on the goodness of fit criterion for a certain quantile [47]. For the quantile regression, a linear model for the conditional quantile function is defined as follows [47]: Letβ be acquired by minimizing the problem: which is an unrestricted quantile regression. Letβ be acquired by minimizing the constrained problem: which is a restricted quantile regression and where ρ τ (u) = u(τ − I(u < 0)) is the loss function.
A goodness-of-fit value is then calculated following [47]: R has a value between 0 and 1 and a value of 1 shows a perfect quantile regression model.
Seventh, we analyse the temporal relation between extreme rainfall and both GNSS-IWV and CAPE on the 6 h scale. We select all data with rainfall above the 90th, 95th, and 99th percentiles, respectively, and their corresponding CAPE and GNSS-IWV amounts. We then averaged the 6 h GNSS-IWV and CAPE data within the 72 h (event day plus day before and day after) for each percentile in both stations, and we show the correlation between both variables and extreme rainfall. In this study, we aim to decipher the influence of both GNSS-IWV and CAPE on extreme-rainfall events. We use the 90th percentile of daily mean rainfall to characterize extreme rainfall. The corresponding 90th percentiles for the stations TUCU and CATA are 22 mm/day and 20 mm/day, respectively. The median value of daily mean rainfall is 3.1 mm/day for TUCU and 2.8 mm/day for CATA, respectively. We first show that each station (TUCU and CATA) in the eastern central Andes has a different rainfall distribution.
We rely on a two-sample Kolmogorov-Smirnov (KS) test to compare the distribution of rainfall at both stations during 2008-2013 due to data availability for CATA station. Based on the test result the hypothesis that rainfall in both stations are from the same continuous distribution is rejected. Therefore, we accept the alternative hypothesis that each station has a different rainfall distribution. The p-value (p = 4%) confirms the difference between both distributions at the 5% level. Similarly, the KS test for the comparison of IWV data shows that they are drawn from different distributions.
Similar to a previous work [48], our analysis suggests that the rainfall and GNSS-IWV distributions are best fitted by a lognormal distribution. The estimated parameters µ and σ reveal different values for each station that underline their different climatic environments.
The density function of the lognormal distribution is defined as follows [49]: where the logarithm of y has a normal distribution. µ is the mean of logarithmic values, and σ is the standard deviation of logarithmic values. We note that the tail of the rainfall distribution can be described by a power law starting at x min = 12.9 and with the estimated exponent α = 2.5 for TUCU station and x min = 28 and the estimated exponent α = 2.9 for CATA station (Figure 2A). In order to compare both stations, we show the exceedance probabilities of the binned rainfall and GNSS-IWV data (Figure 2). Based on the approach described by Aaron Clauset [50], we fit the power-law distribution to CAPE data using a maximum likelihood estimator. Relying on the goodness-of-fit parameter, we observe that a power law is a plausible hypothesis for the CAPE data. (Figure 3A,B). In order to homogenize the number of observations of CAPE, as there are many more observations at lower magnitudes as shown in (Figure 3A,B), we use a logarithmic binning of CAPE. Figure 3. The power-law behaviour using a maximum likelihood approach of log-binned data for the independent variable CAPE (A) for TUCU and (B) for CATA following methods described in [50]. We identify a power-law like behaviour for CAPE values above 2500 J/kg (for TUCU) and 2900 J/kg (for CATA). The p-value greater than 0.1 (TUCU = 0.9, CATA = 0.4) confirms that a power law is a plausible hypothesis for the data.

Correlating Seasonal Pattern of GNSS-IWV and CAPE with Rainfall
Next, we show the seasonal pattern of both variables with respect to rainfall. Our results show that the daily values of GNSS-IWV and CAPE increased during the austral summer months and coincided with an increase in rainfall at both station locations (TUCU and CATA) (Figure 4). In contrast to the GNSS-IWV, which represents higher absolute values during austral summer at the TUCU station location ( Figure 4A,C), CAPE showed larger absolute values for the CATA station ( Figure 4B,D) during austral summer months.
We then show the cyclic behaviour of rainfall and CAPE as well as rainfall and GNSS-IWV using the wavelet coherency analysis. As can be seen in Figure 5, the significant coherence area between CAPE and rainfall as well as GNSS-IWV and rainfall time series is observed from the cycle scale of 8 months to 16 months and more significantly around 12 months from 2008 to 2013. The arrows (phase), which are turning to the right at the period band 8-16, show the in-phase coherence and argue that the CAPE and GNSS-IWV contribute to the rainfall.

Relation between Rainfall and GNSS-IWV
In the next step, we analyse the relation of rainfall with GNSS-IWV. We show that, for wet days with rainfall above 0.1 mm/day, there exists an exponential relationship between the TRMM rainfall data and GNSS-IWV. The Q-Q (Quantile-Quantile) plot also shows an identical distribution within the assumed log-linear relation (after linearisation) between rainfall and GNSS-IWV for wet days and above the 10th percentile ( Figure 6A,B). We observe that, below the 10th percentile, rainfall and GNSS-IWV do not follow an identical distribution ( Figure 6A,B).
We tested our data for other possible model fits such as power, Gaussian, and polynomial, but we found that the exponential model fit the data better compared to other models based on p-value < 0.001 and the statistics of the fit (Table 1).

Relation between Rainfall and CAPE
Next, we analysed the relation between rainfall and CAPE. Previous research based on the idealized parcel theory suggested a power-law relationship between rainfall and CAPE with an exponent β = 0.5 (Equation (4)) [4,26]. The Q-Q plot indicates that below the 10th percentile rainfall and CAPE do not follow the same distribution within the assumed log-linear relation at both stations ( Figure 7A,B).
We indicate that there is a power-law relationship between logarithmically binned CAPE and median rainfall of each bin ( Figure 8A,B).

Relation between Rainfall, CAPE, and GNSS-IWV Based on Quantile Regression
As mentioned above, our regression analysis revealed an exponential relationship between rainfall and GNSS-IWV and a power-law relationship between rainfall and CAPE, which is supported by [4,26]. We used a linear multivariable regression model (Equation (6)) to show the joint effect of both climatic variables on extreme rainfall at both station locations. Quantile regression is characterized as a frequently used statistical tool for analysing extreme rainfall events [51][52][53]. The quantile regression has the proficiency of being less sensitive to the presence of outliers and skewed distributions [51]. Therefore, we analyse the joint effect of both climatic variables on extreme rainfall (0.9 quantile) as well as for the 0.75, 0.8, and 0.85 quantiles at both station locations using quantile regression [47,54]. In order to exclude the relationship between the daily mean rainfall and both variables in the cold season, which is not an applicable relationship specially between rainfall and IWV [13], we rely on a quantile regression analysis for the austral summer season.
Our results indicate the regression coefficient (β) close to the parcel theory (0.5), but generally lower values of (0.3 to 0.4) and the regression coefficient (α) around (0.04 to 0.06) for extreme rainfall (0.9 quantile) at both stations ( Figure 9). We observe a higher contribution of CAPE to rainfall for CATA station at the 0.9 quantile compared to lower quantiles and the lower contribution of GNSS-IWV to rainfall for CATA station at the 0.9 quantile compared to lower quantiles (Figure 9). In contrast we indicate a lower contribution of CAPE to rainfall for TUCU station at the 0.9 quantile compared to lower quantiles and about the same contribution of GNSS-IWV to rainfall for TUCU station at the 0.9 quantile compared to lower quantiles. We then test the goodness of our joint model based on the goodness of fit criterion for a certain quantile [47]. Here, we select the 0.75, 0.8, 0.85, and 0.9 quantiles. Our results show a better fit at higher quantiles (extreme rainfall) for both CATA and TUCU stations ( Figure 10).

Temporal Relation between Extreme Rainfall and Both GNSS-IWV and CAPE at the 6-Hour Time Scale
We show the correlation between GNSS-IWV and CAPE and extreme rainfall events averaged for the events above the 90th, 95th, and 99th rainfall percentiles. We average the correlation for 72 h: 24 h before the event, 24 h on the event day, and 24 h afterward. We averaged the 6 hour mean values of CAPE and GNSS-IWV for all events above the 90th, 95th, and 99th rainfall percentiles separately. We show that, for all three rainfall percentiles, the GNSS-IWV and CAPE generally increase during the day before the event and that peak values-both for GNSS-IWV and CAPE-are observed on the day of the 90th, 95th, and 99th percentiles rainfall events.
We also indicate that this increase is mostly followed by a decrease afterwards. Our results show the higher values of CAPE for the CATA station on the event day. In contrast, higher GNSS-IWV values for the TUCU station are observed at the days of the 90th, 95th, and 99th percentiles rainfall events. We show higher values of CAPE and GNSS-IWV for the averaged 99th percentiles rainfall events as compared to the averaged 90th and 95th percentiles at both stations ( Figure 11).    rainfall (B,E), and for the 99th percentile rainfall (C,F). We selected all times with rainfall above the 90th, 95th, and 99th percentiles, respectively and their corresponding GNSS-integrated water vapour and CAPE amounts. We then show the correlation for 72 h (event day plus day before and day after). Note that the GNSS-integrated water vapour and CAPE generally increase during the day before the event and that peak values-both for GNSS-integrated water vapour and CAPE-are observed at the day of the 90th, 95th, and 99th event rainfall.

Discussion
Our observations show that the two stations TUCU and CATA in the eastern central Andes, had different rainfall distributions. This can be attributed to the fact that each station is located at different topographic and climatic regions. Past studies [4,5,11] investigated the different characteristics of rainfall along the climatic and topographic gradient over the south-central Andes. We show that the GNSS-IWV time series coincide with rainfall minima during the austral winter and maxima during the austral summer. Past studies also indicated a nearly homogeneous annual cyclical signal for GNSS-IWV data and a relation with heavy rainfall in Spain [13]. Our wavelet coherence analyses supports a seasonal agreement between GNSS-IWV and rainfall as well as CAPE and rainfall.
Based on our results from the correlation between rainfall and GNSS-IWV, we show that the relation between both variables at the daily scale can be explained with an expo-nential relationship. In order to explain the extreme rainfall over the eastern central Andes, where convection plays an important role in extreme rainfall generation, it is important to consider multi-proxy approaches, including both dominant climatic variables (GNSS-IWV and CAPE) to separate out the effect of both variables in the presence of each other on rainfall.
Previous research by [4,26,27] suggested a model that displays an exponential relationship between rainfall and the dew-point temperature and describes a power-law relationship between rainfall and CAPE. Based on the fitting function analysis in this study, we show that GNSS-IWV can be substituted as a reliable humidity data together with CAPE to analyse extreme rainfall. They are both responsible for convection and extreme rainfall.
Since the rainfall data at both stations show a skewed distribution, quantile regression analysis can be considered as a suitable statistical analysis. Several previous research studies [51][52][53] also used quantile regression as a reliable statistical tool for analysing extreme rainfall events. Therefore, we investigated the joint effect of both climatic variables on extreme rainfall for the austral summer season relying on quantile regression [47]. We showed that the GNSS-IWV is more important for extreme rainfall at the TUCU station compared to the CATA station.
This argument is supported by the lower contribution of GNSS-IWV to rainfall for CATA station at the 0.9 quantile compared to lower quantiles. This may be related to the fact that extreme convection has occurred more often in the northern (tropical) part of the Andes, where wide convective cores are part of a large mesoscale convective system and are more frequently observed than deep convective cores [46].
The correlation between rainfall and CAPE indicates that there is a higher contribution of CAPE at the CATA station compared to the TUCU station for extreme rainfall events. Previous research, [4] also indicated a higher importance of CAPE for extreme rainfall in the transition zone between the tropical and subtropical regions compared to tropical regions. These regions have been identified by the intense rising of warm and moist air that triggers the formation of deep convective storms [1,4,10,46].
We indicate an improvement for higher rainfall quantiles based on goodness of fit for quantile regression [47]. Therefore, we argue that our joint model is more successful to explain rainfall at higher quantiles.

Conclusions
We investigated the contribution of GNSS-IWV and CAPE to extreme-rainfall events at two GNSS station locations in the eastern central Andes. We used a quantile regression analysis to describe the effect of both atmospheric variables on extreme rainfall in the presence of each other. We obtained the following key results: First, we observed that the two GNSS-IWV stations in the eastern central Andes (CATA and TUCU) belong to different climatic conditions with varying lognormal parameters in the exceedance probability domain for rainfall and IWV. (Figure 2). Second, based on the correlation analysis, we found that there was an exponential relationship between the GNSS-IWV and extreme rainfall at both station locations (Table 1).
Third, we support a power-law relationship between rainfall and CAPE at the GNSS-IWV station locations in the eastern central Andes. The regression coefficient reveals a value close to the one predicted from parcel theory (0.5) at both station locations. Fourth, we show the effect of both variables (GNSS-IWV and CAPE) on rainfall generation by multivariable regression analysis relying on quantile regression. We present different contributions of CAPE and GNSS-IWV to rainfall for each station for extreme rainfall.
Fifth, we observe that the temporal variations of GNSS-IWV and CAPE are well correlated with extreme rainfall just before and after the extreme rainfall ( Figure 11A,B). In this study, we show the effect of two important climatic variables (GNSS-IWV and CAPE) that trigger deep convection and lead to extreme rainfall in the eastern central Andes. We show that high-temporal resolution GNSS-IWV can be used as a reliable data source for extreme rainfall investigation.