Local Meteoric Water Line of Northern Chile (18 ◦ S–30 ◦ S): An Application of Error-in-Variables Regression to the Oxygen and Hydrogen Stable Isotope Ratio of Precipitation

: In this study, a revision of the previously published data on hydrogen ( 2 H / 1 H) and oxygen ( 18 O / 16 O) stable isotope ratio of precipitation in northern Chile is presented. Using the amount-weighted mean data and the combined standard deviation (related to both the weighted mean calculation and the spectrometric measurement), the equation of the local meteoric line calculated by error-in-variables regression is as follows: Northern Chile EIV-LMWL: δ 2 H = [(7.93 ± 0.15) δ 18 O] + [12.3 ± 2.1]. The slope is similar to that obtained by ordinary least square regression or other types of regression methods, whether weighted or not (e.g., reduced major axis or major axis) by the amount of precipitation. However, the error-in-variables regression is more accurate and suitable than ordinary least square regression (and other types of regression models) where statistical assumptions (i.e., no measurement errors in the x-axis) are violated. A generalized interval of δ 2 H = ± 13.1% (cid:24) is also proposed to be used with the local meteoric line. This combines the conﬁdence and prediction intervals around the regression line and appears to be a valid tool for distinguishing outliers or water samples with an isotope composition signiﬁcantly di ﬀ erent from local precipitation. The applicative examples for the Pampa del Tamarugal aquifer system, snow samples and the local geothermal waters are discussed.


Introduction
To fit a model to data pairs, the linear least squares regression is by far the most widely used modeling method (e.g., [1]). The ordinary least squares regression (OLSR) minimizes the sum of the squared vertical distances between the y data values and the corresponding y values on the fitted line (the predictions). The OLSR design assumes that there is no variation in the independent variable (x) because it is controlled by the researcher. This is also known as 'Model I' regression [2]. In contrast, in 'Model II' regression, both the response and the explanatory variables of the model are random (i.e., not controlled by the researcher), and there are errors associated with the measurements of both x and y [2]. Among the 'Model II' regression, major axis regression (MA) assumes equal variance on both variables and minimizes the orthogonal (perpendicular) distances from the data points to the fitted line. To the best of our knowledge, one of the first suggestions for applying orthogonal regression to the precipitation amount weighted stable isotope ratios of the oxygen (δ 18 O, the independent variable and administrative regions (modified from [17]). In blue: Pampa de Tamarugal aquifer. (C) Morphotectonic units of the central Andes (modified from [18]). SBS: Santa Barbara System; SP: Sierras Pampeanas. The "Modern Forearc" coincides with the Precordillera and the Coastal Cordillera [19].
This presents a narrow strip of land (~300 km) between the Andean ridge and the Pacific coast [17]. The latitudinal range of this northern part of Chile aligns with a major extension of the arid desert climate (BW climate zone [20,21]), followed by a tundra climate (ET climate zone [20,21]), and a semi-arid climate in a minor extension [14,21,22]. The Atacama Desert, the driest place on earth with mean precipitation as low as 10 mm per decade, and the higher elevations in the Andes are striking examples of the BW and the ET climates, respectively. The Amazonian influence in the South American monsoon system, with concentrated summer rainfall or dry winters, affects the climate in this area from the north boundary to Nevado Ojos del Salado [22], a stratovolcano with an altitude of 6,893 m, the highest point in Chile and the second highest outside of Asia.
The previous studies on isotope composition of meteoric water revealed a general increase in δ 18 O on the Andean plateau (the "Altiplano", Figure 1C) from north to south, concomitant with an increase in aridity and decrease in convective moistening (amount effect; [23]). Stable isotope and seasonal precipitation patterns suggest an eastern provenance of the vast majority of moisture that falls as precipitation across the Andean Plateau and Western Cordillera ( Figure 1C), with Pacificderived moisture contributing a minor amount at low elevations near the coast ( [23,24]). However, over most regions, the δ 18 O signal of precipitation is influenced by a combination of factors ( [24]). The δ 18 O -depleted values observed in the high altitude area (i.e., the Andean plateau) were related to processes that affect the air masses that (i) originated over the Atlantic Ocean, (ii) cross the Amazon Basin (continental effect), (iii) ascended the Andes (altitude effect) and (iv) precipitated (convective effect) in the Andean plateau ( [25]). In particular, over the eastern Andes, precipitation at low elevations has δ 18 O from −2 to −8‰, but δ 18 O becomes more depleted toward the west as vapor is lifted across the Eastern Cordillera of the Andes ( [26][27][28][29]). The dominant Altiplano summer rain has This presents a narrow strip of land (~300 km) between the Andean ridge and the Pacific coast [17]. The latitudinal range of this northern part of Chile aligns with a major extension of the arid desert climate (BW climate zone [20,21]), followed by a tundra climate (ET climate zone [20,21]), and a semi-arid climate in a minor extension [14,21,22]. The Atacama Desert, the driest place on earth with mean precipitation as low as 10 mm per decade, and the higher elevations in the Andes are striking examples of the BW and the ET climates, respectively. The Amazonian influence in the South American monsoon system, with concentrated summer rainfall or dry winters, affects the climate in this area from the north boundary to Nevado Ojos del Salado [22], a stratovolcano with an altitude of 6893 m, the highest point in Chile and the second highest outside of Asia.
The previous studies on isotope composition of meteoric water revealed a general increase in δ 18 O on the Andean plateau (the "Altiplano", Figure 1C) from north to south, concomitant with an increase in aridity and decrease in convective moistening (amount effect; [23]). Stable isotope and seasonal precipitation patterns suggest an eastern provenance of the vast majority of moisture that falls as precipitation across the Andean Plateau and Western Cordillera ( Figure 1C), with Pacific-derived moisture contributing a minor amount at low elevations near the coast ( [23,24]). However, over most regions, the δ 18 O signal of precipitation is influenced by a combination of factors ( [24]). The δ 18 O -depleted values observed in the high altitude area (i.e., the Andean plateau) were related to processes that affect the air masses that (i) originated over the Atlantic Ocean, (ii) cross the Amazon Basin (continental effect), (iii) ascended the Andes (altitude effect) and (iv) precipitated (convective effect) in the Andean plateau ( [25]). In particular, over the eastern Andes, precipitation at low elevations has δ 18 O from −2 to −8% , but δ 18 O becomes more depleted toward the west as vapor is lifted across the Eastern Cordillera of the Andes ( [26][27][28][29]). The dominant Altiplano summer rain has δ 18 O values from −8% to −15% , with the expectation that precipitation in the driest places (e.g., Atamaca desert) should be moderately to strongly depleted at all elevations up to approximately −18% ( [28,30]).
For this study, a dataset of 32 stations has been constructed through collecting previously published data, considering only the isotope values of precipitation with the measured amount and recalculating the amount-weighted mean in order to have only one δ 18 O and δ 2 H data pair per station (Table 2; Supplementary File S1). Table 2. Amount-weighted oxygen and hydrogen stable isotope ratios (δ 18 O and δ 2 H in permil versus the standard mean ocean water SMOW, respectively) of the previously published precipitation samples collected in northern Chile [31,32,35,37,[40][41][42][43]. Calculations are described in the Supplementary File S1. Samples were mainly collected in Region I (from #1 to #27), Region II (from #28 to #31) and Region IV (#32) ( Figure 1B). a.w.m. ± c.s.: amount-weighted mean ± combined sigma (see methods paragraph for details); N a.p. : Number of accumulation periods of precipitation. This value coincides with the total number of the available isotope data pairs per station (Supplementary File S1); * sample with only one accumulation period (N a.p. = 1). In this case, the combined sigma (c.s.) contain only the standard deviation related to spectrometric measurement (Supplementary File S1). The regression has been calculated both including ( Assuming x i to be the measurement of the stable isotope ratios expressed as δ 18 O and δ 2 H in % versus SMOW (standard mean ocean water)-determined on a specific precipitation amount p i (mm) collected over a specific time period-the standard deviation related to the amount weighted mean x w [4] of the meteoric stations with more than one accumulation periods of precipitation (N a.p. > 1) was calculated as follows [44]:

ID # Location
It should be noted that the accumulation periods, which coincides with the total number of the isotope data pairs and precipitation amounts available per each station, are not uniformly distributed through the year and not equally long for most of the stations considered (Supplementary File S1).
In the existing literature, there are several definitions of 'combined standard deviation' or 'combined uncertainty.' Here, we deal with two different 'uncertainty' sources [8,45], that is, a standard deviation related to the amount-weighted mean calculation σ w , as described above, and a standard deviation related to the mass spectrometric measurements σ ms , which differs depending on the publication (Supplementary File S1). One of the easiest ways to calculate the combined standard deviation σ csd involves foreseeing the combination of σ w and σ ms following the law of propagation of uncertainty, that is, the square root of the sum-of-the-squares [45][46][47]: The combined standard deviations, associated with the δ 18 O and δ 2 H amount-weighted mean data in 30 of the total 32 stations, were used to calculate the local meteoric water line through EIV regression. In two stations, Quisquiro [32] and La Serena [37], the calculation of the standard deviation of the weighted mean σ w was performed differently for two reasons. In the case of Quisquiro, the complete dataset is unavailable; therefore, the standard deviation was calculated among the amount-weighted data corresponding to the extreme seasons (winter and summer, [32]). Meanwhile, in the case of La Serena, the σ w was calculated among the years in which more than 70% of precipitation was analyzed for a given isotope composition [4]. Following this, the σ csd was calculated through using the σ ms declared in the analytical section of the publication dataset [32] or elsewhere [25]. With this, the σ csd on δ 18 O and δ 2 H is within the mean of the other stations.

Results
The parameters of the northern Chile meteoric water lines obtained through the different software and regression methods are listed in Table 3. The values of slope and intercept through EIV appear to be slightly higher than those obtained by other methods that do not take into account the combined standard deviation of the oxygen and hydrogen measurements. However, there is no difference with other regression methods, aside from the results from Cantrell's bivariate worksheet (EIV-d, Table 3b), which was tested on atmospheric chemical data [54]. The regression results of EIV-d method include: (i) a lower standard error in comparison with those from other codes used for EIV calculation; and (ii) a higher slope when compared to PWLSR, but not significantly different (p > 0.05). The difference between the slopes are further smoothed when the sample #9 is removed (N a.p. = 1; Supplementary File S2). It should be noted that lower standard errors on slope and intercept similar to those of Cantrell's bivariate worksheet are obtained in the other EIV models introducing a scale factor in the maximum likelihood method, which is estimated by the square root of the reduced chi-square statistic that results from the fit analysis [11,53]. Lower than half standard error are also obtained by "Deming regression" with a constant variance ratio of λ = δ 18 O var /δ 2 H var = 0.019 (EIV-e, Table 3b), that is the mean ratio of the squared combined sigma (Table 2). Therefore, we consider as most representative of the samples the regression results with the standard errors on slope and intercept calculated without scale factor and with combined sigma for each station (EIV results from "a" to "c" in Table 3b): The upper and lower confidence intervals at 95% probability of the slope and intercept in the EIV regressions (results from "a" to "c" in Table 3b) include all the previous published values obtained by OLSR (Table 1) and all the other slopes and intercepts of the meteoric water lines calculated by different regression methods (Table 3a). This is clear and straightforward, in particular looking to the standard errors of slope and intercept, which are higher in EIV regression results.

Discussion
In isotope hydrology applications, it is often desirable to know whether a water sample from the surface (river, lake, or lagoon) or from an aquifer is shifted in a statistically significant way in relation to the local meteoric water line. This is particularly true in the following: (i) arid zones, where fractionation due to evaporation produces isotopically enriched residual waters; and (ii) geothermal areas, where hot water samples could be isotopically enriched due to water-rock interaction. Northern Chile offers an ideal location for this kind of study because both of these conditions exist here [65,72].
The calculated confidence intervals obtained through EIV regression represent the scatter of the weighted mean data around the fitted line, disregarding the combined standard deviation (Table 3). In OLSR, prediction intervals for y at a specific x value estimate the bands around the fitted line in which the future observation will fall, with a certain probability (usually 95% as confidence interval), given what has already been observed with the sample dataset employed for the regression (t-distribution multiplied by the standard error of the fit). The OLSR prediction interval can be quickly calculated by using commercial or free codes (e.g., [71,73]). However, OLSR does not take into account the measurement error in x when predicting y, either in the combined standard deviations of the data in the fittings or the prediction interval calculations. Moreover, an officially recognized method for the calculation of prediction intervals related to EIV regression still does not exist.
An approximate prediction area around the EIV local meteoric water line could be traced through using an empirical-graphical method, joining the outer limits of the bars representing the combined standard deviations around the mean of the isotope data of precipitation. However, its irregular shape, related to the non-homogeneous values of the combined standard deviations of the regressed stations, necessitates an ad-hoc correction. Specifically speaking, the lower sector of the line (approximately, δ 18 O < −12% and δ 2 H < −80% ) shows numerous monitored stations with wider standard deviations-especially in terms of hydrogen composition δ 2 H-in comparison with the higher sector of the line (Figure 2A).  [69]), snow and penitentes (dark and light blue diamonds, respectively [41,43,59,61,[63][64][65][66][67][68]) and hydrothermal groundwater (red triangles, [70]; PL: Pampa Lirima, TT: Torta de Tocorpuri) samples compared with the lines described in 'A'. Arrows explain the isotope effects in hydrothermal groundwater [70,74]. y and x bars in snow and penitentes depict the combined sigma related to the volume weighted mean (samples from Cerro Tapado [67], Supplementary File S5). Light-blue diamonds without error bars are single samples of the penitentes from Parinacota volcano ( [64], Supplementary File S5).  Table 2) along with the combined standard deviation on hydrogen and oxygen isotope composition (y and x bars, respectively); error-in-variables local meteoric water line (EIV-LMWL solid line) and the generalized interval (dashed lines; δ 2 H = ±13.1% up and down the fitted values on the line). (B) Pampa de Tamarugal aquifer (thick-orange line, [69]), snow and penitentes (dark and light blue diamonds, respectively [41,43,59,61,[63][64][65][66][67][68]) and hydrothermal groundwater (red triangles, [70]; PL: Pampa Lirima, TT: Torta de Tocorpuri) samples compared with the lines described in 'A'. Arrows explain the isotope effects in hydrothermal groundwater [70,74]. y and x bars in snow and penitentes depict the combined sigma related to the volume weighted mean (samples from Cerro Tapado [67], Supplementary File S5). Light-blue diamonds without error bars are single samples of the penitentes from Parinacota volcano ( [64], Supplementary File S5).
The wider combined standard deviation of the samples that fall within the lower sector is mainly due to the irregular/inhomogeneous frequency of the sampling and to the different accumulation periods of precipitation in the same station (mixing of monthly and seasonal data). In contrast, the stations on the higher sector of the line, which are not depleted by the "altitudinal effect" (lower isotope values at a higher elevation, [74,75]), show a minor combined standard deviation. This is probably due to the fact that only four stations for rainwater are located at an altitude of lower than 2500 m, and monitored by a more regular frequency of sampling (e.g., La Serena station).
An innovative method has recently been proposed for the calculation of a "generalized interval," which combines the confidence and predictive concepts in EIV [76][77][78]. Using this approach, the obtained interval forcing a mean on 12 values per station (qx = qy = 12; i.e., as supposing a sampling of precipitation extended to 12 months) is 8.9 ± 4.2% up and down the fitted line, which is fairly similar to the mean combined standard deviation of δ 2 H (9.1 ± 4.8% , excluding the station #9 with N a.p. = 1, Table 2; Supplementary File S3). In contrast, the prediction intervals calculated through the jackknife method [71]-a statistical approach which simulate the resampling of the collected data taking into account their variance-and through Deming regression [71]-which necessitate a constant variance ratio (constant λ [71], calculated on the variances ratio of the 32 samples)-give a prediction interval of ±1.53% , which is narrower than the combined standard deviation (Supplementary File S3).
Considering that the OLSR and EIV meteoric water lines of northern Chile are not different in terms of slope, the stable isotope ratio data on 115 precipitation samples available from the existing literature (Supplementary File S4; [28,30,36,[56][57][58][59][60]68,79]) could help with checking the effectiveness of prediction and generalized intervals. While all these isotope data do not report the amount of precipitation and are therefore not included in the EIV regression calculation, most of them are clustered between or extended over the enriched and depleted sector of the meteoric water line, thus completing the station's gap. The upper and lower 95% prediction intervals on the δ 2 H values related to the OLSR regression of this latter dataset are shifted ±13.4% up and down respectively on the regressed line (Supplementary File S4). This latter value is not significantly different, neither from the upper/lower δ 2 H combined standard deviation values of the weighted-amount dataset (±13.8% ), nor from the upper/lower extremes of the generalized interval (±13.1% ). This is particularly true if we take into account that the unweighted rainfall dataset pertains to single rain events, meaning it is quite normal that their predicted δ 2 H values are shifted towards the higher value of the combined standard deviation of the weighted rainfall dataset.
In terms of the first groundwater sample type (i), in previous studies, the clustering of the δ 18 O and δ 2 H values of the groundwater from Pampa del Tamarugal falling on the right side of the meteoric water line has led several authors to consider this aquifer system as isotopically different from local precipitation [35,80]. This has been attributed to the following: (i) a recharge in climatic conditions different from the current ones [35]; and (ii) to the evaporation that affects the precipitation in the unsaturated zone of the recharge area [80]. A recent re-evaluation of the isotope data published up to that point shows that the isotope composition of the groundwater from that area is effectively parallel to the meteoric water line, while it was noted that the previous hypothesis on climate variation must be reconsidered [69]. However, the "uncertainty wings" around the meteoric water line have not been traced in any of the above-described cases.
In Figure 2B it is shown how the recent calculated regression line (in orange) representing the groundwater in Pampa del Tamarugal [69] falls within the generalized intervals of the EIV meteoric water line. This demonstrates that the mean composition of the groundwater from this area is not significantly different from the precipitation. This does not exclude the fact that some groundwater samples could be significantly affected by evaporation during recharge. Indeed, the right side of the generalized interval of the EIV regression is the location of the enriched precipitation samples associated with the evaporation during rainfall. Furthermore, the regression line of the groundwater samples should have a proper prediction interval, which certainly does not cover exactly that of the rainwater on that side. However, it is beyond the scope of this study to show also the prediction interval related to groundwater regression.
In terms of the snow samples, all but two of the 61 snow samples from northern Chile fall within the area between the generalized intervals determined in this study, confirming that this area could be approximated as an uncertainty ribbon around the EIV regression ( Figure 2B). It should be noted that this dataset includes both samples from precipitation samplers and the Andean snow cover. The two samples out of this area are one outlier (left side of the meteoric water line, Figure 2B) and one isotopically enriched sample from the so-called "penitents" of the Parinacota volcano ( [64], Figure 1B-Region I, Supplementary File S5). The penitentes are an annual phenomenon of snow sublimation related to solar radiation and arid climates; consequently, the isotope composition of these samples could be enriched, in particular, if sampled from surface layers [64,67,81]. The volume weighted mean and the related combined standard deviation calculated on the isotope composition of the snow samples from the Cerro Tapado glacier ( [67], Figure 1B-Region IV, Supplementary File S5) fall exactly on the EIV-LMWL's extension beyond the most negative value and within the generalized interval ( Figure 2B). As expected, the penitentes from the same area are enriched in comparison with the mean isotope composition of the local snow due to sublimation ( Figure 2B, [67]). Moreover, although their mean values fall close to EIV-LMWL, the combined sigma of the penitentes can be higher than the generalized interval (up to ±16% ; Figure 2B, Supplementary File S5).
Finally, geothermal water samples from a recent study were plotted with the EIV regression line and the generalized interval, showing more clearly the distinction between the meteoric waters and the hydrothermal water samples that can be isotopically enriched as a result of high-temperature isotopic fractionations between water-rock, mixing with andesitic water or water-gas interactions [70,74] ( Figure 2B). Specifically speaking, the Torta de Tocorpuri (TT in Figure 2B) and Pampa Lirima (PL in Figure 2B) samples are not different from meteoric waters. According to the Tassi et al. [70], the limited temperature and involvement of the so-called "andesitic water" does not produce a relevant shift from the meteoric composition in these two sample groups ( Figure 2B). In particular, the PL sample group lies precisely between Pampa del Tamarugal line and the EIV local meteoric water line ( Figure 2B).

Conclusions
The regression analysis based on the error-in-variables approach (EIV) produces a meteoric water line with a slope similar to those obtained with ordinary least squares regression or other regression approaches.
Further studies that focus on other areas and which use a greater number of amount-weighted data pairs (for example, the GNIP/WMO dataset [37]) could be useful in terms of verifying this.
The present study offers an EIV local meteoric water line that summarizes all the previously published isotope data of precipitation from northern Chile. Moreover, the use of the innovative generalized interval (locally, δ 2 H = ±13.1% up and down the fitted values on the line), which is a good compromise between the confidence and the prediction intervals, appears to be a useful tool for distinguishing significantly different data from precipitation.
The case of the groundwater system of Pampa del Tamarugal, snow samples and the local geothermal waters is emblematic. An efficient application of the generalized interval or prediction band could be extended to waters that have been subjected to evaporative fractionation (surface water or precipitation samples from desert climate areas), while it could be used to distinguish the outliers.
Author Contributions: Study design, data search, data and interpretation, writing the manuscript-original draft preparation, communicating with the journal, T.B.; data search, components for discussion on results and conclusions, writing-review and editing, J.C.; components for discussion on results and conclusions, writing-review and editing, P.I. and E.S.
Funding: This research received no external funding.