In Saskatchewan, groundwater sources are primarily utilized by residents of rural and remote areas [
53]. A considerable number of water samples exceeded drinking water standards and objectives in the surveillance data, highlighting the need to promote adequate testing of drinking water in rural areas. Generally, it appears that contaminants listed as aesthetic objectives exceed guideline values at a higher frequency than the health-related standards. As expected, the raw groundwater sampled from private wells exceeded standards and objectives more frequently than the water from the regulated, treated public water supplies. However, a considerable number of samples from public supplies still exceeded guidelines, especially for aesthetic objectives.
In a previous study, a sample of 283 wells in Saskatchewan [
19] found that approximately 45% of the wells exceeded the Saskatchewan drinking water objective for sulfate, 47% exceeded the objective for iron, 61% exceeded the objective for hardness, and 79% exceeded the objective for manganese. Our study reflected a similar pattern, although the rates of exceedances were slightly lower; approximately 39% for sulfate, 40% for iron, 31% for hardness, and 68% for manganese in the private wells, and 32% for sulfate, 19% for iron, 21% for hardness and 53% for manganese in public supplies. A previous study from Saskatchewan reported that having aesthetic complaints about tap water was associated with the perception that tap water was unsafe [
24]. The aesthetic quality of tap water could act as a determinant of health by increasing consumption of water alternatives which may include sugar-sweetened beverages [
8].
Previous studies have also investigated concentrations of arsenic [
17] and nitrate [
18] in Saskatchewan wells. Thompson et al. [
17] sampled 61 wells (private wells and wells maintained by rural municipalities) for arsenic, and found that 23% exceeded the current Saskatchewan drinking water standard applied to regulated public water supplies. In our study, just over 13% of private wells exceeded the standard, while approximately 7% of public supply samples exceeded the standard. Thompson [
18] found that 14% of wells tested exceeded the standard for nitrate for regulated waterworks, while 12% of the private wells included in our study exceeded the standard. However, only 4% of public supplies exceeded the nitrate standard.
4.1. Principal Components Analysis
PCA has been used in previous studies to examine and interpret patterns of groundwater quality parameters [
30,
31,
32,
33,
34,
54]. These types of studies typically identify common factor patterns and interpret them with respect to presumed natural and anthropogenic processes that impact groundwater quality, and are often focused on major ions (e.g., sodium, chloride, magnesium, sulfate) that would fall under aesthetic objectives in the Saskatchewan Drinking Water Quality Standards and Objectives. PCA analysis of groundwater has often included nitrate, which falls under Saskatchewan health standards, as a marker for anthropogenic influences on groundwater (e.g., [
30,
31,
32,
34]). However, the full range of parameters included in such studies has not been consistent, particularly with respect to the inclusion of trace metals, making it somewhat difficult to compare results. Comparison to our study was further hampered because we analyzed health standards and aesthetic objectives separately to align our analysis with Saskatchewan Drinking Water Quality Standards and Objectives.
We limited our analyses to include parameters that were routinely sampled from both the public and private supply data to facilitate comparison between the differing supplies. We expected the results to differ between the types of systems because the public supply data represent treated water supplies and the private well data represent raw water samples. While there were some differences in the principal components extracted from the public and private data, there were some striking similarities, especially in the results for the aesthetic objectives, even though four PCaesthetic were retained for the public supply data and three for the private well data. The first PCaesthetic was associated with the same group of variables in both datasets: sodium, chloride, sulfate, alkalinity and total dissolved solids. Additionally, hardness and magnesium were strongly associated with the second PC, and iron and manganese with the third PC in both public and private water supplies. The consistent patterns of these parameters between the datasets suggest relatively strong associations between these parameters in Saskatchewan groundwater.
The PCA for health standards also exhibited some consistencies: nitrate and selenium were strongly associated with the first PC1health for both public and private supplies. Arsenic was associated with the third PChealth in both datasets, but strongly associated with that PC in the public supply data. In contrast, uranium was associated with PC3health in public supplies and with PC1health in private wells. In addition, lead was associated with PC1health in public supplies, and with PC3health in private wells. Because lead contamination of water can be associated with leaching from distribution systems, differences in the covariance of lead with other parameters between public supplies and private wells is not unexpected. However, caution is warranted in the interpretation of the PCA for health standards from the public supply data considering the low Kaiser’s measure of sampling adequacy for these data.
4.2. Geostatistical Analysis
Kriging has previously been validated as a method to summarize arsenic concentrations in groundwater quality and in one study was found to be superior to using an area average or nearest well as a proxy to predict well concentrations [
26]. While some studies have investigated the use of indicator kriging to model the probability of higher arsenic concentrations using geological and hydrological covariates [
25,
27], some recent studies have compared various kriging methods that are accessible in GIS software to investigate prediction of arsenic concentrations in groundwater [
28,
29]. James et al. [
29] evaluated the performance of various kriging methods (ordinary, universal, simple kriging with varying means, kriging with external drift, cokriging with ordinary kriging and cokriging with universal kriging) over a relatively small area in Colorado and found that ordinary kriging performed best. Gong et al. [
28] compared inverse distance weighted interpolation with kriging using Gaussian and spherical models as well as cokriging in predicting arsenic concentrations over various regions in Texas, and found regional differences in the performance of kriging, and concluded that kriging over smaller areas was more accurate than over large geographic regions.
In the present study, Bayesian kriging had the lowest RMSE for the greatest number of variables and was considered the optimal method for our data. However, values of RMSE for ordinary kriging were very similar to those for Bayesian kriging, so there does not appear to be much difference between these methods in the accuracy of the predicted values on cross validation.
We elected not to use covariate information such as well depths or geological data in our models due to difficulty in obtaining accurate covariate data over our study area. Furthermore, a previous study in SK demonstrated a lack of correlation between well depths and concentrations of the water parameters studied [
55]. While depth might be expected to improve modeling of arsenic concentration, conflicting results from other studies suggest that the contribution of depth may be dependent on the study area. For example, a negative correlation between increasing well depth and arsenic concentrations has been reported for wells in Bangladesh [
56,
57], while a positive association between well depth and arsenic was reported in North Carolina [
58]. Yang et al. [
27] did not detect any association between arsenic concentration and well depth in Maine. In one study, including well depth in cokriging models did not improve the ability of kriging to predict arsenic levels [
29]. Gong et al. [
28] found that incorporating well depth in cokriging did not necessarily improve the correlation between predicted and actual values, but did improve the performance of regression models used to predict arsenic levels. Furthermore, Yu et al. [
57] investigated factors affecting arsenic at different geographic scales and concluded that much of the variability in arsenic concentrations at a scale of less than 3 km could be explained by well depth, while geology was the most important factor at scales of greater than 10 km. This suggests that given the large scale of our study area relative to other reported studies, it is unlikely that adding well depth as a covariate would have improved our models. While incorporation of geological data might have improved our predictions, this information was not available for the large study area.
Others have reported a tremendous amount of heterogeneity in groundwater concentrations of arsenic over small scales that is poorly understood [
27,
57]. In Bangladesh, wells within a radius of less than 1 km were found to vary by up to 1000 μg/L [
57]. In another study of a relatively small region of Bangladesh, wells in close proximity exhibited extremely variable arsenic concentrations, especially wells less than 30 m in depth [
59]. This issue was also highlighted in the geostatistical analysis of arsenic in wells in Michigan; residuals for predicted arsenic values were mapped and no spatial pattern in the residuals was detected [
26]. The close proximity of wells with negative and positive residuals of greater than 10 μg/L reflected high variability in arsenic concentrations over short distances [
26]. Additionally, a study in Texas compared geostatistical methods among regions, and found the performance of the different methods varied less within a given area than across the different regions [
28]. This suggests that variability in the distribution of groundwater arsenic across regions is a limiting factor in identifying a single method that would perform uniformly well in different geographic areas. Given the apparent differences in processes influencing spatial variability of arsenic at different scales, it is possible that developing kriging models over smaller targeted areas with a high density of samples could improve the performance of predictions for some local regions. However, this analysis was intended to estimate the mean arsenic concentrations along with principal components representing drinking water quality over a total area of approximately 327,900 km
2.
Interpretation of mapped results of PCA is less straightforward because the values are a representation of a combination of parameters that contribute to the PCA components. For example, areas with high values for the first PC for aesthetic objectives represent higher predicted concentrations of one or more of the contributors to this component, including sodium, chloride, sulfate, alkalinity and total dissolved solids. Nevertheless, this method is useful for examining patterns in common grouping of parameters and allowed extraction of factor scores to summarize mixtures of variables over geographic regions for use in epidemiological analyses.
Previous studies have used geostatistical methods to map the scores resulting from PCA or FA and used the resultant maps to predict the factors that may be impacting groundwater quality, such as pollution or salt water intrusion [
34,
35,
36,
37]. It does not appear that the use of kriging with PCA or factor analysis has been well-validated for prediction of groundwater quality. We are not aware of other studies that have assessed the ability of kriging to accurately predict PCA scores at unmeasured locations so we have not compared our results to others.
We would expect to see a reduction in predictive ability by combining a variable reduction method such as PCA with kriging. PCA reduces the dimensionality of a dataset while capturing as much of the information in the original variables as possible. In our data, the percentage of variance in the original measures described by the retained PCs ranged 63.7–77.8%. While spatial patterns of arsenic have been studied extensively, spatial patterns of the other variables, and especially mixtures of variables, have not. Therefore, it is possible that the PCs we extracted are subject to variability at scales not captured by our analysis. The use of PCA combined with kriging of factor scores should not be discounted as a means of summarizing water quality but should be investigated further, ideally with higher density sampling over smaller geographical regions.
Although the data from public water supplies consisted of repeated measures over time for most sites, the decision was made to model a mean value for each parameter at each site rather than specifically estimating concentrations and component scores at particular points in time. The capacity to estimate time-specific values was limited as not all sites had the same intensity of sampling and sufficient samples to provide a precise estimate for every year in the dataset. Finally, and most importantly, the primary objective of this study was to estimate exposures to metals and ions in drinking water for an epidemiologic study of associations between water quality and chronic diseases with uncertain induction periods. The relevant exposure period over which environmental exposures contribute to chronic diseases is uncertain and represents a potential source of misclassification [
60,
61]. Because the precise time period of interest for estimating exposure was unknown, an estimate of average past exposure was deemed more appropriate for the planned epidemiological analysis than estimating exposures for specific time points, for example, using space-time kriging models.
4.3. Limitations
It is estimated that there are over 66,000 wells in Saskatchewan [
17] and our sample of 4093 private wells is a non-random sample of less than 10% of privately owned wells in the province. Because the database consists of samples taken through participation in a voluntary water quality program, it could disproportionately represent residents with concerns about their well water quality. The results from private wells and public water supplies were similar, suggesting this was not a substantial issue.
Although the public supply data represent data from all available public water supplies across Saskatchewan, there were relatively few locations represented in the public supply data relative to the size of the study area, resulting in a low sampling density that may have particularly impacted the ability of kriging to capture the variability of arsenic at small spatial scales.
Our PCA may have been hampered by not being able to make use of a full suite of parameters and high proportions of concentrations below detection limits for some variables, especially with respect to the health-related standards. We also made the decision to separately analyze aesthetic and health parameters because they are segregated into drinking water standards and objectives. It is possible that considering all available parameters together could have improved the performance of the PCA, although it seems likely that the high number of samples below detection limits would continue to limit the usefulness of some of the variables measured as health standards.
Kriging methods rely on an estimation of the spatial structure of data. While semivariogram models provide a means of investigating spatial relationships, kriging typically requires the assumption that the chosen semivariogram model represents the true spatial structure. This assumption is relaxed with empirical Bayesian kriging allowing for uncertainty in the semivariogram parameters which likely contributes to the superior predictive performance of this method in our study. However, the performance of empirical Bayesian kriging was only marginally better than that of ordinary kriging. Other researchers have investigated Bayesian statistical methods to predict arsenic groundwater concentrations which incorporate spatial relationships using alternatives to semivariograms [
58,
62]. Methods such as these could potentially be used to improve prediction of arsenic concentrations and overcome some of the limitations of kriging especially when spatial variability arises from processes at different scales, limiting the effectiveness of variogram modeling even after allowing for uncertainty in the semivariogram.