Multivariate statistics are widely and routinely used in the field of hydrogeochemistry. Trace elements, for which numerous samples show concentrations below the detection limit (censored data from a truncated dataset), are removed from the dataset in the multivariate treatment. This study now proposes an approach that consists of avoiding the truncation of the dataset of some critical elements, such as those recognized as sensitive elements regarding human health (fluoride, iron, and manganese). The method aims to reduce the dataset to increase the statistical representativeness of critical elements. This method allows a robust statistical comparison between a regional comprehensive dataset and a subset of this regional database. The results from hierarchical Cluster analysis (HCA) and principal component analysis (PCA) were generated and compared with results from the whole dataset. The proposed approach allowed for improvement in the understanding of the chemical evolution pathways of groundwater. Samples from the subset belong to the same flow line from a statistical point of view, and other samples from the database can then be compared with the samples of the subset and discussed according to their stage of evolution. The results obtained after the introduction of fluoride in the multivariate treatment suggest that dissolved fluoride can be gained either from the interaction of groundwater with marine clays or from the interaction of groundwater with Precambrian bedrock aquifers. The results partly explain why the groundwater chemical background of the region is relatively high in fluoride contents, resulting in frequent excess in regards to drinking water standards.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited