A Broad-Scale Method for Estimating Natural Background Levels of Dissolved Components in Groundwater Based on Lithology and Anthropogenic Pressure

The Water Framework Directive (WFD) requires EU member states to assess the chemical status of groundwater bodies, a status defined according to threshold values for harmful elements and based on/the natural background level (NBL). The NBL is defined as the expected value of the concentration of elements naturally present in the environment. The aim of this study is to propose a methodology that will be broadly applicable to a wide range of conditions at the regional and national scale. Using a statistical approach, the methodology seeks to determine NBLs for SO4, As, Cd, Cr, Cu, Ni, Zn, and F based on the lithology of aquifers from which groundwater monitoring data were collected. The methodology was applied in six EU countries to demonstrate validity for a wide range of European regions. An average concentration was calculated for each parameter and chosen water point and linked to a lithology. Based on the dataset created, significant differences between lithologies and pressure categories (urban, agricultural, industrial, and mining) were tested using a nonparametric test. For each parameter, 90th percentiles were calculated to provide an estimation of the maximum natural concentrations possible for each lithology.


Introduction
The Water Framework Directive 2000/60/EC (WFD) [1] requires EU member states to assess the chemical status of groundwater bodies, a status defined by threshold values that should take into account concentrations of elements that may be both naturally present in aquatic environments and discharged by human activities. Member states must therefore characterize the natural background levels (NBLs) for each component and groundwater body while recognizing that a wide range of approaches have been developed. The NBL Table 1. Drinking water (DW) standards [5] and groundwater threshold value (GW-TV). Previous EU studies focused on WFD procedures for estimating the chemical status of groundwater and consideration of NBLs. The BRIDGE (Background Criteria for the Identification of Groundwater Thresholds) approach is based on the generalization that aquifers with similar petrographic properties have similar water composition under analogous hydrodynamic and hydrologic conditions [8]. The BRIDGE approach uses the BRIDGE aquifer typology to distinguish between NBLs [9,10].

DW Standards GW-TV
BRIDGE methods consist of using only sampling points unaffected by anthropogenic factors (natural areas, groundwater with low NO 3 concentrations, etc.) [11]. As a result, this approach limits NBL determination to specific areas. Lack of data and analytical method limitations (reporting limit being too high) made it difficult to determine NBLs for most trace elements. For this reason, first estimations were based primarily on bibliographic review [12,13].
Groundwater chemistry variation depends on a number of factors, including rainfall composition, aquifer lithology, groundwater flow pathways, and residence time [14]. Accordingly, each European country developed its own studies to meet its obligation to determine NBLs; several key European studies have been published [15][16][17][18][19][20][21][22][23]. More broadly, NBLs have been determined for the purposes of groundwater and health risk assessment and drinking water quality management [24][25][26]. Based on the numerous data, statistical methods have been applied, e.g., frequency distributions, probability plots, box plots, and histograms. Concentrations that deviate from the basic distribution (between the 10th and 90th percentiles) are generally excluded from the derivation of background concentrations. Masciale et al. [27] used Huber's non-parametric test to identify and eliminate anomalous data before NBL calculation, also introducing a confidence level for NBL values. Recent studies have also considered model-based statistical approaches, such as iterative 2-δ techniques, maximum normalized residual tests, and geostatistical multi-model approaches [26,[28][29][30][31]. Furthermore, an online tool evaluation of natural background levels (eNaBLe) was developed in Italy in 2020 [32].
The purpose of this study is to propose and test a methodology common to multiple countries, taking into account national specificities and available data related to groundwater, lithology, and anthropogenic pressure. Prior to this work, previous studies had been conducted at different scales and in specific areas (high mountain karst aquifers, specific geologic units) using data provided by groundwater monitoring networks [33][34][35][36][37][38]. Each study has adapted the BRIDGE approach methodology in different ways. In some cases, data from wells with only a median nitrate concentration below 10 mg/L were used (e.g., Slovenia, France, Austria). In others, such as Slovenia, sites with up to 20% anthropogenic impact in recharge areas were considered [39]. In this study, a new methodology aims to work at a large scale (river basin or national scale) and focus on the components presented in Table 1.
However, this exercise faces two primary challenges in particular, because the data used are provided by monitoring under the WFD framework, which was not primarily designed to characterize NBLs. The challenges are:

•
Monitoring network sampling points that are selected to be representative for a groundwater body, but may not be representative for the NBL calculation. • Laboratory reporting limits (LRLs) that may not be in the most appropriate ranges for NBL calculation, because they were initially designed to verify compliance with drinking water standards. Several approaches can be used to estimate the ranges of natural concentrations for water management plans, either by determining the average value and standard deviation or by describing low and high values after eliminating outliers. Spatial heterogeneity and temporal variability of groundwater composition make it difficult to determine a single value. Heterogeneity is scale-dependent, which leads to the consideration of a concentration range or a confidence limit adapted to each scale of work [40]. Indeed, upper limits for groundwater NBLs may exceed environmental or drinking water quality standards, particularly for elements such as As and F [41,42].

Censored Data
A primary challenge in using statistical approaches to process geochemical datasets is the presence of values that are below the limit of quantification (LOQ)/detection limit (LOD), also known as left-censored values. In some datasets, more than 50% of the values may be left-censored. Sophisticated methods require, for example, work on the data distribution law (distributional methods) or an extrapolation from >LOQ data to <LOQ data (robust methods), and are therefore less direct than the substitution of the value by a constant [43]. The data substitution method also depends on the amount of data below the LOQ. For example, to estimate the median, Antweiler (2015) [44] confirmed that, for a low-censoring dataset, LOQ/2 can be used. In addition, when 50% of the data are censored, no technique gives a reliable distribution estimate. Other more robust approaches for the imputation of left-censored values take multivariate approaches into account [45].
Analytical methods have evolved significantly in recent years, and the quality of the results has improved significantly, reducing uncertainties, detection limits, and/or quantification limits. For these reasons, the temporal evolution of the methods and laboratory reporting limits needs to be reviewed before selecting the dataset (see Section 2.3).

Uncertainties
Before interpreting water quality data, it is important to consider the uncertainty associated with the measurement. Traditionally, only analytical uncertainties have been applied to water quality interpretation [46]. However, these uncertainties represent only part of the uncertainty related to interpreting environmental measurements. Water sample collection is also part of the measurement acquisition process. Ghestem (2009) [47] evaluated the overall uncertainty in groundwater measurement. Overall variations were estimated at 5-10% for most substances, including the major components (SO 4 , NO 3 , Cl) and a range of 10 to 35% for trace elements. For As, Cu, Cr, Ni, and Zn (>5 µg/L), the uncertainties are between 15 and 20%.

Proposed Methodology
Based on the WFD context and data limitations, the following methodology ( Figure 1) shows the steps for determining NBLs based on groundwater network data and a consideration of the potential anthropogenic effects.

Groundwater Quality Dataset
The method involves processing data stored in national/regional groundwater monitoring databases (Table 2), which usually contain large datasets. The first step (step 1, Figure 1) is an inventory of the available and usable data; this step involves the selection of relevant parameters for NBL calculation and information on the lithological and hydrogeological context of each sampling point. Additional chemical parameters are needed to determine the hydrochemical groundwater conditions, such as pH, redox potential, NO 3 , and dissolved O 2 and Fe. Thermal and mineral water are excluded from this assessment. After considering data availability and laboratory reporting limits (LOD/LOQ), step 2 is an examination of the time period for which the data are considered (reference period) for each dataset (step 2).

Groundwater Quality Dataset
The method involves processing data stored in national/regional groundwater monitoring databases (Table 2), which usually contain large datasets. The first step (step 1, Figure 1) is an inventory of the available and usable data; this step involves the selection of relevant parameters for NBL calculation and information on the lithological and hydrogeological context of each sampling point. Additional chemical parameters are needed to determine the hydrochemical groundwater conditions, such as pH, redox potential, NO3, and dissolved O2 and Fe. Thermal and mineral water are excluded from this assessment. After considering data availability and laboratory reporting limits (LOD/LOQ), step 2 is an examination of the time period for which the data are considered (reference period) for each dataset (step 2). Sampling point link to the groundwater bodies delineated based on the DK model [49].
Values < LOD substituted with LOQ/2, where LOQ = 3*LOD. High LOD removed JUPITER Figure 1. Flowchart for NBL calculation applied in this study.  Dataset construction and formatting are based on the extraction of data and metadata (e.g., sampling date, unit, analytical method, LOD/LOQ) from national or regional databases (step 3) (Table 2), followed by the use of a set of controls and corrections prior to any treatment (step 4). Controls are applied to check systemic errors that may be observed in large databases (unit mg/L instead of µg/L), duplicated values (multiple uploads), and anomalous data (pH, temperature); when an error can be corrected (generic error), it is possible to substitute the value. Otherwise, the sample is deleted.
A large proportion of trace elements were censored at various LRLs. The censoring level was adjusted (Table 2), thus providing a lower LRL for each element. This is a simple way to deal with non-detected data, but it is a rather arbitrary approach depending on the analytical method and laboratory. In the case of the LOQ exceeding expected NBL ranges, it is more appropriate to remove the censored value with such a high LOQ from the dataset to avoid bias [36]. This removal may require using an iterative approach.
The number of analyses and parameters varies for each dataset. The objective is to have one value for each parameter per sampling point (step 5), so that each sampling point has the same representativeness in the dataset. When several analyses (>3) were available for water points, we chose to calculate a median for these datasets so as to assign a single value to each sampling point. The median value is preferred because it is a more robust value than the mean as a result of being less affected by outliers. When several LOQ levels were reported for one sampling point, we applied the median LOQ.
Summary statistics, which include the minimum, maximum, and common percentiles, were computed for all parameters. Particular attention must be paid to the number of analyses available before conducting this first statistical analysis. In addition, redox and pH types were defined for each sampling point (step 6) according to the following classes (Tables 3 and 4): pH and redox states that control mineral solubility and trace metal mobility, often through sorption/desorption processes for pH and the redox process for redox potential. If no direct measurements are available, dissolved O 2 , NO 3 , SO 4 , and Fe can be used to estimate the groundwater redox state based on the definition in Table 4.

Hydrogeological Characteristics of Sampling Points
For each sampling point, hydrogeological information (aquifer, lithology, depth) are linked to the data (step 7). In addition (step 8), lithological information related to water wells aids in the definition of local geology at the sampling point and in controlling consistency with the assigned groundwater body. For a common approach among all investigated aquifers, major categories were defined on the basis of eight major simplified lithologies (sedimentary: sand, gravel, carbonates, clay/marls, others; volcanic rocks; crystalline bedrock; and metamorphic rocks). This global approach makes it possible to define NBLs for simplified lithologies. Table 3. Redox and pH types applied.

REDOX TYPE Redox Condition
Finally, each sampling point is linked to its groundwater catchment or associated surface watershed (step 9). This information is necessary for the assessment of anthropogenic pressure at each sampling point. For superficial aquifers, the link is quite important, whereas, for deep aquifers that are disconnected from the surface, the link is not critical. This link is also an illustration of groundwater vulnerability for the following steps.

Anthropogenic Pressure
Evaluating anthropogenic pressure and relating activities to specific dissolved components would make it possible to determine the expected NBLs in the area under agricultural, industrial (including mining), and urban conditions. The accuracy of this approach would depend on the information regarding anthropogenic activities and the relationship between the activities and dissolved components released. The pressure inventory (step 10) is provided for anthropogenic pressure, such as diffuse pollution (agriculture, urban), point source pollution (industrial), and mining areas. Anthropogenic pressure is determined on the basis of the Corine Land Cover (CLC) inventory and the following classification: urban pressure; industrial pressure; agricultural pressure; and mining activities. This inventory is complete when databases related to industrial sites and service activities, as well as a register of pollution discharges, are available ( Table 2).
All information for each water point and anthropogenic pressure is summarized using geographical information systems (GISs) (step 11). This facilitates a mapping approach to check areas with abnormal concentrations or exceedances of DWS. Due to the heterogeneity of the available data and anthropogenic conditions among countries, different GIS approaches have been tailored to determine the prevailing pressure at each sampling point (step 12). This approach can be performed at the groundwater body scale, by lithology, or for the entire dataset. The approach for determining the pressure area around a water point will vary depending on hydrogeological and watershed characteristics. The goal of this step is to identify elements affected by anthropogenic activity (after a statistical test to evaluate the impact on concentration distribution) and to discard data for which an anthropogenic pressure is recognized.

Statistical Treatments
Using one average value for each water point, descriptive statistics (step 13) are calculated for each parameter: • Median-Quartile 1 and 3-10th percentile and 90th percentile • Percentage of quantified data by group in order to assess the weight of unquantified (censored) values in the calculation The description of the distribution is illustrated by boxplots or a cumulative frequency plot. Cumulative probability curves may be preferred to facilitate data comparison, because they are useful for evaluating different data populations. Spatial distributions were highlighted by mapping with GIS tools. Outliers potentially illustrate water with anthropogenic contamination or a high geochemical background, which are to be distinguished from the NBLs defined in this study.
Discriminant function analysis (DFA) (step 14) is applied to find patterns within the dataset and classify them. DFA is useful for determining whether water points can be grouped into one water family for NBL determination or if subgroups are more appropriate.
Distribution comparison and non-parametric tests (step 15) may be applied to each parameter to determine if sampling points have the same concentrations under anthropogenic pressure or to compare different classes of sampling points using simplified lithology or geochemical type (acidic or basic water, redox conditions) so as to determine whether lithology or geochemical parameters may affect NBLs. This step makes it possible to identify water families for which NBLs need to be determined. Assumptions of data normality and homogeneity of variances between types within a domain are not always met. In this situation, a parametric test is less powerful than non-parametric alternatives. In a dataset containing groups with large sample numbers (>100), a parametric approach, such as ANOVA, may still be suitable. Nevertheless, groups with a limited amount of data (five to 20) are included in the datasets in this study. For this reason, non-parametric one-way tests, such as Mann-Whitney and Kruskal-Wallis tests, were applied to test the null hypothesis assuming identical distribution for different groups [43]. Results from the non-parametric tests were analyzed with the understanding that they were less powerful than parametric tests. Non-parametric statistics do not require distributional assumptions, just mutual independence of samples. The significance of differences is established from p < 0.05. In the event that the Kruskal-Wallis test was significant, that is, the null hypothesis could be clearly rejected, a post hoc analysis (Nemenyi test) was performed to determine which groups were identified in the dataset.

NBL Calculation
Based on the results of step 15, sampling points identified as being affected by anthropogenic pressure were removed from each parameter data subset (step 16).
NBL determination (step 17) consists of a calculation of natural maximum concentrations representative of a water family. A water family is a lithology type or groundwater body with a dominant lithology. NBLs are calculated for SO 4 , As, Cd, Cr, Cu, Ni, Zn, and F within each lithology. When geochemical parameters, such as pH and redox, are identified as affecting the data distribution (based on the Kruskal-Wallis test result, steps 14-15), the NBL is determined for subgroups. For each water family, the 90th percentile is determined. This percentile is usually applied as a cutoff to define natural concentrations by lithology without any anthropogenic effects or geogenic anomaly; the 90th percentile defines the concentration at which 90% of sampling points associated with a simplified lithology are of a lower concentration. The concentration range between the 10th and 90th percentiles represents the element's "expected" concentration [11]. For this reason, the 90th percentile was selected for NBL determination in this study.

Results of Methodology Application to Regional Studies
The NBL study was conducted in six (6) countries; a summary of groundwater chemistry data is shown in the following tables and detailed in the Lions et al. report [56].

Groundwater Quality Dataset Construction (Steps 1-9)
Each dataset was based on the available analysis with consideration of the specificities and limits of each dataset (Tables S1 to S7). In several cases, the LRL varies significantly from year to year and between laboratories. For example, LRLs for As can range from 0.1 to 10 µg/L in France, whereas, in Denmark, they range from 0.01 to 1 µg/L. For the highest LOQs, which may be higher than NBLs, only data with a maximum LOQ were considered. LOQ cutoff values were optimized according to data distribution, because data with a large amount of high LOQs (i.e., higher than NBLs) disturb the natural concentration distribution [36]. In the internal basins of Catalonia, assuming that a censored value may imply decreased data quality, only parameters (As, Cu, Zn, SO 4 ) with a LOD/LOQ below 33% were considered in the analysis. By considering a large amount of data < LOQs, 90th percentiles may be identical to the LOQ/2 or <LOQ; in this case, the NBL is indicated as <LOQ, and no concentration is calculated.

Anthropogenic Pressure (Steps 10-15)
Various GIS approaches have been used to determine the pressure at each sampling point.

Watershed (Loire-Bretagne)
In the Loire-Bretagne river basin, the methodology consists of linking the prevailing pressure to watershed units at each sampling point (Table S8). Watershed units are determined using the TauDEM (Terrain Analysis Using Digital Elevation Models) toolbox plugin combined with the Digital Terrain Model (DEM). This DEM (BD ALTI) has a spatial resolution of 25 m [57]. The territory was divided into 36,453 basins and sub-basins. Upstream and downstream basins were identified for each basin. Sampling points and pollution sources, such as discharge points, industrial sites, mining areas, non-point sources, or diffuse pollution per municipality and CLC [48], were linked to each watershed.

Individual Analysis
In Slovenia, for groundwater bodies that were declared to be of poor quality, all sampling points were listed as having anthropogenic impacts. Surrounding sites were investigated where potential pollution sources or urban areas were identified. This approach was used because the number of sampling points was limited. This data management led to the removal of 24 sampling points from the original dataset of 203 and one sampling point as an outlier for As, Cd, Cr, and Fe.

Buffer around the Sampling Point
For each sampling point, the areal proportion of CLC inventory types was calculated for the area surrounding sampling points (Table 2, Tables S9 and S10). Based on the large number of wells studied in Denmark ( Figure 2) and Catalonia (>1000), it was decided that a one (1) km buffer would be an adequate proxy for well catchment zones, because actual well catchment areas are unknown at these national scales (Figure 3). In Austria, CLC [50] information was extracted as point data (10 m buffer), because most sampling points are found in gravel aquifers with complex catchment areas.

DRASTIC Method
The possibility of contaminants infiltrating from the surface in the Fruška Gora national park area was studied using the DRASTIC method [58], which was developed to evaluate groundwater vulnerability and make vulnerability maps [52,53]. This study noted that, in the central part of Fruška Gora (ridge), the vulnerability level was low (75-100) to very low (<75). Down the Fruška Gora slope, the vulnerability level increased; thus, ridges in the mountains, which are densely populated, are the most vulnerable (150-185). By combining DRASTIC with CLC, anthropogenic influences (discontinuous urban fabric, industrial units, mineral extraction sites, agriculture areas) can be seen/are clearly observable in areas of high vulnerability. This approach can be useful for an improved determination of NBLs, because prevailing pressure depends on all DRASTIC factors and hydrogeological settings.

Buffer around the Sampling Point
For each sampling point, the areal proportion of CLC inventory types was calculated for the area surrounding sampling points (Table 2, Tables S9 and S10). Based on the large number of wells studied in Denmark ( Figure 2) and Catalonia (>1000), it was decided that a one (1) km buffer would be an adequate proxy for well catchment zones, because actual well catchment areas are unknown at these national scales (Figure 3). In Austria, CLC [50] information was extracted as point data (10 m buffer), because most sampling points are found in gravel aquifers with complex catchment areas.

Buffer around the Sampling Point
For each sampling point, the areal proportion of CLC inventory types was calculated for the area surrounding sampling points (Table 2, Tables S9 and S10). Based on the large number of wells studied in Denmark ( Figure 2) and Catalonia (>1000), it was decided that a one (1) km buffer would be an adequate proxy for well catchment zones, because actual well catchment areas are unknown at these national scales (Figure 3). In Austria, CLC [50] information was extracted as point data (10 m buffer), because most sampling points are found in gravel aquifers with complex catchment areas.

Geochemistry (pH, Redox, Lithology) (Steps 13-15)
Data analysis addresses the geological composition of aquifers and geochemical parameters (pH, redox), which are among the main factors that affect trace element occurrence. Some examples are presented here; for additional details, see Lions et al. [56].
Lithology is the primary factor that controls water composition, including pH, conductivity, major ions, and trace elements. In Austria, major cations Na, K, Ca, and Mg were considered as quantitative variables to characterize aquifer lithology using a confusion matrix (Table S11). DFA (specifically, linear discriminant analysis) showed that, for Denmark, based on predictor variables (pH, SO 4 , As, Ni, F, Cl, O 2 , NO 3 , Fe), it was possible to discriminate easily between the two lithology classes sedimentary: carbonates and sedimentary: sand (accuracy and 95% confidence interval: 81.7% (79.3-83.9%)). Therefore, these lithology classes were used for determining Danish NBLs. For Slovenia, the Mann-Whitney U test results showed significant differences between NBLs of carbonates and sands at the 95% confidence level (α = 0.05). This was true for all elements except Cd, due to its low concentration (0.01 µg/L).
In Loire Bretagne, the Kruskal-Wallis test and Nemenyi post hoc test showed that trace elements belong to different groups that overlap when the lithology is not well-constrained, such as for clays and marls or gravels (Figure 4). Results show that volcanic rocks belong to an individual group for components such as Ni, Zn, F, or SO 4 . Crystalline rocks are in a group of their own for elements such as As (including the group sedimentary-others, which contains crystalline and sedimentary formations) and Cr, whereas Ni is individualized for metamorphic rocks and sand. The distribution of an element across lithology classes was tested to justify the NBL definition for each lithology. In addition, descriptive statistics showed that crystalline and metamorphic rocks were primarily represented by acidic water (>90% with a pH < 7). This fact is directly related to the water-rock interaction in these lithologies. Conversely, carbonates and clays/marls were represented by neutral water (7-7.5) (>83%), whereas basic water was present at 12% of the sampling points in carbonates. Water in gravels can be either acidic or neutral, whereas sand and volcanic aquifers are represented in each family (acidic, basic, neutral). Redox conditions will depend more on hydrogeological settings (depth, confined aquifer) than on lithology.  The Kruskal-Wallis test was used to evaluate the effect of pH and redox on trace element concentrations for each element ( Figure S1). Reduced water predominates in Denmark, whereas oxic water is predominant in Loire-Bretagne. However, As is affected by the redox class, as illustrated in Figure 5 for sedimentary aquifers, whereas pH is not discriminant. The Kruskal-Wallis test was used to evaluate the effect of pH and redox on trace element concentrations for each element ( Figure S1). Reduced water predominates in Denmark, whereas oxic water is predominant in Loire-Bretagne. However, As is affected by the redox class, as illustrated in Figure 5 for sedimentary aquifers, whereas pH is not discriminant. Figure 4. Results for the Kruskal-Wallis test and the Nemenyi post hoc test for lithology families in the Loire-Bretagne area (number of samples in brackets). The categories sharing letters (above the boxplot) are not significantly different at the 95% confidence level based on the post hoc Nemenyi test [56].
The Kruskal-Wallis test was used to evaluate the effect of pH and redox on trace element concentrations for each element ( Figure S1). Reduced water predominates in Denmark, whereas oxic water is predominant in Loire-Bretagne. However, As is affected by the redox class, as illustrated in Figure 5 for sedimentary aquifers, whereas pH is not discriminant.

NBLs (Steps 16-17)
The following NBLs were determined for each dataset based on aquifer lithology. Concentrations that deviated due to anthropogenic factors were excluded from the dataset prior to NBL derivation according to the conclusions of each dataset. Only families with more than five (5) sampling points are presented in Tables 5 and 6. For Denmark, the minimum number of sampling points for a lithological group was set at 20, but when the pH and redox were used, this threshold was increased to 50.
The resulting NBLs considering pH and redox conditions were determined for As, Ni, SO 4 , and F (Table S12). For the Loire-Bretagne area, the results showed that Ni and Zn were clearly affected by acidic conditions with higher concentrations. Reduced conditions led to high concentrations of Fe, Mn, and As. For this dataset, NBLs were determined for As based on redox only and for Ni and Cu based on redox and pH. For carbonates, only neutral and basic waters were considered. NBLs for Zn in sandy aquifers were considered based on pH (Table S13).

Lithology
For the elements in this study, lithology is one of the primary factors that controls NBLs. The results show that using simplified aquifer lithologies is appropriate for comparing large datasets from different countries. Further subdivisions of lithology groups would lead in many cases to very low numbers of data per lithology. The NBL variations shown in Tables 5 and 6 can be partly explained by the broad spectrum of elements found. NBLs can also show multiple and different geological settings (e.g., paleo-environment) that explain the possible variations of a parameter across several countries. However, the approach of this study highlights valuable primary characteristics of lithology groups, including NBL ranges for specific elements.
A discussion of lithology groups by country may be helpful in clarifying NBL variations that can be linked to lithology.
For Denmark, it is important to distinguish between different depositional ages for the sedimentary: sand class. For example, the DFA analysis showed a difference between quaternary sands (mostly of glacial origin) and pre-quaternary sands when pH, SO 4 , As, Ni, F, Cl, O 2 , NO 3 , and Fe are used as predictors [56].
We correctly identified some national specificity. In Denmark and Serbia, the NBL of As in sandy aquifers exceeds threshold values. In Denmark, groundwater resources have often geogenic As with high concentrations [59]. The Serbian dataset refers to a location north of Fruška Gora (Vojvodina), where groundwater usually contains elevated As [60][61][62].
The sedimentary: others class includes a wide range of area-specific lithologies, usually described as having distinct characteristics linked to local geology (specific environment, mineralization). Therefore, it is not a well-defined lithology, and NBL determination will reflect each regional lithology. In Loire-Bretagne, the sedimentary: others class includes aquifers composed of a mixture of sedimentary and other lithologies, such as volcanic strata or weathered crystalline or metamorphic rocks. In the Duero River Basin, there is an undifferentiated sequence of interbedded sedimentary sand, gravel, and clay. In such multilayered aquifer faces, NBLs are calculated for all sedimentary units. For this reason, all of them were aggregated into a single category (cenozoic). In Slovenia, the sedimentary: others class points that sample quaternary alluvial aquifers are mixtures of sand and gravel. Results for the clays/marls class are not presented here, because they were found only in France and parts of Austria [56].
Furthermore, regarding the crystalline or metamorphic classes, all formations cannot be merged into only one lithological class, because each geological domain (e.g., variscan, hercynian) has specific petrologies and mineralization that may affect the NBLs differently. This approach also excludes the contribution of local mineralization, faults, and mineral deposits. This study provides several standard NBLs for crystalline and metamorphic rocks, but further studies are needed to characterize local NBLs.

Effects of Physico-Chemical Parameters
Lithology and water chemistry control pH buffering in groundwater: acidic water in crystalline aquifers and neutral to basic water in carbonate aquifers. These factors will directly control the geochemical equilibrium for trace element release and resulting NBLs. In addition, redox potential may be directly controlled by depth or confined/unconfined conditions. In this case, redox potential is an indicator of groundwater conditions, which is correctly demonstrated by reduced water speciation (low NO 3 , low Fe, low dissolved O 2 ) in deep or confined aquifers. In the Loire-Bretagne area and Denmark, F concentrations differ according to redox classes. For this element, redox is not directly involved in F mobilization, because it is not sensitive to redox processes; however, F is higher in confined aquifers and deep groundwater.

Anthropogenic Effects
This land use-based method for assessing anthropogenic effects aids in the identification of statistical differences between sampling points subject to prevailing agricultural pressure compared to those affected by industrial and/or urban pressures. Sampling point distribution according to prevailing pressures is presented in [56] and Table S8-S10. Regarding statistical significance among various lithologies and anthropogenic impacts, it is possible to conclude that prevailing pressures may affect trace element concentrations in several cases.
In the Duero River Basin (Spain), most sampling points are mainly affected by agricultural pressure. In other words, within a 1 km buffer around each sampling point, most of the land is used for agricultural purposes. The pre-selection of sampling locations to include only drinking water supply wells may explain the absence of points affected by industry, mining, or urban activities.
In the Danish case, less than 1% of the sampling points were devoid of any anthropogenic pressure (i.e., are natural). All are located near the coast, and are not considered to be representative of undisturbed conditions. Depending on the components considered, between 74.8 and 85.8% of the sampling points show effects of agricultural pressure. This finding limits the application of the method, because it means that a statistical test for comparing elemental distributions cannot be used against the natural group. As a result, the normal state is assumed to be dominant agricultural pressure. In this case, only sampling point groups for which industrial or urban pressure predominates can be compared to the agricultural group. The shortcoming of this method is that it fails to distinguish between polluted and potentially polluted sampling points, but excludes the entire group with the specific dominating pressure. As a consequence, an entire data group is excluded, and some NBLs increase because sampling points with low concentrations (potentially the NBL) were excluded.
In Slovenia, a previous study [38] used only sampling points with a minimum probability of being affected by anthropogenic activities, and therefore determined NBLs for NO 3 (3.8 mg/L) and SO 4 (6.9 mg/L). Serianz et al. (2020) [34] assigned 6.0 mg/L for NO 3 after removing all analyses exceeding 10 mg/L of nitrate; SO 4 NBLs were 17.0 mg/L (by the probability method) and 26.0 mg/L (by the pre-selection method). In this study, the natural group NBLs are 15.5 mg/L for NO 3 and 14.0 mg/L for SO 4 , but as much as 41.5 mg/L and 25.0 mg/L for the anthropogenic group, respectively. This result indicates that many points in the Slovenian national monitoring network used for groundwater body assessment are noticeably affected by anthropogenic pressures, so trace element concentrations should be evaluated with care. As a result, only 87 sites without obvious anthropogenic impact were used for the NBL calculation. Thus, it is important to carefully assess the statistical test results (step 15, Figure 1), which can serve as a first screening followed by expert assessment before data removal. In addition, a more detailed assessment of anthropogenic pressure and exclusion of polluted sites can be done for a target area within a specific groundwater body or aquifer.
The results of each prevailing pressure assessment (Table 7) could not be generalized to other groundwater bodies, because they are directly linked to the distribution and nature of anthropic activities. However, anthropogenic impacts have been statically observed for each trace components (As, Cd, Cr, Cu, Ni, Zn, F, SO 4 ) under the following pressures: agriculture (As, Cd, Cu, Ni, Zn), urban and industrial (all), and mining (As, Cu). Other trends may exist, but they have not been observed in the investigated datasets.

NBLs
Discriminant analyses of lithology classes (qualitative variables) and selected major and minor elements (quantitative variables) were performed to characterize the most prevalent water families in each dataset. While the median is representative of the average concentration, the 90th percentile gives a high range for natural concentrations [63]. The 90th percentile makes it possible to estimate the possible concentrations of natural origin in each group. Values defined by higher percentiles, for example, 95 or 97.7%, would be also appropriate as a reference for maximum natural concentrations, but higher percentiles require larger datasets and well-selected sampling points for higher confidence intervals [64]. When a large variability is observed, NBLs could be much higher than the average concentration in groundwater. Therefore, NBLs are not average concentrations, but concentrations in a higher range. These values are helpful to emphasize anomalous concentrations in groundwater. The NBLs obtained from these seven case studies are well within the range of NBLs determined by international studies [65].
In this study, we were able to apply the proposed method and establish NBL values that are within the expected range compared to previous studies [34][35][36][37][38]40,66]. In the case of the Duero River Basin, NBLs are compared to threshold values provided by local authorities. In the case of SO 4 , the NBL obtained when considering the cenozoic aquifer is about 241 mg/L (anthropogenically affected value), while the value without the anthropogenically affected data is about 100 mg/L.
For groundwater management, these NBLs help to highlight groundwater bodies where high concentrations need further investigation, so as to identify the origin of the anomaly. An anomalous concentration in a groundwater body may be a result of groundwater contamination, but may also come from a natural occurrence in a specific aquifer or in a particular geological context, such as mineralization or thermal waters [67][68][69]. Even so, the NBLs established in this study should not be applied directly to management and planning purposes at the groundwater body scale, because the data processing was conducted at a different scale (regional, national). In addition, in the case of high geological heterogeneity, such as in the Catalonian aquifers, NBL estimation needs more criteria to define water families for groundwater management implementation.

Conclusions
The methodology developed in this study was demonstrated in multiple countries and groundwater monitoring contexts. The use of the methodology made it possible to distinguish water families on the basis of lithology and geochemical conditions (pH, redox), for which different NBLs are expected. The proposed method can be used for the broad contexts investigated in this study; it provides coherent results. Therefore, it can be used as a step to explore a large dataset from groundwater monitoring networks. The methodology is an overall lithological approach; its results can be extrapolated to aquifers of comparable lithology by excluding local specificity links, for example, to the presence of mineralization, mining (metals and metalloids), or evaporites (gypsum). For each aquifer, this first estimation is open to the addition of new qualitative and hydrogeological data to refine the preliminary NBLs and the area of application.
The statistical tests applied depend on the amount of available data and the number of hydrogeological and lithological families. This approach may be limited for some hydrogeological entities (e.g., heterogeneous aquifers) depending on the number of available sampling points and characterization of hydrogeological settings (depth, confined/unconfined, pumped layer). NBLs determined for families characterized by only a few water points or that are not representative of the groundwater body should be viewed with caution. NBLs may be refined on the basis of additional data, the definition of lithology or groundwater entities, and the lithologic and petrographic description of the sampled aquifer. Hydrogeological settings for each groundwater need to be carefully addressed.
Determination of NBLs in alluvial aquifers is challenging because of obvious contamination and presence of anthropic activities. This caution is also relevant for unconsolidated and unprotected shallow aquifers, such as the aeolian or glacio-fluvial aquifers in Denmark [70]. In general, these vulnerable aquifers are usually impacted by nitrates or/and organic pollutants according to prevailing pressures [71]. Conversely, confined aquifers are naturally protected from surface infiltration and, as a result, anthropogenic contamination is less common.
Using monitoring network data and comparing measured concentrations to calculated NBLs, it is possible to identify groundwater bodies for additional investigation to identify the origin of anomalous concentrations or anthropogenic inputs. Delineation of these entities and the interpretation of exceedances are beyond the scope of this study, because they require local investigation. Specific geogenic features (mineralization, evaporites, thermal waters) require a different methodology from the lithological NBL as defined in this study. The results of each prevailing pressure assessment cannot be generalized to other groundwater bodies, because trace element contaminations are directly linked to the type of anthropic activity; only general trends for each element can be highlighted.
This work constitutes a first step in the definition of NBLs. At this scale, data analytical methods might also be accompanied by process-oriented analyses and supported by expert background knowledge at the local scale. This methodology will depend strongly on the selection of relevant water points and the use of sampling and analytical methods suitable for reducing uncertainty in NBL characterization. To provide better estimates of trace element NBLs in the future, monitoring networks should be improved or additional sites besides the national monitoring network should be used with the selection of water points less subject to anthropogenic impacts to provide additional information regarding trace element concentrations. In addition, targeted analytical methods should be used to obtain lower LOQs and thereby provide more useful datasets.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/w13111531/s1, Table S1. Denmark: The number of chemical analyses for each element (n), account of LODs in the dataset, range of LODs, the number and percentage of analyses < LOD, the max substitute value, and the number of analyses with the max substituted. Table S2. Serbia: The number of chemical analyses for each element (n), account of LODs in the dataset, the percentage of analyses < LOD, the max substitute value, and max LOD substituted. Table S3. Slovenia: The number of chemical analyses for each element, account of LOQs in the dataset (two were reported in 2016), range of LOQs, the number and percentage of analyses < LOQ, the max substitute value, and the number of analyses with the max substitute. Table S4. Loire-Bretagne, France: The number of chemical analyses for each element (n), account of value > LOQs in the dataset, range of LOQs (low, high), percentage of analyses < LOQ, percentage with high LOQ, and the maximum value substituted. Table S5. Internal Basins, Catalonia: The number of analyses for each element (n) and number and percentage of analyses < LOD, range of LOD, and maximum value imputed with each method. Table S6. Austria: Dataset overview, ranked by percentage of values below LOD or LOQ. Table S7. Spain, DRB: The number and percentage of chemical analyses below the limit of quantification (LOQ), range of LOQs in the dataset, and min and max substituted value. Table S8. Loire-Bretagne, France: The number of water sampling points with data for each element and each group of prevailing pressure. Table S9. Denmark: The number of water sampling points with data for each element and each group of prevailing pressure and results from the Kruskal-Wallis rank sum test. Table S10. Spain, DRB: The number of water sampling points with data for each element and each group of prevailing pressure and results from the Kruskal-Wallis rank sum test. Table S11. Austria: The confusion matrix for lithology. Table S12. Denmark: Natural background levels (NBLs) for sedimentary aquifers combined with pH and redox conditions. Table S13. Loire-Bretagne, France: Natural background levels (NBLs) for the sedimentary aquifers combined with pH and redox conditions.  Data Availability Statement: Final Report for HOVER WP3.3 is available here: https://repository. europe-geology.eu/egdidocs/hover/hover_33_statistical_data_treatment_backgroundlev.pdf. For France, monitoring data (ADES) are available https://ades.eaufrance.fr/ (accessed on 12 February 2021) and information for aquifer are available on https://bdlisa.eaufrance.fr/ (accessed on 15 September 2020). For Denmark, database JUPITER is available at https://eng.geus.dk/productsservices-facilities/data-and-maps/national-well-database-jupiter (accessed on 3 July 2019). Austrian data are available on the Austrian groundwater monitoring network (WISA): Wasserinformationssystem; Gewässerzustandsüberwachungsverordnung, BGBl. II Nr. 479/06 i.d.g.F. Catalonian data from SDIM-ACA are available on http://aca-web.gencat.cat/sdim21/ (accessed on 25 September 2020). Data from the Slovenian Environmental Agency are available on https://www.arso.gov.si/vode/ podatki/ (accessed on 15 July 2020) and land use on www.mkgp.gov and https://rkg.gov.si/vstop/ (accessed on access on 21 September 2020). Data from Duero River Basin (Spain) are available on https://www.chduero.es/web/guest/red-control-estado-quimico (accessed on 30 October 2020) and land use on http://centrodedescargas.cnig.es/CentroDescargas/catalogo.do?Serie=SIOSE (accessed on 29 December 2020).

Conflicts of Interest:
The authors declare no conflict of interest.