Calculation of Visual Background Values of Major Groundwater Components Taking Karamay City as an Example

: Based on the groundwater chemistry data from Karamay City, Xinjiang Province, this study examines seven major components: K + , Ca 2+ , Na + , Mg 2+ , SO 42 − , Cl − , and HCO 3 − . The sampling was conducted during two periods: flood and dry periods. On the basis of analyzing the regional geologic background and hydrogeologic conditions, the study area was divided into calculation units and the test data validity was screened. Then, the outliers were eliminated by Grubbs’ method and Piper’s trilinear plot method, and the effectiveness of the elimination was evaluated by the box-and-line diagram. After that, the distribution types of the groundwater chemistry data in each calculation unit were judged to obtain the background values of the seven test indexes for the groundwater in different calculation units. The results show that Grubbs’ method and Piper’s trilinear plot method are effective in removing outliers. Secondly, the background values of Na + and Cl − in the groundwater of Karamay City are mostly higher than those of other anions and cations during the two sampling periods. This phenomenon may be due to the concentration effect of evaporation. Additionally, the groundwater background values of Ca 2+ , Na + , and Cl − showed more significant differences between the two sampling periods compared to other ions.


Introduction
With rapid economic and social development, human activities have been increasing, resulting in intense groundwater disturbance and the deterioration of groundwater environment [1].Under such circumstances, groundwater quality evaluation for groundwater protection and pollution prevention has become one of the hot spots of groundwater research.The concentration of chemical substances in groundwater results from a combination of factors, which depends on natural factors such as the reaction between water and minerals and gases.Additionally, human activities such as urban, agricultural, industrial, and mining activities will also have an impact on the concentration of substances [2,3].Evaluating the natural level of chemical substances in groundwater is the process of analyzing the groundwater background value, so studying the groundwater background value can provide a scientific basis for the early warning of groundwater contamination [4].Simultaneously, it establishes a foundation for the sustainable development and utilization of groundwater resources [5].
International research on the groundwater background value in the traditional sense started very early.As early as 1978, the United States carried out related research.However, due to the differences in understanding and methodology at the beginning of the study, the definition of the groundwater background value was not yet uniform [6,7].By the 1990s, the more recognized definition was "Groundwater background value refers to the chemical components and content of groundwater under uncontaminated conditions" [8].This definition was based on the consideration of the evolution and internal development of groundwater in its natural state [9].Still, human activities inevitably affect the evolution of groundwater quality and quantity to varying degrees.This results in it being impossible to find out the background value in the strict sense at this stage.To solve this problem, the concept of visual background value has been proposed in recent years.It refers to the characteristic content, variation range, proportional relationship, etc., of each water chemical component formed by superimposing the influence of normal human activities on groundwater in its natural state.It reflects the process and characteristics of each water chemical element that evolves with the existence and development of the natural environment and the influence of normal human activities [10][11][12].Obviously, the visual background value is more targeted and operational for groundwater pollution prevention and control.In this study, the concept of visual background value is also used to analyze and evaluate the groundwater quality.
Currently, there are many methods for analyzing the background value of groundwater [7,13,14], among which mathematical statistics and the pre-selection method are the most used and the most intuitive ways to reflect the background value.Daniele Parrone et al. [15] used the pre-selection method to study the volcano-sedimentary aquifers in Central Italy, and the regional NO 3 − concentration was used as an indicator substance to exclude the influence of human activities.The groundwater background values were then determined based on the different distribution types of the concentration data.The key to the pre-selection method is to determine the indicator substances and their thresholds, but they are usually based on experience or expert knowledge, which is highly subjective.This is the most significant limitation of the pre-selection method.Guanxing Huang et al. [4] used Grubbs' test to remove outliers in calculating the background values of As and Mn for the four computational units of the Pearl River Delta.After the removal, the dataset was tested for normality and found to be normally distributed.However, when the background value calculation process relies too much on mathematical statistics, it will lead to insufficient consideration of the intrinsic evolution law and characteristics of the groundwater hydrochemistry itself.As a result, the background value obtained will not be able to truly reflect the actual situation of the groundwater in the study area.To address this problem, many scholars have combined the hydrochemical mapping method with the mathematical statistical method.This approach can not only pay attention to the hydrogeological situation of the regional groundwater when removing the outliers, but also reflect the regional groundwater migration and transformation laws as well as changes in the groundwater components to a certain extent.Guggenmos et al. [16] combined the multivariate statistical method with the hydrochemical analysis method in the study of the interaction between the groundwater and surface water to remove the water sample monitoring data from the water samples.Grubbs' test is one of the statistical methods used to detect outliers.It is not only suitable for discarding one or more outliers in a dataset but is also reliable for limited measurements.On the other hand, Piper's trilinear diagram method vividly displays hydrochemical characteristics such as regional groundwater ion concentrations.Moreover, when combined with Mahalanobis distance, it can identify outlier data in ion concentration by considering the intrinsic evolution patterns of groundwater.Therefore, this study combines Grubbs' test and the Piper's trilinear diagram method to remove outliers, thereby representing the groundwater background values in Karamay City.In this paper, considering the evolution of groundwater chemistry in Karamay City, Grubbs' method and the Piper trilinear plot method were combined to remove the outliers and thus characterize the groundwater background values.
Karamay City epitomizes an arid region characterized by minimal precipitation but significant evaporation.Additionally, it serves as a quintessential mining hub, primarily centered on oil field exploration and development [17].The extraction of oil and other socio-economic endeavors profoundly influences the groundwater [18].At this point, the quantity and quality of groundwater in Karamay City are facing a severe test.Therefore, this paper utilizes the groundwater chemistry data collected and tested during the two sampling periods in Karamay City in 2022 and carries out a groundwater background value study for the seven main components of K + , Ca 2+ , Na + , Mg 2+ , SO 4 2− , Cl − , and HCO 3 − .At the same time, the analysis of the groundwater background value was used to calculate the onset value of the groundwater contamination for the seven testing indicators.This process aims to provide a scientific basis for the development of groundwater and the management of key industrial enterprises in Karamay City.

Study Area Description
Karamay City belongs to Xinjiang Province (Figure 1b) and is located in northwestern China (84 in the western part of the Junggar Basin [19,20] (Figure 1a).The geomorphological features of the study area are relatively homogeneous, mostly broad and flat desert with elevations ranging from 270 m to 500 m [21].Karamay is located in the hinterland of the Eurasian continent.It has a typical temperate continental arid desert climate, with hot summers (maximum temperature of 42.9 • C) and cold winters (extreme minimum temperature of −35.9 • C).The region experiences arid conditions with little rainfall and high evapotranspiration (annual evapotranspiration of 3545.2 mm).The evapotranspiration-to-precipitation ratio is greater than 27.6 [17].
centered on oil field exploration and development [17].The extraction of oil and socio-economic endeavors profoundly influences the groundwater [18].At this poin quantity and quality of groundwater in Karamay City are facing a severe test.There this paper utilizes the groundwater chemistry data collected and tested during the sampling periods in Karamay City in 2022 and carries out a groundwater backgr value study for the seven main components of K + , Ca 2+ , Na + , Mg 2+ , SO4 2− , Cl − , and H At the same time, the analysis of the groundwater background value was used to calc the onset value of the groundwater contamination for the seven testing indicators.process aims to provide a scientific basis for the development of groundwater an management of key industrial enterprises in Karamay City.

Study Area Description
Karamay City belongs to Xinjiang Province (Figure 1b) and is located in northwe China (84°44′-86°1′ E, 44°7′-46°8′ N) in the western part of the Junggar Basin [19,20] (F 1a).The geomorphological features of the study area are relatively homogeneous, m broad and flat desert with elevations ranging from 270 m to 500 m [21].Karamay is lo in the hinterland of the Eurasian continent.It has a typical temperate continental ari sert climate, with hot summers (maximum temperature of 42.9 °C) and cold winter treme minimum temperature of -35.9 °C).The region experiences arid conditions little rainfall and high evapotranspiration (annual evapotranspiration of 3545.2 mm) evapotranspiration-to-precipitation ratio is greater than 27.6 [17].The aquifer structure in Karamay City is influenced by geomorphology, stratigra lithology, and tectonics, resulting in a regular distribution from mountainous are The aquifer structure in Karamay City is influenced by geomorphology, stratigraphic lithology, and tectonics, resulting in a regular distribution from mountainous areas to plains.Fissure aquifers prevail in mountainous regions, while pore aquifers dominate the southern and central parts of the city.In the northern and western areas, pore aquifers primarily occupy the upper sections, while a mixed pore-fissure structure characterizes the lower segments [22].
The hill area is affected by the arid climate of the Junggar Desert in the east, with scanty precipitation, which is of little significance to the groundwater recharge in the plains.Most recharge in the delta occurs through river leakage or diversion canals.Atmospheric precipitation in the plains averages less than 10 mm, with intense evaporation, resulting in negligible recharge.Groundwater is primarily sustained through diving evaporation and plant transpiration.Shallow buried diving operates on an infiltration-evaporation cycle, while deep buried diving and pressurized water function on an infiltration-runoff cycle.

Samples and Tests
This study adopts the principle of "unit method as the main method, grid method as a supplement" for the distribution of points.Considering the actual situation of the study area, the preliminary division of the calculation unit was carried out according to the hydrogeological data of the area.Large lakes, hills, canyons, and areas with intensive oil field production activities were excluded.Monitoring points were initially positioned based on a grid measuring 6.5 km × 6.5 km.Points within each grid were laid in the center of the cell as much as possible.At the same time, the influence of factors such as topography and distribution of pollution sources is taken into account in the selection process of points.Appropriate deviations can be made when the distance to the pollution source is close, or there are traces of other significant disturbances of human activities.To accommodate the heightened human activities in the region, precision in point deployment was increased as deemed appropriate.In areas like the oilfield, mountains, and other challenging terrains, measures such as thinning treatment were implemented to establish sampling points.Ultimately, a total of 137 monitoring points were established, with their distribution illustrated in Figure 1c.Groundwater samples were collected in March 2022 and July 2022, and a total of 274 sets of groundwater samples were actually tested in the two periods.The testing items of the actual testing samples include K + , Ca 2+ , Na + , Mg 2+ , total hardness, total dissolved solids(TDS), SO 4 2− , Cl − , HCO 3 − , etc.This study only carries out the background value study for the seven main components of K + , Ca 2+ , Na + , Mg 2+ , SO 4 2− , Cl − , and HCO 3 − .The testing process is carried out in accordance with the specifications, and the specific testing methods are shown in Table S1.The detection results of the indicators are shown in Tables S2 and S3.

Available Dataset
In order to ensure the reliability of the data during the calculation of groundwater background values, the acquired datasets should be screened for validity first.Data screening primarily involves conducting a quality audit and checking the balance of anions and cations.The quality audit encompasses evaluating data integrity (e.g., sample completeness, accuracy of detection indices) and analyzing the causes of abnormal data.Anion and cation balance calibration is to determine whether the main anion and cation charges are balanced [23,24].If imbalanced (relative error > 10%), they cannot be used in the calculation of background values of the main components and need to be excluded.After data screening, no significant anomalies were found in the 274 sets of test samples, thus obtaining a valid dataset for subsequent analysis and calculation.

Division of Calculation Units
Various factors, including aquifer hydrogeological characteristics, groundwater flow conditions, and land use, can influence the background value of groundwater, leading to variations in neighboring areas [25].With the help of geospatial data, we can better classify the areas with different attributes and carry out categorization calculations in the process of background value calculation.Hence, this study divides the calculation unit for groundwater background values based on hydrogeological units, groundwater chemical characteristics, and spatial distribution of TDS.The specific division steps are as follows: Firstly, regional hydrogeological map data are used to delineate different hydrogeological units.In the second step, on the basis of hydrogeological units, further refinement and adjustment were made according to the actual groundwater chemical characteristics and TDS data.Accordingly, the study area was finally divided into ten calculation units, and the specific division is shown in Figure 1c.

Removal of Outliers
Removing indicators in the dataset that may have been affected by human activities by means of data screening is an essential tool assessing groundwater background values.The process of identifying and removing outliers in this study is shown in Figure 2. Firstly, the Piper trilinear plot method is used to identify and remove the outlier data.During the identification process, the Piper trilinear plot is converted into a two-dimensional coordinate plot.Secondly, the similarity of each hydrochemical plot is assessed using the Mahalanobis distance, which indicates the similarity between two sets of unknowns.Subsequently, this measure of similarity is employed to identify hydrochemical abnormalities among the various sample points [26,27].Piper's two-dimensional coordinate conversion formula for trilinear maps is as follows: where m c is the concentration of each cation in terms of the amount of substance; m a is the concentration of each anion in terms of the amount of substance; m Na+K is the sum of the concentrations of K + and Na + in terms of the amount of substance; m HCO 3 +CO 3 − is the sum of the concentrations of HCO 3 − and CO 3 2− in terms of the amount of substance; and v is the charge of the corresponding ion.
Sustainability 2024, 16, x FOR PEER REVIEW 6 of points.As long as there was one index abnormality, the data of the main components the sample were regarded as abnormal data.Grubbs' method can be used when the nu ber of samples n < 100.When the number of samples n ≥ 30, the mean plus or minus thr times the standard deviation method is used.In the present study, Grubbs' method w taken as the significance level α = 0.01 for the exclusion of abnormal data [28,29].If t calculated statistic G is greater than the critical value under the significance level α in t table of critical values of Grubbs' test (Table S4), it will be considered an outlier [25,30,3 where, X d is the suspect value to be tested and X d , S n is the mean and standard deviati calculated from n values including the suspect value.
The mean method is a method that uses the average value plus or minus three tim the standard deviation of each water chemical index as the measurement value to perfo cyclic elimination until all remaining samples are within the range of the average val plus or minus three times the standard deviation.

Groundwater Background Value Statistics and Characterization
The data obtained after excluding outliers will serve as the dataset for groundwa background value analysis.The background value statistics of the main componen mainly include three steps: background value data distribution type test, backgrou value statistics, and background value range adjustment analysis.After removing the o liers, the Shapiro-Wilk test is used to test the distribution type of the background da corresponding to each index, and the fit test of the distribution type is taken at the sign icance level of α = 0.05.The distribution type mainly includes three kinds of distributio normal distribution, lognormal distribution, and other distributions.For the backgrou data that have normal or lognormal distribution types, their median is used as the ce tralized eigenvalue.The 5% quartile is used as the upper limit of the background valu The Mahalanobis distance formula is as follows: where X denotes the coordinate values of the array; X denotes the mean of the coordinate values of the array; S −1 denotes the covariance inverse matrix; and Da denotes the Mahalanobis distance.
After obtaining the Mahalanobis distances corresponding to Piper's trilinear plots, the mean plus or minus three times the standard deviation of the Mahalanobis distances were utilized as the critical distances (Di 2 ).Then, the magnitude of Di 2 was compared with that of Da 2 , and the samples whose Da 2 values exceeded the critical distances (Di 2 ) were all excluded.That is, the Mahalanobis distances corresponding to all the remaining samples were in the range of the mean plus or minus three times the standard deviation.Next, we recompute the mean value, covariance matrix, and Mahalanobis distance for the remaining samples.Subsequently, we compare the magnitude of Di 2 with Da 2 .Repeat this process until the Mahalanobis distances of all remaining samples are below the critical distance Di 2 , indicating the absence of outlier samples [25].
On the basis of the identification and exclusion of abnormal point data by Piper's trilinear plot method, Grubbs' method or the mean plus or minus three times standard deviation method was used to identify the anomalies of each index of the remaining points.As long as there was one index abnormality, the data of the main components of the sample were regarded as abnormal data.Grubbs' method can be used when the number of samples n < 100.When the number of samples n ≥ 30, the mean plus or minus three times the standard deviation method is used.In the present study, Grubbs' method was taken as the significance level α = 0.01 for the exclusion of abnormal data [28,29].If the calculated statistic G is greater than the critical value under the significance level α in the table of critical values of Grubbs' test (Table S4), it will be considered an outlier [25,30,31].
where, X d is the suspect value to be tested and X d , S n is the mean and standard deviation calculated from n values including the suspect value.The mean method is a method that uses the average value plus or minus three times the standard deviation of each water chemical index as the measurement value to perform cyclic elimination until all remaining samples are within the range of the average value plus or minus three times the standard deviation.

Groundwater Background Value Statistics and Characterization
The data obtained after excluding outliers will serve as the dataset for groundwater background value analysis.The background value statistics of the main components mainly include three steps: background value data distribution type test, background value statistics, and background value range adjustment analysis.After removing the outliers, the Shapiro-Wilk test is used to test the distribution type of the background data corresponding to each index, and the fit test of the distribution type is taken at the significance level of α = 0.05.The distribution type mainly includes three kinds of distributions: normal distribution, lognormal distribution, and other distributions.For the background data that have normal or lognormal distribution types, their median is used as the centralized eigenvalue.The 5% quartile is used as the upper limit of the background value, and the 95% quartile is used as the lower limit of the background value to form the background value domain of the regional groundwater.For the case where the lower limit of the background value is less than 0, the lower quartile of the background data is used instead.If the distribution type differs, the background domain is defined by "median ± two times the absolute median difference."Utilizing this statistical method, the range of background values for various indicators in each calculation unit is determined separately for both flood and dry periods.To ensure that the statistical findings accurately reflect real-world conditions, it is essential to compare the background values of identical indicators across calculation units during both periods and analyze their coherence and consistency.

Calculation of the Groundwater Contamination Onset Values
Compare the upper limit of background value (ABV) of the regional groundwater environment obtained from the calculation with the standard of groundwater quality Class III water (SV) to obtain the starting value of groundwater pollution (TV), of which the limit of the standard of groundwater quality Class III water is shown in Table 1.
(1) If the upper limit of background value is less than the groundwater quality Class III water standard: (2) If the upper limit of background value is more than the groundwater quality Class III water standard:

−
No standard

Groundwater Chemical Characteristics
Groundwater in the investigation area is mixed with multiple types.Under the comprehensive effects of dissolution and filtration, alternating ion adsorption, etc., the groundwater in the area forms more obvious water chemistry type zoning and TDS zoning characteristics.In this evaluation of groundwater chemical characteristics, the first is classified according to the content of TDS; the second is named according to the percentage of ion milligram equivalents.Meanwhile, the Piper trilinear diagram can be used to evaluate the hydrochemical evolution and hydrochemical types of groundwater in a basin [32][33][34][35].By combining the results of analytical testing of groundwater samples and the Piper trilinear diagram in the study area (Figure 3), the analysis shows that the chemical types of the groundwater in Karamay City can be divided into the following main categories: Na•Mg-HCO 3 , Na-HCO 3 •Cl, Na-SO 4 •Cl, Na•Mg-SO 4 •Cl, Na-HCO 3 •SO 4 •Cl, Na-Cl, Na•Mg-Cl, Na•Ca•Mg-HCO 3 , and Na•Mg-HCO 3 •SO 4 .And the TDS content of groundwater in the whole area varies more obviously.
The Na•Mg-HCO 3 water is mainly distributed in most parts of units 1 and 2, the western part of calculation unit 3, and the western part of unit 4. They are located in the western part of Baikouquan and Wuerhe River.The groundwater mainly relies on the recharge of groundwater runoff from the upstream of the southern part of the river, with good runoff conditions.And the quality of the water belongs to freshwater.The Na-HCO 3 •Cl water is mainly distributed in the western part of units 3 and 4 and part of the western side of Little Erik Lake in unit 5.In addition, most of the TDS in the groundwater are less than 3 g/L, except for the area close to Little Erik Lake, which is 3-10 g/L and belongs to salty water.The Na-Cl groundwater has a broader area of distribution.It is almost distributed in most of the area of Klamath District and part of the eastern part of Xiaokai Township, spanning 5, 6, 7, 8, and 9 calculation units in total.And except for the southern part of unit 8, other areas of groundwater dissolved solids are mostly freshwater.The TDS content of groundwater in other areas is greater than 10 g/L and locally even greater than 50 g/L.The Na•Mg-SO 4 •Cl water is distributed in the north-central part of Unit 7, which is a pre-mountain alluvial flood plain.The groundwater mainly relies on lateral groundwater runoff from the mountainous areas for recharge, and the runoff conditions are relatively poor, with the TDS content of the groundwater greater than 10 g/L, which is saline water.The Na-SO 4 •Cl water is distributed in the northern part of unit 8, which is salt water according to the content of TDS in groundwater.The water types of Na-HCO 3 •SO 4 •Cl and Na•Mg-Cl are primarily found in unit 9.The former dominates the majority of the area within unit 9, while the latter forms a band extending from the south to the central part of the unit.Unit 10 contains two types of groundwater, Na•Ca•Mg-HCO 3 and Na•Mg-HCO 3 •SO 4 , and they are distributed in the north and south of calculation unit 10, respectively.Calculation unit 10 is the Dushanzi District, which belongs to the alluvial floodplain.And groundwater mainly relies on the recharge of groundwater runoff; the runoff conditions are relatively good, and the TDS of groundwater are all less than 1 g/L.The Na•Mg-HCO3 water is mainly distributed in most parts of units 1 and 2, the western part of calculation unit 3, and the western part of unit 4. They are located in the western part of Baikouquan and Wuerhe River.The groundwater mainly relies on the recharge of groundwater runoff from the upstream of the southern part of the river, with good runoff conditions.And the quality of the water belongs to freshwater.The Na-HCO3•Cl water is mainly distributed in the western part of units 3 and 4 and part of the western

The Elimination of the Background Value Outliers
In order to obtain reasonable background value of groundwater environment, based on the division of calculation unit, Piper's trilinear plot method and Grubbs' method were used to identify and eliminate outliers of indicator concentration data at monitoring points of each calculation unit, respectively.At the same time, the box-and-line diagram was used to test and evaluate the effect of outlier removal.
Through screening, a total of 42 groups of outliers were identified and removed from the 274 sets of test sample concentration data in the study area, including 19 groups of data during the flood period and 23 groups of data during the dry period.After identification, data verification and cause analysis of the outliers should be carried out.The outliers in the groundwater monitoring data can be divided into natural outliers and anthropogenic outliers.Among the 42 groups of outliers identified in this study, LY50 in Unit 9 during the flood period may be affected by anthropogenic activities due to its proximity to farmland and residential areas.LY59 in Unit 10 is located in the urban area of Dushanzi, and there are roads and enterprises distributed within a range of 500 m.Based on the monitoring location, pollution source distribution, and analysis of anthropogenic factors, it is inferred that the outliers at this location may have been influenced by human activities.LY53 and LY56 in Unit 9 of the dry period may have been affected by human activities due to the proximity of farmland.In addition, the other outliers are mostly identified by Grubbs' method, which focuses on checking the consistency of each dataset.When the concentration value of an indicator is too high and exceeds the threshold value, it will be excluded.
In order to test the effect of the rejection, box plots were used to visualize the removal of the outliers.At the same time, the effect of the two rejection methods was assessed by the dispersion of the data distribution before and after the rejection.When the difference between the outliers and the quartiles of the indicators was larger, it indicated that the dispersion of the data in the group was larger, and the effect of the rejection was worse [25].Taking the box plots before and after the rejection of the indicator concentration data in the flood period as an example (Figure 4), firstly, the box plots clearly showed the 19 outliers identified by Piper's trilinear plot method and Grubbs' method before rejection.Secondly, the dispersion of the data before rejection of the outliers was larger than that after rejection, which indicated that the two methods had sound effects on rejecting the outliers.

Analysis of Groundwater Background Values and Contamination Onset Values
Following the removal of outliers, the remaining dataset underwent a distribution te Subsequently, groundwater background values were characterized differently based o

Analysis of Groundwater Background Values and Contamination Onset Values
Following the removal of outliers, the remaining dataset underwent a distribution test.Subsequently, groundwater background values were characterized differently based on distribution types.Table 2 shows the background value intervals of groundwater for the seven monitoring indicators in ten calculation units for the two sampling periods.From the table, it can be seen that the data distribution type of each indicator in each unit mainly includes normal distribution and lognormal distribution, in which the normal distribution type is the most common.These two distribution types used 5% quartile and 95% quartile as the limit of background value.Figure 5 shows the background value intervals of the seven indicators located in each unit during the two sampling periods.This intuitively reflects the spatial and temporal distribution characteristics of the distribution of the groundwater background values of the seven indicators in Karamay City.From a spatial distribution perspective, Na + , Cl − , and SO 4 2− are the most prevalent ions in Karamay City, exhibiting widespread distribution.Additionally, the groundwater background values of these indicators in units 6, 7, and 8 are significantly higher than those in other calculation units.Meanwhile, the content of TDS in these three units is higher than that in other units, with a larger degree of mineralization.On the contrary, the groundwater background values of indicators in the Dushanzi District, i.e., in calculation units 9 and 10, are lower, and the mineralization degree is less than 1 g/L.In terms of temporal distribution, only the background values of Na + and Cl − in the dry period are significantly different from those in the flood period.And the other five indicators do not have much change or basically remain the same.In terms of temporal distribution, firstly, the total dissolved solids (TDS) content in Karamay City is high during both sampling periods, and it shows little variation, with TDS content being slightly higher during the flood period than the dry period.Secondly, focusing on the four major cation components in groundwater, the background values of Ca 2+ and Na + show significant changes between the flood and dry periods.In units 4, 6, and 8, the background values of Ca 2+ during the flood period are significantly higher than those during the dry period.In unit 5, the background values of Ca 2+ during the flood period are significantly lower than those during the dry period.Other units show no significant differences in the background values of Ca 2+ between the two sampling periods.For Na + , the background values in units 5 and 6 are slightly lower during the flood period than the dry period, with little difference in other calculation units.Among the three major anion components, only the Cl − background values in the units 5, 6, and 8 are significantly lower during the flood period than the dry period, while other units and the remaining two anions show little difference in background values between the two sampling periods.The groundwater contamination onset value is a direct application of the background value of groundwater in management.According to its calculation formula, it is known that the contamination onset value is basically the same as the spatial and temporal distribution of the groundwater background value.The results of its calculation in the two sampling periods are shown in Table 2.The spatial distribution of the groundwater contamination onset value in Karamay City in different sampling periods is shown in Figure 6.The groundwater contamination onset value is a direct application of the background value of groundwater in management.According to its calculation formula, it is known tha the contamination onset value is basically the same as the spatial and temporal distribution of the groundwater background value.The results of its calculation in the two sampling periods are shown in Table 2.The spatial distribution of the groundwater contamination onset value in Karamay City in different sampling periods is shown in Figure 6.

Discussion
(1) In this study, the hydrogeological unit zoning of Karamay City was optimized and adjusted.It was divided into seven fourth-level hydrogeological unit zones.Accord ing to the calculation unit's division method, on the basis of the divided fourth-level hy drogeological units, the geomorphology, lithology, and groundwater types were consid ered comprehensively to optimize and adjust them.And finally, ten units were delineated

Discussion
(1) In this study, the hydrogeological unit zoning of Karamay City was optimized and adjusted.It was divided into seven fourth-level hydrogeological unit zones.According to the calculation unit's division method, on the basis of the divided fourth-level hydrogeological units, the geomorphology, lithology, and groundwater types were considered comprehensively to optimize and adjust them.And finally, ten units were delineated.The calculation units can basically reflect the spatial distribution of the groundwater environment in the investigation area, indicating that the delineation process is reasonable.In addition, compared with the study of the groundwater background value in the study area as a whole, the division of calculation units can improve the accuracy of the calculation results.That is because groundwater has regional differences.The analysis of the whole will lead to the exclusion of normal data or abnormal data, which will reduce the scientificity and reliability of the calculation results.
(2) As can be seen from Figure 4, the combination of Piper's trilinear plot method and Grubbs' method used in this study is effective in removing the outliers.Simultaneously, through the rejection process, it becomes evident that employing a combination of hydrochemical and numerical statistical methods is essential.Solely relying on numerical statistics is insufficient for completely identifying outliers in monitoring data.The hydrochemical method fully considers the hydrochemical characteristics of the regional groundwater.The combination of the two is able to identify the outliers in the monitoring data in a better way.In addition, it is found that the analysis of the causes of the outliers, i.e., the attribution of anthropogenic and natural causes, relies to a large extent on the judgment after the site investigation and summarization of the historical data.It has certain limitations.Therefore, how to differentiate between anthropogenic and natural factors more objectively can be further investigated and explored in the following.
(3) In this study, groundwater chemical characteristics were classified based on total dissolved solids (TDS) content as well as ion milligram equivalent percent in conjunction with Piper's trilinear diagram.Karamay City is located in the alluvial fan front tilted plains zone on the northwest edge of the Junggar Basin, and its topography is dominated by a large area of plains, with some parts of the western part of the city being hilly.The main components of groundwater in the plain areas studied in this paper are K + , Ca 2+ , Na + , Mg 2+ , SO 4 2− , Cl − , and HCO 3 − .Among the cations, Ca 2+ and Mg 2+ have similar chemical properties and easily form water-insoluble salts with anions.However, Ca 2+ is abundant, while Mg 2+ is relatively scarce.The salts of Na + and K + are highly soluble in water and easily undergo ion exchange and adsorption onto the surfaces of medium particles.HCO 3 − among the anions is chemically unstable, with higher concentrations in recharge areas mainly influenced by the recharge source.SO 4 2− is chemically stable, and the salts formed by Cl − with major cations are highly soluble in water [36].These groundwater chemical characteristics, combined with cation exchange adsorption during water-rock interactions, have resulted in the salinization of groundwater in Karamay City.
(4) For the results of the study of the background value of groundwater in Karamay City, the reasons are analyzed.The groundwater mainly comes from atmospheric rainfall, followed by surface water.And atmospheric precipitation in arid areas is mostly mixed with dust, which is generally dominated by Ca 2+ and HCO 3 − .The TDS of atmospheric precipitation is higher, which can be up to 0.1 g/L and 1 g/L.Therefore, the interaction between atmospheric rainfall and surface water with geotechnical soil has a great influence on the relevant components of groundwater [37].It can be found that the groundwater background values of Na + and Cl − in each calculation unit in Karamay City were mostly higher than the other four indicators in the two sampling periods, which may be due to the evaporation-concentration effect, the direct result of which leads to an increase in mineralization [38].Additionally, in terms of time distribution, the groundwater background values of Ca 2+ during the flood period in the units 4, 6, and 8 are significantly higher than during the dry period, whereas the background values of Na + in these units are significantly lower during the dry period.This indicates that in these units, the processes of atmospheric precipitation recharge and evaporation concentration are significant.During the flood period, the higher precipitation leads to higher Ca 2+ background values.As precipitation decreases and evaporation concentration intensifies, the background value of precipitated Ca 2+ decreases, and the ions of soluble salts gradually become the main components, leading to higher Na + background values.In unit 5, the Ca 2+ background value during the flood period is significantly lower than during the dry period, which is different from the pattern in other units.The likely reason is that unit 5 belongs to the Baiyang River hydrogeological unit, where groundwater recharge mainly occurs through vertical infiltration from the Baiyang River, and discharge is primarily through artificial extraction.If other forms of recharge dominate over atmospheric precipitation, then the atmospheric precipitation, which is rich in Ca 2+ , may not cause Ca 2+ enrichment.
The flowing groundwater carries components leached from the recharge area to the discharge area.In arid regions such as dry plains and lowland basins, where water is not deeply buried, evaporation becomes the primary discharge pathway for groundwater.Evaporation only removes water, but salts remain in the groundwater.With the continuation of time, the groundwater solution gradually concentrates, and the TDS increases.At the same time, with the increase in concentration, the solubility of the smaller salts in the water to reach saturation and precipitation, easily soluble salts of ions gradually become the main component [39].The surface water flowing into the urban area is mainly the Baiyang River system in Karamay City.The amount of water is large in the spring, and there is a significant surface water diffuse flow on the surface of the ground, then dry in summer.At the same time, the study area exhibits a typical temperate continental climate characterized by intense evaporation and concentration.Shallow groundwater rises continuously along soil capillaries in the air-bearing zone, evaporating as water moves and salt remains.The soluble salts in the soil are brought into the groundwater through repeated processes of infiltration, lysing, evaporation, and concentration [18,35,40,41], which is the primary reason for the high mineralization of groundwater in Karamay City.
Before evaporation and concentration, the TDS content of groundwater is shallow, the anion is dominated by HCO 3 − , and the second is SO 4 2− .The content of Cl − is very shallow, and the cation is dominated by Ca 2+ and Mg 2+ .With evaporation and concentration, the bicarbonate of calcium and magnesium with slight solubility is partially precipitated, and SO 4 2− and Na+ gradually become the main components.Continuing to concentrate, the water saturated with sulfate and began to precipitate, then the formation of high TDS water, mainly Cl − , Na + .
Additionally, in a reducing environment with the presence of organic matter, desulfurizing bacteria contribute to the reduction in SO 4 2− to H 2 S, with the result that SO 4 2− decreases to the point of disappearance, HCO 3 − increases, and pH becomes larger in groundwater.The heavy odor of H 2 S in some of the monitoring wells sampled in the mountain front area in the western part of the Karamay district is due to the desulfurization produced by the oil field strata.Closed geological structures, such as oil reservoirs, are conducive environments for sulfate reduction.Therefore, some oilfield waters contain H 2 S while having very low SO 4 2− content.(5) Through the study, the spatial distribution characteristics of groundwater background values of seven indicators, namely K + , Ca 2+ , Na + , Mg 2+ , SO 4 2− , Cl − , and HCO 3 − , in Karamay City were initially grasped.At the same time, combined with the groundwater contamination onset value, the background value is applied to the actual protection and management of groundwater, which is of practical significance for the evaluation and management of local groundwater environmental quality.Therefore, the next in-depth application analysis of groundwater background values can be a research direction, which will move the analysis of groundwater background values from the cognitive level to the application level.

Figure 1 .
Figure 1.An overview map of the study area.(a,b) Map of the geographic location of the study (c) Map of the division of the calculation unit, showing also the location of the monitoring site other areas not included in the calculation units.

Figure 1 .
Figure 1.An overview map of the study area.(a,b) Map of the geographic location of the study area; (c) Map of the division of the calculation unit, showing also the location of the monitoring sites and other areas not included in the calculation units.

Figure 2 .
Figure 2. Flowchart of outlier identification and rejection.

Figure 2 .
Figure 2. Flowchart of outlier identification and rejection.

Sustainability 2024 , 18 Figure 3 .
Figure 3. Piper trilinear map of groundwater in the study area.(a) Piper trilinear map of groundwater in 10 calculation units in Karamay City during the flood period; (b) Piper trilinear map of groundwater in 10 calculation units in Karamay City during the dry period.

Figure 3 .
Figure 3. Piper trilinear map of groundwater in the study area.(a) Piper trilinear map of groundwater in 10 calculation units in Karamay City during the flood period; (b) Piper trilinear map of groundwater in 10 calculation units in Karamay City during the dry period.

Figure 4 .
Figure 4. Box plots of the concentration data for each indicator before and after removing of outlie (a-j) Box plots of the concentration of each index before and after removing of outliers in the 1st 10th units during the flood period, respectively.

Figure 4 .
Figure 4. Box plots of the concentration data for each indicator before and after removing of outliers.(a-j) Box plots of the concentration of each index before and after removing of outliers in the 1st to 10th units during the flood period, respectively.
in distribution type denotes normal distribution; 2 LN denotes lognormal distribution.

Figure 5 .
Figure 5. Plot of background value intervals for each indicator for two sampling periods.(a) His gram of background value intervals of Ca 2+ in two sampling periods, where 1 indicates the flo period and 2 indicates the dry period.(b-g) Histograms of background value intervals for Mg Na + , K + , Cl − , SO4 2− , and HCO3 − in two sampling periods, respectively.

Figure 5 .
Figure 5. Plot of background value intervals for each indicator for two sampling periods.(a) Histogram of background value intervals of Ca 2+ in two sampling periods, where 1 indicates the flood period and 2 indicates the dry period.(b-g) Histograms of background value intervals for Mg 2+ , Na + , K + , Cl − , SO 4 2− , and HCO 3 − in two sampling periods, respectively.

Figure 6 .
Figure 6.Distribution of the groundwater contamination onset value in different calculation units (a) Histogram of the spatial distribution of the groundwater contamination onset value for ten unit during the flood period; (b) Histogram of the spatial distribution of the groundwater contaminatio onset value for ten units during the dry period.

Figure 6 .
Figure 6.Distribution of the groundwater contamination onset value in different calculation units.(a) Histogram of the spatial distribution of the groundwater contamination onset value for ten units during the flood period; (b) Histogram of the spatial distribution of the groundwater contamination onset value for ten units during the dry period.

Table 1 .
Groundwater quality Class III water criteria.

Table 2 .
Calculation results of background values.