Potential Toxic Elements Pollution Status in Zones of Technogenic Impact in Central Regions of Per ú

: Soil is a component of the environment. An environmental policy should identify the sources of trace metals in the soil and their effects on people and living beings. The concentrations of 29 surface soil samples (0–25 cm) were determined using the methods EPA 3050B. The data were analyzed using simple and robust statistical analysis that allowed for determining geo-chemical baseline values. Principal component and correlation analyses were performed, which, together with a spatial analysis, allowed us to distinguish between geogenic and anthropogenic sources. The degree of soil contamination was evaluated using different ecological indices, and the health risks to children and adults were calculated using formulas proposed by the United States Environmental Protection Agency (USEPA). The median concentrations of the analyzed elements correspond to Al 17,666 (mg/kg), As 8.7 (mg/kg), Ba 61.4 (mg/kg), Cd 0.17 (mg/kg), Cr 11.3 (mg/kg), Cu 20.5 (mg/kg), Fe 25,953 (mg/kg), Hg 0.06 (mg/kg), Mn 499 (mg/kg), Ni 20.8 (mg/kg), Pb 15.9 (mg/kg), and Zn 60.6 (mg/kg). In the principal component analysis, four factors were identified that explain 70.3% of the variability of the elements, which, together with the correlation analysis, suggest that the origin of the elements is mainly geogenic with some possible anthropic contributions. The elements analyzed in the soil with moderate contamination correspond to As, Cd, and Pb, in addition, As is the only element that indicated a value above the limit for carcinogenic risk in children. The estimated geochemical baseline values correspond to Al 34,734 (mg/kg), As 15.3 (mg/kg), Ba 113 (mg/kg), Cd 0.41 (mg/kg), Cr 33.8 (mg/kg), Cu 42.9 (mg/kg), Fe 46,181 (mg/kg), Hg 0.12 (mg/kg), Mn 1015 (mg/kg), Ni 42.2 (mg/kg), Pb 21.6 (mg/kg), and Zn 121 (mg/kg). 89.7% of the total samples are at a low level of contamination. The carcinogenic risk due to As in children represents 3.4% of the total samples, so it is considered insignificant.


Introduction
Soil contamination is a global problem in which abnormal concentrations of elements or substances that can affect the development of life are generated [1].Naturally, the content of these elements can have a geogenic origin that can be toxic in high concentrations [2], (although many times this originates from anthropogenic factors.One of the unique properties of soils is their ability to store long-term information about the healthy state of the territory [3].In Peru, the phenomenon of soil contamination is not an unfamiliar concern.It is a mining and agricultural country with a large number of forest resources, water, and a population of approximately 33,396,700 people [4], that generates solid, organic, or liquid waste that contaminates the water of streams and rivers and, when used in irrigation, increases the concentration of physical and chemical components leading to soil contamination [5].Moreover, this compartment acts as a sink for a wide variety of emissions as a result of the aerial deposition of particles emitted by human activities comprising several heavy metals, some of which are toxic.In recent years, some researchers have studied the concentrations of heavy metals in the soils of agriculture and mining areas in Peru [6][7][8][9].The obtained results indicate areas with moderate soil contamination, with elements such as As and Pb being those with the highest concentrations that put people's health at risk.As a control measure, there is the [10], which was created to regulate the country's environmental aspects and thus minimize the impact of anthropogenic activities.Peru needs more research on soil contamination to protect human health and ecosystems. Scientists around the world haven't agreed on a single method for digestion analysis for an environmental interpretation of PTE concentrations determined in soils.This study used the methods EPA 3050B [11] and EPA 6020B [12].Changes in environmental components, retrospective and predictive studies of particular systems, as well as regional and global generalizations of the state of the soil system, are becoming increasingly important.Since it is possible to predict changes in the ecological properties of the soil with the incorporation of new human activities in the territory.To identify changes in due time and assess, predict, prevent, and eliminate the consequences of negative processes occurring in the soil with technogenic impact, environmental monitoring of soils is required.In the Central Region of Peru (in the Supplementary Material are included which locations encompass this Region), eight transmission lines with 138 kV, 220 kV, and 500 kV will be constructed and operated within the project "Nueva Yanango-Nueva Huánuco 500 kV link and associated substations".The transportation sector associated with the construction of transmission lines provides many services.Unfortunately, the adverse effects of transportation have a serious negative impact on the natural environment.It is commonly thought that heavy metals are one of the most important groups among inorganic pollutants related to the transportation sector and infrastructure development [13,14].
The objectives of this study are: (1) to determine the geochemical baseline values of Al, As, Ba, Cd, Cr, Cu, Fe, Hg, Mn, Ni, Pb, and Zn in soils of the Central Regions of Peru; (2) to estimate the level of contamination in soils based on different ecological indices; and (3) to assess the health risk of exposure in children and adults.

Study Area
The study zone is located in the Central Regions of Peru, between parallels 9 • 10 ′ and 11 • 40 ′ south latitude and meridians 74 • 40 ′ and 77 • 20 ′ west longitude (Figure 1), across the "Nueva Yanango-Nueva Huánuco 500 kV link and associated substations" project.The study zone was divided into two areas, Area 1 and Area 2, linked to transmission lines at 500 kV, and 220 kV-138 kV, respectively.The length of the 500 kV transmission line is approximately 187.46 km, while the associated 220 kV and 138 kV lines total approximately 208 km.The minimum widths of the easement strip considered for the project are 64 m for the 500 kV line, 25 m for the 220 kV line, and 20 m for the 138 kV line [15] , (Figure 1).The length of the 500 kV transmission line is approximately 187.46 km, while the associated 220 kV and 138 kV lines total approximately 208 km.The minimum widths of the easement strip considered for the project are 64 m for the 500 kV line, 25 m for the 220 kV line, and 20 m for the 138 kV line [15].
The area of direct influence of the study was extended to 50 m on each side of the transmission lines, covering an area of 3788.28 ha where a total of 115 population groups are located (between rural community, village center, housing association, sector, and annex), distributed in the departments of Junín ( 18), Pasco (6), Huánuco (90), and Ancash (1).The number of inhabitants is estimated at 13,040 [4].The main economic activities correspond to agriculture and, to a lesser extent, commercial activities.It is important to emphasize that crops include maize (Zea maiz), olluco (Ollucus tuberosus), oca (Oxalis The length of the 500 kV transmission line is approximately 187.46 km, while the associated 220 kV and 138 kV lines total approximately 208 km.The minimum widths of the easement strip considered for the project are 64 m for the 500 kV line, 25 m for the 220 kV line, and 20 m for the 138 kV line [15]. The area of direct influence of the study was extended to 50 m on each side of the transmission lines, covering an area of 3788.28 ha where a total of 115 population groups are located (between rural community, village center, housing association, sector, and annex), distributed in the departments of Junín (18), Pasco (6), Huánuco (90), and Ancash (1).The number of inhabitants is estimated at 13,040 [4].The main economic activities correspond to agriculture and, to a lesser extent, commercial activities.It is important to emphasize that crops include maize (Zea maiz), olluco (Ollucus tuberosus), oca (Oxalis tuberosa), barley (Hurdeum vulgare), wheat (Triticum genus), peas (Pisum sativum), and potatoes (Solanum genus).
Two climatic units were identified.Steppe highlands and puna (highland zone), with a dry and very sunny climate but cold at night, are located in the departments of Ancash and Huánuco.The average monthly temperature varies from 11.1 • C to 23.9 • C and the average monthly precipitation is between 4.6 mm and 88.2 mm, with higher precipitation between December and March.High jungle, with a warm and very humid climate, is located in the departments of Pasco and Junín.The average monthly temperature varies from 5 • C to 22 • C and the average monthly precipitation is between 10 mm and 132 mm, with higher precipitation between December and April.Information was also collected from 12 meteorological stations of the National Meteorological and Hydrological Service of Peru (SENAMHI) close to the study area (with a radius of approximately 50 km) [15].
The geology of the study zone is composed of a series of lithostratigraphic units ranging from the Neoproterozoic to the Cenozoic (Figure 2).Within the area located in the department of Ancash, conglomerates of the La Unión Formation and limestones of the Jumasha Formation and Celedín Formation are identified [16].Within the department of Huánuco, sequences of shales and gneisses of the Marañón Basal Complex, conglomerate rocks of the Ambo Group, continental rocks of the Mitu Group, carbonate series of the Pucará Group, sandstones and shales of the Sarayaquillo Formation, Goyllarisquizga Group and Oriente Group Formation, limestones of the Chulec-Pariatambo Formation, carbonate deposits of the Chonta Formation, sandstones of the Vivian Formation, limestones of the Jumasha Formation and Celedin Formation, reddish layers of sandstones and clays of the Huayabamba Group and Lantorache Formation, and conglomerates of the La Merced Formation and part of the La Unión Formation [16][17][18][19][20][21].Lithostratigraphic units such as the Mitu Group, Sarayaquillo Formation, Oriente Group Formation, Chonta Formation (Ks-ch), and Lantorache Formation are also identified within the department of Pasco.The department of Junín includes clastic and carbonate rocks of the Copacabana Group and Chonta Formation and conglomerates of the La Merced Formation [19].Quaternary deposits such as Morainic Deposits, Fluvioglacial Deposits, Alluvial Deposits (Qr-al), Colluvial Deposits, and Fluvial Deposits are observed in all the departments of the study area [16,17,21].The study area includes various metallic mineralizations such as Au, Au-Ag, Cu-Pb-Zn-Ag, and Sb-bearing quartz veins found in micaceous shales of the Marañón Complex.Additionally, there are disseminations, veins, and veins of Cu-Ag and Pb-Zn-Ag in sandstones and shales of the Mitu Group.Stratiform bodies of Pb-Zn-Ag are present in limestones of the Pucará Group, and copper deposits in sedimentary rock are found in the Goyllarisquizga Group, as reported by [16].

Soil Analysis
The concentrations of the elements studied were obtained from the Environmental Impact Study carried out by [15] Consorcio Transmantaro S.A. in 2019 for the "Nueva Yanango-Nueva Huánuco 500 kV Link and Associated Substations" project.
A total of 29 soil samples, 0-25 cm in depth, were obtained contiguous to the vertices of the entire line route.Samples were taken with a plastic shovel and placed in zip-lock bags for transport.The methodology used for sample selection involved environmental criteria such as lithology, life zones, and current land use, as well as technical criteria such as the location of the project components.The samples were obtained to characterize the areas where the main and secondary components of the project will be located and where some contamination could be generated in the future.

Soil Data Analysis
The results of concentrations obtained from samples below the detection limits were modified by assigning them a value of ½ of the detection limit; duplicate samples were averaged to obtain a single value.Statistical analysis of the data set was performed using RStudio and IBM SPSS Statistics v.21 software (IBM, New York, United Staetes).Histograms, box-plot diagrams, and cumulative frequency plots were obtained for exploratory data analysis (EDA); measures of central tendency, distribution, and dispersion were calculated in univariate statistics; Spearman correlation analysis and principal component analysis (PCA) were used to help identify the origin of pollution sources.

Spatial Distribution
Spatial distribution was performed using ArcGIS v.10.5 software using ordinary kriging for the twelve elements under study.Figures 8-10 contain the spatial distribution maps generated with ranges divided into 25th, 50th, 75th, 75th, 90th, and 95th percentiles.

Threshold Values
The establishment and use of threshold values or geochemical baselines allow the comparison of the estimated value with the actual concentration of the samples and are an important factor in the assessment of possible contamination in soils [22].
At a general level, there are geochemical, or direct, and statistical, or indirect, methods.For this study, the "MAD" method, which uses the median ± 2 absolute deviation from the median [23] the upper whisker method, which is calculated as the 75th percentile + 1.5 interquartile range [24,25]; and the use of the 95th percentile [26,27] were applied and compared.In this study, the MAD method was used for geochemical background values.

Environmental Indices
The potential for soil contamination was assessed using several well-established environmental indices [28][29][30].There are several environmental quality indicators to assess the potential contamination of soils.This study employed several widely used indices to appraise soil contamination: the geoaccumulation index (I GEO ), enrichment factor (EF), contamination factor (C f ), pollution index (IPI), and degree of contamination (C DEG ) [25,28,[31][32][33].
The geoaccumulation index (I GEO ) is calculated using Equation (1): where C n corresponds to the measured content of the element in the soil and Bn is the estimated geochemical baseline value of that element in the soil.The classification according to [29], is shown in Supplementary Material, Table S1 [29,30].Equation ( 2) is employed to compute the enrichment factor (EF).This metric is established based on the normalization of a tested element against a reference element, with the latter chosen due to its consistently low occurrence variability.As suggested by [34] Sc, Mn, Ti, Al, and Fe are frequently employed as reference elements.In this particular investigation, Al serves as the reference element for eleven of the twelve analyzed elements, and Fe acts as the reference element for standardizing Al.Their minimal variability influences the choice of these elements in the soils traversed by the transmission line for this study.
where C i is the concentration of the element in the sample, C oi is the corresponding geochemical baseline value, C re f is the concentration of the reference element in the sample, and C ore f is the geochemical baseline value of that element (in mg/kg).Table S6 shows the classification of EF [30].
The contamination factor is calculated using Equation ( 3): where C i is the concentration of the element in the sample and C oi is the corresponding geochemical baseline value.Table S7 shows the classification of the contamination factor [28].
The pollution index is defined as the average of the pollution factors calculated for each of the elements.Its calculation method is presented in Equation ( 4): The classification of IPI values is shown in Table S1.
The last method corresponds to the contamination degree (C DEG ), defined as the sum of the individual contamination factors.Its calculation method is presented in Equation ( 5): The classification for the contamination degree proposed by [28] was based on eight elements.In this study, the C DEG was used in the evaluation of the five pollutants with EQS values for agricultural soils, so a modification was made according to that proposed by [32]).Table S8 shows the classification of the contamination factor.

Environmental Indices with Peruvian Environmental Quality Standards (EQS) for Soil
Additionally, for comparison purposes, the concentrations of potentially toxic elements were compared against the maximum permissible limits established by Peruvian legislation in Supreme Decree N • 011-2017-MINAM [35] for agricultural soils.
The EQS values are presented in Table S2 and are established for some elements.

Health Risks
Carcinogenic and non-carcinogenic health risks were evaluated using formulas derived by the US Environmental Protection Agency [36][37][38][39].
High concentrations of potentially toxic elements (PTEs) in soils pose various health risks to humans.To evaluate these risks, the United States Environmental Protection Agency (USEPA) has established reference doses (RfDs) that allow for the assessment of carcinogenic and non-carcinogenic risks associated with ingestion, dermal absorption, and inhalation as exposure pathways [36,39,40].The daily doses received through ingestion (ADI ing ), dermal absorption (ADI dermal ), and inhalation (ADI inh ) were calculated using Equations ( 6)-( 8), adapted from various USEPA publications [36,[38][39][40].
where the total element concentration in soil is denoted by C soil (mg/kg), the intake rate by IR (mg/day), the exposure frequency by EF (days/year), the exposure duration by (years), body weight by ED (kg), and the average exposure time by AT (days), calculated as AT = ED × 365.The surface area is represented by SA (cm 2 ), the skin adhesion factor by AF (mg/cm 2 ), the dermal absorption factor by ABS (dimensionless), the inhalation rate by InhR (m 3 /day), and the particulate emission factor by PEF (m 3 /kg).The values of these parameters used in the study are presented in Table S3.

Non-Carcinogenic Risk
For the non-carcinogenic risks, Equations ( 9)- (11) show how the hazard quotient (HQ) is derived for each exposure route.To further evaluate the overall potential for non-carcinogenic effects by all exposure routes and to estimate the combination risks, Equation ( 12) is used, which gives the hazard index (H I).
H I = HQ ing + HQ dermal + HQ inh (12) where RFD o corresponds to the oral reference dose (soil ingestion), RFD d , the dermal soil absorption reference dose, and RFD i the inhalation reference dose of soil particles (mg/kg/day).
The values used for this study are presented in Table S4 and were compiled from USEPA [38,[41][42][43].H I values > 1 expose a significant risk to human health [33,44].

Carcinogenic Risk
The carcinogenic risk is based on the values for ingestion (ADI ng ), dermal absorption (ADI dermal ), and inhalation (ADI inh ) calculated by Equations ( 9)-( 11), using the time over which the dose for carcinogens (AT-C) is averaged.For each route of intake, the carcinogenic risk (CR) is calculated through Equations ( 13)- (17), and finally, the cumulative or total carcinogenic risk (TCRI) is calculated using Equation (18).It should be noted that for this study, the oral carcinogenic risk was assessed only for As, Pb, Cr, and Ni because they were the only available SF o values [40].
where ABS Gi corresponds to gastrointestinal absorption without units.
SF i is the inhalation slope factor (mg/kg/day) −1 obtained by Equation (17).
URF is the unit of risk factors (m 3 /µg); InhR is the inhalation rate (m 3 /day); BW is body weight (kg); and 10 3 is the unit conversion factor.

Exploratory Data Analysis
Figures S1-S3 present the different histograms, box-plot diagrams, and cumulative frequency plots obtained, which allow summarizing the main characteristics of the data under study.The histograms obtained show that the elements follow an asymmetric distribution of positive biases to the right.Elements such as As, Cr, Cu, Hg, and Pb are those that fit best as described.Some isolated bars are also observed that represent data values far away from the other data values, called outliers.The latter can be best observed for the elements As, Ba, Cd, Fe, Hg, and Pb in the box-plot diagrams, where these outliers are represented as points outside the upper whisker.Based on the cumulative frequency plots, it can be seen that none of them adopt a 45 • straight line, i.e., it reaffirms the description that no element adopts a normal distribution.

Univariate Statistics
Table 1 presents the summary statistics for the elements under study.The elements with the highest mean concentration values are Al and Fe, which is because they are major elements in the soils [46].The coefficients of variation are higher than 80% for elements such as As, Cd, Cr, Cu, Hg, and Pb, so their mean is not representative of the set.In addition, large differences between minimum and maximum values are observed, indicating heterogeneity in the data.
Table 2 contains a summary of the median concentrations (mg/kg) reported in some publications on Peruvian soils.A comparison with the data obtained in this study shows that the concentrations of As, Cd, Hg, and Pb are higher in the Colquirrumi, Julcani, and La Zanja sectors, which may be since, although these were natural areas with land use considered agricultural, they correspond to areas with a high density of mining sites in operation.
On the other hand, the Infierno Community, which is a control site with agricultural land use that has not been impacted by mining activity and there are no records of this activity in the vicinity, reports the lowest concentrations of most of the elements under study.In the area of La Pastora, the concentrations are similar to the previously described sector, with some variations since they correspond to agricultural land use in an area abandoned by gold mining exploitation.

Multivariate Statistics
Since the data do not follow a normal distribution, Spearman's method was used for the correlation matrix between the twelve elements under study, all samples (Table 3).The elements with the best correlation, ordered from highest to lowest, are Fe-Cr, Fe-Cu, Ni-Cr, Cu-Ni, Cu-Cr, Fe-Ni, Zn-Ni, and Pb-As.According to [47], in the Huánuco Region, the metamorphic and ultramafic rocks of the Marañon Basal Complex have a high content of magnetite and mineralization of Fe, Ni, and Cu, so the high correlations between these elements could be due to their common geogenic origin.Regarding correlations such as Pb-As, they may be generated because inorganic arsenic occurs naturally in the soil and many types of rocks, especially in ores containing copper or lead [48].
Table 4 presents the factor analysis matrix obtained through principal component analysis of all samples.The number of components was reduced to four based on the sedimentation plot and the Kaiser criterion, which indicates using principal components with eigenvalues greater than one [49].Additionally, items with absolute loading values equal to and/or greater than 0.45 were considered to describe the principal composition of each factor.The four principal components explain 70.3% of the total variance of the data.Component 1 explains 27.4% of the total variance, with the associated elements being Ni, Zn, Cr, Cu, Cd, Fe, Pb, and Ba.These elements are usually related to the parent material, so this component may be related to a natural factor such as the lithology and geology of the existing rock.Component 2 explains 20.5% of the total variance, with the associated elements being Fe, Cr, Al, Cu, and the negative sign Cd.It is possible to observe in the cluster analysis (Figure 2) how Fe and Al are grouped in the third level, while Cr and Cu are in the first level.Component 3 explains 11.5% of the variance, with As and Hg being associated as well as Ba and Pb with equal but opposite loadings.Finally, component 4, which explains 10.8% of the variance, has the elements Al and As associated with similar charges, with opposite signs.Most of the elements contain high values in more than two components, which would indicate that their source may be more than one.For the cluster analysis, two dendrograms were generated using Ward's method and Euclidean distance, the first one considering the twelve elements under study, while the second one did not consider Al and Fe because they are considered major elements, allowing the different groups generated to be appreciated on a larger scale (Figure 3).nents, which would indicate that their source may be more than one.For the cluster analysis, two dendrograms were generated using Ward's method and Euclidean distance, the first one considering the twelve elements under study, while the second one did not consider Al and Fe because they are considered major elements, allowing the different groups generated to be appreciated on a larger scale (Figure 3).The first level shows most of the elements under study, which have formed three groups very close to each other composed of Ba-Zn, As-Cd-Hg, and Pb-Cu-Cr-Ni.The second level shows Mn, which has a significant charge level only for the first main component, together with elements such as Cu, Cr, Ni, and Zn.Finally, a third level is observed with the group composed of Al and Fe, which could have been differentiated from the other elements due to their high concentrations.Elements such as Cd, Cr, Cu, and Pb suggest more than one source of origin.Cd is strongly associated with Zn, Cu, and Pb minerals, but since a number of the samples were obtained from agricultural soils, phosphate fertilizers, and sewage sludge, could be contributing to the increase in Cd levels [50].Similarly, although the samples are located at distances of at least 30 m from roads and secondary roads, there is a possibility that the Pb concentrations located in the South of Area 2 could be influenced by vehicular traffic [51].

Threshold Values and Assessment of Potential Contamination
Table 5 presents the threshold values obtained by the different methods described in Section 2.4.The upper limits of the geochemical baseline content differ mainly between the TIF and the P95 concerning the median ± 2 MAD.The median ± 2 MAD is better at reducing the geochemical baseline upper limit than the other two.The highest upper limits are obtained with TIF except for Cd, Fe, Hg, and Pb, which are obtained with P95.The soils have been predominantly influenced by natural element dispersion and accumulation processes, but there are a few points with an anthropic influence.Cd and Pb are the heavy metals with the highest number of outliers in the three methods.Regarding the baseline values established by [48] in soils of the Andes Mountain Range (Peru), the upper limit of the geochemical baseline of this study is higher in Cr, Cu, Zn, and Pb, while the values established for As, Cd, and Hg are lower for the three methods studied.The most plausible calculation of the geochemical baseline in the soils of the Central Regions of Perú was tested using the median ± 2 MAD.Therefore, the ranges obtained by the calculated distribution are chosen as geochemical baseline values.
Table 5. Threshold values (mg/kg) with three calculated methods (median ± 2 MAD, TIF upper whisker method, and P95 95th percentile) and outliers' number of samples with concentrations above the geochemical baseline value in surface soil samples (0-25 cm) from the project.

Assessment of Potential Contamination
The Environmental Quality Standards (EQS) for soil were evaluated considering that the majority of the extracted samples were collected from non-intervention and agricultural areas.Figure 4 presents a graphical representation of the results for all samples.Sampling point 20 showed an As concentration of 58.1 mg/kg, exceeding the EQS for agricultural land use.This point is located in the Huánuco department, northwest of study Area 1, on a Jumasha geological formation with a predominantly limestone lithology.The current land use is crop cultivation, and it is situated 23.9 m from the project's closest vertex.The elevated As value is hypothesized to be of geogenic origin or due to the area's mineralogy, which could have been influenced by fertilizer application.Sampling point 29 presented Cd and Pb concentrations of 1.5 mg/kg and 112.1 mg/kg, respectively, both exceeding the EQS for agricultural land use.This sampling point is located in the Huánuco department, southeast of study area 2, on a Pucará geological group with a predominant lithology of limestone.The current land use is grassland in a secondary forest.The elevated Cd and Pb values are suggested to be of geogenic origin or due to the metallic mineralization in the limestones, a characteristic of the present geological unit.Sampling point 20 showed an As concentration of 58.1 mg/kg, exceeding the EQS for agricultural land use.This point is located in the Huánuco department, northwest of study Area 1, on a Jumasha geological formation with a predominantly limestone lithology.The current land use is crop cultivation, and it is situated 23.9 m from the project's closest vertex.The elevated As value is hypothesized to be of geogenic origin or due to the area's mineralogy, which could have been influenced by fertilizer application.Sampling point 29 presented Cd and Pb concentrations of 1.5 mg/kg and 112.1 mg/kg, respectively, both exceeding the EQS for agricultural land use.This sampling point is located in the Huánuco department, southeast of study area 2, on a Pucará geological group with a predominant lithology of limestone.The current land use is grassland in a secondary forest.The elevated Cd and Pb values are suggested to be of geogenic origin or due to the metallic mineralization in the limestones, a characteristic of the present geological unit.
The reason for the deviation caused by normalizing the results with different threshold values is that EQS values are higher than the MAD method.The EQS values are generic reference values (GRL) or soil screening values (SSV).Generic reference values are typically established by assessing the potential harm they can cause to receptors such as humans or ecosystems.This evaluation considers both the contaminant's inherent toxicity and the level of exposure experienced by the receptors at risk [52].
Table S6 summarizes the results obtained for the geoaccumulation index (I GEO ), enrichment factor (EF), contamination factor (C f ), and pollution index (IPI) for ten of the elements under study.The geochemical baseline values used in the assessment are those provided by the median ± 2 MAD method, as it provides the highest number of outliers.These results are visually represented through box-plot diagrams in Figure 5.According to the values obtained for the geoaccumulation index, the elements analyzed are in the uncontaminated to moderately contaminated range.Cd, As, and Pb are those elements with samples within the latter range, but only representing 6.9% and 3.4% of the total analyzed samples.
For the ( EF ), the elements analyzed fall between the poorly to significantly enriched ranges.Elements within the latter range only represent between 3.4% and 10.3% of the total of the analyzed samples.Regarding the contamination factor, between 3.4% and 6.9% of the total samples analyzed for Cd, As, and Pb are categorized as significantly contaminated.For the pollution index, 89.7% of the total samples are at a low level of contamination.
The geoaccumulation index ( GEO I ), enrichment factor ( EF ), contamination factor ( According to the values obtained for the geoaccumulation index, the elements analyzed are in the uncontaminated to moderately contaminated range.Cd, As, and Pb are those elements with samples within the latter range, but only representing 6.9% and 3.4% of the total analyzed samples. For the (EF), the elements analyzed fall between the poorly to significantly enriched ranges.Elements within the latter range only represent between 3.4% and 10.3% of the total of the analyzed samples.Regarding the contamination factor, between 3.4% and 6.9% of the total samples analyzed for Cd, As, and Pb are categorized as significantly contaminated.For the pollution index, 89.7% of the total samples are at a low level of contamination.
The geoaccumulation index (I GEO ), enrichment factor (EF), contamination factor (C f ), and degree of contamination (C DEG ) were also evaluated using the (EQS) values for agricultural land use as geochemical baseline values.The results are presented graphically in Figure 6.The values obtained for the geoaccumulation index indicate that only one value for Pb is in the moderately contaminated range.100% of the samples for As, Ba, Cd, and Hg are in the practically uncontaminated range.The analyzed elements are in the low to moderate enrichment range.The elements As, Cd, and Pb in the latter range represent only between 6.9% and 10.3% of the total analyzed samples.This indicates that 3.4% of the total analyzed samples for As, Cd, and Pb are within the moderate contamination range.Finally, the DEG C indicates that 100% of the total samples are in the low contamination range.

Health Risks
For the non-carcinogenic risks, Table S7 presents a summary of the results obtained.It is observed that for the non-carcinogenic risk in children, the elements As, Mn, Cr, and Pb contain at least 20.7% of the samples above the significant risk limit, with As and Mn representing high risks for 96.6% and 79.3% of the total samples.For the non-cancer risk in adults, only As contains samples outside the safe limit, representing 10.3% of the total number of samples.For carcinogenic risks in children, only As shows values above the limit, although these represent only 3.4% of the total samples.For carcinogenic risks in adults, As is within the acceptable risk range (S8).The results obtained can also be observed graphically through the box-plot diagrams in Figure 7.The values obtained for the geoaccumulation index indicate that only one value for Pb is in the moderately contaminated range.100% of the samples for As, Ba, Cd, and Hg are in the practically uncontaminated range.The analyzed elements are in the low to moderate enrichment range.The elements As, Cd, and Pb in the latter range represent only between 6.9% and 10.3% of the total analyzed samples.This indicates that 3.4% of the total analyzed samples for As, Cd, and Pb are within the moderate contamination range.Finally, the C DEG indicates that 100% of the total samples are in the low contamination range.

Health Risks
For the non-carcinogenic risks, Table S7 presents a summary of the results obtained.It is observed that for the non-carcinogenic risk in children, the elements As, Mn, Cr, and Pb contain at least 20.7% of the samples above the significant risk limit, with As and Mn representing high risks for 96.6% and 79.3% of the total samples.For the non-cancer risk in adults, only As contains samples outside the safe limit, representing 10.3% of the total number of samples.For carcinogenic risks in children, only As shows values above the limit, although these represent only 3.4% of the total samples.For carcinogenic risks in adults, As within the acceptable risk range (S8).The results obtained can also be observed graphically through the box-plot diagrams in Figure 7.

Spatial Distribution
The spatial distribution (Figures 8-10) of the elements under study shows higher concentrations in Area 1.To the North and Northwest, high concentrations of As, Ba, Cu, Hg, Mn, and Ni are observed.These areas correspond to areas of crops, secondary forests, pastures, and scrublands, where the sampling points are located more than 30 m from secondary roads, highways, and the Buena Vista population center.In addition, one of the samples is located within meters of an electrical substation.The soil samples were developed on conglomerates, granite, colluvial alluvial deposits, schists with Cu-Pb-Zn-Ag, and limestones with stratiform bodies of Pb-Zn-Ag.

Spatial Distribution
The spatial distribution (Figures 8-10) of the elements under study shows higher concentrations in Area 1.To the North and Northwest, high concentrations of As, Ba, Cu, Hg, Mn, and Ni are observed.These areas correspond to areas of crops, secondary forests, pastures, and scrublands, where the sampling points are located more than 30 m from secondary roads, highways, and the Buena Vista population center.In addition, one of the samples is located within meters of an electrical substation.The soil samples were developed on conglomerates, granite, colluvial alluvial deposits, schists with Cu-Pb-Zn-Ag, and limestones with stratiform bodies of Pb-Zn-Ag.
High concentrations of Ba, Cd, Hg, Mn, and Ni are observed towards the SE of Area 1, where secondary forests, pastures, and crops are developed on granite and limestone.They are located at distances of at least 30 m from high-voltage pylons and secondary roads.
The highest lead concentrations are found to the South of Area 2 in areas of secondary forest or grassland, with one sampling close to a house over limestone lithology with Pb-Zn-Ag stratiform bodies.High concentrations of lead can also be observed in the N, NW, and E of area 1 in areas close to a public road and the Buena Vista village center over limestone and granite lithology.High concentrations of Ba, Cd, Hg, Mn, and Ni are observed towards the SE of Area 1, where secondary forests, pastures, and crops are developed on granite and limestone.They are located at distances of at least 30 m from high-voltage pylons and secondary roads.
The highest lead concentrations are found to the South of Area 2 in areas of secondary forest or grassland, with one sampling close to a house over limestone lithology with Pb-Zn-Ag stratiform bodies.High concentrations of lead can also be observed in the N, NW, Cd Ni also show high concentrations to the South of Area 2 in an area close to the Chinchavito village center.Finally, for Cr and Zn, concentrations are relatively low and mostly concentrated within the 50th and 75th percentiles.It should be noted that the study area is free of industries, and the spatial distribution of the elements under study is based on models associated with the concentrations recorded, which are mostly below the geochemical baseline and EQS values.

Figure 2 .
Figure 2. Correlation between the different lithostratigraphic units and the different departments in the study area.

Figure 2 .
Figure 2. Correlation between the different lithostratigraphic units and the different departments in the study area.

Figure 3 .
Figure 3. Dendrograms of cluster analysis, using Ward's method, of heavy metals in surface soil (0-25 cm) of the project.

Figure 3 .
Figure 3. Dendrograms of cluster analysis, using Ward's method, of heavy metals in surface soil (0-25 cm) of the project.

Figure 4 .
Figure 4. Threshold values and soil sample distribution for As, Ba, Cd, Hg, and Pb with EQS values and geochemical baseline values by median method.

Figure 4 .
Figure 4. Threshold values and soil sample distribution for As, Ba, Cd, Hg, and Pb with EQS values and geochemical baseline values by median method.

Figure 5 .
Figure 5. Box-plot diagram for ecological indices using geochemical baseline values using the median ± 2 MAD method.The red dashed lines point out the classification of each environmental index.

Figure 5 .
Figure 5. Box-plot diagram for ecological indices using geochemical baseline values using the median ± 2 MAD method.The red dashed lines point out the classification of each environmental index.

Minerals 2024 , 23 Figure 6 .
Figure 6.Box-plot plots for the ecological indices using the EQS values for agricultural land use as generic reference values.

Figure 6 .
Figure 6.Box-plot plots for the ecological indices using the EQS values for agricultural land use as generic reference values.

Minerals 2024 , 23 Figure 7 .
Figure 7. Box-plot diagrams for carcinogenic and non-carcinogenic risks in children and adults.

Figure 7 .
Figure 7. Box-plot diagrams for carcinogenic and non-carcinogenic risks in children and adults.

Table 1 .
Statistical summary of the concentrations of potentially harmful elements (mg/kg) in soil samples from the project.

Table 2 .
Median concentrations of potentially toxic elements (mg/kg) reported in publications on agricultural soils in Peru.

Table 3 .
Spearman correlation matrix between concentrations of potentially harmful elements (mg/kg) in soil samples from the project.

Table 4 .
Factor analysis matrix for soil samples from the project.