Arsenic Distribution Assessment in a Residential Area Polluted with Mining Residues

Mining is a major source for metals and metalloids pollution, which could pose a risk for human health. In San Guillermo, Chihuahua, Mexico mining wastes are found adjacent to a residential area. A soil-surface sampling was performed, collecting 88 samples for arsenic determination by atomic absorption. Arsenic concentration data set was interpolated using the ArcGis models: inverse distance weighting (IDW), ordinary kriging (OK), and radial basis function (RBF). For method validation purposes, a set of the data was selected and two tests were performed (P1 and P2). In P1 the models were processed without the validation data; in P2 the validation data were removed one by one, models were processed every time that a data point was removed. An arsenic concentration range of 22.7 to 2190 mg/kg was reported. The 39% of data set was classified as contaminated soil and 61% as industrial land use. In P1 the method of interpolation with the lowest RMSE was RBF (0.80), the highest coefficient of E was RBF (46.25), and the highest Ceff value was with RBF (0.48). In P2 the method with the lowest RMSE was OK (0.76), the highest E value was 50.65 with OK, and the Ceff reported the highest value with OK (0.52). The high arsenic contamination in soil of the site indicates an abundant dispersion of this metalloid. Furthermore, the difference between the models was not very wide. The incorporation of more parameters would be of interest to observe the behavior of interpolation methods.


Introduction
The development of mining in Mexico has had a high impact on the environment. This activity, which has been the economic basis for the foundation of several communities, unfortunately it also generated a large amount of liquid, solid and gaseous wastes, mainly in form of sewage, gas, and slags [1]. The most common potentially toxic elements (PTE) derived from mining activities are lead (Pb), cadmium (Cd), zinc (Zn), arsenic (As), selenium (Se), and mercury (Hg) [2]. The PTE do not decompose through the processes of natural degradation, they have low mobility in soil, so they are accumulate over time [3]. The presence of metals in soil represents a risk for human health,

Research Area
The studied area is located in San Guillermo, municipality of Aquiles Serdán, Chihuahua, Mexico, in the geographical coordinates among the parallels 28 • 27 and 28 • 43 north latitude; and the meridians 105 • 41 and 106 • 00 west longitude; altitude between 2100 and 2300 m. It borders to the north with Aldama, to the east with Julimes municipalities, to the south with Rosales and to the west with the municipality of Chihuahua (Figure 1). The climate of the area is classified as semiarid, its maximum temperature is 40 • C and its minimum −14 • C, with an average annual rainfall of 350 mm, and rainy season of 60 days average. The prevailing winds vary throughout the year. The east winds have duration of four months (June to October), while the west winds have a greater frequency (October to June), lasting from seven to eight months [19]. The area involves a housing zone located at 400 m of distance from deposits of mining wastes. The residues are found at open air, there it is fine material of an inactive extraction industry, which used the flotation method for mineral separation, where it was extracted mainly Pb and Zn. The production material used in the benefit plant was extracted from the mines of the area of Santa Eulalia, Aquiles Serdán, Chihuahua. These mines were outstanding for their production of gold, silver, lead, zinc, and copper, throughout the mining history in the region [20].

Soil Sampling
Sampling of surface soil, aimed for As quantification, was performed according to NMX-AA-132-SCFI-2006 [21]. Dust samples in the housing zone were collected according to the US Environmental Protection Agency AP-42 section 13.2.1 [22]. Sampling was realized within an area of 53 ha by a systematic technique based in a radial grid, with a spacing of 100 m and 15° of angle ( Figure  2). A total of 88 samples were collected on spring in the residential area close to the mining wastes and surface soil in the neighborhood.

Arsenic Determination
The 88 samples were homogenized and sieved to a degree of fineness ≤75 μm. The homogenization of the samples was done by manual quartet. Acid digestion was made in a microwave MARSx CEM using the digestion method SW 846-3051 recommended by the manufacturer. Samples were prepared by placing 0.5 g of soil + 10 mL of nitric acid (HNO3), considering a blank. Once digested the samples were filtered and diluted up to 50 mL with distilled water.

Soil Sampling
Sampling of surface soil, aimed for As quantification, was performed according to NMX-AA-132-SCFI-2006 [21]. Dust samples in the housing zone were collected according to the US Environmental Protection Agency AP-42 section 13.2.1 [22]. Sampling was realized within an area of 53 ha by a systematic technique based in a radial grid, with a spacing of 100 m and 15 • of angle ( Figure 2). A total of 88 samples were collected on spring in the residential area close to the mining wastes and surface soil in the neighborhood.

Soil Sampling
Sampling of surface soil, aimed for As quantification, was performed according to NMX-AA-132-SCFI-2006 [21]. Dust samples in the housing zone were collected according to the US Environmental Protection Agency AP-42 section 13.2.1 [22]. Sampling was realized within an area of 53 ha by a systematic technique based in a radial grid, with a spacing of 100 m and 15° of angle ( Figure  2). A total of 88 samples were collected on spring in the residential area close to the mining wastes and surface soil in the neighborhood.

Arsenic Determination
The 88 samples were homogenized and sieved to a degree of fineness ≤75 μm. The homogenization of the samples was done by manual quartet. Acid digestion was made in a microwave MARSx CEM using the digestion method SW 846-3051 recommended by the manufacturer. Samples were prepared by placing 0.5 g of soil + 10 mL of nitric acid (HNO3), considering a blank. Once digested the samples were filtered and diluted up to 50 mL with distilled water.

Arsenic Determination
The 88 samples were homogenized and sieved to a degree of fineness ≤75 µm. The homogenization of the samples was done by manual quartet. Acid digestion was made in a microwave MARSx CEM using the digestion method SW 846-3051 recommended by the manufacturer. Samples were prepared by placing 0.5 g of soil + 10 mL of nitric acid (HNO 3 ), considering a blank. Once digested the samples were filtered and diluted up to 50 mL with distilled water.
Analytical As determination was performed by atomic absorption spectrometry (AAS) with a hydride generator in a GBC-brand AVANTA SIGMA (Waltham, MA, USA). Triplicate and control samples were taken. The As quantification limit was 0.005 mg/kg. The As concentrations were compared with the reference levels indicated in NOM-147-SEMARNAT/SSA1-2004 [23]. These were classified for residential use (less than 22 mg of As/kg of soil), industrial use (less than 260 mg/kg), and soil above the reference level [23], as shown in Table 1.

Spatial Distribution of Arsenic by Interpolation Methods
A layer of vector information of point type was created based in the As concentrations with software ArcGis 10.3 © (ESRI, Redlands, CA, USA). The interpolations were made through the extension Geoestatistical Analyst Wizard (ESRI, Redlands, CA, USA). The interpolation methods used were IDW, OK, and RBF.
The IDW method to estimate the concentration of nonsampled sites uses the existing values around the research area. The values of the closest observations have greater influence than those that are further away; this influence decreases with the distance [24]. Equation (1) for obtaining the estimated values with IDW is as follows: where: Z(S 0 ) = value to be estimated in place S 0 , N = number of observations close to the place to be estimated, λ = weight assigned to each observation to be used, weight decreases with distance, Z(S i ) = observed value of place S i . The radial base function (RBF) estimates values using a mathematical function that minimizes the overall curvature of the surface, resulting in a smooth surface passing exactly through the entry points. The estimation of non-sample values shall be calculated by the following Equation (2) [25]: where Z(S 0 ) is the value to be estimated in the place S 0 , φ(r) is the radial base function, r = ||s i -s 0 ||, is the Euclidean distance between the estimated location and each location s i , and ω i , i = 1, 2, . . . n + 1 , are the weights to estimate. The ordinary kriging (OK) method incorporates the statistical properties of the observations (spatial autocorrelation). OK employs the semivariogram (possible combinations between pairs of points) to express spatial continuity, this quantifies the weight of the correlation as a function of distance. The data closest to a known point have greater weight or influence in the interpolation [26]. The OK method is obtained based on Equation ( Analytical As determination was performed by atomic absorption spectrometry (AAS) with a hydride generator in a GBC-brand AVANTA SIGMA (Waltham, MA, USA). Triplicate and control samples were taken. The As quantification limit was 0.005 mg/kg. The As concentrations were compared with the reference levels indicated in NOM-147-SEMARNAT/SSA1-2004 [23]. These were classified for residential use (less than 22 mg of As/kg of soil), industrial use (less than 260 mg/kg), and soil above the reference level [23], as shown in Table 1.

Spatial Distribution of Arsenic by Interpolation Methods
A layer of vector information of point type was created based in the As concentrations with software ArcGis 10.3 © (ESRI, Redlands, CA, USA). The interpolations were made through the extension Geoestatistical Analyst Wizard (ESRI, Redlands, CA, USA). The interpolation methods used were IDW, OK, and RBF.
The IDW method to estimate the concentration of nonsampled sites uses the existing values around the research area. The values of the closest observations have greater influence than those that are further away; this influence decreases with the distance [24]. Equation (1) for obtaining the estimated values with IDW is as follows: where: Z(S0) = value to be estimated in place S0, N = number of observations close to the place to be estimated, λ= weight assigned to each observation to be used, weight decreases with distance, Z(Si) = observed value of place Si. The radial base function (RBF) estimates values using a mathematical function that minimizes the overall curvature of the surface, resulting in a smooth surface passing exactly through the entry points. The estimation of non-sample values shall be calculated by the following Equation (2) [25]: where Z(S0) is the value to be estimated in the place S0, Ф(r) is the radial base function, r = ||si-s0||, is the Euclidean distance between the estimated location and each location si, and ωi, i = 1, 2, … n +1, are the weights to estimate. The ordinary kriging (OK) method incorporates the statistical properties of the observations (spatial autocorrelation). OK employs the semivariogram (possible combinations between pairs of points) to express spatial continuity, this quantifies the weight of the correlation as a function of distance. The data closest to a known point have greater weight or influence in the interpolation [26]. The OK method is obtained based on Equation (3) [27]: where n is the number of observations, m is the Lagrange multiplier, used to minimize constraints, λi is the weight assigned to each observation (the sum of all weights equals 1), i, j are observations, 0 is the point of estimation, s is the point of estimation (measured variable), and d(si,sj) is the distance between si and s0 from the semivariogram. The interpolation models used in this research have as common parameter the data neighborhood, necessary to perform spatial searches [28]. In the case of d s i , s j +m = Int. J. Environ. Res. Public Health 2019, 16, x Analytical As determination was performed by atomic absorption spectrometry (AAS) w hydride generator in a GBC-brand AVANTA SIGMA (Waltham, MA, USA). Triplicate and co samples were taken. The As quantification limit was 0.005 mg/kg. The As concentrations compared with the reference levels indicated in NOM-147-SEMARNAT/SSA1-2004 [23]. These classified for residential use (less than 22 mg of As/kg of soil), industrial use (less than 260 mg and soil above the reference level [23], as shown in Table 1.

Spatial Distribution of Arsenic by Interpolation Methods
A layer of vector information of point type was created based in the As concentrations software ArcGis 10.3 © (ESRI, Redlands, CA, USA). The interpolations were made through extension Geoestatistical Analyst Wizard (ESRI, Redlands, CA, USA). The interpolation met used were IDW, OK, and RBF.
The IDW method to estimate the concentration of nonsampled sites uses the existing v around the research area. The values of the closest observations have greater influence than that are further away; this influence decreases with the distance [24]. Equation (1) for obtainin estimated values with IDW is as follows: where: Z(S0) = value to be estimated in place S0, N = number of observations close to the place estimated, λ= weight assigned to each observation to be used, weight decreases with distance, = observed value of place Si. The radial base function (RBF) estimates values using a mathematical function that minim the overall curvature of the surface, resulting in a smooth surface passing exactly through the points. The estimation of non-sample values shall be calculated by the following Equation (2) where Z(S0) is the value to be estimated in the place S0, Ф(r) is the radial base function, r = ||siis the Euclidean distance between the estimated location and each location si, and ωi, i = 1, 2, … are the weights to estimate. The ordinary kriging (OK) method incorporates the statistical properties of the observa (spatial autocorrelation). OK employs the semivariogram (possible combinations between pa points) to express spatial continuity, this quantifies the weight of the correlation as a functi distance. The data closest to a known point have greater weight or influence in the interpolation The OK method is obtained based on Equation (3) [27]: where n is the number of observations, m is the Lagrange multiplier, used to minimize constrain is the weight assigned to each observation (the sum of all weights equals 1), i, j are observation the point of estimation, s is the point of estimation (measured variable), and d(si,sj) is the dis between si and s0 from the semivariogram. The interpolation models used in this research ha common parameter the data neighborhood, necessary to perform spatial searches [28]. In the ca where n is the number of observations, m is the Lagrange multiplier, used to minimize constraints, λ i is the weight assigned to each observation (the sum of all weights equals 1), i, j are observations, 0 is the point of estimation, s is the point of estimation (measured variable), and d(s i ,s j ) is the distance between s i and s 0 from the semivariogram. The interpolation models used in this research have as common parameter the data neighborhood, necessary to perform spatial searches [28]. In the case of OK, the determination coefficients (R 2 OK ) were compared by means of the GS+ © program, in order to determine which of them fitted better to the semivariogram generation.
For the surface calculation by degree of contamination, data was interpolated with the IDW, OK, and RBF methods. The interpolation was made with the extension Geoestatistical Analyst Wizard using the As concentrations. After the interpolation, the resulting map from the Geostatical Analyst was rasterized. The raster files were classified in accordance to Table 1, which describes the soil classification by level of As contamination. The classified raster files were converted in a shapefile with a UTM Zone 13 North reference system. Finally, the surfaces were obtained by the level of As contamination.

Normality Analysis
For the tests performance by interpolation methods with GIS, a statistical analysis of the As concentrations with the SPSS 20.0 © software (IBM, CHI, USA,) was performed. The Kolmogorov-Smirnov (KS) normality test was applied. The distribution of As concentrations in the dataset was not normal, thus the As dataset was logarithmically transformed As (ln) to obtain a normal distribution (Figure 3). OK, the determination coefficients (R 2 OK) were compared by means of the GS+ © program, in order to determine which of them fitted better to the semivariogram generation. For the surface calculation by degree of contamination, data was interpolated with the IDW, OK, and RBF methods. The interpolation was made with the extension Geoestatistical Analyst Wizard using the As concentrations. After the interpolation, the resulting map from the Geostatical Analyst was rasterized. The raster files were classified in accordance to Table 1, which describes the soil classification by level of As contamination. The classified raster files were converted in a shapefile with a UTM Zone 13 North reference system. Finally, the surfaces were obtained by the level of As contamination.

Normality Analysis
For the tests performance by interpolation methods with GIS, a statistical analysis of the As concentrations with the SPSS 20.0 © software (IBM, CHI, USA,) was performed. The Kolmogorov-Smirnov (KS) normality test was applied. The distribution of As concentrations in the dataset was not normal, thus the As dataset was logarithmically transformed As (ln) to obtain a normal distribution ( Figure 3).

Validation and Interpolation Comparison
For the validation of interpolation methods, crossvalidation with As(ln) data was used. A set of the data were randomly selected using the SPSS 20.0 © program (test data and validation data) ( Figure  4). Two cross-validation tests were performed. The first test (P1) consisted on running the interpolation models without the selected validation data (one run per interpolation method).
The second test (P2) consisted of removing the selected validation data one by one and performing the interpolation with the remaining data. This procedure was performed 26 times by each method. With this, the absolute value of the deviation between the value estimated by the interpolation for both tests and the experimental data was determined.

Validation and Interpolation Comparison
For the validation of interpolation methods, crossvalidation with As(ln) data was used. A set of the data were randomly selected using the SPSS 20.0 © program (test data and validation data) ( Figure 4). Two cross-validation tests were performed. The first test (P1) consisted on running the interpolation models without the selected validation data (one run per interpolation method).
The second test (P2) consisted of removing the selected validation data one by one and performing the interpolation with the remaining data. This procedure was performed 26 times by each method. With this, the absolute value of the deviation between the value estimated by the interpolation for both tests and the experimental data was determined. For the methods precision, was used the root mean square error (RMSE), the estimated predictive effectiveness (E), the Nash and Sutcliffe Efficiency Coefficient [29], known as Ceff, and the coefficient of determination (R 2 ) between experimental and estimated data. The RMSE was estimated by Equation (4) [27,30]: where yi is the observed value at point "i", ŷi is the estimated value at point "i", and n is the number of used points.
The prediction effectiveness of each method (E) was determined through the following Equation (5) [31]: where z(xi) is the observed value at point "i", (xi) is the estimated value at point "i", n is the number of used points, ̅ is the sample average. E equal to 100%, indicates a perfect prediction. The coefficient Ceff was determined by the following equation: where RMSE is the root mean square error, and SD is the standard deviation [32]. The coefficient of determination (R 2 ) was applied to determine the association between the experimental concentrations and the estimated data by the different methods (Equation (7)). The linear correlation between the experimental vs estimated values was done using the GLM procedure [33]: For the methods precision, was used the root mean square error (RMSE), the estimated predictive effectiveness (E), the Nash and Sutcliffe Efficiency Coefficient [29], known as Ceff, and the coefficient of determination (R 2 ) between experimental and estimated data. The RMSE was estimated by Equation (4) [27,30]: where y i is the observed value at point "i",ŷ i is the estimated value at point "i", and n is the number of used points. The prediction effectiveness of each method (E) was determined through the following Equation (5) [31]: where z(x i ) is the observed value at point "i",ẑ(x i ) is the estimated value at point "i", n is the number of used points, z is the sample average. E equal to 100%, indicates a perfect prediction. The coefficient Ceff was determined by the following equation: where RMSE is the root mean square error, and SD is the standard deviation [32]. The coefficient of determination (R 2 ) was applied to determine the association between the experimental concentrations and the estimated data by the different methods (Equation (7)). The linear correlation between the experimental vs estimated values was done using the GLM procedure [33]: where y i are the observed values,ŷ i are the estimated values,ȳ are the mean values, and n is the number of observations used.

Arsenic Characterization
The As concentrations obtained from the soil samples reported a minimum value of 22.7 and a maximum value of 2190 mg/kg. The As data in soil indicated that 100% of the samples exceeded the reference level for residential land use (22 mg/kg). The 61% of the samples were within the range levels of industrial land use (260 mg/kg). Finally the 39% were classified as contaminated or severely contaminated soil (>260 mg/kg) ( Table 2).
These concentrations demonstrated the dispersion of this contaminant to the inhabited zone, causing a high degree of exposure to As particles, and with this a high risk to public health for the inhabitants of the zone, since the As in soil and dust can be absorbed in human body through dermal contact, inhalation, and ingestion exposures [34]. Table 2. Soil classification by arsenic concentrations.

Spatial Distribution of Arsenic
The As distribution in the studied area by the interpolation methods indicated a similar geographical trend, where high concentrations were found in the southern area. In other areas, As concentration was lower. The spatial distribution of As displayed differences due to the mode in which the data were statistically grouped by each interpolator [27].
The As concentration decreased as the zone was farther from the mining wastes, however, a part of the inhabited area was covered by high As concentrations (>260 mg/kg, above reference level for industrial land use). The other part of the inhabited area was covered by As concentrations from 22 to 260 mg/kg, which ones are above reference level for residential/agricultural use. Possibly the As direction is associated with the land topography where the elevation varied from 1550 to 1510 masl (Figure 1), with a slope of 17.5%. There is also a barrier formed by hills in the east, coupled with the 60 days of average rainfall, contributing to its dispersion, which agrees with the mentioned by Delgado-Caballero (2018) [35]. The spatial distribution of As by the three different interpolators visually tends to be similar. The IDW method has a greater tendency to form hotspots, this tendency is common where surrounding data values are closer to the estimated value [36]

Surface by Arsenic Range
Considering the predominant ranges of As concentrations in the experimental sampling ( Figures  1 and 2), in the classification of industrial land use (22-260 mg As/kg) the IDW method estimated an area of 21.39 ha, OK an area of 21.89 ha and RBF 23.37 ha. In the contaminated soil range (260-1000 mg As/kg) IDW estimated an area of 21.44 ha, OK 20.81 ha and RBF 19.06 ha. In the severely contaminated soil range (>1000 mg As/kg) the IDW method reported a surface area of 1.88 ha, OK of 2.04 and RBF of 2.3 ha. The estimated data are presented in Figure 6 and Table 3.
The IDW method reported a minimum estimated value of As of 15.3 and a maximum of 2183 mg/kg; OK a minimum value of 36.0 and a maximum of 1989 mg/kg; and RBF a minimum value of 17.0 and maximum of 2.166 mg/kg. The above display a great similarity between the minimum and maximum As values in the IDW and RBF methods in comparison to the experimental values. The OK values were distant from the experimental data. This can be explained by the nature of the interpolators. The IDW and RBF methods are exact interpolators, which generate the estimated surface by adjusting the values to the real data, marking abrupt changes in the estimated surface [37]. The OK eliminates the highest and lowest values to obtain a smaller error in the estimation, which causes a smoothing in the estimated surface [38]. This explains the discrepancy of the maximum and minimum values of this model with the experimental ones, and justifies the discretion in the calculation of the surface compared with the other two methods.

Surface by Arsenic Range
Considering the predominant ranges of As concentrations in the experimental sampling (Figures 1  and 2), in the classification of industrial land use (22-260 mg As/kg) the IDW method estimated an area of 21.39 ha, OK an area of 21.89 ha and RBF 23.37 ha. In the contaminated soil range (260-1000 mg As/kg) IDW estimated an area of 21.44 ha, OK 20.81 ha and RBF 19.06 ha. In the severely contaminated soil range (>1000 mg As/kg) the IDW method reported a surface area of 1.88 ha, OK of 2.04 and RBF of 2.3 ha. The estimated data are presented in Figure 6 and Table 3.

Validation and Interpolators Comparison
The mean and the variation for the estimated values of As(ln) in P1 and P2 by interpolation methods are presented in Table 3. The descriptive statistics demonstrated that for both tests, P1 and P2, the lowest variance corresponded to IDW (0.33 and 0.35, respectively). The mean of the experimental data was 5.31, in P1 the method that approached the mean value was OK (5.48), and in P2 the method with the closest value to the mean was RBF (5.48). In P1 the amplitude between the The OK values were distant from the experimental data. This can be explained by the nature of the interpolators. The IDW and RBF methods are exact interpolators, which generate the estimated surface by adjusting the values to the real data, marking abrupt changes in the estimated surface [37]. The OK eliminates the highest and lowest values to obtain a smaller error in the estimation, which causes a smoothing in the estimated surface [38]. This explains the discrepancy of the maximum and minimum values of this model with the experimental ones, and justifies the discretion in the calculation of the surface compared with the other two methods.

Validation and Interpolators Comparison
The mean and the variation for the estimated values of As(ln) in P1 and P2 by interpolation methods are presented in Table 3. The descriptive statistics demonstrated that for both tests, P1 and P2, the lowest variance corresponded to IDW (0.33 and 0.35, respectively). The mean of the experimental data was 5.31, in P1 the method that approached the mean value was OK (5.48), and in P2 the method with the closest value to the mean was RBF (5.48). In P1 the amplitude between the minimum and maximum values were lower for IDW (4.44-6.65), followed by RBF and OK. The amplitude at P2 was lower for IDW (4.38-6.62), followed by RBF and, finally, OK (Figure 7). The precision obtained by RMSE, E (%) and Ceff for the data of the interpolation methods are presented in Table 4. In P1 the method with lowest RMSE was RBF (0.800), followed by IDW (0.803). The interpolator with the highest coefficient of E was RBF (46.25), followed by IDW (45.5). The highest Ceff value was with RBF (0.48), then IDW (0.47). In P2 the interpolation method with the lowest RMSE was OK (0.76), followed by RBF (0.78). The highest E value was 50.65 with OK, followed by 48.19 with RBF. The Ceff reported the highest value with OK (0.52), followed by RBF (0.50). This result shows the difference between both validation tests. In P1 the most accurate method was RBF while in P2 the most accurate method was OK.    In Figures 8 and 9, the data distribution is graphically presented by comparing the experimental As(ln) values with those estimated by the interpolation methods. In P1 the highest coefficient of determination (R 2 ) was with IDW (0.538), followed by RBF (0.533). For P2 OK presented the highest value of R 2 (0.54), followed by RBF (0.528). According to the data obtained the IDW and RBF methods are very similar in precision in P1 showing high values of E(%) and R 2 . However, in P2, OK was the most accurate in agreeing with the highest values of E(%) and R 2 ( Table 5). High R 2 and low RMSE values indicate a good agreement between observed and estimated concentrations of As at the different sampling points (Table 5)       n † = test data, IDW = inverse distance weight, OK = ordinary kriging, RBF= radial basis function, P1 = Test 1, P2 = Test 2, * significant p < 0.05, ** highly significant p < 0.0001.
The statistical analysis of P1 and P2 shows that the smallest variation of the estimated data was with IDW and OK the one with the highest variance. The precision parameter for interpolation methods indicated that in P1 RBF was the most accurate, while in P2 was OK. The correlation coefficients between the observed values and those estimated at P1 were higher with IDW. In P2, OK exposed a significant correlation coefficient (p < 0.0001).
Observing the estimates of each of the models and tests, good correlations were obtained between models of the same test (>0.95), however comparing the models between correlation tests were low (<0.60) ( Table 6). Those results suggest, that behavior between tests is different and the difference between models is the same.

Error Mapping
In the error maps, the areas in red indicate a greater degree of error between the experimental and estimated data. Zones in blue indicate an intermediate degree of error. Zones with yellow indicate a lower error rate, which represents surfaces with values close to the real ones. The zones with a high error rate for the three interpolation models concur in the northwestern part, at the boundaries of the studied area and in close proximity to a topographic transitional zone. This transition zone corresponds to the boundary between the residential area and the hills beginning; moreover it is the further zone of the contamination source. Thus, this area could be subject to greater variability.
For P1 the interpolation of the errors between the experimental values and estimated by each method are presented in Figure 10. The OK and RBF methods present great similarity in the yellow surface (low error degree). IDW modeled in the northeastern part of the studied area a zone in yellow tonality, revealing uncertainty between the experimental data and the As estimated. This showed that although the IDW method had the lowest variance and the highest value of R 2 . The values estimation presents a greater error in the spatial distribution. For RBF homogeneous surfaces are observed in the central pair in yellow tonality, which reaffirms their accuracy in P1.  The interpolation errors in P2 are presented in Figure 11. A zone in the southern part generated by the IDW method presents a hotspot effect with an intermediate error rate. In the northwestern part, the three interpolation methods show the same zone with high error. It is possible to see that the OK method shows different surfaces in yellow tonality, indicating a smaller error between the experimental and estimated values. Following, RBF presents several zones with a low error rate. Finally, in IDW, zones with an intermediate error degree are observed, making it less reliable for P2. However, this test had a greater error than the P1 test.

Discussion
Soil and dust are important components in the housing environment and, therefore, have the most significance in the effects of human health. Exposure to As could cause harmful effects to human health, such as skin lesions, cardiovascular diseases, and metabolic disorders [34]. The spatial distribution suggests the control of high concentrations of the metalloid (up to 2190 mg As/kg) in the neighborhood of the deposit of mining residues, marking this zone as the area with the highest concentration of As. The southern zone of the housing complex also has high concentrations of As (from 22 to 1000 mg As/kg) indicating that the slope and direction of the wind is strongly associated with the distribution [39]. The lowest concentration of As was presented in the northern part of the housing complex, which match with the assumption related to be the farthest from the waste zone, however, it is worrying that these As concentrations exceed the reference levels for land use. The As dispersion is primarily caused by rain runoff due to slopes and wind transport in the direction from south to north [35].
Descriptive statistics expressed that in P1, the model with the best evaluation indicators was IDW, with the smallest variance, good precision and the highest correlation between estimated and observed values (p < 0.0001). In this test, RBF the model with more precision showed values very close to those of IDW in variance and correlation. In P1, KO was not the best in the precision parameters of RMSE, E (%), Ceff, and R 2 , however, the spatial distribution of the error shows surfaces in yellow very similar to RBF. In P2 the model with the highest precision and correlation was OK, nevertheless it was the one that presented greater variance; IDW presented a small variance and high precision and correlation values.
In general, with correlation tests, a high correlation was determined between models per test, however, a different behavior was observed between tests, which is according to how the experimental data are removed to run the model. Additionally, IDW and RBF are easier to use because they require fewer input parameters. In contrast, OK is more difficult to use. The typical OK includes several stages, such as statistical testing, data transformation, spatial structural analysis, and semivariogram function [40].
The error maps showed a high degree of error, on the northwest side (lowest As values). This could be due to the fact that in the zone with high concentrations of As can be present in a punctual way or the edge effect that affected the three methods of interpolation. The difference between tests (P1-P2) shows that OK best estimates the As(ln) values with less data (P2), in comparison to IDW and RBF. A higher data density can explain the superiority of RBF and IDW in P1. This shows that OK is better interpolator with a lower density of randomly distributed data, therefore, it is of greater use for the data simulation when the sampling and analysis is costly in the whole area, such is the case of As. Bhunia et al. (2016) [10] compared interpolators for organic carbon in soil using IDW and RBF, in such research they found that the IDW method was better in regards to RBF among others. In a research by Fortis-Hernández et al. (2010) [41] about the dispersion of nitrate and ammonium in soil, the IDW method turned out to be the most significant, followed by RBF in comparison to OK. Mueller et al. (2001) [42] compared IDW and OK in the analysis of pH, P, K, Ca, and Mg, concluding that IDW was a better option than OK, in data lacking spatial structure. Robinson and Metternicht [17] compared the IDW, OK, and splines methods in the estimation of subsoil pH, soil pH, electrical conductivity, and organic matter. They found that IDW was better at interpolating subsoil pH, OK was better interpolator in surface soil for pH, while organic matter and electrical conductivity were better interpolated with splines. These comparisons with other types of soil elements show the great variability in the accuracy of interpolation methods. Yan et al. (2015) [43] tested the performance of the Kriging interpolator, finding that the mine activity had no influence in the contiguous zones. Chaoyang et al. (2009) [44] used the IDW and kriging interpolators and Fu and Wei [45] the kriging interpolator finding that high concentrations of heavy metals come from mining settlements, which is consistent with our findings.
The efficiency of pollution assessment depends on the accurate mapping of metals and metalloids. The factors that affect accuracy are the number of samples, the distance between the sampling points and the sampling method [46]. Generally, higher sampling density would produce more accurate contamination mapping of heavy metals [42]. However, due to the cost and time of the sampling as well as the analysis cost of the samples, a high sampling density is impractical [40].
With the error mapping, although the results showed few differences, it was possible to show the performance of the interpolation methods. This highlights the importance of verifying the results in a descriptive and spatial way to know the performance of the methods of interpolation [47]. In comparison with other studies [43,[48][49][50], only the result of the spatial distribution of the variables under study is mapped, however, the mapping of uncertainty is essential to know the viability of the application of interpolation methods. Before proposing risk strategies it is important that the error mapping be verified as subsequent efforts may be poorly focused on the space and surface.
The increase in population, the demand for residential buildings [51], and the introduction of industries in areas with soils contaminated by mining waste pose a major health risk. This study tests the proximity and movement of arsenic in the human settlements of Aquiles Serdán. The interpolation maps are the basis for making decisions to minimize the health risk to the population.
Due to this, it is recommended to continue working with interpolation methods at these scales, for the management of the urban environment versus the As and other metals contamination. It is important to continue working with interpolation methods, nevertheless for the tests revealed in this research it is recommended to work at the submethod level of the interpolator and to verify the existence of significant differences between methods

Conclusions
The high environmental contamination of As present in soil of the studied housing area indicates a high dispersion of this metalloid presented in the mining wastes of the research area. The largest concentration of As is located in the southern part of the housing area, the one closest to the mining wastes. Waste deposits constitute a high risk for public health that must be addressed immediately, because if the contaminants dispersion in the area is not minimized, the health risk of the exposed population remains latent as the main routes of exposure: Inhalation of fine particles of airborne mining wastes, and particles suspended by vehicular traffic, as well as contaminated soil and dust intake.
Clear understanding of the distribution, extent and direction of As is the key in assessing the dispersion of such contaminant. As dispersion with the different interpolation methods IDW, OK, RBF was represented with similarities between them. The three methods enclosed the same area with higher As concentration and had minimal variation in the other classifications. Statistic data demonstrated that IDW was more precise in P1, and KO in the P2. Thus, the accuracy of the models could differ depending of the way of run the data. Although, differences between the models were not wide. The incorporation of more parameters, such as concentration of other metals, topography, soil type, and geology, among others, would be of interest to observe the behavior of interpolation methods.