Combination of GIS and Multivariate Analysis to Assess the Soil Heavy Metal Contamination in Some Arid Zones

: Recent decades have witnessed a danger to food security as well as to human health because of pollutants’ negative impact on crop quality. An accurate estimate of the heavy metal concentrations in Egypt’s north Nile Delta is required to lower the high concentration levels of heavy metal in the soil as a means to develop a remediation strategy that stabilizes heavy metals in contaminated soil. Using a geo ‐ accumulation index (I ‐ geo), contamination factor (CF), Improved Nemerow’s Pollution Index (Pn), and Potential Ecological Risk Index (PERI), supported by GIS; principal component analysis (PCA), and cluster analysis, six heavy metals (As, Co, Cu, Ni, V, and Zn) were analyzed from 15 soil profile layers (61 soil samples) to determine the extent of the soil contamination in the area studied. The findings demonstrate the widespread I ‐ geo contamination of As, Co, Cu, Ni, V, and Zn in different layers. The ranges for the I ‐ geo values were from − 8.2 to 5.3; 4.11 to 1.8; 6.4 to 1.9; − 9.7 to 2.8; − 6.3 to 2.9; and from − 12.5 to 2.4 for As, Co, Cu, Ni, V, and Zn, respectively. I ‐ geo categorization therefore ranged from uncontaminated to strongly / extremely contaminated. The CF values varied from 0.01 to 60.6; 0.09 to 5.17; 0.02 to 10.51; 0 to 10.51; 0.02 to 7.12; and 0 to 7.68 for As, Co, Cu, Ni, V, and Zn, respectively. In decreasing sequence, the CFs are arranged as follows: CF (As), CF (Ni), CF (Zn), CF (V), CF (Cu), and CF (Co). Most of the research region (71.9%) consisted of a class of moderately to heavily polluted areas. Additionally, a large portion of the study region (49.17%) has a very high risk of contamination, as per the results of the PERI index. The use of a correlation matrix, cluster analysis, and principal component analysis(PCA) to evaluate the variability in the soil’s chemical content revealed the impact from anthropogenic activities on the heavy metal concentration levels in the study area’s soil. The current findings reflect the poor quality of management in the research region, which led to the increase in the concentration of heavy metals in the soil. Decision ‐ makers could use the outcomes from the spatial distribution maps for contaminants and their levels as a basis for creating heavy metal mitigation strategies.


Introduction
Concern about the effects of heavy metal soil pollution has grown globally for both human health and food security due to the expansion of urbanization, industrialization, and population growth [1]. A high concentration of heavy metals causes a decrease in crop yields, degradation in the environment, and damage in soil [2]. Heavy metal enrichment in the food chain has a negative impact on human health [3]. The Nile Delta's soil quality is continuously declining because of many reasons, such as the increase in the number of factories and their emissions, urban expansion, the rising volume of traffic, and the use of wastewater and garbage deposits [4]. The Nile Delta is one of the world's most densely populated areas and has long been noted for its agricultural activities [5]. This area relies on wastewater for irrigation processes; the Nile water is mixed with agricultural and industrial drainage wastewater from the El Gharbia main drain (Kitchenr), Egypt [6]. Across the regional scales, the distribution of heavy metal soil pollution is unequal [7]. The spatial dispersion of various contributing elements typically varies. Some of the contributors, such as sewage irrigation, pesticide addition, traffic, mining, or industrial emissions, etc., are more likely to be observed locally or at close ranges, whereas others, such as the weathering of parent materials, are more likely to be observed at a further distance [8][9][10][11]. Intensive farming methods need the application of a large quantity of fertilizers regularly in order to supply enough nitrogen (N), phosphorous (P), and potassium (K) for crop growth. When fertilizers are applied continuously, the compounds used to supply these elements may contain trace levels of trace elements, which could dramatically raise the soil's mineral content [6,12]. Heavy metals in a particular area are redistributed because of human activity [2]. Analyzing the distribution pattern of heavy metal contamination can allow for a preliminary assessment and screening of the different types of pollution sources [13]. Multivariate statistical analysis is mostly used to analyze pollution sources by first determining the internal laws of the data and then identifying the main pollution factors. Additionally, when combined with GIS technology, the spatial interpolation of soil heavy metal content can be used to interpret the two-dimensional and three-dimensional distribution of heavy metals, enabling a more precise depiction of the research findings [2,14]. For thoroughly analyzing pollution patterns, the function of the weighted distance (IDW) is helpful [15,16]. The index method, quotient method, fuzzy comprehensive evaluation, geo-accumulation index, potential ecological risk index, and pollution load index are being used to estimate soil ecological risk [17]. Principal component analysis (PCA) has also been utilized to recognize several soil pollution sources, including industrial and agricultural activity, as well as the percentage of heavy metals responsible for soil contamination [18,19]. The PCA has also had the benefit of handling enormous amounts of data without being constrained to a particular number [20]. The current study aims to examine soil contamination with some selected heavy metals in the Northwest of the Nile Delta, Egypt, through mapping the spatial distribution of heavy metals in the study area, defining contamination levels using multivariate statistical (PCA and cluster analysis), and calculating the degree of contamination in the study area by using some indices. The objectives of this investigation are: 1-Assessing the studied soil properties [pH, soil salinity (EC), %clay, soil organic matter content (%SOM)] and total concentrations of As, Co, Cu, Ni, V, and Zn for soil profiles in order to improve the understanding of the environmental impacts in that region and their causes. 2-Utilizing contamination indices such as the geo-accumulation index (I-geo), contamination factor (CF), Improved Nemerow's Pollution Index (Pn), and Potential Ecological Risk Index (PERI) to evaluate the risk of contamination for six heavy metals in different layers of soil profiles. 3-Using PCA and cluster analysis to uncover relationships among variables to improve estimations for the variables under study and determine the sources of heavy metals within the study area.

Materials and Methods
The research region we chose is located in Egypt's northwest Nile Delta. The total area is estimated to be about 797.00 km 2 , and it is located between the longitudes 30°15′0″-30°40′0″ E and latitudes 31°7′15″-31°30′45″ N, as shown in Figure 1. Based on the average meteorological data for a period of 50 years, from 1960 to 2011, the region is classified as having a Mediterranean climate [21]. During the dry season in August, a comparatively high average maximum temperature of 30° C is typically reported. In January, the average low temperature is 13° C. From November through February, precipitation is often mild and hazy, with a mean annual rainfall of approximately 17.23 mm/year. The highest evaporation rates are noted in June and September, due to relatively high temperatures, while the lowest rates are observed in January and December due to low temperatures. Nearly 2 million farmers receive water from the Nile River through a network of 40,000 km of canals for crop irrigation, and a similar network of drainage canals is also present in the same area [22,23]. When combined with irrigation canals, these drainage canals amount to approximately 18,000 km, or almost 58,000 km in total [22]. The most popular type of irrigation involves surface irrigation where water is pumped from irrigation canals and drained into furrows and basins [24]. Rice, wheat, maize, and alfalfa are the most widely grown field crops in the study region. Citrus (orange), guava, and mango are the most widely grown tree fruits in orchards [25] These soils could be classified as Typic Torrifluvents, Typic Torripsamments, Typic Haplosalids, Vertic Natrargids, and Vertic Torrifluvents [25].

Analysis of Collected Samples
Soil samples were taken from 15 representative profiles, as shown in Figure 2. The soil profiles were chosen based on geomorphologic units in the study to reflect various agricultural activities. According to FAO [26] and the USDA [27], morphological descriptions and classifications of soil profiles were performed. The three principal landscape elements in the research area were the flood plain, lacustrine plain, and marine plain. Prior to the building of the high dam, the flood plain (713 km 2 ) was produced by Nile deposits. This landscape is made up of a variety of landforms, including meandering belts, river terraces, overflow basins, decantation basins, and river levees. Lacustrine deposits from the Holocene epoch developed the lacustrine plain, which covers 40 km 2 . Fish farms, wet and dry sabkha, and coastal sand dunes are all present in this landscape. The study area's northern zone (marine plain) covers approximately 40 km 2 and has sand sheet landforms. A distance of 4 km 2 of the total area is covered by lake [25,28]. One composite sample was taken from each layer of the soil profiles to ensure adequate representation. Soil samples were taken from various layers made by morphological changes. Soil profiles were dug up to the water table, or a depth of 150 cm. Consequently, the soil profiles vary in depth from 80 to 150 cm. The Faculty of Agriculture, Tanta University's ISO/IEC 17025 (2017)-compliant and accredited soil, water, and plant laboratory, analyzed the soil chemical parameters (61 soil samples). Before analysis, collected soil samples were air-dried, crushed to pass through a 2 mm sieve, and stored in plastic bags at around 4 °C. The pH meter was used to measure the pH of the soil in a suspension of 1:2.5 soil to water [29], and soil electrical conductivity (EC) was measured in soil paste extract [30]. According to the Walkley and Black method, soil organic carbon (OC) was measured using the dichromate oxidation method [31]. The soil samples were mechanically examined using sodium hexametaphosphate as a dispersant in accordance with the international technique [32]. An amount of 7 mL of concentrated nitric acid and 3 mL of hydrofluoric acid was used to digest soil samples [33]. The inductively coupled plasma mass spectrometry (model prodigy plus) was used to measure the concentrations of As, Co, Cu, Ni, V, and Zn.

Geo-Accumulation Index (I-Geo)
The I-geo conveys contamination by comparing the measured levels of trace elements with the background values. The following equation was used in the Geo-Accumulation Index formula . (1) where: Cn = the heavy metal concentration as measured in the soil samples, and Bn =the geochemical background concentration as observed in the average upper crust, after Wedepohl [34]. Because soil is a component of the Earth's crust and has a chemical composition that is related to that of the crust, the focus is on the relationship between the concentration achieved and the concentration of elements in the crust in this case [35].

Improved Nemerow's Pollution Index (Pn)
The implementation of the Nemerow Pollution Index made it possible to evaluate the soil ecosystem's quality comprehensively. For each sampling site, the modified formula presented by [36] was determined using equation (2). (2) where: Igeomax = the maximum Igeo value, and Igeoave = the mathematical average value of I-geo.

Contamination Factor ( )
By dividing each measured heavy metal's total concentration by the background value, the contamination factor ( ) of each metal in the study was calculated. The following equation was used to calculate [38] ( where: = measured heavy metal's total concentration = The background value of each metal. According to Hakanson [39], there are four different levels of CF: ≤ 1 (low contamination), 1 < ≤ 3 (moderate contamination), 3 < CF ≤ 6 considerable contamination), and ≥ 6 (high contamination).

Potential Ecological Risk Index (PERI)
The potential ecological risk index (PERI) in the present investigation was computed as the sum of all six heavy metals in order to objectively analyze the possible risks from soil contamination by heavy metals, with the aid of Equations (4) and (5), adopted by [39], where: = potential ecological risk, = the toxic response factor for heavy metal, and = the contamination factor, as mentioned above. The toxic response factors for As, Co, Cu, Ni, V, and Zn are10, 5, 5, 5, 2, and 1, respectively [36,39,40]. The evaluated criteria for PERI are classified [36]: ≤ 50 (low risk), 50 ˂ ≤ 100 (moderate risk), 100 ˂ ≤ 150 (high risk), 150 ˂ ≤ 200 (very high risk), and ˃ 200 (extreme risk).

Statistical Analysis
The minimum, maximum, arithmetic mean, and standard deviation for the investigated soil characteristics and heavy metals were computed using SPSS version 25. The linear correlations between the soil variables were verified by using the Pearson correlation coefficient. For the entire data set, the Kaiser-Meyer-Olkin (KMO) method was used to determine whether the samples were adequate. KMO values greater than 0.6 indicated that the data were suitable for PCA [41]. To eliminate multi-collinearity between the original variables, PCA was employed to reduce the dataset into principal component (PC) variables. Additionally, the Bartlett test was used to verify data fitness, and the results showed that p < 0.05, further confirming the data fitness for PCA [42].

Utilizing IDW to Map Soil Properties and Heavy Metal Concentrations
Interpolation maps of selected soil characteristics and heavy metal concentrations were created using the IDW tool found in the ArcGIS10.7 software. This method calculates the grid note by considering nearby sites within a user-defined search radius. Because it is simple to operate, IDW is frequently employed in soil studies [28,[43][44][45][46][47]. Going away from the measurement site, its local impact diminishes, as shown by the equation below.
In this research, agricultural and industrial drainage directly impact the elements present in soil at a certain concentration, which fluctuates depending on the distance from the source; thus, it is preferable to choose this IDW technique [6].

Some Soil Properties and Heavy Metal Concentrations of the Investigated Area
In Table 1, a detailed statistical analysis of the concentrations of heavy metals (HMs) (As, CO, Cu, Ni, V, and Zn) is presented. The concentration of total arsenic ranged between 0.01 and 121.20 mg kg −1 with an average value of 25.52 ± 25.97 mg kg −1 . Arsenic in the soil is typically thought to have a geological origin and is more prevalent in clayey soils. However, because more anthropogenic sources than natural ones emit arsenic into the environment, there is a significant amount of anthropogenic arsenic pollution [48]. The range of cobalt concentration fluctuated between 1.05 and 59.94 mg kg −1 and it had an average value of 17.39 ± 13.33 mg kg −1 . Cobalt has a crucial role in the growth of leguminous plants when it occurs at low concentrations. It also has many benefits for human health because it is a part of Vitamin B12. Nevertheless, in excess it can have negative consequences, such as limiting plants' nitrogen metabolism and photosynthesis, and it also may have serious effects on the human lungs and heart [49]. The average copper mg kg −1 is 37.80 ± 24.90. Cu accumulation in soils is due mainly to human activity, such as mining or industrial activity. Copper-containing compounds are frequently used in agriculture, particularly in insecticides used in vineyards and orchards [50]. This could be the reason for the increased Cu contents detected in soil samples from Mediterranean regions where these land uses are widespread [51]. The total Nickel content we found was 0.03-195.43 mg kg −1 , with a mean value of 70.29 ± 54.60 mg kg −1 . Like the majority of other heavy metals, Nickel in the soil can have either a natural or a human-made origin, although soil contamination may also be caused by industrial activities [52]. Vanadium concentration varied from 1.01 to 377.50 mg kg −1 with an average value of 170.95 ± 123.84 mg kg −1 ; the role of V metal in soil toxicity to the environment is debatable [53]. It is an essential nutrient for many organisms, including humans, in trace amounts [16]. The average value of the total zinc concentrations is 149.27 ± 144.02 mg kg −1 . Zinc is a necessary metal for both plants and humans, but too much of it can be hazardous [54]. Thus, it is essential to manage the proper amount in agricultural soils. It may have immediate, harmful effects and, among other things, result in digestive and immune system issues. Additionally, excessive levels of zinc may decrease the absorption of copper, leading to signs of copper insufficiency [53]. All of the selected heavy metal concentrations are higher than the levels that are proposed by [34], whereas As, V, and Cu concentrations on average are greater than the values recommended by [55]. The pH contents fluctuated between 7.79 and 8.90; pH is a very important property, as the availability of soil nutrients to plant roots, the biological activity in various soil settings, and the activity of enzymes are all impacted by the soil pH [56]. The research area's EC values have a wide range, from 0.55 to 23.20 dS m −1 with an average of 5.55 ± 5.75 dS m −1 . The high salinity of the water table and the influence that lake and seawater may have contributed to the high levels of EC in some locations of the study area. Most of the soil in the northern Delta is classified as having high soil salinity; therefore, this fits the region's typical trend [57,58]. The data we obtained indicated that the soil organic carbon ranged from 0.06 to 12.30 (5.26 ± 3.61 mg kg −1 ).

Principal Component and Cluster Analysis
According to the correlation matrix (person) of the studied variables found in Figure  3, the correlation between As, Co, Cu, Ni, V, and Zn is considerable. Regardless of the correlation significance, the soil pH was adversely associated with each variable [59]. The association between selected soil properties and selected heavy metals was found to be great. Strong positive relationships between organic matter and heavy metals were found; these results are consistent with [60]. The clay content and heavy metal concentrations have a strong correlation that is similar to that of [61]. The variables from the original data were extracted by using PCA, which groups them into factors (principal components) [42,62]. In other investigations [63,64], PCA has been used to evaluate soil contamination with selected heavy metals. Before using PCA analysis, the variables were standardized by means of Z-scores through SPSS software version 25. The Kaiser-Meyer-Olkin (KMO) test was employed to assess the sufficiency of the sampling for all variables [65]. Given that the KMO value was >0.6 [59], the soil sample was considered acceptable at a KMO value of 0.85 (Table 2). Based on 61 observations and 10 variables, a PCA was produced. The Scree plot displays the eigenvalue of various major components or variables (Figure 4). According to the data shown in (Table 3), only two of the variables with a cumulative variability of 72.76% had eigenvalues greater than 1, and the remaining components were discarded based on a Kaiser test [66]. The eigenvalue and variability for PC1 were 5.99 and 59.95, respectively, according to the PCA analysis, whereas PC2 had the highest eigenvalue and variability (12.81 and 72.76%). In this study, the data were divided into two groups using agglomerative hierarchical clustering (AHC clusters). PCA was used to help in detecting pollution source locations by taking the correlation matrix's eigenvalues. A strong correlation between the metals could indicate a common source, such as the industrial activity in the research region or the use of agricultural pesticides and fertilizers. The dendrogram in Figure 5 demonstrates how the two clusters were different from one another; each cluster had unique traits. With differing ranges, averages, and standard deviations (St. Dev) for all variables, the first cluster had 35 observations and the second had 26 observations. These two clusters were taken from the PCA. The collected results revealed considerable changes in As-Co-Cu-Ni-V-Zn-Clay-OM between the two clusters, and all the heavy metals had a negative correlation with the soil pH.

The Spatial Distribution of the Mean Soil Properties and Heavy Metal Weight
As shown in Figure 6, the southeast region of the research area shows an increase in the spatial trends for (As) mean weight (Figure 6a). On the other hand, modest amounts of it were present in the northwest of the research region. The east portions of the research area (Figure 6b) had significantly higher concentrations of (Co), which may be related to local human activity and agricultural management techniques in terms of fertilizer application [67,68]. According to the interpolation map in Figure 6d,e, the sites in the south, east, and center of the study area had the highest values for Ni, whereas the spatial distribution of (V) was irregular since the highest values were dispersed throughout the north and west of the study region. The main sources of soil (Ni) pollution are the companies involved in metal plating, burning fossil fuels, mining Nickel, and electroplating. This is in addition to the anthropogenic sources, such as sewage sludge and other wastes that are used as soil conditioners, phosphate-based agricultural fertilizers, atmospheric deposition, and inorganic fertilizers [69][70][71][72][73][74]. The spatial patterns demonstrated that (Cu) and (Zn) concentrations rose from the north to the southwest (Figure 6c,f). While the red color in the EC map expresses its greatest levels, which are located in the places adjacent to Idko Lake, the highest values for pH were identified in locations in the northeast and southeast of the study area (Figure 6g,h). These outputs are consistent with each other [28]. West of the study area was where SOC levels were the highest (Figure 6i). The study area has numerous textural classes such as sandy, silty clay, silty clay loam, sandy loam, and clay (Figure 6j).

Contamination Hazards Indices
The findings demonstrate that the I-geo values of six heavy metals ranged from highly contaminated to non-polluted. The results demonstrate significant differences in the levels of As, Co, Cu, Ni, V, and Zn in the various soil layers, as seen in Table S1. I-geo (As) ranged between −8.2 and 5.3, I-geo (Co) ranged between −4.1 and 1.8, I-geo (Cu) ranged between −6.4 and 1.9, I-geo (Ni) ranged between −9.7 and 2.8, I-geo (V) ranged between−6.3 and 2.9, and I-geo (Zn) ranged between−12.5 and 2.4. Therefore, I-geo classification varied from uncontaminated to strongly and extremely contaminated. I-geo (AS) displayed strongly to extremely contaminated properties in some surface samples, which suggests that several surface and sub-surface samples had a strong level of contamination. For As, Co, Cu, Ni, V, and Zn, the CF values ranged from 0.01 to 60.6, 0.09 to 5.17, 0.02 to 10.51, 0 to 10.51, 0.02 to 7.12, and 0 to 7.68, respectively. As a result, the classification of CF in the study area ranges from low contamination to very high contamination. CF values are arranged in descending order as follows: CF (As) > CF (Ni) > CF(Zn) > CF(V) > CF(Cu) and CF (Co), Table S2. Various soil contamination indices were used by many other researchers, including Pn and PERI [75,76]. According to the Pn index classification, the study areas were moderately polluted, moderately to heavily polluted, heavily polluted, and heavily to extremely polluted, as shown in Table 4 and Figure 7a. The largest part of the study area was determined to be a moderately to heavily polluted class (79.19%). However, the moderately polluted class only occupied roughly 1.36% of the entire study; 12.14% of the research region was heavily polluted and 7.31% of the study area was classified as heavily to extremely polluted. According to the PERI values, 19.45% of the study area was in a low risk class, which means that 1.36% of the research area was in a high risk class. The majority of the study area was under a very high risk of contamination c (49.17%), and approximately 30.02% of the study area was extremely contaminated, as shown in Table 5 and Figure 7b. This area is one of Egypt's most populated, fertile, and agriculturally developed areas; therefore, it can maintain a sizable population [6]. It is also under significant stress because it is close to the sea and has a large volume of domestic drainage in the Nile Delta. In the absence of official planning, residential neighborhoods and industrial zones frequently overlap. As a result, contaminants are released into the area's soil, water, and air from a variety of anthropogenic activities, such as industrial operations, transportation networks, residential domestic waste, sewage sludge effluents from within or near urban areas, and the use of fertilizer, pesticides, and insecticides in farmlands [4]. As such, numerous techniques have been created to clean up heavy metal-contaminated soil. Physical, biological, and chemical cleanup labels are used to classify these frequently used techniques. In situ chemical amendment of soil is a remediation method that stabilizes the heavy metals in polluted soil and is cost-effective when compared to other remediation strategies [77]. Inorganic supplements are used in this method to lessen the mobility and bioavailability of heavy metals through adsorption, precipitation, ion exchange, and complication [78]. suggested the use of tourmaline (a borosilicate mineral with a very complex chemical composition) is suggested to clean up historically heavy metals. Contaminated alkaline soils have a good chance of success [79]. In addition, with increasing quantities of tourmaline application to the soil, concentrations of water-soluble Ca and Mg in the soil significantly rose. Moreover, tourmaline decreased the metal mobility in polluted soil without an increase in the soil pH.

Conclusions
Soil contamination by heavy metals in the northern Nile Delta, Egypt is regarded as one of the major obstacles to sustainable development and food security, and its assessment is emphasized in the current study. The results revealed that I-geo (As) varied from −8.2 to 5.3, I-geo (Co) from −4.11 to 1.8, I-geo (Cu) from −6.4 to 1.9, I-geo (Ni) from −9.7 to 2.8, I-geo (V) from −6.3 to 2.9, and I-geo (Zn) from −12.5 to 2.4. Some surface samples showed I-geo (AS) displays that ranged from strong to extremely strong, but they also seemed to indicate that some surface and sub-surface samples had a high amount of contamination. CFs are ordered in decreasing order as follows: CF (As), CF (Ni), CF (Zn), CF (V), CF (Cu), and CF (Co). A class of moderately to heavily polluted areas made up the majority (71.9%) of the research region according to the Pn classification. Additionally, according to the PERI index results, most of the study region (49.17%) had a very high risk of contamination. The influence of anthropogenic activities on the heavy metal concentration in the soils of the study area was revealed by the use of a correlation matrix, cluster analysis, and principal component analysis to assess the variability in the soil's chemical content. The present study indicates there are harmful consequences from human activity, such as the overuse of mineral fertilizers and pesticides, on the soils investigated in the study area. The study concludes by advising the implementation of agricultural management rules to decrease human activities that create environmental pollution. Future research will also concentrate on how to adapt and minimize the consequences of soil contamination.

Supplementary Materials:
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/agronomy12112871/s1, Table S1: Geo-Accumulation Index (Igeo) and associated contamination levels of study area; Table S2: Contamination factor (CF) and associated contamination levels of study area.