A Soil Environmental Quality Assessment Model Based on Data Fusion and Its Application in Hebei Province

: Soil pollution has become one of the most important environmental issues in China. It is very important to evaluate soil environmental quality comprehensively and objectively. This paper proposes a soil environment quality assessment model based on the Driving Force-Pressure-State-Impact-Response (DPSIR) model and data fusion. At first, 18 evaluation indicators are selected, including complex indexes, such as the industrialization index, heavy metal pollution index, organic pollution index, potential ecological risk index, and human health risk index, and single indexes such as population density, fertilizer / pesticide application intensity, annual average air quality index, etc. Then, hierarchical analysis model is constructed, and the weight of each indicator is calculated based on Analytic Hierarchy Process (AHP) method. According to the quartile of indicator values of 32 provincial administrative divisions on the Chinese mainland, the values of each indicator are standardized and graded. Finally, the soil environmental quality index (SEQI) is calculated by the weighted average of the standard values of the 18 indicators. The assessment model is then applied in evaluating soil quality of Hebei Province, China. The results show that the soil environmental quality of Hebei’s agricultural land is in a medium state, and the industrial land is approaching the alert state. The pressure of soil pollution mainly comes from the discharge of industrial pollutants and the application of pesticides and fertilizers. Soil pollutants, such as lead, copper, zinc, benzo[a]pyrene, and benzo[a] should be especially controlled.


Introduction
Soil quality has become one of the most important environmental issues in China's modernization process. Soil pollution is related to agricultural safety production, cultivated land quality, and people's health. The National Soil Pollution Status Survey Bulletin, jointly issued by the former Ministry of Environmental Protection and the Ministry of Land and Resources in 2014, showed that the total over-standard rate of soil pollutants in China is as high as 16.1%, indicating that the soil conditions in China are not optimistic [1]. With the awareness of environmental protection and sustainable development, people began to realize the negative effects of soil pollution on production systems and their lives. Soil pollution refers to the accumulation of heavy metals and organic and other pollutants, which lead to an unbalanced nutrient availability for plants, changes in soil microbial community abundance and structure, soil ecosystem degradation and the pollution of groundwater, further affecting the quality and safety of crops and human health [2][3][4][5]. Due to the diversity of soil resource utilization and management methods, there are many indicators to evaluate soil quality [6][7][8][9]. It is necessary to construct a comprehensive assessment model to evaluate soil environmental quality, according to different evaluation purposes. Besides, the soil pollution data sources are widely distributed in the departments of environmental protection, land resources, agriculture, forestry, industry, etc. It is necessary to integrate multi-source soil environmental data into the soil environmental quality assessment system as well as to ensure the reliability of soil quality assessment results.
At present, the development of soil quality assessment is still at the beginning exploration stage. The commonly used soil quality evaluation methods are summarized as follows: (1) Multiple variable indicator kringing method (MVIK) [10] converts multiple single soil quality indicators into a comprehensive soil quality index. The standard of each indicator represents the optimal range or threshold of soil quality, which is established and evaluated based on soil properties in the region. (2) The soil quality model method [11] subdivides the soil quality assessment into 6 specific quality elements: SQ = f (SQ E1 , SQ E2 , SQE 3 , SQ E4 , SQ E5 , SQ E6 ); SQ is the soil quality comprehensive score, SQ E1~S Q E6 represent the crop yield, erosion resistance, groundwater quality, surface water quality, air quality, and food quality, respectively. The soil quality score SQ is calculated by a simple multiplication of SQ E1~S Q E6 . The application scope of this method is wide, but sometimes the information is insufficient, which cannot fully reflect the optimal functional relationship among different soil quality elements. (3) The soil quality index method [12] consists of three steps: selecting the appropriate indicators, transforming the indicators into component values and generating the comprehensive evaluation score. Karlen et al. used a standardized score curve to calculate the score of each indicator and get the soil quality index. Then they selected several important soil functions related to soil quality to evaluate the impact of different management systems on soil quality. This method is suitable for the sustainable management of soil and can provide early prediction to judge whether the management measures have positive or negative effects on the sustainable development of soil [13]. (4) The soil relative quality method [14] first assumes an ideal soil condition; other soil quality indicators are obtained by comparing with the ideal soil. This method is convenient and reasonable. The evaluation results are more practical and therefore suitable for the long-term management of soil. But the selection of ideal soil is very important; it is necessary to select different ideal soil according to different soil conditions in the study area. (5) The fuzzy mathematics comprehensive method [15] describes the gradualness and ambiguity of the change of soil quality status through membership degree, which reflects the comprehensive influence of each factor on the whole evaluation result and makes the evaluation result more accurate and reliable. In this method, the membership degree of each indicator md i , 0 ≤ md i ≤ 1, is determined by the membership function and the weighted average model. It is used to represent the soil quality. This method not only makes use of the boundary fuzziness characteristic of each indicator but also considers the comprehensive influence of the value, weight and the interaction among indicators to minimize the influence of the detecting error on the evaluation results. Thus, the main restrictive factors affecting soil quality in the study area can be found based on the evaluation results.
In summary, the evaluation of soil environmental quality has no internationally or locally unified standards or methods. People usually use the above methods or a combination of several methods, based on the evaluation object and purpose. Soil environmental quality assessment is a complex procedure. Firstly, current soil environmental quality indicators are too easy to trace the state of the contaminated soil. Soil pollution is closely related to regional economic and social agricultural development, climate and geographical environment, environmental protection measures, and technical means. However, the evaluation indicators only focus on the physical and chemical characteristics of the soil and do not pay attention to the relationship between soil pollution and economic and social factors, agricultural development, climate, and meteorological changes. These relatively easy soil environmental assessment methods have a limited effect on the prevention and reduction of soil pollution [16][17][18][19]. Secondly, considering the complexity of soil data environment, existing evaluation methods cannot describe the overall characteristics of the soil environment well and cannot trace the main causes of soil pollution.
To solve the above problems, this study proposes a comprehensive soil environmental quality assessment method based on the Driving Force-Pressure-State-Impact-Response (DPSIR) model [20]. Developed from the pressure-state-response (PSR) model [21], it can analyze the interaction between human activities and resource environment from a system perspective, reflecting their causality and constraint relationship. It also can decompose and simplify the complex process of interactions among factors, making the indicator system clear and simple. Once the indicator value changes, the DPSIR model can provide timely and continuous feedback, which makes the indicator system more flexible and relevant. Besides, the DPSIR model can decompose complex problems first and then effectively synthesize them, which has strong advantages in the complex soil environmental quality assessment. The European Environmental Agency (EEA) first proposes the DPSIR model. It has been widely used in ecological environment assessment [22][23][24]. Our contributions in this study are summarized as follows. We propose a soil environmental quality assessment method based on the DPSIR model, and 18 evaluation indicators are selected according to the principles of the DPSIR model. The weights of each indicator are calculated by the Analytic Hierarchy Process (AHP) method. According to the quartile of indicator values of 32 provinces and municipalities of Mainland China, 18 indicator values are graded and standardized. At last, the weighted average value of these indicators is calculated as the soil environmental quality index. This method is used to evaluate the soil environmental quality of Hebei Province, China. The evaluation results show that the soil environmental quality of Hebei's agricultural land is in a medium state, and the industrial land is approaching the alert state. Some indicators, such as pesticide application intensity and environment protection investment have an extremely bad status and should be paid attention to.
The rest of this paper is organized as follows. Section 2 introduces the soil quality evaluation model based on DPSIR and data fusion. Section 2.1 introduces the selection and description of indicators, especially the complex indicators. Section 2.2 constructs the soil environmental quality assessment model, consisting of three procedures: the establishment of a hierarchical analysis model and calculation of each indicator's weight, the standardization of each indicator's value and the integration of each indicator's value, for a comprehensive evaluation. Section 3 is an application of this model in Hebei province, China. Finally, some significant discussion points and conclusions are listed in Sections 5 and 6.

Soil Environmental Quality Assessment Method Based on the DPSIR Model
The DPSIR model is an indicator system commonly used in the evaluation of ecological environment systems [25,26]. It divides the evaluation indicators into five modules: driving force, pressure, state, impact, and response. When evaluating the soil environmental quality, the five modules can be explained as follows, also can be seen in Figure 1.
(1) Driving force: Refers to the socio-economic development factors, such as industrial development and population growth, that lead to changes in soil environmental quality, which are the most primitive key factors directly forcing the change of soil environment. (2) Pressure: Refers to the pressure directly applied to the soil environmental quality after driving force action. Similar to the driving force, pressure refers to external factors, such as corporate sewage, pesticide, and fertilizer application, etc, that cause soil environmental change. While the driving force affects the soil environmental quality implicitly, the pressure acts explicitly. (3) State: Refers to the pollution status of the soil environment under the above pressures, such as organic pollution and heavy metal pollution in soil. (4) Impact: Refers to the impact of current soil environmental conditions on human health and social economy, such as potential ecological risks, human health risks, etc.
(5) Response: Refers to countermeasures, policies, and environmental technologies that are used to prevent soil pollution, such as soil environmental restoration, government environmental protection investment, and policies. The Evaluation indicator system involved in the DPSIR model reflects the desire to comprehensively evaluate the essence of the object. However, in the specific application of soil environmental assessment, the selection of indicators reflects the different evaluation objects, evaluation objectives, knowledge background, and the theoretical basis of evaluations. Therefore, it is necessary to select suitable indicators for the soil environmental assessment system, based on the principles of sensitivity and independence.

Indicator Selection and Description
According to the rules of sensitivity and independence, indicators of soil environmental quality assessment are selected, as shown in Table 1.  Different from other simple indicators, which can be directly gained from statistical yearbook or other data sources, the following indicators (X 1 , X 11 , X 12 , X 13 and X 14 ) are obtained from complex calculations of several parameters. Calculation processes of these indicators are explained in the following part.
(1) X 1 Industrialization index. X 1 refers to the industrial development level of the studied area.
Generally, the higher the industrialization level is, the worse the soil environmental quality is. Four parameters are taken into account: proportion of heavy/light industry, non-agricultural industry, per capita GDP, and non-agricultural labor ratio. Using the calculation method of literature [27], the initial value is obtained by principal component analysis by SPSS software.
The industrialization index of a province is between 0 and 10. (2) X 11 Heavy metal pollution index. According to the Soil Environmental Quality Standard [28,29], the main pollutants of heavy metals are cadmium (Cd), arsenic (As), mercury (Hg), copper (Cu), lead (Pb), chromium (Cr), zinc (Zn), and nickel (Ni). In order to measure the synthesized pollution of heavy metals, the commonly used Nemerow index is introduced: where P imax and P iavg represent the maximum single pollution index and average pollution index of the i-th sampling point, respectively, and P z is the synthesized heavy metal pollution value. We also use the Nemerow index to calculate the synthesized pollution of organics. (4) X 13 Potential ecological risk index. This index is calculated by a method proposed by Hakanson [28].
The calculation not only takes heavy metal pollution into account but also considers the multi-element synergy and other factors. The calculation formula is shown as Equation (2): where, C i f represents the pollution index of the i-th pollutant; C i s and C i n represent the detected value and the reference value of the i-th pollutant, respectively; and T i r is the toxicity response coefficient of the i-th pollutant. The calculation of the above indexes results in E i r , the potential ecological risk index of the i-th pollutant. The potential ecological risk can be obtained by accumulating E i r . The toxicity response coefficient is set according to references [28,29], as shown in Table 2. (5) X 14 Human health risk index. According to China's Technical Guidelines for Contaminated Site Assessment and the carcinogenicity of related substances provided by the comprehensive risk consulting system of the United States Environmental Protection Agency (US EPA), the carcinogens in soil pollutants are arsenic, lead, hexachlorocyclohexane, and aldrin. The human health risk index consists of total cancer risk and total non-carcinogenic risk. By calculating the health risks of individual pollutants separately, the total health risks can be obtained by adding them together. Three types of lands (agricultural land, residential land, and industrial land) and three types of pollutant intake (mouth, respiration, and skin contact) are taken into account. We use EDI 1~E DI 3 to indicate oral intake, respiration intake, and skin contact intake, respectively. The Risk-based Corrective Action (RBCA) model is used to calculate the index, as shown in Equation (3).
In the formula, CS is the mass ratio of chemical substances in the soil, unit: mg/kg. ABS is the skin absorption coefficient. The exposure assessment parameters refer to the US Environmental Protection Agency's Exposure Factor Handbook, Technical Guidelines for Contaminated Site Risk Assessment [30,31], the specific values are shown in Table 3. The total cancer risk and non-carcinogenic risk can be expressed as: In Equation (4), SF is the slope factor of the carcinogen; RfD is the chronic reference dose. The calculation is based on the Chinese Technical Guidelines for Risk Assessment of Contaminated Sites [30]. After calculation of the carcinogenic risk, we set 10 −6 as the lower limit of acceptable risk.

Evaluation Based on the DPSIR Model
In this section, we will introduce (1) the establishment of a hierarchical analysis model and calculation of each indicator's weight, (2) the standardization of each indicator's value, and (3) the integration of each indicator's value, for a comprehensive evaluation.

Establishment of Hierarchical Analysis Model
Based on the principle of hierarchical analysis, a hierarchical structure of the soil environmental quality assessment model is constructed, including the target layer (T), criterion layer (D), and indicator layer (X). The criterion layer contains five criteria: D 1 (driving force), D 2 (pressure), D 3 (state), D 4 (impact), and D 5 (response), and the indicator layer contains 18 indicators. The hierarchical analysis model is shown in Figure 2. There are several methods that can be used for calculating weights in the DPSIR model, such as the AHP method [32,33], the Delphi method [34], the coefficient of variation method [35], the entropy weight method [36], and the principal component analysis method [37]. When using the entropy weight method or the principal component analysis method, we need to know the values of each indicator before calculating the weights; however, this is not realistic in our application. Therefore, we choose the AHP method to calculate the indicator weights.
The relative weights of criteria/indicators are calculated by pairwise comparisons. The relative importance values of indices are shown in Table 4. The criterion layer contains D 1 to D 5 , and the indicator layer corresponds to X 1 to X 18 , a total of 18 indicators. We construct 4 matrices in the indicator layer. Since the importance of X 11~X14 is equally the same in our evaluation, we set them to 0.5000.
Firstly, the maximum eigenvalue, λ max , of each matrix and its corresponding eigenvector, w, are calculated. Then, a consistency check on each matrix is performed. If it passes the consistency check, the eigenvector corresponding to the largest eigenvalue is the weight. The consistency index is calculated by the formula CI = (λ max − n)/(n − 1), where n is the order of the judgment matrix.
To measure the size of CI, the random consistency index RI set as [0, 0.58, 0.90, 1.12, 1.24, 1.32, 1.41, 1.45, 1.49, 1.51]. The consistency ratio CR = CI/RI. When CR < 0.1, the matrix passes the consistency check. We use Matlab to calculate the weights; the results are shown in Table 5. The weight of the D 3 (status) is 0.4057, the weight of the D 2 (pressure) is 0.2312, and the weight of D 4 (impact) is 0.1649, indicating that the pollution status, pollution source pressures, and pollution impacts are the main factors that affect soil environmental quality. In D 2 (pressure) layer, the indicators X 5 and X 4 have the highest weights: 0.2465 and 0.2329, respectively, indicating that the heavy metal and organic emissions are the main pressures affecting soil quality; in the D 5 (response) layer, the indicators X 17 and X 16 have the highest weights: 0.4914 and 0.1046, respectively, meaning that the wastewater reuse is a very important technical means to control soil pollution.

Standardization of Indicator Values
In Section 2.  Table 6. Suppose the original value of indicator X i is v i , which can be standardized to x i , based on the above-standardized table, and the calculation process is as follows.
where R S and R e are the starting and ending points of the standardized closed interval (R s , R e ], respectively, The original value of indicator X i is in the range of (V il , V ih ], as shown in Table 6. Equation (5a) is for indicators X 14~X18 , and Equation (5b) is for X 1~X13 .

Synthetic Evaluations
The quality of the soil environment is described by the synthetic evaluation of the above indicators. After calculating the weights and standard value of each indicator, the soil environmental quality index (SEQI) is calculated by the weighted average of the standard values of the 18 indicators. The calculation equation is as follows: The result of Equation (5) is standardized between [0,1]. The larger the value is, the better the soil environmental quality is. We grade the evaluation results into 4 levels using upper and lower quartiles: (0.75 , 1] means that the soil environmental quality is excellent; (0.5 , 0.75] means that it is in a medium state; (0.25 , 0.5], an alert state; and (0 , 0.25], a bad state.

Research Area Overview
Hebei Province is located in North China, bounded by 36 • 05 -42 • 40 N and 113 • 27 -119 • 50 E, with a total area of 188,800 square kilometers. Hebei Province has 11 prefecture level cities, including 49 municipal districts, 21 county-level cities, 91 counties, and 6 autonomous counties. By the end of 2019, Hebei Province had a permanent population of 75.9197 million and a GDP of 3510.45 billion yuan. Industrial production is based on coal, iron and steel, textile, and raw material medicine. It is also an important grain and cotton production area in China.

Data Acquisition
The original data mainly comes from the 2015 Hebei Economic Statistics Yearbook, the 2015 Hebei Statistical Yearbook, the 2015 China Environmental Statistics Yearbook, and the Status of Soil Environmental Quality Survey 2015 Environmental Status Bulletin of Hebei Province. Part of the soil pollution data comes from the Hebei Province Economic Geographic Information Big Data Platform.

Soil Quality Assessment of Hebei Province
According to the aforementioned evaluation method in Section 2, the scores of each indicator of Hebei Province are calculated, as shown in Table 7. In order to get the comprehensive soil environmental quality index, we add up the weighted d i values (in Table 7) of agricultural land and industrial land, respectively, and get the results, which is 0.5815 in agricultural land, and 0.5127 in industrial land. This indicates that the soil environmental quality of agricultural land is in a medium state and the soil environmental quality of industrial land is approaching the alert state.

Results
The soil quality index of Hebei gained from our assessment model is 0.5815 in agricultural land and 0.5127 in industrial land, indicating that the soil environmental quality of agricultural land is in a medium state and the soil environmental quality of industrial land is approaching the alert state. From the perspective of the DPSIR model, as shown by criterion layer value d i in Table 7, we can see that the pressure (D 2 ) is in the range of (0.25 , 0.5], which means the pressure of soil environmental quality is in an alert state. The standard value of X 8 (Fertilizer application intensity) and X 9 (pesticide application intensity) of agricultural land is in the range of (0.25, 0.5], indicating that the pressure caused by the pesticide application is large. The current pollution status of industrial land (D 3 ) is 0.4150, more serious than that of agricultural land. While the impact (D 4 ) is in a medium state, the response (D 5 ) is excellent. However, the standard value of X 15 is 0.01, indicating that the environmental protection investment is extremely low and in a very poor state. Therefore, Hebei province should increase investment in soil pollution prevention. Following is a discussion of some complex indicators.

X 11 Heavy Metal Pollution Index
The heavy metal pollution indexes of agricultural and industrial land are 1.41 and 2.35, respectively. Soil samples were collected at a depth of 0-10 cm. We sampled 3224 points in Hebei to describe the soil pollution status, including 1537 samples in industrial land and 1687 in agricultural land; samples are scattered in 11 prefecture level cities, about 1 point/15 km 2 . The Nemerow index of the heavy metal pollution of each sample point is calculated, respectively. The results are shown in Table 8. The above results indicate that the surface soil of Hebei province is in a state of heavy metal pollution, and the pollution in the industrial land is relatively serious. This must be highly valued by relevant departments to prevent further pollution of the soil.

X 12 Organic Pollution Index
The Nemerow indexes of organic pollution of agricultural and industrial land are 1.06 and 1.71, respectively. The sample points of organic pollution are the same as that of heavy metal pollution. The statistic data of organic pollution is shown in Table 9. Farmland organic pollution mainly comes from chemical fertilizer application, pesticide application, and sewage irrigation. The pesticide application intensity and fertilizer application intensity score of Hebei province are both in the range of ( 0.25, 0.5], indicating that the pesticide and fertilizer application intensity is in an alert state, which must be paid attention to. At the same time, the organic pollution of industrial land is in a medium state. The main pollutants are Benzo [a] anthracene and benzo[a]pyrene. The pollution mainly comes from the discharge of industrial pollutants.

X 13 Potential Ecological Risk Index
The original value of the potential ecological risk index is 203, which is calculated by the data of 3224 sampling points. Among the 3224 sampling points, the maximum potential ecological risk index is 263.12, the minimum value is 74.56, and the average value is 203, which reached the medium ecological hazard level (150, 500). Of the sampling points with an ecological hazard index value less than 150, 70% appear in the agricultural land. Besides, the average ecological risk index of the industrial land is also larger than that of the agricultural land.

Human Health Risk Index
The original value of the human health risk index is 1.5 × 10 −5 , among which the organic cancer risk is less than 10 −6 , which is small. In agricultural lands, the average cancer risk of lead is 3.4 × 10 −5 , which is greater than the acceptable cancer risk of human health (10 −6 ). Overall, due to the exposure time of industrial and agricultural lands being much lower than that of residential areas, the total cancer risk of our study is within an acceptable range. However, the cancer risk of lead has reached the alert value.

Discussion
A case study of Hebei has proven that the evaluation models proposed in this paper are appropriate and that they can be applied to other provinces or cities. The following actions can be taken by the Hebei government to improve the soil environmental quality: (1) Improve farming patterns and promote organic fertilizers. The pollution of agricultural land in Hebei is mainly caused by the unreasonable agricultural production mode. Rational selection of cultivated crops, selection of anti-fouling crops, development of economic crops, and reduction of the use of chemical fertilizers and pesticides can prevent the continued pollution of agricultural land. (2) Develop green industry and improve resource utilization. The government should take measures to limit the entry of enterprises with high-energy consumption, high water consumption, and heavy pollution. Continue to close down enterprises that do not rectify low-efficiency and heavy pollution, promote the optimization of industrial structure, and fundamentally achieve pollution reduction aims. (3) Increase investment in environmental protection. The government can cooperate with the Environmental Protection Bureau, the National Development and Reform Commission and other departments to actively report environmental management projects to the higher levels and strive for the state's investment in Hebei's environmental protection cause.

Conclusions
This study proposes a soil environment quality assessment method based on the DPSIR model. Eighteen evaluation indicators are selected, a hierarchical analysis model is constructed and the weight of each indicator is calculated based on the AHP method, and the values of each indicator are standardized and graded. The soil environment quality is calculated by the weighted average of these indicators. This method is applied in evaluating the soil quality of agricultural and industrial lands of Hebei province, China. The results show that the soil environmental quality of Hebei's agricultural land is in a medium state and that the industrial land is approaching the alert state. The pressure of soil pollution mainly comes from the discharge of industrial pollutants and the application of pesticides and fertilizers. Specifically, the pesticide application intensity of Hebei province has reached a limit value and Hebei's environmental protection investment level is extremely low, which must be paid attention to.
Regarding the research method of this paper, the index selection of the DPSIR model adopted in this paper can be further optimized. For example, the ecological environment index can be added to the impact. Moreover, this paper only evaluates the soil environmental quality of agricultural and industrial areas in Hebei, because the pressure and driving force in the DPSIR model are only in line with industrial and agricultural production and have little impact on residential areas, roads, and other areas. In future research, a more comprehensive evaluation model can be designed to evaluate the diverse soil application types in the urban soil environment, and a more comprehensive evaluation model can be designed to evaluate the diverse soil application types in soil environment, specifically, an in-depth analysis of the factors which could affect soil quality of Hebei province can be performed in the future.