Haze Influencing Factors: A Data Envelopment Analysis Approach

This paper investigates the meteorological factors and human activities that influence PM2.5 pollution by employing the data envelopment analysis (DEA) approach to a chance constrained stochastic optimization problem. This approach has the two advantages of admitting random input and output, and allowing the evaluation unit to exceed the front edge under the given probability constraint. Furthermore, by utilizing the meteorological observation data incorporated with the economic and social data for Jiangsu Province, the chance constrained stochastic DEA model was solved to explore the relationship between the meteorological elements and human activities and PM2.5 pollution. The results are summarized by the following: (1) Among all five primary indexes, social progress, energy use and transportation are the most significant for PM2.5 pollution. (2) Among our selected 14 secondary indexes, coal consumption, population density and civil car ownership account for a major portion of PM2.5 pollution. (3) Human activities are the main factor producing PM2.5 pollution. While some meteorological elements generate PM2.5 pollution, some act as influencing factors on the migration of PM2.5 pollution. These findings can provide a reference for the government to formulate appropriate policies to reduce PM2.5 emissions and for the communities to develop effective strategies to eliminate PM2.5 pollution.


Introduction
Recognizing the key influencing factors of haze is critically important for haze control. The purpose of this study is to explore contributing factors related to human activities or meteorological events.
Haze not only pollutes the ecological environment, but also seriously affects the physical and mental health of humans. Low visibility in haze weather could easily lead to traffic accidents, endangering human life and property [1]. The particulate pollutants in haze can cause great harm to humans and the toxic effect is mainly ascribed to PM 2.5 , defined as fine particulate matter with a diameter smaller than 2.5 µm [2]. Due to their size, PM 2.5 particles can penetrate into the respiratory tract and reach the alveoli [3], and can even enter the systemic circulation and affect other organs [4]. PM 2.5 may also directly affect the cardiovascular system by entering into the systemic circulation and causing cardiovascular dysfunction [5]. They can also cause depression and other mental illnesses [6]. As PM 2.5 is the main component of haze and causes great harm to humans, the key to control haze pollution is to decrease PM 2.5 pollution. Therefore, this paper focuses on investigating the influencing factors of PM 2.5 pollution.
In recent years, the haze condition in China has become so significant that the problem has triggered a warning to the government, the communities, and all citizens. The annual average concentration of PM 2.5 from 2013 to 2016 is as high as 57.75 µg/m 3 in China [7], which is over six times higher than that of the United States (8.47 µg/m 3 [8]) or 5 times higher than that of the WHO standard (10 µg/m 3 ).

Human Activities
Many scholars have conducted various studies on the causes and mechanisms of haze, hoping to provide scientific advice for the government to prevent and control haze [9][10][11]. It is known that human activities, such as industrial pollution and fuel combustion, automobile exhaust, construction dust, road dust and industrial dust, contribute substantially to haze pollution. Taking Beijing as an example, in 2017, vehicle exhaust, coal combustion, industrial production and dust emission were the main sources of PM 2.5 , accounting for 31.1%, 22.4%, 18.1% and 14.3%, respectively [12].
Direct emissions mainly come from various types of combustion, including fossil fuel (coal, oil, natural gas) combustion and biomass burning (straw, firewood). Coal accounts for a large proportion of China's energy consumption structure. In 2017, coal consumption accounted for 60.4% of total energy consumption [13].
Haze pollution caused by coal combustion in China can be divided into two types by scale: one is large-scale high-density coal combustion such as power plants and iron and steel enterprises. You and Xu [14] along with Tang et al. [15] found that coal combustion is the main source of particulate pollution in China.
The other is small-scale coal combustion, such as household coal combustion, which is widely distributed and easily overlooked. Cheng et al. [16] pointed out that the actual coal consumption by residents is 4.6 times that of the amount reported in the statistical yearbook for that year, and hence household coal combustion should not be ignored. The studies by Xue et al. [17] and Zhang et al. [18] indicated that PM 2.5 emissions from the residents of the Beijing-Tianjin-Hebei region accounted for nearly 50% of the total PM 2.5 emissions and is exhibiting an upward trend.
Pollutants that can be turned to PM 2.5 pollution include nitrogen oxides, sulfur dioxide and volatile organic compounds. Vehicle exhaust is a main source of this kind of PM 2.5 pollution. Ming et al. [19], Gao et al. [20] and Yang et al. [21] pointed out that the increase in the number of cars has caused a significant increase in haze pollution. Gao et al. [22] assessed the health damage caused by vehicle exhaust. Goel and Kumar [23] studied the exposure to vehicle exhaust under different traffic conditions, and their findings show that PM 2.5 pollution caused by vehicle exhaust when traffic jams is more serious than when traffic flow is smooth.
Construction dust, road dust and industrial dust are also important sources of PM 2.5 . Kinsey et al. [24] and Hassan et al. [25] found that the dissipation of dust during construction is one major source of PM emissions. Zhao et al. [26] pointed out that high-intensity economic activities and large-scale infrastructure construction are reasons for the increase in PM 2.5 concentration in China. Lin and Zhu [27] and Wu et al. [28] found that the concentration of PM 2.5 is closely related to the urbanization process in China. Zhou et al. [29] showed that there was a significant positive relation between the emission of industrial dust and the concentration of PM 2.5 .

Meteorological Factors
Some scholars suggest that meteorological events are natural causes of haze formation. Studies found that meteorological factors related to haze include relative humidity, wind speed, and precipitation. The studies by Fu et al. [30], Chambers et al. [31], Yang et al. [32] and Li et al. [33] indicated that relative humidity and wind speed have a great relationship to haze pollution. Guo et al. [34] found that the washing effect of precipitation can significantly reduce haze pollution.
While meteorological factors are natural causes or objective factors for the formation of haze, human activities account for a significant part of the pollution, especially for PM 2.5 pollution. PM 2.5 pollution conditions are the combined effects of meteorological and human factors. However, in the past, only a few studies have considered the combined effects of meteorological conditions and human activities on haze formation.

Our Approach
Jiangsu is an important industrial province in China, with a GDP (Gross Domestic Product) of 8.59 trillion RMB in 2017, which is about 10.03% of China's total GDP or the national GDP of Russia (8.51 trillion RMB) [35]. With the fast development of industrialization and urbanization, people there are also facing heavy pollutant emissions, and haze and fog have become a serious concern for the residents and a burden on the public. Under China's new standard economic system, people urgently demand air quality improvement. As the key task of ecological environment governance, PM 2.5 pollution prevention and treatment have become a critical task to win "the battle for blue skies". Differentiating the impact of different factors on PM 2.5 pollution in Jiangsu Province can provide more precise information for the government and communities to design suitable policies and care strategies.
Meteorological factors and human reasons will be taken as the influencing factors of PM 2.5 pollution (Section 3). Through related literature analyses, human factors are divided into four major categories: (1) Social progress: including the impact of urbanization, housing construction and demolition, and population growth. (2) Transportation: vehicle exhaust emissions. (3) Energy use: fossil fuel combustion. (4) Environmental protection: human efforts to reduce pollution. Because the social and economic data are described by years, but the meteorological data are provided by units of hours or days, we face the problem of the time scale not being uniform when the relevant data is processed.
DEA is a linear programming methodology to evaluate the efficiency of multiple decision-making units (DMUs) when the production process presents a structure of multiple inputs and outputs [36]. This method is not required to determine the explicit expression of the relationship between input and output variables, which eliminates many subjective factors, and hence, has strong objectivity. Therefore, the DEA method is widely used in efficiency evaluation [37]. However, traditional DEA does not consider the random errors of input and output variables, such as measurement error and data noise [38,39], which are unavoidable in data collection [40].
In this paper, the stochastic data envelopment analysis model is applied to resolve this problem. When compared with traditional DEA, chance constrained stochastic DEA has two advantages: admitting random input and output, and allowing the evaluation unit to exceed the front edge under the given probability constraint [41,42]. We study the coupling effect of multiple factors in the effort to identify the influencing factors of PM 2.5 pollution in Jiangsu province.
The remainder of the article is organized as follows: Section 2 presents the traditional DEA method and chance constrained stochastic data envelopment analysis method. Section 3 introduces (1) the sources of meteorological data, as well as economic and social data, for the 13 prefecture-level cities in Jiangsu Province, (2) the construction of primary and secondary level indicators affecting PM 2.5 pollution are provided, (3) three sets of experiments are designed to identify key factors of PM 2.5 pollution based upon different levels of indicators. Section 4 contains an empirical analysis. Through corresponding analyses and comparisons of the results from the three sets of experiments, the key factors affecting PM 2.5 pollution are analyzed, and the comprehensive effects of various factors for PM 2.5 pollution at different risk levels are evaluated. Section 5 is the summary and conclusion.

Model and Method
Using the chance constrained stochastic DEA method, different experiments were carried out using different levels of indexes. If the result changed substantially after a certain index was deleted, the deleted index had a strong relation with PM 2.5 . Therefore, the following three experiments were designed: All 14 secondary indexes were used as input indicators, GDP and PM 2.5 as output indicators, and chance constrained stochastic DEA was used for the calculation. II.
Based on the settings of experiment I, a secondary index was deleted and the remaining 13 input indicators and 2 output indicators were used for the calculation. The purpose of this experiment was to identify which secondary input index has the greatest impact on output indicators. III. As indicators may interact with each other, considering the interrelatedness and mutual influence between indicators, and treating the same type of indexes as one group, we obtained five primary indexes. Based on the settings of experiment I, a primary index was deleted and the remaining four input indicators and two output indicators were used for the calculation. The purpose of this experiment was to identify which primary index has the greatest impact on output indicators.
Data envelopment analysis (DEA) was proposed by the famous operational research scholars, Charnes, Copper and Rhodes, in 1978 [43], and is a non-parametric evaluation method that is relatively efficient when compared to the regression analysis method and least squares method. It is suitable for the evaluation of several decision units with the same type of multiple inputs and outputs. This method does not require determination of the explicit expression of the relationship between input and output variables, which eliminates many subjective factors, and hence, has increased objectivity. Therefore, the DEA method is widely applied in efficiency evaluation.
In this paper, we employ the BCC model: a model for evaluating technical efficiency, named after proposers Banker, Charnes and Cooper in 1984 [44], which is the following optimization problem: Objective function: Constraints: In (2), the input variable of a decision making unit (DMU) in an economic (production) activity is The output variable is Y = (y 1 , · · · , y r , · · · , y s ), where y r is the r-th output; X j is the input vector of the j-th DMU, Y j is the output vector of the j-th DMU; X o and Y o are the corresponding indices of the evaluation DMU; S − is the slack variable of the input variable; S + is the slack variable of the output variable; e 1 T is the unit vector in the m-th dimension; e 2 T is the unit vector in the s-th dimension; ε is the Archimedes infinitesimal; λ j is the nonnegative weight of the j-th DMU; Thus, the entire production activity of the DMU can be represented by (X, Y), and the input set of n DMUs can form an n × m matrix, while the output set can form an n × s order output matrix. When the optimum output is 1, i.e., θ = 1, the j-th DMU is technically effective. When θ < 1, the j-th DMU is not technically effective.
When dealing with practical problems, input and output variables are often disturbed by random factors such as measurement error and data noise. Because of the random nature of the phenomena and the laws themselves, the input and output variables of DMU may not be accurately determined, and these variables are subject to certain random distribution. There will be great errors when using the traditional DEA model to deal with such problems. In addition, the traditional DEA model has some difficulty in dealing with data with different time scales. Therefore, the chance constrained stochastic DEA method becomes important in practice and can be used to calculate data with different time scales [45].
The chance constrained stochastic DEA model based on the BCC output orientation is the following [46]: Objective function: Constraints: Pr(.) represents the probability; x ij is the input variables of DMU j ; y rj is the output variable of DMU j ; In our study, n = 13, representing 13 prefecture-level cities in Jiangsu province. We have m different input variables x ij , with three different schemes: (1) m = 14, representing the 14 secondary indexes used in Experiment I; (2) m = 13, representing the 13 secondary indexes used in Experiment II; (3) m = 5, representing the 5 primary indexes used in Experiment III.
For output variables y rj , s = 2, representing the 2 output variables. As Model (2) is an output-oriented model, the reciprocal of the optimal solution of the target value obtained from Model (4) is the stochastic chance constraint efficiency of the evaluated DMU. Model (4) can be rewritten as (without the slack variable): Objective function: Constraints: Φ −1 (α) is the value of the inverse distribution function of the standard normal distribution function at α; is the standard deviation of the linear combination distribution of all i-th input variables; σ I ij and σ o rj are the standard deviations of x ij and y rj , respectively; x ij and y rj are corresponding mean values of the DMU j input and output. For the unit to be evaluated DMUo, θ o in the objective function is the target optimal solution, o ∈ { 1, 2, . . . n} . Let its stochastic efficiency be ϕ o , then it holds that ϕ o = 1 θ .

Data Sources and Indexes Explanation
In this section, we describe the data collection and data characteristics, as well as indexes, we have constructed.

Data Sources
The meteorological data comes from the China Meteorological Science Data Sharing Service Network (CMSDSSN) ranging from 2013 to 2016 [47]. The CMSDSSN is a meteorological data sharing network system, and a service platform for all domestic and global users.

Indexes Explanation
In this section, five indexes; meteorological conditions, social progress, transportation, energy use and environmental protection, were constructed as primary indexes. Each primary index contained one or more secondary indexes. The setting of the indexes is shown in Table 1. The following are further explanations for the five primary indexes: Weather conditions refers to different meteorological elements, including six secondary indexes: wind speed, precipitation, temperature, atmospheric pressure, sunshine hours and relative humidity.
Social progress includes the progress and development of material civilization. It is mainly reflected in the progress of material production methods and economic life. It is the achievement of humans in reforming and conquering nature, including improvement of the living standard, improvement of food and clothing, change of life style and many other things. Based on the existing data, three secondary indexes were selected: urbanization rate, population density, and building Air pollution data comes from the daily monitoring data of atmospheric pollutants from the Environmental Protection Department of Jiangsu ranging from 2013 to 2016. The Jiangsu Environmental Protection Department is responsible for establishing and improving the basic system of environmental protection, monitoring the emission of pollutants.

Indexes Explanation
In this section, five indexes; meteorological conditions, social progress, transportation, energy use and environmental protection, were constructed as primary indexes. Each primary index contained one or more secondary indexes. The setting of the indexes is shown in Table 1. The following are further explanations for the five primary indexes: Weather conditions refers to different meteorological elements, including six secondary indexes: wind speed, precipitation, temperature, atmospheric pressure, sunshine hours and relative humidity.
Social progress includes the progress and development of material civilization. It is mainly reflected in the progress of material production methods and economic life. It is the achievement of humans in reforming and conquering nature, including improvement of the living standard, improvement of food and clothing, change of life style and many other things. Based on the existing data, three secondary indexes were selected: urbanization rate, population density, and building construction area.
Transportation: Automobile exhaust emissions are one of the major sources of PM 2.5 and a significant part of air pollutants. While the automobile exhaust monitoring data is not available, two relative indexes from the statistical yearbook were selected: civil car ownership and number of public transportation vehicles in operation.
Energy use: Coal accounts for the majority of all energy use in China, and this particular energy structure makes pollution by coal burning especially serious. It is necessary to study the impact of coal consumption and energy efficiency on air pollution. Two secondary indexes were selected: per 10,000 yuan gross output value of industry energy consumption and total coal consumption.
Environmental protection: The Chinese central government has adopted various policies to reduce PM 2.5 pollution and damage. Considering the efforts made by people in order to reduce PM 2.5 pollution, this index was set up. One secondary index was selected: green coverage rate of built-up areas.
Output indexes were divided into two categories: one is the desirable output and the other is the undesirable output. The desirable output is the output that is beneficial to the overall goal and meets expectations, and the undesirable output is a byproduct of the beneficial output.
For economic applications in our setting, larger values mean better economic performance. At the same time, the pollutant discharge should be as low as possible. Therefore, we choose GDP as the desirable output and PM 2.5 as the undesirable output.
Because the weather data and PM 2.5 data are collected daily, and the social and economic data from the statistical yearbook is annual data, they need to be pre-processed before the calculation based on the requirements of chance constrained stochastic DEA. For weather data and PM 2.5 data, the annual mean and variance of data were calculated. For social and economic data, the variance of a sequence of 4-year data was calculated. When the calculation was with the chance constrained stochastic DEA, the variance was the same for 4 years. The percentage of energy consumption by enterprises to the total industrial output value.

Total Coal Consumption
The amount of coal consumed converted into the amount of standard coal.

Environmental Protection Green Coverage Rate of Built-up Area
The percentage of green coverage area to built-up area.

Output
Expected Output Gross Regional Product The market value of all final goods and services produced in a region. Undesirable Output PM 2.5 Concentration of particles in air with diameters less than or equal to 2.5 microns.

Result Analysis
In this section, we carried out the three designed sets of experiments, analyzed and compared the results from them.

Chance Constrained Stochastic DEA with All 14 Secondary Indexes as Input Indicators Subsection
Experiment I was carried out with all 14 secondary indexes as input indicators. In Formula (3), α is the risk level. In this section, we attempt to analyze the difference in the results obtained by selecting different risk levels α, and compare the results obtained in different years with the same risk level α. We took the risk levels α = 0.05, α = 0.1, α = 0.2, α = 0.5, α = 0.8, α = 0.9, and α = 0.95 for the calculation. By using the chance constrained stochastic DEA model, we calculated the target value and stochastic efficiency for the thirteen cities in Jiangsu. The results of all 4 years are shown in Figure 2.

Result Analysis
In this section, we carried out the three designed sets of experiments, analyzed and compared the results from them.

Chance Constrained Stochastic DEA with all 14 Secondary Indexes as Input Indicators Subsection
Experiment I was carried out with all 14 secondary indexes as input indicators. In Formula (3), α is the risk level. In this section, we attempt to analyze the difference in the results obtained by selecting different risk levels α, and compare the results obtained in different years with the same risk level α. We took the risk levels α = 0.05, α = 0.1, α = 0.2, α = 0.5, α = 0.8, α = 0.9, and α = 0.95 for the calculation. By using the chance constrained stochastic DEA model, we calculated the target value and stochastic efficiency for the thirteen cities in Jiangsu. The results of all 4 years are shown in Figure  2. For all 4 years, when the risk level was less than or equal to 0.5, the random efficiency of all cities was 1. When the risk level was greater than 0.5, most cities with changed stochastic efficiency exhibited a decrease in stochastic efficiency with increased risk level. Table 2 below shows summary results of stochastic efficiency for all 13 cities and the years 2013-2016, with the same risk level in different years. Since the results obtained with different risk levels were highly similar, we used risk level α = 0.95 as an example to illustrate. For all 4 years, when the risk level was less than or equal to 0.5, the random efficiency of all cities was 1. When the risk level was greater than 0.5, most cities with changed stochastic efficiency exhibited a decrease in stochastic efficiency with increased risk level. Table 2 below shows summary results of stochastic efficiency for all 13 cities and the years 2013-2016, with the same risk level in different years. Since the results obtained with different risk levels were highly similar, we used risk level α = 0.95 as an example to illustrate.
From Table 2

Chance Constrained Stochastic DEA with 1 Secondary Index Deleted Each Time
There were 14 input indicators in Experiment I, however, the result of Experiment I cannot distinguish which indicator has the greatest impact on the stochastic efficiency, as seen in Experiment I where all 14 secondary indexes were used. Therefore, experiment II was designed. By deleting 1 secondary index each time, we tried to find which secondary index has the greatest impact on the output indicators.
From Figure 3, it can be seen that for all 4 years, when the risk level was α = 0.95, the change in stochastic efficiency caused by deleting the total coal consumption was the largest, accounting for 23.81% of the changes in stochastic efficiency. The second largest was population density, accounting for 22.62% of the changes in stochastic efficiency. The third largest was civil car ownership, accounting for 15.48% of the changes in stochastic efficiency. The per 10,000 yuan gross output value of industry energy consumption was the fourth, accounting for 9.52% of the changes in stochastic efficiency.
For all 4 years and at a risk level of α = 0.95, when the secondary indexes; wind speed, precipitation, air temperature and atmospheric pressure were deleted, the stochastic efficiency did not change, indicating that these four secondary indexes had no significant effect on the regional GDP and PM 2.5 .
For all secondary indexes that lead to stochastic efficiency changes, removal of human factors accounted for 98.81% of the total changes in stochastic efficiency. Removal of natural causes (sunshine hours) accounted for only 1.19% of the total change. This clearly indicates that human activities play a leading role in PM 2.5 pollution.
For 2013 alone, secondary indexes that changed the stochastic efficiency were (in order from high to low): total coal consumption, population density, civil car ownership and per 10,000 yuan gross output value of industry energy consumption. This order remained unchanged in the subsequent years 2014-2016.
By comparing the stochastic efficiency between the deleted and non-deleted secondary index for all 14 secondary indexes, and observing whether the stochastic efficiency changed or not, we could tell which secondary index had the greatest impact on the stochastic efficiency. Therefore, when dealing with PM 2.5 pollution, corresponding policies and measures can be introduced. For all 4 years and at a risk level of α = 0.95, when the secondary indexes; wind speed, precipitation, air temperature and atmospheric pressure were deleted, the stochastic efficiency did not change, indicating that these four secondary indexes had no significant effect on the regional GDP and PM2.5.
For all secondary indexes that lead to stochastic efficiency changes, removal of human factors accounted for 98.81% of the total changes in stochastic efficiency. Removal of natural causes (sunshine hours) accounted for only 1.19% of the total change. This clearly indicates that human activities play a leading role in PM2.5 pollution.
For 2013 alone, secondary indexes that changed the stochastic efficiency were (in order from high to low): total coal consumption, population density, civil car ownership and per 10,000 yuan gross output value of industry energy consumption. This order remained unchanged in the subsequent years 2014-2016.
By comparing the stochastic efficiency between the deleted and non-deleted secondary index for all 14 secondary indexes, and observing whether the stochastic efficiency changed or not, we could tell which secondary index had the greatest impact on the stochastic efficiency. Therefore, when dealing with PM2.5 pollution, corresponding policies and measures can be introduced.

Chance Constrained Stochastic DEA with 1 Primary Index Deleted Each Time
As indicators may interact with each other, considering the interrelatedness and mutual influence between indicators, treating the same type of indexes as one group, we get five primary indexes. It is of great significance to examine which primary index has the largest impact on output indicators. Unlike experiment II, experiment III deleted one primary index at a time.

Chance Constrained Stochastic DEA with 1 Primary Index Deleted Each Time
As indicators may interact with each other, considering the interrelatedness and mutual influence between indicators, treating the same type of indexes as one group, we get five primary indexes. It is of great significance to examine which primary index has the largest impact on output indicators. Unlike experiment II, experiment III deleted one primary index at a time.
The risk levels α = 0.05, α = 0.1, α = 0.2, α = 0.5, α = 0.8, α = 0.9, and α = 0.95 were used for the calculation. Four primary indexes were used as input indicators after removing the remaining one primary index. The chance constrained stochastic DEA model was used to calculate the target value and stochastic efficiency of 13 cities in Jiangsu. For α = 0.05, α = 0.1, α = 0.2, and α = 0.5, both the target value and stochastic efficiency were 1, therefore, these results were omitted. We used α = 0.95 as an example to illustrate as we did in Figure 3.
From Figure 4, it can be seen that for all 4 years, when the risk level was α = 0.95, the change in stochastic efficiency caused by deleting the social progress index was the largest, accounting for 42.86% of the changes in stochastic efficiency. The second largest was the energy use index, accounting for 30.00% of the changes in stochastic efficiency. The third largest was the transportation index, accounting for 20.00% of the changes in stochastic efficiency.
For all primary indexes that changed stochastic efficiency, the removal of the weather condition index accounted for only 1.43% of the total change in stochastic efficiency. Removal of the human factor index accounted for 98.57% of the total change in stochastic efficiency, meaning that human activities are more associated with PM 2.5 pollution. Hence, human activities are the main reason for PM 2.5 pollution.
For 2013 alone, primary indexes that changed the stochastic efficiency were (in order from high to low): social progress, energy use and transportation, which remained unchanged in the subsequent years from 2014 to 2016. Now we summarize results from all three experiments: (1) For the same year, the stochastic efficiency decreased with an increased risk level. For the same risk level, the stochastic efficiency increased with later years. (2) By deleting one secondary index at a time, we determined the secondary index (total coal consumption) that had the greatest impact on output indicators. The second largest was population density, the third largest was civil car ownership, and per 10,000 yuan gross output value of industry energy consumption was the fourth. Removal of human factors accounted for 97.94% of the total changes in stochastic efficiency. Removal of natural causes (wind speed and sunshine hours) accounted for only 2.06% of the total change. While some meteorological elements such as wind speed and temperature do have a large impact on PM 2.5 , we used annual data in this study and did not show this. Daily or hourly data will be more likely to show the importance of meteorological elements. (3) By deleting one primary index at a time, we found the primary index (social progress) that had the greatest impact on output indicators. The second largest was energy use, and the third largest was transportation. Removal of the human factor index accounted for 97.51% of the total changes in stochastic efficiency while removal of the weather condition index accounted for only 2.49% of the total change. (4) Human activities are deeply associated with PM 2.5 pollution and the impact of meteorological factors is far less than that caused by human activities. value and stochastic efficiency were 1, therefore, these results were omitted. We used α = 0.95 as an example to illustrate as we did in Figure 3. From Figure 4, it can be seen that for all 4 years, when the risk level was α = 0.95, the change in stochastic efficiency caused by deleting the social progress index was the largest, accounting for 42.86% of the changes in stochastic efficiency. The second largest was the energy use index, accounting for 30.00% of the changes in stochastic efficiency. The third largest was the transportation index, accounting for 20.00% of the changes in stochastic efficiency. For all primary indexes that changed stochastic efficiency, the removal of the weather condition index accounted for only 1.43% of the total change in stochastic efficiency. Removal of the human factor index accounted for 98.57% of the total change in stochastic efficiency, meaning that human activities are more associated with PM2.5 pollution. Hence, human activities are the main reason for PM2.5 pollution.
For 2013 alone, primary indexes that changed the stochastic efficiency were (in order from high to low): social progress, energy use and transportation, which remained unchanged in the subsequent years from 2014 to 2016. Now we summarize results from all three experiments: (1) For the same year, the stochastic efficiency decreased with an increased risk level. For the same risk level, the stochastic efficiency increased with later years. (2) By deleting one secondary index at a time, we determined the secondary index (total coal consumption) that had the greatest impact on output indicators. The second largest was population density, the third largest was civil car ownership, and per 10,000 yuan gross output value of industry energy consumption was the fourth. Removal of human factors accounted for 97.94% of the total changes in stochastic efficiency. Removal of natural causes (wind speed and sunshine hours) accounted for only 2.06% of the total change. While some meteorological elements such as wind speed and temperature do have a large impact on PM2.5, we used annual

Discussion
One of the main drawbacks of the traditional DEA method is that it cannot handle measurement errors. Traditional DEA is very sensitive to data errors because efficient units determine the efficiency [49]. Therefore, it is very important to incorporate random variations into DEA analysis. Compared with traditional DEA, chance constrained stochastic DEA has the following two advantages: it admits random input and output, and allows the evaluation unit to exceed the front edge under the given probability constraint.
In developed countries, people have experienced severe haze pollution during the industrialization era, e.g., the Los Angeles photochemical smog incident [50], the Donora smoke event [51], the great smog of London [52] and the Meuse valley fog [53]. These pollution incidents caused huge damage to people's health and personal property. These pollution incidents were controlled through targeted measures and persistent governance [54]. In China, the Central Government took various measures to lower the level of haze pollution and reduce the damage caused by haze. On 29 February 2012, the government issued a newly revised "ambient air quality standard" that established a standard to dynamically monitor the haze pollution and officially release timely real time air quality indexes. For the first time, PM 2.5 was added to the national environmental quality standard. Since then, China's PM 2.5 pollution has been significantly reduced. The annual average concentration of PM 2.5 decreased from 72 µg/m 3 in 2013 to 47 µg/m 3 in 2016.

Conclusions
In this paper, we applied the chance constrained stochastic DEA to explore the relationship between PM 2.5 pollution and meteorological elements, along with human activities, by analyzing the meteorological data, daily air pollutants monitoring data, and social and economic data derived from the Jiangsu Statistical Yearbook and the Statistical Yearbook of 13 prefectural cities in Jiangsu Province. The results showed that chance constrained stochastic DEA can clearly identify that human activities are the main cause of PM 2.5 pollution in Jiangsu Province. Chance constrained stochastic DEA is rarely used to assess PM 2.5 pollution. Our research results show that this method is very suitable for evaluating the PM 2.5 pollution problem.