Correlation Analysis of PM10 and the Incidence of Lung Cancer in Nanchang, China

Air pollution and lung cancer are closely related. In 2013, the World Health Organization listed outdoor air pollution as carcinogenic and regarded it as the most widespread carcinogen that humans are currently exposed to. Here, grey correlation and data envelopment analysis methods are used to determine the pollution factors causing lung cancer among residents in Nanchang, China, and identify population segments which are more susceptible to air pollution. This study shows that particulate matter with particle sizes below 10 micron (PM10) is most closely related to the incidence of lung cancer among air pollution factors including annual mean concentrations of SO2, NO2, PM10, annual haze days, and annual mean Air Pollution Index/Air Quality Index (API/AQI). Air pollution has a greater impact on urban inhabitants as compared to rural inhabitants. When gender differences are considered, women are more likely to develop lung cancer due to air pollution. Smokers are more likely to suffer from lung cancer. These results provide a reference for the government to formulate policies to reduce air pollutant emissions and strengthen anti-smoking measures.


Introduction
Air pollution has a serious impact on human health, in particular the incidence of lung cancer, which is the most common cancer in the world. In 2013, the World Health Organization (WHO) classified outdoor air pollution as carcinogenic to humans (International Agency for Research on Cancer, Group 1) [1]. Residents exposed to air pollution have shown higher frequencies of mutations in cells, DNA damage and other chromosomal aberrations which lead to increasing the incidence of lung cancer and polycyclic aromatic hydrocarbons (PAHs) is the most important pathogenic factor causing these lesions [2,3]. Exposure to PM 10 , SO 2 , and NO 2 significantly increase the risk of lung cancer [4]. As a developing country, China has the world's most serious burden of lung cancer [5], incidence rate of lung cancer was 50.9 per 100,000 for men and 22.4 per 100,000 for women in 2015 [6]. Therefore, knowing which pollutant among air pollution factors is most closely related to lung cancer is of great significance.
Scholars have studied the relationship between components of air pollution and lung cancer in many regions of the world. For example, Raaschou-Nielsen et al. [7] found that between 1985 and 2005, sulfur and nickel pollutants had the most significant effect on lung cancer in Europe. Furthermore, Li et al. [8] found that in Hong Kong, automobile exhaust and trace metals accounted for most of the PM 10 pollution, and caused lung cancer. Finally, Guo et al. [9] determined that in China, PM 2.5 and O 3 both had a significant statistic relation with the incidence of lung cancer.
Many studies have explored the potential health risks of PM 10 among residents in China. An et al. [10] found that in 2009, most of China's residents were exposed to air with an average annual PM 10 concentration of 40-100 µg/m 3 . However, the PM 10 concentration safe limit suggested by WHO is 20 µg/m 3 . Shen et al. [11] found that the majority of the population in Henan Province, China were exposed to toxic air while Zhang et al. [12] determined that in Beijing, with more city buildings and denser roads, the health risk to local residents increased as the concentration of PM 10 increased.
In recent years, rapid economic development has resulted in deleterious health impacts for the residents of many Asian countries. Many of these health impacts have been attributed to elevated concentrations of PM 10 , and hence, there have been several studies focusing on PM 10 induced mortality rates. Chen et al. [13] found that as the PM 10 concentration in northern China increased by 10 mg/m 3 , lung cancer mortality increased by 3.4% to 13.6%. Furthermore, Chen et al. [14] found that between 2010 and 2014, increased concentrations of PM 10 in Chengdu have resulted in increased mortality. Maleki et al. [15] found that between2009 and 2014, high concentrations of PM 10 caused 3777 deaths in Ahvaz, Iran. Ho et al. [16] determined that during 2012, high concentrations of PM 10 in Ho Chi Minh City, Vietnam killed 204 residents. Other scholars have focused on the negative effects of increased PM 10 concentrations on the economy. Hou et al. [17] determined that in 2009, China's health related economic losses due to PM 10 reached $U.S. 106.5 billion, approximately equal to 2.1% of China's Gross Domestic Product (GDP) in that year. Hou et al. [18] also found that from 2008 to 2012, the average annual economic loss associated with PM 10 pollution was $U.S. 9.5 billion.
Developed nations such as the Unites States and those in Europe have already experienced the industrialization process and suffered from the subsequent severe air pollution brought about by this. Consequently, these regions fully recognize the importance of improving air quality and have adopted various measures to reduce pollutant emissions in order to improve the health of the population. As a result of this process, new industries have developed which have helped to boost economic growth. Keuken et al. [19] found that in Rotterdam, The Netherlands, by applying stringent emission standards to prevent high emitters near residential areas, the concentration of PM 10 was reduced by 18 µg/m 3 during the past 24 years, and the average life expectancy of the residents increased by 13 months over the same period. Carugno et al. [20] found that between 2003 and 2014, after implementing air quality control policies such as renewal of vehicle exhaust systems and engines, incentives to purchase hybrid and electric vehicles, the decrease in PM 10 concentrations in Lombardy, Italy resulted in a reduction in the number of deaths caused by PM 10 ; approximately 343 people died during the first 5 years of the study, the number was reduced to 253 between 2007 and 2010, and between 2011 and 2014, the death toll was reduced to 208. Research conducted by Castro et al. [21] showed that by applying national regulations such as fuel requirements and maximum permitted pollution levels, the decrease of PM 10 concentrations had a positive effect on economic growth, in the Agglomeration of Lausanne-Morges, Switzerland, during 2005 to 2015; health related impacts of the reduction of PM 10 concentration were monetarized at approximately CHF 36 million annually.
Many scholars have conducted air pollution and lung cancer studies in China, but there are few studies related to the air pollution and population health in the middle reaches of the Yangtze River. The aim of this paper was to understand the relationship between the incidence of lung cancer and air pollution in Nanchang, China and explore the characteristics of lung cancer incidence in different population demographics such as residence, gender and smoking history. Nanchang, the capital city of Jiangxi Province, is one of the central cities of the middle reaches of the Yangtze River and the core city of Poyang Lake eco economic zone. Automobile exhaust, soil dust, coal combustion, building dust and metallurgical dust are the major sources of air pollution in Nanchang. The incidence rate of newly diagnosed lung cancer cases was 55.9 per 100,000 for men and 24.8 per 100,000 for women in 2013 [22]. It is of great practical significance to study the association between air pollution and the incidence of lung cancer, to give scientific countermeasures and suggestions for the protection of public health.

Air Pollution Index and Air Quality Index
Based on the Environmental Protection Laws of the People's Republic of China (Standing Committee of the National People's Congress, 2014) [23] and the Law of the People's Republic of China on the Prevention and Control of Atmospheric Pollution (Standing Committee of the National People's Congress, 2015) [24], China has formulated the Ambient Air Quality Standard. Based on this standard, the air pollution index (API) is suitable for studying short-term air quality trend. API simplifies several air pollutant concentrations in routine monitoring as a single exponential and reveals not only air quality status but also the air pollution situation.
Prior to 2012, China's air pollution index included SO 2 , NO 2 and PM 10 . When the concentration of a pollutant is C i,j ≤ C i ≤ C i,j+1 , its pollution index is: where I i is the pollution index of the i-th pollutant, C i is the concentration of the i-th pollutant, I i,j is the pollution index value of the i-th pollutant at turning point j, and C i,j is the concentration of the i-th pollutant (for I i,j ) at turning point j. C i,j+1 is the concentration of the i-th pollutant (for I i,j+1 ) at turning point j + 1. The air pollution index, API, is the largest of all the pollution indices:  [26] to quantitatively describe air quality. AQI adopts more stringent standards and includes more pollutant indicators such as PM 2.5 , O 3 and CO; consequently, results evaluated using the AQI better reflect the atmospheric conditions.
With respect to calculating AQI, the concentration limit of each pollutant (individual air quality index, IAQI) were calculated based on actual measured values of pollutant concentrations including fine particulate matter (PM 2.5 ), inhalable particulate matter (PM 10 ), sulfur dioxide (SO 2 ), nitrogen dioxide (NO 2 ), ozone (O 3 ), and carbon monoxide. IAQI is calculated as follows: where I AQI P is the air quality index of pollutant P, C p is the mass concentration value of pollutant P, BP Hi is the concentration breakpoint that is no more than C p , BP Lo is the concentration breakpoint that is no less than C p , AQI Hi is the index breakpoint corresponding to BP Hi , and I AQI Lo is the index breakpoint corresponding to BP Lo . From the IAQI of each pollutant, the maximum is then determined to be AQI. If the AQI is greater than 50, and the maximum pollutant of IAQI is defined as the primary pollutant then: AQI = max(I AQI 1 , I AQI 2 , I AQI 3 , · · · , I AQI n ) The Environmental Protection Bureau of Nanchang City did not monitoring AQI related data (PM 2.5 , O 3 and CO) until 2013 under the requirement of Ministry of Environmental Protection of China. Therefore in this paper, the API index was used prior to 2013 due to lack of related pollutant indicators, and the AQI index was used after 2013. Because AQI and API are two different indices, significance test are required to determine whether these two indices can be used simultaneously or not [27,28].
Daily AQI and API were used to conduct significance test from 2013 to 2014. These two indices were statistically significant because the p-value of 0.000 was 4.11 × 10 −8 less than 0.05, indicating that AQI and API were significantly related.

Grey Correlation Analysis
Grey correlation analysis is a quantitative description and comparative method for investigating trends within a system [29,30]. Grey correlation analysis has many advantages: the amount of data required is not large, therefore grey correlation analysis has advantage in dealing with small data samples; a strict calibration method is not required for Grey correlation analysis, so it is simpler and easier to use than other statistical methods; the grey relation results are also more intuitionistic and understandable.
The method works by determining the similarity of the sequence reference data array with comparative data. Grey correlation analysis has been widely used in air quality assessment and prediction. For example, You et al. [31] assessed the air quality in Japan and found that air quality declined year on year from 2008. Furthermore, Qin et al. [32] assessed the winter particulate matter concentration for Beijing, Shanghai, Guangzhou and Lanzhou during 2013 and 2014. Their results showed that particulate matter was highly correlated with CO, NO 2 , and SO 2 . Wang et al. [33] simulated the air pollution of five cities in the Jiangsu Province, China using the improved grey dynamic trend model, which was shown to simulate air pollution with more accuracy. Chen et al. [34] simulated the hourly concentrations of PM 10 and PM 2.5 of Taichung in 2008, then compared the results with data obtained from a back-propagation artificial neural network (BPNN). Their results showed that the grey model can predict the hourly PM concentration accurately. Finally, Pan et al. [35] found that the grey correlation model can be used to simulate the air quality in Tianjin during 2001 to 2009, and they predicted that air quality will continue to improve during 2010 to 2015.
Grey correlation analysis was applied following a five-step process: Step 1: Define the reference and comparison sequences.
The data sequence that reflects the characteristics of a system is the reference sequence and the data sequences that affect the behavior of a system are the comparison sequences.
Step 2: Apply the non-dimensional method to the reference and comparison sequences.
Because the physical meaning and dimensions of each factor in the system differ, it is difficult to draw an accurate conclusion when factors are compared with each other. Thus, when dealing with the grey relational grade analysis, non-dimensionless data processing procedure is usually required as follows: where j is the serial number of samples from the reference sequence, x 0,max is the maximum value of the reference sequence and x 0,min is the minimum value of the reference sequence.
In Equation (6), x ij is data for the i-th factor j-th sample in the comparison sequence, x ij,max is the maximum value of the i-th factor sample and x ij,min is the minimum value of the i-th factor sample.
Step 3: Determine the grey correlation coefficient for the reference and comparison sequences.
The degree of correlation refers to the difference between the geometric shapes of the curves for the reference and comparison sequences. As such, the difference between the curves can be used as a measure of the degree of correlation. For reference sequence X 0 and several comparison sequences, X 1 , X 2 , · · · , X n , the correlation coefficients of the reference sequence and the comparison sequences at different times are denoted by ξ(X i ). This is called the resolution coefficient, and its values range from 0 to 1; however, its value is usually 0.5: Step 4: Determine the degree of correlation (r i ).
The correlation coefficient is the degree of relevance between the comparison sequences and the reference sequence at different moments; consequently, it has more than one value. Here, it is necessary to concentrate the correlation coefficients of each point in the curve as an average value, measuring the degree of correlation between the comparison sequence and the reference sequence: Step 5: Sorting the degree of correlation.
Relevance between factors is described by the order of the degree of correlation, as opposed to the magnitude of the correlation. Therefore, the relational sequence is formed by arranging the degree of correlation for sub sequences to the same generating sequence in order of size, denoted by {x}; this reflects the pros and cons of each subsequence to the generating sequence. While r 0 i > r 0 j, for the same generating sequence {x 0 }, {x i } is superior to x j and is denoted by {x i } > x j , r 0 i is the eigenvalue of the i-th subsequence to the generating sequence.

Data Envelopment Analysis
Data envelopment analysis (DEA) is a linear programming methodology to evaluate the efficiency of multiple decision-making units (DMUs) when the production process presents a structure of multiple inputs and outputs [36]. This method is not required to determine the explicit expression of the relationship between input and output variables, which eliminates many subjective factors, and hence, has strong objectivity. Therefore, the DEA method is widely used in efficiency evaluation. Many scholars have used DEA to evaluate environmental pollution, and recommend decision-making plans. Sueyoshi and Yuan [37] determined that the Chinese government should allocate economic resources to cities, strengthen environmental protection, and bring energy consumption under control. In a follow up study, Sueyoshi and Yuan [38] found that industries need to reduce fossil fuel use and use more renewable green energy. Wang et al. [39] determined that China's industrial zones should reduce pollution through technical investment and reduce the proportion of coal use. Finally, Moutinho et al. [40] argued that, although high environmental taxes and high pressure environmental policies can make European countries livable, these policies can also hamper economic growth.
Data envelopment analysis regards an economic system or a production process as an entity (unit). Within a certain range, units invest a quantity of production factor and produce a quantity of "product"; such units are called DMUs. The same type of DMU, with the same target and task, the same external environment, and the same input and output indicators can form a DMU set.
The input vector of a DMU in an economic (production) activity is X = (x 1 , · · · , x i , · · · , x m ), where x i indicates the i-th input, the output vector is Y = (y 1 , · · · , y r , · · · , y s ), y r indicates the r-th output;(X j , Y j ) is the input and output vector of the j-th DMU, and (X 0 , Y 0 ) are the corresponding indices of the evaluation DMU. Thus, the entire production activity of the DMU can be represented by (X, Y), and the input set of n DMUs can form an n × m order input matrix, while the output set can form an n × m order output matrix.
In this paper, we use the BCC model. The BCC model was proposed by Banker, Charnes and Cooper [41], and is defined as follows: The generation possibility set of the BCC model for computing pure technical efficiency can be expressed as: When θ = 1, the DMU j is technique effective. When θ < 1, the DMU j is not technique effective.

Meteorological Data
Meteorological observational data from 2003 to 2014 (12 years in total) was provided by the Nanchang National Meteorological Station. According to the Technical Regulation for Haze Pollution Day Judging (on trial, Ministry of Environmental Protection of the People's Republic of China, 2014) [42], the haze days per year of Nanchang were obtained.

Air Quality (Air Pollutants) Monitoring Data
Air quality data was provided by Environmental Protection Bureau of Nanchang City collected from nine environmental monitoring sites. These monitoring data from 2003 to 2014 (12 years in total) include many air pollution factors such as daily mean PM 10 , NO 2 and SO 2 . Nanchang began monitoring air quality earlier than most other cities in China. Starting in 2001, Environmental Protection Bureau of Nanchang City and Nanchang Municipal Meteorological Bureau began to work together to set up environmental monitoring sites. The collected data were processed by professional personnel. Therefore, the accuracy, continuity and authority of the data are ensured.

Lung Cancer Case Data
Lung cancer case data were obtained from a local hospital (the hospital specialises in oncology, it is the research center for cancer prevention and control of Jiangxi Province, a tertiary referral hospital with medical treatment, prevention, teaching and scientific research). Lung cancer case data is in the form of lung cancer registration data includes case number, gender, age, time of diagnosis, place of residence (urban or rural), and smoking history.

Population Statistical Data
Population data for Nanchang during 2003 to 2014 was provided by the Nanchang Statistical Yearbook (Bureau of Statistics of Nanchang, 1996-2014) [43].

Results
Newly diagnosed lung cancer patients in Nanchang from 2003 to 2014, are shown in Table 1. It is clear that the number of lung cancer patients is on the rise: In 2003, there was 119 cancer patients. In 2012, that number had increased to 454.
For people living in different parts of Nanchang, the number of rural lung cancer patients are growing much faster than urban lung cancer patients: In 2003, the number of lung cancer in urban areas (99) was approximately five times the incidence of lung cancer in rural areas (20). While in 2014, the number of lung cancer in urban areas (229) and the incidence of lung cancer (225) is almost identical.

Grey Correlation Analysis
This section employs the grey correlation analysis method to explore the most serious air pollution factors contributing to lung cancer. The incidence of lung cancer in Nanchang was used as reference sequence, while the air pollution index was used for comparison sequences. There can be a time lag between air pollution and lung cancer, so different time lags were taken into consideration. Specific data used in the grey correlation analysis are shown in Table 2. The accumulated incidence of lung cancer in Nanchang was used as a reference sequence (2004-2014, 2005-2014, 2006-2014, 2007-2014, 2008-2014, 2009-2014, 2010-2014, 2011-2014, 2012-2014, 2013-2014). Corresponding to the reference sequence, the accumulated annual mean concentration of SO 2 , NO 2 , PM 10 , annual haze days, and annual mean API/AQI were used as comparison sequences with time lags of between one and ten years (2003-2013, 2003-2012, 2003-2011, 2003-2010, 2003-2009, 2003-2008, 2003-2007, 2003-2006, 2003-2005, 2003-2004). The grey correlation for the respective time lags was then calculated and the results are shown in Table 3 and Figure 1.
When the time lag is between one and four years, the grey correlation for PM 10 is higher than any other pollution factors, indicating that PM 10 has the greatest impact on the incidence of lung cancer. For time lags of five, nine and ten years, PM 10 has the second highest grey correlation, suggesting that PM 10 also has an impact on the incidence of lung cancer for these time lags. For time lags between one and three years, the grey correlation of API/AQI is the second highest, behind PM 10 . When the time lag is between six and eight years, the API/AQI grey correlation is second only to SO 2 . At a lag of nine years, the grey correlation of API/AQI is the highest, suggesting that the incidence of lung cancer in Nanchang residents was also closely related to API/AQI.
The grey correlation of haze is the lowest in all time lags, meaning that haze does not have the strongest impact on the incidence of lung cancer while comparing with other pollution factors.
In summary, as the time lag increases all the air pollution factors show a decreasing trend with the incidence of lung cancer and PM 10 appears to be one of the most significant factors influencing the incidence of lung cancer in Nanchang.
Lung cancer is a chronic disease that relies on cumulative effects, therefore the incidence should increase with longer lags, which seems contradictory to our conclusion. However, practical considerations need to be taken into consideration. In China, many government officials are blindly pursuing the rapid growth of GDP, ignore the pollution caused by the introduction of the heavy polluting industries. For Nanchang, the situation is the same. The GDP in Nanchang was 64.1 billion Yuan in 2003. After 2003, GDP annual growth exceeded 10% and annual growth rate of GDP in Nanchang was far higher than the annual growth rate of GDP in China. In 2014, the GDP in Nanchang was 336.7 billion Yuan, over 5 times as much as GDP in 2003 [42]. The rapid growth of GDP brought by heavy pollution industry and surge in car numbers has made air pollution worse year by year. Automobile exhaust and coal combustion contain large number of PAHs. Increased PAHs concentration can significantly increase the incidence of lung cancer [3]. The cumulative amount of air pollution factors that cause lung cancer varies annually, almost increasing year by year, which lead to the increasing number of lung patients that can be seen in Table 2.  The grey correlation of haze is the lowest in all time lags, meaning that haze does not have the strongest impact on the incidence of lung cancer while comparing with other pollution factors.
In summary, as the time lag increases all the air pollution factors show a decreasing trend with the incidence of lung cancer and PM10 appears to be one of the most significant factors influencing the incidence of lung cancer in Nanchang.
Lung cancer is a chronic disease that relies on cumulative effects, therefore the incidence should increase with longer lags, which seems contradictory to our conclusion. However, practical considerations need to be taken into consideration. In China, many government officials are blindly pursuing the rapid growth of GDP, ignore the pollution caused by the introduction of the heavy polluting industries. For Nanchang, the situation is the same. The GDP in Nanchang was 64.1 billion Yuan in 2003. After 2003, GDP annual growth exceeded 10% and annual growth rate of GDP in Nanchang was far higher than the annual growth rate of GDP in China. In 2014, the GDP in Nanchang was 336.7 billion Yuan, over 5 times as much as GDP in 2003 [42]. The rapid growth of GDP brought by heavy pollution industry and surge in car numbers has made air pollution worse year by year. Automobile exhaust and coal combustion contain large number of PAHs. Increased PAHs concentration can significantly increase the incidence of lung cancer [3]. The cumulative amount of air pollution factors that cause lung cancer varies annually, almost increasing year by

Grey Correlation for Air Pollution Factors during a 5-Year Period and the Incidence of Lung Cancer during 5-Year Period with Different Time Lags
The grey correlation for air pollution factors during 5-year periods and the incidence of lung cancer in 5-year periods with different time lags were calculated. Firstly, the incidence of lung cancer in Nanchang during 2010-2014 was used as a reference sequence. Corresponding to the reference sequence, annual mean concentrations of SO 2 , NO 2 , PM 10 , and annual haze days along with annual mean API/AQI were used as comparison sequences with time lags of between one and seven years (2009-2013, 2008-2012, 2007-2011, 2006-2010, 2005-2009, 2004-2008, 2003-2007). The grey correlations for the respective time lags were then calculated and the results are shown in Table 4. Results show that the grey correlation for PM 10 is the highest (0.7716) when the time lag is one year and seven years. This indicates that the annual mean concentration of PM 10  Next, the incidence of lung cancer in Nanchang during 2009-2013 was used as a reference sequence. and annual mean concentrations of SO 2 , NO 2 , PM 10 and annual haze days along with annual mean API/AQI were used as comparison sequences with time lags of between one and six years (2008-2012, 2007-2011, 2006-2010, 2005-2009, 2004-2008, 2003-2007). The grey correlation for the different time lags was calculated and the results are shown in Table 5. Results show that, with a lag of one, two, three, and six years, PM 10 has either the highest or second highest grey correlation, indicating that PM 10 has a significant impact on the incidence of lung cancer during 2009 to 2013. Between a lag of one to six years, the grey correlation for API/AQI is the either the highest or the second highest, suggesting API/AQI also has a major impact on the incidence of lung cancer during 2009 to 2013.
Air pollution factors other than PM 10 and API/AQI, rarely display high grey correlations. The grey correlation for SO 2 is the second highest when there is a four year lag and the grey correlation for NO 2 is the highest when there is a five year lag. The grey correlation for annual haze days never achieves a rank of first or second. Thus, we can infer that SO 2 , NO 2 and annual haze days do not substantially impact the incidence of lung cancer during 2009 to 2013.
Finally, the incidence of lung cancer in Nanchang during 2008-2012 was used as a reference sequence and annual mean concentrations of SO 2 , NO 2 , PM 10 , haze days, and API/AQI were used as comparison sequences with time lags from between one and five years (2007-2011, 2006-2010, 2005-2009, 2004-2008, 2003-2007). The grey correlation for the different time lags was calculated and the results are shown in Table 6. At time lags of two, three, and five years, the grey correlation for PM 10 is the either the highest or the second highest. Furthermore, at time lags of between one and three years, the grey correlation for API/AQI is the either the highest or the second highest. These results suggest that both PM 10 and API/AQI have a significant impact on the incidence of lung cancer from 2008 to 2012.In summary, our results show that PM 10 is the most serious air pollution factor causing lung cancer.

DEA
In this section, we use DEA to study the main air pollution factors contributing to lung cancer and study the relationship between the incidence of lung cancer and air pollution factors for different types of patients.

Lung Cancer Incidence and Air Pollution Factors in Nanchang
Based on the BCC model, the impact of air pollution factors on the incidence of lung cancer in Nanchang was evaluated and calculated by using Matlab (R2016b version). Grey correlation analysis shows that PM 10 is the most significant air pollution factor causing lung cancer. Consequently, the following two configurations are used to verify the effectiveness of DEA for simulating the incidence of lung cancer. The first configuration contains all the pollution factors while the second configuration contains only PM 10 . For both configurations, the time lag of the incidence of lung cancer was evaluated between none and 4 years. These results are shown in Table 7 and Figure 2. Table 7. Effectiveness of DEA for different time lags (1) between PM 10 and the incidence of lung cancer (2) between all factors and the incidence of lung cancer.   Results show that the analyses containing all five pollution factors are very similar to the results of PM10. Furthermore, line graphs of the two conditions are very similar. These similarities reinforce our earlier finding that PM10 is the most critical air pollution factor contributing to lung cancer.

The Relationship between the Incidence of Lung Cancer and Air Pollution for Different Types of Patients
Lung cancer patients in Nanchang during 2003 to 2014 were classified as rural or urban, male or female, and smoker or non-smoker. To further understand the relationship between the incidence of lung cancer and air pollution, these classification criteria were used simultaneously. Taking both gender and smoking history into consideration, four detailed categories were formed: male smoker (male-s), male non-smoker (male-ns), female smoker (female-s), and female non-smoker (female-ns). Using age and smoking history a further four categories were formed: smoker older than 70 (s > 70), smoker younger than 70 (s ≤ 70), non-smoker older than 70 (ns > 70), and non-smoker younger than 70 (ns ≤ 70).
The DEA method was applied to evaluate the impact of all 5 air pollution factors on the incidence of lung cancer with different time lags of lung cancer incidence ranging from none to four years. Then the patients were sorted from large to small according to their DEA effectiveness. These results are shown in Table 8.
For nearly all the time lags, DEA effectiveness suggests that urban dwellers are at the highest risk from air pollution induced lung cancer, followed by female, smoker, male, non-smoker, and Results show that the analyses containing all five pollution factors are very similar to the results of PM 10 . Furthermore, line graphs of the two conditions are very similar. These similarities reinforce our earlier finding that PM 10 is the most critical air pollution factor contributing to lung cancer.

The Relationship between the Incidence of Lung Cancer and Air Pollution for Different Types of Patients
Lung cancer patients in Nanchang during 2003 to 2014 were classified as rural or urban, male or female, and smoker or non-smoker. To further understand the relationship between the incidence of lung cancer and air pollution, these classification criteria were used simultaneously. Taking both gender and smoking history into consideration, four detailed categories were formed: male smoker (male-s), male non-smoker (male-ns), female smoker (female-s), and female non-smoker (female-ns). Using age and smoking history a further four categories were formed: smoker older than 70 (s > 70), smoker younger than 70 (s ≤ 70), non-smoker older than 70 (ns > 70), and non-smoker younger than 70 (ns ≤ 70).
The DEA method was applied to evaluate the impact of all 5 air pollution factors on the incidence of lung cancer with different time lags of lung cancer incidence ranging from none to four years. Then the patients were sorted from large to small according to their DEA effectiveness. These results are shown in Table 8.
For nearly all the time lags, DEA effectiveness suggests that urban dwellers are at the highest risk from air pollution induced lung cancer, followed by female, smoker, male, non-smoker, and rural. Table 8 shown that the DEA effectiveness between air pollution and urban patients is greater than the DEA effectiveness between air pollution and rural patients. This coincides with the incidence of lung cancer in Nanchang during 2003 to 2014.  Table 9 shown that in 2003, the incidence of lung cancer in urban areas (2.6817/100,000) was approximately 6 times the incidence of lung cancer in rural areas (0.4507/100,000). While in 2014, the incidence of lung cancer in urban areas (4.3701/100,000) and the incidence of lung cancer (4.2937/100,000) is almost identical. This change is related to the acceleration of China's urbanisation processes and rural development. The DEA effectiveness between air pollution and female patients is greater than the DEA effectiveness between air pollution and male patients.
The DEA effectiveness between air pollution and smoking patients is greater than the DEA effectiveness between air pollution and non-smoking patients due to smoking being one of the major causes of lung cancer. Incidence of smoking patients is approximately twice than that of non-smoking patients. Combined with the actual incidence of lung cancer, the incidence of lung cancer in both smokers and non-smokers is increasing. Tables 8 and 10, for all 14 categories, the DEA effectiveness between air pollution and male smoking patients is greater than the DEA effectiveness between air pollution and male non-smoking patients. This indicates that the combined effect of smoking and air pollution exposure will greatly increase the risk of lung cancer. The incidence of lung cancer in smoking patients is more than three times that of the incidence of non-smokers with lung cancer. Considering the fact that most smokers in China are male, this phenomenon is particularly noteworthy.

As shown in
The DEA effectiveness between air pollution and female non-smoking patients is greater than the DEA effectiveness between air pollution and female smoking patients. This suggests that air pollution was not significantly associated with lung cancer among female smoking patients.

Discussion
As far as we know, this study is one of the few to study the relation between air pollution and lung cancer in middle reaches of Yangtze River. By using the grey correlation analysis method and DEA, the relationship between the incidence of lung cancer and air pollution factors in Nanchang city was explored. The results obtained by the two different analysis methods were similar, which reinforces the reliability of the results.
Our study found that, by using grey correlation analysis, among five air pollution factors (annual mean concentration of SO 2 , NO 2 , PM 10 , haze days, and API/AQI), PM 10 was the most closely related to the incidence of lung cancer in Nanchang. This result was reinforced by DEA, which showed that PM 10 was the most critical air pollution factor for lung cancer.
One interesting finding shows that the DEA effectiveness between air pollution and urban patients was greater than the DEA effectiveness between air pollution and rural patients, which means that air pollution has more impact on urban dwellers as compared to rural dwellers. After related research of pollution sources in Nanchang, we found that automobile exhaust, soil dust, coal-burning dust, building dust and metallurgical dust are the major sources of PM 10 in Nanchang. The main sources of pollution in urban areas are automobile exhaust emissions and industrial pollution. For rural areas, the main sources of pollution are automobile exhaust emissions, industrial pollution and pollution brought by ore (marble and limestone) mining and building stones processing [44]. Quarries, cement plants and building materials processing plants around Nanchang are some of the traditional industries that brought serious air pollution. In rural development, some out-of-date or heavy polluting industry that unable to meet environmental requirements were relocated from urban areas to rural areas in order to avoid been shut down. Some government officials welcomed these companies in order to raise GDP in their jurisdictions. Another fact that cannot be ignored was that many rural residents actually work in urban areas, they breathed the same air as the urban dwellers. All of these factors greatly increased the incidence rate of lung cancer in rural residents.
Another interesting finding is that the DEA effectiveness between air pollution and female patients was greater than the DEA effectiveness between air pollution and male patients, indicating that women are more susceptible to lung cancer caused by air pollution. For most adult Chinese women, they had to take on the responsibilities of a housewife including cooking. Unlike Western food, Chinese food uses a lot of oil in the cooking process. Hot pots and oil will produce a lot of cooking oil fumes which can potentially cause lung cancer. Several studies have shown that a significant association between cooking oil fumes and lung cancer exists for Chinese women [45,46]. Another reason is that due to the female physique is more sensitive to air pollution as compared to the male physique, women are more likely to suffer from passive smoking and cause lung cancer [45,47].
The DEA effectiveness between air pollution and smoking patients was greater than the DEA effectiveness between air pollution and non-smoking patients, meaning that smokers are more likely to cause lung cancer due to air pollution.
Industrialisation and urbanisation can promote economic development and increasing GDP growth rate. Between 2003 and 2014, Nanchang's economy grew rapidly and GDP grew to more than five-fold within twelve years (from 64.1 billion Yuan in 2003, to 336.7 billion Yuan in 2014). However, the development of economy and the rapid growth of population have brought severe pressure to the atmosphere. As the capital city of Jiangxi Province, the problem of atmospheric pollution has become increasingly prominent in recent years. The main causes of air quality pollution are: (1) Many enterprises have entered Nanchang, but the environmental management standards are not perfect, which makes many enterprises choose coal combustion equipment and out-of-date equipment, resulting in an increase in air pollutant emissions. (2) The increase of population makes the real estate industry develop greatly in Nanchang, the construction of buildings blossom everywhere. Lack of dust control measures while demolishing old houses, removing wreckage and constructing new houses result in PM 10 increased significantly. The boom in real estate industry also raises demand for more products from cement plants and quarries around Nanchang. (3) Car ownership increased but lack of exhaust control measure. Motor vehicles with black smoke can be seen everywhere in the streets, and motor vehicle exhaust has become a major source of air pollution. (4) The layout of industry zone is unreasonable and cannot match the meteorological condition.
North industrial zone of Nanchang is located in the upper wind direction of the city's dominant wind direction, which is one of the direct reasons leading to the decline of air quality in the urban area.
According to our findings, we make the following recommendations: (1) Priority should be given to low polluting industries to reduce the introduction of heavy polluting industries. Heavy polluting industries need to meet the environmental quality standards and pollutant discharge or emission standards before they can resume production. Increasing the usage of clean energy and renewable energy, reducing the usage of coal consumption, therefore the emission of air pollutants can greatly reduce. (2) Strengthen the control of PM 10 brought by real estate industry and related industries. Use water mist to reduce dust during demolition and muck trucks have to use dust cap. Encourage the use of environmentally friendly building materials. Trees and grass need to plant on open ground in order to reduce soil exposure. The pits left by quarries need more vegetation cover. By the end of 2016, the urban green coverage rate was 45%, and the greening rate of villages was 32%.
(3) The government should increase the proportion of new energy vehicles. More people should be encouraged to use public transport.
(4) Industrial areas need scientific planning to avoid areas that can affect the health of residents: an upper wind direction as the city's dominant wind direction should be avoided, keeping a certain distance keep away from densely populated areas.

Conclusions
Air pollution is a serious problem in Nanchang. The finding shows that PM 10 is closely related to lung cancer in Nanchang. Air pollution has more impact on urban dwellers as compared to rural dwellers. Women are more susceptible lung cancer caused by air pollution. If appropriate measures are taken, the incidence of lung cancer in Nanchang can be expected to decline in the future.