Ambient Cumulative PM2.5 Exposure and the Risk of Lung Cancer Incidence and Mortality: A Retrospective Cohort Study

Smoking, sex, air pollution, lifestyle, and diet may act independently or in concert with each other to contribute to the different outcomes of lung cancer (LC). This study aims to explore their associations with the carcinogenesis of LC, which will be useful for formulating further preventive strategies. This retrospective, longitudinal follow-up cohort study was carried out by connecting to the MJ Health Database, Taiwan Cancer Registry database, and Taiwan cause of death database from 2000 to 2015. The studied subjects were persons attending the health check-ups, distributed throughout the main island of Taiwan. Cox proportional hazards regression models were used to investigate the risk factors associated with LC development and mortality after stratifying by smoking status, with a special emphasis on ambient two-year average PM2.5 exposure, using a satellite-based spatiotemporal model at a resolution of 1 km2, and on dietary habit including consumption of fruits and vegetables. After a median follow-up of 12.3 years, 736 people developed LC, and 401 people died of LC-related causes. For never smokers, the risk of developing LC (aHR: 1.32, 95%CI: 1.12–1.56) and dying from LC-related causes (aHR: 1.28, 95%CI: 1.01–1.63) rises significantly with every 10 μg/m3 increment of PM2.5 exposure, but not for ever smokers. Daily consumption of more than two servings of vegetables and fruits is associated with lowering LC risk in ever smokers (aHR: 0.68, 95%CI: 0.47–0.97), and preventing PM2.5 exposure is associated with lowering LC risk for never smokers.


Introduction
Lung cancer (LC) is the most common cancer in the world, with 2.2 million new cases in 2019, and it is by far the leading cause of cancer death among both men and women, making up 25% of all cancer deaths [1]. The overall incidence of LC in Taiwan ranks 15th globally and is the second highest in Asia, trailing only North Korea [1]. It has consistently 2 of 14 been the leading cause of cancer-related death in Taiwan since 2011, with a mortality rate of 41.1 per 100,000 people in 2019 [2].
A synergic interaction of myriad risk factors contributes to the development of LC, including environmental toxin exposure, genetic predisposition, infectious comorbidities, and individual lifestyle. Despite a decline in incidence and mortality of LC following global tobacco control policies [3,4], LC among never smokers still accounts for an estimated 20% of cases in men and more than 50% of cases women [4,5]. Additionally, the global incidence of lung adenocarcinoma is still increasing against the trend of smoking rates in both sexes [4,[6][7][8][9]. We can, therefore, speculate that tobacco smoke accounts for only a minority of LC development among nonsmokers. The corresponding effect of exposure to ambient fine particulate matter (PM 2.5 ) on the development of LC and LC-related mortality, particularly in patients who have never smoked, has been demonstrated in several studies [10][11][12][13][14]. It is known that the impact of PM 2.5 on LC development is greater in Asia than in Western countries [15]. In Taiwan, never smokers constitute a major proportion of LC patients, up to 50% and 90% in males and females, respectively [2,10].
A nationwide study, using a registry database, conducted in Taiwan has demonstrated a steadily increasing age-adjusted incidence of lung adenocarcinoma in male never smokers, from 9.06 to 23.25 per 100,000 population as the smoking rate decreased from 59.4 to 29.9%, and from 7.05 to 24.22 per 100,000 population in never smoker females from 1995 to 2015. The accelerated increase in PM 2.5 levels, particularly in southern Taiwan, might be a possible explanation [10]. However, that retrospective study did not consider the impact of personal lifestyle and couldn't reflect the temporal relationship between PM 2. 5 and LC development.
The impact of dietary factors on LC development is uncertain. A negative association between the consumption of β-carotene-, vitamin C-rich vegetables and fruits and LC development has been reported in never smoker females [16][17][18], while the opposite trend of LC increment was noted among smokers with dietary supplementation of betacarotene [19].
With the gradual implementation of LC prevention policies around the world, this retrospective, longitudinal follow-up study has two aims: (1) to evaluate and compare the risk factors of LC incidence and LC-related mortality in ever smokers and never smokers, especially emphasizing cumulative PM 2.5 exposure and individualized dietary habits; (2) to further investigate the factors that affect the survival of LC patients with a smoking habit vs. never smokers. Our findings provide evidence for further preventive strategy guidance.

Study Design and Data Source
The data for this retrospective, longitudinal follow-up cohort study were obtained from a population-based database of health examinations, the MJ Health Database (MJHD), from 2000 to 2015 in Taiwan. MJ is the name of the health check-up clinics and also the name of the database. The studied subjects were persons attending health check-ups, and they were distributed throughout the main island of Taiwan. The details of the MJHD have previously been described in the literature [20]. For each health examination visit, the results of physical examinations (anthropometric measurement and biological test data) and the results of self-administered questionnaires on lifestyle behaviors of participants were recorded. All participants signed an informed consent form before physical examination. We also linked the MJHD to Taiwan Cancer Registry (TCR) data and cause of death (COD) data for further evaluation of LC-related incidence and mortality. All participants were followed up with, from the baseline (i.e., the first health examination) until the endpoint (event of interest), the date of death, or the end of 2015. The study was approved by the Institutional Review Board (IRB) of Biomedical Science Research, Academia Sinica (IRB number: AS-IRB-BM-17044).

Selection of Study Population
The selection process of the study population is shown in Figure 1; a total of 471,669 participants, who received at least once health examination, as shown in the MJHD, between 1 January 2000 and 31 December 2015, were enrolled. We first excluded 5208 participants with missing encrypted personal identification numbers (PIDNs), because this indicates a lack of follow-up. After linking the MJHD to the TCR and COD data through PIDNs, a total of 466,461 participants were included in the initial datasets.

Selection of Study Population
The selection process of the study population is shown in Figure 1; a total of 471,669 participants, who received at least once health examination, as shown in the MJHD, between 1 January 2000 and 31 December 2015, were enrolled. We first excluded 5208 participants with missing encrypted personal identification numbers (PIDNs), because this indicates a lack of follow-up. After linking the MJHD to the TCR and COD data through PIDNs, a total of 466,461 participants were included in the initial datasets. Participants with cancer diagnosis or self-reported history of any cancer prior to the date of health examination or who died within three months after the first visit were excluded. We also excluded participants who had missing values on baseline covariates or inadequate estimated glomerular filtration rate (eGFR) values (i.e., eGFR < 2, eGFR≥ 200), or who provided fewer than two measuring points of eGFR during the study period, since at least two eGFR records were needed to compute an annual decline in eGFR. A total of 174,431 participants were enrolled for the final analysis, including 130,559 never smokers and 43,572 ever smokers.  Participants with cancer diagnosis or self-reported history of any cancer prior to the date of health examination or who died within three months after the first visit were excluded. We also excluded participants who had missing values on baseline covariates or inadequate estimated glomerular filtration rate (eGFR) values (i.e., eGFR < 2, eGFR≥ 200), or who provided fewer than two measuring points of eGFR during the study period, since at least two eGFR records were needed to compute an annual decline in eGFR. A total of 174,431 participants were enrolled for the final analysis, including 130,559 never smokers and 43,572 ever smokers.

Endpoints
For our first aim, the primary endpoint was an LC incident after the first health examination visit, which was identified from the TCR data using the International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3) codes C33-C34. The secondary endpoint was LC-related death after the first health examination visit. We identified LC-related death from the COD data using the 9th revision (ICD-9) or the 10th revision (ICD-10) of the international classification of diseases (ICD-9 code 162; ICD-10 codes C33-C34).
As for our second aim, the endpoint was all-cause death of LC patients after diagnosis of LC, whose LC was diagnosed before 31 December 2014, and who were subsequently followed up until death or the end of 2015. The numbers of lung cancer cases diagnosed in different years during the studied period are listed in Table S1.

Definition of Variables
Information regarding participants' demographic characteristics, lifestyle habits, family history, comorbidities, and medical history were collected via standard self-administered questionnaires and health examination records.
Diabetic mellitus was defined as fasting glucose levels >126 mg/dL or the current use of antihyperglycemic drugs. Hypertension, cardiovascular disease, and stroke were defined according to self-reported history or the current use of antihypertensive drugs and cardiac drugs. The estimated glomerular filtration rate (eGFR) was calculated from the modification of diet in renal disease (MDRD) equation [21]. The annual decline in eGFR was computed from the slope of the linear regression line of eGFR on the follow-up year. Both eGFR and CEA are common indicators in health examinations. In previous studies, eGFR decline might be associated with the incidence of some cancers, including urothelial cancer and lung cancer [22,23]. Carcinoembryonic antigen (CEA) was used as the surrogate biomarker of cancers, particularly for lung adenocarcinoma [24]. Never smokers were defined as those participants who self-reported never smoking. Participants who self-reported ever having smoked or smoking at least once a week were classified as ever smokers.
LC cases captured in our data were further classified into four types: small-cell carcinoma (ICD-O-3 morphology codes 8041, 8045), adenocarcinoma (8140, 8250, 8255, 8260, 8550, 8551, and 8560), squamous cell carcinoma (8070, 8071, and 8072), and other carcinomas. LC stages were classified into four stages (I, II, III, and IV) according to the 6th and 7th editions of the American Joint Committee on Cancer (AJCC, https://cancerstaging. org/references-tools/deskreferences/Pages/default.aspx, (accessed on 3 May 2021)). PM 2.5 exposure was estimated at each participant's address reported in the questionnaire coordinates by using a satellite-based spatiotemporal model [25] with a high spatial resolution of 1 × 1 km on the basis of National Aeronautics and Space Administration (NASA) aerosol optical thickness (AOD) data [26]. The two-year mean PM 2.5 concentration (µg/m 3 ) prior to the heath examination date was used as an indicator of long-term exposure to ambient PM 2.5 air pollution.

Statistical Analysis
Baseline characteristics of participants were presented as mean ± standard deviation or median (interquartile range (IQR)) for continuous variables and frequency (percentage) for categorical variables. The intergroup difference of continuous variables was compared by using the independent t-test or Mann-Whitney U test, depending on the normal distribution. A Chi-square test or Fisher's exact test was used to compare the intergroup difference of categorical variables, as appropriate.
To investigate the effect of PM 2.5 and fruit or vegetable consumption on the incidence and mortality of lung cancer among all study participants, Cox proportional hazards regression models were used to estimate the hazard ratios (HR) and 95% confidence intervals (95% CI) after adjusting the related covariates (details in Supplementary File S1). The cut-off points for the servings of fruits and vegetables were obtained after statistical selection since statistical significance can be only seen at 2 servings after we applied different cut-off points from 2 to 5. All statistical analyses were conducted with SAS software version 9.4 (SAS Institute, Cary, NC, USA). A p-value ≤ 0.05 was considered statistically significant.

Results
Overall, a total of 736 (0.42%) LC cases were identified after a median follow-up of 12.3 years. The demographic data of the enrolled population are shown in Table 1. Compared to those who did not develop LC, LC patients were more likely to be older (55.0 vs. 39.0 years old, p < 0.001), and had a lower education level (67.7% vs. 39.0%, p < 0.001), poorer renal function (77.3 vs. 86.5 mL/min/1.73 m 2 , p < 0.001), a higher level of carcinoembryonic antigen (CEA) (2.3 vs. 1.5 mg/dL, p < 0.001), greater family history of LC (8.0% vs. 5.2%, p < 0.001), and more comorbidities. Additionally, the LC group had more ever smokers (36.4% vs. 25.1%, p < 0.001) but lower PM 2.5 exposure concentration (20.2 vs. 21.5 µg/m 3 , p < 0.001) than those without LC. Table 1. Baseline characteristics of enrolled population, stratified by the incidence of LC and LC-related mortality.  The characteristics of LC patients are summarized in Table 2. Compared to the LC patients who never smoked, the ever smokers were likely to be older (57.9 vs. 53.3 years old, p < 0.001), and male (90.3% vs. 28.9%, p < 0.001), and had a lower education level (77.2% vs. 62.2%, p < 0.001), higher CEA level (3.1 vs. 1.9 mg/dL, p < 0.001), less fruit and vegetable intake (87.3% vs. 96.4%, p < 0.001) and less long-term PM 2.5 exposure concentration (20.0 vs. 20.3 µg/m 3 , p = 0.012). Most of the LC patients without smoking habits were diagnosed with adenocarcinoma (82.26%). The lag from health examination to LC diagnosis is not significantly different between the groups with different smoking habits (p = 0.631).  Table 3 shows the association between each factor and both LC incidence and LCrelated mortality. Ever smokers had higher cumulative incidences of LC development than never smokers (0.61% vs. 0.36%). When treating an LC incident as the event of interest, age greater than 50, lower education level, family history of LC, lower eGFR, and higher CEA level were found to be significant LC risk factors for both never smokers and ever smokers. For those who never smoked, being female was a significant risk factor for developing LC (aHR: 1.33, 95%CI: 1.07-1.64, p < 0.01), and a much higher proportion of adenocarcinoma was observed in never smokers than ever smokers (82.26% vs. 52.24%, p < 0.001, Table 2). This result is in-line with previous studies [27,28]. The effect of PM 2.5 and vegetable and fruit intake on LC incidence differed between smokers and never smokers (Table 3 and Figure 2). For never smokers, the risk of developing LC rose significantly with every 10 µg/m 3 increment of PM 2.5 exposure (aHR: 1.32, 95%CI: 1.12-1.56), while such an effect was not observed among ever smokers (aHR: 0.96, 95%CI: 0.76-1.20). Conversely, consuming more than two servings of vegetables and fruits per day was able to help ever smokers reduce LC risk (aHR: 0.68, 95%CI: 0.47-0.97), yet this was not seen in never smokers.   Regarding LC-related deaths in the whole population as the event of interest, a total of 401 death events (0.23%) were reported after the median follow-up duration of 12.3 years (Table 1). Those who died of LC were older (59.1 vs. 39.0 years old, p < 0.001), had a lower educational level (75.3% vs. 39.0%, p < 0.001), were more ever smokers (46.4% vs. Regarding LC-related deaths in the whole population as the event of interest, a total of 401 death events (0.23%) were reported after the median follow-up duration of 12.3 years (Table 1). Those who died of LC were older (59.1 vs. 39.0 years old, p < 0.001), had a lower educational level (75.3% vs. 39.0%, p < 0.001), were more ever smokers (46.4% vs. 25.1%, p < 0.001) and had more comorbidities at the baseline when compared to survivors; men accounted for around 60%, and adenocarcinoma was the commonest type ( Table 1). The LC-related mortality in ever smokers was 2.63-fold higher than never smokers (0.42% vs. 0.16%), and the associations between risk factors and LC-related mortality are summarized in Table 3 and Figure 3. Every 10 µg/m 3 increment of PM 2.5 exposure concentration is a significant risk factor for LC-related mortality (aHR: 1.28, 95% CI: 1.01-1.63, p < 0.05) for never smokers, but not for ever smokers (aHR: 0.96, 95%CI: 0.76-1.20).  Table 4 explores the association between risk factors and all-cause mortality for LC patients. Elderly age and advanced cancer stage at diagnosis of LC raised mortality risk in both never smokers (aHR: 1.03, 95%CI: 1.01-1.04, p < 0.001; aHR: 6.09, 95%CI: 3.87-9.57, p < 0.001) and ever smokers (aHR: 1.03, 95%CI: 1.01-1.05, p = 0.001; aHR: 7.48, 95%CI: 4.15-13.48, p < 0.001). For never smokers, non-adenocarcinoma type cancer (aHR: 2.55, 95%CI: 1.73-3.75, p < 0.001) increased mortality. For the LC patients, regardless of their smoking status, PM2.5 no longer seems to be a significant risk factor for death. The relationship between PM2.5 and mortality is plotted in Figure 4.   Table 4 explores the association between risk factors and all-cause mortality for LC patients. Elderly age and advanced cancer stage at diagnosis of LC raised mortality risk in both never smokers (aHR: 1.03, 95%CI: 1.01-1.04, p < 0.001; aHR: 6.09, 95%CI: 3.87-9.57, p < 0.001) and ever smokers (aHR: 1.03, 95%CI: 1.01-1.05, p = 0.001; aHR: 7.48, 95%CI: 4.15-13.48, p < 0.001). For never smokers, non-adenocarcinoma type cancer (aHR: 2.55, 95%CI: 1.73-3.75, p < 0.001) increased mortality. For the LC patients, regardless of their smoking status, PM 2.5 no longer seems to be a significant risk factor for death. The relationship between PM 2.5 and mortality is plotted in Figure 4.  A total of 736 participants developed LC during the studied period (Table S2). Among them, 72 participants did not have stage information of LC from the cancer registry. The numbers of LC cases from stage 1 to stage 4 were 201 (30.27%), 30 (4.52%), 121 (18.22%), and 312 (46.99%), respectively. Higher CEA levels were noted in ever smokers A total of 736 participants developed LC during the studied period (Table S2). Among them, 72 participants did not have stage information of LC from the cancer registry. The numbers of LC cases from stage 1 to stage 4 were 201 (30.27%), 30 (4.52%), 121 (18.22%), and 312 (46.99%), respectively. Higher CEA levels were noted in ever smokers who also had more comorbidities, regardless of stages of lung cancer. A higher percentage of never smokers consumed ≥2 servings of fruits and vegetables per day. The median follow-up durations from joining the database to lung cancer diagnosis were 9.5 years for the early stages of LC (stage ≤ 2), and 8.31 years for later stages of LC (stage ≥ 3), respectively.

Discussion
Several major findings were obtained from the current study. First, each 10-unit increment of cumulative PM 2.5 exposure will increase the risk of LC incidence and LCrelated mortality by 1.32-fold and 1.28-fold in never smokers, respectively, but no such increase is seen in smokers. Second, daily consumption of at least two portions of fruits and vegetables decreased by 33% the risk of LC development in ever smokers, but not in never smokers.
The current study demonstrates a visible trend of cumulative PM 2.5 exposure which increases by 1.32 the risk of LC incidence in never smokers, consistent with the results of previous meta-analyses showing a 1-46% increase in the risk of LC incidence per 10 µg/m 3 increase in PM 2.5 concentration [15,29,30]. The interpretation of previous study results should be performed with caution, noting their geographical diversity and heterogeneous definitions of smoking status. Stronger associations between PM 2.5 and LC in Asia than in North America and Europe have been reported [12,15]. The PM 2.5 exposure concentration, with an average of 20.40 µg/m 3 , was higher in the current study than the levels reported in most previous Western studies which had average concentrations from 6.6 to 13.0 µg/m 3 [13,29,30]; this might be a potential contributor to higher incidences of LC in Taiwan. Several possible mechanisms for the related pathogenesis of PM 2.5 and LC development have been proposed. Under PM 2.5 long-term exposure, the epigenetic and microenvironmental alterations, mediated by microRNA dysregulation, DNA methylation, and cell autophagy and apoptosis, may activate oncogene-associated pathways to induce carcinomatosis of the lungs [31].
In contrast, the current study shows that the association between PM 2.5 and LC risk was insignificant in ever smokers, perhaps because cigarette smoking leads to excess body weight, which may eliminate the effect of PM 2.5 . Regarding the risk of LC development, cigarette smoking increased the relative risk by a factor of 15 to 50 in current and ever smokers, respectively, whereas the relative risks reported for PM 2.5 seldom exceeded 1.2 [32]. One study reported that LC risk increased by 32% (95% CI: 1.02, 1.69), 20% (95% CI: 1.01, 1.41) and 16% (95% CI: 1.02, 1.30) per increase of 10 µg/m 3 PM 2.5 in former, current, and never smokers, respectively [15].
Previous studies have reported that each increase of 10 µg/m 3 in the ambient concentration of PM 2.5 is associated with a 15 to 27% increase in mortality in LC patients, particularly for former and current smokers [11,15]. However, our study shows that the impact of PM 2.5 increment on LC-related mortality can only be seen in never smokers, rather than in ever smokers. Effects from ambient PM 2.5 on LC incidence and mortality among ever smokers were not observed because smoking produces polycyclic aromatic hydrocarbons (PAH) contained in PM 2.5 and PM 10 indoors, which are directly inhaled into the body and cause health impacts [33].
Furthermore, regardless of smoking status, elderly age and advanced cancer staging at diagnosis determined the poor outcomes after LC diagnosis rather than the accumulative PM 2.5 exposure concentration. The lack of genetic mutation reporting of LC and detailed treatment might be a potential confounder for prognosis in LC patients.
Daily consumption of at least five portions (≥400 g) of fruit and vegetables is recommended to reduce the risk of cardiovascular disease, cancer, and all-cause mortality [34,35]. The antioxidant activity contributed from biologically active compounds such as flavonoids, carotenoids, and other vitamins might be responsible for such risk reduction. However, in the literature there has not been adequate comprehensive evaluation of the protective role of fruits and vegetables in LC development and outcomes, and findings on whether they have a beneficial effect are mixed. A daily supplementation of 20-30 mg of beta-carotene has been found to increase the incidence of LC among smokers and asbestos workers [36]. A meta-analysis showed an overall of 8%−18% risk reduction for LC with daily 70-300 g fruit and vegetable intake, while the protective effect was attenuated after stratifying by smoking status, with only a marginally significant association among current smokers and an insignificant inverse trend in former or never smokers [37]. In our study, we also found a protective effect against LC in ever smokers who consumed at least two portions of fruits and vegetables daily.
There are several limitations in this study. First, the current study lacks a standardized follow-up protocol for health condition monitoring, and there are potential confounders from self-reported questionnaire data. Though environmental tobacco smoking (ETS) is an important exposure risk for LC in non-smokers, we cannot quantify the ETS effect among non-smokers due to inadequate information from questionnaires, particularly for tobacco consumption amount and duration (packs per year). Second, the exact time point of LC diagnosis is uncertain because the pre-diagnostic information of LC patients could not be obtained. Third, the two-year PM 2.5 concentration might not reflect a direct impact on the development of lung cancer. In our case, underestimation of the risk might have occurred because PM 2.5 was declining during the studied period [38]. For participants who later joined the health check-up program, their exposure status might be lower than earlier participants. Fourth, these datasets ignored the effects of genetic mutation of LC and also ignored the potential effects of indoor air pollution and exposure to other carcinogens on outcomes of LC. Fifth, although we have considered geographical differences in our exposure model by addressing the participants' locations, the temporal resolution of exposure is in years. Therefore, it was hard to correlate with the community level characteristics and the variations of ambient exposure using this dataset. In southern Taiwan, the causes of serious air pollution can be attributed to three major factors including the hubs of the petrochemical industry, traffic-related air pollution, and downwind areas affecting the diffusion of air pollutants.

Conclusions
In conclusion, the strategy to lower LC incidence could differ by smoking status. For non-smokers, preventing long-term exposure to PM 2.5 may attenuate the risk of LC development. Smoking cessation and encouraging daily consumption of at least two portions of fruits and vegetables is suggested for ever smokers.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/ijerph182312400/s1, Table S1: The number of lung cancer cases diagnosed by year from 2000 to 2015; Table S2: The descriptive statistics for the different stages of lung cancer cases; File S1: The detailed statistical analysis and variables' definitions.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data that support the findings of this study are available from the MJ Health Research Foundation and Ministry of Health and Welfare, Taiwan, but restrictions apply to the availability of these data, which were under approval for the current study and so are not publicly available. The linked data set used in this study had to be analyzed in person in the Health and Welfare Data Science Center, Ministry of Health and Welfare, Taiwan.