Survival Analysis in Patients with Lung Cancer and Subsequent Primary Cancer: A Nationwide Cancer Registry Study

With improved survival in patients with cancer, the risk of developing multiple primary malignancies (MPMs) has increased. We aimed to characterize MPMs involving lung cancer and compare these characteristics between patients with single lung cancer and those with lung cancer and subsequent primary cancer (known as lung cancer first [LCF]). Methods: This retrospective study was conducted based on Taiwan Cancer Database from Taiwan’s National Health Insurance Registry Database. Patients with lung cancer (n = 72,219) from 1 January 2011 to 31 December 2015, were included in this study, and their medical records were traced back to 1 January 2002, and followed until 31 December 2019. Results: MPMs occurred in 10,577 (14.65%) patients with lung cancer, and LCF and other cancer first (OCF) accounted for 35.55% and 64.45% of these patients, with a mean age at lung cancer diagnosis of 65.18 and 68.92 years, respectively. The median interval between primary malignancies in the OCF group was significantly longer than that in the LCF group (3.26 vs. 0.11 years, p < 0.001). Patients in the single lung cancer group were significantly older than those in the LCF group (67.12 vs. 65.18 years, p < 0.001). The mean survival time of patients with LCF was longer than that of patients with single lung cancer. Following initial lung cancer, the three most common second primary malignancies were lung, colon, and breast cancers. For patients with advanced lung cancer, survival in patients with mutant epidermal growth factor receptor (EGFR) was longer than that in patients with undetected EGFR. In stage 3 and 4 patients with EGFR mutations, the LCF group showed better survival than the single lung cancer group. Conversely, in stage 1 patients with mutant EGFR, the LCF group exhibited worse survival than the single lung cancer group. Conclusions: Survival in patients with MPMs depends on baseline characteristics and treatments. Our findings may contribute to the development of precision medicine for improving personalized treatment and survival as well as the reduction of medical costs.


Introduction
In Taiwan, lung cancer is the most common cause of cancer deaths, resulting in over 9000 deaths per year [1]. Lung cancer is also the top cause of cancer deaths in the United States, causing approximately 28% of cancer deaths [2,3]. According to data from the Surveillance, Epidemiology and End Results database, the incidence of lung cancer increased and peaked in 1992. The survival rate of lung cancer has increased in the past four decades [2,4]. The average 10-year survival rate in patients with early-stage tumors is 88%, which may increase to 92% if they undergo surgical resection immediately after diagnosis [5]. However, improved survival in patients with cancer increases the 2 of 12 risk of developing subsequent primary malignancies. In the United States, one in five newly diagnosed patients with cancer has a history of cancer. The subsequent primary cancers significantly impact morbidity and mortality in cancer survivors [6]. Previous reports have revealed that the incidence of multiple primary malignancies (MPMs), defined as at least two independent primary malignancies in the same or different organs of the same individual, has increased [7,8]. The incidence of MPMs has been reported as 0.73-5.2% in all patients with tumors. This large discrepancy may be attributed to different experiences of physicians and different diagnostic tools used by various hospitals [7]. The clinical manifestations and prognostic factors of MPMs involving lung cancer have been addressed in several studies in the past decades [7,[9][10][11], but some conclusions are controversial. Numerous studies have demonstrated that patients with MPMs do not have worse prognoses than those with single cancers [9,10]; however, opposite results were obtained by other studies [12]. Independent prognostic factors include the stage of lung cancer, age, smoking, and interval between primary malignancies [11,13,14]. Furthermore, sex and order of MPM occurrences affect the prognosis of patients with MPMs [10,11]. Sex also plays a role in the development of comorbid cancers, including esophageal squamous cell carcinoma and laryngeal cancers [15]. Most published studies are based on singlecenters or regional cancer registries [16], and few national database studies have been performed. Additionally, the epidemiologic data varied between different countries or regions, and there is limited information on subgroup analyses of different cancer stages and epidermal growth factor receptor (EGFR) expression. Therefore, in the present study, a thorough analysis using data from multiple hospitals and institutions was performed. This study aimed to investigate the characteristics and prognosis of MPMs involving lung cancer and compare these characteristics between patients with single lung cancer and those with lung cancer and subsequent primary cancer (known as lung cancer first [LCF]). Finally, subgroup analyses were performed to investigate the effects of different cancer stages and EGFR expression on prognosis in patients with single lung cancer and LCF.

Data Sources and Research Samples
Taiwan's National Health Insurance Registry Database (NHIRD) is a comprehensive real-world database established in 1995. NHIRD provides information on over 99% of individuals in Taiwan, including hospitalization and outpatient attendance [17]. Taiwan Cancer Registry (TCR) files are included in NHIRD and contain detailed information on cancer diagnoses and treatments.
The rate of TCR completeness is 98.4%. In 2002, TCR recorded the International Classification of Diseases for Oncology (third edition, ICD-O-3), cancer staging, diagnosis and recurrence dates, histological cancer types, and detailed treatment information in long-form files. Furthermore, laboratory and clinical data were included in the long-form files since 2011 [18,19]. Informed consent for this study was fully waived because the personal demographic information was anonymized in NHIRD. The Ethics Institutional Review Board of Fu Jen Catholic University in Taiwan reviewed and approved the study protocol (IRB number: C108121).

Study Population and Exclusion Criteria
This retrospective study included data on patients with lung cancer (n = 72,219) retrieved from the TCR files from 1 January 2011 to 31 December 2015. The medical records of patients diagnosed with lung cancers were traced back to 1 January 2002, and followed until 31 December 2019, to identify existing or subsequent second primary cancers. The longest follow-up time was 14.7 years. The second primary cancer was identified based on the coding of another primary cancer in the registry in a different organ or with pathology distinguished from that of the primary lung cancer, according to the criteria proposed by Warren and Gates [20]. TCR is recorded with strict clinical judgment of pathological results. Thus, metastatic cancer was recorded as a recurrence of an initial primary cancer and not as a second primary cancer.
The date of lung cancer diagnosis (ICD-9-CM: 162/ ICD-O-3: C34) was set as the index date, and the follow-up time was from 1 January 2002 to 31 December 2019. The endpoint was the date of death or end of the follow-up time. Figure 1 presents the research flow chart. Patients with lung cancer were categorized into patients with single lung cancer (n = 61,642; 85.35%) and those with two or more malignancies involving lung cancer (n = 10,577; 14.65%). Based on the classification proposed by Warren and Gates in 1932, synchronous MPMs (SMPMs) were defined as tumors occurring within ≤6 months of each other, and metachronous MPMs (MMPMs) were defined as primary tumors that developed with a period of >6 months between their occurrences [20]. Further, patients with MPMs involving lung cancer were divided into those with LCF and other cancer first (OCF) based on the order of the primary cancer occurrences. Patients in whom the first malignancy was lung cancer were categorized as those with LCF (n = 3728; 32.25%), whereas patients in whom the first malignancy was a cancer other than lung cancer (lung cancer was the second primary cancer) were classified as those with OCFs (n = 6849; 64.75%) [9]. longest follow-up time was 14.7 years. The second primary cancer was identified based on the coding of another primary cancer in the registry in a different organ or with pathology distinguished from that of the primary lung cancer, according to the criteria proposed by Warren and Gates [20]. TCR is recorded with strict clinical judgment of pathological results. Thus, metastatic cancer was recorded as a recurrence of an initial primary cancer and not as a second primary cancer. The date of lung cancer diagnosis (ICD-9-CM: 162/ ICD-O-3: C34) was set as the index date, and the follow-up time was from 1 January 2002 to 31 December 2019. The endpoint was the date of death or end of the follow-up time. Figure 1 presents the research flow chart. Patients with lung cancer were categorized into patients with single lung cancer (n = 61,642; 85.35%) and those with two or more malignancies involving lung cancer (n = 10,577; 14.65%). Based on the classification proposed by Warren and Gates in 1932, synchronous MPMs (SMPMs) were defined as tumors occurring within ≤6 months of each other, and metachronous MPMs (MMPMs) were defined as primary tumors that developed with a period of > 6 months between their occurrences [20]. Further, patients with MPMs involving lung cancer were divided into those with LCF and other cancer first (OCF) based on the order of the primary cancer occurrences. Patients in whom the first malignancy was lung cancer were categorized as those with LCF (n = 3728; 32.25%), whereas patients in whom the first malignancy was a cancer other than lung cancer (lung cancer was the second primary cancer) were classified as those with OCFs (n = 6849; 64.75%) [9].

Statistical Analyses
Chi-square test was used to compare categorical variables, expressed as N (%), and ttest was used to compare continuous variables, expressed as means ± standard deviations. Multivariate Cox proportional hazard regression analysis with adjusted hazard ratios and 95% confidence intervals for different variables that potentially confounded survival were applied to estimate survival in the single lung cancer and LCF groups, respectively. The Kaplan-Meier method was used to analyze the survival rates of patients with lung cancer.

Statistical Analyses
Chi-square test was used to compare categorical variables, expressed as N (%), and t-test was used to compare continuous variables, expressed as means ± standard deviations. Multivariate Cox proportional hazard regression analysis with adjusted hazard ratios and 95% confidence intervals for different variables that potentially confounded survival were applied to estimate survival in the single lung cancer and LCF groups, respectively. The Kaplan-Meier method was used to analyze the survival rates of patients with lung cancer. Statistical analyses were performed using SAS 9.4 (SAS Institute, Cary, NC, USA) and R software 3.4.1 version (The Project for Statistical Computing, Vienna, Austria). Table 1 summarizes baseline demographic characteristics of 72,219 patients with lung cancer, including 61,642 patients with single cancer and 10,577 patients with two or more primary malignancies. The cumulative incidence of MPMs was 14.65% with a follow-up time of 14.7 years. The LCF group comprised 3728 patients, whereas the OCF group consisted of 6849 patients. Both groups showed a high proportion of male patients over 65 years. The main histological type was adenocarcinoma (AC), which accounted for 65.88% and 60.58% of patients in the LCF and OCF groups, respectively. The mean age at diagnosis of the first primary cancer was 65.18 and 65.28 years, whereas that at the diagnosis of the second primary cancer was 66.03 and 68.92 years in the LCF and OCF groups, respectively. The median interval between the two primary malignancies was 0.11 and 3.26 years in the LCF and OCF groups, respectively. At the time of occurrence, the LCF group had the highest proportion of patients with SMPMs (65.33%). In contrast, most patients had MMPMs (94.48%) in the OCF group. The EGFR mutation rate in the LCF group was significantly higher than that in the OCF group (17.11% vs. 11.02%). A higher percentage of patients underwent surgery in the LCF group than in the OCF group (47.05% vs. 36.75%) ( Table 1). The characteristics of patients with single lung cancer and LCF are shown in Table 2. A higher proportion of patients were over 65 years in the single lung cancer group than in the LCF group (p < 0.001). The mean age at diagnosis of lung cancer in the single lung cancer and LCF groups was 67.12 and 65.18 years, respectively (p < 0.001). The percentage of patients who were engaged in smoking and alcohol consumption was lower in the LCF group. A higher proportion of patients were diagnosed with stage 4 cancer in the single lung cancer group than in the LCF group (56.06% vs. 29.61%, p < 0.001). The distribution of histological type in the single lung cancer group was significantly different from that in the LCF group (p < 0.001).  Table 3 and Table S1 show the mean and median survival time. The mean survival time of patients with single lung cancer and LCF are presented in Table 3. The mean survival time of patients with LCF in stages 2, 3, and 4 was longer than that of patients with single lung cancer. In stage 3, the mean survival time of patients with single lung cancer was 2.22 years, whereas that of patients with LCF was 2.96 years. In stage 4, the mean survival time of patients with single lung cancer was 1.35 years, whereas that of patients with LCF was 1.80 years. The mean survival time of patients with a history of smoking, alcohol drinking, and all histological types in the LCF group was longer than that in the single lung cancer group. The mean survival time of patients with EGFR mutations was 2.58 and 3.51 years in the single lung cancer and LCF groups, respectively. The mean survival time of patients without EGFR mutations was 1.77 years in the single lung cancer group and 2.78 years in the LCF group.  Table 4 shows the ten most common second primary cancers in patients with LCF. The three most common cancers after the initial lung cancer were lung (ICD: 162), colon (ICD: 153), and breast (ICD: 174) cancers.  Figure 2 and Table S2 present the multivariate analyses of overall survival in the single lung cancer and LCF groups. The LCF group showed significantly worse survival compared with the single lung cancer group, with a hazard ratio of 0.84. Table S2 shows the univariate and multivariate analyses of overall survival in both groups in detail, including crude and adjusted models. The univariate analysis revealed that age, sex, smoking, drinking, EGFR mutation, different cancer stage, histological type, and operation were significantly associated with all-cause mortality.  Table S2 present the multivariate analyses of overall survival in the single lung cancer and LCF groups. The LCF group showed significantly worse survival compared with the single lung cancer group, with a hazard ratio of 0.84. Table S2 shows the univariate and multivariate analyses of overall survival in both groups in detail, including crude and adjusted models. The univariate analysis revealed that age, sex, smoking, drinking, EGFR mutation, different cancer stage, histological type, and operation were significantly associated with all-cause mortality. In Table 5, a matrix plot with pairwise comparisons obtained from the Kaplan-Meier analysis demonstrates the survival in the single lung cancer and LCF groups in different stages with or without EGFR mutations. The survival between single lung cancer and LCF were significantly different at stage 1 and 4 with EGFR mutations (p < 0.001).  In Table 5, a matrix plot with pairwise comparisons obtained from the Kaplan-Meier analysis demonstrates the survival in the single lung cancer and LCF groups in different stages with or without EGFR mutations. The survival between single lung cancer and LCF were significantly different at stage 1 and 4 with EGFR mutations (p < 0.001). Kaplan-Meier survival curves are shown in Figure 3. In stage 1 patients with EGFR mutations, the single lung cancer group exhibited better survival than the LCF group ( Figure 3A). Conversely, in stage 4 patients with EGFR mutations ( Figure 3B) or undetected EGFR ( Figure 3D), the single lung cancer group showed worse survival than the LCF group. Survival was significantly different between stage 3 (p = 0.0429) ( Figure 3C) and 4 (p < 0.001) ( Figure 3D) patients with undetected EGFR.

Results
Kaplan-Meier survival curves are shown in Figure 3. In stage 1 patients with EGFR mutations, the single lung cancer group exhibited better survival than the LCF group (Figure 3A). Conversely, in stage 4 patients with EGFR mutations ( Figure 3B) or undetected EGFR ( Figure 3D), the single lung cancer group showed worse survival than the LCF group. Survival was significantly different between stage 3 (p = 0.0429) ( Figure 3C) and 4 (p < 0.001) ( Figure 3D) patients with undetected EGFR.

Heterogeneity in the Incidence of MPMs Involving Lung Cancer
Wide variations in the incidence and characteristics of MPMs involving lung cancer are mainly due to differences in study populations and follow-up time. The characteristics and prognosis of MPMs have been previously reported. MPMs are defined by the diagnostic criteria proposed by Warren and Gates [20], which include the following: "(1) each

Heterogeneity in the Incidence of MPMs Involving Lung Cancer
Wide variations in the incidence and characteristics of MPMs involving lung cancer are mainly due to differences in study populations and follow-up time. The characteristics and prognosis of MPMs have been previously reported. MPMs are defined by the diagnostic criteria proposed by Warren and Gates [20], which include the following: "(1) each malignancy must be histologically confirmed, (2) each malignancy occurs in a different region or organ, (3) the new emergent cancer must be confirmed to be non-metastatic, and (4) each cancer has its own pathological features" [21]. MPMs are categorized into two groups based on the timing of the two occurrences. In the LCF group, lung cancer is the first primary malignancy, whereas in the OCF group, lung cancer is the second primary malignancy [9]. However, the definitions of SMPM and MMPM vary in the literature.
Although several studies considered 6 months between malignancies as the criteria for MMPM, some studies used a 60-day or 2-year interval as the criteria for distinguishing SMPMs and MMPMs [14,22]. The overall incidence of MPMs varies by country and region, which may be attributed to different study populations, diagnostic techniques, and healthcare system facilities. The risk of developing second primary malignant neoplasm is higher in cancer survivors than in the general population, showing a 3.8% higher probability of developing metachronous second primary malignant neoplasm within a median follow time of 2.5 years [21]. Moreover, the 10-year cumulative risk of developing second primary cancer is as high as 13% if diagnosed in patients aged 60-69 years [21]. The incidence of MPMs is approximately 5% for all tumors [23] and 0.86-6.4% for MPMs involving lung cancer [7,24,25].
Although the duration of follow-up varied in previous studies, the cumulative incidence of MPMs involving lung cancer increased over time. In our study, the cumulative incidence was 14.65% over a follow-up time of 14.7 years. The cumulative incidence in our study was much higher than that reported in previous studies in Asia. A retrospective analysis revealed that 2.5% (364/14,528) of patients with lung cancer developed MPMs over a median follow-up time of 5.37 years [9]. The incidence was potentially underestimated because the follow-up time was relatively short. Another single-center study in Taiwan reported that the incidence of MPMs involving lung cancer during the follow-up was only 0.86%, i.e., 193 of 22,405 patients with cancer had MPMs involving lung cancer between 1993 and 1997 [24]. In the LCF subgroup, the incidence of second primary malignancy also varied widely. In a previous study, the incidence of second primary malignancy following initial primary non-small cell lung cancer (NSCLC) was 6.4%, and lung cancer was the most common second cancer (45.1%) [25]. Similarly, in our study, 3728 (5.16%) of 72,219 patients with lung cancer developed second primary cancers, and the most common second cancer was lung cancer. MPM risk factors have been recently highlighted. A previous study revealed that smoking was a significant risk factor for developing MPMs involving lung cancer [24]. However, the genetic, iatrogenic, or environmental risk factors for MPMs remain unclear [8]. Patients who underwent radiotherapy and chemotherapy had more MPMs than those who did not receive these treatments [26]. Moreover, a study showed that increasing age and being divorced/widowed/separated were independent risk factors for second primary lung cancer (SPLC) in most primary cancer types, and over half of the patients died of SPLC [27]. Men are more likely than women to have second malignant neoplasms [12].
We investigated the clinical characteristics of patients, including a history of smoking, prognosis, and common accompanying malignancies in MPMs involving lung cancer. In our study, 3728 (35.2%) of the 10,577 patients with MPMs involving lung cancer had LCF, and 6849 (64.8%) patients had OCF. These results are consistent with those reported in previous studies [7,16,21,28]. A study in Taiwan showed 26.4% (51/193) had LCF and 73.6% (142/193) had OCF [24]. The mean age at diagnosis of the first and second primary malignancies was significantly different between the LCF and OCF groups [24]. These results were consistent with our results showing that the median interval between the two primary malignancies in the OCF group was 3.26 years, which was significantly longer than that in the LCF group. Moreover, the mean age at diagnosis of lung cancer in the LCF group was lower than that in the single lung cancer group. The most common second primary malignancies accompanying LCF were lung, colon, breast, and prostate cancers in our study. A previous study revealed that upper digestive tract, colorectal, and cervical cancers were the most common cancers accompanying lung cancer [24]. These results varied across sexes and countries. For instance, the most common concomitant malignancies among males with lung cancer were gastric, prostate, and colon cancers, whereas those among females with lung cancer were breast, thyroid, and colon cancers [29]. The incidence of gastric cancer is higher in Japan, and prostate cancer is less frequent in Asian males than in African and Caucasian males [9]. The most common second primary malignancy following NSCLC was lung cancer (45.1%) [25]. Our data consistently showed that the most common second primary malignancy in the LCF group was lung cancer in both males and females.

Survival and Prognosis of MPMs Involving Lung Cancer
In contrast to the conventional belief that patients with multiple cancers exhibit inferior survival rates, a previous study revealed that the survival of patients with second lung cancer coupled with other cancers showed no significant differences compared with that of patients with single lung cancer [10]. Additionally, patients with metachronous MPMs tended to have a better prognosis than those with synchronous MPMs, and the clinical stage was a significant risk factor [9]. Patients with initial lung cancer SMPMs exhibited worse prognosis, and patients with second primary non-lung cancers had better prognosis than those with SPLC [9,30]. We compared survival in the single lung cancer and LCF groups and analyzed the subgroups in different stages and with different EGFR expression. The overall mean survival time of patients with LCF was significantly longer than that of patients with single lung cancer. This difference may be due to survivorship bias or the age at lung cancer diagnosis. Patients with LCF were younger than those with single lung cancer. Univariate analyses revealed that old age, male sex, history of smoking and alcohol drinking, synchronous cancer, small cell lung cancer, late-stage cancer, and undetected EGFR were associated with poor prognosis. Multivariate analyses revealed that old age, male sex, small cell lung cancer, late-stage cancer, and negative EGFR were independent risk factors for poor prognosis. Not surprisingly, these poor prognostic factors for lung cancer were also the prognostic factors for MPMs involving lung cancer.
Asian patients with lung adenocarcinoma are more likely to have EGFR mutations. In China, the overall frequency of EGFR mutations is 50.2%, and the frequency of EGFR mutations in regular smokers is 35.3% [31]. EGFR testing is recommended for all patients with lung cancer in advanced stages (stages 3b, 3c, and 4) of adenocarcinoma, especially in females and nonsmokers [31]. A recent study revealed the impact of EGFR mutations on survival; the clinical stage of lung cancer, order of occurrence of lung cancer, and existence of EGFR mutations were important factors for patient survival [32]. Unlike these studies, we conducted subgroup survival analyses of different stages and EGFR expression statuses. In all stages, the LCF group with or without EGFR mutations had longer mean survival time than the single lung cancer group. Survival differed significantly in different stages and with different EGFR mutation statuses between the LCF and single lung cancer groups. Compared with patients with undetected EGFR, survival in stage 3 and 4 patients with mutant EGFR was superior in both groups. In stage 3 and 4 patients with EGFR mutation and undetected EGFR, survival in the LCF group was better than that in the single lung cancer group. Conversely, in stage 1 patients with mutant EGFR, the LCF group had worse survival than the single lung cancer group.

Conclusions and Limitations
The characteristics and outcomes of MPMs are important issues because the survival of patients with MPMs depends on baseline characteristics and treatments. The cumulative incidence of MPMs was 14.65% over a follow-up time of 14.7 years. The mean age at diagnosis of lung cancer in patients with LCF was lower than that in patients with single lung cancer. The overall mean survival in patients with LCF was better than that in patients with single lung cancer. Survival in patients with advanced lung cancer with mutant EGFR was superior to that of patients with undetected EGFR. However, this result was not observed in patients with early lung cancers.
The limitations of this study are common to studies utilizing large databases. Some coding may have been incomplete, especially regarding personal behavior, such as smoking and alcohol drinking. The results concerning survival in advanced stages of lung cancer with mutant EGFR compared with that in stage 1 with EGFR were notable. Data may have been missing because EGFR screening is not recommended for early-stage lung cancer. In our study of real-world practice, complete molecular information in all stages of lung cancers was not always provided, which is another limitation of this study. The study did not analyze ethnicity, and risk factors for MPMs were not discussed. Thus, further studies are required to clarify any discrepancy. In this study, we investigated patients with lung cancer in Taiwan. Similar to other Asian countries, EGFR mutations account for half of lung adenocarcinomas. This difference in ethnicity characteristics highlights the need for personalized treatment and follow-up in each country. Finally, we expect that the results of this study will lead to better personalized treatments and survival prediction along with reductions in the medical costs for treating MPMs.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/jcm11195944/s1, Table S1: Mean survival time of single and second primary malignancy; Table S2: Univariate and multivariate regression analysis of overall survival among single lung cancer and lung cancer first patients.  Informed Consent Statement: Patient consent was waived because NHIRD provided de-identified baseline and administrative information. All NHIRD data have been reviewed and processed in the research project before analysis.