The Indirect Efficacy Comparison of DNA Methylation in Sputum for Early Screening and Auxiliary Detection of Lung Cancer: A Meta-Analysis

Background: DNA methylation in sputum has been an attractive candidate biomarker for the non-invasive screening and detection of lung cancer. Materials and Methods: Databases including PubMed, Ovid, Cochrane library, Web of Science databases, Chinese Biological Medicine (CBM), Chinese National Knowledge Infrastructure (CNKI), Wanfang, Vip Databases and Google Scholar were searched to collect the diagnostic trials on aberrant DNA methylation in the screening and detection of lung cancer published until 1 December 2016. Indirect comparison meta-analysis was used to evaluate the diagnostic value of the included candidate genes. Results: The systematic literature search yielded a total of 33 studies including a total of 4801 subjects (2238 patients with lung cancer and 2563 controls) and covering 32 genes. We identified that methylated genes in sputum samples for the early screening and auxiliary detection of lung cancer yielded an overall sensitivity of 0.46 (0.41–0.50) and specificity of 0.83 (0.80–0.86). Combined indirect comparisons identified the superior gene of SOX17 (sensitivity: 0.84, specificity: 0.88), CDO1 (sensitivity: 0.78, specificity: 0.67), ZFP42 (sensitivity: 0.87, specificity: 0.63) and TAC1 (sensitivity: 0.86, specificity: 0.75). Conclusions: The present meta-analysis demonstrates that methylated SOX17, CDO1, ZFP42, TAC1, FAM19A4, FHIT, MGMT, p16, and RASSF1A are potential superior biomarkers for the screening and auxiliary detection of lung cancer.


Introduction
Lung cancer is the leading cause of malignant tumor death, with the morbidity and mortality of lung cancer gradually increasing over the past decades [1]. Only 13% of lung cancer patients survive more than 5 years, and the mortality is close to the morbidity (the ratio of mortality to morbidity is 0.87) [1][2][3]. Despite research on the diagnosis of lung cancer and the use of increasingly advanced technology in its treatment, the prognosis remains poor because of the predominant diagnosis of III, IV-stage disease [3]. Therefore, early diagnosis, auxiliary detection, and treatment have become a major focus to reduce the mortality caused by lung cancer.
With the emergence and development of molecular epidemiology, which opens the "black box" of the disease process, molecular biomarkers have much potential to improve the understanding of the occurrence, development, and prognosis of disease for the early detection of these lesions at the pre-invasive stage, even predicting the progress of the disease [4,5]. Nowadays, it is clearly acknowledged that genetic alterations are accompanied by equally important epigenetic modifications in the pathogenesis of lung cancer [6,7]. In particular, DNA methylation is one of the earliest epigenetic modifications; it is closely associated with the occurrence and development of lung cancer, and it appears earlier than obvious malignant phenotype [7]. Abundant evidence manifests that a variety of well-known DNA methylations can be used as a promising biomarker for the early diagnosis of lung cancer, such as p16 [8], RASSF1A [9], APC [10], MGMT [11], DAPK [12] and RARβ [13]. Many studies have demonstrated that the aberrant methylation of some genes in sputum samples could be a novel "remote medium" for the early detection of lung cancer, which avoids the necessity of invasive procedures [5,14]. The development of next-generation sequencing technology and the maturation of methylation detection technology [15,16] have made methylation detection much more stable and cheaper, and could provide the necessary conditions for clinical utility. Taken together, findings suggest that methylation could serve as efficient diagnostic biomarkers for lung cancer.
However, the DNA methylation of specific genes in sputum samples do not yet provide a convincing and superior gene panel with high sensitivity and specificity for clinical utility in lung cancer. In addition, there was still no comparative evaluation on the diagnostic accuracy of methylation in sputum samples for the screening and detection of lung cancer. Recently, network meta-analysis has been developed to assess the comparative effectiveness of several interventions and to synthesize evidence across a network of studies [17,18]. This method makes it possible to estimate the comparative diagnostic accuracy of multiple biomarkers which have not been directly compared with each other in one study but have been reported by multiple studies under a common comparator. Therefore, we performed a network meta-analysis to indirectly compare the efficacy of the DNA methylation of multiple genes in the diagnosis of lung cancer.

Materials and Methods
We followed the reporting guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) statement when conducting this review. The PRISMA statement has guidelines that include a four-phase flow diagram to systematically guide the inclusion and exclusion of research papers [19]. In addition, the guidelines provide a 27-item checklist that describes the requirements per review section (e.g., title, abstract, introduction, methods, results, discussions, and funding) to ensure that systematic reviews are properly conducted and reported [19].

Search Strategy
We conducted a comprehensive literature search in PubMed, Ovid, Cochrane library, Web of Science databases, Chinese Biological Medicine (CBM), Chinese National Knowledge Infrastructure (CNKI), Wanfang, Vip Databases, and Google Scholar. The main search terms included: lung cancer or lung carcinoma or non-small cell lung cancer or "NSCLC", "sputum or flema", "diagnostic", "sensitivity and specificity" and "methylation or hypermethylation or hypomethylation or demethylation". All articles published until December 2016 were considered. In addition, the reference lists of all identified studies were manually searched to identify any additional studies.

Inclusion Criteria and Exclusion Criteria
The articles, which could not be excluded based on the title and abstract, were retrieved for full-text review. Studies were included in this meta-analysis if they met the following criteria: (1) the diagnostic potential of sputum DNA methylation for lung cancer; (2) study design being case-control; (3) the patients being diagnosed with lung cancer by pathology; (4) provided data on the numbers of true positives (TP), false positives (FP), true negatives (TN), and false negatives (FN); (5) the methods of detecting methylation based on methylation-specific polymerase chain reaction (MSP) or quantitative methylation-specific PCR (qMSP).
Studies were excluded from the meta-analysis for the following reasons: (1) abstracts, letters, reviews, expert opinions, case reports, or nonclinical studies; (2) studies had duplicate or overlapping data; (3) study was based on tissue, blood, or animals.

Data Extraction and Quality Assessment
Data extraction was conducted in duplicate by two investigators (Di Liu and Hongli Peng) based on title, abstract, author, year of publication, country of origin, sample size, assay methods, and diagnostic performance (sensitivity (SEN), specificity (SPE), TP, FP, FN, TN), target gene(s), and the score of the quality assessment of studies of diagnostic accuracy (QUADAS) [20] and the standards for reporting of diagnostic accuracy (STARD) [21]. Any disagreements in data extraction were resolved by consensus.
STARD and QUADAS guidelines were utilized to assess the methodological quality of each study. There are 25 items in the STARD initiative checklist, and a score of 1 was given when the item was yielded [21]. Fourteen items were included in the QUADAS tool, whereby a score of 1 was given when a specific item was fulfilled, 0 if this item was unclear, and −1 for the item not achieved [20]. All of these studies were evaluated independently and discussed by the reviewers until a consensus was reached.

Statistical Analysis
We used standard methods recommended for the direct meta-analysis which estimated the diagnostic test evaluation of DNA methylation compared with the gold standard [22]. The number of TP, TN, FP, and FN were retrieved from each article. The SEN, SPE, positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR) estimates with 95% confidence interval (CI) from each study were analyzed using a random-effect model and the bivariate summary receiver operating characteristic (SROC) curve was generated. The area under the curve (AUC) represents an analytical summary of the test performance and illustrates the trade-off between sensitivity and specificity [23]. The heterogeneity among studies was assessed on the basis of the Chi Square test using the Cochran Q statistic. The I 2 statistic, which measures the extent of inconsistency between studies, was also assessed [24]. Spearman's correlation coefficient of logarithm sensitivity and 1-specificity for each gene was assessed to determine the threshold effect [25]. Analyses were performed using two statistical software programs (Meta-Disc 1.4 for Windows and Stata version 12.0, Stata Corp, College Station, TX, USA).
For indirect comparisons, the comparative diagnostic accuracy of all biomarkers was estimated according to common comparator (the gold standard). We did not assume consistency (which was evaluated by comparing the direct estimates with the indirect estimates for each comparison) of two biomarkers without direct analyses. We took the step-wise approach, which was suitable for the simple star-network meta-analysis to obtain an indirect analysis. The Deeks' test and Egger's test were utilized to estimate the funnel plot asymmetry and the publication bias [26]. The indirect meta-analysis was conducted using indirect treatment comparison (ITC) software and Stata 12.0 (Stata Corp, College Station, TX, USA) [27]. A two-sided p-value of less than 0.05 was considered significant.

Study Characteristics and Quality of Included Studies
The flowchart of included studies was presented in Figure 1. A total of 424 studies were preliminarily reviewed, of which 33 were available for the indirect meta-analysis [5,14,. The characteristics of each study are shown in Table 1, including name of the study, number of patients and controls, biomarkers, and quality assessment based on STARD and QUADAS. The systematic      The graph of the SROC (summary receiver operating characteristic) cure is shown in Figure 2, which demonstrates the trade-off between SEN and SPE values in multiple studies. The SROC curve results showed that AUC (area under the curve) of 32 different methylated genes was 0.69 (0.64-0.73), indicating the ability of 32 pooled gene methylations to differentiate lung cancer patients from non-lung-cancer patients with a mid-level accuracy. The graph of the SROC (summary receiver operating characteristic) cure is shown in Figure 2, which demonstrates the trade-off between SEN and SPE values in multiple studies. The SROC curve results showed that AUC (area under the curve) of 32 different methylated genes was 0.69 (0.64-0.73), indicating the ability of 32 pooled gene methylations to differentiate lung cancer patients from nonlung-cancer patients with a mid-level accuracy.   Figure 3 shows the star-network of comparisons for methylated genes (1-32) with the gold standard of being diagnosed with lung cancer (33). We established a network to compare the diagnostic accuracy of 32 methylated genes, with the results of indirect comparisons presented in Table S1. The OR and 95% CI of SOX17, TAC1, ZFP42, CDO1 differed significantly from the other 28 methylated genes and were higher than 1, indicating that SOX17, TAC1, ZFP42, and CDO1 have a higher diagnostic accuracy. More information is shown in the supplemental content ( Figures S1-S32). We combined indirect comparisons to evaluate the comparative efficacy of different methylated genes; the superior genes were performed by SOX17 (sensitivity: 0.84, specificity: 0.88), CDO1 (sensitivity: 0.78, specificity: 0.67), ZFP42 (sensitivity: 0.87, specificity: 0.63), TAC1 (sensitivity: 0.86, specificity: 0.75) (

Test of Heterogeneity and Meta-Regression
In the meta-analysis, computation of the Spearman's rank correlation coefficient between the logit of sensitivity and that of 1-specificity of sputum DNA testing was 0.465 (p < 0.001), indicating the heterogeneity of threshold effect. We also investigated the non-threshold effects; the results indicated the existence of significant heterogeneity in the overall sensitivity (I 2 = 91.2%, p < 0.001), specificity (I 2 = 93.5%, p < 0.001), PLR (I 2 = 85.4%, p < 0.001), NLR (I 2 = 88.8%, p < 0.001), and DOR (I 2 = 72.6%, p < 0.001). Therefore, a bivariate binomial mixed model was applied to summarize the pooled estimates in this study. To determine the sources of heterogeneity, we performed meta-regression to test the effect of ethnicity (Asian/others), sample size (n = 0-100/101-200/201-), and the quality of study (low/medium/high). Multivariable regression showed that ethnicity (coefficient = −0.785, p = 0.001) and the sample size (coefficient = −0.324, p = 0.036) had statistically significant differences, while the quality of study (coefficient = −0.074, p = 0.552) showed no significant difference. Then, we conducted subgroup analysis based on ethnicity and the sample size, as shown in Table 3. In addition, we performed a meta-regression to test the effect of ethnicity, sample size, and the

Test of Heterogeneity and Meta-Regression
In the meta-analysis, computation of the Spearman's rank correlation coefficient between the logit of sensitivity and that of 1-specificity of sputum DNA testing was 0.465 (p < 0.001), indicating the heterogeneity of threshold effect. We also investigated the non-threshold effects; the results indicated the existence of significant heterogeneity in the overall sensitivity (I 2 = 91.2%, p < 0.001), specificity (I 2 = 93.5%, p < 0.001), PLR (I 2 = 85.4%, p < 0.001), NLR (I 2 = 88.8%, p < 0.001), and DOR (I 2 = 72.6%, p < 0.001). Therefore, a bivariate binomial mixed model was applied to summarize the pooled estimates in this study. To determine the sources of heterogeneity, we performed meta-regression to test the effect of ethnicity (Asian/others), sample size (n = 0-100/101-200/201-), and the quality of study (low/medium/high). Multivariable regression showed that ethnicity (coefficient = −0.785, p = 0.001) and the sample size (coefficient = −0.324, p = 0.036) had statistically significant differences, while the quality of study (coefficient = −0.074, p = 0.552) showed no significant difference. Then, we conducted subgroup analysis based on ethnicity and the sample size, as shown in Table 3. In addition, we performed a meta-regression to test the effect of ethnicity, sample size, and the quality of study with different genes. The results showed that ethnicity (coefficient = −1.117, p = 0.048) and the sample size (coefficient = −1.177, p = 0.026) were of statistically significant bias for p16, while not significant bias for other candidate genes (p > 0.050).
Publication bias was evaluated by Deeks' test and Egger's test. The funnel plots for publication bias showed no asymmetry (Figure 4). The result of Deeks' test showed that p = 0.008, indicating that publication bias could exist in the meta-analysis. Publication bias was evaluated by Deeks' test and Egger's test. The funnel plots for publication bias showed no asymmetry (Figure 4). The result of Deeks' test showed that p = 0.008, indicating that publication bias could exist in the meta-analysis.

Discussion
Lung cancer has become a global burden, further substantiating the need for early screening and auxiliary detection [1,3]. The key to accomplishing both these goals is the better understanding of the genes or pathways disrupted in causing lung cancer [6,45,59]. The fact that silencing genes through hypermethylation or activating genes through hypomethylation play an important role in the initiation and progression of lung cancer has stimulated the development of screening approaches to identify additional genes and pathways that are disrupted within the epigenome [59]. In addition, DNA methylation in sputum samples has the potential to serve as a non-invasive screening method for the identification of specific biomarkers, enabling the early detection of lung cancer [31,39].
In the direct meta-analysis, we identified that methylated genes in sputum samples for the early screening and auxiliary detection of lung cancer yielded an overall sensitivity of 0.46 at the same specificity of 0.83. Furthermore, the PLR (positive likelihood ratio) was 2.72, NLR (negative likelihood

Discussion
Lung cancer has become a global burden, further substantiating the need for early screening and auxiliary detection [1,3]. The key to accomplishing both these goals is the better understanding of the genes or pathways disrupted in causing lung cancer [6,45,59]. The fact that silencing genes through hypermethylation or activating genes through hypomethylation play an important role in the initiation and progression of lung cancer has stimulated the development of screening approaches to identify additional genes and pathways that are disrupted within the epigenome [59]. In addition, DNA methylation in sputum samples has the potential to serve as a non-invasive screening method for the identification of specific biomarkers, enabling the early detection of lung cancer [31,39].
In the direct meta-analysis, we identified that methylated genes in sputum samples for the early screening and auxiliary detection of lung cancer yielded an overall sensitivity of 0.46 at the same specificity of 0.83. Furthermore, the PLR (positive likelihood ratio) was 2.72, NLR (negative likelihood ratio) was 0.64, and DOR (diagnostic odds ratio) value was 4.28, and the AUC (area under the curve) was 0.69, indicating a mid-level accuracy. Therefore, we should pick the superior genes for clinical utility as diagnostic biomarkers for lung cancer. Combined indirect comparisons identified the superior genes as SOX17 (sensitivity: 0.84, specificity: 0.88), CDO1 (sensitivity: 0.78, specificity: 0.67), ZFP42 (sensitivity: 0.87, specificity: 0.63), and TAC1 (sensitivity: 0.86, specificity: 0.75). A single DNA methylation biomarker cannot be expected to detect all cases of lung cancer. Some studies demonstrated that combined multiple methylated genes could improve the diagnostic value of cancers [37,60]. We identified that the sensitivity value of methylated FAM19A4 and PLR value of methylated RASSF1A, FHIT, MGMT, and p16 are relatively high, suggesting that they are comprehensive parameters for the screening test [61]. In addition, methylated RASSF1A and p16 genes are reported to be promising driving molecules in many cancers under the concept of precision medicine [9,[62][63][64][65][66][67]. In addition, the methylation of FAM19A4, FHIT, and MGMT were reported to play important roles in the occurrence and deterioration of lung cancer [68][69][70][71]. Therefore, we advocate that methylated SOX17, CDO1, ZFP42, TAC1, FAM19A4, FHIT, MGMT, p16, and RASSF1A are useful in the screening and auxiliary detection of lung cancer.
To our knowledge, this study is the first systemic review and indirect meta-analysis to assess the comparative diagnostic effectiveness of the methylation profile of multiple candidate gens in sputum samples for the early screening and detection of lung cancer. According to the method of network meta-analysis [18,27], we used indirect comparison to estimate the comparative diagnostic accuracy of two methylated genes based on a common comparator (the gold standard of being diagnosed with lung cancer). Therefore, the inconsistency between the direct and indirect comparison is not available to address.
However, we observed a large degree of heterogeneity among studies investigating methylation profile in sputum samples used for lung cancer. Threshold effect is one of the primary causes of heterogeneity among diagnostic accuracy studies [24]. In the present meta-analysis, we found obvious heterogeneity as a result of threshold effect, which may be caused by different genes. There is a clear and unified cut-off value for methylation/unmethylation regardless of whether it is based on qualitative analysis or quantitative analysis for each gene tested by the method of MSP [16]. Moreover, we performed meta-regression to test the heterogeneity caused by ethnicity, sample size, and the quality of study. The results suggested that study region (p = 0.001) and the sample size (p = 0.036) might be a source of heterogeneity for this meta-analysis. The results of subgroup analyses showed that large-sample studies had higher sensitivity than the small-and moderate-sample studies, while studies in Asia had lower sensitivity than other regions.
In addition, publication bias could exist in the meta-analysis. This meta-analysis was only based on published studies, therefore inducing the possibility of publication bias. The Deeks' test and Egger's test not only detect publication bias, but also indicate the heterogeneity due to the effect of ethnicity, sample size, the quality of study, etc. [72,73]. Therefore, we proposed that the heterogeneity was potentially due to different genes, study region, and sample size. However, this speculation needs to be investigated in the future study.
The present network meta-analysis included 33 articles and 32 candidate genes, with the majority of genes only included in one article. These make it difficult to directly compare the diagnostic efficacy among multiple genes; thus, only indirect comparisons were evaluated in this study. However, the absence of direct comparisons may lead to bias. Pairwise meta-analysis and network meta-analysis were carried out sequentially for direct and indirect comparisons of migraine headache days among three interventions compared with those treated by three placebos, and the results showed that there was no significant inconsistency between the direct and indirect evidence for the majority of comparisons [74]. Another network meta-analysis was performed to directly and indirectly compare the effectiveness of several oral antidiabetic drugs in the prevention of cardiovascular mortality and morbidity, and the results indicated that the inconsistency between direct and indirect estimates of all-cause mortality, cardiovascular-related mortality, acute coronary syndrome, and myocardial infraction were significant low [75]. In summary, the present results from indirect comparisons should be reliable and acceptable.
Based on the focus of diagnostic accuracy studies, we identified other common limitations and insufficiency. Firstly, all the publications included in this analysis were reported on case-control studies, indicating that the selection bias could possibly lead to over-estimations of diagnostic accuracy compared with the cross-sectional study and cohort study [76,77]. In addition, the effects of language selection bias and publication bias cannot be ignored in any meta-analysis [19]. Finally, the detection utilizing sputum DNA testing was not good enough. We think two methods might have the potential to screen valid and good biomarkers for the advancement of the field. Firstly, a panel with multiple methylated genes may be of high performance in diagnostic models. Secondly, instead of qualitative methods (MSP), quantitative methods for the determination of the methylation patterns in candidate genes may increase the diagnostic performance.

Conclusions
In conclusion, despite these limitations, our meta-analysis advocates that methylated SOX17, CDO1, ZFP42, TAC1, FAM19A4, FHIT, MGMT, p16, and RASSF1A are useful biomarkers in the screening and auxiliary detection of lung cancer. Our findings provide new avenues for assessing the comparative diagnostic effectiveness of several methylations in lung cancer based on the method of network meta-analysis. Further high-quality and large-scale studies are needed to confirm our analysis.