Prediction of Chronic Atrophic Gastritis and Gastric Neoplasms by Serum Pepsinogen Assay: A Systematic Review and Meta-Analysis of Diagnostic Test Accuracy

Serum pepsinogen assay (sPGA), which reveals serum pepsinogen (PG) I concentration and the PG I/PG II ratio, is a non-invasive test for predicting chronic atrophic gastritis (CAG) and gastric neoplasms. Although various cut-off values have been suggested, PG I ≤70 ng/mL and a PG I/PG II ratio of ≤3 have been proposed. However, previous meta-analyses reported insufficient systematic reviews and only pooled outcomes, which cannot determine the diagnostic validity of sPGA with a cut-off value of PG I ≤70 ng/mL and/or PG I/PG II ratio ≤3. We searched the core databases (MEDLINE, Cochrane Library, and Embase) from their inception to April 2018. Fourteen and 43 studies were identified and analyzed for the diagnostic performance in CAG and gastric neoplasms, respectively. Values for sensitivity, specificity, diagnostic odds ratio, and area under the curve with a cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3 to diagnose CAG were 0.59, 0.89, 12, and 0.81, respectively and for diagnosis of gastric cancer (GC) these values were 0.59, 0.73, 4, and 0.7, respectively. Methodological quality and ethnicity of enrolled studies were found to be the reason for the heterogeneity in CAG diagnosis. Considering the high specificity, non-invasiveness, and easily interpretable characteristics, sPGA has potential for screening of CAG or GC.


Introduction
Gastric cancer (GC) is a global health-related burden and the fourth most common cause of cancer-related deaths worldwide [1]. The sequential cascade of histopathology for development of intestinal-type gastric adenocarcinoma is from normal gastric epithelium to chronic gastritis, chronic atrophic gastritis (CAG), and intestinal metaplasia (IM), followed by dysplasia, and finally GC [2]. Patients with premalignant lesions, such as CAG or dysplasia, have a considerable risk for developing GC, and early detection of these lesions is important for the screening of GC [3,4].
For the population-based screening of GC, the endoscopic mass screening program has shown its efficacy in GC-prevalent countries such as Korea and Japan [5]. The endoscopic screening program reduced GC-related mortality by 47% in a nested case-control study in Korea [6]. However, it is not cost-effective in regions with low incidence of GC, and stepwise or individualized screening according to the risk factors of GC has been recommended [4,5].

Selection Criteria
We included studies that met the following criteria. Patients (1) who have CAG or gastric neoplasms (dysplasia or cancer); (2) intervention: sPGA with cut-off value of PG I ≤70 ng/mL and/or PG I/PG II ratio ≤3; (3) comparison: none; (4) outcome: diagnostic performance indices of sPGA for CAG or gastric neoplasms including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (PLR), negative likelihood ratio (NLR), accuracy, or diagnostic odds ratio (DOR), which enable an estimation of TP, FP, TN, and FN values; (5) study design: all types; (6) studies of human subjects; and (7) full-text publications written in English. Studies that met all of the inclusion criteria were sought and selected. The exclusion criteria were as follows: (1) narrative review; (2) letter, comment, editorial or reply to questions; (3) study protocol; (4) publication with incomplete data; and (5) systematic review/meta-analysis or consensus report. Studies meeting at least one of the exclusion criteria were excluded from this analysis.

Methodological Quality
The methodological quality of the included publications was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool, which contains four domains, including "patient selection", "index test", "reference standard", and "flow and timing" (flow of patients through the study and timing of the index tests and reference standard) [16]. Each domain is assessed in terms of high-, low-, or unclear risk of bias, and the first three domains are also assessed in terms of high-, low-, or unclear concerns about applicability [16]. Two of the evaluators (C.S.B. and J.J.L.) independently assessed the methodological quality of all the included studies, and any disagreements between the evaluators were resolved by discussion or consultation with a third evaluator (G.H.B.) [4].

Data Extraction and Primary and Modifier-Based Analyses
Two evaluators (C.S.B. and J.J.L.) independently used the same data fill-in form to collect the summary of primary outcomes (TP, FP, FN, and TN) and modifiers in each study. Disagreements between the two evaluators were resolved by discussion or consultation with a third author (G.H.B).
DTA was the primary outcome of this study. We calculated the values for TP (subjects with positive sPGA who have CAG or gastric neoplasms), FP (subjects with positive sPGA who do not have CAG or gastric neoplasms), FN (subjects with negative sPGA who have CAG or gastric neoplasms), and TN (subjects with negative sPGA who do not have CAG or gastric neoplasms) of sPGA for the diagnosis of CAG or gastric neoplasm. To calculate the values, we used 2 × 2 tables whenever possible, from the original articles that contain various diagnostic performance indices (sensitivity, specificity, PPV, NPV, PLR, NLR, accuracy, or DOR etc.). If only a part of data was presented, we calculated the values for TP, FP, FN, and TN using the following formulas: sensitivity = TP/(TP + FN); specificity = TN/(FP + TN); PPV = TP/(TP + FP); NPV = TN/(FN + TN); PLR = sensitivity/(1-specificity); NLR = (1-sensitivity)/specificity; accuracy = (TP + TN)/(TP + FP + FN + TN); DOR = (TP × TN)/(FP × FN); standard error = (ln(upper confidence interval (CI)) -ln(lower CI))/3.92 =

√
(1/TP + 1/FP + 1/FN + 1/TN). The following data were also extracted from each study, whenever possible: study design, distribution of age, gender or ethnicity of enrolled population, sample size, published year, measurement method of sPGA, and the proportion of smokers and H. pylori-infected individuals.

Statistical Analysis
Stata Statistical Software, version 15.1 (College Station, TX, USA) including relevant packages, such as metandi, midas, and mylabels, was used for this meta-analysis.

Identification of Relevant Studies
Among the 28 studies that adopted the cut-off standard of PG I ≤70 ng/mL and PG I/PG II ratio ≤3, two studies [46,51] evaluated diagnostic performance for sPGA based on same population with slightly different inclusion criteria. Therefore, to avoid dependence issue from single population-based multiple outcomes, the study with larger population [46] was included in the meta-analysis as a representative outcome. Finally, 27 studies were included for the diagnosis of GC with the cut-off standard of PG I ≤70 ng/mL and PG I/PG II ratio ≤3.

Methodological Quality of the Include Studies
Methodological qualities of the included studies were similar for the diagnosis of CAG except for five studies. Most of the studies used histological diagnosis as a reference standard of CAG diagnosis; however, three studies [20,28,31] deployed endoscopic diagnosis (visual inspection) as a reference standard of CAG diagnosis. One study [21] included only high-risk patients, such as patients with severe CAG, IM, and dysplasia, excluding the healthy population. Another study [33] also included high-risk patients as a population for reference standard. These five studies for the diagnosis of CAG were rated as "high-risk" in at least one of the seven domains ( Figure 2).        Methodological qualities of the included studies were similar for the diagnosis of CAG except for five studies. Most of the studies used histological diagnosis as a reference standard of CAG diagnosis; however, three studies [20,28,31] deployed endoscopic diagnosis (visual inspection) as a reference standard of CAG diagnosis. One study [21] included only high-risk patients, such as patients with severe CAG, IM, and dysplasia, excluding the healthy population. Another study [33] also included high-risk patients as a population for reference standard. These five studies for the diagnosis of CAG were rated as "high-risk" in at least one of the seven domains ( Figure 2).  Methodological qualities of the included studies were similar for the diagnosis of gastric neoplasm except for 13 studies. Ideally, all the patients should be tested with the same reference standard method (endoscopy). However, seven studies [35,39,46,47,49,51,63] performed endoscopy to diagnose gastric neoplasm only for patients with positive sPGA or positive double-contrast barium X-ray introducing partial verification bias. One study [48] conducted endoscopy every 2 years for patients with positive sPGA and every 5 years for patients with negative sPGA, adopting different standards of reference test (differential verification bias).
Two studies [46,51] evaluated diagnostic performance of sPGA based on a same population with slightly different inclusion criteria and another two studies [46,47] also evaluated diagnostic performance based on a same population using different cut-off values. Therefore, these studies were ranked as "high-risk" for the applicability concerns.
Since most of the studies were case-control studies, they were not ranked as "high-risk". A total of 13 abovementioned studies for the diagnosis of gastric neoplasm were rated as "high-risk" in at least one of the seven domains ( Figure 3).

DTA of sPGA in GC
Since the minimum number of studies required for the quantitative analysis is four, DTA summary of sPGA in dysplasia or neoplasm was not calculated with a specific cut-off standard (only two or three studies were included with a specific cut-off value) (Figure 1). Sensitivity, specificity, PLR, NLR, DOR and AUC with 95% CI for the cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3 for GC diagnosis were 0.59 (0.50-0.67), 0.73 (0.64-0.81), 2.2 (1.7-2.9), 0.56 (0.46-0.68), 4 (3-6), and 0.70 (0.66-0.74), respectively (Table 6, Figure 7A). The SROC curve with 95% confidence region and prediction region is illustrated in Figure 8A. Assuming 20% prevalence of GC (prior probability), Fagan's nomogram shows that the posterior probability of GC is 36% if patients are diagnosed as positive, and the posterior probability of GC is 13% if patients are diagnosed as negative according to the sPGA with the cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3 ( Figure 9A).  Figure 4B). The SROC curve with 95% confidence region and prediction region is illustrated in Figure 5B. Fagan's nomogram shows that the posterior probability of CAG is 66% if patients are diagnosed as positive, and the posterior probability of CAG is 12% if patients are diagnosed as negative according to the sPGA with the cut-off value of PG I/PG II ratio ≤3 ( Figure 6B).

DTA of sPGA in GC
Since the minimum number of studies required for the quantitative analysis is four, DTA summary of sPGA in dysplasia or neoplasm was not calculated with a specific cut-off standard (only two or three studies were included with a specific cut-off value) (Figure 1). Sensitivity, specificity, PLR, NLR, DOR and AUC with 95% CI for the cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3 for GC diagnosis were 0.59 (0.50-0.67), 0.73 (0.64-0.81), 2.2 (1.7-2.9), 0.56 (0.46-0.68), 4 (3-6), and 0.70 (0.66-0.74), respectively (Table 6, Figure 7A). The SROC curve with 95% confidence region and prediction region is illustrated in Figure 8A. Assuming 20% prevalence of GC (prior probability), Fagan's nomogram shows that the posterior probability of GC is 36% if patients are diagnosed as positive, and the posterior probability of GC is 13% if patients are diagnosed as negative according to the sPGA with the cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3 ( Figure 9A).        Values for sensitivity, specificity, PLR, NLR, DOR, and AUC with 95% CI for the cut-off value of PG I ≤70 ng/mL for GC diagnosis were 0.62 (0.38-0.82), 0.57 (0.32-0.79), 1.4 (0.9-2.3), 0.67 (0.40-1.11), 2 (1-5), and 0.63 (0.58-0.67), respectively (Table 6, Figure 7B). The SROC curve with 95% confidence region and prediction region is illustrated in Figure 8B. Fagan's nomogram shows that the posterior probability of GC is 26% if patients are diagnosed as positive, and the posterior probability of GC is 14% if patients are diagnosed as negative according to the sPGA with the cut-off value of PG I ≤70 ng/mL ( Figure 9B).
Values for sensitivity, specificity, PLR, NLR, DOR, and AUC with 95% CI for the cut-off value of PG I/PG II ratio ≤3 for GC diagnosis were 0.56 (0. 35 Figure 7C). The SROC curve with 95% confidence region and prediction region is illustrated in Figure 8C. Fagan's normogram shows that the posterior probability of GC is 39% if patients are diagnosed as positive, and the posterior probability of GC is 12% if patients are diagnosed as negative according to the sPGA with the cut-off value of PG I/PG II ratio ≤3 ( Figure 9C).

Exploring Heterogeneity with Meta-Regression and Subgroup Analysis of sPGA in CAG
For the diagnosis of CAG with the cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3, the SROC curve was symmetric ( Figure 5A). We observed a negative correlation coefficient between logit transformed sensitivity and specificity (−0.92) and asymmetry parameter, β, with non-significant p value (p = 0.14) indicating no heterogeneity among studies. However, 95% prediction region in the SROC curve was wide, and age (p = 0.01) and methodological quality of the included studies (p = 0.01) were found to be the source of heterogeneity in meta-regression. Subgroup analyses according to the modifiers of heterogeneity showed lower AUCs in studies with a younger population (<60 years) and high methodological quality (Table 5).
For the diagnosis of CAG with the cut-off value of PG I/PG II ratio ≤3, the SROC curve was symmetric ( Figure 5B). We observed a negative correlation coefficient between logit transformed sensitivity and specificity (−0.72) and asymmetry parameter, β, with non-significant p value (p = 0.70), indicating no heterogeneity among studies. However, the 95% prediction region in the SROC curve was wide, and ethnicity (p = 0.02), age (p = 0.03), methodological quality of included studies (p = 0.01), and total number of patients (p = 0.05) were found to be the source of heterogeneity in meta-regression. Subgroup analyses according to the modifiers of heterogeneity showed lower AUCs in studies with a younger population (<60 years), an Asian population, low methodological quality, and higher number of included patients (≥1000) ( Table 5).

Exploring Heterogeneity with Meta-Regression and Subgroup Analysis of sPGA in GC
For the diagnosis of GC with the cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3, SROC curve was symmetric ( Figure 8A). We observed a negative correlation coefficient between logit transformed sensitivity and specificity (−0.38) and asymmetry parameter, β, with non-significant p value (p = 0.26), indicating no heterogeneity among studies. However, 95% prediction region in SROC curve was wide and ethnicity (p = 0.02), published year (p = 0.01), and total number of patients (p = 0.01) were found to be the source of heterogeneity in meta-regression. Subgroup analyses according to the modifiers of heterogeneity showed lower AUCs in studies with Western population, more recent publications (2010-2018 vs. 1995-2009) and lower number of included patients (<1000) ( Table 6).
For the diagnosis of GC with the cut-off value of PG I ≤70 ng/mL, the SROC curve was symmetric ( Figure 8B). We observed a negative correlation coefficient between logit transformed sensitivity and specificity (−0.61) and asymmetry parameter, β, with non-significant p value (p = 0.92), indicating no heterogeneity among studies. However, 95% prediction region in the SROC curve was wide and methodological quality of included studies (p = 0.05), detection method of sPGA (p <0.01), and total number of patients (p <0.01) were found to be the source of heterogeneity in meta-regression. Subgroup analyses according to the modifiers of heterogeneity was only possible for methodological quality, because the number of subgroups classified according to the other modifiers was lower than four. Subgroup analysis showed lower AUCs in studies with high methodological quality (Table 6).
For the diagnosis of GC with the cut-off value of PG I/PG II ratio ≤3, the SROC curve was symmetric ( Figure 8C). We observed a negative correlation coefficient between logit transformed sensitivity and specificity (−0.83) and asymmetry parameter, β, with non-significant p value (p = 0.57), indicating no heterogeneity among studies. Only ethnicity (p <0.01) was found to be the source of heterogeneity in meta-regression. Subgroup analyses according to the modifier of heterogeneity showed lower AUCs in studies with Asian populations (Table 6).

Publication Bias
Publication bias was not evaluated for diagnosis of CAG, as fewer than 10 studies on this subject were included with any cut-off values.
For the diagnosis of GC, 27 studies were included with cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3. Deeks' funnel plot asymmetry test showed no evidence of publication bias (p = 0.71) ( Figure 10A). Publication bias was not evaluated for cut-off of PG I ≤70 ng/mL, as only six studies were included with this cut-off value. Eleven studies were included with cut-off of PG I/PG II ratio ≤3. Although Deeks' funnel plot asymmetry test for 11 studies with a cut-off value of PG I/PG II ratio ≤3 showed a p value of 0.02, indicating publication bias, the plot was symmetrical with respect to the regression line ( Figure 10B). For the diagnosis of GC with the cut-off value of PG I/PG II ratio ≤3, the SROC curve was symmetric ( Figure 8C). We observed a negative correlation coefficient between logit transformed sensitivity and specificity (−0.83) and asymmetry parameter, β, with non-significant p value (p = 0.57), indicating no heterogeneity among studies. Only ethnicity (p <0.01) was found to be the source of heterogeneity in meta-regression. Subgroup analyses according to the modifier of heterogeneity showed lower AUCs in studies with Asian populations (Table 6).

Publication Bias
Publication bias was not evaluated for diagnosis of CAG, as fewer than 10 studies on this subject were included with any cut-off values.
For the diagnosis of GC, 27 studies were included with cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3. Deeks' funnel plot asymmetry test showed no evidence of publication bias (p = 0.71) ( Figure 10A). Publication bias was not evaluated for cut-off of PG I ≤70 ng/mL, as only six studies were included with this cut-off value. Eleven studies were included with cut-off of PG I/PG II ratio ≤3. Although Deeks' funnel plot asymmetry test for 11 studies with a cut-off value of PG I/PG II ratio ≤3 showed a p value of 0.02, indicating publication bias, the plot was symmetrical with respect to the regression line ( Figure 10B).

Discussion
There are two main types of pepsinogen (PG), namely PG I and PG II, which are proenzymes of pepsin, an endoproteinase present in the gastric juice [21]. PG I is secreted mainly by chief cells in the fundic glands of the stomach fundus and body, whereas PG II is secreted by all the gastric glands and the proximal duodenal mucosa (Brunner's glands) [5,21,74,75]. The secretion ability of gastric mucosa is usually intact in the case of no infection or acute H. pylori infection [75]. However, when chronic H. pylori infection with CAG extends from antrum to corpus of stomach, chief cells are replaced by pyloric glands [7]. Therefore, concentration of serum PG I decreases due to the damaged secretion ability of gastric mucosa, whereas the concentration of PG II remains relatively intact, leading to a low PG I/PG II ratio and this value reflects the severity of CAG [4,7,75].
Although various cut-off values have been suggested, PG I ≤70 ng/mL and PG I/PG II ratio ≤3 have been proposed for the prediction of CAG or GC [4,8]. However, previous meta-analyses presented only pooled outcomes, which cannot determine the diagnostic validity of sPGA with cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3 [9,10], although no threshold effect was detected [9]. Moreover, the meta-analysis determined publication bias with Begg's test, which is inappropriate for DTA because of type I error inflation [9]. Serum concentration of gastrin, which is produced and secreted primarily by the G cells in antrum, is increased when the corpus mucosa is predominantly involved, and decreased with antral predominant gastric atrophy [5,75]. Combined efficacy of sPGA with H. pylori antibody [11] and/or gastrin-17 [12,13] has been indicated for the prediction of gastric cancer [11] and CAG [12,13], and it is mainly used in Europe (as panel test). However, sPGA is preferable to serum gastrin measurement because sPGA reflects gastric mucosal status better [75]. Moreover, previous meta-analyses could not determine the diagnostic validity of sPGA alone [11][12][13]. Although previous meta-analyses, published in 2004 and 2006, reported diagnostic validity of sPGA with cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3, these studies cannot reflect recently published data and had several methodological pitfalls [8,14] (Table 1).
The results of our study confirm that the performance of sPGA is better for the diagnosis of CAG than GC, and sPGA has potential for CAG or GC screening (triage test) considering its high specificity (Tables 5 and 6). Another finding of this study is the diagnostic validity of sPGA with cut-off value of PG I/PG II ratio ≤3. Although direct comparison of DOR does not have significant implications, the DTA of sPGA with cut-off value of PG I/PG II ratio ≤3 was similar to that with cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3 (Tables 5 and 6). A recent study also indicated that the PG I/PG II ratio is one of the stomach-specific circulating biomarkers for GC risk assessment [69]. It is also known that sPGA is a cost-effective diagnostic test and useful to reduce the intestinal-type GC, especially for high-risk populations [76,77]. Considering the non-invasiveness and easily interpretable characteristics, the results of this study indicates the utility of sPGA as a population-based screening tool for CAG or GC.
Compared to the previous meta-analyses that combined the diagnostic values with various cut-off standards in a single outcome, the results of this study showed slightly lower diagnostic values (AUC for the diagnosis of CAG: 0.81 vs. 0.85/AUC for the diagnosis of GC: 0.70 vs. 0.76/DOR for the diagnosis of GC: 4 vs. 5.41), indicating overestimation of diagnostic validity in previous studies [9,10].
In terms of the reasons of heterogeneity, subgroup analyses showed decreased I 2 values in high-quality studies with cut-off value of PG I ≤70 ng/mL and PG I/PG II ratio ≤3 for the diagnosis of CAG compared to those of main analysis (I 2 of sensitivity: 96.4% to 61.5%, I 2 of specificity: 96.1% to 88.6%) (Figure 4 and Table 5). In the subgroup analyses with a cut-off value of PG I/PG II ratio ≤3 for the diagnosis of CAG, high-quality studies (I 2 of sensitivity: 98.3% to 88.5%, I 2 of specificity: 98.2% to 74.2%) and a Western population (I 2 of sensitivity: 98.3% to 85.1%, I 2 of specificity: 98.2% to 80%) also showed decreased I 2 values compared to those of main analysis, indicating needs for high-quality studies with a Western population to enhance the evidence level in this topic. Although studies with Western population showed slightly higher AUC (0.88 vs. 0.85) than pooled AUC, the value is closer to that of high-quality studies subgroup (0.92), indicating it is not an overestimation, rather we need more Western population data to enhance the level of evidence. In Table 6, recently published subgroup showed much lower AUC (0.61 vs. 0.76) than that of old publications; however, the AUC of recently published subgroup was closer to that of high-quality subgroup (0.68; data not shown because it was not a source of heterogeneity in meta-regression), indicating overestimation of older publications. There was a change in diagnostic values according to the modifiers in the subgroup analyses for the diagnosis of GC; however, such decrease of I 2 values in the subgroup analyses was not detected ( Table 6) (data about I 2 not shown in the results section).
The distribution of CAG or IM (known as pre-malignant or high-risk lesions of GC) in entire population affects the determination of optimal cut-off value of sPGA (spectrum bias). In our meta-analysis, a study by Dinis-Ribeiro et al. [21] included high-risk patients of GC, such as those with AG, IM, or dysplasia, excluding the healthy population, and showed higher sensitivity compared to that of pooled analysis with cut-off of PG I/PG II ratio ≤3 (0.66 vs. 0.50) ( Table 5). A previous study by Valli De Re et al. [78] also included high-risk patients, such as first-degree relatives of patients with GC or CAG, and showed high sensitivity and specificity of 0.96 and 0.93 for the prediction of Operative Link on Gastric Intestinal Metaplasia Assessment (OLGIM) stage ≥2 with cut-off of PG I ≤47.9 ng/mL. The proposed cut-off of PG I was lower than 70 ng/mL because they included a high-risk population. However, they proposed algorithm approach of using gastrin-17 first, because they included high-risk patients and gastrin-17 showed highest discrimination capacity of CAG among proposed biomarkers. For the next-step, they recommended using PG I ≤47.9 ng/mL for the prediction of OLGIM stage ≥2. PG I generally shows a low level in CAG; however, if an optimal cut-off should be determined in a high-risk population, lower cut-off value might be required. A combination with a marker, such as gastrin-17, which shows high discriminative performance of CAG, could be considered.
The present study rigorously investigated the diagnostic validity of sPGA with well-known cut-off value of PG I ≤70 ng/mL and/or PG I/PG II ratio ≤3 for the diagnosis of CAG or GC, excluding threshold effect. However, the study has several limitations. Firstly, a relatively small number of studies were enrolled with cut-off value of PG I ≤70 ng/mL or PG I/PG II ratio ≤3 compared to the combination of both values. Secondly, potential publication bias was suspected in the diagnosis of GC with cut-off value of PG I/PG II ratio ≤3 (Deeks' funnel plot asymmetry test showed p value of 0.02, although the plot showed symmetrical shape), probably due to relatively small number of enrolled studies (n = 11) ( Figure 10B). Thirdly, substantial heterogeneity among studies were suspected, although rigorous subgroup analyses were performed and interpreted. Fourthly, this meta-analysis included many case-control studies, which easily overestimate the diagnostic validity of the index test. Fifthly, the diagnostic validity of sPGA is known to be associated with the smoking, H. pylori infection status, or the proportion of diffuse-type GC of the enrolled population [79]. However, this information was presented only in small portion of enrolled studies, limiting further analysis.
In conclusion, sPGA has the potential for use as a CAG or GC screening (triage test). Considering the heterogeneity among studies found in this analysis, high-quality studies based on Western populations could enhance the evidence level in this topic. Most importantly, considering that the usefulness of sPGA may be different between countries, this biomarker should be validated before practically using it for the screening of CAG or GC, because the enrolled studies were conducted in only a few countries.

Conflicts of Interest:
The authors declare no conflict of interest.
Access to Data: All investigators will have access to the final dataset. All of the data is accessible and available upon request by corresponding author.