Systematic Review with Meta-Analysis: Endoscopic and Surgical Resection for Ampullary Lesions

Ampullary lesions (ALs) can be treated by endoscopic (EA) or surgical ampullectomy (SA) or pancreaticoduodenectomy (PD). However, EA carries significant risk of incomplete resection while surgical interventions can lead to substantial morbidity. We performed a systematic review and meta-analysis for R0, adverse-events (AEs) and recurrence between EA, SA and PD. Electronic databases were searched from 1990 to 2018. Outcomes were calculated as pooled means using fixed and random-effects models and the Freeman-Tukey-Double-Arcsine-Proportion-model. We identified 59 independent studies. The pooled R0 rate was 76.6% (71.8–81.4%, I2 = 91.38%) for EA, 96.4% (93.6–99.2%, I2 = 37.8%) for SA and 98.9% (98.0–99.7%, I2 = 0%) for PD. AEs were 24.7% (19.8–29.6%, I2 = 86.4%), 28.3% (19.0–37.7%, I2 = 76.8%) and 44.7% (37.9–51.4%, I2 = 0%), respectively. Recurrences were registered in 13.0% (10.2–15.6%, I2 = 91.3%), 9.4% (4.8–14%, I2 = 57.3%) and 14.2% (9.5–18.9%, I2 = 0%). Differences between proportions were significant in R0 for EA compared to SA (p = 0.007) and PD (p = 0.022). AEs were statistically different only between EA and PD (p = 0.049) and recurrence showed no significance for EA/SA or EA/PD. Our data indicate an increased rate of complete resection in surgical interventions accompanied with a higher risk of complications. However, studies showed various sources of bias, limited quality of data and a significant heterogeneity, particularly in EA studies.


Introduction
Ampullary lesions (AL) are rare conditions and comprise 7 to 10% of periampullary lesions [1]. However, the incidence of ALs has clearly increased from 1973 to 2005 particularly in patients over the age of 50 [2]. Nowadays, most ALs are incidentally diagnosed by endoscopy or radiology imaging. However, in symptomatic patients, painless jaundice and cholangitis, acute pancreatitis, nausea, vomiting or weight loss have been described [3]. Most ALs are ampullary adenomas (AA) and adenocarcinomas (AC) following an adenoma-to-carcinoma sequence [4]. Some other rare entities such as neuroendocrine or mesenchymal lesions have also been described [5].
Treatment of ALs is historically a surgical approach but advances in endoscopic techniques have facilitated minimally invasive therapies [6]. Non-invasive ALs, carcinoma in situ and T1-node-negative adenocarcinoma can be treated either by endoscopic ampullectomy or papillectomy (EA) [7], surgical or trans-duodenal ampullectomy (SA) [8] or pancreaticoduodenectomy (PD) [9]. However, clear consensus guidelines or recommendations are lacking and the therapeutic strategy for ALs depends on local expertise. EA is usually performed for smaller lesions without any sign of invasive carcinoma, clear margins, soft tissue and absence of ulceration [10]. In contrast, recent studies describe the feasibility of "piece-meal" EA [11], even in large laterally spreading lesions, with deep ductal invasion [12] and supposed node-negative T1 adenocarcinoma [13]. Additionally, EA could be used as a "macrobiopsy" for tumor staging and as a bridge to surgery [14]. This is important, as recent studies still show limited pre-interventional accuracy of the endoscopic biopsy despite the use of EUS [15]. In addition, the significant morbidity and mortality of PD, as well as a notable recurrence of up to 43% in EA, have to be considered in the therapeutic decision for resection of ALs [16].
To date, randomized controlled trials that have matched EA and SA or PD directly are lacking. Only a few studies retrospectively compared endoscopic and surgical techniques for the treatment of ALs. These works revealed different inclusion criteria, outcomes and surgical approaches. In addition, the conclusions drawn were, at least in part, counterintuitive [11,16,17]. Thus, the aim of this systematic review and meta-analysis was to analyze the outcomes of EA, SA and PD for non-invasive ALs, carcinoma in situ and T1-node-negative adenocarcinoma in terms of complete resection (R0), complications and recurrences.

Data Sources
Our search strategy and inclusion criteria were based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) recommendations [18]. The analysis was registered in the PROSPERO-Database (on 19 August 2019, registration number 147795). We performed a systematic literature reveiw in Medline, EMBASE and SCOPUS on 27 September 2018 to identify studies that analyzed endoscopic or surgical interventions for ALs. Additionally we performed a manual research of references from representative articles and reviews.

Study Selection Process and Outcome Measures
Only human subject studies were considered in the analysis. Any retrospective or prospective study analyzing EA, SA or PD for ALs that reported at least one of the outcomes was included. The primary outcome measurement of our study was the rate of complete resection (R0) determined by histology. Secondary outcomes were overall complications (AEs) for endoscopic and surgical interventions and rate of recurrence. Recurrence was defined as a new lesion on endoscopy after initial negative follow-up endoscopy (in EA and SA) or local or distant recurrence in cross-sectional imaging (SA and PD). Further inclusion criteria were histologically proven ALs, patients >18 years, >10 patients per study, >90% ampullary adenoma or adenocarcinoma (T1 or carcinoma in situ), publication later than 1990. Studies were excluded, if they were deemed to have insufficient data or did not report independent data (e.g., review articles, editorials, correspondence letters, duplication and others) or did not fulfil inclusion criteria. A bibliography of full texts was established for all studies that could possibly meet inclusion criteria.

Quality Assessment
The data quality of the included studies was evaluated by the modified Newcastle-Ottawa Scale (NOS) ( Table 1) for non-randomized studies, ranging from 0 (low-quality) to 9 (high-quality) [19]. The overall NOS-Score was calculated by 7 items (4 points for adequate selection, 2 for comparability and 3 for outcome). A study was scored as good quality if at least 3 points for selection, 1 for comparability and 2 for outcome were achieved. Fair quality was defined as 2 points for selection, 1 point for comparability or 1 point for outcome. Studies of poor quality reached 1 point for selection, no points for comparability or 1 point in outcome. All reviewers (CH, EAA, FA, AG, MH) assessed the specific quality indicators of the included publications.

Statistical Analysis
Statistical analyses were performed by means of openMeta[Analyst] (v. 12.11.14, http://www.cebm. brown.edu/openmeta/), "R" (version 3.5.3 distributed by www.r-project.org) and SPSS (version 25, IBM). We calculated pooled values for R0, complications and recurrence and presented the proportion with a 95% confidence interval (CI). The binary random effects model (DerSimonian-Laird) was used for analysis of EA and SA, as these studies showed high heterogeneity. In contrast, meta-analyses for PD were calculated by a fixed effect model with inverse variance weighting as data were of low heterogeneity. Statistical heterogeneity between the included studies was determined by forest plots and by calculating the I 2 index. In the case of high heterogeneity, a sensitivity analysis was performed. All pooled event rates were shown in forest plots, regardless of the level of heterogeneity.
Studies were evaluated for publication bias by Egger's test [20], and funnel plots were drawn by using JASP (v. 0.11.1, https://jasp-stats.org/). To compare endoscopic and surgical interventions, a proportion-meta-analysis was conducted. The individual study proportions were transformed into a quantity using the Freeman-Tukey Double Arcsine Proportion model. Then, the pooled proportion was calculated by transforming the weighted mean back out of the transformed proportions. For the transform, we used inverse arcsine variance weights for the fixed-effects model and DerSimonian-Laird weights for the random effects model [21]. Statistical significance was assessed by considering the 95% CIs of the two pooled proportions and the differences of proportions and their 95% CIs, respectively. All calculated p-values by Students t-test were 2-sided, and p < 0.05 was considered statistically significant. Combined weighted proportions were determined by use of the openMeta[Analyst].
Search strategy, screening and data extraction can be found in the supplemental material in methods. Overview over all rated publications in our Systemativ Review, using the Newcastel-Ottawa-Score. Green shows good, yellow mediocre and red low reporting quality.

Study Characteristics and Quality
Our literature search revealed 2395 articles in total. Out of these, 1660 did not meet the topic. After screening of titles and abstracts, 367 studies remained. Finally, 59 papers were eligible to be included into the meta-analysis after full-text evaluation (42 EA, 8 SA, 1 PD, 3 SA + PD, 4 EA + SA, 1 EA + PD; Figure 1 PRISMA chart) . We did not include conference abstracts that were not published as full-texts. Although some studies reported outcomes of different interventions, no single study directly compared different approaches with relevant patient count. Relevant study characteristics and quality assessment are reported in Table 2; Table 3. The majority of studies (n = 54, 92%) were retrospective single center reports. In total, out of the EA studies, we included 2658 patients (54.1% male) at an average age of 59.2 years for our analysis. Moreover, SA studies involved 393 patients (47.5% male) with a mean age of 63.5 years and PD studies involved 778 patients (53.3% male) with a mean age of 63.9 years. The follow-up after the procedures ranged between 5 and 190 months but was not adequately described in 18 studies (31%, Table 3).    The quality of the included articles was heterogeneous. Out of 59 papers, 35 were of good quality (59.4%) and 23 revealed poor scientific quality (38.9%) according to the NOS. One study was of fair quality (1.7%, all Table 2). Study quality was included in the sensitivity analysis.

Complete Resection in Endoscopic and Surgical Interventions
We were able to include 39 datasets of EA, 10 datasets of SA and 4 of PD in the meta-analysis for complete resection. Patients were asymptomatic in 54.4% (EA), 35.3% (SA) and 21.6% (PD). The mean lesion size of the different study cohorts was 19.7 mm (EA), 17.5 mm (SA) and 17.0 mm (PD). T1-adenocarcinoma and carcinoma-in-situ were found in 8.7% of patients in EA, in 30.8% in SA and in 100% of patients in PD (all Table 3). The remaining patients had non-invasive ALs. Thereby, a strict comparison, matching or propensity scoring was impossible to perform.
In addition, we evaluated the endoscopic publications for biliary/pancreatic duct stent implantation, submucosal injection and additional therapy that could influence the development of post-interventional complications. Thereby, papers also showed a high heterogeneity in this regard (Table 4). Sensitivity analysis failed to reduce heterogeneity of the calculated proportions as no single or systematic outliers could be identified.

Publication Bias
The calculation of funnel plots for reported R0-rates for EA ( Figure 5A) and SA ( Figure 5B) showed no evidence of publication bias. The p-values of the Egger's test were 0.061 for EA and 0.229 for SA. We did not analyze publication bias for PD as less than 10 PD studies were included. Our data revealed no evidence for publication bias in regard to rates of complete resection.

Discussion
ALs are a rare but increasingly diagnosed entity. Although mainly benign and premalignant lesions are diagnosed, ALs have the potential for malignant transformation. In addition, ALs can cause cholangitis or pancreatitis and thereby reduce patients' quality of life. Thus, treatment of ALs either by EA, SA or PD is indicated in all cases [2][3][4]80]. Nevertheless, prospective randomized trials comparing EA, SA or PD are lacking and the treatment of AL for non-invasive lesions, carcinoma in situ and T1-node-negative adenocarcinoma is far from a consensus. The choice of treatment is currently based on the decision of the treating clinician, and available endoscopic or surgical resources and is not supported by evidence-based guidelines.
This study is the first systematic review to report R0 resection rate, complications and recurrence of all three interventional procedures for the treatment of ALs. Our analyses clearly indicated a superiority of surgical approaches to endoscopic therapy with regard to complete resection (EA 76.6%, SA 96.4%, PD 98.9%). Nevertheless, this high efficacy was accompanied by a considerable rate of complications, in particular for PD (EA: 24.7%, SA: 28.3%, PD: 44.7%), but is possibly under-evaluated in these retrospective single cohort series. Statistical calculations indicate a significant difference between endoscopic and surgical interventions. One could argue that there is sufficient evidence to recommend surgery for invasive ALs as first line therapy. However, for non-invasive lesions or only suspicious one, especially in patients with high risk for surgical complications, the best approach remains to be determined. In addition, there is high heterogeneity and many limitations in the available literature.
First, we had to exclude numerous studies of PD data as they provided R0-rates and complications for the whole study populations including distal bile duct carcinoma, pancreatic cancer, duodenal lesions and all stages of ALs and ACs. As a consequence, distinct data of AAs or ACs (T1) could not be assessed. Therefore, many articles that presumably would fit our analysis had to be excluded. It is nevertheless unlikely that they would have significantly modified the R0 resection rate. Finally, only four publications of PD for ALs were eligible. Hence, we cannot exclude a publication bias as the Egger's test required a minimum of 10 publications to sufficiently address this issue. Moreover, all PD papers analyzed data of ACs but not AAs or other ALs. Unfortunately, this finding leads to the assumption of a selection bias, as ACs are preferentially treated surgically rather than endoscopically. In addition, the selection of ACs could explain the low heterogeneity of PD data due to high similarity of included patients. In contrast, ACs were found only in 8.7% of EA patients and in 30.8% of SA data, which might reflect accurate patient selection. Indeed, in T1 adenocarcinoma, only those with an intact submucosal layer (D0) are eligible for local treatment (either EA or SA) as their likelihood of negativity in lymph node invasion is very high [81][82][83]. This may also have introduced a selection bias in the published data as described above. Finally, EUS should be used to evaluate ALs prior to intervention [84] but was mentioned in only 70.6% of surgical and 65.2% of EA studies. The inequality in ACs between the three interventional groups could also explain the similar rate of recurrence, although SA and PD showed a convincing rate of complete resection. A recurrence of a completely resected AA is more unlikely compared to a T1 AC, according to its nodal status. Additionally, recurrences were not described as local or distant, which is of utmost importance. The mean follow-up after procedure was possibly too low to catch all local recurrences.
Moreover, the histologic type (intestinal vs. pancreaticobiliary) and grade of dysplasia (low grade vs. high grade vs. T1d1 invasive lesion vs. T1d2 advanced lesion) also have an impact on outcome. It is acceptable to perform a second papillectomy for an initially incomplete resected AA rather than send for surgery. In contrast, an incidentally diagnosed AC, either incompletely resected or associated with a significant risk of lymph node involvement, might require additional surgery. In addition, it is important to report the subtypes of ACs (intestinal vs. pancreaticobiliary), as they influence the risk of local or distant recurrences [85]. However, the discrimination of AC subtypes was reported in only two of all included EA studies [23,65] and in none of the surgical series. It is also important to note that EA is a recent technique that has improved over the years, and results from the 1990s or early 2000 might not be as good as today, regarding patient's selection, the rate of R0 resection and post procedure outcomes. Another important parameter to discriminate AAs from ACs, in particular in large ALs, is the molecular markers. A KRAS-mutation was found more often in advanced AAs and in ACs [86,87]. It is nevertheless important to note that KRAS mutation is an early event in tumorigenesis and can be found in early AA. Molecular markers in ALs are not sufficient to incorporate into our analysis but are interesting for future projects.
Another limitation of our meta-analysis is based on the design of the included studies. Almost all data were published in monocentric retrospective cohort studies. Prospective comparative randomized trials are lacking. Thus, the quality of the included studies is heterogeneous, only 59.4% of papers were of good quality according to NOS. In particular, the quality of EA data remains, at least in part, limited. Technical specifications of EA procedures have developed over the time and thus, different techniques were published. Moreover, the definition of R0 by histology and/or complete endoscopic resection was not used consistently in all studies. Beside the use of EUS, a prior ERCP procedure could influence results. Many publications did not report about prior sphincterotomy and, maybe more importantly, the use of pancreatic and bile duct stents was inhomogeneous. The current ESGE guidelines clearly recommend the implantation of a pancreatic duct stent to prevent post-ERCP-pancreatitis [88,89] in EA. A lot of EA data were published prior to these guidelines and, thus, did not regularly use pancreatic duct stents. Moreover, in 40.4% of patients with EA, a prior submucosal injection was performed to lift the lesions. A submucosal injection is not recommended as such injection resulted in reduced rates of complete resection [10,31,41,80]. Furthermore, in some EA studies, additional therapy by radiofrequency ablation (RFA), argon plasma coagulation (APC), additional EA or surgery was performed. All these confounders may have substantially influenced the results, in particular for complete resection, and could explain the high heterogeneity in EA studies. However, we aimed to reduce heterogeneity by different sensitivity analyses. Unfortunately, that results remain similar (success rates did not differ >10% between different analyses).
In addition to a complete resection, we analyzed the overall rate of complications between endoscopic and surgical interventions. The rate of complications was clearly higher in PD compared to EA, but not compared to SA. Since complications are defined very specifically with respect to every interventional approach and are difficult to compare, we decided to solely compare rates of overall complications. Moreover, if overall complications were reported in the included studies, most of them were not graded according to modern classifications such as Clavien-Dindo and it was not possible to accurately classify and compare them between PDD, SA and EA according to their severity. Therefore, our analysis is limited, as we cannot judge the burden of the reported adverse events. We think that an evaluation of severity of complications ought to be examined if there is an extension of in-patient time. In addition, the postoperative mortality of PD was not assessable in the present meta-analysis but should not been underestimated, as recently shown by nationwide data analyses [90,91].
Despite all the above mentioned limitations, our meta-analysis has some strengths. The baseline characteristics of age and gender were comparable between the three groups. Interestingly, the mean size of the lesions was quite similar (EA 19.7 mm, SA 17.5 mm, PD 17.0 mm). This finding was surprising, because large lesions are often more likely to be treated by surgery. But our finding is in line with current data indicating that there is no correlation of size and malignancy, if malignancy was excluded by EUS [11,14,17,92]. Hence, a selection bias depending on the size of the lesions is very unlikely.
Our results are in line with a recently published (non-systematic) meta-analysis of five studies summarizing that surgery was more effective compared to EA in AAs, but was associated with higher rates of complications [93]. However, this analysis showed several limitations in design and statistical analysis. The current data raised several questions that need to be addressed by further studies. Even if surgery is more effective in treating AL in our meta-analysis, a risk-benefit analysis has to be performed for PD according to the high rates of AEs. Another concern is the role of SA, as it is restricted to expert centers and not widely available, and might call for centralization in the management of such rare lesions. A more fundamental issue is the long-term outcome of repeated EA for recurrent AA. There are no data comparing repetitive EA with surgery for AA or EA as bridge to surgery.
From the present meta-analysis, very few of the included studies were comparative. Additionally, because of the heterogeneity in the reported variables, a strict comparison between the endoscopic and surgical groups was not possible to perform. In addition, with regard to PD, the use of laparoscopic and robotic-assisted PD is emerging. Retrospective studies showed comparable efficacy with reduced hospital stay and blood loss compared to open PD [94,95]. Nevertheless, recent published evidence failed to show the superiority or even equivalence of the laparoscopic approach to the Whipple procedure [96]. Indeed, a randomised controlled trial was stopped prematurely because of an excess of mortality in the laparoscopic group compared with the open approach [97]. In our meta-analysis, no studies analyzing laparoscopic or robotic-assisted PD could be included but these techniques might be interesting in future treatment of patients with advanced ALs.
To address these unanswered questions, a prospective randomized controlled trial should be initiated. Nevertheless, due to the rarity of ALs and lack of clear stratification attributes, this trial will not be realizable. This call for an international multicenter retrospective study that analyzes EA, SA and PD for AL in detail to provide evidence for therapeutic algorithms and data for the implementation of guidelines in the treatment of different types of ALs, including recurrent or incomplete resected lesions and additional ablative therapies [98].
In conclusion, the present meta-analysis provides an overview about the different therapeutic options for non-invasive ALs, carcinoma in situ and T1-node-negative adenocarcinoma. Our data suggest an improved rate of complete resection in surgical interventions compared to EA. But this benefit was accompanied by a significantly higher rate of complications, in particular after PD. However, high heterogeneity and limited data quality affect the reliability of data and results should be interpreted with caution. Further research is needed to establish clear guideline recommendations for the treatment of ALs.