Is Long-Term Survival in Metastases from Neuroendocrine Neoplasms Improved by Liver Resection?

Background and Objectives: Although many of the neuroendocrine neoplasms (NEN) have a typically prolonged natural history compared with other gastrointestinal tract cancers, at least 40% of patients develop liver metastases. This study aims to identify whether liver resection improves the overall survival of patients with liver metastases from NEN. Materials and Methods: We conducted a retrospective study at “Fundeni” Clinical Institute over a time period of 15 years; we thereby identified a series of 93 patients treated for NEN with liver metastases, which we further divided into 2 groups as follows: A (45 patients) had been subjected to liver resection complemented by systemic therapies, and B (48 patients) underwent systemic therapy alone. To reduce the patient selection bias we performed at first a propensity score matching. This was followed by a bootstrapping selection with Jackknife error correction, with the purpose of getting a statistically illustrative sample. Results: The overall survival of the matched virtual cohort under study was 41 months (95% CI 37–45). Group A virtual matched patients showed a higher survival rate (52 mo., 95% CI: 45–59) than B (31 mo., 95% CI: 27–35), (p < 0.001, Log-Rank test). Upon multivariate analysis, seven independent factors were identified to have an influence on survival: location (midgut) and primary tumor grading (G3), absence of concomitant LM, number (2–4), location (unilobar), grading (G3) of LM, and 25–50% hepatic involvement at the time of the metastatic disease diagnosis. Conclusions: Hepatic resection is nowadays the main treatment providing potential cure and prolonged survival, for patients with NEN when integrated in a multimodal strategy based on systemic therapy.


Introduction
Neuroendocrine neoplasms (NEN) comprise a heterogeneous group of malignancies that arise from neuroendocrine cells throughout the body [1]. There are some features that characterize their ability to synthesize and secrete hormonally active polypeptides; thus,

Materials and Methods
We conducted a retrospective study at "Fundeni" Clinical Institute, a center of excellence dedicated to hepato-bilio-pancreatic surgery and hepatic transplant.

Patients
We interrogated a retrospective database available in the institution, for patients subjected to multimodal therapy for NELM diagnosed histologically and immunohistochemically, over a time period of 15 years (between 1 January 2004 and 31 December 2018). We thereby identified a series of 93 patients further divided, based on their treatment approach, into 2 groups as follows: A (45 patients) had been subjected to liver resection complemented by systemic therapies and B (48 patients) underwent only systemic therapies. The therapeutical conduct was decided for every single patient by a multidisciplinary team that included an expert in hepato-bilio-pancreatic surgery, an oncologist specialized in GI tract neoplasms, and a specialist in interventional radiology. Consequently, surgery was offered to patients fit for surgery (all analyzed patients had normal liver function, considered a primary selection criteria), considered oncologically and technically resectable. Patient's performance status was used to evaluate the patient's capability to sustain surgery. The oncological resectability was established based on the location and resectability of primary tumor, extent of metastatic disease (organs involved, number and size of the extrahepatic metastases), and response to oncological treatment. We considered liver resection in presence of extrahepatic disease, if also resectable or controlled by oncological treatment. Liver resection was not contraindicated if liver metastases progressed on oncological treatment. The technical resectability was assessed based on the number and size of the liver metastases, its relationship with the main vascular and/or biliary structure, background liver, coagulation status, liver function and the liver remnant volume. To ensure resectability, R1, and even R2 resection were allowed; in advanced liver disease, with many and/or large liver metastases, removal of about 70-90% of the tumor burden was considered an optimal surgical treatment.
We consider the time opportune to offer some explanation as to the way the LR was judged unfeasible and consequently patients were directed to group B: 21 patients due to technical considerations; 10 patients because of the existence of multiple and disseminated extrahepatic metastatic disease in high burden; 14 patients refused the surgical act; 3 patients had comorbidities which disqualified them for LR.
As a first step in reducing the bias of selection, we ruled out 4 patients from group A, and 5 from group B that were the subjects of other types of liver-directed therapies, such as: radiofrequency ablation, whole liver transplantation, transarterial chemoembolization, or associated to liver resection (e.g., combined resection and ablation). Due to the small number of patients of these patients (9 in total), we unfeasible to create a new subgroup.
In addition, we also ruled out 11 "end-stage disease" patients from the systemic therapy group that were referred to palliation; considered so because of high tumor burden and spread of the metastatic disease (intra-and/or extrahepatic metastatic disease): metastatic liver replacement volume >75% that were not suited even for debulking liver resection, multiple location with great expression of extrahepatic disease, and high risk of morbidity and mortality following systemic therapy.

Study Endpoints and Baseline Definitions
We decided to set the OS as the endpoint, in terms of the time of therapy onset until death [20] supplied by the "Directorate for Persons Record and Database Management". The patient's medical records provided the follow-up. The "Response Evaluation Criteria in Solid Tumors", version 1.1, were used to rank the outcome to systemic treatment [21].

Statistical Analyses
We described the categorical variables in frequencies and percentages, and we compared them by the means of the Chi 2 and Fisher's exact test. We represented continuous variables as mean and/or standard deviation (SD), or median + range. We compared continuous variables that presented a significant deviation from normality, using the proper nonparametric tests, and the ones with quasi-normal distributions by means of Student's t-test. We assessed the OS by means of the Kaplan-Meier method, computing in months the time to the event occurrence. The log-rank function was utilized to test the survival differences.

Propensity Score Matching (PSM)
To minimize the bias of patient selection, we performed a PSM analysis. This statistical tool is known to significantly reduce the effect size bias and to give the experimental design characteristics of randomized studies. However, it has a primary shortcoming: it reduces the size of the sample, which further results in a drop as far as the statistical power analysis is concerned. When computing the propensity score (PS) we started from the parameters provided by in the existing literature, namely: age; Charlson Comorbidity Index; PT related: location, grading, resection status; NELM related: number; extrahepatic disease [6,8,22,23], to which we added parameters that we considered adequate to depict more precisely the LM: ECOG performance status, size of the largest LM (measured in centimeters), neoplasic liver volume at the time of diagnosis, and LM grading. In the next step we employed logistic regression, and used the following overall parameters: age; ECOG; Charlson score; PT: location, grading, resection status; NELM: number, size of the greatest liver metastasis (centimeters), neoplasic liver volume at diagnosis, grading; and existence of extrahepatic disease-in the estimation of the PS. Afterwards, we used a caliper of 0.15 in a matching "one-to-one nearest neighbor" method, in order to perform the matching of patients that had originated from groups A and B. This resulted in a small case matching of 15% (i.e., 7 patients in each, out of 14 patients in all). Therefore, given the reduced size of the sample, we had to proceed to a bootstrapping selection with Jackknife error correction [24,25], whose end result was obtaining a more statistically illustrative sample, on which we could perform a pertinent analysis of the survival differences between groups A and B, and identify those factors associated with better OS; the result of it was a larger sample of 2000 virtual patients identical in term of characteristics to the real patients. We repeated the PSM analysis on this, using the same caliper and covariates as aforementioned, which led to the same percentage of 15% cases, thereby validating the computation (i.e., 152 in each, 304 virtual patients in total), and eventually we conducted the survival analysis.
Covariates found to have a significant statistical influence on survival on the extended matched sample upon univariate analysis were looked into, by means of the Cox proportional hazard model using forward stepwise selection. We expressed the results as hazard ratio (HR); moreover, we accepted a confidence interval (CI) of 95%.

Quality Assessment of PS
The logistic regression model was looked into closer so that the quality of PS could be evaluated. To start with, the correct classification of participants in groups had to be compared to the null hit rate; in this way, we discovered an improvement of 36% (86% vs. 50%). Afterwards, we also performed a Hosmer-Lemeshow test (inferential goodness-offit), which illustrated good model fit Chi 2 = 3.047 (p = 0.931). These reveal the fact that: the therapeutic conduct was not settled randomly, therefore this can be foreseen in a reasonable way by analyzing the result of the parameters integrated in the PS assessment. Thirdly, we compared by means of an independent samples t-test the differences in the bias of selection: i.e., the probability of undergoing hepatic resection or not. We consider this early assessment significant in evaluating the magnitude of the bias, as well as in recording the improvement after performing the matching function. Results showed that: the PS of the groups were statistically different, as an indicator of the possibility of selection in the LR group (p < 0.001, standardized mean difference (SMD) = 1.458). We consider the SMD value of paramount importance, insofar as, in order to estimate the treatment effects, the two groups should not be compared directly.

Nearest Neighbor Matching within a Specified Caliper
Even though we generated such a large virtual sample, when using PS to perform the matching of patients from the two groups, obtaining the perfect match on 11 covariates is totally unlikely. With this in mind, we specified a distance of measure (a caliper) from the very beginning. Had we used a larger value of the caliper it would have led, on the one hand, to more pairs matched, but, on the other, to a lower power of bias reduction. To conclude, we considered it convenient to use a 0.15 SD caliper of the PS as an acceptable interval to reduce bias selection of the groups.

Post-Matching Analyses for Balance Evaluation
In addition, it was mandatory to asses the balance resulting from the PSM model. Dedicated literature data suggest that: the SD in the mean PS between group A and B should be less than 0.20; in addition, the PS's ratio of variances in A and B should range close to 1 (within 0.80-1.20) [26,27]. Moreover, the statistical differences in the matched sample should not be significant for covariates, be they continuous or categorical. Once the PS and covariates are balanced, we can compare the groups directly over the pursued purpose of the study, i.e., OS and factors that potentially influence it.
For all analyses, we considered statistically significant a p-Value < 0.05. We made use of IBM SPSS Statistics (version 23.0) with the Python extension to perform the analyses required for all statistical purposes. Table 1 shows the demographic and baseline data of the unmatched patients (93 in all).  Tables 2 and 3 show the results of the evaluation balance prior-and post-PSM.    Table 4 offers a detailled description of the systemic therapy administered both priorand post-PSM.

Systemic Therapy Administered in the Unmatched Groups
In the present study, there are no patients that were subjected to only a stand-alone systemic therapy. Instead, numerous types of therapies were associated in several carefully selected patients in order to obtain a maximal response. Thus, multiple types of systemic therapy were employed according to the existing guidelines, which were updated during the timespan of therapy, adjusting the therapeutical conduct in order to achieve prolonged survival.

Estimating Treatment Effects on the Virtual Matched Cohort
Seven independent factors could be identified as factors potentially associated with improved overall survival or negative outcome upon multivariate analysis.  Table 5.

Discussion
Neuroendocrine neoplasms (NEN) are defined as epithelial neoplasia with predominantly neuroendocrine differentiation, which originates from neuro-ectodermal cells. Although these cells are distributed throughout the body, NENs mainly arise in 54-90% of cases from the pancreas and gastrointestinal tract, this location being currently defined as gastroenteropancreatic (GEP) NEN. In 2019, World Health Organization (WHO) proposed a classification and grading criteria for NEN in neuroendocrine tumors (NET) and neuronendocrine carcinomas (NEC), which are further subdivided accordingly [28]. NECs are often aggressive, with a high tendency for metastases, while NETs usually have a much better 5-year OS to up to 67%. Fortunately, high grade NEN are very rare, varying between 0.04 and 0.54 [29,30].
The age-adjusted incidence rate of NETs increased 6.4 times from 1973 (1.09 cases per 100,000) to 2012 (6.98 cases per 100,000), as a population-based study suggests; the latter was conducted by the United States Surveillance, Epidemiology, and End Results (SEER) program.
As primary hepatic location is extremely rare, representing just 0.3% of NENs [31], and difficult to prove in clinical setting, liver localization of NENs is formally considered as LM, unless no other NEN location is clinically found [31]; thus, will be defined as LM of unknown primary site [31].
Surgery alone may be curative only for localized NEN, but multimodality treatment is always recommended for the liver metastases of unknown primary or primary liver tumor. The LR part played in the treatment of NELM is not conclusively established, and the indication for surgery is currently individualized. Surgical removal of LM is generally not recommended in case of GEP NEC [32]. However, even in such situations, LR could prove beneficial in some selected cases [33,34].
The present study centered on the experience of a single center of excellence, focusing on NELM patients; it offers feedback on the role of LR in a tertiary referral center. Our results prove the fact that LR alongside systemic therapy improves survival compared with systemic therapy alone. Although the baseline traits of group A and B present similitudes, we encountered differences involving the extent and aggressiveness of the metastatic disease that we had expected given the present day therapeutical approach settled by the multidisciplinary team: group A patients are more likely to be candidates for LR, due to the following conditions: improved ECOG, lower tumor load, than those in group B. Nevertheless, the matching between the two groups was mainly possible due to a subset of patients in group B with similar tumor burden and comorbidities, but with technically unresectable liver metastases due to topography (deep-located).
In order to alleviate the significance of the bias of selection on the clinical results, several statistical methods were put to work. Initially, we computed a PS; afterwards, the PSM was effected. This technique is known to downsize effect size bias and to give the experimental design features of randomized studies, by avoiding the comparison of groups that differ significantly in characteristics; additionally, in the case of rare tumors, it is considered a useful statistic instrument for identifying correct relationships among data [6]. Zhang et al. identified no survival difference between their samples of NELM after running the matching function [35]. Norlen et al., using the same method, discovered no survival benefit for NELM patients undergoing surgery and RFA as to patients who underwent systemic therapy alone [36]. Therefore, the conclusions that highlight the benefits of LR were considered by some biased, as related to the selection of patients with less liver metastases burden and fewer comorbidities [15].
However, literature showed different results between the unmatched and matched groups. Daskalakis et al. investigated the prophylactic resection in stage IV small-intestine NET. They identified a much longer survival for asymptomatic unmatched patients; however, no difference was detected after PSM [37]. Schreckenbach et al. also found an important survival advantage for the LR group when comparing the unmatched patients, but found no improvement in survival by analyzing the matched patients [6]. Literature suggests that LR and RFA may be equivalent [36], therefore Schreckenbach et al. considered that the lack of survival differences between the groups occurred because patients in the comparison group had also received other liver-directed therapies besides resection (e.g., radiofrequency ablation (RFA) and trans-arterial chemoembolization (TACE) [6]. Therefore, we decided to compare only those patients that underwent LR alongside systemic therapies, to the ones subjected to systemic therapies alone; this fact could also contribute to our results that favor liver resection.
We emphasize the fact that in our study LR patients had their PT removed more frequently than patients in the systemic therapy group. Similarly, Schreckenbach et al. encountered the same fact, and found that the survival advantage effect vanished after using PSM. Their finding is consistent with the research of Citterio et al., which reported better survival in NELM patients undergoing resection of their PT [22]. We consider the parameters used in computing the PS in previously published studies [6,8,22,23] were focused mainly on the primary tumor characteristics. Thus, for the sake of comparing similar patients to obtain a significant survival analysis, we considered more proper to introduce-besides the ECOG performance status-other parameters that can depict more precisely the LM: size of the largest LM (measured in centimeters), neoplasic liver volume upon diagnosis, and LM grading. In addition, in order to perform a solid identification of factors potentially associated with better OS were analyzed by the Cox proportional hazard model using forward stepwise selection (on both samples, i.e., original, and the matched one). Upon multivariate analysis, we identified the seven abovementioned factors that independently affect the outcome Literature suggests that LR with R0 resection margin status is the single possibility of cure and recommended as the first-choice approach for patients with grade 1 or 2 disease (KI67 < 20%) and a burden of single metastasis of any size, with no extrahepatic or effectively manageable limited disease [8]. Unfortunately, curative intent LR is suitable for a small percentage of patients; moreover, this gold standard perspective is sustained by limited evidence (consisting of retrospective, non-controlled case series of highly selected patients with lower liver burden, probably lower in age and with fewer comorbidities). There is evidence that R0 LR confers a 5-year benefit in terms of OS; however, this does not translate trigger a significant effect over a 10-year period [38]. In reality, R0 LR is palliative in a longer perspective by offering a longer disease-free survival; regardless of margin status, post-resection relapse should be carefully kept in mind, actively anticipated, although-unfortunately-the optimal approach is still not clear [39].
A Nordic study offers a conceptual divergence, insofar as it examined intended curative resection +/− radiofrequency ablation of NELM from G3, poorly differentiated NET, usually meant for systemic treatment (median OS: 11 months). In a case series of 32, GEP NELM patients, 20 of which having Ki67 ≥ 55%, the 3-year and 5-year OS post LR/RFA was 47% and 43%, respectively, with a median post-LR progression-free survival of 8.4 months. A Ki67 of less than 50% and adjuvant chemotherapy were associated with favorable OS [35]. Similarly, we performed LR even in case of aggressive and rapidly evolving NELM (Ki-67 > 20%: 13%, grading 3: 11%, hepatic involvement 50-75%: 13%), as well as in the case of patients with functional syndrome, or those with poorly controlled symptoms caused by hormonally hypersecreting tumors. Thus, we performed R1 resections in 2%, and R2 in 11% of patients.
We believe that a multimodal neoadjuvant/adjuvant treatment concept that combines both surgical and medical therapeutic strategies, may comprehensively treat macroscopic and microscopic neuroendocrine disease, and offer the possibilities for long-term disease control. Following other centers' policy, we also used SSA in the adjuvant setting post LR; literature warranted this approach by reporting a 5-year OS in patients with metastatic pancreatic NEN with LR alone of 34%, whereas in those who received adjuvant SSA, of 79% (p < 0.01) [40].
The limitations of this study are represented by a potential patient selection bias, the retrospective nature of the study and the reduced sample size. The PSM has a primary shortcoming: it reduces the size of the sample, which further results in a drop as far as the statistical significance is concerned (we got a small case 15% matching (i.e., 7 patients in each, out of 14 patients in all). Therefore, given the reduced size of the sample, we had to proceed to a bootstrapping selection with Jackknife error correction [24,25], whose end result was obtaining a more statistically illustrative sample, on which we could perform a pertinent analysis of the survival differences between groups A and B, and identify those factors associated with better OS; the result of it was a larger sample of 2000 virtual patients.
We repeated the PSM analysis on this, with the identical caliper and covariates as aforementioned, which led to the same percentage of 15% cases, thereby validating the computation (i.e., 152 in each, 304 virtual patients in total), and eventually, we conducted the survival analysis. However, one can debate over this method of analysis conducted on a surrogate extended matched sample of subjects. Nevertheless, this [24,25] is a recognized method in mathematical literature, meaning that the virtual patients have the same features with the real ones, a fact proved the computations made in the subchapter Post-matching analyses for balance evaluation. Although it was not used before in the dedicated literature, we consider it fit for reaching number of patients when reporting a single center's experience in low-incident tumors. In this respect, we announce that our research team has pioneered this methodology by implementing in the study of Gastrointestinal Stromal Tumor Liver Metastases [41].
Naturally, the selection bias can be best eliminated in a randomized trial, but keeping in mind that LR brings benefits-a fact repeatedly pointed out in literature-its application is debated upon due to ethical reasons. Nowadays, because of the absence of randomized controlled trials of surgery vs. other modalities [42], the evidence consisting of retrospective, non-controlled case series of highly selected patients with lower liver burden, probably lower in age and with fewer comorbidities is subject to grand limitations. In addition, the outcomes of randomized controlled trials often are not sufficiently substantiated and therefore cannot be generalized to a larger patient population. Eventually, a viable option would be the use of non-randomized observational data extracted from databases; this might stress out a common clinical practice, and supplement clinical trials as far as their conclusions are concerned.
Some authors consider overall survival the gold standard when estimating the clinical benefit of a treatment [43]. Our team's medical statistician considered that the analysis of progression free survival is redundant. We also consider that: the main purpose of the treatment in this particular disease is to increase the overall survival, and not to necessarily increase the progression free survival. Moreover, it is generally accepted that post-liver resection recurrence is inevitable, regardless of resection margin extent, and the difference between curative intent resection and debulking is best considered as "resetting the clock" in order to offer patients an improved survival [4]. In case of liver metastases form neuroendocrine tumors, the goal is not to achieve complete liver clearance (to obtain tumor free patients post R0 liver resection, which would inevitably have tumor recurrence), but instead to pursue survival prolongation [4]. Briefly, our goal was to improve overall survival in a disease characterized by inevitable recurrence, in which the metastatic clearance is meaningless, thus rendering progression and recurrence not significant in this condition.

Conclusions
In patients with LM from NEN, liver resection combined with systemic therapy seem to provide a better overall survival compared to systemic therapy alone. In patients with LM from NEN, liver resection combined with systemic therapy seem to provide a better overall survival compared to systemic therapy alone, in absence of NEC or LM grading and high liver tumor burden.  Institutional Review Board Statement: Ethical review and approval were waived for this study, because the analyzed data were standard information available in our retrospective institutional database, and no additional procedures were conducted on the patients for the present study, thus no ethical concerns were raised by the institution's Ethics Committee.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the authors.