Risk Factors for Pancreatic Cancer in Patients with New-Onset Diabetes: A Systematic Review and Meta-Analysis

Simple Summary New onset diabetes patients are a high-risk group for pancreatic cancer. Since pancreatic cancer is responsible for less than 1% of new-onset diabetes cases, testing all of them might lead to an unfavorable risk/benefit balance. Additional risk factors can contribute to a better definition of the population that needs further screening. Currently, 22 studies examining additional risk factors have been published, but often they have a limited number of participants for the individual risk factor. By pooling their results in a meta-analysis, we could establish the magnitude of several risk factors. We found that pancreatic cancer cases were older than controls by 6.14 years (CI 3.64–8.65, 11 studies). Among new-onset diabetes patients, the highest risk of pancreatic cancer involved a family history of pancreatic cancer (3.78, CI 2.03–7.05, 4 studies), pancreatitis (5.66, CI 2.75–11.66, 9 studies), gallstones (2.5, CI 1.4–4.45, 4 studies), weight loss (2.49, CI 1.47–4.22, 4 studies), and high/rapidly increasing glycemia (2.33, CI 1.85–2.95, 4 studies) leading to more insulin use (4.91, CI 1.62–14.86, 5 studies). Risk factors or symptoms were distinct in the new-onset diabetes patient group. They are strongly connected to pancreatic cancer and are ideal for targeted screening, using a score or model as the first step. Abstract (1) Background: Patients with new-onset diabetes (NOD) are at risk of pancreatic ductal adenocarcinoma (PDAC), but the most relevant additional risk factors and clinical characteristics are not well established. (2) Objectives: To compare the risk for PDAC in NOD patients to persons without diabetes. Identify risk factors of PDAC among NOD patients. (3) Methods: Medline, Embase, and Google Scholar were last searched in June 2022 for observational studies on NOD patients and assessing risk factors for developing PDAC. Data were extracted, and Meta-Analysis was performed. Pooled effect sizes with 95% confidence intervals (CI) were estimated with DerSimonian & Laird random effects models. (4) Findings: Twenty-two studies were included, and 576,210 patients with NOD contributed to the analysis, of which 3560 had PDAC. PDAC cases were older than controls by 6.14 years (CI 3.64–8.65, 11 studies). The highest risk of PDAC involved a family history of PDAC (3.78, CI 2.03–7.05, 4 studies), pancreatitis (5.66, CI 2.75–11.66, 9 studies), cholecystitis (2.5, CI 1.4–4.45, 4 studies), weight loss (2.49, CI 1.47–4.22, 4 studies), and high/rapidly increasing glycemia (2.33, CI 1.85–2.95, 4 studies) leading to more insulin use (4.91, CI 1.62–14.86, 5 studies). Smoking (ES 1.20, CI 1.03–1.41, 9 studies) and alcohol (ES 1.23, CI 1.09–1.38, 9 studies) have a smaller effect. (5) Conclusion: Important risk factors for PDAC among NOD patients are age, family history, and gallstones/pancreatitis. Symptoms are weight loss and rapid increase in glycemia. The identified risk factors could be used to develop a diagnostic model to screen NOD patients.


Introduction
The incidence of pancreatic ductal adenocarcinoma (PDAC) doubled over the last 2 decades [1]. The cumulative lifetime risk is 0.91% [2]. Diagnosis of PDAC comes too late for curative treatment in 80% of cases. This contributes to PDAC being one of the deadliest cancers worldwide, accounting for 4.7% of all cancer-related deaths [3]. Among diagnosed patients, the 5-year survival rate does not exceed 10% [4]. In countries that have screening programs for breast and colorectal cancers, PDAC has become the second most frequent cause of cancer mortality [5].
It has been established that all cancers discovered in the first years after diabetes diagnosis were already present and caused the diabetes, and several underlying mechanisms are under research [6][7][8][9][10][11][12]. Diabetes or prediabetes is often the first symptom of PDAC: diabetes diagnosis happens up to 3 years before the cancer diagnosis [13]. Among pancreatic cancer patients, about 80% have a diagnosis of either hyperglycemia or diabetes. Blood glucose levels slowly increase as early as 10 years before PDAC diagnosis, in the prediabetes range [14]. This has led to the idea that NOD or even prediabetes could be a potential clue to the early diagnosis of pancreatic cancer [15].
As pancreatic cancer is responsible for less than 1% of NOD cases, using a biomarker test for every patient with NOD might lead to an unfavorable risk/benefit balance if the performance of the test is not exceptional [16] (Figure 1). To further stratify the group that would need biomarker and then imaging testing, the use of a simple model or score is interesting. This strategy of 3 sieves would be more cost-effective and cause less harm than a strategy leaning on biomarkers and imaging alone.  To further stratify the group that would need biomarker and then imaging testing, the use of a simple model or score is interesting. This strategy of 3 sieves would be more cost-effective and cause less harm than a strategy leaning on biomarkers and imaging alone.
Currently, 22 studies examining additional risk factors have been published, but often they have a limited number of participants for the individual risk factor. Pooling their results in a meta-analysis should increase the precision.
Based on a systematic review with meta-analysis, this paper aims to assess PDAC risk in NOD individuals and to identify risk factors among NOD patients, which are needed for a stepwise diagnostic strategy.

Materials and Methods
We performed a systematic literature search and last updated it in June 2022 in the three major databases, PubMed (RRID:SCR_004846), Embase (RRID:SCR_001650), and Google Scholar (RRID:SCR_008878), using the terms described in Appendix A. We did not apply any search restrictions. The study is registered in the inplasy study registry (INPLASY202220065).
We included observational studies (both cohorts and case-control studies) reporting on NOD patients and assessing additional factors regarding the risk of developing PDAC. Our objectives were to identify these risk factors that further enrich the NOD population in PDAC occurrence. Also, we aimed to analyze the risk of PDAC in NOD patients compared to non-diabetic persons.
We excluded studies with the sole focus on biomarkers or medication. We did not include case reports, small case series, reviews, opinions, or articles without an English abstract. When we found interesting conference abstracts, we searched with the author's names for follow-up publications, and, if relevant, included those. As the data was presented in a very heterogenous way, we sometimes contacted the authors for additional data to be included in their study. However, not all authors answered (Appendix A, Table A1 of studies excluded at the full-text screening).
Two team members voted blindly during each step of the paper selection and quality assessment and made consensus decisions, resolving conflicts by discussion.
We extracted the following data from eligible studies: the name of the first author, journal and publication year, country and period, sample size, study type, patient characteristics, NOD definition, risk of PDAC in the NOD population, and additional risk factors ( Figure 2).
Studies reporting associations were used in the meta-analysis. Using the method of DerSimonian & Laird (an estimate of heterogeneity after the Mantel-Haentzel model), we performed a random-effects meta-analysis of risk factors that were reported in at least ) performed a quality assessment using 10 criteria as defined in the paper by Hoy et al. in a specific bias assessment tool for prevalence studies [18]. We judged overall bias for selected papers, following the corresponding bias flags among the 10 criteria. As the overall number of studies per risk factor was small, we did not exclude any study. To determine the risk of publication bias, we used a funnel plot and the Egger test (Appendix A, Figure A2).
We extracted data on the definition of NOD/subgroups of duration, age, sex, ethnicity, lifelong smoking, alcohol abuse, family history of PDAC, gall stones/cholecystitis, pancreatitis, a rapid increase of glycemia, weight loss, insulin use, obesity, and hyperlipidemia. When more than 2 groups were reported, we combined groups, for example, former smokers + current smokers = lifelong smokers. Or introduced the most meaningful cut-off; for example, for groups of BMI (Body Mass Index) reported, we distinguished BMI < 30 = not obese, BMI ≥ 30 obese (Details in Appendix A).
We also extracted the percentage of NOD patients that developed PDAC (in the cohort studies) and the OR for PDAC for NOD versus no diabetes in the case-control studies.

Studies
The search yielded 779 references, which we imported into Covidence. After removing duplicates and excluding irrelevant studies, we selected 15 studies for data extraction. Reference lists and citation searches (for studies that cited those we had already included) provided an additional 6 studies to be included in the analysis. There was one paper from other sources. Twenty-two studies were included. In total, 576,210 patients with NOD contributed to the analysis, of which 3560 had PDAC ( Figure 2).
The study designs were heterogeneous, including retrospective cohorts (some with prospective analysis) (n = 13), case-control studies (n = 8), and one small prospectively recruited screening study, with recruitment at a diabetes clinic [19] (Table 1).

Risk Factors for PDAC in NOD Patients
The strongest demographic risk factor was older age. The overall mean age difference in the studies was more than 6 years (pooled age mean difference 6.14 years, CI 3.64-8.65, I 2 = 96%, 11 studies), which seemed to be even more pronounced in European studies. Sex was not a statistically significant risk factor: the overall effect size (ES, either from odds ratio or incidence rate ratio) in overall studies was 1.07 for the male gender (CI 0.96-1.18, I 2 = 28.6%, 18 studies). Race was analyzed in only a few studies, which showed that whites had a slightly higher risk for PDAC (ES 1.46, CI 1.25-1.71, I 2 = 0.0%, 5 studies) ( Figure 3).

Association between NOD and PDAC
All studies identified a strong association between NOD and PDAC. The overall effect size was 3.35 (CI 2.75-4.09, I 2 = 83.3%), with a clear tendency of the ES to be higher when the interval since NOD diagnosis was shorter: in the first year after diabetes diagnosis, it was 5.52 (CI 3.61-8.46, I 2 = 85.6%).

Proportion of NOD Caused by PDAC
As we assumed that all PDAC was present before the diabetes diagnosis and the cause of NOD, we calculated the cumulative percentage of observed PDAC diagnosis. It ranged from 0.13% in the Taiwanese registry study by Tseng et al. [25] to 2.7% in the prospectively recruited screening study by Illés et al. [19]. Studies excluding people under 50 found 0.74% (CI 0.63-0.85%) of PDAC cases among NOD patients. The overall cumulative percentage of PDAC in NOD patients was 0.36% (CI 0.3-0.42, I 2 = 86.3%, 14 studies) ( Figure 5).
As we assumed that all PDAC was present before the diabetes diagnosis and the cause of NOD, we calculated the cumulative percentage of observed PDAC diagnosis. It ranged from 0.13% in the Taiwanese registry study by Tseng et al. [25] to 2.7% in the prospectively recruited screening study by Illés et al. [19]. Studies excluding people under 50 found 0.74% (CI 0.63-0.85%) of PDAC cases among NOD patients. The overall cumulative percentage of PDAC in NOD patients was 0.36% (CI 0.3-0.42, I 2 = 86.3%, 14 studies) ( Figure  5). . Meta analysis of OR of PDAC in NOD as opposed to no diabetes in patients grouped by the allowed duration of NOD as defined in the study or by the corresponding subgroup. The proportion of NOD with pancreatic adenocarcinoma as a probable reason for diabetes in the cohort studies in subgroups of applied age restriction. When only NOD older than 50 were included, it was highest.

Limitations and Strengths of the Study
Despite our systematic approach, we could have overlooked critical studies through our choice of search terms. We minimized this by using several formulations and searching references regarding the included papers. The most significant limitations of our findings are biases in the included studies and the disparity of representation of geographical regions. Many studies are from the USA, Europe, and Asia, and one is from Australia, but we could not identify any South American or African studies.
In extracting the data, we were limited by the heterogeneity of the included studies. To have enough data to analyze, we included studies with slightly different definitions. Definitions of how we extracted the data are in Appendix A. The results of our metaanalyses still show considerate heterogeneity, partially explained by the difference in inclusion criteria (for example, age), ethnicity, and definition of new-onset diabetes, all of which we also examined as risk factors or subgroups. Some risk factors might also interact with each other.
A strength of our review is that it gives a complete, systematic overview of the current body of evidence regarding additional risk factors for PDAC in NOD populations. Our paper is, to our knowledge, also the first to conduct a meta-analysis on the risk factors.

Interpretation of Findings
The association between diabetes and PDAC has long been recognized. Several papers have shown that the risk is highest directly after diagnosis and then decreases over subsequent years [41]. The association might be confounded by commonly shared risk factors such as obesity or chronic pancreatitis. The actual frequency of pancreatic cancer in the population of NOD is still unclear, as most studies are retrospective, and the percentage in the only prospective study is much higher. Currently, four prospective studies are recruiting patients and might bring more clarity [42][43][44][45].
It is essential to look specifically at the group of NOD patients, as they differ from the general population. For instance, NOD patients tend to be more obese than the general population, as obesity is a very important risk factor for diabetes mellitus. Within the population of NOD patients, obesity is not associated with more PDAC cases, as our analysis shows. In fact, the mean BMI of pancreatic cancer cases was lower than that of NOD controls. This might be even more pronounced through tumor-induced recent weight loss. It was surprising to find at most a weak association of smoking and alcohol abuse in this meta-analysis. Possibly these risk factors are more important for non-diabetic PDAC patients, or their importance has generally been overestimated.
Risk factors or symptoms that are distinct in the NOD patient group and are strongly connected to pancreatic cancer are ideal for targeted testing. They can be used for statistical model fitting. Our analysis showed that age, family history of PDAC, pancreatitis/cholecystitis, weight loss, and rapid increase in glycemia/necessity of insulin are robust candidates. A tendency to lower lipids, unusual in newly diagnosed diabetes patients, is also interesting. Unfortunately, some of the strongest risk factors are rather rare, which negatively impacts the sensitivity of such models. The correct balance between the frequency and magnitude of those risk factors remains to be found.

Importance of the Presented Work and Future Directions for Early Diagnosis Programs
Screening programs aim to diagnose cancer in the asymptomatic, early stages amenable to curative treatment. Scrutiny regarding balancing benefits and burdens, cost, survival extension, and quality-life years gain is essential. As pancreatic cancer has a low incidence in the total population, this is a challenge. The main risk of pancreatic cancer screening is a toohigh rate of false-positive results, leading to unnecessary further investigations. Including the identified additional risk factors or symptoms can help define the target population.
A stepwise approach of first identifying a group with increased risk of pancreatic cancer within the NOD population through a scoring or diagnostic model and then further reducing the number of patients needing imaging by a biomarker test has been proposed by Pannala et al. [15]. Several studies have proposed scores to identify the best group for testing [19,22,35,36]. A scoring system has advantages, as it is objective and can be validated. Nevertheless, it also has disadvantages, such as being time-consuming for the family physician or challenging to apply when data is missing. The complexity of a scoring model should consider the balance between the accuracy of prediction and the simplicity of daily use. Considering the slightly different associations of risk factors in different regions (for example, the USA, Europe, Asia), such scoring might differ depending on the location. These regional differences are related not only to the characteristics of PDAC patients but also to NOD. Diabetes is closely related to diet and obesity, which are subject to socio-cultural and genetic influences. In the USA, the average age for diabetes diagnosis is lower than it is in Europe. In Asia, patients with a much lower BMI than that in western countries suffer from an increased risk for diabetes [46]. In conclusion, before using a score as a diagnostic model in a new population, it will need adaptation, or at least calibration and validation. Embase search of: Pancreas carcinoma AND non-insulin-dependent diabetes mellitus AND high-risk population OR Pancreas carcinoma AND non-insulin-dependent diabetes mellitus AND risk assessment. Google scholar search of: All in the title: diabetes risk OR diagnosis "pancreatic cancer". After the selection of relevant articles, we also checked their references for additional possible matches with our research topic, which had been missed in the initial search, and checked for publications that cite those we previously included. Additional papers that were already known to the authors or came to their knowledge from other sources were also included.

Appendix A.2. Effect Size of PDAC in NOD Patients versus No Diabetes Patients
Each study had a different definition of NOD. Some used a definition of 1 year, others 2, 3, or 4 years after diabetes diagnosis (Table 1). Other studies had several subgroups for the duration of diabetes. This variability of definitions influenced the results considerably. For that reason, we did a subgroup analysis, either with the subgroups as published or with the definition used in the study. This has the disadvantage that a study without subgroups and using a 3-year definition will have in that group patients with diabetes onset less than a year ago-that is not reported separately so we cannot know that.

Appendix A.3. Parameters for Meta-Analysis, Remarks about the Reported Risk Factors/Symptoms
There was considerable heterogeneity between the studies and the published values. To include as many different studies as possible, we did a meta-analysis of the reported effect size. Where a crude Odds Ratio was reported, we took this. When possible, we calculated an Odds Ratio from the published case frequency numbers. In a small number of studies, only a Relative Risk or Incidence Ratio was published, and there were no case numbers to calculate an Odds Ratio. In this case, we used the published Effect size.
Smoking 3 studies of the 6 with data on smoking reported 3 categories: never smokers, exsmokers, and current smokers; the others only 2 categories, exposed or not exposed. We put all patients that were ever exposed to smoking into one group.
Alcohol abuse The reporting on alcohol consumption was also heterogenous, with 2, 3, or 4 different exposure groups. To group participants according to their alcohol status, we used the cut-off of 20 g/day of risky consumption (independent of gender) and sorted the published groups accordingly.
Obesity Whenever several groups of body mass index were reported, we introduced dichotomous sorting with the limit of body mass index equal to or above 30 as the definition of "obesity".
Pancreatitis Some studies reported on "chronic pancreatitis" and others on status "post pancreatitis", but as there were few studies, and acute pancreatitis can lead to chronic pancreatitis, we analyzed them together.
Gall stones/Cholecystitis Some studies reported on Gall stones, others on Cholecystitis. We grouped those together under "Gall stones", as Cholecystitis without Gall stones is very rare.
Rapid increase/High Glycemia Here the heterogeneity was huge, as some papers reported means and differences in means, others a slope, and third a proportion. Some referred to HbA1c, others to fasting glucose. To be able to meta-analyze it at all, we used all papers that reported numbers of patients, though some reported the numbers with a rapid increase [27,28], while others reported those with high fasting glucose (>160 mg/dL) at diagnosis [35,36].
Insulin use Medication was not the focus of our review, and studies that looked solely at medication were excluded, so we have not included all studies that look at the association of insulin use.

Appendix A.4. Bias Assessment
External validity The most relevant concern was selection bias. Some studies [19,23,26] sampled selectively from hospital populations that were probably not representative of the general population. Other studies [27,29] examined military veterans, a rather specific cohort comprising predominantly males and not representative of the overall population.
The choice of controls was also prone to some bias, as in some studies [26,28], convenience samples were used. The controls in Ben et al. [23] consisted of a hospital population. Moreover, they excluded all malignant diseases and all patients with diagnoses related to alcohol, tobacco, and drugs, which introduces considerable bias in assessing risk factors.
Internal validity The registry studies, which did not collect data directly from the patients, are at risk of misclassifications. Generally, the retrospective assessment of records is problematic because missed diagnoses regarding PDAC and diabetes might lead a study to underestimate the connection between those two diseases ( Figure A1).
Major concerns, Some concerns, No concerns Figure A1. Bias assessment of publications. Risk of Bias Assessment.

External validity
Was the study's target population a close representation of the national population in relation to relevant variables?
Was the sampling frame a true or close representation of the target population? Was some form of random selection used to select the sample/the control, or was a census undertaken?
Internal validity Were data collected directly from the subjects (as opposed to a proxy)? Was an acceptable case definition used in the study? Was the study instrument that measured the parameter of interest shown to have validity and reliability?
Was the same mode of data collection used for all subjects? Was the length of the longest duration for the parameter of interest (NOD)appropriate? Were the numerator(s) and denominator(s) for the parameter of interest appropriate? Overall risk of bias For assessing the overall risk of bias, we considered patient selection as the most crucial factor, which dominated our decision.

Appendix A.5. Publication Bias
To test for publication bias (done only for effects reported in at least 10 studies), we calculated a funnel plot for the effect size of PDAC in the NOD population (11 studies with 26 OR (different age groups) reported this), the age difference (reported by 10 studies), and for the effect size of sex as a risk factor within the NOD subgroup (reported by 18 studies). There was no suspicion of relevant publication bias ( Figure A2).

Appendix A.5. Publication Bias
To test for publication bias (done only for effects reported in at least 10 studies), we calculated a funnel plot for the effect size of PDAC in the NOD population (11 studies with 26 OR (different age groups) reported this), the age difference (reported by 10 studies), and for the effect size of sex as a risk factor within the NOD subgroup (reported by 18 studies). There was no suspicion of relevant publication bias ( Figure A2).