Healthy Diet, Polygenic Risk Score, and Upper Gastrointestinal Cancer Risk: A Prospective Study from UK Biobank

Dietary and genetic factors are considered to be associated with UGI cancer risk. However, examinations of the effect of healthy diet on UGI cancer risk and the extent to which healthy diet modifies the impact of genetic susceptibility on UGI cancer remains limited. Associations were analyzed through Cox regression of the UK Biobank data (n = 415,589). Healthy diet, based on “healthy diet score,” was determined according to fruit, vegetables, grains, fish, and meat consumption. We compared adherence to healthy diet and the risk of UGI cancer. We also constructed a UGI polygenic risk score (UGI-PRS) to assess the combined effect of genetic risk and healthy diet. For the results high adherence to healthy diet reduced 24% UGI cancer risk (HR high-quality diet: 0.76 (0.62–0.93), p = 0.009). A combined effect of high genetic risk and unhealthy diet on UGI cancer risk was observed, with HR reaching 1.60 (1.20–2.13, p = 0.001). Among participants with high genetic risk, the absolute five-year incidence risk of UGI cancer was significantly reduced, from 0.16% to 0.10%, by having a healthy diet. In summary, healthy diet decreased UGI cancer risk, and individuals with high genetic risk can attenuate UGI cancer risk by adopting a healthy diet.


Introduction
Upper gastrointestinal (UGI) cancer, including esophageal cancer (ESC) and gastric cancer (GC), account for 1.7 million new cancer cases and 1.3 million deaths each year worldwide [1]. Previous studies have identified several common environmental risk factors for UGI cancer, including tobacco [2] and alcohol consumption [3], obesity [4], physical activity [5], and dietary factors [6]. Dietary components have received an increasing amount of attention as a potentially modifiable factor [7,8].
It was estimated that 5.1-5.9% of cancer cases each year worldwide can be attributed directly to poor diet [9]. As recently reported by the World Cancer Research Fund International/American Institute for Cancer Research, the role of individual dietary components on UGI cancer risk remains controversial and limited [10]. Rather than individual dietary components, people consume diverse foods together, and the resulting complex combination of dietary components is likely to have interactive or synergistic effects [11]. In this context, dietary pattern analysis has been recommended as an approach because it considers the complexity of overall diet and can potentially facilitate public health interventions [12]. In recent years, cancer prevention guidelines have shifted from reductionist or nutrition-centric approaches to more holistic dietary concepts characterized by dietary patterns. Holistic dietary concepts emphasize how food as a whole can prevent chronic disease, associating nutrients, foods or food groups with health rather than studying the role played by nutrient/food interactions in health [13][14][15]. Adherence to a dietary pattern can be assessed using a priori method, which is constructed on the basis of a predefined set of criteria (generally based on guidelines) to measure diet quality in a given population [16], which would be easier to make comparisons between different studies and populations. A meta-analysis of the association of GC risk with dietary patterns indicated that Western dietary patterns (generally considered unhealthy, characterized by an increased consumption of meat, high-fat dairy products, sweets, and starchy foods) were associated with a higher GC risk, while prudent dietary patterns (generally considered healthy, characterized by higher intake of vegetables and fruits) played a protective factor [17]. A case-control study suggested that adherence to a healthy dietary pattern represented by high loadings of vegetables and fruits was associated with a lower risk of GC [18]. However, there is no large-scale prospective cohort study that systematically investigates the association between dietary patterns and UGI cancer risk.
Accumulating evidence has shown that genetic factors have major roles in the development of UGI cancer [19,20]. Recent genome-wide association studies (GWAS) have identified dozens of genetic variants associated with UGI cancer risk [21,22]. The PRSs, gathering genetic contribution and effects of all UGI cancer-associated genetic variants, have been proven to effectively predict incident cases of ESC and GC [23,24]. Both dietary factors and genetic risk play an essential role in the development of the disease. A Gene-Diet Interaction Study from the UK Biobank showed that, compared with those in the lowest intraocular pressure (IOP) polygenic risk score (PRS) quartile who consumed no caffeine, those in the highest IOP PRS quartile who consumed ≥321 mg/day showed a 3.90-fold higher glaucoma prevalence [25]. Moreover, one current study suggested that genetic factors modified the association between diet and cardiovascular disease (CVD) [26]. However, previous studies have typically focused on the separate effects of dietary factors and genetic factors on UGI cancer risk. Few studies provided insight into the combined effect of dietary factors and genetic factors on UGI cancer risk. It is unclear whether there is a gene-diet combined effect or interaction in the risk of UGI cancer development, as well as the extent to which participants with a high genetic risk of UGI cancer can offset that risk by adhering to a healthy diet.
In this study, we conducted dietary pattern analysis based on examining the adherence to healthy diet and investigated the association of adherence to healthy diet with UGI cancer risk using UK Biobank data. We also tested the hypothesis that dietary factors and genetic factors jointly contribute to incident UGI cancer and that adopting a healthy diet can attenuate UGI cancer risk for individuals at high genetic risk.

Study Design and Participants
UK Biobank is a large, population-based prospective study with genetic and phenotypic data. Between 2006 and 2010, UK Biobank recruited over 500,000 participants from the general population who were aged 40-69 years. Participants were recruited at 22 assessment centers located throughout England, Wales, and Scotland [27]. Participants completed a touch-screen questionnaire, took physical measurements, and provided biological samples at assessment centers. The basic collection details are described elsewhere [28,29]. We excluded participants with prevalent cancer (n = 46,531), those who were missing any dietary information data (n = 40,132), and individuals who had withdrawn consent for future linkage (n = 157), leaving 415,589 participants (193,083 men and 222,506 women) included in the study. First, we examined the association between the degree of adherence to healthy diet defined by healthy diet score and UGI cancer risk. Then, we compared the combined effect and interactions of healthy diet and genetic risk categories on UGI cancer risk across genetic risk groups. Last, we compared the benefit of adherence to a healthy diet within genetic risk groups ( Figure 1).  1 For healthy diet and genetic risk on UGI cancer risk across and within genetic risk group analysis, participants without available genetic information were excluded (n = 21,032).

Dietary Intake Assessment
The touch-screen questionnaire, self-completed at baseline, was used to collect the frequency of consumption of the following 12 food items over the previous year with FFQ: beef, lamb, pork, processed meat, oily fish, non-oily fish, fresh fruit, dried fruit, raw vegetables, cooked vegetables, cereal, and bread. We also created new data fields based on food items: (1) Red meat intake, (2) Total fish intake, (3) Total vegetables intake, (4) Total fruit intake, (5) Whole grains intake, and (6) Refined grains intake. We summed beef, lamb and pork intake to create red meat intake. We also summed oily fish and non-oily fish intake to generate total fish intake. To calculate total vegetables and fruit consumption respectively, we aggregated cooked vegetables and salad/raw vegetable intake as total vegetables intake, and fresh fruit and dried fruit as total fruit intake. We divided grains into whole grains and refined grains according to the type of bread and cereal mainly consumed. We defined wholemeal or wholegrain bread, bran cereal, oat cereal, and muesli as whole grains; white bread, brown bread, other bread, biscuit cereal, and other cereals as refined grains. We categorized the 12 food items into 7 food groups, including red meat, processed meat, total fish, total fruit, total vegetables, whole grains and refined grains. We also defined serving size for each baseline food items. For bread and cereal, data were provided for weekly consumption, which were converted into daily consumption. Detailed serving size and coding for each food item/food group are shown in Table S1.

Healthy Diet Score Estimation
We adopted seven dietary factors and cut-offs according to recommendations for dietary priorities on cardiometabolic health [30], that is, increasing fruit, vegetables, whole grains, and fish consumption, and decreasing red meat, processed meat, and refined grains intake. The healthy diet score was calculated using the seven dietary components: Total fruit ≥ 4 servings/day; Total vegetables ≥ 4 servings/day; Total fish ≥ 2 servings/week; Processed meat ≤ 1 serving/week; Red meat ≤ 1.5 servings/week; Whole grains ≥ 3 servings/day; Refined grains ≤ 1.5 servings/day. Each favorable dietary factor was given one point (Table S2). The score ranged from 0 to 7; we defined score 0-1 as low-quality diet, 2-4 as intermediate-quality diet, and 5-7 as high-quality diet, according to data distribution characteristics. Next, we categorized the scores into unfavorable diet (healthy diet score < 4) and favorable diet (healthy diet score ≥ 4).

PRS Calculation and UGI-PRS Construction
Genotyping process and single nucleotide polymorphisms (SNPs) used in the UKB research have been described elsewhere in detail [31,32]. We extracted variants with p < 5 × 10 −8 and minor allele frequency (MAF) ≥0.01 from GWAS with the largest sample size in European ancestry [23,33]. For variants that were not available in the UKB genotyping data, their strong correlated SNPs (r 2 > 0.8) were included in the present study. If more than one variant correlated in the same locus were reported, the SNPs with the smallest reported p-value were selected by using the linkage disequilibrium clumping procedure (at r 2 < 0.2) in PLINK. We excluded SNPs with allele mismatches or MAF differences > 0.10, compared with those in the European population of 1000 Genomes, and palindromic SNPs (A/T, G/C) with an MAF ≥0.45. Finally, we estimated site-specific PRS based on 13 SNPs and 3 SNPs for ESC and GC, respectively (Table S3). No SNPs were shared or in high LD (r 2 > 0.6) with each other in more than one site-specific PRS. Firstly, site-specific PRS was created following an additive model [34], generated by multiplying the genotype dosage of each risk allele by its respective effect size, summing all alleles together. Then, we built a UGI-PRS to assess UGI cancer risk by summing site-specific PRSs weighted by ESC and GC age-standardized incidence rate in the UK population [35]. Cancer site-specific PRS has been proven to effectively identify individuals with high risk of overall cancers and gastrointestinal cancer risk [36,37]. The UGI-PRS was divided into three levels of genetic risk: low (lowest quintile), moderate (quintiles 2-4), and high (top quintile).

Outcome Assessment
The outcomes in the study were first primary incident events due to UGI cancer (ESC and GC), which is identified through the national cancer registries of England, Wales, and Scotland, coded by the 10th revision of the International Classification of Diseases (ICD-10), as (C15) and (C16) for ESC and GC, respectively. After four years of baseline recruitment (2006-2010), UGI cancer risk in participants was assessed from baseline up to the UGI cancer diagnosis, death, completion of follow-up, or loss to follow-up, whichever occurred first. The time of risk was calculated according to date the participant attended the assessment center (Data Field: 53), date of cancer diagnosis (Data Field: 40005) and the end date of follow-up. The end date of follow-up was updated to September 2018 for Scotland and to June 2021 for England and Wales. For participants who developed a UGI cancer, time at risk was the interval between the date of cancer diagnosis and the date of attending assessment. For participants without UGI cancer, time at risk was calculated by the end date of follow-up minus date of attending assessment center.

Statistical Analysis
Cox proportional hazard models were used to investigate the associations between healthy diet and UGI cancer risk and to estimate hazards ratios (HRs) and 95% confidence intervals (CIs) with the time of follow-up used as the timeline variable. The proportional hazard assumptions were checked using Schoenfeld residuals. We determined UGI cancer risk for participants among healthy diet score categories (low-quality diet, intermediatequality diet, and high-quality diet group). We also compared the UGI cancer risk for per two-point increase in healthy diet score. Furthermore, we investigated the combined effect and interactions of dietary and genetic factors on UGI cancer risk according to healthy diet and genetic risk categories to explore the extent to which healthy diet modified the associations between genetic susceptibility and UGI cancer risk across genetic risk groups. We examined the results for potential additive and multiplicative interaction between healthy diet and genetic risk [38]. The additive interaction was evaluated using two indexes: the relative excess risk due to the interaction (RERI) and the attributable proportion due to the interaction (AP) [39]. The 95% CIs of the RERI and AP were generated by drawing 5000 bootstrap samples from the estimation data set [40]. If there was no additive interaction, the CIs of the RERI and AP would include 0. In addition, we used RHR (ratio of HR) to evaluate the gene-diet multiplicative interactions by setting variable cross-product terms of the healthy diet with the genetic risk in the models. The 95% CIs of RHR would contain 1 if there was no multiplicative interaction. We also calculated the absolute risk as the percentage of incident UGI cancer cases occurring in each genetic risk group to compare the benefit of adherence to a healthy diet with incident UGI cancer within genetic risk groups. The absolute risk reduction was calculated according to the given groups UGI cancer incidences difference, and then the difference in five-year event rates was extrapolated among given groups. The calculation of 95% CIs for the absolute risk reduction were calculated by drawing 1000 bootstrap samples from the estimation dataset.
Two models were applied in our analyses: minimally adjusted model, adjusted for age at recruitment, sex, Townsend deprivation index, assessment center (10 regions) and ethnic background; fully adjusted model, additionally adjusted for BMI (kg/m 2 , <25, 25-29.9, ≥30), glycosylated hemoglobin (HbA1c, mmol/mol, quintiles), smoking status (never, former, current, unknown), alcohol intake frequency (never/rare, twice or less per week, at least three times per week, unknown), education (college or university degree, no degree, unknown), multimorbidity (None, ≥1, unknown), physical activity (<600 MET minutes/week, 600-3000 MET minutes/week, >3000 MET minutes/week) [41] and family cancer history (yes, no, unknown) (Table S4). We additionally adjusted the top 10 genetic principal components of ancestry in the analysis including genetic risk. Missing data were coded as missing proxies (unknown) for categorical variables, while those for continuous variables were imputed with sex-specific median values.
We performed the following sensitivity analysis to further investigate the robustness of our results: (1) excluded participants who reported that they had made a major change in their diet in the past 5 years due to illness in the past 5 years (n = 41,292); (2) excluded participants followed up for less than two years (n = 1648); (3) excluded non-white participants (n = 21,680).
All statistical analyses were performed with R software for version 4.2.0 (R Core Team, Auckland, CA, USA). All p values were two-sided and p < 0.05 was considered statistically significant.

Participants and Characteristics
A total of 415,589 participants (53.54% women) had available dietary data of this study. The median follow-up period was 12.12 (interquartile range: 11.32-12.84) years for UGI cancer incidence. A total of 1389 UGI cancer developed during the period, including 564 GC and 831 ESC. The baseline characteristics of participants are shown in Table 1 1 Values are presented as mean ± SD or n (%) unless otherwise indicated.

Healthy Diet and the Risk of UGI Cancer
The association between adherence to healthy diet and UGI cancer risk was shown in Table 2. Individuals with a high-quality diet that included high intake of fruit, vegetables, fish and whole grains and reduced amount of red meat, processed meat and refined grains had a lower risk of UGI cancer incidents compared with those in low-quality diet group, with HR of 0.76 (95% CI: 0.62-0.93, p = 0.009). Having a two-point increase in healthy diet score was associated with a higher UGI cancer risk, with HR of 0.90 (95% CI: 0.83-0.97, p = 0.006). Similar results were noted in a series of sensitivity analyses (Table S5).

Combined Effect and Interactions of Healthy Diet and Genetic Risk on UGI Cancer Risk
We determined that participants who had an unhealthy diet and were in a high genetic risk group had an approximately 1.60-fold risk of UGI cancer risk, with HR reaching 1.60 (95% CI: 1.20-2.13, p = 0.001), when compared with participants with a healthy diet and low genetic risk ( Figure 2). The results of the sensitivity analysis did not change materially ( Figure S1A-C). The RERI, AP, and RHR were not significant, which indicated no additive and multiplicative interactions of healthy diet and genetic risk on the risk of UGI cancer (Table 3). Risk of incident UGI cancer according to healthy diet and genetic risk categories in the UKB cohort. The HRs were estimated using Cox proportional hazard models with adjustment for age at recruitment, sex, assessment center (10 regions), ethnicity, Townsend deprivation index, education, BMI, glycosylated hemoglobin (HbA1c), smoking status, alcohol intake frequency, physical activity, multimorbidity, family history of cancer, and the first 10 principal components of ancestry. * For healthy diet and genetic risk on UGI cancer risk across and within genetic risk group analysis, participants without available genetic information were excluded (n = 21,032). Unfavorable diet (healthy diet score < 4) and Favorable diet (healthy diet score ≥ 4). Definition of abbreviations: RERI = relative excess risk due to the interaction; AP = attributable proportion due to the interaction; RHR = ratio of hazard ratio. * Defined by PRS: low (lowest quintile), intermediate (quintiles [2][3][4], and high (quintile 5). 1 Cox proportional hazards regression is adjusted for age at recruitment, sex, assessment center (10 regions), Townsend deprivation index, ethnicity, education, BMI, glycosylated hemoglobin (HbAlc), smoking status, alcohol intake frequency, physical activity, multimorbidity, and family history of cancer.

Benefits of Adherence to a Healthy Diet with UGI Cancer Risk
In further stratification analyses with an unhealthy dietary pattern as the reference group according to genetic risk categories, we found that in the intermediate and high genetic risk groups, similar risk reduction for UGI cancer were observed in those who adhered to a healthy dietary pattern compared to those who adhered to an unhealthy dietary pattern. Among participants with an intermediate genetic risk, the absolute fiveyear incidence risk of UGI cancer were 0.13 for participants with an unhealthy dietary pattern versus 0.11 for those with a healthy dietary pattern. Similarly, for individuals with high genetic risk, the absolute five-year incidence risk of UGI cancer decreased from 0.16 for participants with an unhealthy dietary pattern to 0.10 for those with a healthy dietary pattern ( Table 4). The results of sensitivity analyses were similarly (Table S6).   1 Cox proportional hazards regression is adjusted for age at recruitment, sex, assessment center (10 regions), Townsend deprivation index, ethnicity, education, BMI, glycosylated hemoglobin (HbAlc), smoking status, alcohol intake frequency, physical activity, multimorbidity and family history of cancer. Unfavorable dietary pattern (healthy diet score < 4) and Favorable dietary pattern (healthy diet score ≥ 4).

Discussion
In this large, prospective study using UK Biobank, we investigated dietary pattern analyses based on healthy diet and UGI cancer risk. We found that improving the quality of healthy diet was associated with a lower risk of UGI cancer. Across genetic risk groups, analysis further showed that individuals with high genetic risk and an unhealthy dietary pattern were at a greater risk of UGI cancer compared to those with low genetic risk and a healthy dietary pattern. Within genetic risk groups, analysis indicated that adherence to a healthy dietary pattern was consistently associated with a decreased absolute five-year incidence risk of UGI cancer in intermediate and high genetic risk groups.
Current studies suggested that dietary patterns analyses are regarded as good ways to explore diet and cancer risk. A systematic review and meta-analysis from prospective cohort studies supported an association between healthy dietary patterns and decreased risks of colon and breast cancer [42]. One study that focused on nutrition and breast cancer showed that adherence to a healthy dietary pattern might improve overall survival after diagnosis of breast cancer [43]. We performed dietary pattern analyses based on healthy diet score and the risk of UGI cancer. A systematic review and meta-analysis on dietary patterns and gastric cancer risk indicated that there is an approximately two-fold difference in GC risk between a 'prudent/healthy' diet, and a 'Western/unhealthy' diet [17]. A population-based case-control study suggested that a diet high in fruit and vegetables may decrease the risk of ESC cancer [44]. Another systematic review and meta-analysis suggested that a healthy dietary pattern was significantly associated with a decreased risk of ESC [45]. Our study also found similar results, i.e., that adherence to a healthy diet reduced the UGI cancer risk. We also compared the benefit of adherence to a healthy dietary pattern within genetic risk groups based on the calculation of absolute five-year incidence risk of UGI cancer. We found that individuals with intermediate and high genetic risk who adopted a healthy diet had a decreased risk of developing UGI cancer. For participants with high genetic risk, the absolute five-year incidence risk of UGI cancer was significantly reduced from 0.16% to 0.10% by having a healthy diet. Taken together, our findings along with previous evidence not only demonstrated the significance of adherence to healthy diet, but also provided collective support for public health interventions to promote a healthy dietary pattern for everyone, especially people with intermediate or high genetic risks, which will ultimately lead to a reduction of UGI cancer burden.
It has been estimated that ESC and GC could be prevented in 54% and 59% of patients in the UK, respectively [46]. It is important to understand the contribution of modifiable risk factors to UGI cancer and how they affect or add to the inherited genetic factors. At present, several studies have summarized the association between diet and nutrition and the UGI cancer risk; however, reported meta-analytic estimates from observational studies may not represent causality. Instead, they may result from common biases across studies, such as exposure measurement error, residual confounding, and publication bias, and thereby weaken the strength of the scientific evidence [47][48][49]. In addition, few studies have focused on the combined effect and interactions of gene-diet on the risk of UGI cancer. We systematically and comprehensively investigated the association between modifiable dietary factors with UGI cancer risk and tested the hypothesis that UGI cancer risk can be modified or reduced by adopting a healthy diet in a large prospective cohort study.
UK Biobank is a large, general population-based prospective cohort, which provides health outcomes and a wide range of potential confounders, including diet. One of the inevitable problems with large sample studies is that p values are more likely to be statistically different. In detail, a statistical p value is the distance between the data and the null hypothesis measured by an estimate of the parameter of interest. This distance is usually measured in terms of the standard deviation (standard error). The standard error shrinks as the sample size increases; in a very large sample, the standard error becomes very small, which leads to a statistically significant distance between the estimate and the null hypothesis that may be negligible. Therefore, to reduce type I errors, the null hypothesis cannot be rejected by the p-value alone in a large sample study. These problems can be solved by additionally reporting effect sizes and 95% confidence intervals (CI) [50]. In our study, we provided 95% CI as well as p values to more cautiously infer the association between healthy diet and UGI cancer.
The present study has several limitations. First, participants in the UK Biobank are of European descent; therefore, the summary statistics should be generalized to the general population with caution. Secondly, the use of self-reported recall of FFQ could introduce some level of recall bias. Third, it is generally accepted that associations between nutrients and disease should only be considered primary if the effects are independent of energy intake [51]. We were not able to adjust for total energy intake because the baseline touchscreen brief FFQ only covered some commonly consumed foods. Therefore, our findings may be biased by the differences in body size, physical activity, and metabolic efficiency resulting from energy intake. Last, covariates were evaluated only once at baseline, and changes during the follow-up or competitive risk of other illnesses may have an effect on risk estimates.

Conclusions
Our findings confirm and broaden the results from previous studies. Healthy diet was associated with a lower risk of UGI cancer. Dietary factors and genetic risk had a combined effect on risk of UGI cancer. Individuals with high genetic risk can attenuate UGI cancer risk by adopting a healthy dietary pattern.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/nu15061344/s1, Figure S1A: Risk of incident UGI cancer according to healthy diet and genetic risk categories in the UKB cohort after excluding participants who report changing their diet in the last 5 years due to illness; Figure S1B: Risk of incident UGI cancer according to genetic and healthy diet and genetic risk categories in the UKB cohort after excluding participants who report less than 2 years of follow-up; Figure S1C: Risk of incident UGI cancer according to healthy diet and genetic risk categories in the UKB cohort after excluding non-white participants; Table S1: Serving size and coding of intake for each touchscreen food items/food groups; Table S2: Healthy diet score factors definition; Table S3: Single nucleotide polymorphisms utilized to build the polygenic risk scores for UGI cancer; Table S4: Definition of covariates; Table S5: Associations between healthy diet and the risk of UGI cancer after excluding participants who report changing their diet due to in the last 5 years due to illness or after excluding participants who report less than 2 years of follow-up or after excluding non-white participants; Table S6: UGI cancer risk associated with healthy diet by genetic risk level after excluding participants who report changing their diet in the last 5 years due to illness or after excluding participants who report less than 2 years of follow-up or after excluding non-white participants.