Comparing Calculated Nutrient Intakes Using Different Food Composition Databases: Results from the European Prospective Investigation into Cancer and Nutrition (EPIC) Cohort

This study aimed to compare calculated nutrient intakes from two different food composition databases using data from the European prospective investigation into cancer and nutrition (EPIC) cohort. Dietary intake data of the EPIC cohort was recently matched to 150 food components from the U.S. nutrient database (USNDB). Twenty-eight of these nutrients were already included in the EPIC nutrient database (ENDB—based upon country specific food composition tables), and used for comparison. Paired sample t-tests, Pearson’s correlations (r), weighted kappa’s (κ) and Bland-Altman plots were used to compare the dietary intake of 28 nutrients estimated by the USNDB and the ENDB for 476,768 participants. Small but significant differences were shown between the USNDB and the ENDB for energy and macronutrient intakes. Moderate to very strong correlations (r = 0.60–1.00) were found for all macro- and micronutrients. A strong agreement (κ > 0.80) was found for energy, water, total fat, carbohydrates, sugar, alcohol, potassium and vitamin C, whereas a weak agreement (κ < 0.60) was found for starch, vitamin D and vitamin E. Dietary intakes estimated via the USNDB compare adequately with those obtained via the ENDB for most macro- and micronutrients, although the agreement was weak for starch, vitamin D and vitamin E. The USNDB will allow exposure assessments for 150 nutrients to investigate associations with disease outcomes within the EPIC cohort.


Introduction
Detailed information on the nutritional composition of foods can be found in food composition databases (FCDBs) [1]. FCDBs are usually country-or region specific, and represent a fundamental information resource for nutrition science by, for example, estimating exposure to various food components with both positive and negative health outcomes [1][2][3]. However, when estimating dietary intake across multiple countries with different eating cultures and traditional diets, the lack of a single standardised dietary database that provides internationally comparable nutritional data poses the methodological challenge of how to determine the nutrient content of consumed foods [3,4].
In Europe, considerable efforts have been made to harmonise national FCDBs [3,5,6]. Within the frame of the European prospective investigation into cancer and nutrition (EPIC) study, the EPIC nutrient database (ENDB) project was conducted between 2002 and 2005. The ENDB was a pioneer project for the harmonisation of food composition data across 10 European countries [7], resulting in an end-user nutrient database including information on 28 food components.
Despite the effort made through the European projects to harmonise European FCDBs, many national FCDB still have a limited food list, lacking information for many food items, especially on micro-nutrients, and are still not comparable for important food components, due to methodological variations [3,4]. These variations may include different analytical approaches and methods for nutrient calculations, definitions of nutrients and units of measurement [4,8]. As a result of the limitations found in national FCDB, extending the ENDB with extra food components using the EPIC country-specific FCDBs is not feasible, and would introduce substantial measurement error in dietary intakes. This measurement error may lead to non-differential misclassification of exposure and reduced power to detect associations with disease-outcomes.
As part of the EPIC study, a new dietary intake database has now been compiled to investigate the intake of a broader range of nutrients than initially covered by the ENDB project (e.g., individual fatty acids, amino acids, individual sugars). For this, the EPIC dietary intake data was matched to the U.S. nutrient database (USNDB, National Nutrient Database for Standard Reference of the U.S. Department of Agriculture-USDA), which includes more than 8000 foods and 150 food components. [9].
As a matter of relative validation, the aim of this study was to compare the already available dietary intakes of 28 in the ENDB with the same nutrients from the USNDB. The selection of the USNDB as a more extensive source of food components reflects a pragmatic approach when more comprehensive standardisation of national FCDBS is not feasible, and will allow exposure assessments for 150 nutrients to investigate associations with disease outcomes within the EPIC cohort.

EPIC Study Design
EPIC is a large on-going multicentre prospective cohort study consisting of 521,324 adults (366,521 women and 153,437 men) mostly aged 35-70 years at recruitment [10]. The objective of this cohort was to investigate the role of diet, lifestyle, metabolic factors and genetics in cancer development, as well as other non-communicable diseases [10,11]. Study participants were enrolled between 1992 and 2000 from 23 centres across 10 European countries: Denmark, France, Germany, Greece, Italy, The Netherlands, Norway, Spain, Sweden and the United Kingdom. EPIC's study rationale, study population and data collection have been described elsewhere [10,11]. All participants provided written informed consent and the ethical review boards from the International Agency for Research on Cancer (IARC-Lyon, France) and from all local centres approved the study.

Dietary Intake Assessment Methods
The collection of long-term dietary intake data was conducted at baseline through country or centre-specific and validated dietary questionnaires (DQ), spanning the previous 12 months, and designed to capture geographical specificity of the diet [10]. In most centres, DQs were self-administered food-frequency questionnaires, with the exception of the centres in Ragusa (Italy), Naples (Italy) and Spain, where food-frequency questionnaires were administered by face-to-face interviews [10]. Using different dietary assessment methods across study countries and centres, may induce systematic and random errors in dietary intake measurement when the dietary data from the different countries is combined. To address this issue, a calibration approach was developed to adjust for possible systematic over-or underestimation in dietary intake measurements [12]. This approach included a single 24-h dietary recall (24-HDR), conducted by trained interviewers using a computer-assisted, interactive dietary interview program (EPIC-soft) [13]. This procedure was standardised within and between all EPIC centres. The 24-HDR was collected at baseline for a representative sample (N = 36,994) of the entire EPIC cohort [12].

Initial Compilation of a Harmonised Nutrient Database for the EPIC Project
The ENDB aimed to harmonise the nutrient values of the national FCDBs across the 10 participating EPIC countries and originally focussed on energy and 26 nutrients [7]. An inventory of nutrient definitions, methods of analyses and modes of expression among the FCDBs in nine EPIC countries formed the basis of the ENDB [4]. Since 2010, a folate database has been compiled as an extension of the ENDB [14], based on a new inventory and critical evaluation of folate data in 15 European and three non-European FCDBs [15]. The ENDB counts 28 fully documented food components today (Table 1).

Matching of the EPIC Food List with the USNDB
To match the EPIC food list with the USNDB, we used the USNDB release 26 (October 2013) with 8463 food items and 150 food components, with further completion using the 28th release (September 2015) containing 8789 food items. The USNDB has great transparency on the source of nutritional data, ensures documentation about the methods, definitions and calculation-and imputation methods used, and gives information on the data quality assessment for all analytical nutrient profiles [9].
The procedure to match the EPIC food list with the USNDB builds upon the standardised procedure used to compile the ENDB [7,14]. In brief, consumed foods derived from the dietary assessments in EPIC were matched as closely as possible to foods available in the USNDB. Nutrient values of foods unavailable in the USNDB were either estimated by recipe calculation, or by weighted averaging, i.e., the weighted average of the consumption frequencies of related foods was calculated (e.g., vegetable oil.: weighted average of vegetable oils including olive oil, rapeseed oil, corn oil).

Quality Assessment of the Matching Procedure
Three quality controls were performed which guaranteed the accuracy of the matching procedure, linking EPIC food data to the USNDB. First, a random sample of food items was matched in duplicate to the USNDB by two researchers independently. Secondly, the fully matched food list and the assigned nutrient values were checked for errors once by an accredited nutritionist and once by an expert of the ENDB project. Finally, systematic quality controls were performed based upon the distributions of intakes. Extreme intake values were inspected and identified errors were corrected.

Statistical Analyses
The reported food intakes of 476,768 participants for the DQ data (whole EPIC sample) and 34,064 participants for the 24-HDR data (EPIC sample with 24-HDR data) were analysed. For the DQ data, participants with missing information on more than 80% of the relevant dietary questions (N = 6837) and participants with implausible energy intakes (i.e., the top and bottom 1% of the distribution of the ratio of reported total energy intake to energy requirement; N = 10,242) were excluded from the analyses. No participants were excluded for analyses concerning the 24-HDR data, because of its detailed and standardised nature and built-in quality controls. Data from Greece were not included in this study. Missing nutrient values were replaced by zeros to allow the calculation of nutrient intakes for all subjects.Mean dietary intakes of energy and the other 27 components, their standard deviations (sd) and median were calculated using the USNDB and the ENDB. Differences in dietary intakes were reported as absolute mean differences and paired samples t-tests were conducted. Pearson correlation coefficients were performed to investigate the associations of dietary intakes estimated using the USNDB with the ENDB. As a measurement of agreement between both methods, rather than a measurement of differences, Bland-Altman plots were presented for each of the nutrients and their corresponding limits of agreement [16], using the Statistical Package for the Social Sciences (SPSS Inc., Chicago, IL, USA; version 26). Weighted kappa coefficients (κ) were calculated to assess the agreement on the classification of individual dietary intakes into quintiles. Cut-offs for quintiles of equal size were assigned separately for the USNDB and ENDB, based on the distribution of dietary intake of each food component. In this study, it was decided to be stricter with the interpretation of the weighted kappas than suggested by Cohen [17], considering the following interpretation: 0.01-0.39 as none to slight, 0.40-0.59 as weak, 0.60-0.79 as moderate, 0.80-1.00 as strong to very strong agreement.
All statistical tests were two-sided, with a statistical significance level of α = 0.05 and carried out with the Statistical Analysis Software (SAS Institute Inc., SAS Campus Drive, Cary, NC, USA) version 9.4, unless otherwise specified. Table 2 shows the mean dietary intakes, their standard deviation and the median dietary intakes of energy and 27 nutrients, as estimated by the USNDB and the ENDB, and the absolute difference in mean nutrient intakes. Differences in mean nutrient intake between the USNDB and ENDB were statistically significant (paired samples t-test: p < 0.001) for all nutrients. Results per country can be found in Table S1. Concerning the DQ data, dietary intakes for energy, total fat and carbohydrates estimated by the USNDB were higher compared to dietary intakes calculated by the ENDB. For proteins, dietary intake measurements by the USNDB were lower. Absolute mean differences for energy and the three principal classes of macronutrients between the USNDB and ENDB were 61.2 kcal/day (3% relative to the ENDB) for energy intake, 1.2 g/day (1.5% of ENDB) for total fat, −4.3 g/day (−4.9% of ENDB) for proteins and 24.0 g/day (10.4% of ENDB) for carbohydrates. Similar results were found for the 24-HDR, the mean difference between the USNDB and ENDB for energy intake was 20.2 kcal/day (1% relative to the ENDB), −0.2 g/day (−0.2% of ENDB) for fat, −6.1 g/day (7.2% of ENDB) for proteins and 17.4 g/day (7.6% of ENDB) for carbohydrates. The strongest mean difference in dietary intake between the USNDB and the ENDB was found for starch: −72.3 g/day (−59.3% relative to the ENDB) and −76.8% (−64.7% relative to the ENDB) for the DQ data and 24-HDR data, respectively. A higher/lower dietary intake measure by the USNDB compared to the ENDB for the DQ data was systematically presented as a respective higher/lower dietary intake by the USNDB for the 24-HDR, except for the nutrients with a mean difference between −5% and 5%. Table 2. Mean, standard deviation and median of dietary intakes of 28 nutrients estimated by the U.S. nutrient database (USNDB) and the EPIC nutrient database (ENDB) and their absolute mean difference in nutrient intake, reported for the 24-hour dietary recall data (24-HDR) and the dietary questionnaire data (DQ).  Pearson correlation and weighted kappa for dietary intakes of energy and 27 priority nutrients between the USNDB and the ENDB can be found in Table 3. Results per country can be found in Table S2. Regarding the DQ data, Pearson correlation coefficients for the associations of nutrient intakes estimated using the USNDB and the ENDB ranged from r = 0.62 for vitamin E (alpha-tocopherol) to r = 1.00 for water. For most of the nutrients, strong to very strong correlations were found (r ≥ 0.80). Moderate correlations (r = 0.60-0.79) were only found for thiamine (r = 0.78), magnesium (r = 0.78), starch (r = 0.72) and vitamin E (r = 0.62). Similarly, for the 24-HDR data, strong to very strong correlations were found for most of the nutrients, followed by moderate correlations for vitamin B6, food folate, magnesium, retinol, iron, and vitamin E. Only for vitamin B1 (r = 0.56), starch (r = 0.54) and vitamin D (r = 0.48) weak correlations were found.

24-HDR (N
Results of the weighted kappa analysis for the DQ data ranged from κ = 0.43 for starch to κ = 0.98 for water. Energy intake, total fat, protein and carbohydrate intakes showed strong to very strong agreement (κ = 0.80-1.00). Moderate agreement (κ = 0.60-0.79) was found for the majority of the nutrients and only starch, vitamin D and vitamin E showed weak agreement (κ = 0.40-0.59). Regarding the 24-HDR, weighted kappas were lower compared to those of the DQ data and ranged from κ = 0.30 for starch to κ = 0.96 for water. Strong to very strong agreement was only shown for water, energy, fat and alcohol and a much larger share of nutrients showed a weak agreement (iron, magnesium, vitamin D, vitamin E, vitamin B1, vitamin B2 and vitamin B12). Bland-Altman plots for energy, total fat, proteins, carbohydrates and alcohol intakes for the 24-HDR data and DQ data are presented in Figures 1 and 2, respectively. Bland-Altman plots for all other nutrients can be found in Figure S1 for the 24-HDR data and in Figure S2 for DQ data. The mean difference, or bias, is the same as the mean difference presented in Table 2. Visual inspection of these Bland-Altman plots shows a divergent pattern for the majority of the nutrients (e.g., cholesterol intake, iron intake, magnesium intake, vitamin E intake, retinol intake), which reflects an increasing mean difference with increasing intakes. Results of the weighted kappa analysis for the DQ data ranged from κ = 0.43 for starch to κ = 0.98 for water. Energy intake, total fat, protein and carbohydrate intakes showed strong to very strong agreement (κ = 0.80-1.00). Moderate agreement (κ = 0.60-0.79) was found for the majority of the nutrients and only starch, vitamin D and vitamin E showed weak agreement (κ = 0.40-0.59). Regarding the 24-HDR, weighted kappas were lower compared to those of the DQ data and ranged from κ = 0.30 for starch to κ = 0.96 for water. Strong to very strong agreement was only shown for water, energy, fat and alcohol and a much larger share of nutrients showed a weak agreement (iron, magnesium, vitamin D, vitamin E, vitamin B1, vitamin B2 and vitamin B12).
Bland-Altman plots for energy, total fat, proteins, carbohydrates and alcohol intakes for the 24-HDR data and DQ data are presented in Figure 1 and Figure 2, respectively. Bland-Altman plots for all other nutrients can be found in Figure S1 for the 24-HDR data and in Figure S2 for DQ data. The mean difference, or bias, is the same as the mean difference presented in Table 2. Visual inspection of these Bland-Altman plots shows a divergent pattern for the majority of the nutrients (e.g., cholesterol intake, iron intake, magnesium intake, vitamin E intake, retinol intake), which reflects an increasing mean difference with increasing intakes.

Main Results and Interpretation
Matching EPIC dietary intake data to the USNDB will allow exposure assessments for 150 fo components to investigate in relation to disease outcomes within the EPIC cohort. Comparat analyses showed significant, but rather small, absolute differences between dietary intakes of the selected nutrients estimated by the USNDB and the ENDB for participants in the EPIC study. Amo the three classes of macronutrients, the greatest mean difference was found for carbohydrates (10. difference for DQ data and 7.6% difference for 24-HDR relative to the ENDB), representing hig carbohydrate intake estimates by the USNDB than by the ENDB. Within the USNDB, data on to carbohydrates was calculated 'by difference' (i.e., the difference between 100 and the sum of percentages of water, protein, total fat, ash and, when present, alcohol), and includes total diet fibre [18]. Within the ENDB, the sum of analysed fractions was the reference method, exclud dietary fibre (i.e., glycaemic carbohydrates), whereas values obtained by difference were graded 'non comparable' [7]. Note that the carbohydrate values were considered as possibly presenting most heterogeneity in terminology, definition, mode of expression and methods used (analytica calculations) across EPIC countries [4]. These differences in definition and calculation methods lik explain the absolute difference in carbohydrate intakes between the USNDB and the ENDB. T greatest mean difference was shown for starch, a fraction of carbohydrates, with much hig estimates reported by the ENDB. In addition to the heterogeneity described in carbohydrate val across European FCDB, the level of detail with regard to the coverage of the different carbohydr fractions varies significantly across European countries [4]. Therefore, the starch values reported both the USNDB and ENDB should be handled with caution. A short overview of the USNDB a ENDB reference component-specific definition and standard analytical methods and approaches all 28 nutrients is shown in Table S3.
Relative differences in dietary intake estimates between the USNDB and the ENDB w examined using Pearson correlations. Moderate to very strong correlations for the majority of food components under study demonstrate a good ranking of the subjects according to their nutri intake. However, Pearson correlation coefficients can be misleading when assessing agreeme because the significant correlations describe a linear relationship between two sets of data, but do necessarily imply good agreement between the USNDB and ENDB. Therefore, Bland-Altman pl were used to describe the agreement between the two methods. The divergent pattern shown in majority of the Bland-Altman plots indicates an increase in mean difference with increasing intak In addition to Bland-Altman plots, the results of the weighted kappa analysis indicated stro agreement for the three principal classes of macronutrients and moderate agreement for the major of the other nutrients for the DQ data. However, weighted kappas of the 24-HDR data were low

Main Results and Interpretation
Matching EPIC dietary intake data to the USNDB will allow exposure assessments for 150 food components to investigate in relation to disease outcomes within the EPIC cohort. Comparative analyses showed significant, but rather small, absolute differences between dietary intakes of the 28 selected nutrients estimated by the USNDB and the ENDB for participants in the EPIC study. Among the three classes of macronutrients, the greatest mean difference was found for carbohydrates (10.4% difference for DQ data and 7.6% difference for 24-HDR relative to the ENDB), representing higher carbohydrate intake estimates by the USNDB than by the ENDB. Within the USNDB, data on total carbohydrates was calculated 'by difference' (i.e., the difference between 100 and the sum of the percentages of water, protein, total fat, ash and, when present, alcohol), and includes total dietary fibre [18]. Within the ENDB, the sum of analysed fractions was the reference method, excluding dietary fibre (i.e., glycaemic carbohydrates), whereas values obtained by difference were graded as 'non comparable' [7]. Note that the carbohydrate values were considered as possibly presenting the most heterogeneity in terminology, definition, mode of expression and methods used (analytical or calculations) across EPIC countries [4]. These differences in definition and calculation methods likely explain the absolute difference in carbohydrate intakes between the USNDB and the ENDB. The greatest mean difference was shown for starch, a fraction of carbohydrates, with much higher estimates reported by the ENDB. In addition to the heterogeneity described in carbohydrate values across European FCDB, the level of detail with regard to the coverage of the different carbohydrate fractions varies significantly across European countries [4]. Therefore, the starch values reported in both the USNDB and ENDB should be handled with caution. A short overview of the USNDB and ENDB reference component-specific definition and standard analytical methods and approaches for all 28 nutrients is shown in Table S3.
Relative differences in dietary intake estimates between the USNDB and the ENDB were examined using Pearson correlations. Moderate to very strong correlations for the majority of the food components under study demonstrate a good ranking of the subjects according to their nutrient intake. However, Pearson correlation coefficients can be misleading when assessing agreement, because the significant correlations describe a linear relationship between two sets of data, but do not necessarily imply good agreement between the USNDB and ENDB. Therefore, Bland-Altman plots were used to describe the agreement between the two methods. The divergent pattern shown in the majority of the Bland-Altman plots indicates an increase in mean difference with increasing intakes.
In addition to Bland-Altman plots, the results of the weighted kappa analysis indicated strong agreement for the three principal classes of macronutrients and moderate agreement for the majority of the other nutrients for the DQ data. However, weighted kappas of the 24-HDR data were lower compared to those of the DQ data. Overall, this project shows a good level of agreement for energy intake and the majority of the nutrients, although not for starch, vitamin D, vitamin E and thiamine.
In addition to the specific arguments for carbohydrate related compounds explained above, some more generic issues in the matching of dietary intake data with food items available in the FCDBs may further explain the absolute and relative differences found in this comparison between dietary intakes estimated by the USNDB and ENDB. Three major issues in the matching with FCDB can be addressed.
First, there are significant differences in the food lists (e.g., number of foods, kinds of food stuffs, level of detail) available in the different FCDBs. Matching European food data, which have countryand local-specific unique foods and dishes, to the USNDB is not unequivocal. The lack of exact food matches between the food items available in the EPIC food list and the USNDB has led to difficult and arbitrary decisions that had to be made during the matching procedure. This issue also arose during the compilation of the ENDB. The food lists available in the national FCDBs, originating from the late 1990s and early 2000s, were, for some countries, limited in the number of food items available, requiring them to borrow food composition data from neighbouring countries [7].
Second, differences could be caused by advancements and variation in definitions and laboratory technologies to measure the different nutrients. Indeed, more advanced laboratory methods available in the past few years may also contribute to differences in values between the different FCDBs. These differences in methodologies already complicated the harmonisation of the different national FCDBs used in the ENDB, as various methods had been used across Europe. Considering the further advancements in technologies over the past decades, these innovations may contribute to the differences found between the more recent USNDB and the ENDB.
Third, changes over time in product composition, processing and potential changes in national food regulations may result in a variation of nutrient content of (processed) foods over time. Furthermore, geographical and environmental variations are likely to exist between the same foods in the different national FCDBs, especially in the vitamin and mineral content. These differences in food composition are likely to be found between continents, between European countries, and even between foods originating from the same grower or manufacturer and/or over different harvests (e.g., due to differences in species, exposure to sunlight or pesticides, storage conditions and period). Considering the current global food system, with import and export of foods between regions, matching with a non-European database that delivers standardised and high-quality food composition data was appraised as a pragmatic and scientifically justifiable solution.
Since agreement was higher for energy and macronutrient intakes than for certain micronutrient intakes, it is likely that the vitamin and mineral content of food is the most vulnerable to environmental and climatic conditions, food processing and regulations and/or the analytical method used. This is particularly the case for unstable components (i.e., labile to temperature, pH and oxidation), leading to potential problems in the accurate measurement of these nutrients.

Comparison with Similar Studies
Only few studies have examined the dietary intake of an adult population by different FCDBs [19][20][21][22][23], and most of them focused on dietary intakes measured by different European FCDBs [19][20][21]. Good correlations for macronutrients (r > 0.70) were reported by most of these studies comparing different European FCDBs [19,20,22,23]. Although one comparative study suggested a discrepancy between FCDBs for energy, fat and carbohydrate intakes [21]. Only one study examined the level of agreement between macro-and micronutrients of the USNDB (modified by Chilean food items) and the British FCDB [23]. This study concluded that results for dietary intakes are similar for the USNDB and the British FCDB for the majority of the nutrients under study. However, the USNDB tends to give relative overestimates of macronutrients in comparison to the British FCDB, but such a trend was inconsistent for micronutrients [23]. Within the EPIC cohort, similar results were found when comparing dietary intakes by the USNDB and ENDB. However, the magnitude of the overestimation of macronutrients was rather low, ranging between 1.5% to 10% higher intakes relative to the ENDB, and no such trend was observed for micronutrients. Micronutrients are less likely to be expressed in comparable ways by means of their methodological nature (analytical methods used, definition, and measurement units) across different FCDBs [4,7,[19][20][21][22][23].

Recommendations for Future Studies and Food Composition Data Compilers
The EuroFIR (European food information resource) project and INFOODS (International Network of Food Data Systems) are appreciated for their efforts in promoting international cooperation and harmonisation of standards to improve data quality, availability and reliability [24,25]. Unfortunately, national FCDBs are not necessarily conceived to provide internationally comparable data or internationally interchangeable data, which is a major constraint for multi-centric studies that require standardisation across continents and regions [26]. Therefore, international (cross-continental) efforts are needed to study, inventory and eventually put into practice reference analytical methods for assessing the nutrient contents of foods and the use of universal definitions, measurement units and classification into food groups. Once settled, European FCDB could benefit from this, and expand their number of food components. However, high quality local food composition data remains important, especially when it comes to typical local food products and dishes or for countries that are mainly self-sufficient for their food supplies.

Strengths and Limitations of the Comparison Study
A major limitation of this comparison study is the lack of a gold standard, since both approaches, i.e., estimating dietary intake using the USNDB or the ENDB, are prone to error. Furthermore, the results of the relative validation study for the 28 food compounds might not be generalisable to the remaining 122 food components of the USNDB. The lack of nutritional composition data for several food items for less common nutrients should be taken into account. It may affect exposure estimations (underestimation of true intakes) and lead to the attenuation of associations found between nutrient intakes and health outcomes. Despite these limitations, this study has several strengths. To match the EPIC food list to the USNDB, a standard procedure was maintained, building on the previous experiences of the ENDB project [7,14]. To assure the continuation of the standard approach, quality controls were built in during the matching procedure. The strong methodology of the EPIC study allowed assessment of the relative validity in duplicate, using the 24-HDR and the DQ and taking advantage of its large sample size. Furthermore, matching the EPIC food list to the USNDB is a strong added value to the EPIC study, and is of crucial importance: providing up to 150 food components to the EPIC cohort dataset will give the opportunity to investigate additional risk factors for specific cancers and other chronic diseases.

Recommendations for Users of the EPIC Nutrients Database
If the dietary nutrients of interest are available in the ENDB, it is recommended to make use of the ENDB data, as this enables interpretation of the results in the context of the previously published work in EPIC. In case one or more nutrients are not available in the ENDB, it is recommended to use the USNDB exclusively for all nutrients as mixing both approaches is discouraged in order to avoid discrepancies between nutrients (e.g., total energy intake should remain the sum of the energy of the different macronutrients). For nutrients with rather weak agreement (κ < 0.60) between the USNDB and ENDB, researchers should be prudent in using the data from both the USNDB and the ENDB, because both databases have limitations, and sensitivity analyses using the nutrients from both databases could be suggested.

Conclusions
In this study, the EPIC dietary assessment data was matched to the USNDB. Good agreement was shown between the USNDB and the original ENDB for energy intake, total fats, proteins, carbohydrates, sugar, alcohol, potassium and Vitamin C, although not for starch, vitamin D, vitamin E and thiamine. The USNDB will allow the analysis of the dietary exposure of up to 150 food components in relation to health and disease risk within the EPIC study.
Supplementary Materials: The following are available online at http://www.mdpi.com/2072-6643/12/10/2906/s1: Table S1: Mean, standard deviation and median of dietary intakes of 28 nutrients of the U.S. nutrient database (USNDB) and the EPIC nutrient database (ENDB) and their absolute mean difference in nutrient intake, reported for the 24-h dietary recall data (24-HDR) and the dietary questionnaire data (DQ) by country. Table S2: Pearson correlation coefficients and weighted kappas (κ) for dietary intakes of 28 nutrients of the U.S. nutrient database (USNDB) and the EPIC nutrient database (ENDB), reported for the 24-h dietary recall data (24-HDR) and the dietary questionnaire data (DQ) by country. Table S3: Short overview of the U.S. nutrient database (USNDB) and the EPIC nutrient database (ENDB) reference component-specific definition and standard analytical methods and approaches used. Figure S1: Bland-Altman plots for 24-h dietary recall (24-HDR) data for 23 nutrients intakes between the U.S. nutrient database (USNDB) and the EPIC nutrient database (ENDB). The mean difference is represented by the full line, the upper and lower limit of agreement are presented by dotted lines. Figure S2: Bland-Altman plots for dietary questionnaire (DQ) data for 23 nutrients intakes between the U.S. nutrient database (USNDB) and the EPIC nutrient database (ENDB). The mean difference is represented by the full line, the upper and lower limit of agreement are presented by dotted lines.