Empirically Derived Dietary Patterns in UK Adults Are Associated with Sociodemographic Characteristics, Lifestyle, and Diet Quality

The aim of this study was to examine empirical dietary patterns in UK adults and their association with sociodemographic characteristics, lifestyle factors, self-reported nutrient intake, nutrient biomarkers, and the Nutrient-based Diet Quality Score (NDQS) using National Diet and Nutrition Survey data 2008–2012 (n = 2083; mean age 49 years; 43.3% male). Four patterns explained 13.6% of the total variance: ‘Snacks, fast food, fizzy drinks’ (SFFFD), ‘Fruit, vegetables, oily fish’ (FVOF), ‘Meat, potatoes, beer’ (MPB), and ‘Sugary foods, dairy’ (SFD). ‘SFFFD’ was associated positively with: being male; smoking; body mass index (BMI); urinary sodium; intake of non-milk extrinsic sugars (NMES), fat and starch; and negatively with: age; plasma carotenoids; and NDQS. ‘FVOF’ was associated positively with: being non-white; age; income; socioeconomic classification (National Statistics Socio-economic Classifications; NSSEC); plasma carotenoids; intake of non-starch polysaccharides and polyunsaturated fatty acids. It was negatively associated with: being male, smoking, BMI, urinary sodium, intake of saturated fat; and NMES and NDQS. Whilst the patterns explained only 13.6% of the total variance, they were associated with self-reported nutrient intake, biomarkers of nutrient intake, sociodemographic and lifestyle variables, and the NDQS. These findings provide support for dietary patterns analyses as a means of exploring dietary intake in the UK population to inform public health nutrition policy and guidance.


Introduction
The identification of underlying, consistent patterns of dietary intake is valuable in nutritional epidemiology [1]. Whilst looking at single or a few dietary components remains an important focus for nutritional epidemiological studies, this approach has some conceptual and methodological limitations [2]. Individuals do not purchase or consume dietary components, nutrients or, frequently, foods, as single items in isolation. Individuals usually consume multiple nutrients in one food item and a number of food items as part of one meal. The combinations of foods and meals consumed day to day may also be habitual. This has implications for public health nutrition policy and guidance, as focusing on single nutrients (e.g., saturated fat) or foods (e.g., free sugars) may not be easy messages for the general public to translate into dietary behaviour change. In addition, it is likely that diseases such as cancer are influenced by multiple biochemical and physiological interactions between nutrients and foods. Other substances within food with no nutritive value, such as phytochemicals, may also be influential in disease development or protection [3,4] and biochemical and metabolic interactions in the body between micronutrients, dietary components, and foods may complicate or confound

Dataset
The data used in the analyses were from the UK National Diet and Nutrition Survey (NDNS) Rolling Programme. Computerised files from the NDNS were obtained from the UK Data Service archives (www.ukdataservice.ac.uk, accession number 64197). The NDNS is an annual cross-sectional survey undertaken in the UK since 2008. Dietary intake data are gathered from a self-reported unweighed 3-or 4-day food diary. Blood plasma and urine are also collected from a consenting sub-sample approximately 8 weeks after the self-reported data collection, from which biomarkers of nutritional status are derived. For the purposes of the analyses here, all dietary data collected are assumed to be representative of habitual dietary intake and nutrient status.
The NDNS contains data on more than 7000 foods, which are aggregated into 59 'main' food groups and then disaggregated from these 'main' categories into 'subsidiary' food groups. The 'main' food groups categorise all dietary intake data. Each food code has a value for 54 nutrients including energy, sugars, vitamins, and minerals [45]. Table 1 lists the 60 food group variables included in the principal component analysis (PCA). All 'main' NDNS food groups were included in the analyses with the exception of 'Commercial Toddlers Foods and Drinks', which was irrelevant for the adult sub-sample, and 'Dietary sweeteners' as these data were not provided in the dataset. 'Subsidiary' food groups for 'low fat spreads', 'reduced fat spreads', and 'miscellaneous' were included due to their widely varying nutritional composition and contribution to levels of consumption of key nutrients such as non-milk extrinsic sugars (NMES) and polyunsaturated fatty acids (PUFA) [45]. Food groups were presented as average grams consumed daily by each individual.
Data were included in the analyses from years 1 to 4 of the NDNS (2008-2012) comprising a sample of 2083 adults (aged ≥ 19 years, mean age 49 years) [46]. In line with other studies of the NDNS, the full sample was included in the analyses, as it was not considered possible to separate under-reporters from under-consumers, for example, those who were unwell [47].

Statistical Analyses
A number of statistical analysis methods, such as cluster analysis and factor analysis, can be employed to undertake a posteriori dietary patterns analyses. Exploratory factor analysis and, in particular, principal component analysis (PCA) has been used in many studies to reduce large sets of dietary intake variables representing the total diet into smaller sets of variables and to identify underlying 'dietary patterns' [7,48]. For this study, PCA was used to derive the dietary patterns, as this method uses the degree to which foods are correlated with each other to derive a new, smaller set of composite variables [48]. These composite variables are uncorrelated with each other and therefore can be considered as representative of discrete dietary patterns. The extracted sets of composite variables are 'components' and the variables within them are 'factors'. PCA produces as many 'components' as there are variables, but the first few that explain the largest proportion of the variance in the data are selected. All analyses were conducted in SPSS Version 22 statistical software package.
Prior to conducting the PCA, the Kaiser-Meyer-Olkin (KMO) measure and Bartlett's test of sphericity were undertaken to ensure that the data were suitable for factor analysis. The KMO reached the acceptable limit of 0.5 [49] and Bartlett's test of sphericity was significant, indicating that correlations between items were sufficiently large to undertake PCA. The 60 pre-defined food variables (average grams consumed per day) shown in Table 1 were entered into the PCA. Orthogonal rotation (varimax) was applied. The purpose of applying the rotation is that it redistributes the explained variance for each component and thus achieves a simpler structure [50]. As per convention for this type of analysis, the number of components selected was based on a visual assessment of the scree plot ( Figure 1) and those with eigenvalues above the values of approximately 1.5, in order to identify the fewest number of patterns that explained the largest proportions of variance [7].    The 'dietary patterns' generated were characterised by high and low consumption of particular foods and drinks. All foods and foods groups have an associated factor loading with each component or 'dietary pattern'. Foods and food groups with associations of ≥0.25 or ≤−0.25 were considered 'moderate' and ≥0.3 or ≤−0.3 were considered 'strong' or 'significant' contributors to a dietary pattern in line with previous studies [50][51][52].
Main effects regression analysis identified the associations between the dietary patterns and sociodemographic characteristics, lifestyle factors, self-reported intake, and biomarkers of single nutrients and overall diet quality. The following variables were included in the 'sociodemographics and lifestyle' model: age, household income (standardised to adjust for the number of individuals in the household), BMI (kg/m 2 ), sex (male/female), ethnicity (white/non-white), National Statistics Socio-economic Classifications (NSSEC) (1-Higher managerial, administrative, and professional occupations; 2-Lower managerial, administrative, and professional occupations; 3-Intermediate occupations; 4-Small employers and own account workers; 5-Lower supervisory and technical occupations; 6-Semi-routine occupations; 7-Routine occupations; 8-Never worked and long-term unemployed), and smoking status (smoker/non-smoker). BMI and household income were included in the model as continuous variables in order to avoid the problems that have been demonstrated to result from artificial stratification [53]. For the variables sex, ethnicity, and smoking status, the category with the largest number of subjects was selected as the reference group. For NSSEC, the 'never worked' category was selected as reference as a comparator [54]. The following variables were included in the 'nutrient intake and biomarkers' model: self-reported intake per 1000 kcal of vitamins C, D, E, B 6 , B 12 , iron, folate, and magnesium; % food energy from non-milk extrinsic sugars (NMES), saturated fat, total fat, n-3 polyunsaturated fats (PUFA), n-6 polyunsaturated fats, non-starch polysaccharides (NSP); biomarkers from blood and urine samples of vitamin C, 25-hydroxy vitamin D, retinol, ferritin, triglycerides, total cholesterol, urinary sodium, and total plasma carotenoids.
Overall 'diet quality' was measured using the Nutrient-based Diet Quality Score (NDQS): a composite measure scoring levels of consumption of 12 nutrients and alcohol. The score was developed to reflect UK Dietary Reference Values (DRVs) and government dietary guidelines [55][56][57]. Inclusion of items and scoring was developed to take account of a priori knowledge of key nutrients for health and prevention of diseases with a focus on those diet-related issues that are most prevalent in the UK as well as population level nutrient deficiencies or over-consumption (for example, salt and saturated fat) [58][59][60][61]. The score was validated against biomarkers of nutrient intake derived from blood and urine samples [62]. Regression analysis was used to explore associations between the dietary patterns and the NDQS. Plots of residuals were visually assessed for evidence of homoscedasticity, constancy of variance, and outliers. Where variables were not normally distributed, regressions including the square of the independent variables were carried out to test for evidence of curvilinearity. Where the coefficient of the squared term was significant (thus indicating curvilinearity), the coefficients for both the value and the squared values were plotted to visually assess for potential curvilinear (quadratic) effects.

Results
Study population characteristics are described in Table 2. The mean age of the sample was 49 years and there were significantly more females than males and white than non-white participants.

Principal Component Analysis of Dietary Patterns
The first four principal components that explain the largest proportions of variance in the dietary intake data individually have eigenvalues above 1.5, and following a visual assessment of the scree plot ( Figure 1), were selected and retained as 'dietary patterns'. Together, the four components explain 13.6% of the total variance in the dietary intake data (3.9%, 3.7%, 3.1%, and 2.8%, respectively). Table 3 shows the rotated solution of the PCA. Foods with moderate or strong factor loadings were interpreted as those characterising each dietary pattern. The patterns were labelled subjectively, for ease of translation, based on these foods: 'Snacks, fast food, fizzy drinks' (SFFFD), 'Fruit, vegetables, oily fish' (FVOF), 'Meat, potatoes, beer' (MPB,) and 'Sugary foods, dairy' (SFD). Table 4 illustrates the foods with moderate/strong factor loadings for each of the four components or dietary patterns.     Table 5 shows the main effects of each dietary pattern for sociodemographic characteristics and lifestyle factors. The majority of the associations were as might be expected. The SFFFD pattern was positively (p ≤ 0.05) associated with being male (0.24), being a smoker (0.16), and BMI (0.13) and was negatively associated with age (−0.03). The FVOF pattern was negatively associated with being male (−0.08), being a smoker (−0.37), and BMI (−0.002). This pattern was positively associated with being non-white (0.72) and with age (0.002). There was a clear gradient for the FVOF pattern with NSSEC, which was significant for all categories, with a lower NSSEC, such as never worked, being negatively associated with this pattern and a higher NSSEC, such as higher managerial and professional occupations, being positively associated this pattern. The SFD pattern was positively associated with being male (0.19), with age (0.003), and with all categories of occupation other than routine occupations. The SFD pattern was most strongly associated with lower supervisory and technical and semi-routine occupations (0.53 and 0.52, respectively, p < 0.01). It was negatively associated with being a smoker (−0.21), being non-white (−0.46), and with BMI (−0.02), which is contrary to the relationship that might be expected for this dietary pattern. The MPB pattern was significantly positively associated with being male (0.63), age (0.01), and being a smoker (0.21) and negatively with being non-white (−0.56).   Table 6 shows the main effects of each dietary pattern for nutrient intake derived from self-reported dietary intake diaries and nutritional biomarkers derived from urine and blood plasma samples. The FVOF pattern was positively associated with intake per 1000 calories of vitamins C, D, E, B 12 , B 6 , iron, folate, and magnesium; proportion of food energy from n-3 and n-6 PUFAs; fibre (NSP) and biomarkers of vitamin C, D (25-hydroxy vitamin D), and A (retinol), iron (ferritin), and total carotenoids. It was negatively associated with the proportion of food energy from NMES, saturated fat, total fat, starch, and urinary sodium. The SFFFD pattern was negatively associated with intake per 1000 calories of vitamins C, D, E, B 6 , B 12 , folate, magnesium, proportion of food energy from n-3, intake of fibre (NSP), and biomarkers of vitamins C, D, A, total cholesterol, and total carotenoids. This pattern was positively associated with the proportion of food energy from NMES, total fat, n-6 PUFA, starch, and with urinary sodium. The MPB dietary pattern was negatively associated with intake of vitamins C and E, iron, and magnesium and with biomarkers for total carotenoids. It was positively associated with the proportion of food energy from NMES, saturated fat, total fat, and fibre (NSP) and plasma biomarkers of triglycerides and ferritin and urinary sodium. The SFD pattern had very similar associations to that of the SFFFD pattern, with the exception of being positively associated with the proportion of food energy from saturated fat and intake of fibre (NSP) and negatively associated with intake of n-6 PUFA, starch and biomarkers of retinol, ferritin and urinary sodium. Table 7 shows the main effects of the NDQS for each dietary pattern unadjusted and adjusted for: age, gender, NSSEC, household income, ethnicity, smoking status, BMI, and total energy intake. All patterns were significantly predictive of diet quality as measured by the NDQS (p < 0.001). In both models, the SFFFD and MPB patterns were negatively associated with the NDQS, but the effect was slightly attenuated in the SFFFD pattern in the adjusted model compared with the unadjusted model. The SFFFD pattern was more strongly negatively predictive of the NDQS than the MPB pattern, with an expected decrease in NDQS score of 3.7 with every incremental increase in SFFFD factor score, compared with an expected decrease in NDQS score of 1.2 with every increase in MPB factor score, when all things remained equal. The effect of the MPB pattern on the NDQS was strengthened in the adjusted model (a 1.8 decrease on the NDQS compared with a 1.2 decrease). The FVOF and the SFD patterns were positively associated with the NDQS in both models; the FVOF pattern was more strongly positively predictive of the NDQS than the SFD. In the unadjusted model, for every incremental increase in FVOF score, the expected NDQS increase was 4.5 and for SFD it was 2.4. The effects of both patterns were slightly attenuated in the adjusted model compared with the unadjusted (with expected increases in NDQS of 3.8 and 1.3, respectively, compared with 4.5 and 2.4 in the unadjusted model). Table 6. Main effects (95% confidence intervals) of dietary patterns for self-reported nutrient intakes and biomarkers.

Discussion
Analysis of dietary patterns is an important method in nutritional epidemiological research, as it allows diet to be explored and investigated as a multi-dimensional exposure, which more accurately reflects the way that free living individuals consume food. Nutrients and foods are rarely consumed in isolation, but as part of meals and habitual patterns of consumption. The aim of this study was to explore empirical patterns in UK adults and their associations with sociodemographic characteristics, lifestyle factors, self-reported intake, and biomarkers of nutrients and overall diet quality. Four patterns explained 13.6% of the total variance: 'Snacks, fast food, fizzy drinks' (3.9%), 'Fruit, vegetables, oily fish' (3.7%), 'Meat, potatoes, beer' (3.1%), and 'Sugary foods, dairy' (2.8%). Individuals scoring higher on the SFFFD pattern, which might also have been labelled as the 'unhealthy', 'processed', or 'Western' dietary pattern, were more likely to be male, white, a smoker, have a higher BMI, consume a greater proportion of food energy from (non-milk extrinsic) sugars, total fat, starch, and n-6 PUFA, and have higher urinary sodium levels. This pattern was negatively associated with age, self-reported intake per 1000 kcal, and biomarkers of a range of key nutrients, food energy from n-3 PUFA, intake of fibre (NSP), total plasma carotenoids (which are an indicator of fruit and vegetable intake [64]), total cholesterol, and a composite diet quality score calculated from self-reported dietary intake data (NDQS). The FVOF pattern, which might also have been labelled as the 'healthy' or 'prudent' diet was almost the inverse of the SFFFD pattern in its associations, with the exceptions of also being negatively associated with proportion of food energy from saturated fat, positively associated with biomarkers of 25-Hydroxy Vitamin D, retinol (Vitamin A), ferritin (iron), and having no significant association with total cholesterol. The MPB pattern, which could also be categorized as a 'traditional British' diet and the SFD or 'sweet tooth' pattern were both positively associated with being male, white, and older; with consuming a higher proportion of food energy from NMES, saturated fat, total fat, and fibre (NSP). Both patterns were negatively associated with intake per 1000 kcal of vitamin C, vitamin E, iron, magnesium; consuming a higher proportion of food energy from starch; and with total plasma carotenoids. There were some differences between these two patterns in that the SFD pattern was also negatively associated (where the MPB pattern had no significant association) with self-reported intake per 1000 kcals of vitamin D, B 12 , and folate, proportion of food energy from both n-3 and n-6 PUFA, and biomarkers for vitamin A. MPB was also positively associated with being a smoker, urinary sodium, and plasma ferritin (iron), where SFD was the inverse. MPB was also associated with plasma triglycerides, where SFD had no significant association. The differences in nutrient intake, urine and plasma levels of nutrients, and nutrient biomarkers are reflected in the differences in the foods that characterize each of these patterns. For example, the MPB pattern is characterized by processed red meat, which is high in salt and iron. Notably, three of the four dietary patterns, none of which represent a high quality diet, were more likely to be consumed by white males.
The findings suggest that there are proportions of the UK adult population that have patterns of dietary intake that are of varying dietary quality and are associated not only with demographic characteristics but also with lifestyle factors and socioeconomic measures. This is important, as lifestyle factors (such as smoking status) and socioeconomic measures (such as income and NSSEC) are associated with health outcomes independently of diet [65]. These findings support the use of empirically derived patterns as a method for exploring and describing dietary intake in the UK to inform public health nutrition policy and research. In addition, these data suggest that particular foods and other variables such as smoking status may be useful as proxies for dietary patterns in nutritional epidemiological studies. The use of the theory driven Nutrient-based Diet Quality Score also highlights the importance of these methods of analyses as a means to exploring diet as a multi-dimensional exposure. Therefore, this study demonstrates the usefulness of dietary patterns analyses methods, both empirically derived and a priori defined.
The results of this study are likely to be generalisable to the UK population. The NDNS is a robust dataset containing detailed dietary intake data from a nationally representative sample of UK adults. The methods for data collection, recruitment of participants, processing, and analysis of the data, including sourcing and updating food composition data, have been developed with close scrutiny and oversight from the commissioning departments in government [58]. PCA is a widely used method in nutritional epidemiological studies to reduce the detailed complexity of dietary intake into reduced sets of variables that represent 'patterns' of consumption that can be labelled for easy interpretation [48]. In addition, the dietary patterns identified in this study reflect those of other similar studies undertaken in the UK, which supports their generalisability and external validity [25,50,66] Studies undertaking PCA on data from the LIDNS [6] and the ALSPAC datasets [50] have reported numbers and types of dietary patterns similar to those identified in this study. PCA of the LIDNS dataset resulted in four dietary patterns explaining 16.5% of the total variance that were labelled as 'fast food', 'health aware', 'traditional', and 'sweet' [6]. Similarly, a study analysing dietary patterns in pregnant women in the ALSPAC dataset identified five dietary patterns explaining 32.7% of variance that were also similar to those identified in this study, which were described as: 'health conscious', 'traditional', 'processed', 'confectionery', and 'vegetarian'. PCA in men in the same dataset identified four dietary patterns which were labelled similarly as: 'health conscious', 'traditional', 'confectionery/processed', and 'semi-vegetarian' [25].
There are a number of limitations to this study. The four identified dietary patterns explained only 13.6% of the total variance in the dietary intake data, which is a smaller proportion than other studies undertaking similar types of analyses in the UK [50,51,[66][67][68]. This finding was potentially a result of the inclusion of a greater number of variables in the PCA than these other studies [69]. Studies with a lower number of dietary variables included in the PCA have resulted in a greater proportion of the variability explained [70]. Where food group categories are broad, foods that are weakly associated with a pattern may be classified in the same category as foods more strongly associated, thus increasing the amount of information captured by a specific pattern. This in turn may have an impact on the sensitivity of the components and thus their associations with disease or other variables [69]. Therefore, in some studies greater granularity may be more important in extracting the patterns than the amount of variance explained. This may be why some authors do not report the proportion of variance explained by the factors [48]. Another limitation is that some of the food group categories pre-defined in the NDNS dataset that were included in the PCA such as 'yoghurt, fromage frais, and dairy desserts' include a broad range of foods with widely varying nutritional compositions and impacts on health. The use of the NDQS as a composite measure of diet quality is both a strength and a limitation. The NDQS is a 13-item construct, based on UK DRVs, developed to score sensitively to current UK public health priorities and validated against nutrient biomarkers [62]. However, as with all such scores, decisions regarding the definition of 'diet quality', the inclusion of items, scoring ranges, and weighting were made with some level of subjectivity.

Conclusions
The findings in this study contribute to the current understanding of dietary intake in UK adults and have implications for the way that population level dietary intake is assessed and evaluated to inform public health policy and guidance. The findings show that empirically derived dietary patterns in UK adults are associated with sociodemographic characteristics, lifestyle, and diet quality, as measured by single nutrient indicators (both self-reported and biomarkers) and a composite Nutrient-based Diet Quality Score. They also suggest that there are combinations of foods and other sociodemographic or lifestyle variables that could be explored as proxies for dietary patterns in nutritional epidemiological studies and dietary assessment. The dietary patterns identified in this study are similar to some of those identified in other UK studies where different datasets, methods of data collection, and population subgroups have been utilised. This supports their validity despite only a relatively low proportion of total variance in the dietary intake data being explained. This is a significant finding for public health policy, as it highlights the importance of focusing on the 'whole diet' in exploring population level diet, targeting interventions, and developing public health messages as opposed to single foods or nutrients. The findings are also significant for public health researchers, as they provide significant support for the use of dietary patterns analyses as valid and insightful methods for exploring dietary intake and habits in the UK population.