Can a Simple Dietary Index Derived from a Sub-Set of Questionnaire Items Assess Diet Quality in a Sample of Australian Adults?

Large, longitudinal surveys often lack consistent dietary data, limiting the use of existing tools and methods that are available to measure diet quality. This study describes a method that was used to develop a simple index for ranking individuals according to their diet quality in a longitudinal study. The RESIDential Environments (RESIDE) project (2004–2011) collected dietary data in varying detail, across four time points. The most detailed dietary data were collected using a 24-item questionnaire at the final time point (n = 555; age ≥ 25 years). At preceding time points, sub-sets of the 24 items were collected. A RESIDE dietary guideline index (RDGI) that was based on the 24-items was developed to assess diet quality in relation to the Australian Dietary Guidelines. The RDGI scores were regressed on the longitudinal sub-sets of six and nine questionnaire items at T4, from which two simple index scores (S-RDGI1 and S-RDGI2) were predicted. The S-RDGI1 and S-RDGI2 showed reasonable agreement with the RDGI (Spearman’s rho = 0.78 and 0.84; gross misclassification = 1.8%; correct classification = 64.9% and 69.7%; and, Cohen’s weighted kappa = 0.58 and 0.64, respectively). For all of the indices, higher diet quality was associated with being female, undertaking moderate to high amounts of physical activity, not smoking, and self-reported health. The S-RDGI1 and S-RDGI2 explained 62% and 73% of the variation in RDGI scores, demonstrating that a large proportion of the variability in diet quality scores can be captured using a relatively small sub-set of questionnaire items. The methods described in this study can be applied elsewhere, in situations where limited dietary data are available, to generate a sample-specific score for ranking individuals according to diet quality.


Introduction
Diet is a major modifiable risk factor for a range of chronic diseases [1]. Analysis of diet in large populations is crucial for identifying dietary aspects that can increase or decrease the risk of chronic disease. Studies are recognizing the importance of characterizing the total diet, rather than single nutrients or foods, since a wide range of dietary components play a role in the development of chronic diseases, such as obesity, cardiovascular disease, diabetes, and some cancers [2][3][4]. As such, within nutritional epidemiology, there is a trend towards the use of single indices that provide an overall assessment of diet quality [5]. Diet quality indices describe the nutritional adequacy of an individual's diet as it adheres to pre-defined national dietary guidelines [6,7]. A measure of diet quality that is based on national dietary guidelines can be used as a predictor of chronic disease, in order to investigate associations with other behaviors, as a control in epidemiological research, or to examine diet-exposure relationships.
Detailed methods, such as multi-item, semi-quantitative food frequency questionnaires (FFQ), three-day food records, and 24-h recalls are widely used for the assessment of diet quality [8]. Whilst these tools provide information on food and nutrient intake, they are often time consuming and too resource intensive for use in large epidemiological research settings. Therefore, short dietary questionnaires are increasingly used to assess diet quality in large population surveys. Rather than quantifying actual intakes, short dietary questionnaires aim to rank individuals according to their dietary intake. A number of short dietary questionnaires or screeners have been successfully developed and validated, demonstrating the utility of such tools for the assessment of certain food groups [9][10][11][12][13], nutrients [10,[14][15][16], specific dietary patterns [17,18], or diet quality [19][20][21][22][23][24][25]. Furthermore, several approaches have been taken to reduce the length of existing dietary questionnaires whilst still maintaining the validity of reduced-item scores. These involve regression methods [26][27][28], data reduction techniques, such as factor analysis [29,30] and principal component analysis [31], or the use of correlations between nutrient estimates from full and reduced versions [32].
However, in large scale, longitudinal studies, inconsistent dietary data are often collected at successive time points. This presents a challenge when attempting to make longitudinal assessments of diet quality, owing to lack of consistent reporting. Thus, this study investigates whether it is possible to construct a longitudinal measure of diet quality in situations where inconsistent dietary data are available across time points.
In the RESIDential Environments (RESIDE) project, which is an Australian longitudinal study of adult health and behavior, dietary data were collected at four time points in varying detail. The most comprehensive dietary data were obtained at the fourth and final time point (T4) in the form of a 24-item questionnaire containing questions about dietary behaviors and the frequency of consumption of selected foods and food groups. Preceding time points collected dietary data on sub-sets of the 24 items. Given that full dietary data were not available across all of the time points, a method of regression was used to construct a longitudinal measure of diet quality from the sub-sets of dietary data. The aim was to demonstrate that a simple diet quality index based on a sub-set of the 24 RESIDE questionnaire items, can rank individuals according to their diet quality, as determined by a more comprehensive index that was designed to reflect adherence to the Australian Dietary Guidelines.

Study Design and Participants
This study used data from the RESIDential Environments (RESIDE) project. RESIDE was a quasi-experimental, longitudinal study that was conducted across Perth, Western Australia (WA), from 2003 to 2012. The primary aim of RESIDE was to evaluate the impact of the Western Australian government's new sub-division design code policy "Livable Neighborhoods Community Design Guidelines" (LNG) on participant health and behavior. A cohort of adults (n = 1811) moving from their house located within an established residential area into one of 74 new housing developments were surveyed at four times; T1 prior to moving  (T1, T2, T3 and T4), the participants completed questionnaires on self-reported physical activity, health, lifestyle behaviors, usual food intake, and sociodemographic variables. Of the 1811 participants at baseline (40% male), ages ranged from 19-78 years with a mean age of 41 (SD = 13) in men and 39 (SD = 11) in women. Detailed descriptions of the study design and sampling procedures are presented elsewhere [33]. The current study utilized cross-sectional data (n = 565) from T4. Ethics approval was provided by the University of Western Australia Human Research Ethics Committee.

Sociodemographic, Health and Lifestyle Variables
Self-reported sociodemographic data at T4 included gender, age (years), marital status (married/de facto; separated/divorced/widowed; single), education (secondary or less; trade/apprentice/certificate; bachelor or higher), and income (<50,000; 50,000-69,999; 70,000-89,999; ≥90,000). Health and lifestyle variables included smoking status (participants were asked, "Which of the following best describes your cigarette smoking status: (1) I smoke daily; (2) I smoke occasionally = current smoker; (3) I don't smoke now but I used to; (4) I've tried it a few times but never smoked regularly = ex-smoker; and, (5) I've never smoked = never smoked), self-rated health (participants were asked "In general, would you say that your health is excellent, very good, good, fair, or poor?"), and body mass index (BMI) based on self-reported height and weight categorized into healthy: BMI ≥ 18.5 to <25 kg/m 2 ; overweight: BMI ≥ 25 to <30 kg/m 2 ; and, obese: BMI ≥ 30 kg/m 2 [34]. The frequency and intensity of physical activity was also assessed in minutes per week. Participants reported the number of times and minutes per week of either walking (for recreation and transport), and moderate or vigorous intensity leisure time activities. These data were then used to derive an overall measure of physical activity using standardized scoring and levels of activity [35,36].

Dietary Intake
At T4, the participants completed a 24 item dietary assessment that included 12 semi-quantitative food frequency questions (FFQ) and 12 dietary behavior questions (DBQ). At previous time points (T1, T2 and T3) data were collected on a sub-set of the T4 survey items (Table A1 in Appendix A). All of the dietary survey items were sourced from questions from the 1995 Australian National Nutrition Survey [37,38] or previously evaluated questions that were shown to be valid measures of food intake and dietary behaviors [39][40][41][42]. A one-week test-retest of survey items provided intraclass correlations (ICC) ranging from 0.79 to 0.95 [43]. The standard serving size of fruit, vegetables, and beverages was defined in the surveys in terms of metric measurements (cups and milliliters), and included additional information to guide the participants about how many cups in everyday drinks, i.e., 1 can of soft drink = 1.5 cups, 1 bottle of Gatorade = 2 cups, and 1 bottle of soft drink = 2.5 cups.

Assessment of Diet Quality
Diet quality was assessed at T4 using three indices: a RESIDE dietary guideline index (RDGI), developed using the most comprehensive data (24-items) available at T4; and two simple RESIDE dietary guideline indices (S-RDGI1 and S-RDGI2) constructed using the sub-sets of six survey items available at T1, T2, T3, and T4 (S-RDGI1), and nine survey items available at T2, T3, and T4 (S-RDGI2).
Development of the RDGI score: The RDGI was adapted from previously validated food-based indices that were developed to reflect the Australian Dietary Guidelines (ADG) [44][45][46] and was designed to assess the adherence with components of the ADG relevant to adult (>18 years) dietary intake [4], and the Australian Guidelines to Reduce Health Risks from Drinking Alcohol [47]. All 24 survey items contributed to the RDGI, and were assigned as indicators for each ADG component. A total of 10 ADG components contributed to the RDGI score, including six adequacy components (i.e., foods to increase in the diet as per ADG 2: Enjoy a wide variety of nutritious foods from these five groups every day) and four moderation components (i.e., foods to limit in the diet as per ADG 3: Limit intake of foods containing saturated fat, added salt, added sugars, and alcohol) ( Table 1). See Table A2 for relevant guidelines and components used to construct the RDGI.
The aim of scoring and cut-off points was to achieve a balanced contribution of components to the overall score that resulted in high discriminating power. Using data-driven group or population medians as cut-offs may provide greater discriminating power, but are not comparable across groups and may not be related to recommended intakes as per guidelines. This study aimed to assess diet quality in relation to the ADG. Thus, cut-offs reflected age and sex-specific ADG recommendations for number of serves. Each component was scored from 0-10, such that equal weights were attributed to all of the components within the index. This is the common approach for most dietary indices [6].
When components consisted of more than one indicator item, scores totaled to a maximum of 10. The maximum score was assigned when the guideline was met, with a proportionate score given for levels of intake above/below this, and zero assigned as the minimum score (furthest from guideline). Items were scored proportionately, to the extent to which the guideline was met, to allow for the final score to reflect the degree to which individuals met recommendations. For example, for fruits and vegetables, maximum points (n = 10) were assigned to intakes at or above recommendations (number of serves per day), and a minimum score of zero for non-consumers. This is consistent with evidence that suggests an inverse dose-response association between intake of fruits and vegetables with the risk of coronary heart disease, stroke, cardiovascular disease, total cancer, and all-cause mortality [48]. For components that were shown to follow a U-shaped association with health, i.e., red meat [49], maximum points were scored to participants at or below the recommended intakes, and a lower proportionate score was given to intakes greater than recommendations. This was done so as not to penalize individuals who chose not to consume alcohol or red meat for various reasons. In situations where the ADG were not quantitative, or did not align with the response categories of the survey items, criteria provided from another set of national recommendations or empirical study were sought to inform cut-offs. Cut-offs for maximum and minimum scores are detailed in Table 1.
This scoring approach was similar to that of McNaughton et al. (2008) and Thorpe et al. (2016), and recommended dietary index methodology, which recognizes the existence of correlations and interactions between individual dietary components such that strongly correlated dietary variables contribute more heavily to the score [6,44,46]. The final total score was the sum of the 10 components so that the index ranged from 0-100, with a higher score reflecting greater diet quality in relation to the ADG.
Development of the S-RDGI1 and S-RDGI2 scores: The sub-sets of six survey items available at T1, T2, T3, and T4 are highlighted in Table 1 and the additional three items available at T2, T3, and T4 are in bold italic (nine survey items in total available at T2, T3, and T4). The six survey items totaled to a maximum score of 31.5, whilst the nine survey items totaled to a maximum score of 46.5. In order to generate a representative measure of diet quality (RDGI) for use across all of the time points, a method of linear regression was used to examine the relationship between the available sub-sets of scores and the full RDGI, from which the fitted regression model can be used to predict the dependent variables (S-RDGI1 and S-RDGI2) at time points when only the independent variables (sub-sets of scores) are known. A multiple linear regression model was fitted using the RDGI scores (dependent variable) and the scores of the six survey items (independent variables) at T1, T2, T3, and T4 ( Table 1). The resultant S-RDGI1 score was calculated as the predicted index score from the estimated regression equation. The same procedure was repeated to determine the S-RDGI2 score using the nine survey items that are available at T2, T3, and T4 (Table 1). How often do you eat cheese? (including ricotta, cottage, processed, cream cheese, hard and soft cheese) <once per month = 0 Once per month = 1 2-3 times per month = 2 1-2 times per week = 3 3-5 times per week = 4 6-7 times per week = 5 Drink plenty of water How many cups of water, including sparkling water, do you drink in a day? Total beverage intake zero cups = 0 Total beverage intake: M 1-9 cups = 2.5 F 1-7 cups = 2.5 Total beverage intake: M ≥ 10 cups = 5 F ≥ 8 cups = 5 How many cups of diet or sugar-free soft drinks, cordial or sports drinks do you drink in a day? (such as coke zero or sugar free Gatorade) How many cups of hot drinks do you drink in a day? (such as tea, coffee, herbal tea) Proportion of water to total beverage intake 4 0% = 0 >0% < 50% = 2.5 ≥50% = 5 Table 1. Cont.

Limit intake of foods high in saturated fat
How often do you eat chips, French fries, wedges, fried potatoes or crisps? 5

Statistical Analysis
From the 565 participants at T4, 555 (212 men and 343 women) provided complete responses to all of the 24 dietary survey items that were required to calculate the RDGI scores and undertake subsequent analyses.
Descriptive statistics were calculated for participant sociodemographic, health and lifestyle variables, RDGI scores, and ADG components, along with the percentage of participants that met maximum criteria for each ADG component. Using the outputs that were obtained from the fitted linear regression models, the respective S-RDGI1 and S-RDGI2 scores were calculated using the model intercept plus the sum of the corresponding survey item scores multiplied by their regression coefficient, e.g., Y' = βX + A. Where Y' equals the predicted index score (S-RDGI1 or S-RDGI2), β is the corresponding regression coefficient for the independent variable X (survey item score) and A is the intercept.
Several approaches were used to assess the agreement between RDGI scores and S-RDGI scores (S-RDGI1 and S-RDGI2) [53]. Spearman's rank correlation coefficients were used to assess the strength of the associations between diet scores (RDGI, S-RDGI1, and S-RDGI2) and the intakes of food and drink items and ADG component scores. The following categories were used for the interpretation of correlation coefficients: 0.9-1 almost perfect; 0.7-0.9 very high; 0.5-0.7 high; 0.3-0.5 moderate; 0.1-0.3 low; and, 0-0.1 insubstantial [54].
Cross-classification was used to determine the percentage of participants that were correctly classified by the S-RDGI1 and S-RDGI2 into the same tertile (good outcome = ≥ 50%) or grossly misclassified into the opposite tertile (good outcome = ≤ 10%) of diet quality, as measured by the RDGI. Since this method of cross-classification may include agreement that has occurred by chance, Cohen's weighted kappa (Kw) was also calculated to provide a measure of agreement that is adjusted for chance where kw = (Po − Pe/(1 − Pe)), Po = observed agreement, and Pe = expected agreement by chance. Weights were applied such that no difference between tertiles = 1, a difference of one tertile = 0.5 and a difference of two tertiles = 0. Values of Kw > 0.8 indicate very good agreement, between 0.8-0.61 good agreement, 0.6-0.41 moderate agreement, 0.40-0.21 fair agreement, and <0.2 poor agreement [54].
Bland-Altman plots further examined the agreement between S-RDGI scores (S-RDGI1 and S-RDGI2) and RDGI scores [55]. The difference between the scores (S-RDGI − RDGI) was plotted against the mean of both the scores ((S-RDGI + RDGI)/2), for both S-RDGI1 and S-RDGI2. The mean difference and 95% limits of agreement (LOA) were calculated (±1.96 SD) to examine if the mean difference varied with diet score. Linear regression analysis of the difference on the average of the two measures examined whether the mean difference varied significantly with diet score.
Associations between diet scores (RDGI, S-RDGI1, and S-RDGI2) and participant characteristics (gender, age, education, income, self-rated health, smoking status, physical activity, and weight status) were tested using analysis of variance (ANOVA), with post hoc t-tests if more than two categories. All of the analyses were conducted using IBM SPSS Statistics for Windows, Version 23.0. Armonk, NY: IBM Corp.

Sample Characteristics
The distributions of participant characteristics are shown in Table 2. The mean age was 48 years (range = 25-80). Greater than 85% were married or in a de facto relationship, and over half had greater than secondary school education or an income ≥ $90,000. More than 50% of participants rated their health as very good or excellent, only 6.8% were current smokers, and over 55% of participants were classified as either overweight or obese. The mean total amount of moderate to vigorous physical activity per week was 5.6 h. The mean RDGI score was 69.9 (SD = 8.8) and the distribution was slightly negatively skewed (skewness = −0.33). Participants scored the highest on the saturated fat, water/fluids, alcohol, and fruit components. Overall, there were few ADG components that were well met by participants, with fruit (53.5%) and alcohol (46.7%) having the greatest percentage of participants achieving maximum scores.  Table 3 shows the estimated coefficients from the fitted linear regression models for the prediction of S-RDGI scores from the scores of the six survey items (S-RDGI1) and nine survey items (S-RDGI2). All six and nine items had a coefficient that was significantly different from zero (p < 0.05). All of the coefficients were positive, indicating that a higher diet quality was associated with higher intakes of vegetables, fruit, fish and reduced fat dairy and lower intakes of red meat, chips, meat products, and alcohol. For both of the models, fruit then vegetables had the greatest effect on diet quality scores (based on standardized coefficients, i.e., coefficient *SD = standardized coefficient), followed by alcohol (days/week), red meat and milk type for the S-RDGI2, and red meat then milk type for the S-RDGI1. Overall, 62% of the variation in the RDGI scores in this sample was explained by the six survey items (R 2 = 0.62, p < 0.001) and 73% was explained by the nine survey items (R 2 = 0.73, p < 0.001). The mean diet quality score was 69.8 (SD = 8.8) for the RDGI, 69.8 (SD = 6.9) for the S-RDGI1, and 69.8 (SD = 7.5) for the S-RDGI2. Table 3. Estimated coefficients from the fitted linear regression models for prediction of S-RDGI scores (S-RDGI1 and S-RDGI2) 1 from the scores of the sub-sets of survey items (n = 555).

Agreement between RDGI and S-RDGI Scores
Correlations between the diet quality scores (S-RDGI1, S-RDGI2 and RDGI) and specific food and drink items and ADG component scores are presented in Table 4. Intakes of fruit and vegetables were highly positively correlated with all of the diet quality scores. Whereas, the intakes of discretionary foods, including: chips; fried roast or BBQ chicken, pizza, burgers, or fish and chips; and, meat pies, sausage rolls, or other savory pastries, were highly, negatively correlated with all of the diet scores. As expected, food and drink items that were not included in the S-RDGI1 and S-RDGI2 were not strongly correlated with these scores. Table 4. Correlations between diet quality scores and intakes of food and drink items (based on original frequency categories outlined in Table A1) and Australian Dietary Guidelines (ADG) components (based on scoring outlined in Table 1). The S-RDGI1 scores correctly classified 360 participants (64.9%) into the same tertile, misclassified 185 (33.3%) into adjacent tertiles, and grossly misclassified 10 (1.8%) into opposite tertiles of the RDGI. The S-RDGI2 scores correctly classified 387 participants (69.7%) into the same tertile, misclassified 158 (28.4%) into adjacent tertiles, and grossly misclassified 10 (1.8%) into opposite tertiles of the RDGI. The weighted k-statistic showed that the RDGI and S-RDGI1 had moderate agreement (Kw = 0.58; 95% CI = 0.53, 0.64; p < 0.001), and the RDGI and S-RDGI2 had good agreement (Kw = 0.64; 95% CI = 0.59, 0.69; p < 0.001).

Questionnaire Item (Frequency Categories) S-RDGI1 S-RDGI2 RDGI
The relationship between the difference in diet scores and their mean (Bland-Altman plot) is illustrated in Figure 1a (S-RDGI1 and RDGI) and Figure 1b (S-RDGI2 and RDGI). The S-RDGI1 and S-RDGI2 appear to over-estimate the lower RDGI scores and under-estimate the higher RDGI scores. The slope of the fitted regression line for the difference versus the means of diet scores was significantly different from zero for both the S-RDGI1 (slope β = −0.269; p < 0.001) and S-RDGI2 (slope β = −0.171; p < 0.001).

Associations between Diet Scores and Participant Characteristics
Mean diet quality scores are shown in Table 5 for the RDGI, S-RDGI1, and S-RDGI2, according to categories of sociodemographic, health, and lifestyle characteristics. A higher diet quality, as determined by the RDGI, S-RDGI1, and S-RDGI2, was associated with being female, undertaking ≥150 min of physical activity per week, and being in excellent health (compared with good/fair). Current smokers had significantly lower diet quality when compared with ex-smokers or those who had never smoked. Those who were overweight or obese had lower mean diet quality, however, this was not statistically significant. Similarly, although there was an increasing trend in mean diet quality for both RDGI and S-RDGI1 with an increasing education level, this was not significant. Diet quality tended to increase with age. There was no significant association or trend in mean diet quality between RDGI, S-RDGI1, and S-RDGI2 and income.

Discussion
Using a method of regression, we found that diet quality scores that were calculated using a sub-set of six (S-RDGI1) and nine (S-RDGI2) survey items showed good agreement with those that were derived from the full 24-item questionnaire (RDGI). All three indices (S-RDGI1, S-RDGI2, and RDGI) showed expected variations across sociodemographic, health, and lifestyle variables (with the exception of income and education), and correlated well with intakes of key food and drink items and ADG component scores. Overall, 62% of the variation in the RDGI scores was explained by the six survey items that were included in the S-RDGI1 and 73% was explained by the nine survey items that were included in the S-RDGI2. Gross misclassification was very low, correct classification was high, and kappa scores were moderate to good.
Diet scores from all three indices (S-RDGI1, S-RDGI2, and RDGI) were significantly correlated with most of the ADG components and the intakes of key food and drink items that are consistent with a healthy diet (i.e., as diet scores increased, intakes of fruit, vegetables, and fish increased, whereas the intakes of red meat, discretionary foods, and alcohol decreased). Correlations ranged from very high to moderate, with the exception of biscuits, cakes, desserts, pastries, lollies, and/or chocolate and sugary drinks. Although not directly comparable, due to variations in methodology, the correlation coefficients from this study were in range with those that were reported elsewhere in the literature [56][57][58][59].
Food and drink items that were not included in the S-RDGI1 and S-RDGI2 were not strongly correlated with these scores, neither were underrepresented ADG components (i.e., grains/cereals). This was expected given that the simple index scores included only a sub-set of survey items and they were not designed to adequately measure intakes of food groups but to differentiate individuals based on their diet quality. Intakes of bread; pasta, rice, noodles, or other cooked cereals; milk; and, cheese did not correlate well with all three indices. Furthermore, the corresponding ADG components for grains/cereals and dairy or alternatives did not correlate well with all scores. It is possible that insufficient detail (e.g., portion sizes, reduced fat versus full fat, and whole grains versus refined grains) being collected on these dietary components limited their contribution to discriminating individuals with high or low diet quality. Alternatively, these dietary components may not be important indicators of diet quality. Other studies have reported similar findings with intakes of grains and dairy showing weak correlations with diet quality [56,59,60].
Both the S-RDGI1 and S-RDGI2 over-estimated lower RDGI scores and under-estimated higher RDGI scores. This bias was less apparent for the S-RDGI2, as expected given the inclusion of three extra survey items in the S-RDGI2. However, achieving absolute agreement between the scores was not the main aim of this study, rather, it was intended to rank participants according to their diet quality. Results showed that 95% of participants had a difference in diet quality scores within ±11 (S-RDGI1) and ±9 (S-RDGI2) points of the RDGI, and this is adequate to successfully classify individuals into high, medium, or low diet quality.
There were few ADG components that were well met by participants, as consistent with previously published Australian work [44,46,61]. Participants scored higher on components for alcohol and fruit intake, and lower for vegetables, grains/cereals, and dairy components. Other studies have reported similar findings [44,46], as did the 2014 Health and Wellbeing Survey for adults in Western Australia, in which 52% and 8.8% of respondents reported eating the recommended daily serves of fruit and vegetables, respectively [62].
Both the S-RDGI1 and S-RDGI2 performed comparatively well in distinguishing between groups with known differences in diet quality. For all of the indices, a higher diet quality was associated with being female, undertaking moderate and high amounts of physical activity, not smoking, and being in excellent health. These findings are consistent with previous studies [44,46]. The lack of a significant association between diet quality indices and either income or education was likely due to the characteristics of the RESIDE study sample being of a relatively higher socioeconomic status with over half having greater than secondary school education or an income ≥ $90,000. Similarly, the lack of a clear trend in diet quality with age for all of the indices may be due to the disproportionate sample, with only 12 participants being under the age of 30. However, diet quality was significantly higher in the oldest age group for the S-RDGI1 and S-RDGI2, as seen in the latest 2011-2012 National Nutrition and Physical Activity Survey (NNPAS) [63]. Although participants that were classified as overweight or obese had lower diet quality, as determined by all three indices, this was not significant and was possibly due to the unreliable nature of self-reported height and weight data [64], or dietary under-reporting being more common in overweight people [65].

Implications
The S-RDGI2 (nine item score) performed slightly better than the S-RDGI1 (six item score) across all of the measures of agreement. However, the addition of three extra survey items in the S-RDGI2 for fish and alcohol intake only improved the regression model R 2 by 11%. Since the S-RDGI1 demonstrated good agreement with similar results to the S-RDGI2, this suggests that a large proportion of the variability in diet quality can be explained by relatively few survey items relating to dietary behaviors, consumption, and food frequency. These findings are consistent with previous studies. For example, the UK Eating Choices Index (ECI), which developed from a four-item questionnaire, successfully discriminated between healthy and unhealthy eating choices and was significantly correlated with nutrient profiles consistent with a healthy diet [25]. Similarly, an eight-item Danish dietary quality score (DQS) was associated with key nutrients and biomarkers that are consistent with higher diet quality [66]. Indeed, research has shown that intakes of certain food types can significantly predict an individual's diet quality [24]. This has implications for the development of new short dietary questionnaires or screeners, suggesting that just a few key questions relating to individual food items can be successfully used to classify populations or groups according to their diet quality.

Strengths and Limitations
This study was limited by the modest response rate (31%), and participant characteristics may not be comparable to national statistics due to the nature of the RESIDE study design. As such, results may not be transferable to other populations. Furthermore, it is acknowledged that given the low response rate and the period of data collection that was used in this study, differences in food consumption patterns amongst groups of people (responders and non-responders) or over time, may mean that the six and nine-item questionnaires do not predict diet quality as effectively in situations outside the present study. Rather, researchers should apply the techniques that are described in this paper to an existing dataset in order to generate a sample-specific measure of diet quality. No information on portion sizes of food frequency items was obtained, nor was it possible to adjust scores by energy intake. However, this is consistent with brief dietary assessment methods and other studies [44,46]. The present study did not measure dietary variety across food groups. Studies have demonstrated variety across food groups to be a stronger predictor of dietary quality than variety within those groups [67]. However, the inclusion of dietary variety at the food group level can be considered to be unnecessary since only with a varied diet is it possible to score high on all food group items [6]. In addition, individuals with high intakes as a result of higher energy needs will more likely also have a higher dietary variety. There was limited dietary information on unsaturated fats and meat alternatives, e.g., nuts, seeds, beans/legumes, tofu, or eggs. Therefore, these aspects of diet were underrepresented in scores and may have led to the misclassification of diet quality for some participants. Similarly, fruit juice was not included as a serving of fruit on the RESIDE questionnaire and this may have resulted in the underreporting of fruit intake and contributed towards lowered scores for diet quality.
The same tool was used to create the RDGI and S-RDGI1/S-RDGI2, which can potentially lead to an overestimation of relative validity [68] and contribute to correlated errors. As such, actual agreement may be lower than the reported findings. Ideally, estimated index scores should be compared against a standard reference that is constructed from an independent dietary assessment method to avoid correlated errors [69]. However, other studies have applied similar approaches within the literature in the absence of more appropriate data [58].
This study was strengthened by the use of a food-based index of diet quality (RDGI), as consistent with the current focus of diet research [5]. Furthermore, although the RDGI was not independently validated, it was modelled on an index that has been previously validated and shown to reflect intakes of key nutrients [44]. In addition, this study used age and sex-specific cut-offs with proportionate intermediate scoring within the guidelines to incorporated additional variation in scores.

Recommendations for Future Research
Future research to test the reliability and validate the S-RDGI1 and S-RDGI2 against a reference method (three-day food diary and/or validated FFQ), and examine scores in relation to objectively measured health outcomes, nutrient intakes, and/or relevant biomarkers, is recommended. In addition, investigating whether the S-RDGI1 and S-RDGI2 can detect changes in diet quality over time will further evaluate the effectiveness of the described methods as a means of generating a measure of diet quality for use in epidemiological research and studies of public health interventions.

Conclusions
The S-RDGI1 and S-RDGI2 successfully ranked participants according to their diet quality and performed well across a range of measures of agreement. However, due to the discussed limitations and the potential for correlated errors, results should be interpreted with caution and we emphasize that our observations would benefit from further validation. Yet, findings indicate that a large proportion of the variability in diet quality scores can be captured using a relatively small sub-set of dietary survey items or indictor questions based on individual food groups or items. Therefore, in large-scale, longitudinal studies with limited time and resources available for dietary assessment, a sub-set of questionnaire items may be sufficient for the development of a sample-specific measure of diet quality for ranking individuals or as a confounding variable in subsequent analyses. Furthermore, in situations where incomplete dietary data are available across time points, a method of regression that is based on an available sub-set of questionnaire items may be an effective approach for generating a consistent measure of diet quality and overcoming this common limitation of longitudinal studies. How often do you eat cheese (including ricotta, cottage, processed, cream cheese, hard and soft cheese)? [39] 0.89 -3-5 times per week -Most days (6-7 times per week)    [4] and Australian Guidelines to Reduce Health Risks from Drinking Alcohol [47] used to construct the RDGI.

Australian Dietary Guidelines
Guideline 2: Enjoy a wide variety of nutritious foods from these five groups every day • Plenty of vegetables of different types and colors, and legumes/beans.
• Milk, yoghurt, cheese and/or their alternatives, mostly reduced fat.
• Drink plenty of water.

Australian Dietary Guidelines
Guideline 3: Limit intake of foods containing saturated fat, added salt, added sugars and alcohol • Limit intake of foods high in saturated fat such as many biscuits, cakes, pastries, pies, processed meats, commercial burgers, pizza, fried foods, potato chips, crisps and other savory snacks.
• Limit intake of foods and drinks containing added salt.
• Limit intake of foods and drinks containing added sugars such as confectionary, sugar-sweetened soft drinks and cordials, fruit drinks, vitamin waters, energy and sports drinks.
• If you choose to drink alcohol, limit intake. For women who are pregnant, planning a pregnancy or breastfeeding, not drinking alcohol is the safest option.

Australian Guidelines to Reduce Health Risks from Drinking Alcohol
Guideline 1: Reducing the risk of alcohol-related harm over a lifetime • No more than 2 standard drinks on any day.