Reproducibility and Validity of a Semi-Quantitative Food Frequency Questionnaire for Assessing Dietary Intake of Vegetarians and Omnivores in Harbin, China

This study aims to evaluate the reproducibility and validity of a semi-quantitative food frequency questionnaire (SQFFQ) developed for vegetarians and omnivores in Harbin, China. Participants (36 vegetarians and 64 omnivores) administered SQFFQ at baseline (SQFFQ1) and six months later (SQFFQ2) to assess the reproducibility. The 24 h recalls (24 HRs) for three consecutive days were completed between the administrations of two SQFFQs to determine the validity. For reproducibility, Pearson correlation coefficients between SQFFQ1 and SQFFQ2 for vegetarians and omnivores were 0.45~0.88 and 0.44~0.84, respectively. For validity, unadjusted Pearson correlation coefficients were 0.46~0.83 with an average of 0.63 and 0.43~0.86 with an average of 0.61, respectively; energy-adjusted Pearson correlation coefficients were 0.43~0.82 with an average of 0.61 and 0.40~0.85 with an average of 0.59, respectively. Majority of the correlation coefficients for food groups and macronutrients decreased or remained unchanged after energy adjustment. Furthermore, all correlations were statistically significant (p < 0.05). Bland–Altman plots also showed reasonably acceptable agreement between the two methods. In conclusion, the SQFFQ developed in this study has reasonably acceptable reproducibility and validity.


Introduction
As global public health issues, chronic diseases, especially obesity, diabetes, cancer and cardiovascular diseases, have attracted more and more attention. Epidemiological studies have suggested that dietary habits are likely to be related to the occurrence of chronic disease. A vegetarian diet can reduce the incidence rate of obesity [1], diabetes [2], cancer [3] and cardiovascular diseases [4], while an omnivorous diet may have the opposite effect [5]. In order to determine the relationship between dietary habits and chronic diseases, it is necessary to clearly understand and reliably assess the dietary intake of vegetarians and omnivores to improve their health status through effective dietary interventions. Weighed food records are a method that can accurately measure dietary intake, but the main limitation of this method is that it is time-consuming and more suitable for short-term individual dietary intake surveys [6]. The semi-quantitative food frequency questionnaire (SQFFQ) is a widely used method to assess dietary intake over various periods in epidemiological research because of its timesaving, low cost, simple operation and high response rate, and the data collected reflect the dietary intake in the past for a long time, which is more valuable than the short-term data [7,8]. Since an SQFFQ is prone to some degree of measurement error that may attenuate associations between dietary intake and disease. All newly developed

Study Participants
The participants in the present study were recruited through advertisements, email and telephone in the area of Harbin, China. For inclusion in the study, participants were required to be 25~40 years of age, without chronic, nutritional, or infectious diseases; not pregnant nor breastfeeding; with no smoking or drinking habits; and vegetarians who have been on a vegetarian diet for at least one year [24,25]. Written informed consent was obtained from all participants for participation in this study. We collected information about participants' age, education level and employment status. Height was measured without shoes at 0.1 cm using a research-grade digital stadiometer (Model: HT-DM40, Faenza, Italy). Weight was measured in light clothing without shoes to the nearest 0.1 kg with a portable digital scale (Model: Yolanda-CS10A, Shenzhen, China). Body mass index (BMI) was calculated according to the following formula [26]: BMI = weight (kg)/height (m 2 ). In the end, one hundred and twenty-one participants (46 vegetarians and 75 omnivores) were recruited to participate in this study.

Study Design
The study started in July 2017 and lasted for the subsequent six months. During the study period, participants were required to complete two administrations of SQFFQs and one 24 HRs for three consecutive days including two weekdays and one weekend day [21]. In the reproducibility study, the SQFFQ1 was administrated by a trained interviewer and the SQFFQ2 at the following visit six-month later [27]. To validate the SQFFQ, the 24 HRs for three consecutive days were completed between the intervals of two SQFFQs [28]. In return, each participant will receive a detailed dietary assessment and personalized dietary guidance based on the results of the nutrient intake analysis of the participant. The study design and schedule used are shown in Figure 1.

Figure 1.
Study design and schedule used in this study. A 22 food groups semi-quantitative food frequency questionnaire (SQFFQ) was administrated at the baseline (SQFFQ1) and 6 months later (SQFFQ2) to vegetarians and omnivores by trained interviewers with a face-to-face approach. The 24 h dietary recalls (24 HRs) for three consecutive days (including two weekdays and one weekend day) were performed by participants between SQFFQ1 and SQFFQ2 to recall the items and portion sizes of all foods that they consumed from the last day (22:00) to the next day (22:00). The reproducibility was tested by comparing the results from two SQFFQs, and the validity was assessed by comparing the data obtained from the SQFFQ1 and the mean 24 HRs.

Semi-Quantitative Food Frequency Questionnaire
The SQFFQ was developed based on the methodology proposed by Willett [6]. The SQFFQ consisted of three parts, including the food items list, the frequency of food consumption, and the amount of food consumed each time. There were 116 items on the food list, which were divided into 22 groups (Table 1). Each food item was based on the dietary guidelines for Chinese residents. [26], the National Health and Dietary Survey in China [29] and the dietary habits of the Chinese local vegetarians and omnivores. The frequency options provided in the SQFFQ were (1) none/no consumption; (2) Number of times per day; (3) Number of times per week; (4) Number of times per month; (5) Number of times per year. The average amount consumed each time was filled in "gram, g" or "milliliter, mL". To improve the accuracy of participants' estimation of food weight, we provided plastic food models and photos of standard food portion size to facilitate the assessment of food weight [30]. For seasonal foods (e.g., watermelon, grape, and cucumber), participants were asked to recall how often they ate these foods during the season, and then interviewers converted consumption frequency during the season to an average consumption frequency over a year [31]. For example, the participants ate watermelon for 3 months (June to August) in the past year, consuming 1000 g each time and 3 times a week on average, record "1000 g" in the column of average consumption per time and "36" in the column of "annual" eating times. The mean intake of each food item per day was calculated by multiplying the frequency of daily food consumption and the amount of food consumed each time in g/d or mL/d [32]. Study design and schedule used in this study. A 22 food groups semi-quantitative food frequency questionnaire (SQFFQ) was administrated at the baseline (SQFFQ1) and 6 months later (SQFFQ2) to vegetarians and omnivores by trained interviewers with a face-to-face approach. The 24 h dietary recalls (24 HRs) for three consecutive days (including two weekdays and one weekend day) were performed by participants between SQFFQ1 and SQFFQ2 to recall the items and portion sizes of all foods that they consumed from the last day (22:00) to the next day (22:00). The reproducibility was tested by comparing the results from two SQFFQs, and the validity was assessed by comparing the data obtained from the SQFFQ1 and the mean 24 HRs.

Semi-Quantitative Food Frequency Questionnaire
The SQFFQ was developed based on the methodology proposed by Willett [6]. The SQFFQ consisted of three parts, including the food items list, the frequency of food consumption, and the amount of food consumed each time. There were 116 items on the food list, which were divided into 22 groups (Table 1). Each food item was based on the dietary guidelines for Chinese residents. [26], the National Health and Dietary Survey in China [29] and the dietary habits of the Chinese local vegetarians and omnivores. The frequency options provided in the SQFFQ were (1) none/no consumption; (2) Number of times per day; (3) Number of times per week; (4) Number of times per month; (5) Number of times per year. The average amount consumed each time was filled in "gram, g" or "milliliter, mL". To improve the accuracy of participants' estimation of food weight, we provided plastic food models and photos of standard food portion size to facilitate the assessment of food weight [30]. For seasonal foods (e.g., watermelon, grape, and cucumber), participants were asked to recall how often they ate these foods during the season, and then interviewers converted consumption frequency during the season to an average consumption frequency over a year [31]. For example, the participants ate watermelon for 3 months (June to August) in the past year, consuming 1000 g each time and 3 times a week on average, record "1000 g" in the column of average consumption per time and "36" in the column of "annual" eating times. The mean intake of each food item per day was calculated by multiplying the frequency of daily food consumption and the amount of food consumed each time in g/d or mL/d [32].

24 Hour Dietary Recall
Each participant was asked to complete one 24 HRs for three consecutive days during the intervals of two SQFFQs. The three 24 HRs included two weekdays and one weekend day. Participants were required to recall the items and portion sizes of all foods consumed during the past 24 h from the last day (22:00) to the next day (22:00). The mixed dishes were converted into single food items. The recalled food items were assigned to the corresponding food groups as defined by the SQFFQ. Trained interviewers administered the SQFFQs and 24 HRs through face-to-face interviews. The mean 24 HRs were used as a reference method to validate the SQFFQ [28]. During the whole study period, each participant corresponded to the same interviewer to reduce possible bias.

Data Cleaning
Participants who did not complete the SQFFQs or 24 HRs were excluded from the analyses. Participants with implausible energy intakes (< 600 kcal/day or > 4000 kcal/day) were also excluded [19,33]. The total energy intake of each participant was calculated based on the Chinese food composition tables [34].

Statistical Analysis
All statistical analyses were performed with SPSS 20.0. A value of p < 0.05 was considered to be statistically significant. Categorical variable data, such as gender, education level and employment status were represented by frequency (n) and percentage (%). Continuous variable data, such as age, height, weight and BMI, were represented by the mean and standard deviation (SD). The daily intake of each food item was determined based on the average consumption frequency and the amount of each food item [35]. Macronutrient intake for each food item was calculated as the daily intake of each food item multiplied by nutrient per 100 g [36]. The macronutrient composition of foods can be found in the Chinese Food Composition Tables [34]. Descriptive statistics for energy, macronutrients and food intake are presented as mean and standard deviation (SD), median and interquartile ranges, respectively. Differences in food and macronutrient intake between two SQFFQs, and between the SQFFQ1 and the mean 24 HRs, were compared using the Wilcoxon signed rank test [20]. Pearson correlation coefficients were calculated to access the association between average daily intake of nutrients and food [37]. The correlation coefficients of 0.10~0.39, 0.40~0.69, 0.70~0.89 and 0.90~1.00 represent a weak, moderate, strong and very strong correlation, respectively [38]. Energy-adjusted intakes of food and macronutrients were calculated by using the residual method [39] to remove the person variation caused by day-to-day fluctuations and seasonal variations and were used to calculate correlation coefficients for assessing the relationship between the SQFFQ1 and the mean 24 HRs. For visualization, Bland-Altman plots were drawn to examine the agreement between the SQFFQ1 and the mean 24 HRs for energy and macronutrients [40]. A good agreement was defined as having no more than 10% of the points exceeding the 95% limits of agreement and being close to the mean line [32].

Results
All enrolled participants (n = 121) completed the questionnaires, in which participants who did not satisfactorily complete the SQFFQs or 24 HRs (7 vegetarians and 5 omnivores) and had implausible energy intake (3 vegetarians and 6 omnivores) were excluded from the analyses, and 100 (82.6%) subjects completed the study.
The characteristics of the 100 subjects are shown in Table 2. The mean age was 32.8 ± 4.8 years, ranging from 25 to 45 years and 52.0% were women. The mean height was 167.0 ± 7.4 cm. The mean weight was 65.1 ± 5.9 kg, and the mean BMI was 23.1 ± 3.1 kg/m 2 , ranging from 18.8 to 30.0 kg/m 2 . In total, 18.0% of the subjects had a university degree or above; 72% of the subjects had employment status. There was no significant difference in age, height, weight and BMI between vegetarians and omnivores (p > 0.05).

Reproducibility
As shown in Table 3, in the comparison of the intake of foods, energy and macronutrients from two SQFFQs, for vegetarians, the intake of rice, flour food, buns, eggs, dark vegetables, fruits, nuts, beverages, energy, protein, fat and carbohydrates was higher when estimated by SQFFQ2 than by SQFFQ1; for omnivores, the intake of buns, pastry food, fried food, red meat, processed meat, freshwater fish, seafood, bean products, light vegetables, mushrooms, beverages, energy, protein, fat and carbohydrates was higher when estimated by SQFFQ2 than by SQFFQ1. The differences between SQFFQ1 and SQFFQ2 for vegetarians and omnivores were 0~9.7% and 0.1%~15.8%, respectively. Through the Wilcoxon rank-sum test, there was no significant difference in foods energy and macronutrient intake between SQFFQ1 and SQFFQ2 (p > 0.05). The Pearson correlation coefficients of the two SQFFQs on vegetarians ranged from 0.45 for eggs to 0.88 for fruits with an average of 0.65; the Pearson correlation coefficients of the two SQFFQs on omnivores ranged from 0.44 for coarse cereals to 0.84 for dairy with an average of 0.64. Furthermore, all correlations were statistically significant (p < 0.05). It shows that the survey results of SQFFQ1 and SQFFQ2 are consistently indicating reasonably acceptable reproducibility. Table 4 shows the validity of food, energy and macronutrient intake between the SQFFQ1 and the mean 24 HRs. Compared with the mean 24 HRs, except for red meat, poultry, processed meat, freshwater fish and seafood, food groups (i.e., rice, flour food, buns, eggs, bean products and dark vegetables), energy and macronutrients of vegetarians were underestimated in SQFFQ1, with a different rate of 1.7%~9.5%, and other food groups were overestimated, with a different rate of 0.3%~13.5%. Among omnivores, food groups (i.e., rice, porridge, fried food, coarse cereals, potato, dairy, eggs, poultry, freshwater fish, dark vegetables, light vegetables, mushrooms and fruits) were overestimated, with a different rate of 0.8%~13.9%, and others were underestimated, with a different rate of 0.4%~17.2%. Although there was underestimation and overestimation, no significant difference was observed for the food groups, energy and macronutrients between SQFFQ1 and the mean 24 HRs (p > 0.05). The unadjusted Pearson correlation coefficients of the SQFFQ1 and the mean 24 HRs on vegetarians and omnivores were 0.46~0.83 with an average of 0.63 and 0.43~0.86 with an average of 0.61, respectively. All correlations were statistically significant (p < 0.05). Most of correlation coefficients for food groups and macronutrients decreased or remained unchanged after energy adjustment. The energy-adjusted Pearson correlation coefficient of the SQFFQ1 and the mean 24 HRs on vegetarians and omnivores were 0.43~0.82 with an average of 0.61 and 0.40~0.85 with an average of 0.59, respectively. All correlations were statistically significant (p < 0.05). It shows that the survey results of SQFFQ1 and 24 HRs are consistent indicating reasonably acceptable validity.

Bland-Altman Analyses
Bland-Altman plots are a graphical representation that shows the agreement between the SQFFQ1 and 24 HRs for energy and macronutrients, some of which are shown in Figure 2. The horizontal axis represents the mean total intake of energy and macronutrients from both SQFFQ1 and 24 HRs, whereas the vertical axis represents the difference in energy and macronutrient intake between the SQFFQ1 and 24 HRs. The dashed line represents the average difference between the two methods, while the solid line represents the distance between the mean of the difference ± 1.96 times standard deviations. A good agreement was defined as having no more than 10% of the points exceeding the 95% limits of agreement and being close to the mean line [32]. As shown in Figure 1, except for a few points outside the 95% limits of agreement, most of the points were within the 95% limits of agreement, and most of them were close to the mean line.

Bland-Altman Analyses
Bland-Altman plots are a graphical representation that shows the agreement between the SQFFQ1 and 24 HRs for energy and macronutrients, some of which are shown in Figure 2. The horizontal axis represents the mean total intake of energy and macronutrients from both SQFFQ1 and 24 HRs, whereas the vertical axis represents the difference in energy and macronutrient intake between the SQFFQ1 and 24 HRs. The dashed line represents the average difference between the two methods, while the solid line represents the distance between the mean of the difference ± 1.96 times standard deviations. A good agreement was defined as having no more than 10% of the points exceeding the 95% limits of agreement and being close to the mean line [32]. As shown in Figure 1, except for a few points outside the 95% limits of agreement, most of the points were within the 95% limits of agreement, and most of them were close to the mean line.

Discussion
In order to determine the relationship between dietary habits and chronic diseases, an SQFFQ consisting of 116 food items was developed to assess the dietary intake of vegetarians and omnivores in Harbin, China. The SQFFQ developed in this study was considered to have an optimal number of food items according to Cade's suggestion that the number of food items ranges from 5 to 350 [15]. In the present study, we evaluated the reproducibility and validity of the SQFFQ. Reproducibility means that the same questionnaire is used to measure the same subject twice at different time points. The larger the correlation coefficient of the data obtained from the two surveys, the better the reproducibility of the questionnaire [12]. With regard to time frame, varying time intervals between SQFFQ1 and SQFFQ2, from 15 days to several years, have been reported in previous studies [11,41]. The time interval between the two questionnaires should be as long as the respondents cannot remember the results of the last answer and as short as the dietary habits of the respondents do not change during the two questionnaires. Some researchers believe that the interval of half a year to one year is better [42]. Therefore, the time interval between the two dietary surveys in this study is 6 months, and SQFFQ1 and SQFFQ2 are obtained, respectively. However, the time reference can reflect changes in intake caused by seasonality, which may have occurred in this study, possibly lowering true correlations, especially for fruits and vegetables.
In the reproducibility study, the results showed that there was no statistically significant difference between SQFFQ1 and SQFFQ2. Pearson correlation coefficients of the two SQFFQs on vegetarians and omnivores were 0.45~0.88 and 0.44~0.84, respectively (p < 0.05). It was similar to the reports of a previous study [43]. Among them, the correlation coefficients of coarse cereals (0.81), dark vegetables (0.85), light vegetables (0.84) and fruits (0.88) of vegetarians were higher than those of the others; the correlation coefficients of red meat (0.84), poultry (0.83), processed meat (0.79) and seafood (0.78) of omnivores were higher than those of the others. A possible reason for the higher correlation coefficients could be relative to their dietary habits. Some researchers believe that the correlation coefficient between dietary survey methods can reach more than 0.4, and if the correlation is meaningful, it can be considered that the survey results are consistent indicating reasonably acceptable reproducibility [44].
The validity refers to the effectiveness and authenticity of the survey results, that is, the consistency between the data obtained by SQFFQ and the actual intake data. At present, no assessment methods can accurately estimate dietary intake. Therefore, validity evaluation can only be achieved by comparing the results of the SQFFQ with a relatively accurate assessment method. The 24 HRs for three consecutive days are often used as a reference method [45,46], because of their no impact on the measurement of the SQFFQ, and the measurement error between the two methods is irrelevant [47]. In this study, the statistical and validity analysis showed that there was no statistically significant difference between SQFFQ1 and 24 HRs in the food, energy and macronutrient intake of vegetarians and omnivores. The unadjusted Pearson correlation coefficients of the two methods on vegetarians and omnivores were 0.46~0.83 and 0.43~0.86, respectively (p < 0.05). The adjusted Pearson correlation coefficients of the two methods on vegetarians and omnivores were 0.43~0.82 and 0.40~0.85, respectively (p < 0.05). The correlation coefficients of most foods and macronutrients ranged from 0.4 to 0.7, showing moderate agreement. Similar to a previous FFQ study [48], correlation coefficients of most food and nutrients in this study decreased after energy adjustment. This may be due to the large differences in energy intake among individuals. Similarly, variability was associated with an overestimation or an underestimation of systematic errors. In addition, we used the Bland-Altman plots to evaluate the validity of the SQFFQ and 24 HRs. A good agreement was defined as having no more than 10% of the points exceeding the 95% limits of agreement and being close to the mean line [32]. Bland-Altman consistency analysis showed that the SQFFQ1 and 24 HRs in vegetarians and omnivores are good consistency, indicating that the SQFFQ has reasonably acceptable validity.

Strength and Limitations
FFQs and other forms of memory-based dietary assessment methods are useful tools in epidemiological studies to understand subjects' dietary intake [49]. Even though the limitation of these assessment methods is acknowledged, SQFFQs remain until nowadays the most used dietary assessment method to study dietary patterns.
The main strength of this study is that the trained interviewers administered the SQFFQs and 24 HRs through face-to-face interviews, and each participant corresponded to the same interviewer to minimize possible bias during the whole study period. Moreover, to improve the accuracy of participants' estimation of food weight, we provided plastic food models and photos of standard food portion size to facilitate the assessment of food weight.
On the other hand, there were a few limitations to this study. Willett [6] has suggested a sample size of 100 to 200 as reasonable for validation studies; however, this study excluded many categories of population based on recruitment criteria and reasonable questionnaire, and these exclusions might result in a relatively small sample size for validity assessment. The 24 HRs might not be adequate to reflect the seasonal effects and other poorly defined fluctuations in dietary consumption. For seasonal foods, factors such as forgetfulness and assessment of food portion size can cause food underestimation [50]. In addition, biological markers are used as the reference methods for validity assessment. We did not use biomarkers to assess dietary intake since they are affected by bioavailability and absorption which may lead to underestimation [51], which means that we can only rely on 24 HRs to assess the validity of SQFFQs.

Conclusions
The results demonstrated that the SQFFQ has reasonably acceptable reproducibility and validity for assessing the dietary consumption of vegetarians and omnivores in Harbin, China. Based on the present study, this SQFFQ may likely be applied to epidemiological investigations of the relationship between the dietary intake of vegetarians and omnivores and chronic diseases in similar areas.