Food frequency questionnaires (FFQs) are the most commonly selected tools in nutrition monitoring, as they are inexpensive, easily implemented and provide useful information regarding dietary intake. They are usually carefully drafted by experts from nutritional and/or medical fields and can be validated by using other dietary monitoring techniques. FFQs can get very extensive, which could indicate that some of the questions are less significant than others and could be omitted without losing too much information. In this paper, machine learning is used to explore how reducing the number of questions affects the predicted nutrient values and diet quality score. The paper addresses the problem of removing redundant questions and finding the best subset of questions in the Extended Short Form Food Frequency Questionnaire (ESFFFQ), developed as part of the H2020 project WellCo. Eight common machine-learning algorithms were compared on different subsets of questions by using the PROMETHEE method, which compares methods and subsets via multiple performance measures. According to the results, for some of the targets, specifically sugar intake, fiber intake and protein intake, a smaller subset of questions are sufficient to predict diet quality scores. Additionally, for smaller subsets of questions, machine-learning algorithms generally perform better than statistical methods for predicting intake and diet quality scores. The proposed method could therefore be useful for finding the most informative subsets of questions in other FFQs as well. This could help experts develop FFQs that provide the necessary information and are not overbearing for those answering.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited