The European Food Safety Authority has developed a standardized food classification and description system called FoodEx2. It uses facets to describe food properties and aspects from various perspectives, making it easier to compare food consumption data from different sources and perform more detailed data analyses. However, both food composition data and food consumption data, which need to be linked, are lacking in FoodEx2 because the process of classification and description has to be manually performed—a process that is laborious and requires good knowledge of the system and also good knowledge of food (composition, processing, marketing, etc.). In this paper, we introduce a semi-automatic system for classifying and describing foods according to FoodEx2, which consists of three parts. The first involves a machine learning approach and classifies foods into four FoodEx2 categories, with two for single foods: raw (r) and derivatives (d), and two for composite foods: simple (s) and aggregated (c). The second uses a natural language processing approach and probability theory to describe foods. The third combines the result from the first and the second part by defining post-processing rules in order to improve the result for the classification part. We tested the system using a set of food items (from Slovenia) manually-coded according to FoodEx2. The new semi-automatic system obtained an accuracy of 89% for the classification part and 79% for the description part, or an overall result of 79% for the whole system.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited