Evaluation of the Scientific Quality and Usability of Digital Dietary Assessment Tools

: The importance of digital tools for dietary assessment has increased in recent years, both commercially and scientifically. In the field of nutrition research, the digitization of dietary assessment methods presents many opportunities and risks. One of the main challenges is ensuring scientific quality while maintaining good usability. In this context, an evaluation tool was developed based on the guidelines of the European Food Safety Authorization (EFSA; 2009 and 2014), complemented by the usability aspect of health-related applications. This was followed by a literature search concerning the available dietary assessment tools, which were analyzed according to the evaluation criteria. Eight applications were included in the study after reviewing the inclusion and exclusion criteria for the digital tools. A total of thirty-eight requirements in eight main categories were defined for the evaluation, which the best possible dietary assessment tool should meet. The evaluation showed that none of the tested tools currently meet all the defined requirements or categories. The aspects of usability and the accuracy of data collection showed a positive correlation, suggesting a direct link between the two categories and providing an important approach for future developments.


Introduction
In view of growing social challenges and global crises, contemporary and targetgroup-oriented nutrition research is becoming increasingly important.In this context, the digitalization of nutritional research, including the associated methods of nutritional assessment, is seen as an important concern by leading scientific societies.At the national and international levels, there is a great need for research in the field of nutritional sciences, which is often limited in its realization due to a lack of financial resources [1].In this context, digital dietary assessment methods offer a cost-effective alternative to traditional methods such as computer-assisted telephone interviews (CATIs) and computer-assisted personal interviews (CAPIs) [2].
CAPIs and CATIs, e.g., using EPIC-Soft, have been successfully used for many years.However, the digital environment and, therefore, user habits have changed dramatically since they were first used.Accordingly, digital dietary survey methods have also evolved.There is now a wide range of web-based tools, apps, food diaries and 24 h dietary recalls (24 h recall), both commercial and research-based.These tools are available worldwide and offer a plethora of options, which can quickly lead to a loss of overview or make it difficult to choose the right tool [3,4].
Nutritional surveys can be used to record, analyze and evaluate the dietary behavior of individuals, population groups or even the population as a whole [5].The fundamental basis is the selection of an appropriate dietary survey method that provides qualitatively and quantitatively reliable data on the dietary behavior of the respective target group.Whether a method is considered suitable for the specific purposes of a study depends, among other things, on the objective, the time period, the size of the study, the validity of the method, the costs and available resources, as well as spatial and logistical factors [6,7].Each method has individual advantages and disadvantages that need to be taken into account when making a choice [5].
Weighted food records (WFRs) are still the gold standard for dietary survey methods and are therefore recommended by the European Food Safety Authority (EFSA) as the main method for national dietary surveys [2,8].For international organizations such as the EFSA, it is particularly important for well-founded risk assessments that data from dietary surveys be collected in a manner as detailed and uniform as possible [2].
The commonly used 24 h recalls are not defined as the methodological gold standard.However, 24 h recalls are regarded as more compatible with all socioeconomic strata, more cost-effective and more applicable to studies in an international context than WFRs.Due to several advantages of the 24 h recall, particularly its high practicability, this method is often used in large national consumption studies.According to the EFSA, the 24 h recall has been the main method for international (more precisely, pan-European) dietary surveys since 2009 [2].The 24 h recalls are mostly in the CATI and/or CAPI formats.
However, with increasing digitization, it is questionable whether traditional interviewing methods such as telephone or face-to-face methods will be accessible in the future.In the context of mobile technologies, the concept of usability is becoming increasingly important.Software for health research is considered user friendly if it meets the following criteria, according to Nouri et al.: ease of use, operability, visibility of system status, user control and freedom, consistency and standards, error prevention, completeness, quality requirements, adaptability, competence, style, behavior and structure [9].The collection of dietary information usually requires a high level of participation from participants [10].It is therefore very important that the chosen method not only meets the content needs but is also accepted by the participants [11].When choosing the appropriate method for a dietary survey, it is therefore always necessary to find a compromise between the involvement of the participants and the accuracy of the information [10].Less user-friendly programs may be an obstacle to the adaptation of digital alternatives [12].The response rate and participation of participants can be significantly influenced by the usability of the tool [11].
In addition to flexible application possibilities, mobile digital technologies also have various advantages and disadvantages.A comparison of the different perspectives on the two options mentioned can be found in Table 1.
Table 1.Comparison of the advantages and disadvantages of computer-based and mobile dietary assessment methods compared to analogue assessment methods [13][14][15].To identify the best possible tool and its strengths and weaknesses for future developments, an evaluation of the existing digital tools was carried out.The results of this study could primarily contribute to a better understanding of the potential of the existing digital dietary assessment tools and, if necessary, to the identification of opportunities for improvement.

Literature Search
The literature search was based on the 2009 and 2014 EFSA guidelines, which specify the general principles for the collection of national food consumption data [2,16].Using these guidelines, further literature was reviewed to provide information on the content requirements for dietary survey methods.In addition, German and English literature on existing analogue and digital tools in Germany and other European countries was screened, considering the usefulness of digitizing dietary survey methods and the needs for user-friendly health apps.The keywords and operators used in the PubMed database for the literature search are listed in Appendix A (Table A1).

Definition of the Evaluation Criteria
The evaluation was based on the fulfilment of the EFSA nutritional criteria and the requirements for user-friendly health apps.Furthermore, the categories used were derived from the quality criteria for quantitative research (validity, reliability, objectivity) and supplemented with criteria from recent publications [17,18].The evaluation categories identified are validity, reliability, objectivity, practicability, acceptance (user acceptance), usability (user-friendliness), functionality, and accuracy.The combination of criteria aimed to define the requirements for the best possible dietary assessment method.An overview of the evaluation categories and associated requirements is provided in Appendix B, Table A2.

Inclusion and Exclusion Criteria
The publications by Murai et al. and Gazan et al. were the primary sources for finding digital dietary assessment tools [18,19].Missing tools were also added from the UK Medical Research Council's Nutritools website (https://www.nutritools.org(accessed on 10 April 2023)).The inclusion of tools was assessed using predefined inclusion and exclusion criteria (Table 2).
Access or a demo version was requested from the institution by e-mail or telephone for tools that were not directly accessible.If access was not granted, the tool was excluded from the survey according to Table 2.

Evaluation and Functional Testing of the Tools
The evaluation of the tools was based on the foods consumed by a nutritionist in the 24 h prior to the test, which were recorded in advance in a food diary to eliminate recall bias.For each test, both a weekday and a weekend day were selected in accordance with the EFSA recommendations [2,16].
To test the functionality of the tools, data protection regulations were checked, food was deliberately not entered, possibly own food data were added spelling mistakes were deliberately made when searching for food, common brands were searched for, the messaging functions of the apps were checked, and unrealistic information was provided to test the logic recognition.Each tool was tested using the above sources of error until a program-defined end of testing occurred, and the process was completed for the program.
A requirement within the evaluation categories was only deemed to be met if it was fully present or fully functional.If no statements could be made due to demo access or if definitions of the requirement were only partially met, the evaluation was classified as partially satisfied.Large deviations from the requirements or a complete absence corresponded to a 'not fulfilled' categorization.All the requirements were satisfied if 38/38 requirements were met.
The evaluation categories were considered to be achieved if at least 75% of the requirements in a category were satisfied.To calculate the percentage of achievement, the rating 'partially fulfilled' was assigned to the rating 'not fulfilled' to simplify the calculation.The assessment of the degree of fulfilment of each requirement and the overall category was carried out by two assessors independently of each other.In case of disagreement, a third person was consulted.

Selection of the Tools
The literature search identified a total of 33 tools.After removing duplicates (n = 7) and excluding age-and target-group-specific tools, a total of 19 tools remained; these tools were supplemented by 5 tools from the Nutritools website (n = 24).Due to the restriction of the German and English languages, including availability as a full or demo version, a total of 16 tools were excluded, resulting in 8 tools being included in the evaluation (Figure 1).The 8 remaining tools were the web-based 24 h dietary recalls ASA24, myfood24, Intake24 and SACANA and the smartphone apps Nutrihand, Traqq, Keenoa and MyFitnessPal.

Evaluation of the Tools
The Keenoa tool met most of the requirements (32/38; ~84%) for a digital dietary sur vey method (Figure 2).Similarly, Keenoa also showed the highest fulfilment rate at th level of the evaluated categories compared to the other applications (6/8; 75%; Figure 3) According to the present evaluation, Keenoa can be described as functional, user-friendly accepted, practicable, objective, and reliable, but not as sufficiently valid and accurate.Th mobile food diary MyFitnessPal also met a comparatively high proportion of the require ments (27/38; ~71%; Figure 2) and evaluation criteria (5/8; 62.5%; Figure 3).Compared t Keenoa, there was a difference in reliability that was not present in MyFitnessPal.Th digital 24 h recall myfood24 (24/38; ~63%), Traqq (24/38; ~63%) and ASA24 (23/38; ~61% were close behind in terms of the accomplished requirements.Overall, ASA24 showed th greatest discrepancy between the requirements and the criteria fulfilled, as only the func tional and user-friendly criteria could be considered satisfied (2/8; 25%).The other tools i.e., Intake24 (18/38; ~47%), Nutrihand (13/38; ~34%) and SACANA (11/38; ~29%), all me less than 50% of the requirements.Figure 3 provides an overview of the individual tool and the percentage of achievements for each category.

Evaluation of the Tools
The Keenoa tool met most of the requirements (32/38; ~84%) for a digital dietary survey method (Figure 2).Similarly, Keenoa also showed the highest fulfilment rate at the level of the evaluated categories compared to the other applications (6/8; 75%; Figure 3).According to the present evaluation, Keenoa can be described as functional, user-friendly, accepted, practicable, objective, and reliable, but not as sufficiently valid and accurate.The mobile food diary MyFitnessPal also met a comparatively high proportion of the requirements (27/38; ~71%; Figure 2) and evaluation criteria (5/8; 62.5%; Figure 3).Compared to Keenoa, there was a difference in reliability that was not present in MyFitnessPal.The digital 24 h recall myfood24 (24/38; ~63%), Traqq (24/38; ~63%) and ASA24 (23/38; ~61%) were close behind in terms of the accomplished requirements.Overall, ASA24 showed the greatest discrepancy between the requirements and the criteria fulfilled, as only the functional and user-friendly criteria could be considered satisfied (2/8; 25%).The other tools, i.e., Intake24 (18/38; ~47%), Nutrihand (13/38; ~34%) and SACANA (11/38; ~29%), all met less than 50% of the requirements.Figure 3 provides an overview of the individual tools and the percentage of achievements for each category.

Evaluation of the Tools
The Keenoa tool met most of the requirements (32/38; ~84%) for a digital dietary survey method (Figure 2).Similarly, Keenoa also showed the highest fulfilment rate at the level of the evaluated categories compared to the other applications (6/8; 75%; Figure 3).According to the present evaluation, Keenoa can be described as functional, user-friendly, accepted, practicable, objective, and reliable, but not as sufficiently valid and accurate.The mobile food diary MyFitnessPal also met a comparatively high proportion of the requirements (27/38; ~71%; Figure 2) and evaluation criteria (5/8; 62.5%; Figure 3).Compared to Keenoa, there was a difference in reliability that was not present in MyFitnessPal.The digital 24 h recall myfood24 (24/38; ~63%), Traqq (24/38; ~63%) and ASA24 (23/38; ~61%) were close behind in terms of the accomplished requirements.Overall, ASA24 showed the greatest discrepancy between the requirements and the criteria fulfilled, as only the functional and user-friendly criteria could be considered satisfied (2/8; 25%).The other tools, i.e., Intake24 (18/38; ~47%), Nutrihand (13/38; ~34%) and SACANA (11/38; ~29%), all met less than 50% of the requirements.Figure 3 provides an overview of the individual tools and the percentage of achievements for each category.The requirements that have been met by very few tools and that form the basis for future developments include: -Listing of regional and seasonal foods -Notices/notifications -Documentation through images -Independent representative days -Completeness of recording -Indication of place of consumption -Health hazard warnings -Survey several times a day -Recording of dietary supplements

Focus on Usability and Accuracy
Focusing on the aspect of usability, Traqq, MyFitnessPal and MyFood24 were found to meet 92%, 83% and 75% of the requirements within the category, respectively, making them among the most user-friendly methods.In particular, the Keenoa app (100%) met all the usability requirements evaluated, while SACANA (~33%) and Nutrihand (25%) could be considered the least user-friendly (Figure 3).
In terms of accuracy, none of the tested tools provided sufficiently accurate and detailed data to meet the defined requirement criteria.Again, Keenoa (~67%) fulfilled most of the requirements for providing the most accurate and detailed data.In contrast, the tool SACANA met the fewest requirements (~11%).The other tools satisfied the accuracy requirements of ~33-56% (Figure 3).ASA24, myfood24, SACANA, Intake24, Traqq, Keenoa and MyFitnessPal scored higher in the usability category than in the accuracy category.The Nutrihand tool was more accurate but less user-friendly.None of the analyzed tools had both high accuracy and high usability.Overall, there was a positive correlation (r = 0.72; r 2 = 0.51) between the accuracy and usability categories (Figure 4).

Focus on Usability and Accuracy
Focusing on the aspect of usability, Traqq, MyFitnessPal and MyFood24 were found to meet 92%, 83% and 75% of the requirements within the category, respectively, making them among the most user-friendly methods.In particular, the Keenoa app (100%) met all the usability requirements evaluated, while SACANA (~33%) and Nutrihand (25%) could be considered the least user-friendly (Figure 3).
In terms of accuracy, none of the tested tools provided sufficiently accurate and detailed data to meet the defined requirement criteria.Again, Keenoa (~67%) fulfilled most of the requirements for providing the most accurate and detailed data.In contrast, the tool SACANA met the fewest requirements (~11%).The other tools satisfied the accuracy requirements of ~33-56% (Figure 3).ASA24, myfood24, SACANA, Intake24, Traqq, Keenoa and MyFitnessPal scored higher in the usability category than in the accuracy category.The Nutrihand tool was more accurate but less user-friendly.None of the analyzed tools had both high accuracy and high usability.Overall, there was a positive correlation (r = 0.72; r 2 = 0.51) between the accuracy and usability categories (Figure 4).

Discussion
The results indicate that none of the evaluated tools fully meet all the requirements or categories; therefore, there is currently no tool that completely fulfils the EFSA 2009 and 2014 guidelines and is also satisfactorily user-friendly.This finding is in line with the current state of research according to Khazen et al., namely that there is no 'one-size-fitsall' solution.This is mainly due to tool-specific limitations such as geographical aspects, language availability, target-group-specific validations, and reliability [10].Gazan et al. also point out similar weaknesses while showing that there is a high degree of heterogeneity between individual nutrient intake validation studies and that not all the tools have been tested for usability [18].
A total of nine requirements were identified that were not met by more than half of the tools and therefore provided a starting point for optimizing the digital tools.These include the integration of regional or seasonal foods and the querying of dietary supplements.By incorporating these aspects, a higher data collection accuracy could be achieved.The accuracy category was not fully met by any of the tools in this survey, indicating the potential for optimization.In addition, more notifications could help to ensure that data are entered at the point of food consumption and several times a day, which could in turn improve the accuracy of the information collected.It was also noted that

Discussion
The results indicate that none of the evaluated tools fully meet all the requirements or categories; therefore, there is currently no tool that completely fulfils the EFSA 2009 and 2014 guidelines and is also satisfactorily user-friendly.This finding is in line with the current state of research according to Khazen et al., namely that there is no 'one-size-fitsall' solution.This is mainly due to tool-specific limitations such as geographical aspects, language availability, target-group-specific validations, and reliability [10].Gazan et al. also point out similar weaknesses while showing that there is a high degree of heterogeneity between individual nutrient intake validation studies and that not all the tools have been tested for usability [18].
A total of nine requirements were identified that were not met by more than half of the tools and therefore provided a starting point for optimizing the digital tools.These include the integration of regional or seasonal foods and the querying of dietary supplements.By incorporating these aspects, a higher data collection accuracy could be achieved.The accuracy category was not fully met by any of the tools in this survey, indicating the potential for optimization.In addition, more notifications could help to ensure that data are entered at the point of food consumption and several times a day, which could in turn improve the accuracy of the information collected.It was also noted that many of the tools used do not meet the standards of today's modern user interfaces, which may reduce the motivation to use them.
In addition, the 'validity' category was only met by one tool.This result initially contradicts the data from Gazan et al., who were able to identify validation data or studies for almost all the analyzed tools [18].However, it should be noted that the category 'validity' in the present survey does not refer to the existence of validation data but includes the requirements 'regional and/or seasonal foods', 'documentation with pictures', 'personalization of the interviewer', 'multistep procedure' and 'plausibility and consistency check of the entered data'.Thus, the basic approach to assessing validity is very different and cannot be compared with an examination of the existence of validation data.
Interestingly, the categories of usability and accuracy are not mutually exclusive but are positively correlated.This should be seen as an important stimulus for future development, as accuracy can be improved by focusing on the needs of the person performing the test.In this context, Gazan et al. reported that many tools have difficulties in easily finding the right food in databases, which leads to biases and inaccuracies in the collected data.Optimizing the search function is directly related to improving the accuracy and usability [18].In the future, more attention should be paid to a combination of the two categories when developing tools to ensure good compliance from participants and good to very good accuracy at the same time.
A limitation of the present evaluation is that further evaluation criteria could be included that are not specific to the EFSA guidelines but are more specific to modern digital dietary survey approaches.Furthermore, only a limited number of tools could be tested, mainly due to the availability of demo versions or full access, in addition to the general inclusion and exclusion criteria.Some requirements or criteria were difficult to assess because some tools were only available as demo versions, which limited the overall range of functions and therefore the quality of the evaluation.Another limitation is that both web-based and mobile technologies were tested against the same criteria.Some requirements are more closely related to the basic methods used.For example, the 'notifications' requirement is more easily met by apps than by web-based tools.A future focus on one of these categories, preferably mobile technologies, can be seen as useful in terms of advancing digitization.It was also striking that some of the criteria defined were either directly related or mutually exclusive.For example, a tool that meets the 'weighing' requirement cannot meet the 'estimation' requirement.This limitation could be overcome by broader criteria with more choices.Overall, it should be noted that despite the fixed set of criteria, such evaluations are subject to a certain degree of subjectivity.Nevertheless, such evaluations provide good support for the functionality and further development of digital approaches to nutrition surveys.
Another limitation is the different targeting of the tools, in relation to which MyFitness-Pal occupies a special position.While the other seven tools are designed for use in research or in the health sector in general [20][21][22][23][24][25][26], MyFitnessPal is aimed more at individuals [27].However, consideration of this tool represents enrichment in that the criteria of acceptance, usability, functionality, and practicality are particularly desirable (see Figure 3).
Further research and ongoing monitoring and analysis of the available tools are needed to ensure that usability and data accuracy interact.One interesting feature that has been available for some time to improve both categories is the use of integrated barcode scanners, which can make it much easier for participants to identify specific foods [18].Using MyFitnessPal as an example, it has been shown that most foods, especially common foods, can be added quickly and correctly by scanning the barcode on the packaging.However, the accuracy of the food data must be criticized, as in some cases, especially with more unconventional foods, the wrong food may be stored.Furthermore, the function does not guarantee the completeness of the products available on the market, and the creation of products by users without subsequent checking by administrators can lead to inaccuracies in the data record.

Conclusions
The optimal method for combining usability and scientific quality must first and foremost be valid, reliable, objective, practicable, accepted, accurate and functional.Accordingly, there is a high profile of requirements that can be regarded as possible in principle based on the available information and the present evaluation.The limitations described are a useful starting point for the future development of a modified evaluation system.In particular, the aforementioned limitation of potentially mutually exclusive requirements should be taken into account.Nevertheless, our results show that even the best tool, Keenoa, does not meet all the requirements, indicating the need for a more comprehensive nutritional assessment tool.Non-personalized interviewee [2] 4.

Figure 1 .
Figure 1.Overview of the tool search by inclusion and exclusion criteria.

Figure 2 .
Figure 2. Percentage of requirements met per tool, regardless of the evaluation category.

Figure 1 .
Figure 1.Overview of the tool search by inclusion and exclusion criteria.

Figure 1 .
Figure 1.Overview of the tool search by inclusion and exclusion criteria.

Figure 2 .
Figure 2. Percentage of requirements met per tool, regardless of the evaluation category.Figure 2. Percentage of requirements met per tool, regardless of the evaluation category.

Figure 2 .
Figure 2. Percentage of requirements met per tool, regardless of the evaluation category.Figure 2. Percentage of requirements met per tool, regardless of the evaluation category.

Figure 3 .Figure 3 .
Figure 3. Overview of the percentage achievement of the evaluation categories of the tested tools: ASA24, myfood24, SACANA, Intake24, Nutrihand, Traqq, Keenoa and MyFitnessPal.The requirements that have been met by very few tools and that form the basis for future developments include: − Listing of regional and seasonal foods − Notices/notifications − Documentation through images − Independent representative days − Completeness of recording − Indication of place of consumption − Health hazard warnings − Survey several times a day − Recording of dietary supplements

Table 2 .
Inclusion and exclusion criteria for the selection of appropriate tools.