1. Introduction
Accurately measuring dietary intake in individuals is inherently challenging. Common dietary assessment methods include food diaries, food frequency questionnaires [
1], and emerging technologies such as photographic records [
2]. Among these, the 24 h dietary recall (24HR) is one of the most widely used tools and is designed to provide a detailed account of all foods and beverages consumed on the previous day [
1].
Unlike other dietary assessment methods, the 24HR typically relies on a trained interviewer to elicit detailed information. For instance, when a participant reports consuming “a bagel,” the interviewer may follow up with specific questions about the type (e.g., whole grain vs. refined flour), flavour (e.g., plain vs. cinnamon raisin), brand, preparation method (e.g., toasted or not), condiments (e.g., butter, cream cheese), accompanying beverages, and portion size [
1]. Although this level of detail allows for more accurate nutrient analysis [
3], it also increases respondent burden and the cognitive demand of recall.
While trained interviewers enhance the accuracy of dietary recalls, they also represent a key limitation of traditional 24HRs [
4]. Since these interviews are typically conducted one-on-one [
3], larger-scale studies must hire multiple interviewers or extend the study duration to accommodate all participants. Each 24HR can take up to an hour to complete [
3], making the use of 24HRs a resource-intensive method. As a result, researchers often turn to alternatives such as food frequency questionnaires (FFQs), which are self-administered, less time-consuming, and require minimal training [
5]. However, despite their practicality, FFQs are less accurate than 24HRs in estimating specific nutrient intakes [
4], limiting their suitability for certain types of analyses.
To address the time and resource burden associated with traditional 24HRs, the National Cancer Institute developed the Automated Self-Administered 24 h Dietary Assessment Tool (ASA24) [
6]. The ASA24 utilizes the Automated Multiple-Pass Method (AMPM) to collect detailed information about respondents’ dietary intake from the previous day. Conducted entirely online, ASA24 is well-suited for large-scale nutrition research [
6]. By eliminating the need for trained interviewers, ASA24 overcomes many logistical challenges of traditional 24HRs, enabling multiple participants to complete recalls simultaneously and making it feasible for use in large epidemiological studies.
Despite its advantages, the absence of an interviewer in ASA24 introduces new limitations. Participants may misreport their intake—either unintentionally, through typographical errors or misinterpretation of prompts, or intentionally, through selective reporting—resulting in implausible dietary recalls (IDRs) [
7]. Many of these errors might have been identified and corrected during interviewer-led 24HRs, but in automated settings, they can distort both mean and variability estimates of nutrient intake. Although ASA24 incorporates standardized portion size prompts and facilitates anomaly detection during data cleaning [
8], the removal or exclusion of IDRs still represents a loss of potentially valuable data.
Missing data can reduce statistical power, increase the risk of Type II errors [
9,
10,
11], and bias study results, particularly when the missing data are systematically different from those observed [
12]. While researchers may commonly use listwise deletion to address this issue, this approach can exacerbate bias [
13]. A more sophisticated approach is multiple imputation (MI), which replaces missing values with plausible estimates derived from fitted models across multiple newly created datasets [
14]. MI is particularly effective when data are Missing Completely at Random (MCAR) or Missing at Random (MAR)—that is, when the probability of missingness is unrelated or explainable by observed variables [
15].
Nevertheless, all imputation methods—including MI—share a fundamental limitation: they involve estimation without knowing the true values. While MI is more robust than simpler approaches, its accuracy in the context of high-variability outcomes like daily nutrient intake remains unclear [
15,
16]. Given the day-to-day fluctuation in dietary behaviours (and subsequent nutrient intake), it is essential to understand how well MI can reconstruct missing dietary data, particularly in large-scale nutrition studies that use ASA24.
To date, no study has directly evaluated the accuracy of MI in estimating missing nutrient intake data derived from 24HRs. Therefore, the overall goal of this study was to assess the performance of MI in accurately reconstructing nutrient intake values under conditions of simulated missingness. To do so, the following specific objectives were identified:
To assess the correlation between imputed and true values using Spearman’s rho at 10%, 20%, and 40% levels of simulated missing data.
To evaluate the accuracy of imputed values, defined as being within ± 10% of the actual value for each nutrient.
To examine trends in correlation strength and accuracy across increasing proportions of missing data.
2. Materials and Methods
2.1. Study Design and Data Source
Data for this study were drawn from the SmartAPPetite for Youth Study, a cluster-randomized controlled trial conducted in Southwestern Ontario, Canada, among adolescents aged 13–18 years from 2017 to 2020. This age range was selected because it corresponds to the standard age span for high school students in Ontario, Canada, which was the intended target population for the SmartAPPetite intervention. That study aimed to evaluate a smartphone application (“SmartAPPetite”) intended to improve food knowledge, food purchasing, and diet quality [
17]. Relevant to this current study, the SmartAPPetite for Youth participants completed two tools at three time points—baseline, post-intervention, and follow-up. The tools were (1) a 24 h dietary recall using ASA24, and (2) a “youth survey” that assessed dietary habits and related psychosocial factors.
A formal sample size calculation was not performed for this secondary analysis, which was based on pre-existing data collected during the SmartAPPetite for Youth cluster-randomized trial. While the original study was powered to detect intervention effects on dietary outcomes, the current analytic sample of 743 adolescents is sufficiently large to support a robust evaluation of multiple imputation accuracy across varying levels of simulated missingness.
2.2. Measures
2.2.1. ASA24 Dietary Recall
The ASA24 dietary recall aimed to capture participants’ dietary intake for the previous 24 h. This validated tool follows a 7-step AMPM process [
18] inquiring about foods consumed and associated mealtimes; a probe for additional foods not previously reported; detailed questions about food items, including preparation method, portion size, brand, and condiments; review and editing of entered data; prompting for commonly forgotten foods (e.g., snacks consumed while commuting or shopping); final confirmation of entries; and a self-assessment of whether the reported intake reflected usual intake. Upon completion, ASA24 automatically calculates nutrient intakes using its internal food composition database.
While ASA24 may slightly underestimate the intake of certain nutrients (e.g., energy, protein) when compared to recovery biomarkers [
19], it demonstrates comparable accuracy to traditional 24HRs in estimating nutrient intake [
19,
20,
21]. This makes ASA24 a practical and reliable tool for use in large-scale epidemiological studies.
2.2.2. Youth Survey
The youth survey included questions on demographics (e.g., age, sex, ethnicity), self-reported physical and mental health, food-related behaviours and general eating habits (e.g., allergies, cooking frequency, meal skipping), perceived importance of healthy eating, and food purchasing behaviours. Participants were also asked to provide their primary residence’s postal code, which was used to calculate median neighbourhood income. A food knowledge quiz, adapted from two validated instruments [
22,
23], was administered at the end of the survey.
2.2.3. Nutrient and Diet Quality Measures
From the full nutrient output generated by ASA24, the following 21 nutrients were selected for analysis based on their relevance and previous epidemiologic research: calories (kcal), protein (g), total fat (g), saturated fat (g), carbohydrates (g), total sugars (g), fibre (g), calcium (mg), iron (mg), magnesium (mg), potassium (mg), sodium (mg), zinc (mg), vitamin C (mg), thiamin (mg), riboflavin (mg), niacin (mg), folate (mcg), vitamin B12 (mcg), and vitamin A (mcg, RAE). Two composite diet quality scores were also calculated: Healthy Eating Index-2015 (HEI-2015) [
24,
25] and Nutrient Rich Foods Index 9.3 (NRF 9.3) [
26].
2.3. Additional Covariates
Variables from the youth survey included in the analysis were sex, age, ethnicity (White/Caucasian: yes/no), self-rated physical and mental health, number of physically active days in the past week (0–7 days), perceived importance of eating healthy, and total food knowledge score (ranging from 0 (minimum)–50 (maximum)). Additionally, a proxy for socioeconomic status was considered by incorporating a variable for neighbourhood-level median household income, as calculated by linking each participant’s primary residence’s postal code to 2016 Canadian census data at the dissemination area level [
27,
28]. Information on how each question from the youth survey was asked can be found in
Appendix A.
2.4. Data Cleaning and Identification of Implausible Dietary Recalls
To identify IDRs, thresholds were applied to ASA24-derived nutrient intakes based on established guidelines [
8]. Specifically, records were set to “missing” if any of the following nutrient values fell outside plausible ranges (
Table 1):
These thresholds were derived from the upper and lower 5% bounds of National Health and Nutrition Examination Survey (NHANES) data distributions [
8].
2.5. Simulation of Missing Data
Following the creation of a cleaned dataset—one with no missing entries and no implausible nutrient values—a simulation procedure was implemented to artificially introduce missing data. A random number generator was used to randomly select dietary records to be set as missing. For each selected case, the corresponding dietary intake data were exported and stored separately as the reference “true” values. These original values were then removed from the dataset to simulate realistic patterns of missingness.
The dataset was subsequently prepared for MI. An MI model using chained equations was fitted, generating 200 imputed datasets. Predictor variables included in the imputation model were sex, age, white/Caucasian ethnicity (yes/no), self-reported physical health, self-reported mental health, number of physically active days in the past week, perceived importance of healthy eating, total food knowledge score, and neighbourhood-level median household income.
Once imputation was completed, the average of the 200 values was calculated for each nutrient and used as the “final” imputed estimate. The original (true) values were then reinserted into the dataset for comparative analysis.
2.6. Analysis of Imputation Data
To assess the performance of the imputation model, comparisons were made between the imputed values and their corresponding true values on an intra-individual basis. Descriptive statistics, including means and standard deviations, were computed for both actual and imputed values. Spearman’s rho was used to calculate correlation coefficients between true and imputed nutrient values, with corresponding p-values to determine statistical significance. A p-value < 0.05 was considered statistically significant.
To evaluate practical accuracy, a ±10% threshold was applied. Specifically, for each nutrient, an imputed value was considered accurate if it fell within 10% of the individual’s true value. For instance, for a participant with an actual energy intake of 2000 kcal, any imputed value between 1800 and 2200 kcal would be classified as accurate. The proportion of imputed values meeting this criterion was then calculated.
This entire process was repeated across three levels of simulated missing data: 10%, 20%, and 40%. These levels were chosen to reflect pragmatic real-world scenarios. A missing data level under 10% might not substantially affect results, while missingness exceeding 40% [
29] may undermine study validity regardless of the missing data handling method (e.g., listwise deletion or multiple imputation).
4. Discussion
This study evaluated the accuracy of MI for estimating missing nutrient intake data among Canadian adolescents, using 24 h dietary recall data from ASA24. To our knowledge, this is the first study to assess imputation accuracy using a reference dataset with known true values.
Across all levels of missingness, correlations between imputed and actual nutrient values were weak. While MI is regarded as a robust method when data are MAR, our findings suggest that its performance may be limited when applied to high-variability outcomes such as nutrient intake. Notably, the modest improvement in correlation coefficients at 20% and 40% missingness could be attributed to increased sample size and statistical power rather than improved model performance.
The accuracy of imputed values, defined as being within 10% of the true value, was low for most nutrients. Furthermore, and contrary to expectations, MI did not become less reliable with more missing data [
30]. For example, the correlation coefficient for calories was lowest among the dataset with 10% missing data and increased as the percentage increased, while other nutrients showed no clear pattern of relationship between missing data percentage and coefficient values. Even the most accurately imputed variable (HEI-2015) had correct estimates in less than one-third of cases. These findings suggest that although MI may preserve overall distributions and allow for full sample inclusion, it does not reliably reproduce individual-level nutrient intake data. This is critical, as nutritional epidemiology research often relies on accurate intake values to examine exposure–outcome relationships.
Our findings diverge from previous research that used MI in FFQs or registry data, largely because those studies lacked true values for comparison. For example, studies from Japan [
31], Italy [
32], and the U.S. [
33] assessed MI performance via comparisons to complete-case analyses, rather than direct validation against known intakes. While such approaches may provide insight into relative bias, they cannot assess absolute accuracy, as the missing values remain unknown.
Importantly, this study leveraged a rare opportunity to simulate missingness within a dataset of plausible recalls, thereby enabling direct comparison between imputed and true values. While this design enhances internal validity, it also introduces limitations and potential biases. The study excluded IDRs to establish a known reference, which may not reflect real-world patterns where data are often missing not at random (MNAR) [
30]. Moreover, our use of a single 24 h recall per participant limited our ability to estimate usual intake, which likely constrained the imputation model’s performance. This was necessary because the source data came from an intervention study. Using multiple recalls could have altered the “usual intake” over time because of the intervention itself.
Additionally, the classification of implausible recalls based on extreme values in energy and select nutrients, while conservative and consistent with ASA24 guidance [
8], may have excluded some valid recalls. We opted for this method as we did not have the necessary data to apply more refined methods such as the Goldberg cutoff [
34], which would have required measured body weight or energy expenditure data.
Nevertheless, this study has important strengths. It is the first to assess the absolute accuracy of MI for nutrient data using known true values, across varying levels of missingness. It also adopts an individual-level evaluation approach, offering insights that average-based comparisons cannot. Additionally, because ASA24 allowed for multiple dietary recalls without accompanying interviewer requirements, we were able to collect a large enough sample to examine detailed dietary intakes of participants, which may not have been possible using traditional 24HRs. Our pragmatic use of ASA24 and common demographic covariates makes this study directly applicable to dietitians and other nutrition researchers conducting primary nutrition studies.