Relative Validity and Reliability of the Remind App as an Image-Based Method to Assess Dietary Intake and Meal Timing in Young Adults

Image-based dietary records have been validated as tools to evaluate dietary intake. However, to determine meal timing, previous studies have relied primarily on image-based smartphone applications without validation. Noteworthy, the validation process is necessary to determine how accurately a test method measures meal timing compared with a reference method over the same time period. Thus, we aimed to assess the relative validity and reliability of the Remind® app as an image-based method to assess dietary intake and meal timing. For this purpose, 71 young adults (aged 20–33 years, 81.7% women) were recruited for a 3-day cross-sectional study, where they completed a 3-day image-based record using the Remind app (test method) and a 3-day handwritten food record (reference method). The relative validity of the test method versus the reference method was assessed using multiple tests including Bland–Altman, % difference, paired t-test/Wilcoxon signed-rank test, Pearson/Spearman correlation coefficients, and cross-classification. We also evaluated the reliability of the test method using an intra-class correlation (ICC) coefficient. The results showed that, compared to the reference method, the relative validity of the test method was good for assessing energy and macronutrient intake, as well as meal timing. Meanwhile, the relative validity of the test method to assess micronutrient intake was poor (p < 0.05) for some micronutrients (iron, phosphorus, potassium, zinc, vitamins B1, B2, B3, B6, C, and E, and folates) and some food groups (cereals and grains, legumes, tubers, oils, and fats). Regarding the reliability of an image-based method to assess dietary intake and meal timing, results ranged from moderate to excellent (ICC 95% confidence interval [95% CI]: 0.50–1.00) for all nutrients, food groups (except oils and fats, which had low to moderate reliability), and meal timings. Thus, the results obtained in this study provide evidence of the relative validity and reliability of image-based methods to assess dietary intake (energy, macronutrients, and most food groups) and meal timing. These results open up a new framework for chrononutrition, as these methods improve the quality of the data collected and also reduce the burden on users to accurately estimate portion size and the timing of meals.


Introduction
The future of chrononutrition relies on what we can learn about individual behavior, which requires an accurate assessment of what people eat and drink, as well as when and how often they consume any type of food or beverages [1,2]. The latter are also known as temporal eating patterns [2]. Commonly used methods to assess temporal eating patterns include pen-and-paper tools such as food records and 24 h dietary recalls [3][4][5]. However, Image-based dietary records are currently delivered by users via smartphone applications, which have been validated as tools to evaluate dietary intake [7,13,14]. However, to determine meal timing, previous studies have relied primarily on image-based smartphone applications without validation [2,15]. Importantly, the validation process is necessary to determine how accurately a test method measures meal timing compared with a reference method over the same time period (known as relative validity) [16]. Furthermore, the validation process is necessary to identify the magnitude and direction of measurement error, the potential causes of measurement error, and how these errors can be minimized or accounted for in the analyses [16]. Of note, a recent validation study by Giogia et al. [17] showed significant differences in the timing of most meals, when reported through recallbased survey questions versus paper-based food records. Specifically, the authors showed a significant delay in meal timings when reported through recall-based surveys, compared to those reported through food records [17]. These results highlight the relevance of evaluating the relative validity of image-based dietary records to assess meal timing, versus a reference method.
Taking the above into account, our aim was to assess the relative validity and reliability of the Remind app [18] as an image-based method to jointly assess dietary intake and meal timing versus food recording. Note that the Remind app is a real-time mobile application that allows for file and photo sharing, as well as immediate feedback from respondents [18]. The latter is relevant considering that daily life and work styles have been characterized by instantaneous communication that is increasingly accessible via digital technology [19]. Furthermore, the COVID-19 pandemic has accelerated the use of telemedicine, making the use of emerging technologies in clinical practice more likely [1]. From this perspective, we emphasize the need for new validated applications to jointly assess dietary intake and meal timing.

Participants and Study Design
Young adults (aged 20-35 years) were recruited for a 3-day cross-sectional study among undergraduate students at the University of Barcelona (Barcelona, Spain). Recruitment consisted of an informative talk, explaining the details to the volunteers about the research, and inviting them to take part in the study. Exclusion criteria consisted of not owning or having access to a smartphone capable of downloading the Remind app and/or unwillingness to participate in the study. Based on these criteria, a total of 78 subjects were included in the study, and all of whom gave their written informed consent. We further excluded 7 subjects with missing information in their food records (e.g., serving sizes, method of preparation) which resulted in a final analytical sample of 71 participants. All study procedures were conducted according to the general recommendations of the Declaration of Helsinki and were approved by the Ethics Committee of the University of Barcelona (IRB00003099).

Anthropometric Measurements
Weight was measured using a body composition analyzer (InBody 720, Biospace, Seoul, Korea), with the subjects wearing light clothing and without shoes, to the nearest 0.1 kg. Height was determined using a fixed wall stadiometer (Seca 217, Seca, Hamburg, Germany) to the nearest 0.1 cm. Body mass index (BMI) was calculated as weight (kg) divided by height squared (m).

Dietary Intake and Meal Timing Assessment Methods
During the study period, all participants were asked to complete a handwritten 3-day food record (reference method) and a 3-day image-based dietary record using the Remind app (test method) [18]. Participants were asked to complete both records within one week, but on the same days and including two weekdays and one weekend day. In addition, a registered dietitian instructed participants to record the type of food or beverage (including alcoholic beverages) with the brand, if possible, the method of preparation, the serving size (in grams or household measurements), and the location of the meal (e.g., home or restaurant). In addition, participants were required to report meal times in both methods during the study period. This allowed us to evaluate the time and frequency in which each food or beverage was consumed.
For the image-based dietary record, participants were asked to download the Remind app into their mobile phones. It should be noted that the Remind ® app complies with the requirements of the EU General Data Protection Regulation, is free, and is compatible with all mobile operating systems [18]. Additionally, all participants received training on how to use the mobile app and how to take the pictures so that they could accurately reflect food intake ( Figure 1). As such, participants were taught to photograph all foods/beverages consumed at a 45 • angle using a fiducial marker (a reference object with known dimensions), which could be a pen or cutlery [20]. Participants were also asked to take pictures of second servings and leftovers. In addition, as shown in Figure 1, participants were required to enter a brief written description of what they ate, as well as the time the food was consumed. It is worth noting that the Remind app provides real-time communication to monitor participants' progress, which can reduce participant burden and improve data quality.

Dietary Intake
Data from handwritten food records and image-based dietary records were processed by a registered dietitian using PCN Pro 1.0 software [21]. To estimate the daily dietary intake for both methods, we standardized the food entries, including selected food portions, according to values provided in the software (i.e., a normal serving of pasta), unless the participant provided the exact quantity of food or beverage. Furthermore, specific brands were not selected unless indicated by the participant. Additionally, we used photographic guides of food portions consumed in Spain [22,23] to help to quantify the foods included in the image-based dietary records. The latter allowed us to estimate the average daily energy (kcal/day), macronutrient (g/day), and micronutrient (mg/day or µg/day) intakes for both methods. In addition, we estimated the average daily intake (g/day) of the following food groups: i Fruits: fresh fruits, canned fruits, and dried fruits. ii Vegetables: leaf, flower, or stem vegetables, root vegetables, bulbs, and mushrooms. iii Cereals and grains: cereals, grains and flour, pasta, baked goods, cookies, pastries, and breakfast cereals.

Dietary Intake
Data from handwritten food records and image-based dietary records were processed by a registered dietitian using PCN Pro 1.0 software [21]. To estimate the daily dietary intake for both methods, we standardized the food entries, including selected food portions, according to values provided in the software (i.e., a normal serving of pasta), unless the participant provided the exact quantity of food or beverage. Furthermore, specific brands were not selected unless indicated by the participant. Additionally, we used photographic guides of food portions consumed in Spain [22,23] to help to quantify the foods included in the image-based dietary records. The latter allowed us to estimate the average daily energy (kcal/day), macronutrient (g/day), and micronutrient (mg/day or µg/day) intakes for both methods. In addition, we estimated the average daily intake (g/day) of the following food groups: i.

Meal Timing
For each meal, participants were asked to record the time in which they started eating on handwritten food records, as well as on the image-based food records. Note that in both cases, the participants recorded meal times based on the time of their cellphones. Subsequently, meals were classified as breakfast, lunch, dinner, or mid-morning and midafternoon snacks, based on the designation that each participant indicated. We then calculated the average meal timing in which breakfast, lunch, dinner, and mid-morning and mid-afternoon snacks were consumed. This parameter was determined by comparing a test method (Remind app) to a reference method (3-day handwritten food records), where the reference method had a higher degree of demonstrated validity, although it was not an exact measure of the underlying concept [24]. For this purpose, we applied the methodology proposed by Lombard et al. [16], where a combination of 5 statistical tests (Bland-Altman, % difference, paired t-test/Wilcoxon signed-rank test, Pearson/Spearman correlation coefficients, and cross-classification) is used to test different facets of validity such as agreement, association, or bias, either at the group or individual levels [16].

Agreement at Group Level
To test group-level agreement, we first used the Bland-Altman test, which reflects the presence, direction, and extent of bias, as well as the limits of agreement [16,25]. The latter was assessed by plotting the mean difference (y-axis) and the mean intakes (x-axis) between both methods and for each subject, to illustrate the magnitude of disagreement and to identify outliers and trends in bias [16,25]. In this case, the mean difference and mean intakes were calculated as follows: Then, the upper and lower limits of agreement were calculated as follows: Lower limit of agreement (LLA) = mean difference − 1.96 standard deviations Upper limit of agreement (ULA) = mean difference + 1.96 standard deviations Note that about 95% of the recordings should be between the LLA and the ULA. In addition, Bland-Altman Spearman correlation coefficients between mean difference and mean intake were calculated to reflect the presence of proportional bias as well as its direction [16]. Then, according to the p-value, the outcomes were classified as "good" (p > 0.05) or "poor" (p ≤ 0.05). In this case, poor outcomes would reflect proportional bias [16].
We also calculated the difference (in percentage) between the reference and the test measure. The latter reflects the size and direction of the error at the group level [16]. The % difference was calculated for the total sample as follows: According to the % difference, the outcomes were classified as "good" (0.0-10.9%), "acceptable" (11.0-20.0%), or "poor" (>20.0%) [16].
Subsequently, we used the paired t-test (parametric) or the Wilcoxon signed-rank test (non-parametric) to evaluate the agreement between the test and reference methods [26,27]. Then, according to the p-value, the outcomes were classified as "good" (p > 0.05) or "poor" (p ≤ 0.05) [27].

Agreement at Individual Level
First, we measured the strength of the association between the test and the reference method [27]. This parameter was assessed based on the data distribution, using Pearson's (parametric) or Spearman's (non-parametric) correlation coefficients. Then, according to the correlation coefficient (r/Rho), the outcomes were classified as "good" (≥0.50), "acceptable" (0.20-0.49), or "poor" (<0.20) [28].
We also used the cross-classification method, which provides an indication of how well the dietary method separates subjects into classes or consumption categories [27]. To do so, we used tertiles and then calculated the percentage of subjects who were correctly classified in the same tertile. Likewise, we estimated the percentage of subjects who were in the opposite category overall [27]. Then, according to the % of subjects who were correctly or grossly classified in each tertile, the outcomes were classified as follows [28]: i "Good" if ≥50% of the sample was classified in the same tertile and ≤10% in the opposite tertile. ii "Poor" if <50% of the sample was classified in the same tertile and >10% in the opposite tertile. Reliability refers to the extent to which a measurement process gives the same results when repeated under similar circumstances [24]. Therefore, this parameter is defined as the extent to which measurements can be replicated [29]. To evaluate this parameter, we used the intraclass correlation coefficient (ICC) based on a 2-way random-effects model, which is a widely used reliability index that reflects the absolute agreement between measurements of the same quantitative variable, in the same subjects [29]. Note that the appropriate ICC interpretation to evaluate the level of reliability should be based on the 95% confidence interval [95% CI] of the ICC estimate, not the ICC estimate itself [29]. Thus, according to the ICC's [95% CI], the outcomes were classified as "excellent" (>0.90), "good" (>0.75-0.90), "moderate" (0.50-0.75), or "poor" (<0.50) [29].

Statistical Analyses
Normality was confirmed for all variables using histograms and Q-Q plots. Descriptive characteristics were presented for all participants, including mean ± standard deviation for parametric variables, median (interquartile range) for non-parametric variables, and proportions for categorical variables. Then, for the validation process of the Remind app (test measure) versus the 3-day handwritten food records (reference method), we constructed Bland-Altman plots and calculated Bland-Altman Spearman correlation coefficients to assess any systematic bias between methods. In addition, we calculated the % difference and compared the differences between methods using paired t-test/Wilcoxon signed-rank tests to assess agreement at the group level. We then evaluated individual-level agreement using the correlation coefficient (Pearson or Spearman, based on data distribution) and crossclassification. Finally, we calculated the ICC [95% CI] based on a 2-way random-effects model to assess reliability. All analyses were performed using SPSS statistical computer software, version 25.0 (IBM SPSS Statistics, Armonk, NY, USA).

Energy and Nutrient Intake
The agreement at the group level was analyzed using Bland-Altman plots to compare the mean differences in energy and nutrient intakes between the test and the reference method. As shown in Figure 2, most values of energy (kcal/day) and macronutrient intake (carbohydrate (g/day), protein (g/day), and fat intake (g/day)) were within acceptable limits of agreement. Likewise, most of the values regarding micronutrient intake were within acceptable limits of agreement.
Furthermore, according to the Bland-Altman Spearman correlation coefficients (Table 1), we found good outcomes (p > 0.50) for energy, macronutrients, dietary fiber, and most micronutrients. However, the outcomes for iron (p = 0.031), vitamin E (p = 0.006), and vitamin C (p = 0.042) intakes were considered to be poor and thus reflected proportional bias. The agreement at the group level was analyzed using Bland-Altman plots to compare the mean differences in energy and nutrient intakes between the test and the reference method. As shown in Figure 2, most values of energy (kcal/day) and macronutrient intake (carbohydrate (g/day), protein (g/day), and fat intake (g/day)) were within acceptable limits of agreement. Likewise, most of the values regarding micronutrient intake were within acceptable limits of agreement. Furthermore, according to the Bland-Altman Spearman correlation coefficients (Table 1), we found good outcomes (p > 0.50) for energy, macronutrients, dietary fiber, and most micronutrients. However, the outcomes for iron (p = 0.031), vitamin E (p = 0.006), and vitamin C (p = 0.042) intakes were considered to be poor and thus reflected proportional bias. Regarding the % difference (Table 1), our results showed that energy and all nutrient intakes had a difference of <11%, which indicates a good outcome. Likewise, the results from the paired t-test/Wilcoxon signed-rank test (Table 1) showed a good outcome for energy, protein, fat, and fiber intakes, as well as calcium, magnesium, phosphorus, and sodium intakes, and vitamins A, D, and B12 intakes. Meanwhile, carbohydrate and the remaining micronutrient (iron, phosphorus, potassium, zinc, vitamins E, B1, B2, B3, B6, C, and folate) intakes showed poor agreement at the group level.
As for the individual level of agreement between the test and the reference method (Table 1), we observed that the outcomes for the correlation coefficients and cross-classification analyses were good for all items (energy and nutrient intake).

Food Group Intake
The Bland-Altman plots of the intake of the different food groups showed that most values were within the acceptable limits of agreement ( Figure S1). In addition, according to the Bland-Altman Spearman correlation coefficients (Table 2), there was a good outcome for most of the food groups (fruits, vegetables, cereals and grains, legumes, milk and dairy products, meats, eggs, fish, and non-alcoholic drinks), except for tubers (p = 0.003) and oils and fats (p = 0.028) where the outcome was poor, indicating that there was proportional bias. Regarding the % difference, our results revealed that most of the food groups (fruits, vegetables, cereals and grains, tubers, milk and dairy products, meats, and fish) had good agreement at the group level. Meanwhile, eggs and non-alcoholic beverages had an acceptable level of agreement, while legumes and oils and fats showed a poor group-level agreement (Table 2). Likewise, according to the paired t-test/Wilcoxon signed-rank test (Table 2), most food groups showed good agreement at the group level, except for cereals and grains and tubers, which showed poor agreement at the group level.
Concerning the relative validity at the individual level (Table 2), the correlation coefficients and the results of the cross-classification analyses showed that the intakes of all of the food groups exhibited a strong relationship at the individual level. Only oils and fats showed a poor outcome in the cross-classification analysis, with 14% of individuals being misclassified in the opposite tertile.

Meal Timing
As shown in Figure 3, the Bland-Altman plots showed that most meal timing values fell within the acceptable limits of agreement. In addition, the results of the Bland-Altman Spearman correlation coefficients, the % difference, and the paired t-test/Wilcoxon signedrank tests revealed that the timing of all meals, reported using the test method, had good agreement at the group level compared to the reference method (Table 3). Similarly, the agreement at the individual level of the test method evaluated using correlation coefficients and cross-classification showed good outcomes for all meal times (Table 3). As shown in Figure 3, the Bland-Altman plots showed that most meal timing values fell within the acceptable limits of agreement. In addition, the results of the Bland-Altman Spearman correlation coefficients, the % difference, and the paired t-test/Wilcoxon signedrank tests revealed that the timing of all meals, reported using the test method, had good agreement at the group level compared to the reference method (Table 3). Similarly, the agreement at the individual level of the test method evaluated using correlation coefficients and cross-classification showed good outcomes for all meal times (Table 3).

Reliability of the Remind App as a Tool to Evaluate Dietary Intake and Meal Timing
Regarding the reliability of the Remind app (Table 4), we observed that according to the ICC [95% CI], energy intake had moderate to excellent reliability, while carbohydrates, and fats and fatty acids (saturated, monounsaturated, and polyunsaturated) had moderate to good reliability. We also found that the reliability of the test method to assess protein, cholesterol, and dietary fiber intake could be considered as good to excellent. Likewise, its reliability to evaluate several vitamin intakes (A, D, B1, B6, B12, and folates) was good to excellent, while the reliability in assessing magnesium was good and the reliability in assessing other mineral intakes (calcium, iron, phosphorus, potassium, and zinc) and some vitamin intakes (E, B2, B3, and C) was moderate to good.  Among food groups (Table 4), the reliability of the test method to evaluate fish intake was excellent, while for legumes and tubers the reliability was good to excellent. For the other food groups (fruits, vegetables, cereals and grains, milk and dairy products, meats, eggs, and non-alcoholic drinks), the ICC [95% CI] indicated that the Remind app had moderate to good reliability, while for oils and fats the reliability was poor to moderate.
Finally, our results showed that the reliability of the test method to evaluate the timing of most meals was excellent. Only in the case of the timing of the mid-afternoon snack and dinner was the reliability good to excellent (Table 4).

Discussion
Our results showed that, compared to the reference method, the Remind app has good relative validity and moderate to excellent reliability to jointly assess dietary intake (energy, macronutrient, and food groups) and meal timing. To our knowledge, this is the first study to evaluate the relative validity and reliability of a mobile app as a tool to assess meal timing. In this regard, our results suggest that image-based records have good agreement at the individual and group levels for assessing meal timing. The latter is important considering that temporal eating patterns, that is, what and when we eat, are important contributors to health [2,30]. Furthermore, evidence from experimental and epidemiological studies in the field of chrononutrition has shown the importance of meal timing and its implications in obesity and its management [2,[31][32][33]. Therefore, validated tools, such as ours, are needed to capture the temporal components of energy and nutrient intake.
Among other significant findings, our results showed that the Remind app had good reliability for assessing the average daily consumption (g/day) of fruits, vegetables, milk and dairy products, meats, eggs, fish, and non-alcoholic beverages. This is in agreement with Matthiessen et al. [13] who evaluated the relative validity of image-based records to assess food group intake. Only in the case of cereals and grains, legumes, tubers, oils and fats were the outcomes suboptimal. In this regard, Boushey et al. [8] postulated that these differences may be due to the fact that a photograph can provide more information than paper-based food records. In our experience, this could be the case for oils and fats, where some participants did not report the use of oil for cooking or salad dressing in paper-and-pen food records, while in image-based records the presence of oil could be clearly seen. This observation is in line with another study, which also indicates that the food group "oils and fats" is commonly misclassified and thus reflects poor agreement at the individual and group levels [34].
Regarding the differences at the group level in the intake of legumes, cereals, and tubers, we hypothesized that the discrepancies could lie in the way in which users and experts (in our study, a registered dietitian) estimate the size of the portions consumed [1,3]. Note that user estimation of portion size is a well-established limitation of pen-and-paper food records [1], and so it is plausible that image-based records may provide a better estimate of portion sizes. The latter could also be in line with Boushey's observation, noting that a photograph can provide more information than paper-based food records. However, it should be taken into account that this type of study investigates the "relative validity" of one method with respect to another, which implies that neither has the absolute truth [7,16].
Among other significant findings, our results showed that the use of the Remind app as an image-based method has good relative validity and a moderate to excellent reliability to assess energy and macronutrient intakes. Similar results were found in other validation studies of image-based methods, where data from energy and macronutrient intakes were within the limits of agreement [14,34]. Furthermore, the results of a systematic review and meta-analysis showed that when used as a primary dietary record, image-based dietary assessments can provide valid results for assessing energy and macronutrient intakes, as with traditional methods (i.e., 24 h dietary recall and estimated/weighted food records) [9].
Although, as with any other method to assess dietary intake [7], image-based records provided through the Remind app had their limitations with regard to micronutrient assessment. As such, compared to the reference method, the Remind app showed some differences at the group level when assessing the intake of iron, potassium, zinc, vitamins E, B1, B2, B3, B6, and C, and folates. These results are partially consistent with other validation studies that also found low agreement between image-based records and food records to assess the intake of iron, zinc, and folates [7,35].
Despite our findings, some caution needs to be taken when interpreting our results. First, this study included a relative validation; therefore, it is not possible to conclude that one method is closer to the "true dietary intake" than the other [7]. Second, previous research has shown that three days may be adequate for establishing the mean energy intakes of groups; however, it may not be a long enough duration to accurately measure micronutrient intake [36]. Third, our final sample included 71 young adults, which while similar to other studies that have assessed the relative validity of image-based methods to assess dietary intake [14,37,38], may not be representative of the entire population. Fourth, we also recognize that our sample of young adults may not be representative of the entire population with respect to technology use. Therefore, future studies are needed to assess the relative validity of the Remind app for assessing dietary intake and meal timing in other populations. Nonetheless, our study had several strengths, beginning with the fact that we used a combination of five different tests to evaluate the relative validity and reliability. These, according to Lombard et al. [16], reflect different facets of validity such as agreement, association, or bias at the group or individual level. Second, our study was focused on the validation of the temporal aspects of food intake; so, we evaluated the relative validity of the app to jointly assess dietary intake (energy, nutrients, and food groups) and meal timing in free-living conditions. Third, in the image-based dietary records, participants provided written information in addition to the photograph which may have improved the accuracy of estimates [37]. Fourth, a registered dietitian performed the dietary data entry and analysis. Finally, the Remind app allowed us to access real-time communication to monitor participant progress, potentially reducing the participant burden and improving data quality.

Conclusions
In summary, our results provide evidence of the relative validity and reliability of an image-based method to assess the temporal aspects of food intake. As such, the Remind app has good relative validity and moderate to excellent reliability to jointly assess dietary intake (energy, macronutrients, and most food groups) and meal timing. The latter opens a new framework for chrononutrition, as these image-based methods improve the quality of data collected and also reduce the burden on users to accurately estimate portion size and meal timing. Furthermore, considering that our current lifestyles are characterized by technology and instant communication, the use of novel technologies to assess dietary intake and meal timing in clinical and epidemiological settings is necessary.
Supplementary Materials: The following supporting information can be downloaded at https://www. mdpi.com/article/10.3390/nu15081824/s1, Table S1. Mean values of dietary intake and meal timing estimated through the test (3-day image based food record-Remind app) and reference (3-day handwritten food record) methods. Figure S1. Bland-Altman plots showing mean difference vs. mean intakes (solid line) between the test (Remind app) and reference (3-day handwritten food record) methods, and the lower and upper limits of agreement (dotted lines) for food group intake.