1. Introduction
In everyday life, we frequently have to choose between options offering immediate rewards and those providing benefits in the future. Consider, for instance, the decision-making process at a cafeteria when selecting lunch, where one must decide between a burger and a salad. While the burger may offer immediate satisfaction of taste cravings, the salad represents a healthier and more advantageous choice in the long term. Such decisions, characterized by subjective values and potential costs for each option without an objectively correct answer, are termed value-based decisions [
1]. Each option’s different benefits and costs create a conflict during the decision-making process [
2,
3,
4]. Although there is no objectively right or wrong decision in this type of conflict, consistently favoring immediate rewards over long-term benefits can lead to significant harmful consequences. This is a particular challenge for dietary decisions, where regularly preferring immediately rewarding but unhealthy foods can contribute to obesity, which in turn is associated with adverse outcomes such as depression, anxiety, hypertension, type 2 diabetes mellitus, and substantial healthcare expenditures [
5,
6]. Despite individuals recognizing that a healthier diet is preferable, many attempts to adhere to a diet remain unsuccessful [
7]. Therefore, it is crucial to comprehend how the decision conflict between healthy but less tasty and tasty but unhealthy food options is processed and how it can be influenced in favor of options with long-term benefits.
In experimental psychology, this conflict is typically investigated using a binary food choice task. In this task, participants must decide between one food option that is tasty but less healthy (e.g., a burger) and another food option that is healthy but less tasty (e.g., a salad). The food options are typically presented as images [
8,
9,
10,
11]. In order to come to a decision, it is assumed that participants trade-off the taste and health properties of both food items. Since taste is processed faster [
11] and is more immediately rewarding, choosing the healthy option is considered to require self-control [
12]. The amount of healthier choices can be increased by manipulations such as providing additional information in the form of caloric information [
9] or by presenting explicit or implicit cues that focus on long-term goals and consequences [
13,
14,
15] or emphasize the tastiness of healthy options [
16]. When considering the trade-off between taste and health that occurs during the decision-making process, these manipulations could affect decision-making by increasing the relative weights of health and taste attributes and therefore introducing a stronger focus on the healthy option [
17]. Similar effects have been found for other types of value-based decisions, such as intertemporal choice, where participants have to choose between an immediate but smaller monetary reward and a delayed but large monetary reward e.g., [
10]. However, there is an important difference to consider when comparing food choice tasks to other types of value-based decision-making. In food choice tasks, the health and taste attributes are not presented explicitly; rather, food options are presented as an image and participants assign their own subjective health and taste values to each option. Therefore, it is unclear how this trade-off between health and taste occurs and how other factors determine the decision-making process in food choice.
Hence, exploring potential differences between a value-based trade-off and typical food choice tasks can provide new insights. This can be achieved by comparing a typical food choice task (e.g., choose between an apple and a chocolate bar) and an explicit value-based task, where attribute values of taste and health are presented separately for each food item (i.e., choose between an option that is 80% healthy and 40% tasty and an option that is 40% healthy and 80% tasty). In such a comparison, differences may arise even just through the different presentation modalities (image vs. text). This is because humans can semantically process information faster if it is presented pictorially instead of in words [
18,
19], which can result in more impulsive decisions [
20]. However, regardless of how the decision is framed (as images or as text), participants have to weigh which option they find more valuable overall (i.e., they have to trade-off health and taste) to come to a decision. This should induce decision conflict, particularly when the difference in subjective value between both options is small [
8,
21]. Decisions with a small difference in value between the options, i.e., where both options are equally liked, are considered more difficult and are associated with longer response times than decisions with a high difference in value, i.e., where there is a strong preference for one option. In a previous study, we showed that decision conflict and difficulty effects occur in both food choices (presented as images) and in abstract value-based trade-offs of health and taste (presented as text), but that cognitive processing might differ [
22].
In addition, further insights into the decision-making process can be gained by tracking the underlying attentional processes. This can be achieved by recording eye movements, which serve as indicator of overt visual attention. As stated by the gaze-bias theory [
23], people spend more time looking at the decision option that they eventually end up choosing. Considering food choices, it has been shown that the chosen option has more fixations and longer dwell times compared to the alternative option [
17,
24,
25,
26], i.e., that more attention is paid to it. The visual attention an object attracts is influenced by top-down and bottom-up processes [
27]. Top-down (goal-oriented) processes rely on the voluntary allocation of attention to objects relevant to the goal or task currently being pursued [
28]. In contrast, bottom-up (stimulus-oriented) processes are based on visual features (e.g., color, size, texture) or internal attributes of stimuli that automatically attract someone’s attention [
29]. The attention an object captures can be changed by influencing these processes. Possible ways to influence top-down attention are, for example, health goals and health logos on food. Both reinforce the importance of a healthy diet. Previous research showed that health goal primes increase the probability of choosing a healthy food item and the dwell times on low-energy food products [
30]. Bottom-up attention can be influenced by modifying the stimulus properties. For example, visual salience can affect decision making through its effects on attention by biasing choices towards more visually salient options [
31,
32]. These results also apply to more realistic situations like the supermarket [
33]. As stated above, the modality of the stimulus is also an important factor influencing the decision-making process, as it affects bottom-up attention. Freijy, Mullan, and Sharpe [
34] used a dot-probe task to calculate an attentional bias toward high-caloric and low-caloric food items presented as pictures or words. The dot-probe task involves short presentations of pairs of images or words on the screen. Subsequently, a probe (typically represented by a dot or asterisk) emerges at the location of one of the previously displayed stimuli. Participants are asked to indicate the probe’s location as fast as possible. This procedure allows the discrimination between attention directed toward stimuli and attention oriented away from stimuli, offering a more refined assessment of attentional allocation. Consequently, an attentional bias toward target stimuli is evidenced by the quicker detection of probes replacing such stimuli. In contrast, delayed detection of probes replacing such stimuli indicates attentional avoidance of target stimuli. Freijy et al. (2014) found an attentional bias toward high-caloric foods when the stimuli were presented as pictures and toward low-caloric foods when stimuli were presented as words. This finding suggests that the modality of the stimulus can influence attentional biases. Although taste and health attributes were not explicitly evaluated, high-caloric foods are often associated with taste, whereas low-caloric foods are connected to health [
35,
36]. However, it is still unknown whether and how attentional biases affect the actual decision-making in process in food choices.
Therefore, the aim of our study is to close this gap and to understand if and how attentional processing in a typical food choice task deviates from a value-based trade-off. This could have powerful implications for understanding why people behave inconsistently with their own dietary beliefs and goals, for example preferring tasty but unhealthy food even when health is an important attribute for them. Identifying potential attentional biases can hence help to influence the decision conflict and promote healthier food choices and, thus, a healthy lifestyle. In the present study, we combined eye-tracking with a novel experimental setup that we introduced in our previous study [
22]. In the task, participants completed both a typical food choice consisting of food images (subsequently referred to as image condition) and an abstract value-based decision task consisting of options presented as a percentage of perceived taste and health values (subsequently referred to as text condition). We expected to replicate the following behavioral effects from our previous study [
22]:
H1: The proportion of healthy choices is lower in the image condition than the text condition.
H2: Response times in the image condition are faster than in the text condition.
H3a: For both conditions, response times in healthy choices are longer than in unhealthy choices.
H3b: The effect of choice type on response times differs both conditions.
H4a: For both conditions, response times for difficult decisions (i.e., decisions with a small value difference between the two decision options) are longer than for easy decisions.
H4b: The effect of difficulty on response times differs between both conditions.
In addition, we derived the following eye-tracking hypotheses from the literature described above:
H5a: For both conditions, there are longer dwell times and more fixations on the decision option that participants eventually choose (i.e., the healthy item in healthy choices and the tasty option in tasty choices) (as predicted by the gaze-bias theory [
23]
and previous findings in food choice tasks [
17,
24,
25,
26]
).
H5b: These effects of dwell time and fixations differ between both conditions (as derived from attentional differences for high- vs low-caloric food [
34]
and from effects of visual salience [
31,
32]
).
2. Materials and Methods
2.1. Data Statement
We report how we determined our sample size, all data exclusions (if any), all manipulations, and all measures in the study. All data and analysis scripts are openly accessible at
https://osf.io/7qmdc/. The study was not preregistered.
2.2. Participants
We based our sample size estimation on a small effect size of ηp
2 = 0.01. Using MorePower 6.0.4 [
37], the power analysis yielded a sample size of 74 participants for a 2 × 2 × 2 repeated measures ANOVA with a power of 0.8. To compensate for possible drop-out, we aimed for a final sample size of 81 participants (74 participants + 10%). We recruited participants from the ORSEE-based database [
38] of the Faculty of Psychology of the Dresden University of Technology, Dresden, Germany. Participants had to meet the following criteria: 18–35 years of age, no weight-loss-oriented restricting diet, no dyschromatopsia, normal or corrected-to-normal vision, no other eye diseases (e.g., strabismus), currently not in psychotherapeutic treatment, and German language skills at native level. A total of 91 participants took part in the experiment (64 female, 24 male, 3 not specified, mean age = 23.69 years,
SD = 4.16 years). Due to technical difficulties, eight participants could not complete the experiment, and two additional participants had to be excluded during data processing (see data preprocessing). The final sample consisted of 81 participants (59 female, 20 male, 2 not specified, mean age = 23.69 years,
SD = 4.29 years). For three participants, the behavioral data were recorded but no eye-tracking data were recorded. Therefore, the final sample for analyzing the eye-tracking data included 78 participants (56 female, 20 male, 2 not specified, mean age = 23.69 years,
SD = 4.24 years). All participants were remunerated with either a payment of EUR 12 (8 EUR/h), or received course credit in exchange for their involvement in the study. Prior to engaging in the experiment, all participants provided informed consent. The study was performed in accordance with the Declaration of Helsinki and was approved by the ethics committee of Dresden University of Technology, Germany (IRB00001473/EK47022016).
2.3. Apparatus and Stimuli
We used an EyeLink 1000 desk-mounted eye tracker (SR Research, Kanata, ON, Canada) to record eye movements, with a reported average accuracy between 0.25° and 0.50° of visual angle and a root mean square resolution of 0.01° (
www.sr-research.com). We tracked the movements of the participant’s right eye using the combined pupil and corneal reflection setting at 500 Hz sampling rate. A chin rest was employed to ensure minimal interference from head movements and to maintain a consistent eye-to-monitor distance of 60 cm. Before each block of the experimental task, we ran a nine-point grid calibration using a grid of three horizontal positions and three vertical positions and a drift correction. The eye-tracker was re-calibrated if validation yielded an average error of over 1°. Overall, the eye-tracker achieved an average error across participants of 0.38° (
SD = 0.11°) with a maximum error of 0.85° on average (
SD = 0.37°).
The task was presented on a 24-inch screen (1920 × 1080 pixels, 72 Hz). We used the Psychophysics Toolbox 3 [
39] in Matlab 2018b (the Mathworks Inc., Natick, MA, USA) on a Windows 10 computer to present the experiment. Participants used a wired USB keyboard to make their responses (QWERTZ-layout, ‘Y’ for left and ‘M’ for right option). In the image condition, we employed a set of colored food images (300 × 225 pixels) sourced from the Food-pics image database [
40]. In the text condition, we used numbers as percentages and the German words ‘gesund’ (meaning healthy) or ‘lecker’ (meaning tasty) to display taste and health values for each option. The text was presented using the font type ‘Calibri’ in font size 36. For each option, both attributes (taste and health percentages) were displayed together, with a combined size of 176 × 80 pixels (see
Figure 1).
2.4. Procedure
First, participants provided informed consent and completed a brief pre-experiment questionnaire, disclosing information about their age, gender, current health status, and mood. Additionally, they underwent a vision assessment to confirm normal or corrected-to-normal vision and assess color perception. Subsequently, each participant was presented with two sets of materials printed on paper. The first set featured 274 food images [
40], encompassing various items ranging from vegetables and fruits to indulgent treats like chocolate and candy and fast food items like hamburgers or pizza. The second set comprised 28 abstract textual descriptions detailing combinations of healthiness and tastiness values represented as percentages (e.g., 20% healthy and 80% tasty). Participants were asked to associate specific food items from the first set with the abstract value combinations presented in the second set. For instance, each participant selected a food item they considered 90% healthy and 10% tasty. These personally chosen food images then formed the basis for the individualized stimulus sets used for the trials in the main experiment. Next, participants were informed about the eye-tracking setup. If necessary, the chair height was changed so the participants could comfortably position their heads on the chin rest.
Subsequently, participants engaged in a food choice task requiring them to indicate their preference between two presented food items. The decision task comprised four blocks, each containing 148 trials (for task details, refer to the task design section), with self-paced breaks provided between these blocks. Each trial began with the display of a fixation cross, followed by the presentation of the stimuli (see
Figure 1). Following the onset of the stimulus, participants had to make their decision within four seconds. Upon responding or upon the expiration of this time limit, the subsequent trial commenced, marked by the appearance of another fixation cross. The intertrial interval (ITI) was randomized between 750 ms and 1000 ms. To familiarize themselves with the task, participants completed a tutorial of 30 practice trials before performing the actual decision task. The eye-tracking system was validated (and re-calibrated if necessary) after the tutorial and between blocks.
After the food choice task, participants completed a brief post-experiment questionnaire, which inquired about their current emotional state, mood, and conjecture regarding the purpose of the experiment. Overall, the experiment had a duration of approximately 1.5 h.
2.5. Task Design
We implemented a within-subjects factorial design with the factors
presentation format and
trial type. The
presentation format consisted of two levels: percentage-based descriptions (referred to as the text condition) or images of food items (the image condition), which were manipulated on a block-wise level. In the text condition, food items were described using numerical percentages for their health and taste attributes. For instance, participants were presented with a choice between an item that was 90% healthy and 50% tasty and another item that was 20% healthy and 100% tasty. In the image condition, food items were represented by images (e.g., an apple and a cookie) (see
Figure 1).
The trial type consisted of conflict trials and control trials. Conflict trials required self-control by offering the choice between one food option with a higher health percentage (termed the “healthy item”) and another food option with a higher taste percentage (termed the “tasty item”). The health value of the healthy item was either 90% or 100%, and its taste value was either 10%, 20%, 40%, 50%, 70%, or 80%, yielding twelve unique combinations. The taste value of the tasty item was either 90% or 100% and its health value was 10%, 20%, 40%, 50%, 70%, or 80%, resulting in twelve additional combinations. This procedure yielded a total of 144 unique conflict trials. Control trials were designed to assess the consistency of participants’ decision making. They consisted of one food item that was both healthy and tasty and one item that was neither healthy nor tasty. The healthy and tasty items were either 90% healthy and 100% tasty or 100% healthy and 90% tasty. The items that were neither healthy nor tasty were 10% healthy and 20% tasty or 20% healthy and 10% tasty. This yielded a total of four control trials. All trials (conflict and control) were presented in four blocks, with each block containing all trials in randomized order (4 × 148 = 592 trials). Two of these blocks presented images as stimuli and two blocks presented percentages (order balanced across participants). Additionally, we counterbalanced the positions of the healthy food item (right or left side of the screen) and, in the percentage condition, the position of the attributes (health value above taste value and vice versa) across participants.
To assess decision difficulty, we calculated the difference between the sum of health and taste values for each food item. These differences could be 0, 10, 20, 30, 40, 50, 60, 70, or 80. Decisions with differences falling within the range of 0 to 20 were categorized as difficult, while those within the range of 60 to 80 were considered easy.
2.6. Data Preprocessing
As a consistency check as to whether participants made valid choices, we analyzed choices in control trials (where we expected participants to choose the item that was both tasty and healthy instead of the item that was neither). Two participants were excluded due to their number of incorrect choices in control trials (25% vs. 50%). Moreover, we calculated the mean overall response times for each participant. One participant had very low response times (M = 0.64 s; SD = 0.18 s; Min = 0.26 s; Max = 1.53 s). However, since the exclusion of this participant did not influence the quality of the results, we decided to retain this dataset in the analysis.
2.7. Data Analysis of Behavioral Data
We used MATLAB 2020b for data processing and analysis. For statistical analyses, we utilized JASP version 0.10.2.0 [
41]. We analyzed participants’ healthy choice percentage in conflict trials using a repeated measures analysis of variance (rmANOVA) with the factors
Visualization (text; image) and
Difficulty (easy; difficult). We analyzed response times in conflict trials using an rmANOVA with the factors
Visualization (text; image),
Difficulty (easy; difficult), and
Choice (healthy; unhealthy). For the analysis of the eye-tracking data, we also calculated an rmANOVA including the within factors
Visualization (text; image),
Choice (healthy; unhealthy), and
Item (healthy; tasty). Furthermore, as an exploratory analysis, we calculated an rmANOVA including the within factors
Visualization (text; image),
Difficulty (easy; difficult), and
Item (healthy; tasty).
4. Discussion
In the present study, we examined how the framing of food options affects decision conflict and visual attention in food choices. Participants made binary choices between food items that were healthy but less tasty or less healthy but tasty (based on participants’ individual ratings). The food items were presented either as a typical food choice task in the form of images (e.g., an apple; image condition) or pre-matched value-based percentages of health and taste properties (e.g., 90% healthy and 20% tasty; text condition). At the behavioral level, we found a lower proportion of healthy choices and lower response times in the image condition compared to the text condition, replicating our previous findings [
22]. In terms of visual attention, the results showed more fixations and longer dwell times on the item corresponding to the subsequent choice in the text condition (i.e., on the healthy item for healthy choices and on the tasty item for unhealthy choices). In the image condition, however, this only applies to the healthy item.
The current findings indicate that the presentation of food items as images, as is typical for food choice tasks, induces a greater temptation to opt for the unhealthy choice compared to a mere trade-off choice based on health and taste values. Participants showed a significantly higher preference for the tasty (i.e., unhealthy) option in the image condition than in the text condition. Furthermore, we found lower response times in the image condition compared to the text condition. There are two possible explanations for this finding: Either a generally faster decision-making process leads to more impulsive decisions, or there are general differences in the decision making process between the conditions.
The assumption of a faster decision-making process is supported by the fact that pictorial information is processed faster [
18,
19], potentially resulting in faster decision-making and consequently more impulsive selections [
42]. Moreover, in the text condition, where both health and taste information were explicitly presented, the need for explicit processing of additional information could lead to extended decision times. The prolonged processing in itself causes longer decision times and could contribute to less impulsive choices. Furthermore, previous research has demonstrated that taste is processed more rapidly and therefore has a greater influence on decisions [
11,
43], which could support more impulsive decisions.
However, there are also some indications suggesting that it is not only a matter of faster and hence more impulsive decisions in the image condition, but that the decision processes differ between both conditions. First, in the text condition, we found effects in the response times that are comparable with prior research concerning value-based and rational choice results, like longer response times for difficult choices [
44]. This is to be expected for a trade-off between attributes of two options. Interestingly, this effect was much smaller in the image condition. In addition, the attention processes seem to differ between the two conditions. In the text condition, results show more fixations in difficult trials [
45] and longer dwell times for difficult decisions. Furthermore, we found evidence for the gaze cascade effect in the text condition, which refers to the tendency for the course of decision-making attention to shift to the chosen option [
46,
47]. For example, we found more fixations and longer dwell times on the item corresponding to the subsequent choice, and the last fixation of an item led to a corresponding choice on average (i.e., the healthy choice if the healthy item was fixated last and the unhealthy choice if the tasty item was fixated last). In the image condition, however, we did not find differences in response times as a function of difficulty and no differences in fixations and dwell time between difficult and easy decisions. Moreover, only minor evidence of the gaze cascade effect [
47], according to which last fixation predicts choice, was found in this condition. Taken together, this suggests that typical food choice tasks that present food items as images go beyond a trade-off of health and taste attributes. Thereby, participants deviate from their own subjective trade-off between health and taste, as assessed in the task condition. Instead, when food options are presented as images, the tasty option seems to influence participants’ attention and decision-making processes quite strongly, as evidenced by the tasty item receiving the same attention in healthy decisions as in unhealthy decisions, and participants choosing the tasty option even if their last fixation was on the healthy option.
As different decision strategies might be involved in both conditions, an interesting question for future studies is whether an option-based or attribute-based strategy is used for decision-making in the text condition. These two approaches to decision-making are discussed in the existing body of research on classical intertemporal choice tasks [
48,
49,
50,
51]. In the option-based approach, it is posited that a subjective value is derived for each option by considering both the amount and time associated with it, and the option with the higher subjective value is subsequently chosen. Conversely, the attribute-wise approach contends that decisions are made through a direct comparison of attributes, such as value versus value and time versus time. When applied to the context of food choice in our study, both decision strategies, namely the option-based approach (where subjective values are compared) and the attribute-based approach (where attributes are directly compared), can be employed in the text condition. However, in the image condition, the health and taste attributes are already presented in an integrated manner, suggesting a predisposition towards an option-based decision-making process. This aspect may have also contributed to the observed differences between the two conditions. However, this could not be investigated in the present study, as we presented the health and taste values close to each other to achieve comparable stimulus sizes between the text and image condition.
Our findings have important implications for the field of food choice research. Our results suggest that decision-making in food choices does not merely reflect a trade-off between the health and taste properties of each food item. This could explain why people sometimes make dietary choices that are inconsistent with their own dietary goals and beliefs (e.g., preferring a tasty but less healthy option even though healthiness of food items matters to them). Our results also offer suggestions on how to promote healthy food choices. First, when participants have to make an abstract trade-off between health and taste attributes of two food items, they make a higher number of healthy choices than when matching food items are presented as images. This suggests that the representation of food as an image should be avoided in order to promote healthy choices, e.g., in school cafeterias or canteens. However, prior literature also shows that simultaneous presentation of image and additional information (e.g., caloric information, tastiness of healthy items) leads to more healthy choices [
9,
16]. Future studies could investigate the question as to whether it is solely the speed of decision-making that leads to more impulsive decisions with visual stimuli, or if it is specifically the stronger weighting of taste-related information. This could be investigated, for instance, by either imposing a minimum decision-making time or by comparing the effects of potentially relevant information (such as caloric information) with potentially irrelevant information (such as unrelated text) when presented concurrently with the visual stimulus. Second, the results suggest that visual attention has a different role in the image condition than in the text condition, but still has an effect on the decision process. This could indicate that healthy choices can be promoted by controlling visual attention (e.g., through salience of the stimuli or through presentation order). This is supported by a study by Dai et al. [
52], who showed that perceptual salience can influence food choices irrespective of health and taste preferences. Therefore, making healthy food items more salient when presenting them visually might increase healthy food choices.
Some limitations should be considered when interpreting the results of the current study. First, the stimulus material for the two task conditions was very different. While the typical food choice task consisted of images, the value-based trade-off task consisted of abstract numerical health and taste attributes. It could be argued that this abstract presentation is too far removed from realistic food choices to draw conclusions about decision making in food choices. However, as such a trade-off is often assumed to be the basis for food choices, it is important to explicitly investigate this question. Future studies could build on this line of research by comparing the value-based trade-off task with a text-based food choice task (i.e., where food stimuli are presented as words instead of images).
Second, as outlined above, we presented health and taste values together as one area of interest. While this was necessary to achieve a comparable setup to the image condition, this limits the research questions that can be addressed in our experiment. In future studies, both properties could be presented in such a way that the fixations and gaze duration for the individual properties can be examined separately. This would allow us to gain further insights in to potential differences in option-based vs. attribute-based decision-making strategies.