1. Introduction
Bowel cancer, also known as colorectal cancer, is responsible for 10% of all cancer-related deaths in the UK [
1] (more recent data are not fully available at the time of writing) and ranks as the fourth leading cause of cancer mortality worldwide [
2]. Early detection is vital for effective treatment and reducing mortality rates associated with this disease [
3]. One widely used screening tool is the Faecal Immunochemical Test (FIT), which identifies traces of blood in stool—a possible sign of bowel cancer [
4]. Detecting the disease at an early stage through FIT enables healthcare providers to intervene promptly, improving patient outcomes and helping to lower overall mortality rates [
4].
Systematic screening schemes are standard in a large proportion of areas worldwide and use visual representations standardly to support take-up and encourage participation [
5]. Despite all of this effort and promotion, widely varying inequalities in take-up of such screening persist. In a number of countries, uptake completion rates for such schemes are still 16% [
5]. According to the latest available figures from NHS Bowel Cancer Screening Programme (England, April 2023–March 2024), the uptake, or more precisely the completion rate, for bowel cancer screening of those within the eligible age group of 60 to 74 years is 67.6%, but it is particularly uneven regionally and by socioeconomic group. For instance, in the most deprived wards, the uptake was 55.8%, whereas in the least deprived wards, it was 75.8% [
1].
Health marketing is the planned application of marketing and communication principles to encourage healthy behaviour, services, and interventions [
6]. It entails crafting certain messages, via media, including YouTube and Facebook [
7], as well as community mobilization in order to create awareness, change attitudes, and promote positive health actions. Health marketing has, over time, been a key driver of altering public behaviour, i.e., increasing vaccination rates, smoking cessation, balanced diets, and screening program initiation [
8,
9]. In making health more understandable and accessible, such campaigns have helped to bridge knowledge gaps, de-stigmatize conditions, and enable individuals to make informed choices about their health.
Numerous studies have used eye-tracking and other neuromarketing tools to investigate the impact of health promotion campaigns on changing behaviour. For example, breast cancer advertisements were examined by Skandali et al. [
10] in a study that involved Facial Expression Analysis software (Noldus FaceReader version 9); Champlin et al. [
11] used eye-trackers to measure the attention and the attractiveness of various adverts around obesity, pregnancy, and others; Milošević et al. [
12] evaluated the effect of monochromatic public health messages, particularly related to alcohol consumption and use of tobacco, among 58 adults, using eye-trackers. Our eye-tracking study adds upon the aforementioned research, as we investigate individual elements shown in bowel-cancer screening advertisements, contrasting anticipated regret (AR) compared to positive (P) slogan framing, and to assess whether these influences are dependent on image type. The complexity associated with AR must also be acknowledged with some research suggesting its impact on reducing bowel-cancer screening intentions [
13].
Regret is an aversive emotion people experience when they perceive that the present circumstances available to them could have been improved had they previously made different choices at an earlier moment in time [
14,
15]. Scholars agree that regret can be felt retrospectively, considering previous decisions—or anticipated in anticipating future decisions. They note that regret is a multifaceted, comparative emotion that is normally based on self-blame. Landman [
16] defines regret as a cognitively motivated reaction to irreversible choices that generate uncertainty. Similarly, ref. [
17] defines regret as hurting awareness that one’s existing situation is not as desirable as it could otherwise have been.
At the centre of the regretful experience is counterfactual thinking—mentally calculating “what could have been”, comparisons between what has transpired and what could have transpired [
18]. It normally exists as “if-then” sentences, where the “if” is a person’s action and the “then” is a valued outcome. They not only account for making sense of what has transpired in life, but they also determine future choice and behavioural change [
19,
20]. Therefore, counterfactual thinking plays a critical role in developing goal-directed behaviour [
18], including cancer screening [
2,
21].
Research has revealed that regret about a future action—or inaction—can powerfully affect decision-making. In particular, the anticipation of regret about performing some specific action reduces one’s propensity to perform it, while anticipation of regret at failing to act raises one’s propensity for acting. Anticipated regret, for instance, has been found to predict healthy behaviour intentions, such as drinking alcohol within safe limits [
22], avoiding junk food [
23], and engaging in physical activity [
24].
Importantly, anticipated regret adds to the prediction of health intention uniquely above the standard determinants of attitudes, perceived behavioural control, and subjective norms [
14,
15,
25,
26]. Contemporary meta-analysis confirmed that interventions aimed at increasing negative anticipated feelings such as regret have the potential to effect large changes in health behaviour [
24,
26,
27]. Overall, this literature points to the healthy contribution of anticipated regret to better choice-making.
Historically, many public health campaigns have aimed to increase individuals’ awareness of disease risks, operating under the assumption that greater knowledge would lead to healthier behaviour. However, research suggests that emotional responses to health-related activities are often stronger predictors of behaviour than factual understanding of the associated risks. In particular, people tend to perceive greater threats to their health when they experience a decline in trust [
8,
9,
10,
28].
Although there has been extensive research on (i) message framing for health messages, (ii) visual style and imagery for advertisements, and (iii) visual salience/layout, existing research has the tendency to analyse these levers in isolation and very seldom in bowel-cancer screening using ROI-level eye-tracking. Framing papers usually include attention or intention, as opposed to overt attentional capture and maintenance; visual style comparison (e.g., illustration vs. photography vs. AI) usually does not investigate interaction with the cognitive process of the slogan; and salience/layout operations shift placement or contrast without theory-informed framing contrast. We therefore do not have cumulative evidence as to whether an anticipated-regret frame consistently accelerates early orienting (TTFF) and sustained processing (dwell, fixations, revisits) to all image types, or whether identity congruence only speeds up the “first glance” without embedding processing. Filling this cumulative gap is essential for practice because creative decisions (copy + imagery + placement) are made jointly; limited guidance to any one factor may lead to suboptimal campaign design.
Against this backdrop and the integrated gap outlined above, we examine the effects of message framing and visual design influence on attention to bowel-cancer screening advertisements at two processing stages—early orienting vs. maintenance. Anticipated-regret (AR) framing should increase vigilance by eliciting counterfactual thinking (“what if I miss screening?”), perhaps speeding up early orienting to the call-to-action slogan, while positive (P) framing should facilitate maintenance inspection via benefit-highlighting and reassurance. We therefore examine whether AR and P frames temporally differentially direct attention dynamics—indexed by time to first fixation (TTFF), compared with measures based on dwell time—onto the slogan area of interest (AOI). We also examine how image category influences these effects. Visual form can modify perceptual fluency, novelty, and credibility cues. AI images can trigger fast orienting on the basis of novelty but cast doubt on authenticity; hand drawings can be intimate and prosocial, supporting fixation; older stock images can convey institutional credibility but compete less for attention. We manipulate AI-generated, hand-drawn, and retro stock images to assess how the design form enhances or reduces framing effects on both initial (TTFF) and sustained (dwell, number of fixations, revisits) attention. Lastly, we treat identity congruence—the fit between viewer-perceived ethnicity and the ethnicity depicted in the ad—our boundary condition. Identity match should enhance perceived self-relevance and trust, two attention and compliance drivers. If framing with AR is stronger when self-relevance is high, identity congruence could enhance initial capture and increase duration with the slogan; mismatch could hinder or even reverse them. In general, our interest binds theory of anticipated regret and counterfactuals to empirical, testable attention machinery in actual advertising copy, with an expectation that visual design and social identity content could affect the machinery systematically.
Our main goal is to measure how audiences distribute visual attention between the slogan and other ad components in bowel-cancer screening ads as a function of slogan frame (positive vs. anticipated-regret) and image type (hand-drawn, AI-generated, older stock photo). Early orienting is conceived as TTFF to the slogan AOI, whereas sustained interest is indexed as total dwell time, number of fixations, and number of revisits to the slogan AOI, as compared with non-slogan AOIs (e.g., person/scene, logos, symbols). Another goal is to explore moderation by (a) image type—whether framing effects vary in dependence on AI-generated compared to hand-drawn compared to stock photos—and (b) congruence in identity—whether the ethnicity congruence of viewers’ and audiences’ increases or decreases framing effects on first and sustained attention. Exploratorily, we investigate whether effects generalize over ad layouts and whether image-type differences reflect novelty (fast orienting) rather than credibility (extended looking). Collectively, these goals yield an evidence-based explanation of the translation of creative decision-making in message framing and imagery into measurable attention profiles that are theoretically associated with screening intentions. Thus, the following hypotheses were formulated:
H1 (early attention): AR slogans will yield faster time to first fixation (TTFF) on the slogan AOI compared to P slogans.
H2 (sustained attention): AR slogans will receive greater total dwell time on the slogan AOI compared to P slogans.
H3 (image-type moderation): Attention to slogans (TTFF, dwell time, fixation count, revisits) will differ by image type (AI vs. hand-drawn vs. older stock).
3. Data Analysis and Results
We structure the results according to attention phase and hypothesis. We first test H1 (early orienting) with time to first fixation (TTFF) on the slogan AOI [
36]. Second, we test H2 (sustained engagement) with dwell time on the slogan AOI, fixation number and revisit number being further measures of persistence [
37,
38]. For H3, we examine ImageType (HD, OLD, AI) both as a main effect and as a framing effect moderator [
39,
40]; for the boundary condition, we examine IdentityMatch (match vs. non-match) on the same measures. For cross-test analysis, we examine ImageType (HD, OLD, AI) main effects and moderation controlling for IdentityMatch (match vs. non-match) as a boundary condition. All the models were adjusted for ImagePopulation (W vs. SA), age (centred), gender, and education, and participant-level clustering or random effects were applied as needed [
41,
42].
We used complementary estimators to align with outcome distributions and to test robustness rather than to duplicate findings: the base models employed a linear mixed-effects model (REML) on log-TTFF (with random slopes and intercepts by participant), Gamma GEE with a log link to dwell time, a Poisson GEE to number of fixations (switching over to negative binomial where over-dispersion was identified), and negative binomial GEE to number of revisits. Median (τ = 0.50) quantile regressions with cluster-bootstrap CIs offer distribution-robust central tendency estimation. Convergent inferences between model families maximize confidence that results are not artefacts of an initial modelling decision. Robustness analyses comprise random-slope LMM alternatives, median (τ = 0.50) quantile regression with cluster-bootstrap CIs, and sensitivity to winsorisation/log-transform alternatives and TTFF censor rules [
43,
44,
45]. We present effects as time ratios (TRs) for TTFF, mean ratios (RM) for the Gamma models, and incidence rate ratios (IRRs) for the count models; models are linear mixed-effects models (LMMs) and generalized estimating equations (GEEs). Model assumptions and diagnostics were verified as below: residual scale–location plots and QQ-plots (normality/heteroscedasticity) and convergence comments for LMMs; deviance/Pearson residuals and appropriateness of the log link for Gamma GEEs; over-dispersion tests (Pearson χ
2/df) with negative binomial substitution where necessary, and working-correlation testing (QIC) with stability tests for count GEEs. Figures show estimated marginal means and effect sizes with 95% CIs; tables show model coefficients and ratios (TR for TTFF; RM/IRR for dwell/counts). Model diagnostics (convergence notes, residual/over-dispersion tests, and working-correlation fit) are included with each result as appropriate.
3.1. H1—Early Attention (TTFF to Slogan AOI)
We fit a linear mixed-effects model (REML) of log-TTFF (ms) with fixed effects of SloganType (AR vs. P), ImageType (HD, OLD, AI), IdentityMatch (match vs. non-match), and interactions SloganType × ImageType and SloganType × IdentityMatch; covariates were ImagePopulation (W vs. SA), age (centred), gender, and education. Random effects were a by-participant intercept and a random slope for SloganType. Optimizer warnings suggested non-convergence, so fixed-effect estimates must be interpreted with caution, even though patterns were consistent with marginal means (
Table 1).
Compared to AR, P slogans evoked slower TTFF to the slogan AOI, b = 0.342, SE = 0.014, z = 23.998, p < 0.001, time ratio (TR) = 1.41 [1.37, 1.45]. Relative to HD, OLD photos slowed TTFF (b = 0.141, SE = 0.013, z = 11.066, p < 0.001, TR = 1.15 [1.12, 1.18]) and AI photos slowed TTFF even more (b = 0.239, SE = 0.013, z = 18.760, p < 0.001, TR = 1.27 [1.24, 1.30]). Identity match predicted quicker TTFF (b = −0.147, SE = 0.011, z = −13.330, p < 0.001, TR = 0.86 [0.85, 0.88]). The interaction SloganType × ImageType was not significant (OLD: p = 0.602; AI: p = 0.076). Baseline (AR, HD, W, non-match, Age_c = 0) was exp(6.350) ≈ 572 ms.
Support was found for H1: AR > P in early orienting (shorter TTFF). Large main effects of ImageType (HD < OLD < AI) were also found, and identity match helped orienting.
Specification: log_TTFF ~ C(SloganType) × C(ImageType) + C(SloganType) × IdentityMatch + C(ImagePopulation) + Age_c + C(Gender) + C(Education) + (1 + C(SloganType) | ParticipantID).
All except the bottom line slope upward (HD < OLD < AI), and the mean increases from ~760→~870→~940 ms, showing slower orienting to older and AI pictures. Small crossing and overall parallel lines indicate similar image-type effects across all groups, as might be expected from the significant ImageType main effects and the non-significant Slogan × ImageType interaction of the TTFF LMM. This trend confirms H3 (image-type differences) and the ECDF in
Figure 1 (AR faster than P overall).
The participant traces in both panels move in the direction HD → OLD → AI, progressively slower orienting with stock and AI images; this replicates the large ImageType main effects (HD < OLD < AI). Maybe most notably, the AR panel (
Figure 2) is lower than the P panel (
Figure 1) for all image types, as would be expected for a ~30–40% AR benefit (TR ≈ 1/1.41) and for additivity over moderation. The approximately parallel slopes between panels indicate the non-significant Slogan × ImageType interaction: image type moves the baseline (vertical positioning of the lines), but AR yields a relatively stable gain on HD, OLD, and AI. Within condition, thin participant lines converge near the bold mean with similar ordering (HD is fastest; AI is slowest) and demonstrate that the group pattern is not the result of the inclusion of a small number of participating subjects. Together, the plots constitute a graphical test of H1 (AR < P in TTFF) and H3 (HD < OLD < AI), and account for why the interaction fell short of significance: the effects are additive rather than multiplicative.
In
Figure 2, lines also slope upward (HD < OLD < AI) but with smaller mean overall (~545→~630→~690 ms) than the P panel, in line with quicker orienting under AR (supports H1), and with the same ordering of image types (supports H3). Visual comparability of the slopes between panels A and B supports the non-existence of moderation of slogan × ImageType within the LMM and quantile-regression tests.
The TTFF slopes in
Figure 1 and
Figure 2 exhibit approximately parallel HD→OLD→AI slopes for P and AR, which indicate an additive structure: image type alters the baseline latency, and AR provides a relatively stable benefit. This trend is then in line with (i) an early, slogan-level orienting mechanism operating prior to image-style variation having much impact; and (ii) restricted statistical power for the detection of cross-level interactions in within-subject latency data, where random-slope variance raids some of the cross-product signal. Small nonparallelities (e.g.,
p = 0.076 for weak AI term) are in line with this position—directionally but below typical significance thresholds after participant heterogeneity is controlled.
3.2. H2—Sustained Attention (Dwell on Slogan AOI)
We fit a Gamma GEE with log link (exchangeable working correlation; robust/clustered SEs by participant, on dwell time (ms) on slogan AOI for SloganType × ImageType, covariates being ImagePopulation, age (centred), gender, and education (
Table 2).
Relative to those for AR slogans, P slogans attracted less focus on the slogan AOI, b = −0.256, SE = 0.007, z = −35.11, p < 0.001, with a ratio of means (RM) of 0.77 [0.76, 0.79]. Compared with hand-drawn (HD) images, OLD images elicited smaller attention, b = −0.092, SE = 0.006, z = −15.88, p < 0.001, RM = 0.91 [0.90, 0.92], and AI images elicited significantly lower attention, b = −0.209, SE = 0.008, z = −27.68, p < 0.001, RM = 0.81 [0.80, 0.82].
There was a negligible Slogan × ImageType interaction for AI images, b = 0.025, SE = 0.009, z = 2.82, p = 0.005, RM = 1.03 [1.01, 1.04]: the AR benefit (AR > P) in dwell was roughly 2–3% lower for AI compared to HD images (AR vs. P RM in HD ≈ 0.77; in AI ≈ 0.79). The slogan × ImageType interaction for OLD images was nonsignificant, p = 0.984. ImagePopulation (SA), gender, education, and age were not significant predictors (all p > 0.05). The baseline estimated dwell (AR, HD, W, non-match, Age_c = 0) was exp(6.894) ≈ 986 ms [975–998].
The results confirm H2 (AR > P under dwell). Dwell in H3 is ImageType-dependent (HD > OLD > AI), and the AR-P difference is slightly smaller for AI images, showing modest moderation by image type. Specification: Dwell_ms ~ C(SloganType) × C(ImageType) + C(SloganType) × C(ImagePopulation) + Age_c + C(Gender) + C(education).
Dwell indicates a small but consistent decline in the AR benefit to AI images (P × AI RM ≈ 1.03), while OLD is the same as HD. This is a mirror that image style influences preservation rather than capture: AI imagery exacts light disfluency that cuts—but does not eliminate—the AR benefit. This moderate magnitude of this moderation (≈2–3%) is in line with an additive main effect of framing involving a mild maintenance-stage penalty to AI, and not complete reversal. Combined with the TTFF findings, it accounts for the mixed interaction pattern: AR functions mostly as a framing prior (additive over styles), with image category contributing stage-different baseline shifts and with only modest interaction under prolonged viewing.
All the AR vs. P comparisons are less than 1, validating H2 (AR > P residence) for all types of images and displaying little moderation by ImageType in this case (as with the nonsignificant slogan × ImageType terms). These plots supplements the GEE table (effects on a log scale) and the EMM line graphs by providing effect sizes with uncertainty in one view (
Figure 3 and
Figure 4).
6. Conclusions, Limitations, and Future Directions
On four measures of eye-tracking—TTFF, dwell time, fixations, and revisits—future-regret (AR) messages exceeded positive (P) messages, grabbing viewers’ attention more quickly and holding it longer. Baseline attention depended on picture type (hand-drawn > older stock > AI), but not always the AR–P difference. Identity congruence between watchers and on-screen viewers also sped first orienting (shorter TTFF). These impacts survived mixed-effects, GEE, and quantile-regression analyses.
This research is not without limitations. First, although the sample population was small (N = 42), it accords with eye-tracking norms where repeated-measures and trial-level data provide large within-subjects data counterbalancing smaller group-level samples [
12,
55]. The demographic restriction to UK residents between ages of 40–65 limits generalizability. Cultural difference in visual attention and cross-cohort age difference in cognitive control and sensitivity to messages will require replication with younger cohorts and across cultures to determine the broader applicability of regret-based framing effects. Second, the paradigm assessed only short, free-viewing presentations (5 s) [
11,
49]. Such exposure durations are readily appropriate for the examination of early-stage attentional capture and maintenance but are not applicable to subsequent stages in persuasion, such as comprehension, recall, or behavioural uptake. Subsequent research may therefore try to draw on hybrid paradigms that apply the use of eye-tracking to delayed recall tasks or behavioural choice paradigms, as an attempt to more directly relate attentional measures to decision-making products [
11]. Lastly, since the comparison between AI, OLD, and HD images was intended to reflect differences in generation process, these types necessarily had to vary on other sensory attributes such as texture, contrast, visual clutter, and face salience [
48]. Thus, attentional differences cannot be entirely accounted for by generation process. Subsequent work would have to explore testing manipulations of low-level visual features with framing manipulations and determine whether the perceptual fluency contributions can be distinguished from stylistic or technological inferences. We also did not obtain coder audits or participant ratings of ‘uncanny’ attributes (e.g., realism, facial/hand anomalies), so any link to uncanny-valley mechanisms is interpretive and should be tested directly in future work. Similarly, extension of the set of AR wordings might be used to determine whether the observed attentional benefits are specific to regret per se, or to more general categories of negative-framed appeals. Future researchers should examine generalizability testing in larger, preregistered international and age-group samples. Second, connect attention to downstream effects (recall, intentions, actual booking/kit return) with hybrid eye-tracking and follow-up designs. Third, try to dissociate low-level visual features (contrast, clutter, face salience) to tease apart image “generation” from fluency perception. We did not standardize or quantify these low-level properties across image types, so residual confounding cannot be ruled out. The within-subjects design, counterbalancing, and robustness checks mitigate, but do not eliminate, this issue; future work should compute objective image metrics (e.g., luminance, RMS contrast, edge density, clutter) and include them as covariates. Fourth, contrast different negative frames and AR wordings to calibrate effect and reduce reactance. Fifth, investigate test context effects (mobile feed v. print; static vs. brief video) and CTA placement/duration. Lastly, investigate mechanisms with causal mediation and individual-difference moderators (trust, health anxiety, prior screening) to optimize targeting without compromising equity.
In general, both models and media combined, anticipated-regret framing paired with identity-congruent image consistently drew eyes to the screening slogan. Hand-drawn human-focused images motivated close reading, whereas AI-produced rendered images trailed behind. In varying mixes and styles, the trajectory of AR was consistent, providing a consistent avenue for public health communications: a headline glanced at briefly, held just long enough to matter.