Next Article in Journal
Pedagogical Tact Insights in Online Learning Communities
Next Article in Special Issue
Can Generative Artificial Intelligence Effectively Enhance Students’ Mathematics Learning Outcomes?—A Meta-Analysis of Empirical Studies from 2023 to 2025
Previous Article in Journal
Curriculum Devolution Under Neoliberal Pressures: The Case of Senior Secondary Music in Victoria, Australia and Its International Resonances
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Role of the Instructor’s Social Cues in Instructional Videos

1
Key Laboratory of Modern Teaching Technology (Ministry of Education), Shaanxi Normal University, Xi’an 710062, China
2
Department of Psychological & Brain Sciences, University of California, Santa Barbara, CA 93106, USA
3
Manchester Institute of Education, The University of Manchester, Manchester M13 9PL, UK
*
Author to whom correspondence should be addressed.
Educ. Sci. 2026, 16(1), 82; https://doi.org/10.3390/educsci16010082
Submission received: 1 December 2025 / Revised: 29 December 2025 / Accepted: 3 January 2026 / Published: 7 January 2026

Abstract

Little attention has been paid to whether an instructor’s hand-pointing gestures or use of a mouse-guided arrow can mitigate the attentional loss caused by an instructor’s happy facial expressions or can enhance the social benefits of these expressions in instructional videos. The goal of the present study is to determine whether social cues in an instructional video affect learning processes and outcomes. The participants were 57 female students from a university. We employed a 2 × 2 mixed experimental design. The instructor’s facial expression was a within-subject variable, while the type of pointing cue was a between-subject variable. Students who had the smiling instructor rather than the bored instructor gave higher ratings of the perceived positive emotion of the instructor, felt more positive emotion, and had more motivation to learn. Eye-tracking technology showed that students who learned with the smiling instructor spent more time looking at the content on the slides than those who learned with a bored instructor. Students who learned with the smiling instructor scored higher on a learning outcome post-test than those who learned with the bored instructor. Among female Chinese students, this pattern is consistent with the five steps posited by the positivity principle, which concludes that people learn better from instructors who exhibit positive social cues. Pointing with a human hand was not superior to pointing with an arrow, suggesting that in this case hand-pointing was not a strong social cue and did not moderate the effects of facial expression. Given the exclusively female sample, future research should examine whether these effects generalize across genders.

1. Introduction

1.1. Objective and Rationale

Imagine a student learning second language vocabulary from an instructional video featuring a series of slides next to an instructor who talks about them. What is the role of social cues from the instructor, such as exhibiting happy rather than bored facial expressions, or pointing toward the slide with her hand rather than a mouse-guided arrow? The primary goal of the present study was to determine how a happy or bored emotional tone to the onscreen instructor’s facial expression affects learning processes and outcomes. The secondary goal was to determine if the effects of facial expression are moderated by whether the instructor uses her hand or a computer-generated arrow to point to the slide.
The rationale for this study is that although much research has focused on cognitive factors for designing instructional videos (Chikha et al., 2024; Shi et al., 2024; Yoon et al., 2024), less research has investigated the role of affective factors in designing instructional videos (Plass & Hovey, 2022). According to the positivity principle (Horovitz & Mayer, 2021; Lawson et al., 2021a, 2021b), an instructor with a happy facial expression can trigger students’ positive emotions and enhance their learning engagement. Visual cues, such as mouse-guided arrows and hand-pointing gestures (which also can be a social cue), can effectively guide students to pay attention to the relevant information on the slides that the instructor is verbally referring to (van Gog, 2014). Research suggests that happy facial expressions and visual cues from instructors can effectively enhance learning in instructional video settings (Stull et al., 2021). However, there is a dearth of studies examining their interaction effects on student learning. The present study helps fill this gap.

1.2. Research on Social Cues in Instructional Video

The happy facial expressions of an instructor, which convey positive emotions, are critical social cues in video lectures. According to the positive principle (Horovitz & Mayer, 2021), instructors’ facial expressions can influence students’ emotions, motivation, engagement, and learning outcome. Empirical studies provide initial evidence that an instructor’s happy facial expressions can trigger students’ positive emotions and enhance their motivation (Lawson et al., 2021b; Marius et al., 2025; Suen & Hung, 2025). However, investigation of the impact of happy facial expressions in video lectures on learning outcomes has yielded mixed results (Lawson et al., 2021a, 2021b; Polat, 2023; C. Zhang et al., 2024). While some studies have found that an instructor’s happy facial expressions, compared to bored ones, facilitate students’ positive emotions, motivation, and learning outcomes (Pi et al., 2023, 2024; R. Zhang et al., 2023; C. Zhang et al., 2024), others have not observed immediate benefits on learning outcome (Lawson et al., 2021a, 2021b; Y. Wang et al., 2022). These mixed findings suggest that, in addition to the social cue effects emphasized in previous research, other factors—such as increased arousal, novelty effects, or expectancy violations—may also influence students’ attention and motivation (Chan & Saunders, 2023; Margoni et al., 2024; Xie et al., 2023).
Beyond these general mechanisms, one specific factor that might account for the variability in learning outcome is the visual competition between the instructor and the learning content (e.g., slides). Although students may be more engaged in learning when seeing an instructor’s happy facial expressions, their cognitive effort might be redirected towards visually processing the instructor’s image rather than focusing on the relevant learning content in the slides. Research utilizing eye-tracking technology reveals that an instructor’s happy facial expressions, as opposed to bored ones, captures significantly more of students’ visual attention (Pi et al., 2023, 2024). This leads to a split in students’ attention between them and the instructional slides (Polat et al., 2025; Sweller et al., 2019; M. Wang et al., 2022). Therefore, instructional videos featuring an instructor’s facial expressions could potentially cause cognitive overload for students.
A potential way to mitigate the issue of visual competition triggered by the instructor’s happy facial expressions is adding visual cues to guide students’ attention to the learning materials in instructional videos (Chikha et al., 2024; Dargue et al., 2019; Gallagher-Mitchell et al., 2018; Shi et al., 2024). Visual cues refer to non-content information that directs students’ attention to the relevant information on the screen (De Koning et al., 2007). In instructional videos, various tools can serve as visual cues for guiding student attention, such as an instructor’s gaze, hand-pointing gestures, and a mouse-guided arrow (Emhardt et al., 2022; Gallagher-Mitchell et al., 2018; Li et al., 2023; Meier et al., 2023). Previous studies have shown that students perform better when viewing instructional videos with hand-pointing gestures or a mouse-guided arrow compared to those without visual cues (Gallagher-Mitchell et al., 2018; Li et al., 2023; Yoon et al., 2024; C. Zhang et al., 2024).
Therefore, while an instructor’s happy facial expression may trigger students’ positive social responses (e.g., positive emotions and enhanced motivation), without visual cues, students may not be effectively guided to where to focus their cognitive effort. This study aims to establish design guidelines for optimizing the benefits of instructors’ happy facial expressions in instructional videos.

2. The Present Study

Overall, compared to bored facial expressions, an instructor with a happy facial expression in an instructional video can trigger students’ positive emotions and enhance their engagement with the content of the instructional video, leading to better learning outcomes (Horovitz & Mayer, 2021; Lawson et al., 2021b; Pi et al., 2023). However, few studies have examined the chain of affective, motivational, and cognitive events leading to superior learning outcomes caused by positive social cues, such as an instructor’s happy facial expressions. Furthermore, relatively little attention has been paid to whether an instructor’s hand-pointing gestures or mouse-guided arrow can mitigate the attentional loss caused by the instructor’s happy facial expressions or enhance the social benefits of these expressions in instructional videos.
The present study aimed to investigate the conditions under which an instructor’s happy facial expressions are beneficial in instructional videos, especially when combined with hand-pointing gestures or a mouse-guided arrow, as opposed to the instructor’s bored facial expressions. We compared students’ emotions, motivation, eye movements, and learning outcomes across four types of video lectures, created by pairing the instructor’s facial expression type (happy vs. bored) with the type of pointing (hand-pointing gestures or mouse-guided arrow).

Theoretical Framework and Predictions

This project provides a systematic test of the positivity principle, which posits that positive social cues in a multimedia lesson set off a chain of events in the learner as summarized in Figure 1.
In the first link (perceived emotion), students recognize the valence of the instructor’s emotional tone. We predict that students who receive an onscreen instructor exhibiting a happy facial expression will rate the instructor’s emotional state as more positive than students who receive a bored instructor (Hypothesis 1).
In the second link (felt emotion), learners experience emotions corresponding to the instructor’s tone. We predict that students with a happy instructor will report more positive feelings than those with a bored instructor (Hypothesis 2).
In the third link (motivation), learners are more motivated when experiencing a positive instructor. We predict higher motivation ratings with a happy instructor compared to a bored instructor (Hypothesis 3).
In the fourth link (attention), learners focus more on core onscreen material under a positive instructor. We predict that students with a happy instructor will allocate more visual attention to the slides than those with a bored instructor (Hypothesis 4).
In the fifth link (learning outcome), learners achieve better outcomes when experiencing a positive instructor. We predict that students with a happy instructor will score higher on the post-test than those with a bored instructor (Hypothesis 5).
Concerning the role of type of pointing cues, the embodiment principle (Fiorella, 2022; Mayer, 2021) suggests that pointing with a human hand is a stronger social cue than pointing with a computer-generated arrow, leading to all the same predictions as listed above for positive rather than negative facial expressions. The embodiment principle states that people learn better when the instructor uses their body to support the instructional message, such as through pointing or writing with their hand. However, in the present study, the arrow-pointing condition may have been superior in some respects because it was more precisely embedded within the instructional slides. We also are interested in whether one form of pointing is more effective than the other in boosting the effects of happy facial expression.

3. Method

3.1. Participants and Design

An a priori power analysis was conducted to determine an appropriate sample size. The effect size f, derived from multi-group comparisons using F, is considered small, medium, and large at values of 0.1, 0.25, and 0.4, respectively (Cohen, 1988). G*Power (Version 3.1.9.6) showed that in order to have sufficient power to detect a medium effect size (f = 0.25, α = 0.05, power = 0.95), N = 54 would be an adequate sample size for the planned analyses. The participants were 57 female college students recruited from a university. Each participant received CNY 25 as a token of appreciation for their participation. Their mean age was 21.02 years (SD = 4.30) and they came from diverse majors, including geography, physics, and psychology.
We employed a 2 (instructor’s facial expression: happy vs. bored) × 2 (the type of pointing cue: arrow pointing vs. hand pointing) mixed experimental design. The instructor’s facial expression was a within-subject variable, while the type of pointing cue was a between-subject variable. The experiment included four video lectures featuring an instructor demonstrating: (1) a happy facial expression with hand pointing (happy/hand), (2) a happy facial expression with arrow pointing (happy/arrow), (3) a bored facial expression with hand pointing (bored/hand), and (4) a bored facial expression with arrow pointing (bored/arrow). Participants were randomly assigned, with half (n = 29) viewing two instructional videos featuring arrow pointing, and the other half (n = 28) viewing two instructional videos featuring hand-pointing gestures. The instructional content presented to both groups was identical, with one video involving a happy instructor and one video involving a bored instructor. To minimize potential order effects, half of the participants in each group first viewed the video featuring a happy facial expression, followed by the video with a bored expression, while the sequence was reversed for the remaining participants.

3.2. Materials

3.2.1. Video Lessons

As shown in Figure 2, we created four instructional videos focused on teaching English vocabulary to Chinese students, drawing on the research conducted by Pi et al. (2023, 2024): a happy facial expression with hand pointing (happy/hand), a happy facial expression with arrow pointing (happy/arrow), a bored facial expression with hand pointing (bored/hand), and a bored facial expression with arrow pointing (bored/arrow). Each video introduced five vocabulary words and had a duration of approximately 3–4 min, including the Chinese meaning, pronunciation, and an illustrative sentence. The instructional videos were identical in terms of presentation size, position, duration of pointing cues, and proportion of text content between the slides and the instructor.
The instructor’s facial expression (happy vs. bored) served as a within-subject variable. To ensure that any observed effects were attributable to the instructor’s social cues rather than differences in material difficulty, the vocabulary words used in the happy and bored conditions were carefully equated. Specifically, five words were selected for each condition, and their frequencies were compared using norms from the Corpus of Contemporary American English (COCA), a widely used and genre-balanced corpus of American English. A Mann–Whitney U test indicated no significant difference in word frequency between the two sets of words (U = 5.00, p = 0.151), suggesting a comparable lexical difficulty across conditions.
The type of pointing used by the instructor (i.e., hand or mouse-guided arrow) served as a between-subject variable, and the vocabulary taught remained consistent across the two pointing conditions. Regarding the manipulation of pointing cues, in the hand-pointing condition, the instructor used her right hand to guide students towards the relevant teaching content on the slides. In the arrow-pointing condition, instructional content was highlighted on the slides by a red cursor controlled by a mouse, complementing the instructor’s verbal explanations. During this, the instructor’s hands were stationary, resting naturally in front of her. Moreover, the instructor maintained a consistent facial expression (either happy or bored) throughout the lecture (Pi et al., 2023).

3.2.2. Participant Questionnaire

We used a participant questionnaire to solicit participants’ demographic information, such as age, gender, and major.

3.2.3. Prior Knowledge Test

To assess participants’ familiarity with the English words taught, we produced a prior knowledge test consisting of the words from the lesson and asked participants to indicate their knowledge of these words (Pi et al., 2023).

3.2.4. Perceived Emotion and Felt Emotion Questionnaires

We adapted the Self-Assessment Manikin (SAM; Bradley & Lang, 1994) to measure two distinct constructs: participants’ perceptions of the instructor’s emotion (perceived emotion) and their own felt emotions (felt emotion). Each construct was assessed with a single item. The items were “How pleasant do you think the instructor appeared in the video?” for perceived emotion and “How pleasant are you feeling at the moment?” for felt emotion. Participants rated each item on a nine-point Likert scale, ranging from 1 (“Very Unpleasant”) to 9 (“Very Pleasant”), with higher scores indicating more positive ratings. As each construct was measured by a single item, internal consistency reliability was not calculated.

3.2.5. Motivation Questionnaire

We used the motivation dimension of the learning experience questionnaire developed by Stull et al. (2018), which consists of six items. For example, one item was “I enjoyed learning this way.” Participants rated the items using a seven-point Likert scale ranging from 1 (“strongly disagree”) to 7 (“strongly agree”). The final score was determined by averaging the ratings across the six items, with higher ratings indicating greater motivation. The questionnaire in this study had satisfactory internal consistency (Cronbach’s alpha = 0.88).

3.2.6. Learning Outcome Post-Tests

The learning outcome tests were based on learning performance tests developed by Pi et al. (2023), tailored to the content of the two instructional videos. We employed two parallel versions of the learning outcome test corresponding to the two videos. Each test comprised two parts. (1) The first part included five fill-in-the-blank questions requiring participants to recall the meanings of each word in Chinese. Correct answers in this section were awarded 3 points each, for a maximum of 15 points. (2) The second part consisted of five cloze passages and five synonym selection questions. The following is an example of a cloze passage question: “Keep your feet under your desk; do not let them _____ into the aisle. A. protrude; B. siren; C. dwindle; D. fallacy.” The following is an example of a synonym selection question: “The antonym of Elucidate is _____. A. eliminate; B. clarify; C. eulogy; D. oscillate.” Correct answers in this section were also awarded 3 points each, for a maximum of 30 points. The overall learning outcome score was based on the sum of the scores from the first and second parts. The internal consistency of the two learning outcome tests was satisfactory (Cronbach’s alpha = 0.79).

3.2.7. Eye Tracking Apparatus and Measures

We recorded participants’ eye movements using an Eyelink 1000 eye tracker (SR Research Ltd., Ottawa, ON, Canada) with a 1000 Hz sampling rate and a 60 cm viewing distance. We created two dynamic areas of interest (AoIs): (1) the slides and (2) the instructor. We measured participants’ attention towards each AoI through their dwell time (i.e., the total time fixated on an AoI) and their fixation counts. These measures were used to assess participants’ focus on specific areas of the video lectures (Alemdag & Cagiltay, 2018; Huangfu et al., 2025).

3.2.8. Procedure

Participants first provided demographic information and completed the prior knowledge test. They were then randomly assigned to one of two between-subject conditions: the hand-pointing group or the arrow-pointing group. Within each group, a counterbalancing procedure was applied to the within-subject factor (the instructor’s facial expression). Specifically, half of the participants viewed the happy-expression video first followed by the bored-expression video, whereas the other half viewed the videos in the reverse order. Eye movements were recorded throughout the video viewing phase. After viewing each instructional video, participants completed the perceived emotion questionnaire, felt emotion questionnaire, motivation questionnaire, and learning outcome test at their own pace. The entire experiment lasted approximately 30 min. The study received ethics approval from the ethics committee of the lead author’s institution.

4. Results

4.1. Were Participants Equivalent Across Conditions?

Prior to the main analyses, participants’ prior knowledge of the target vocabulary was examined to ensure equivalence across experimental conditions. A 2 (facial expression: happy vs. bored) × 2 (pointing cue: hand vs. arrow) repeated-measures ANOVA on the prior knowledge test scores revealed no significant main effects of facial expression, F(1, 55) = 0.91, p = 0.35, and ηp2 = 0.02; F(1, 55) = 1.90, p = 0.17, and ηp2 = 0.03, and no significant interaction, F(1, 55) = 1.58, p = 0.22, and ηp2 = 0.03).
Furthermore, the two between-subject groups were equivalent on age. An independent-samples t-test revealed no significant difference in mean age between the hand-pointing group and the arrow-pointing group (t(55) = 0.96, p = 0.34, and Cohen’s d = 0.25).
Together, these results indicate that participants were comparable across conditions prior to the experimental manipulation.

4.2. Hypothesis 1: Perceived Emotion of the Instructor

The first link in the positivity principle is that learners perceive the positive or negative tone displayed by the onscreen instructor (Hypothesis 1). The first line of Table 1 shows the mean ratings (and standard deviations) for each condition on the perceived emotion questionnaire. In support of Hypothesis 1, a repeated measures analysis of variance (ANOVA) revealed a significant main effect of the instructor’s facial expressions, in which happy instructors were rated as displaying more positive emotion than bored instructors; F(1, 55) = 250.68, p < 0.001, and ηp2 = 0.82. This provides initial evidence that the instructor’s facial expression is a salient social cue for learners.
The main effect of the type of pointing cue on participants’ perceptions of the instructor’s emotional valence was not significant, as F(1, 55) = 0.35, p = 0.56, and ηp2 = 0.006, and there was no significant interaction; F(1, 55) = 0.28, p = 0.60, and ηp2 = 0.005. Thus, in contrast to the embodiment principle, there was no evidence that hand-pointing was a stronger social cue than an arrow.

4.3. Hypothesis 2: Felt Emotion

The second link in the positivity principle is that students tend to feel the positive or negative emotion displayed by the onscreen instructor (Hypothesis 2). The second line of Table 1 shows the mean ratings (and standard deviations) for each condition on the felt emotion questionnaire. In support of Hypothesis 2, a repeated measures ANOVA revealed a significant main effect of the instructor’s facial expressions, in which students who viewed happy instructors reported feeling stronger positive emotion themselves than students who viewed bored instructors; F(1, 55) = 53.34, p < 0.001, and ηp2 = 0.49. This provides additional evidence that the instructor’s facial expression is a social cue that affects the learner’s emotional state.
Neither the main effect of the type of pointing cue on the participants’ emotional valence nor the interaction effect were significant; F(1, 55) = 0.22, p = 0.64, and ηp2 = 0.004 and F(1, 55) = 1.11, p = 0.30, and ηp2 = 0.02. Thus, in contrast to the embodiment principle, there was no evidence that hand pointing was a stronger social cue that an arrow.

4.4. Hypothesis 3: Motivation to Learn

The third link in the positivity principle is that learners feel more motivated to learn when the onscreen instructor displays positive rather than negative emotion (Hypothesis 3). The third line of Table 1 shows the mean ratings (and standard deviations) for each condition on the motivation questionnaire. In support of Hypothesis 2, a repeated measures ANOVA revealed a significant main effect of the instructor’s facial expressions on motivation, in which students who viewed a happy onscreen instructor gave higher motivation ratings than students who viewed a bored onscreen instructor; F(1, 55) = 33.72 p < 0.001, and ηp2 = 0.38. This provides additional evidence that the instructor’s facial expression is a social cue that affects the learner’s motivation to learn.
The main effect of the type of pointing cue on the motivation was not significant, as F(1, 55) = 0.08, p = 0.78, and ηp2 = 0.001, and the interaction was not significant; F(1, 55) = 3.34, p = 0.07, and ηp2 = 0.06). In contrast to the embodiment principle, this result suggests that hand-pointing was not a stronger social cue than an arrow.

4.5. Hypothesis 4: Visual Attention to the Lesson

The fourth link in the positivity principle is that learners tend to focus their eyes more on the lesson when the onscreen instructor displays positive rather than negative emotion (Hypothesis 4). The fourth line of Table 1 shows the mean dwell time on the slides (and their standard deviation) for each condition based on eye tracking. In support of Hypothesis 4, a repeated measures ANOVA revealed a significant main effect of the instructor’s facial expressions on learners’ dwell time on the slides, in which learners who viewed happy onscreen instructors spent more time looking at the slides than learners who viewed bored onscreen instructors; F(1, 55) = 24.24, p < 0.001, and ηp2 = 0.31.
Additionally, in contrast to the embodiment principle, there was a main effect of the type of pointing cue, in which dwell time on the slides was greater with arrow pointing than hand pointing, as F(1, 55) = 45,42, p < 0.001, and ηp2 = 0.45, perhaps because the arrow pointing was more precisely embedded in the slides. There was also a significant interaction; F(1, 55) = 7.74, p = 0.007, and ηp2 = 0.12. A simple analysis of the main effects indicated that when viewing videos with hand pointing, there was no significant difference in the time participants spent on the slides between the happy and bored facial expression conditions (MD = 2.53 and p = 0.14), but when viewing videos with arrow pointing, participants spent more time looking at the slides in the happy facial expression conditions compared to the bored facial expression conditions (MD = 9.09 and p < 0.001). This pattern is shown in Figure 3.
The mean dwell time on the instructor (and the standard deviation) for each group is shown in the fifth line of Table 1. As could be expected, the dwell time on the instructor was significantly higher when the instructor displayed a happy facial expression rather than a bored facial expression; F(1, 55) = 4.18, p = 0.046, and ηp2 = 0.07. However, there was no significant main effect of the type of pointing cue, as F(1, 55) = 0.50, p = 0.48, and ηp2 = 0.01, and no significant interaction effect; F(1, 55) = 2.71, p = 0.11, and ηp2 = 0.05. Overall, learners appear to be more drawn to looking at people with happy faces than bored faces.
Similar patterns were observed using the eye-tracking metrics of the number of fixations on the slide (row 6) and on the instructor (row 7). Consistent with Hypothesis 4, a repeated measures ANOVA showed that learners allocated more fixations to the slides when the instructor displayed happy rather than bored facial expressions; F(1, 55) = 13.78, p < 0.001, and ηp2 = 0.20. In line with the previously discussed results for dwell time, learners allocated more fixations to the slides when the lesson involved arrow pointing rather than hand pointing, as F(1, 55) = 12.91, p < 0.001, and ηp2 = 0.19, perhaps because the arrows were more precisely embedded in the slides. The interaction effect was not significant; F(1, 55) = 3.23, p = 0.078, and ηp2 = 0.06. However, given that the effect was marginal, we conducted a simple main effects analysis. When viewing videos featuring hand pointing, there was no significant difference in the fixation count for the slides between the happy and bored facial expression conditions (MD = 13.04 and p = 0.19). However, when viewing videos featuring arrow pointing, participants gazed at the slides more frequently in the happy facial expression conditions compared to the bored facial expression conditions (MD = 37.52 and p < 0.001). This pattern is shown in Figure 4.
Regarding participants’ fixation counts on the instructor, in contrast to the embodiment principle, we did not observe significant main effects of the instructor’s facial expression, as F(1, 55) = 1.26, p = 0.27, and ηp2 = 0.02; there also was significant effect of the type of pointing cue, F(1, 55) = 0.08, p = 0.78, and ηp2 = 0.001, and no significant interaction; F(1, 55) = 1.61, p = 0.21, and ηp2 = 0.03.
Overall, the pattern of results for both the dwell time and the number of fixations is consistent with the positivity principle’s prediction that learners attend more to the lesson when the onscreen instructor displays positive rather than negative facial expressions.

4.6. Hypothesis 5: Learning Outcome

The final link in the positivity principle tested in this study is that learners build a better learning outcome when the onscreen instructor displays positive rather than negative emotion as indicated by post-test performance (Hypothesis 5). The bottom line of Table 1 shows the mean learning post-test score (and the standard deviation) for each condition. In support of Hypothesis 5, a repeated measures ANOVA revealed a significant main effect of the instructor’s facial expressions on the post-test score, in which learners who viewed happy onscreen instructors achieved higher scores than learners who viewed bored onscreen instructors; F(1, 55) = 45.88, p < 0.001, and ηp2 = 0.46. This is an important link in the chain shown in Figure 1 and completes the final link with all five links showing the predicted effects of positive rather than negative instructors.
In contrast to the embodiment principle the results of a two-way repeated ANOVA exhibited that the main effect of pointing was not significant, as F(1, 55) = 0.26, p = 0.61, and ηp2 = 0.005, which could be attributed to the greater precision of the arrow pointing condition. However, the interaction effect was significant; F(1, 55) = 4.11, p = 0.048, and ηp2 = 0.07. Simple effects analysis showed that learners demonstrated better learning outcomes in the happy facial expression conditions as opposed to the bored expression conditions. This difference in learning outcome was notably greater in the arrow pointing conditions (MD = 7.88 and p < 0.001) than in the hand-pointing conditions (MD = 4.25 and p = 0.002). This pattern is shown in Figure 5.

5. Discussion

5.1. Empirical Contributions

There are five primary empirical contributions of this study concerning the role of the instructor’s facial expressions in learning from instructional videos. First, learners rate the emotional state of onscreen instructors as more positive when the onscreen instructor displays happy rather than bored facial expressions. This suggests that learners are able to recognize whether instructors are displaying positive or negative facial expressions. Second, learners rate their own felt emotion as more positive when the onscreen instructor displays happy rather than bored facial expressions. Third, learners give higher ratings of motivation when the onscreen instructor displays happy rather than sad facial expressions. Fourth, learners spend more time looking at the slides and allocate more eye fixations on the slides, based on eye-tracking, when the onscreen instructor displays happy rather than bored facial expressions. Fifth, learners score higher on learner post-tests when the onscreen instructor displays happy rather than bored facial expressions. We also note that pointing with a human hand was not superior to pointing with an arrow, suggesting that, in this case, hand pointing was not a strong social cue.

5.2. Theoretical Contributions

Our results support the five links of the positivity principle as shown in Figure 1. Positive instructors appear to initiate a sequence of affective, motivational, and cognitive processes that enhance learning. Specifically, students recognized the instructor’s positive social cues, experienced more positive emotions and motivation, allocated greater visual attention to the lesson content, and achieved higher post-test scores. These findings provide evidence for the final link in the positivity principle, which has not always been confirmed in prior research (Horovitz & Mayer, 2021; Pi et al., 2023, 2024; R. Zhang et al., 2023; C. Zhang et al., 2024).
Additionally, our finding that a mouse-guided arrow outperformed hand pointing in guiding learners’ attention is theoretically informative, as it challenges a strong interpretation of the embodiment principle, which posits that human-like gestures function as more effective social cues for learning (Fiorella, 2022; Mayer, 2021). The present results suggest that, in certain instructional contexts, functional precision may outweigh social affordance. Specifically, the arrow cue was directly embedded within the instructional slides, enabling it to indicate target words or elements with high spatial precision. In contrast, the instructor appeared in a separate video panel positioned adjacent to the slides and could only gesture toward the general region of the content, resulting in lower pointing precision. As a consequence, the hand-pointing gesture may have been less effective for directing attention to specific static elements on the slides. These findings imply that when the instructional goal is to guide learners’ visual attention to precise onscreen information, spatially precise, non-human cues may be more effective than embodied social cues. Thus, the effectiveness of embodied cues may depend on the degree to which they can be tightly integrated with instructional content, rather than on their social nature alone.

5.3. Practical Implications

Based on the evidence supporting the positivity principle in this study, instructors creating video lectures should give deliberate attention to their facial expressions during instruction. Specifically, instructors are advised to maintain a warm, engaged, and consistently positive facial expression (e.g., a natural smile) throughout the lecture, while avoiding expressions that may appear disengaged or bored. Such emotional cues can function as salient social signals that foster learners’ positive affect and motivation. In addition, when the instructional goal involves directing learners’ attention to specific onscreen elements, instructional designers should prioritize the spatial precision of visual cues. Our findings suggest that clearly embedded cues, such as oonscreen arrows that directly indicate target information, may be more effective than less precise gestural cues. Designers of video lectures should therefore consider integrating attention-guiding cues directly into instructional materials to support efficient visual processing. Taken together, these findings suggest that effective instructional video design benefits from coordinating instructors’ emotional expressions with precisely embedded visual cues. Such deliberate integration can help create video lectures that better support learner engagement and learning outcomes.

6. Limitations and Future Directions

We recognize two limitations in the current study. The first limitation concerns the perception of an instructor’s pointing gestures. We proposed that, although pointing gestures play a role as social cues, they are not as effective as arrow pointing in guiding students’ attention to learning materials in instructional videos. Previous studies have primarily focused on the role of pointing gestures in guiding attention in instructional videos but have overlooked their role as social cues (Li et al., 2023; Meier et al., 2023; Yoon et al., 2024; C. Zhang et al., 2024). Therefore, further research should investigate students’ perceptions of an instructor who uses happy facial expressions and hand-pointing gestures to verify our assumption.
The second limitation of our study is that all participants were female. Prior research suggests that there are gender differences in the perception and processing of facial expressions (Donges et al., 2012; Montagne et al., 2005). For instance, men are evidenced to be less accurate, as well as less sensitive, in labelling facial expressions compared to women. Men also show an overall worse performance compared to women on a task measuring the processing of emotional faces (Montagne et al., 2005). Therefore, future studies should test whether students’ gender moderates the positivity principle.

7. Conclusions

This study examined the effects of an onscreen instructor’s facial expression in short instructional videos for English vocabulary learning among female Chinese college students. In this context, when the instructor displayed a happy rather than a bored facial expression, female learners perceived the instructor as more positive, reported stronger positive emotions and higher motivation, exerted more visual attention on the lesson content, and achieved higher learning post-test scores. Theories of multimedia learning should incorporate the role of affective processing as well as cognitive processing and instructional designers should consider the instructor’s social cues alongside the cognitive design of a lesson.

Author Contributions

Conceptualization, Z.P., R.E.M. and X.L.; Methodology, Z.P. and X.H.; Formal analysis, Z.P. and X.H.; Investigation, X.H.; Writing—original draft, Z.P., X.H., R.E.M. and X.L.; Writing—review & editing, X.Z. and X.L.; Project administration, Z.P. and X.L.; Funding acquisition, Z.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China under Grant [62377035] and the National Social Science Foundation of China under Grant [24XSH023].

Institutional Review Board Statement

The protocol was approved by the Ethical Committee of the Shaanxi Normal University on 18 Jun 2023, (Ethical Approval Reference Number: SNNU2023060189).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Alemdag, E., & Cagiltay, K. (2018). A systematic review of eye tracking research on multimedia learning. Computers & Education, 125, 413–428. [Google Scholar] [CrossRef]
  2. Bradley, M. M., & Lang, P. J. (1994). Measuring emotion: The self-assessment manikin and the semantic differential. Journal of Behavior Therapy and Experimental Psychiatry, 25(1), 49–59. [Google Scholar] [CrossRef] [PubMed]
  3. Chan, H. M., & Saunders, J. A. (2023). The influence of valence and motivation dimensions of affective states on attentional breadth and the attentional blink. Journal of Experimental Psychology: Human Perception and Performance, 49(1), 34–50. [Google Scholar] [CrossRef]
  4. Chikha, H. B., Mguidich, H., Zoudji, B., & Khacharem, A. (2024). Uncovering the roles of complexity and expertise in memorizing tactical movements from videos with coach’s pointing gestures and guided gaze. International Journal of Sports Science & Coaching, 19(5), 1883–1896. [Google Scholar]
  5. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). L. Erlbaum Associates. [Google Scholar]
  6. Dargue, N., Sweller, N., & Jones, M. P. (2019). When our hands help us understand: A meta-analysis into the effects of gesture on comprehension. Psychological Bulletin, 145(8), 765–784. [Google Scholar] [CrossRef]
  7. De Koning, B. B., Tabbers, H. K., Rikers, R. M. J. P., & Paas, F. (2007). Attention cueing as a means to enhance learning from an animation. Applied Cognitive Psychology, 21(6), 731–746. [Google Scholar] [CrossRef]
  8. Donges, U.-S., Kersting, A., & Suslow, T. (2012). Women’s greater ability to perceive happy facial emotion automatically: Gender differences in affective priming. PLoS ONE, 7(7), e41745. [Google Scholar] [CrossRef]
  9. Emhardt, S. N., Jarodzka, H., Brand-Gruwel, S., Drumm, C., Niehorster, D. C., & van Gog, T. (2022). What is my teacher talking about? Effects of displaying the teacher’s gaze and mouse cursor cues in video lectures on students’ learning. Journal of Cognitive Psychology, 34(7), 846–864. [Google Scholar] [CrossRef]
  10. Fiorella, L. (2022). The embodiment principle in multimedia learning. In R. E. Mayer, & L. Fiorella (Eds.), The Cambridge handbook of multimedia learning (3rd ed., pp. 286–295). Cambridge University Press. [Google Scholar]
  11. Gallagher-Mitchell, T., Simms, V., & Litchfield, D. (2018). Learning from where “eye” remotely look or point: Impact on number line estimation error in adults. Quarterly Journal of Experimental Psychology, 71(7), 1526–1534. [Google Scholar] [CrossRef]
  12. Horovitz, T., & Mayer, R. E. (2021). Learning with human and virtual instructors who display happy or bored emotions in video lectures. Computers in Human Behavior, 119, 106724. [Google Scholar] [CrossRef]
  13. Huangfu, Q., He, Q., Luo, S., Huang, W., & Yang, Y. (2025). Does teacher enthusiasm facilitate students’ chemistry learning in video lectures regardless of students’ prior chemistry knowledge levels? Journal of Computer Assisted Learning, 41(1), e13116. [Google Scholar] [CrossRef]
  14. Lawson, A. P., Mayer, R. E., Adamo-Villani, N., Benes, B., Lei, X., & Cheng, J. (2021a). Do learners recognize and relate to the emotions displayed by virtual instructors? International Journal of Artificial Intelligence in Education, 31(1), 134–153. [Google Scholar] [CrossRef]
  15. Lawson, A. P., Mayer, R. E., Adamo-Villani, N., Benes, B., Lei, X., & Cheng, J. (2021b). The positivity principle: Do positive instructors improve learning from video lectures? Educational Technology Research & Development, 69(6), 3101–3129. [Google Scholar]
  16. Li, W., Wang, F., & Mayer, R. E. (2023). How to guide learners’ processing of multimedia lessons with pedagogical agents. Learning and Instruction, 84, 101729. [Google Scholar] [CrossRef]
  17. Margoni, F., Surian, L., & Baillargeon, R. (2024). The violation-of-expectation paradigm: A conceptual overview. Psychological Review, 131(3), 716–748. [Google Scholar] [CrossRef]
  18. Marius, M., Iasmina, I., & Diana, M. C. (2025). The role of teachers’ emotional facial expressions on student perceptions and engagement for primary school students-an experimental investigation. Frontiers in Psychology, 16, 1613073. [Google Scholar] [CrossRef]
  19. Mayer, R. E. (2021). Multimedia learning (3rd ed.). Cambridge University Press. [Google Scholar]
  20. Meier, J., de Jong, B., van Montfort, D. P., Verdonschot, A., van Wermeskerken, M., & van Gog, T. (2023). Do social cues in instructional videos affect attention allocation, perceived cognitive load, and learning outcomes under different visual complexity conditions? Journal of Computer Assisted Learning, 39, 1339–1353. [Google Scholar] [CrossRef]
  21. Montagne, B., Kessels, R. P. C., Frigerio, E., de Haan, E. H. F., & Perrett, D. I. (2005). Sex differences in the perception of affective facial expressions: Do men really lack emotional sensitivity? Cognitive Processing, 6, 136–141. [Google Scholar] [CrossRef]
  22. Pi, Z., Liu, R., Ling, H., Zhang, X., Wang, S., & Li, X. (2024). The emotional design of an instructor: Body gestures do not boost the effects of facial expressions in video lectures. Interactive Learning Environments, 32(3), 952–971. [Google Scholar] [CrossRef]
  23. Pi, Z., Liu, W., Ling, H., Zhang, X., & Li, X. (2023). Does an instructor’s facial expressions override their body gestures in video lectures? Computers & Education, 193, 104679. [Google Scholar]
  24. Plass, J., & Hovey, C. (2022). The emotional design principle in multimedia learning. In R. E. Mayer, & L. Fiorella (Eds.), The Cambridge handbook of multimedia learning (3rd ed., pp. 324–336). Cambridge University Press. [Google Scholar]
  25. Polat, H. (2023). Instructors’ presence in instructional videos: A systematic review. Education and Information Technologies, 28, 8537–8569. [Google Scholar] [CrossRef]
  26. Polat, H., Kayaduman, H., Taş, N., Battal, A., Kaban, A., & Bayram, E. (2025). Learning declarative and procedural knowledge through instructor-present videos: Learning effectiveness, mental effort, and visual attention allocation. Educational Technology Research and Development, 73, 3479–3513. [Google Scholar] [CrossRef]
  27. Shi, Y., Chen, Z., Wang, M., Chen, S., & Sun, J. (2024). Instructor’s low guided gaze duration improves learning performance for students with low prior knowledge in video lectures. Journal of Computer Assisted Learning, 40(3), 1309–1320. [Google Scholar] [CrossRef]
  28. Stull, A. T., Fiorella, L., Gainer, M. J., & Mayer, R. E. (2018). Using transparent whiteboards to boost learning from online STEM lectures. Computers & Education, 120, 146–159. [Google Scholar] [CrossRef]
  29. Stull, A. T., Fiorella, L., & Mayer, R. E. (2021). The case for embodied instruction: The instructor as a source of attentional and social cues in video lectures. Journal of Educational Psychology, 113(7), 1441–1453. [Google Scholar] [CrossRef]
  30. Suen, H. Y., & Hung, K. E. (2025). Enhancing learner affective engagement: The impact of instructor emotional expressions and vocal charisma in asynchronous video-based online learning. Education and Information Technologies, 30, 4033–4060. [Google Scholar] [CrossRef]
  31. Sweller, J., Van Merriënboer, J. J., & Paas, F. (2019). Cognitive architecture and instructional design: 20 years later. Educational Psychology Review, 31, 261–292. [Google Scholar] [CrossRef]
  32. van Gog, T. (2014). The signaling (or cueing) principle in multimedia learning. In R. E. Mayer (Ed.), The Cambridge handbook of multimedia learning (2nd revised ed., pp. 263–278). Cambridge University Press. [Google Scholar]
  33. Wang, M., Chen, Z., Shi, Y., Wang, Z., & Xiang, C. (2022). Instructors’ expressive nonverbal behavior hinders learning when learners’ prior knowledge is low. Frontiers in Psychology, 13, 810451. [Google Scholar] [CrossRef] [PubMed]
  34. Wang, Y., Feng, X., Guo, J., Gong, S., Wu, Y., & Wang, J. (2022). Benefits of affective pedagogical agents in multimedia instruction. Frontiers in Psychology, 12, 797236. [Google Scholar] [CrossRef] [PubMed]
  35. Xie, L., Liu, C., Li, Y., & Zhu, T. (2023). How to inspire users in virtual travel communities: The effect of activity novelty on users’ willingness to co-create. Journal of Retailing and Consumer Services, 75, 103448. [Google Scholar] [CrossRef]
  36. Yoon, H. Y., Kang, S., & Kim, S. (2024). A non-verbal teaching behaviour analysis for improving pointing out gestures: The case of asynchronous video lecture analysis using deep learning. Journal of Computer Assisted Learning, 40(3), 1006–1018. [Google Scholar] [CrossRef]
  37. Zhang, C., Wang, Z., Fang, Z., & Xiao, X. (2024). Guiding student learning in video lectures: Effects of instructors’ emotional expressions and visual cues. Computers & Education, 218, 105062. [Google Scholar] [CrossRef]
  38. Zhang, R., Cheng, G., & Wu, L. (2023). Influence of instructor’s facial expressions in video lectures on motor learning in children with autism spectrum disorder. Education and Information Technologies, 28, 11867–11880. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Five links in the positivity principle.
Figure 1. Five links in the positivity principle.
Education 16 00082 g001
Figure 2. Screenshots of each instructional video (in the arrow-pointing conditions, the red cursors generated by a mouse are emphasized with red circles, although such highlighting was absent in the arrow-pointing conditions.).
Figure 2. Screenshots of each instructional video (in the arrow-pointing conditions, the red cursors generated by a mouse are emphasized with red circles, although such highlighting was absent in the arrow-pointing conditions.).
Education 16 00082 g002
Figure 3. The graphs illustrate the impact of the instructor’s facial expression and the type of visual cue (main and interaction effects) on participants’ dwell time on slides. The vertical error bar is one standard error around the mean. Significance levels for p-values are as follows: *** p < 0.001.
Figure 3. The graphs illustrate the impact of the instructor’s facial expression and the type of visual cue (main and interaction effects) on participants’ dwell time on slides. The vertical error bar is one standard error around the mean. Significance levels for p-values are as follows: *** p < 0.001.
Education 16 00082 g003
Figure 4. The graphs illustrate the impact of the instructor’s facial expression and the type of visual cue (main and interaction effects) on participants’ fixation count on the slides. The vertical error bar is one standard error around the mean. Significance levels for p-values are as follows: *** p < 0.001.
Figure 4. The graphs illustrate the impact of the instructor’s facial expression and the type of visual cue (main and interaction effects) on participants’ fixation count on the slides. The vertical error bar is one standard error around the mean. Significance levels for p-values are as follows: *** p < 0.001.
Education 16 00082 g004
Figure 5. This graph illustrates the impact of the instructor’s facial expression and the type of visual cue (main and interaction effects) on learning outcome. The vertical error bar is one standard error around the mean. Significance levels for p-values are as follows: ** p < 0.01, *** p < 0.001.
Figure 5. This graph illustrates the impact of the instructor’s facial expression and the type of visual cue (main and interaction effects) on learning outcome. The vertical error bar is one standard error around the mean. Significance levels for p-values are as follows: ** p < 0.01, *** p < 0.001.
Education 16 00082 g005
Table 1. The means and standard deviations of each dependent variable measure for the four conditions.
Table 1. The means and standard deviations of each dependent variable measure for the four conditions.
Dependent Variable
Measure
Experimental Condition
Happy/Hand
(M ± SD)
Happy/Arrow
(M ± SD)
Bored/Hand
(M ± SD)
Bored/Arrow
(M ± SD)
Perceived emotion rating6.79 ± 1.326.79 ± 1.422.21 ± 1.102.52 ± 1.84
Felt emotion rating5.61 ± 1.175.90 ± 1.054.68 ± 1.314.66 ± 1.29
Motivation rating3.91 ± 1.124.20 ± 1.243.46 ± 1.283.34 ± 1.01
Dwell time (s)
Slides106.48 ± 13.53127.99 ± 11.76103.95 ± 9.69118.90 ± 9.05
Instructor5.56 ± 5.818.05 ± 9.785.24 ± 4.625.08 ± 6.31
Fixation counts
Slides386.11 ± 58.04448.38 ± 68.78373.07 ± 48.39410.86 ± 56.63
Instructor14.75 ± 13.7218.21 ± 18.4815.04 ± 14.1213.52 ± 12.77
Learning outcome score29.09 ± 6.7630.00 ± 7.6324.84 ± 8.4222.12 ± 7.16
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pi, Z.; Huang, X.; Mayer, R.E.; Zhao, X.; Li, X. Role of the Instructor’s Social Cues in Instructional Videos. Educ. Sci. 2026, 16, 82. https://doi.org/10.3390/educsci16010082

AMA Style

Pi Z, Huang X, Mayer RE, Zhao X, Li X. Role of the Instructor’s Social Cues in Instructional Videos. Education Sciences. 2026; 16(1):82. https://doi.org/10.3390/educsci16010082

Chicago/Turabian Style

Pi, Zhongling, Xuemei Huang, Richard E. Mayer, Xin Zhao, and Xiying Li. 2026. "Role of the Instructor’s Social Cues in Instructional Videos" Education Sciences 16, no. 1: 82. https://doi.org/10.3390/educsci16010082

APA Style

Pi, Z., Huang, X., Mayer, R. E., Zhao, X., & Li, X. (2026). Role of the Instructor’s Social Cues in Instructional Videos. Education Sciences, 16(1), 82. https://doi.org/10.3390/educsci16010082

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop