1. Introduction
As reported by Csikszentmihalyi in “Beyond Boredom and Anxiety” [
1], activities can be classified as autotelic (those that inherently encourage flow or optimal experience) and non-autotelic (those less likely to foster such experiences). However, autotelic activity does not always guarantee autotelic experience, and vice versa. It is not difficult to consider that this outcome depends a lot on the individual’s personality. Those with a more autotelic personality are more likely to experience flow, even during less engaging activities. However, in facilitating autotelic experiences, the context plays a crucial role that has yet to be fully explored. In this vein, the context of flow experiences is the focus of this paper, and more specifically, during gaming experiences in educational settings. Even when activities are performed individually but in a shared physical space, the collective atmosphere can enhance the likelihood of flow, even for individuals less inclined toward autotelic experiences. To report this in educational settings, we will focus on how unique characteristics of gaming in classes can make evident experiences and behaviors that are not apparent when considering individual students. Thanks to flow experience during gaming, we are made more aware that each class is not simply an aggregation of students but rather a social and physical space where collective behaviors, shared emotions, and even rituals are facilitated to arise, as well as individual behaviors that are not obvious to observe.
More specifically, the state of flow is a condition in which “people are so involved in an activity that nothing else seems to matter; furthermore, the experience itself is so enjoyable that people will continue to do it even at great cost, for the sheer sake of doing it” [
2]. Various studies in the gaming field have identified key criteria for defining the state of flow, such as enjoyment, a distorted perception of time, and a balance between the level of challenge of the video game and the player’s skills [
3]. Another fundamental characteristic is that the experience is autotelic, meaning “there is an intrinsic motivation towards that specific activity […] and performing the activity is the goal in itself” [
3] or, in other words, “an activity that is enjoyable and intrinsically rewarding” [
4]. Therefore, the state of flow involves complete engagement in a specific activity, and the concept of intrinsic motivation is crucial, particularly in gaming for educational purposes, which inherently requires a higher level of cognitive and physical investment [
4]. Moreover, the state of flow in video games “is one of the main sources of attraction, and this is reflected in prolonged gaming times” [
4]. Key factors also include attention, as the activity must fully absorb the individual, leaving no room for external distractions. Additionally, control over the game, clear objectives, and immediate feedback are essential to sustaining the flow state [
5]. From a cross-cultural perspective, flow is considered a universally shared experience: “Optimal experiences are common: 85% of individuals report experiencing flow in daily life; of these, 54% chose structured sports activities, 52% physical activities, while 47% chose hobbies and games” [
6].
However, flow is not just an individual experience, but is also reachable with others, also termed as “group flow”. This term was developed by Sawyer [
7], a psychologist and educator, who described it as a state particularly conducive to fostering creative collaboration [
8]. Sawyer explained this phenomenon as “a collective state of mind (…) a peak experience, a group performing at its maximum potential.” [
7]. The concept of “group flow” has evolved, and various subcategories have been proposed. “Individual flow experienced in the presence of others is termed social flow, which can be further subdivided into
co-active and interactive flow, the latter divided into private and shared interactive flow” [
9]. In the case of co-active flow, individuals experience flow in the presence of others but without interaction. The social context serves mainly as a background factor, although it can still influence the flow experience through subtle cues, either facilitating or hindering it. In addition, the emotional environment can play a role, as a positive and welcoming atmosphere enhances the flow experience. Studies have shown that the mere presence of others can influence the emotions experienced at that time, which can lead to a more enjoyable flow experience [
9].
We chose to focus our studies on the so-called co-active flow, and we termed it also “environmental flow”, a flow state concerning the environment, as it closely resembles a very common situation in the educational context, where schoolmates share the same physical space, the classroom, and simultaneously engage in the same individual tasks. In this context, we are interested in uncovering the possible link between the flow state and shared emotions, as studies have shown that “optimal experiences are particularly intense and enjoyable when they occur collectively. In these situations, participants transcend their egos and become part of a complex system that generates collective emotions” [
6]. When people engage in collective activities, they become more closely connected to their co-participants, fostering synergy and enhancing their sense of belonging and group cohesion. This positively affects both individuals and the group [
6]. As a result, “interactive optimal social experiences seem to have a significant impact on collective well-being and group efficacy” [
6]. Kessler and Hollbach also found that “group emotions emerge in collective situations where shared identities are derived from being part of a group.” [
10]. Consistent with Zumeta’s and colleagues findings [
6], participating in collective activities fosters emotional synchrony, which enhances perceived similarity, emotional sharing, and, ultimately, performance improvement—effects that are particularly pronounced in group activities.
In the first study in Sovere, the main objective is to investigate whether playing a single game in classes could lead to differences not only at the individual player level but also at the level of classes. In the second study, the main goal is to investigate whether playing two games of different genres—one more activating in terms of physical performance, the other more narrative and emotionally engaging—may either influence both the individual’s behaviors and the same class or not.
3. The First Study: Students in Sovere
The primary objective of this study is to identify whether playing in each class and reaching the flow state favor the preservation of homogeneity between classes in terms of certain variables measured at the beginning of the study. The initial variables measured are related to physiology, performance in a specific subject, and flow experience with video games. During the study, other variables related to the behaviors connected to the initial variables are analyzed to understand whether each class implements the observed behaviors differently, establishing its own uniqueness or not. More specifically, the hypotheses of the study are as follows.
Hypothesis 1.
Class-level homogeneity in mathematics achievement (i.e., the average math grades of the class) is associated with class-level homogeneity in task-solving performance (i.e., the time taken to solve exercises).
Hypothesis 2.
Class-level homogeneity in gaming experience (i.e., students’ familiarity with the game and weekly hours of gameplay) is associated with class-level homogeneity in flow experience (as measured by the Flow State Scale).
Hypothesis 3.
Class-level homogeneity in physiological characteristics (i.e., students’ age, gender, height, and weight) is associated with class-level homogeneity in metabolic activity during different tasks, as measured by the Metabolic Equivalent of Task (MET) derived from CO2 emissions within the classroom.
3.1. Materials and Methods
The study was conducted at the “Daniele Spada” comprehensive institute in Sovere, a town in the province of Bergamo. The structure of the study involved five main phases: the completion of a questionnaire on the demographic data and gaming abilities of the participants, an arithmetic exercise (10 min), a 20 min gaming session using Sonic Dash, the completion of the Flow State Scale and the School Well-Being Questionnaire, and finally, a second arithmetic exercise (10 min).
The experimental sample comprised 86 students aged 11–12 years (second class), divided by gender: 38 male, 45 female, and three who did not specify their gender. The participants were organized into four different second-year middle school sections as follows: section 2A (19 students: 9 male, 9 female, and one missing data); section 2B (22 students: 11 male, 10 female, and one “prefer not to answer”); section 2C (23 students: 12 male, 10 female, and one “prefer not to answer”); and section 2D (22 students: 13 male and 9 female). From the point of view of the analysis, both “missing data” and “prefer not to answer” were considered missing data.
As concerns the initial variables, for aspects concerning physiology, we measured gender, weight, and height, as these values affect CO
2 emission [
11]. Regarding school performance, we measured mathematics performance, as we involved the participants in solving arithmetic problems; and finally, for the gaming experience, we measured gamer status, and familiarity with the game, as they could affect the reaching of the flow state experience [
5].
We used a timer inside the online questionnaire to track exercise completion time, the Flow State Scale, and a carbon dioxide concentration sensor. The Dioxcare sensor allowed us to record variations in CO2 concentration levels within individual classrooms throughout the different experimental phases.
The Flow State Scale [
12] was used to assess the level of flow experienced during the gaming session. The scale captures not only an overall value of the flow state, but also the nine dimensions identified by Cskszentmihályi [
2] (e.g., the transformation of the sense of time and the autotelic experience). Each dimension is measured by 4 items; e.g., for the challenge–skill balance dimension, an item is “I was challenged, but I believed my skills would allow me to meet the challenge”. The scale was primarily developed to measure flow experience in sports professionals, as it occurs in their optimal performance.
To analyze the potential effect of flow on the environment, we examined overall CO
2 levels and the Metabolic Equivalent of Task (MET from now on) per student. MET represents the energy consumption ratio relative to body mass during specific physical activities, standardized against a reference value of 3.5 mL of oxygen per kilogram per minute. This choice was motivated by the need for reliable statistical comparisons; rather than relying on single CO
2 concentration values, we considered each student’s contribution to overall CO
2 levels through their metabolic rate, which correlates with individual CO
2 emission rates [
11]. The MET for each student was calculated considering the Equation (4) proposed in [
13], and our calculations are as follows:
where
(the Respiratory Quotient), i.e., the molar ratio of
exhaled to
inhaled, is dimensionless and can be reasonably approximated to 0.85 [
11]
H is height in meters, and
W is weight in kilograms. Instead of using the calculated
generation rate, we derive this value using the real CO
2 measurement data as follows:
This represents the increment of CO
2 over time for the volume (as per Formula (6) in [
13]), but we divide by
n (the number of students) to obtain the average value per person. Then, we convert from
M to MET using the following:
where
x represents the MET corresponding to the metabolic rate
M. We had four different specific values of MET according to the specific phase of the study: (
), MET pre-game:
), MET during the game: (
), and MET post-game: (
).
For the study, we implemented a 20 min gaming session involving all classmates simultaneously in the same room, playing the commercial game Sonic Dash. Sonic Dash is a free platformer and action game available on smartphones and tablets, easily downloadable from app stores. Players navigate various levels composed of platforms, avoiding obstacles and enemies while collecting rings that can be used for power-ups or character enhancements. In mission mode, players complete various objectives to earn rewards and progress in the game, with additional modes allowing for leaderboard rankings based on individual scores. The game’s accessible mechanics, coupled with continuous challenges, make it particularly conducive to facilitating the flow state among participants.
To obtain consent for data collection in the school, we produced two main documents: one for informed consent from parents and another for student assent. The consent form for parents was compiled before the research, and the informed assent document for the students was presented and completed (along with a verbal explanation) at the beginning of the study.
Statistical analyses were performed using R (version 4.x) to investigate the effects of gameplay within classroom groups on performance, flow, and collective engagement. Given that all participants played the same game (Sonic Dash), the primary focus was on identifying potential differences between class groups. Descriptive statistics (means and standard deviations) were calculated for performance measures (exercise completion times), as well as for Flow State Scale and Metabolic Equivalent of Task measures. Assumptions of normality and homogeneity of variances were tested through the Shapiro–Wilk and Levene tests, respectively. When these assumptions were violated, non-parametric analyses were applied (e.g., Wilcoxon signed-rank, Kruskal–Wallis, and Dunn post hoc tests with Bonferroni correction). All pairwise post hoc comparisons were corrected for multiple testing using the Bonferroni adjustment, in order to control the familywise error rate (FWER) across analyses. This approach enabled the identification of within-class and between-class variations in both behavioral and experiential measures, revealing how co-active flow contexts may differently shape emotional and cognitive engagement across groups.
3.5. H3. Class-Level Homogeneity in Physiological Characteristics Is Associated with Class-Level Homogeneity in Metabolic Activity During Different Tasks
Across the entire sample, the mean values (± SD) were 2.46 ± 0.81 for
, 3.01 ± 0.82 for
, 3.14 ± 1.03 for
, and 1.69 ± 0.79 for
(see
Table 1). When examined by class, Class 2A showed notably lower
scores (1.41 ± 0.34) compared with Classes 2B (2.92 ± 0.78), 2C (2.62 ± 0.42), and 2D (2.73 ± 0.67), with similar variability patterns across the other three measures (see
Table 2).
Normality testing (Shapiro–Wilk) for each class–variable combination revealed that at least one group per variable deviated from normality; therefore, non-parametric Kruskal–Wallis tests were applied.
For METtotal, the Kruskal–Wallis test indicated a significant effect of Class, , , with a large effect size = 0.47, 95% CI [0.32, 0.64]. Dunn–Bonferroni post hoc comparisons confirmed that Class 2A () scored significantly lower than Classes 2B (), 2C (), and 2D () (all ), with corresponding pairwise effect sizes ranging from to (95% CI [0.25, 0.80]). No significant differences emerged among Classes 2B, 2C, and 2D.
For METpre, class differences were again significant, , , with a large effect size , 95% CI [0.20, 0.55]. Dunn–Bonferroni comparisons revealed that Class 2C () scored significantly higher than Classes 2A (, ) and 2B (, ), and that Class 2D () scored significantly lower than Class 2C (, ) and Class 2B (, ). Pairwise effect sizes ranged from to (95% CI [0.09, 0.74]).
The METgame measure also varied significantly across classes, , , with a large effect size , 95% CI [0.32, 0.61]. Post hoc comparisons showed that Class 2C () scored significantly lower than Classes 2A (, ) and 2B (, ), while Class 2A () scored higher than Class 2B (, ) and Class 2D (, ). The strongest pairwise effect was observed between Classes 2A and 2C (, 95% CI [0.47, 0.81]).
Finally, METpost displayed the largest class differences, , , with a very large effect size , 95% CI [0.66, 0.85]. Dunn–Bonferroni tests showed Class 2D () achieving the highest scores, significantly exceeding Classes 2A (), 2B (), and 2C () (all ). Effect sizes for these comparisons ranged from to (95% CI [0.13, 0.92]), with the strongest contrast between Classes 2A and 2D.
These results collectively (see
Table 3 and
Table 4) indicate that class membership accounted for substantial portions of variance in all MET-related measures, as confirmed by consistently large effect sizes
ranging from 0.35 to 0.77). The strong and coherent pattern of effects across pre-, in-game, and post-game phases suggests that physical activation and engagement levels were not uniformly distributed among classes. In particular, Class 2A showed markedly lower physiological activation across phases, while Class 2D maintained the highest post-game values, reflecting different collective dynamics in energy expenditure and behavioral engagement. Taken together, all these findings contrast with the third hypothesis and therefore show that each class behaves differently in terms of CO
2 emission rates with respect to different activities.
4. The Second Study: Students in Siena
The primary objective of this study is to identify how different genres of games affect the flow state and, consequently, the values of some variables measured during the study. The variables measured are related to collective emotions, performance on a specific subject, and experience with video games. More specifically, the hypotheses of the study are as follows.
Hypothesis 1. Differences in game genre are associated with differences in performance, both in terms of completion time and number of correct answers.
Hypothesis 2. Differences in game genre are associated with differences in the flow experience.
Hypothesis 3. Differences in game genre are associated with differences in the perception of shared emotions.
4.1. Materials and Methods
The study was carried out in two different secondary schools in Siena: The Sarrocchi Institute, and The Galilei Institute. In the first school, two classes were involved, one from a Lyceum (3B, 22 students: 10 males, 11 females, 1 not specified) and one from a Technical School of Informatics (3A, 15 students: 14 males, 1 female). For the Galilei Institute, a Lyceum class was involved (3D, 19 students: 7 males, 11 females; and 1 not specified), for a total of 56 students (31 males, 23 females; 2 not specified). All students were of the same age (16–17 years attending the third class).
The structure of the study involved five main phases: the completion of a questionnaire regarding participants’ demographic data and gaming abilities, an Italian literature exercise (10 min), a 20 min gaming session, the completion of two scales about playing experience, a scale about shared emotions, and finally, a second Italian literature exercise (10 min). During the exercises, time was kept for storing each of the students’ answer performances. Also in this second study, the 20 min gaming session involved all classmates simultaneously in the same room. However, in this case, in each class, two groups were randomly assigned, one playing Sonic Dash and the other playing Life is Strange (LIS from now on). While Sonic is an action game, Life is Strange is a narrative-driven adventure game available on multiple platforms, including smartphones and tablets. Players follow the story of a high school student who discovers the ability to rewind time, allowing them to alter choices and explore their consequences. Gameplay revolves around exploration, dialogue-based decision making, and episodic storytelling rather than action or reflex-based tasks. The game’s emotionally engaging narrative, combined with immersive environments and character development, makes it particularly effective in eliciting reflective engagement and fostering emotional involvement among players.
In addition to the Flow State Scale [
12] already used in the previous study, we also considered the Narrative Transportation Scale [
14]. Traditionally, Csikszentmihalyi’s flow theory has been associated with contexts where performance, the presence of clear goals, immediate feedback, and a balance between challenge and skill are essential prerequisites for achieving an optimal state of engagement. This model fits well with action or platform games, where players are required to complete well-defined tasks, often with measurable outcomes in terms of score, time, or level progression. However, narrative video games occupy a different space that does not align with the classic view of flow as a performance-centered experience. In these games, the subjective experience is often oriented toward story immersion, identification with characters, and emotional exploration, rather than the overcoming of challenges through technical skill. For this reason, it is useful to speak of a narrative form of flow, or even, as some scholars suggest, anti-flow, meaning a deeply engaging experience that is structured on different dimensions than traditional performance-based flow. From a theoretical standpoint, two key elements appear to sustain players’ engagement with narrative games: motivation and flow experience. A fundamental study highlights how these two variables are crucial in maintaining players’ involvement, even in the absence of conventional competitive goals [
15]. A second influential work, also cited by the aforementioned authors, introduces the Narrative Transportation Scale, originally developed to assess the degree to which readers become “transported” into a story. Although designed for literary narratives, its underlying theoretical framework makes it highly adaptable to narrative video games, where storytelling plays a central role, so we introduced here an adapted version of the Narrative Transportation Scale that we renamed the Transportation Flow Scale (TFS from now on). The adaptation process followed standard translation–back translation procedures: items were independently translated into Italian by two bilingual experts, back-translated into English, and then reviewed by a panel of three specialists in media psychology and education to ensure conceptual and linguistic equivalence. Minor wording changes were made to contextualize the items for gaming (e.g., “reading” was replaced with “playing”). The purpose of introducing the TFS was not to reproduce the Flow State Scale (FSS), but rather to complement it by capturing a distinct experiential dimension of immersion. While the FSS emphasizes action-oriented, performance-based engagement (e.g., challenge–skill balance, control, and concentration), the TFS was designed to measure absorption in story, imagination, and emotional resonance—experiences typical of narrative-driven games such as Life Is Strange. Therefore, strong convergence between the two scales was not expected, as they were theoretically and conceptually distinct by design. In this context, the combined use of the Flow State Scale (more sensitive to performance-based dynamics) and the Transportation Flow Scale (better suited to capture narrative immersion) offers a promising methodological approach to grasp the complexity of experiences across different game types. Lastly, it is important to note that even in narrative games, where the performance dimension is less prominent, players can experience intense engagement that, while not strictly corresponding to traditional flow, shares several emotional and motivational components. Understanding and measuring these nuances is essential to fully appreciate the educational and psychological potential of video games, including in school or learning contexts. A version of the scale adapted for playing experience was administered both to players of Sonic and Life is Strange to verify whether the two different genres of video games, Sonic, more performance-oriented, and LIS, more narrative-oriented, lead to two different flow experiences (measured by comparing the Flow State Scale and the Transportation Flow Scale). The scale contains twelve items, such as “I wanted to learn how the game ended.” Before analysis, items 2, 5, and 9 of TFS were reverse-coded to ensure consistency in the direction of scoring. Internal reliability for the 12-item TFS was satisfactory, with a Cronbach’s
and McDonald’s
, indicating good internal consistency. A parallel analysis suggested a unidimensional structure, supporting the use of a single latent factor (eigenvalue-based criterion). The exploratory factor analysis (Maximum Likelihood extraction, no rotation) confirmed that all items loaded positively on a single factor, with loadings ranging from 0.37 to 0.79 and the factor explaining 32% of the total variance. These results support the construct validity of the adapted TFS as a coherent measure of transportation-based flow experiences in narrative gameplay contexts.
To consider what emerged through the Sovere study about collective emotion phenomena observed after play session, a scale about synchronization of emotions was added as follows: the Perceived Emotional Synchrony (PES from now on) scale [
16]. Playing together individually can be considered as a sort of participation to a collective gathering, a sort of ritual that is shared among participants even if individually involved. Rooted in Durkheim’s [
17] concept of collective effervescence, PES captures a state of shared high emotional arousal that arises from co-presence, shared attention, and synchronized behavior, regardless of the emotional valence of the event. This emotional convergence enhances feelings of social unity, strengthens group identification, and promotes a sense of connection to something greater than the self. In this view, we decided to measure the PES construct to delineate the experience of playing together in the same class using this concept. The PES is a 16-item scale, including questions such as “We felt a strong shared emotion.”
Further evidence of validity of TFS was provided by significant positive correlations between TFS average scores and the FSS after gameplay (r = 0.42, p = 0.001) and with PES (r = 0.29, p = 0.031*). These findings indicate that, while conceptually distinct, the TFS captures a related yet differentiated dimension of immersive flow, particularly relevant for narrative-based gaming experiences.
To obtain consent for data collection in the schools, we obtained informed consent from each participant. The document for the consent was presented and completed online (along with a verbal explanation) at the beginning of the study.
All statistical analyses were conducted using R (version 4.x). Given the study design, analyses were aimed at examining the potential effects of different game genres on students’ performance, subjective experience, and shared emotions. Descriptive statistics (means and standard deviations) were first computed for all key variables, including response times, accuracy rates, Flow State Scale scores, Transportation Flow Scale scores, and Perceived Emotional Synchrony measures. The Shapiro–Wilk test was applied to verify normality assumptions, while the Levene test assessed homogeneity of variances. Depending on these diagnostics, either parametric tests (t-tests, one-way ANOVA) or non-parametric tests (Wilcoxon signed-rank, Kruskal–Wallis, and Dunn post hoc tests with Bonferroni correction) were performed. All pairwise post hoc comparisons were corrected for multiple testing using the Bonferroni adjustment, in order to control the familywise error rate (FWER) across analyses. This analytical approach allowed us to evaluate whether and how game genre (e.g., Sonic Dash vs. Life Is Strange) influenced participants’ task efficiency, flow experience, and emotional synchrony both at the individual and collective levels.
4.3. H1. Differences in Game Genre Are Associated with Differences in Performance, Both in Terms of Completion Time and Number of Correct Answers
To examine whether different game genres affect task performance, we analyzed both the time required to complete the exercises and the number of correct answers, comparing pre- and post-gameplay scores across the full sample and within each game group (Game 1 and Game 2).
Regarding completion time, the paired-samples t-test on the full sample revealed a statistically significant reduction in the time taken after gameplay, t(55) = −4.15, p < 0.001, with a mean decrease of 17.5 s (95% CI [9.06, 25.94]). The effect size was medium (d = −0.56, 95% CI [−0.83, −0.27]). This pattern was also observed within both subgroups. For participants who played Game 1, the reduction in completion time was significant, t(35) = −3.27, p = 0.002, with a mean difference of 17.25 s (95% CI [6.54, 27.96]) and a medium effect size (d = −0.54, 95% CI [−0.89, −0.19]). Similarly, for those who played Game 2, the difference was also significant, t(19) = −2.50, p = 0.022, with a mean difference of 17.95 s (95% CI [2.92, 32.98]) and a medium effect size (d = −0.56, 95% CI [−1.02, −0.08]). These results indicate that, on average, students solved the exercises significantly faster after gameplay across both game conditions, with consistent medium-sized effects, suggesting a robust performance improvement following the gaming experience with no substantial differences in effect magnitude between the two genres.
In contrast, no significant differences were found in the number of correct answers. Due to violations of normality assumptions, a Wilcoxon signed-rank test was used for the full sample and for Game 1. For the full sample, the result was non-significant, , with a small and non-significant effect size (r = −0.13, 95% CI [−0.41, 0.17]), indicating that task accuracy remained stable after the game. Similarly, for Game 1, no significant difference in accuracy was found (), with a negligible effect size (r = −0.12, 95% CI [−0.46, 0.25]). For Game 2, where normality assumptions were met, a paired-samples t-test confirmed the absence of a significant effect, , with a negligible effect size (d = −0.13, 95% CI [−0.57, 0.31]).
Overall, gameplay significantly improved response speed but did not affect accuracy, and this pattern was consistent across both game types. Thus, while students completed the exercises more quickly after playing either game, their performance in terms of correctness remained unchanged. These results suggest that the type of game (narrative vs. arcade) does not differentially affect cognitive performance, supporting the idea that games can enhance efficiency without compromising accuracy.
To compare with the previous study in which meaningful differences were found about the time of answers in different classes, we performed the same with this second study. To investigate whether the variation in performance (in terms of response time and number of correct answers) differed across the three classes (3A, 3B, and 3D), we analyzed the difference between pre- and post-intervention scores using both parametric and non-parametric methods based on the distribution of the data.
For the response time difference, we computed a variable () representing the change in completion time before and after playing. Shapiro–Wilk normality tests indicated that the assumption of normality was not violated for any of the three classes ( in all three cases). Given that the residuals were approximately normally distributed, a one-way ANOVA was conducted to compare across classes. The results revealed no statistically significant differences among the three groups, , with a negligible effect size (, 95% CI [0.00, 1.00]). This suggests that the reduction in response time after gameplay was comparable across the different classes.
Regarding the correct answers difference (), a variable set to compare the number of correct answers before and after the gaming session, the Shapiro–Wilk tests showed a deviation from normality for Class 3B (), while the other two classes did not violate normality (3A: ; 3D: ). Due to the violation in one subgroup, we opted for the Kruskal–Wallis test, which does not assume normality. The Kruskal–Wallis test yielded no significant differences in the change in correct answers between classes (), with a small effect size (, 95% CI [−0.03, 0.24]). This result indicates that the improvement (or stability) in task accuracy after gameplay did not differ significantly across the classes.
Taken together, these results suggest that, differently from the previous study, neither the improvement in response times nor in the number of correct answers differed significantly across the three classes, indicating a consistent effect of the gaming intervention across class groups.
4.4. H2. Differences in Game Genre Are Associated with Differences in the Flow Experience
To examine whether the type of game influenced participants’ self-reported flow experience after gameplay, we compared the variable , describing the average of the Flow State Scale administered after the game session, between the two experimental groups (Game 1 and Game 2). Preliminary analyses using the Shapiro–Wilk test revealed that at least one group violated the normality assumption (for Game 1 group ); therefore, a non-parametric Wilcoxon rank-sum test was used to assess group differences. The results indicated no statistically significant difference in between the two groups (). The corresponding effect size was small (, 95% CI [0.01, 0.40]), suggesting that the type of game played did not substantially impact participants’ flow experience, thus falsifying one side of the hypothesis.
To compare the game experience with the first study, a Kruskal–Wallis rank sum test was performed to examine whether the average scores on the Flow State Scale after the game () varied significantly across the three school classes (3A, 3B, and 3D). The test indicated a statistically significant difference between at least two of the groups, , with a moderate effect size (, 95% CI [−0.02, 0.34]). Due to the violation of normality assumptions in one of the groups (3D, ), the Kruskal–Wallis test was preferred over a parametric ANOVA. These results suggest that the flow experience after gameplay may be influenced by class-level factors, potentially reflecting contextual, demographic, or educational differences among the groups. To further investigate the significant Kruskal–Wallis result regarding differences in post-game flow scores () across the three classes, post hoc pairwise comparisons were conducted using Dunn’s test with Bonferroni correction. The test revealed a significant difference between Class 3B and Class 3D (Z = 2.44, = 0.044), with a large effect (, 95% CI [0.52, 1.04]), indicating that students in Class 3B () reported lower flow scores than those in Class 3D (). Although the remaining comparisons did not reach statistical significance (), their effect sizes were of comparable magnitude (, 95% CI [0.52, 1.04]), suggesting that meaningful differences might exist but were not detectable due to sample size limitations.
Another variable was analyzed, the related to the compilation of the Transportation Flow Scale, aiming at clarifying another form of flow experience more in line with narrative games. Also, in this case, our hypothesis is to verify whether a different game genre leads to a different game experience. To evaluate whether the type of game played influenced participants’ narrative engagement as measured by the Transportation Flow Scale average score , we conducted a two-sample t-test. Prior to this, the assumption of normality was verified via Shapiro–Wilk tests, which did not indicate significant deviations from normality in either group (Game 1: ; Game 2: ). Additionally, an F-test to compare variances showed no significant difference , supporting the use of a t-test assuming equal variances. Results of the independent t-test revealed a statistically significant difference in scores between participants who played Game 1 (Sonic Dash; ) and those who played Game 2 (Life is Strange; ), t(54) = −3.15, p = 0.0026, with a 95% confidence interval for the mean difference ranging from −0.81 to −0.18. The corresponding effect size was large (, 95% CI ), indicating that the narrative-based game (Game 2) elicited substantially higher levels of narrative immersion and engagement compared to the action-based game (Game 1). These results support the hypothesis that the Transportation Flow Scale is particularly sensitive to differences in experiential engagement depending on game genre, effectively distinguishing between more narrative-driven and more action-oriented gameplay experiences.
To assess whether the average score on the TFS scale () differed significantly across the three class groups (3A, 3B, and 3D), a one-way ANOVA was conducted. Prior to the analysis, normality of residuals was verified within each class using the Shapiro–Wilk test, which did not reveal significant deviations from normality (; ; ), thereby justifying the use of a parametric test. The ANOVA revealed a trend toward significance among class groups (), suggesting a potential, although not statistically significant, difference in scores depending on class. The corresponding effect size was small-to-moderate (, 95% CI [0.00, 1.00]), indicating that approximately 10% of the variance in transportation experience could be attributed to class membership. Although the result did not reach conventional significance, this trend may warrant further investigation with a larger sample to confirm whether class-level contextual factors influence perceived narrative immersion during gameplay. However, based on current data, no conclusive class effect was observed.
4.5. H3. Differences in Game Genre Are Associated with Differences in the Perception of Shared Emotions
To assess changes in the perceptions of shared emotions, we worked on Perceived Emotional Synchrony before and after the gaming session. To do this, a Wilcoxon signed-rank test was conducted on the average PES scores before and after gameplay across the whole sample. The Shapiro–Wilk test indicated that the distribution of the difference scores violated the assumption of normality (W = 0.945, p = 0.01289), justifying the use of a non-parametric test. The Wilcoxon signed-rank test revealed a statistically significant decrease in PES scores after gameplay (, ; , ). This suggests that gaming, regardless of the specific game played, was associated with a significant decrease in participants’ Perceived Emotional Synchrony. The corresponding effect size was with a 95% confidence interval , indicating a small-to-moderate effect.
To examine if there were different effects of gaming on Perceived Emotional Synchrony, according to the game played, we analyzed the within-group changes in PES average scores ( vs. ) separately for participants who played Game 1 and Game 2. For the Game 1 group, the Shapiro–Wilk test indicated that the difference scores were normally distributed (, ), thus allowing the use of a paired t-test. The results revealed a statistically significant decrease in PES scores after the game session: , , with a mean difference of ( CI: ). The mean PES score declined from (pre-game) to (post-game). The effect size was small but statistically meaningful (, CI ), indicating a modest yet reliable decrease in Perceived Emotional Synchrony following gameplay. In contrast, for the Game 2 group, the Shapiro–Wilk test again supported normality of the difference scores (, ). The paired t-test did not indicate a significant difference in PES scores before and after the game session: , , with a mean difference of ( CI: ). The corresponding effect size was negligible (, CI ), confirming the absence of a meaningful change in Perceived Emotional Synchrony for this group. These findings suggest that only Game 1 was associated with a statistically significant reduction in Perceived Emotional Synchrony, while Game 2 had no observable effect. Our hypothesis is, however, verified as different game genres showed different effects of the shared emotions, in particular through Perceived Emotional Synchrony.
To examine whether changes in Perceived Emotional Synchrony differed across classes, we computed in the variable the difference between post- and pre-gaming session scores and performed a Kruskal–Wallis test, due to non-normality observed in one of the groups (Shapiro–Wilk for class 3B: ). The test revealed no statistically significant differences in the values of PES before and after playing among the three classes . These results suggest that the change in PES following the intervention was comparable across classes 3A, 3B, and 3D, indicating that class membership did not significantly influence variations in emotional synchrony. Although the effect size was small , this result confirms the consistency of the observed pattern across groups.
5. Joint Discussion
This study is situated within the long-standing tradition of research on the flow state, a line of inquiry initiated by Csikszentmihalyi [
2], who also explored, since the beginning, how experiences of flow could be intentionally fostered in educational settings to support creativity, engagement, and shared activity. One notable example is the “Flow Activities Room” [
18] introduced at the Key School, an experimental elementary school inspired by the theories of multiple intelligences and intrinsic motivation. In this environment, students were encouraged to freely choose and engage in stimulating, self-directed tasks within a structured but flexible framework. The Flow Room served two key educational purposes: first, to allow students to explore and develop their individual abilities and intelligences through diverse and enjoyable activities; and second, to extend the intrinsic motivation experienced in this space to the rest of their academic work. The underlying idea was that if students discovered that learning could be intrinsically rewarding in one context, they might transfer this perception to their overall educational experience.
In the same spirit, the present studies conducted in school settings sought to investigate how gaming activities—when embedded within classroom contexts—can act as catalysts for similar processes of engagement, creativity, and collective experience, supporting both individual flow and the emergence of shared emotional and motivational states. The findings from the Sovere and Siena studies converge on an important result: in both contexts, playing reduced exercise completion times without compromising accuracy. While the number of correct answers did not improve, gameplay consistently facilitated faster performance, pointing to enhanced task efficiency and greater meta-cognition skills [
19] rather than improved accuracy or learning outcomes. This supports the idea that play, regardless of its content, serves as a catalyst for “performance readiness” by priming students for quicker and equally precise responses.
Despite this overarching similarity, the two studies highlight different dimensions of variation. In the Siena study, no significant differences were observed between the two games tested—Sonic Dash and Life Is Strange—with respect to general indicators such as psychological well-being, response accuracy, and response time. This suggests that the act of playing itself, rather than the specific game genre, was sufficient to produce general performance benefits at the group level. However, experiential measures, including Perceived Emotional Synchrony, Flow, and Narrative Transportation, proved more sensitive to the nature of the game. Narrative-driven play tended to foster immersion and a sense of shared experience, while competitive play showed reductions in PES, reflecting a potential dampening of collective emotional convergence. This is also in line with what was reported by Culbertson and colleagues [
20], where it was shown that according to Emotional contagion theory [
21], one’s perceptions of their own flow in the classroom may be influenced by their perceptions of others’ flow. This is due to “the tendency to automatically mimic and synchronize facial expressions, vocalizations, postures, and movements with those of another person and, consequently, to converge emotionally” [
21]. In contrast, the Sovere study focused exclusively on Sonic Dash but across multiple classes. Here, the main source of variation lay not in the game itself but in classroom dynamics. Although the students were comparable in terms of baseline characteristics—mathematics performance, gender, gamer status, familiarity with the game, and socio-demographic background—significant differences emerged between classes in variables such as exercise completion times, flow, and MET. This suggests that play acted as a revealing lens, bringing to the surface class-specific social and emotional patterns that were otherwise latent. Interestingly, despite the competitive nature of Sonic Dash, observational evidence in Sovere showed strong peer interaction during gameplay (e.g., sharing scores, suggestions, and experiences), contrasting with the decline in PES observed in Siena under competitive conditions. Age differences between the two cohorts may partly explain this discrepancy, as social and emotional dynamics often evolve with developmental stage.
The role of MET also provided further insights. In the Sovere study, gaming phases were associated with the highest MET values, comparable to moderate physical activity [
22], underscoring the activating and engaging quality of play. In contrast, post-game phases yielded the lowest values, reflecting a relaxing or recovery effect after play. At the class level, distinct patterns of MET change were revealed, suggesting that groups differed in their collective energy expenditure and engagement styles, offering an additional dimension through which play exposed classroom-specific dynamics.
Taken together, the two studies emphasize a dual perspective on play in education. On the one hand, play provides general benefits at the level of the full sample—namely, improved efficiency without loss of accuracy—regardless of game type or classroom. On the other hand, play also reveals context-dependent dynamics: in Siena, differences emerged between games with respect to experiential quality, while in Sovere, differences emerged between classes despite uniform gameplay. These findings suggest that constructs linked to performance and efficiency are robust across contexts, whereas constructs tied to emotional resonance and collective experience are sensitive to both the type of game and the specific classroom environment.
From an educational standpoint, this dual role of play is highly relevant. Teachers may strategically integrate narrative games to foster collaboration and shared affect, or competitive games to encourage focus, energy, and goal orientation. More broadly, gameplay can be understood as both an intervention to enhance task readiness and a diagnostic tool to reveal collective socio-emotional patterns, providing educators with valuable insights into the dynamics of their classrooms.
Ultimately, both studies demonstrate that the classroom can act as a living laboratory for the study of environmental flow, where emotional contagion and shared engagement shape not only the learning process but also the social fabric of the class. Future research should explore how these mechanisms can be intentionally leveraged to create healthier, more emotionally cohesive educational settings, integrating play not merely as a pedagogical instrument but as a social catalyst for connectedness and well-being.
Limitations and Future Directions
This research presents several limitations that should be acknowledged and, at the same time, open interesting directions for future work. First, in terms of participants, the overall number of subjects involved was relatively small, and the two studies included students of different ages, attending only three schools. These differences in age and educational context may have introduced variability in the results and limited the generalizability of the findings. In addition, having such a small and context-specific sample does not allow a clear understanding of whether the observed effects are representative of broader school populations. Future research should therefore aim to involve a larger number of participants, ideally including multiple classes within the same school and students of comparable age ranges. This would make it possible to generalize results by age group and school level. Furthermore, extending the research to schools in different cities or regions could help clarify whether geographical or cultural factors contribute to the observed variations in emotional, behavioral, or cognitive responses to gaming and flow experiences.
Second, in terms of methodology, the exercises used to assess performance were intentionally simple. Although this design allowed for ease of comparison between groups, it may have constrained the potential to detect more nuanced effects of gaming and flow on higher-level cognitive or problem-solving skills. Future studies might consider adopting more complex and domain-specific tasks, possibly aligned with the type of game used, to better capture the transfer of flow-related benefits to academic or cognitive performance. Additionally, no follow-up sessions were conducted to investigate the persistence of the effects of gaming and flow over time. Longitudinal designs with repeated sessions could help determine whether improvements in emotional well-being, class climate, or meta-cognition are short-lived or can be sustained across multiple interventions. Always concerning methodological aspects, estimating individual METs from indoor CO2 concentration, classroom volume, and participants’ height and weight involves several assumptions, including ventilation rates and the spatial distribution of activity. Maintaining fully controlled environmental conditions in classroom-based research represents a methodological challenge, and differences between classrooms could potentially influence MET estimates. Within the limits of the current study, we standardized conditions as much as possible: all classrooms were ventilated for 15 min prior to the start of the sessions, and windows and doors were kept closed during the entire data collection period. Moreover, the MET calculations accounted for differences in classroom volume, ensuring that variations in room size did not affect the determination of individual MET coefficients. Future studies could further strengthen environmental controls to improve precision and comparability of MET estimates. Potential strategies include continuous monitoring of air exchange rates, use of identical or controlled rooms, or more sophisticated modeling of participant localization and activity patterns within the classroom. Such refinements would enhance the robustness of MET measurements in classroom-based research. Finally, both the Sovere and Siena studies were largely exploratory in nature, reflecting the novelty of this research field. The constructs investigated partly overlapped but also differed between the two studies, making full integration of the findings challenging. Future research should therefore work toward developing a standardized experimental protocol, defining a coherent set of measures to assess emotional, cognitive, and social outcomes of co-active flow experiences in the classroom.
Such a framework would help identify the constructs most suitable for addressing the central research question: how the state of flow achieved during gameplay—in co-active classroom contexts—can enhance both individual and collective well-being, and foster a more receptive cognitive disposition toward learning. A further promising avenue for future investigation involves collaborative flow, by designing and testing game-based activities that promote cooperation rather than competition. This would allow researchers to compare the social and emotional dynamics of co-active versus co-operative flow and to evaluate their respective benefits for classroom engagement and emotional synchrony.
6. Conclusions
This paper presented two complementary studies designed to explore how the flow state is shaped by the school context in which gaming occurs. We examined context at two levels: first, through situations of co-active flow, where classmates played individually but simultaneously in the same room; and second, by investigating the impact of different game genres. To capture this complexity, the second study extended the standard Flow State Scale with the introduction of the Transportation Flow Scale, adapted from the Narrative Transportation framework. This allowed us to show that while the FSS primarily reflects the action-oriented and competitive dimensions of gaming, the TFS better captures the immersive qualities of narrative-centered games.
Building on this, we considered how collective emotions emerge in co-active flow situations. Inspired by the first study, which suggested that classroom dynamics shape the collective atmosphere during play, the second study incorporated the construct of Perceived Emotional Synchrony traditionally used to study emotional convergence in collective rituals. Findings revealed that game genre played a crucial role: competitive games tended to erode PES, whereas narrative-driven games fostered a stronger sense of shared affective experience. Moreover, the first study highlighted that even within a single school, significant class-specific differences emerged, suggesting that group-level social dynamics can substantially modulate the effects of gameplay.
Taken together, these results indicate that educators may strategically curate gaming experiences to influence classroom climate. Narrative games may encourage collaboration and emotional cohesion, while competitive games may stimulate focus and goal-oriented behavior—both potentially advantageous depending on the educational objective. The overarching message is that play in school contexts serves a dual function: it enhances task efficiency while simultaneously acting as a lens that reveals collective socio-emotional patterns. For this reason, teachers could, at least in principle, leverage different genres of games to modulate class mood and foster desired socio-emotional outcomes. To provide a more concrete framework for educational application, we propose that teachers can strategically select game genres to support specific classroom goals related to well-being, social cohesion, and emotional climate. Narrative-driven games (e.g., Life is Strange) can be used to foster calm, reflection, empathy, and concentration, while fast-paced or reactive games (e.g., Sonic Dash) may promote engagement, healthy competition, and energy release. A flexible, modular approach could include the following steps: (1) identifying the classroom goal for the session, (2) selecting an appropriate game genre aligned with this goal, (3) sequencing gameplay with follow-up activities such as group discussion, reflective journaling, or cooperative tasks, and (4) monitoring impact through brief self-reports or teacher observations of engagement, mood, and peer interaction. For instance, short narrative gameplay sessions could be followed by reflective discussion to consolidate emotional and cognitive engagement, whereas brief reactive game sessions could be paired with cooperative challenges to stimulate energy and teamwork. This framework allows teachers to adapt game selection, duration, and sequencing based on classroom needs while supporting positive classroom climate and student readiness to learn.
Nevertheless, this line of inquiry is only beginning. Further research is needed to refine the tools for measuring flow and emotional synchrony in educational contexts, to disentangle the mechanisms driving class-specific differences, and to assess the long-term educational impact of integrating structured gaming experiences into school practice.