Preschoolers’ STEM Learning on a Haptic Enabled Tablet

: The research on children’s learning of science, technology, engineering, and math (STEM) topics from electronic applications (apps) is limited, though it appears that children can reasonably transfer learning from tablet games to particular tasks. We were interested to determine whether these ﬁndings would translate to the emerging technology of haptic feedback tablets. The research on haptic feedback technology, speciﬁcally, has found that this type of feedback is e ﬀ ective in teaching physics concepts to older students. However, haptic feedback has not yet been su ﬃ ciently explored with younger groups (e.g., preschoolers). To determine the e ﬀ ect of playing a STEM game enhanced with haptic technology on learning outcomes, we designed an experiment where preschool participants were randomly exposed to one of three di ﬀ erent conditions: (a) STEM game with no haptic feedback (tablet), (b) STEM game enabled with haptic feedback (haptics), or (c) a puzzle game (control). Results revealed no signiﬁcant di ﬀ erences in comprehension or transfer by condition. Results from this study contribute to the literature on the e ﬀ ectiveness of haptic feedback for preschool STEM learning.


Introduction
The United States consistently falls behind its global peers in students' math and science achievement scores [1]. Data from the 2015 Program for International Student Assessment (PISA) rank United States' fifteen-year-olds in the middle of the global scene (38 out of 71 countries) in math and in the top third (24 out of 71 countries) of science scores [2]. When compared to its developed counterparts alone, the United States fares even worse-30th out of 35 countries in math and 19th out of 35 countries in science. Given these striking statistics and the uptick of scholars suggesting science, technology, engineering, and math (STEM) will be the backbone of 21st-century world of work [3], it is unsurprising that the education sector is feeling the pressure to prioritize STEM education.
This emphasis is not only felt in K-12 classrooms, but also increasingly in early childhood education programs. While STEM learning was originally thought to be too challenging or inappropriate for children under age five [4], in the last ten 10 years, however, that conversation has changed dramatically as experts have been more focused on improving STEM education in early childhood settings [5,6]. Preschoolers not only have a natural curiosity for STEM concepts, but also have the cognitive capacity to make these connections. Indeed, research demonstrates a positive relationship between STEM learning in preschool and later academic achievement [7]. With the appropriate scaffolding, it is not unheard of for this age group to start learning STEM.
Technology has quickly become an integral part of children's lived experience [8] and a place where children may be able to learn new and novel concepts. Research has shown that traditional tablet technologies can support aspects of STEM learning [9]. Yet, traditional tablets provided limited tactile feedback to the user, a feature that may be supportive of learning for many STEM concepts. Haptic feedback devices-such as the tiny vibrations Android users feel when they tap a navigation button on their smartphones-may offer a solution in which users can experience enhanced tactile feedback while engaging with a touchscreen device. There is evidence of significant STEM learning from haptic feedback devices with adolescents and young adults [10,11]. However, we are only just realizing the potential power of these devices for young children.
The purpose of this study is to evaluate whether young children playing a STEM application (app) designed to explore the concepts of weight and balance learn better from this experience when it is presented on a haptic feedback tablet compared to a traditional touchscreen tablet. We currently know almost nothing about young children using haptic feedback devices and literature is only now starting to come out about the different features of STEM apps used on traditional tablets with this young age group. To our knowledge, this study is the first of its kind to use haptic feedback tablets with children as young as three years old (preschoolers) AND determine learning outcomes from this experience. Our study adds to the current literature around children learning from STEM games on tablets more generally, as well as from using haptic feedback tablets, specifically.
This article has the following order: we first describe the current literature on young children's learning STEM concepts from tablets, introduce haptic feedback technology and literature related to its use with youth, and finally discuss the theoretical frameworks we used to design this study before moving into the research questions, methods, results, and discussion, We conclude with final thoughts and implications of this work.

Young Children Learning STEM from Tablets
There is a long history of young children learning from high-quality educational media beginning with research on television [12][13][14]. More recently, research on children's learning of STEM topics from mobile devices and apps demonstrates that children can reasonably transfer learning from tablet games to particular tasks. For instance, four-to six-year-olds were able to transfer learning a challenging cognitive task (i.e., Tower of Hanoi) from a touchscreen fairly seamlessly to the physical version [15]. The researchers found that regardless of the original modality children practiced the task on (2D or 3D), all children improved in the final problem-solving task [15]. Problem solving, in this study and elsewhere, has been described as an important STEM building block [16,17]. Evaluations of specific math applications have also found mostly positive results as well [18]. In one study, Schacter and Jo [19] showed that preschoolers who played a math-oriented game (Math Shelf ) on a tablet for 20 min weekly for 15 weeks learned significantly more mathematics concepts than children who were in the control group (business as usual in their classrooms). Another study established that a group of preschoolers whose teachers received dedicated professional development tools and implemented a high-quality digital math app in their classrooms outperformed students who did not receive these resources on post-test measures [20].
Despite these promising findings, it is not entirely clear that playing tablet applications on a 2D touchscreen improves learning outcomes more than watching a non-interactive video. Specifically, in one study, preschoolers who played a measuring game on a tablet demonstrated learning when tested with transfer tasks that nearly identically replicated the content that they interacted with. However, those who viewed a pre-recorded version of the measuring tablet game did better on far transfer tasks that required them to apply what they learned from the screen to more novel experiences. The researchers posit that the interactive nature of playing the game might only help in conditions when the child is required to learn content and use it in ways that most resemble the gameplay [21].
Other research echoes these findings, suggesting that exposure to high-quality educational apps, but not necessarily interacting with them, supports desirable STEM learning outcomes [22][23][24]. These researchers note that such differences may also be a function of child age.

STEM: Weight and Balance
Research on children's use of STEM media has identified that today's preschoolers are indeed consuming media designed to teach STEM concepts [25][26][27]. One content analysis of preschool television shows marketed as STEM found that these shows offered at least some aspect of STEM educational material in their randomly sampled episodes [28]. The authors suggest that in addition to television content analyses, more work is needed to explore STEM content in educational apps designed for preschool children [28].
In the current study, we specifically examine concept of weight and balance. According to developmental scientists, the preschool age range is critical for the development of children's understanding of weight [29,30], but that was not always the shared belief. Research from Halford and Dalton [31] found that two-and three-year-olds could make successful predictions about what a scale would do when given instruction about rules of weight or distance from the fulcrum. Yet, when rules about both weight and distance were incorporated in the instructions, two-and three-year-olds performed no better than chance in predicting the effect of these factors [31]. These results replicated to other conditions as well [32]. Taken together, these findings suggest that even children as young as two-years-old can recognize weight and distance effects on a balance task when given the appropriate support and using developmentally appropriate techniques. Of course, in these studies and others, there is certainly an effect of age. That is, when studies recruit and compare different preschool age groups, a notable trend occurs. Studies of this nature often find that five-year-olds perform best on balance scale tasks with four-year-olds performing consistently well, if only slightly worse than five-year-olds, but still better than two-and three-year-olds [33,34]. Despite earlier scholars' resistance, it appears that the preschool years are, in fact, a crucial time for learning about weight. Because the research suggests that three-and four-year-olds perform differently on balance tasks (though both still better than two-year-olds), understanding concepts of weight and balance appears to have some developmental component.
Given the importance of this STEM concept for preschoolers, the proposed study explores whether three-and four-year-old children can transfer knowledge about weight and balance after playing a game designed to explore these concepts.

Haptic Technology
The modern use of the term "haptics" is associated with touch; the study of touch as well as humans' ability to interact with their environment though tactile means. Haptics, as a field of study, incorporates research from a variety of disciplines. From engineering and robotics to cognitive psychology and education, interest in developing and understanding the impact of haptic technology is extremely multidisciplinary. Research on haptic designs often includes study on the development of the tactile and/and force feedback device hardware as well as developing, testing, and refining the software that actually allows users to "feel" the sensation of surface contact [11].

Haptic Impact on Learning
Although haptic feedback could theoretically be useful for all sorts of subjects, research on the impact of haptic feedback on STEM learning, specifically, suggests that this type of tactile sensation is particularly useful because it provides more of a "hands-on," kinesthetic experience; the kind of experience that is necessary for deep STEM learning [35]. Experiences with haptic feedback, then, are designed to fully immerse students in the learning environment. Research on the implementation of haptics in education supports this phenomenon. To date, the empirical evidence supports haptic experiences as conducive to learning STEM concepts such as force fields [36], viruses [37], and levers [38]. Haptic technology has also been shown to support STEM learning for middle schoolers who are blind, low vision, or otherwise visually impaired [39]. However, not all research comparing interaction with haptic devices to business-as-usual models is wholly supportive. Minogue and Borland [40] found no significant difference in pre-to post-test knowledge of buoyancy for undergraduates who interacted with a haptic device compared to those that did not, but did find qualitative differences in how participants described buoyancy after the experiment. The authors explain that rather than focusing on the quantitative measures of learning outcomes, future research should seek to better understand the language that students ascribe to haptic experiences in order to determine how that may impact learning in the long-term [40].
Although the research suggests a potentially positive effect of haptic feedback experiences on learning, the studies described earlier focus exclusively on young adult or adolescent populations and often experiment with only physics concepts. There is considerably less research on haptic feedback use in teaching elementary STEM concepts, though some studies have been identified. One study found that, compared to peers who were not exposed to haptic feedback, fifth graders who learned about gears using force and kinesthetic simulations (haptics) improved in their conception of gears and were better able to transfer this knowledge to a novel environment [10]. In addition to improved learning outcomes, experiences with haptic technology appears to be particularly engaging for elementary school learners [41][42][43]. Indeed, Williams, Chen [44] found that elementary students thought that the haptic software the team developed for learning about simple machines was effective or very effective in teaching such concepts. The authors also noted that student and teacher responses to open-ended questions about the software were generally positive. In fact, many remarked that using these tools to explore simple machines "[sic] was FUN." As such, understanding the appeal of haptic interfaces by preschool children will be a useful addition to this body of literature. Further, some of the differences in findings around haptic feedback and learning might be due to the fact that research on this topic includes interactions with haptic joysticks and controllers alongside tablet devices that offer the variable friction designed to enhance interactivity.
Although prior research has focused primarily on using haptics to improve physics learning for middle school, high school, and undergraduate students, there is a lacuna of research on how haptics may support other types of STEM learning for younger children. This study seeks to address this gap in the literature specifically.

Embodied Cognition
Although not originally conceived to be applied to learning from touchscreen devices, theories of embodiment (or embodied cognition) are one way to consider learning effects from interactive devices. Rather than describing cognition as a computational process (as was previously en vogue), the embodied cognition hypothesis contends that learning emerges through sensorimotor experiences with the environment [45][46][47][48]. That is, cognition-our ideas, thoughts, and understanding about the world-is believed to be a direct consequence of past and present experiences with the physical world and it starts as early as birth. Further, research on young children's use of physical manipulatives also validates this notion [49]. For example, in two of their studies, Manches, O'Malley [50] and Manches and O'Malley [51] demonstrated that early grade schoolers' (5-7 year-olds) could come up with more solutions to a partitioning problem (an early math concept) when they had the chance to use physical manipulatives than when they practiced the same concept with no materials or when using a pencil and paper.
Outside of learning from physical manipulatives in the environment or movement of our bodies in physical space, theories of embodiment have been increasingly used to describe learning from interactive devices because of the inherently physical nature of interaction [52][53][54]. In terms of embodiment, a simple swipe, drag, or tap of the index finger on a tablet could have great implications for learning. Embodied cognition offers a valuable framework for studying the role of haptic feedback in learning about STEM concepts on a mobile device.

The Capacity Model
Another theory used to explain learning from educational media comes from work done by Shalom Fisch. Incorporating concepts from cognitive psychology, education, and media studies, his result, the Capacity Model [14,55], posits that children are faced with a host of cognitive processing demands when watching any kind of television, but that such demands are even more of a challenge when children view educational media. Educational media are unique because they usually present both narrative content (the storyline) and educational content (the intended curriculum) simultaneously. Not only must children process both narrative and educational content, but the comprehension of the two is influenced by the degree to which these elements are related. That is, how closely the educational content is embedded in the narrative content also affects comprehension, what Fisch calls "distance." With respect to the two types of content and the distance between them, the Capacity Model suggests that comprehension is improved when three conditions are met: (1) when processing requirements of the storyline are low (because the child already knows the characters, for example), (2) when processing loads of the educational content are low (perhaps because the child is particularly interested in the educational content), and (3) when distance between the narrative content and educational is relatively small. Of course, comprehension is also shaped by various factors related to processing the storyline and the educational content (i.e., individual viewers, program details, etc.).
Theoretically, Fisch's model works well for describing comprehension of educational television and has been empirically tested by researchers in the field. Research from Piotrowski [56] provided the first empirical support of the Capacity Model, establishing that children with more advanced story schema skills were significantly more likely to demonstrate strong narrative and educational comprehension. Further evidence of the Capacity Model's application comes from Aladé and Nathanson [57]. The researchers found that viewer characteristics such as short-term memory, verbal ability, and prior knowledge about the narrative were related to increased narrative comprehension, while prior knowledge of the educational content was related to processing of the educational lesson. However, interest in neither educational content nor narrative content was related to comprehension of those two areas [57].
So far, the research on the Capacity Model has been specific to educational television (because it was developed with television in mind), but even Fisch [58] recognizes the potential power of learning from educational digital games. In his updated version, Capacity Model 2.0, Fisch [59] mentions several noticeable differences. One key modification is that in addition to demands on the processing of the educational and narrative content, there is a third component; the processing demands related to gameplay (i.e., the usability of the game). With the inclusion of this element, the model now features three types of distance: (1) the distance between game play and educational content, (2) the distance between educational content and narrative content, and (3) the distance between narrative content and game play. Unlike Capacity Model 1.0, this version privileges minimizing the distance between educational content and game play. When the distance between educational content and gameplay is large (i.e., the two are not integral to one another), comprehension of the educational content will suffer [58]. Not only is the distance between gameplay and educational content critical, Fisch [59] posits that priority of resources will be given to understanding gameplay over and above the other two types of content. With respect to educational television, Capacity Model 1.0 prioritizes narrative content over the educational content so this difference is crucial. The new Capacity Model suggests that if gameplay is too challenging, too advanced, or otherwise too difficult, the learning outcomes will suffer.
All in all, the Capacity Model and its newest update, Capacity Model 2.0, are useful in understanding learning from educational media. We use Capacity Model 2.0 to explain any learning effects in terms of comprehension and transfer in this experiment. Interestingly, Capacity Model 2.0 and embodied cognition are not inherently dichotomous, but are two theories that could potentially explain any type of results from this novel experiment; whether there is learning and even whether there is not.

The Present Study
To determine the effect of playing a STEM game enhanced with haptic feedback technology on learning outcomes, we performed an experiment with three-and four-year-olds. Specifically, we asked: RQ1: Does an additional level of interactivity (embedded in haptics) improve preschoolers' learning of a STEM concept (weight and balance)-as evidenced by comprehension and transfer?
Given the research on children learning STEM content from tablet computers, we predicted that: Hypothesis (H1). Participants in the haptic and tablet conditions will outperform control condition participants in comprehension and transfer outcome tasks.
Since the research on children learning STEM content from tablet computers also considers moderating factors, we also asked: RQ2: Is there an interaction with respect to child age and/or other child factors (e.g., pre-test scores, verbal ability, appeal, and parent report of executive function) on these comprehension and transfer outcomes?
Here, we predicted that developmental age will play a role, such that: Hypothesis (H2). Older children will perform better than younger children in comprehension and transfer tasks.

Participants
A total of 73 three-and four-year-olds participated in the study, but five were removed because they completed less than three of the final research tasks. To be eligible for the study, preschoolers had to communicate with the researcher in English, have some experience with a tablet (e.g., iPad or Android), and not have played the two games of interest nor have significant experience with haptic feedback technology. The final sample consisted of 68 children between the ages of 3 and 4.5 (M age = 3.76, SD = 0.53; 54% male). Please see Table 1 for more information about participants. Parents reported child race/ethnicity, household income, and parent education information. Sixty-four percent of participants were White, 4% were Black or African American, 8% were Hispanic or Latino, 6% were Asian/Pacific Islander, and 18% were biracial. Household income ranged from $50,000 to more than $175,000, with an average income of $85,000-$99,000 (SD = $15,000). The families in the sample were also highly educated: 33% of participants' recorded parents had master's degrees, 43% had bachelor's degrees, 18% received associate degrees, and 6% had at least completed high school with some college education.
Participants were recruited from a database of parents who had opted in to being recruited for studies of this nature as well flyers in local preschools and other community areas (including online discussion groups) in the greater Chicagoland area. Additional participants were recruited at a local preschool in Tampa, FL. Participants were compensated $15 in cash for their time. All procedures were approved by the sponsoring university's Institutional Review Board.

Procedure
Upon arrival to laboratory or other research setting, participants were randomly assigned to one of three experimental conditions (Tablet, Haptic, or Control). Parents signed consent forms, while the experimenter collected child assent before the study began. Children first had the opportunity play freely with magnetic tiles as a warm-up activity to become comfortable with the researcher before officially beginning the study. After a series of pre-assessments (see measures section below), all participants were exposed to the experimental stimuli on the same Tanvas device. After approximately ten minutes of play with the stimuli, participants participated in a series of post-assessments. The researcher and parent were in the room throughout the session, but instructed not to talk unless the child needed help with the tablet. Parents completed a questionnaire reporting parent/child demographics, parent attitudes toward STEM media, child executive function skills, and child/parent technology use. The entire session took approximately 30 min. All sessions were audio and video recorded for later analysis.

Experimental Stimuli
The research team created the target stimulus by modifying a freely available application, WGBH's Peep and the Big Wide World Bunny Balance game. The game, as described on the website, is designed to have children experiment with balance, weight, and size by dragging bunnies onto one of four seats of a seesaw [60]. We chose this game because describing measurable attributes such as weight and size is crucial for subsequent math and science learning as evidenced by the kindergarten Common Core Math Standards [61].
In our version of Bunny Balance, all of the bunnies are the same color (gray) and had been modified using one of the specific bunny's shape, but are different sizes. See Figure 1 for screenshot of the game. Bunnies were lined up by size; smallest to largest. Although not always the case in the real world, weight and size did correspond in the game. That is, children would learn through playing the game that the smallest bunny was the lightest and the biggest bunny was the heaviest. The seesaw was empty with one seat on each side of the seesaw. Participants could differentiate the ends of the seesaw by noting their color. The left side's seats were red, while the right side's seats were blue.

Haptic Tablet
All participants experienced the stimuli on a Tanvas TPaD tablet. The TPaD tablet (see [62] for a review of the earlier technology) is an Android tablet with the embedded technology of TanvasTouch. Using electrostatics to control friction and create virtual touch, TanvasTouch software allows for programming specific textures and haptics effects to be associated with the swipe of fingers on any touchscreen [63].

Tablet Condition
Since the TPaD tablet operates as an android tablet, we had children in this condition (n = 23) play our game as it would naturally exist on a traditional android tablet. Children first watched an introductory video (screen capture) of someone else playing. The video demonstrated ways of making the red end heavier, the blue end lighter, and balancing the see-saw. Children then had a chance to try it out themselves, first by dragging the different bunnies to the seats without any prompts. The researcher then instructed children to touch the arrow button to hear the six prompts. Children had the opportunity to attempt the prompt (see order in Appendix A). If they did not correctly complete the prompt, they were encouraged to try again or move to the next question. Participants in this condition played until they successfully completed all six prompts, until 10 min had elapsed, OR participants indicated they no longer wanted to play.

Haptic Condition
Although visually identical to the tablet condition, in the haptic condition (n = 23) the target stimulus game had haptic feedback embedded within the game design such that as bunnies increase in size, they also increased in the tactile feedback associated with dragging them across the screen. Each bunny differed in its oscillation pattern such that the largest bunny at high friction was the most difficult to move, while the smallest bunny was at low friction, the easiest to move. The other bunny was on the pattern somewhere in between. We also chose to associate particular textures with the bunnies. Again, children first watched the introductory video and then played with this game until (a) they successfully completed the prompts, (b) 10 min passed, or (c) they indicated they no longer wanted to play.

Haptic Tablet
All participants experienced the stimuli on a Tanvas TPaD tablet. The TPaD tablet (see [62] for a review of the earlier technology) is an Android tablet with the embedded technology of TanvasTouch. Using electrostatics to control friction and create virtual touch, TanvasTouch software allows for programming specific textures and haptics effects to be associated with the swipe of fingers on any touchscreen [63].

Tablet Condition
Since the TPaD tablet operates as an android tablet, we had children in this condition (n = 23) play our game as it would naturally exist on a traditional android tablet. Children first watched an introductory video (screen capture) of someone else playing. The video demonstrated ways of making the red end heavier, the blue end lighter, and balancing the see-saw. Children then had a chance to try it out themselves, first by dragging the different bunnies to the seats without any prompts. The researcher then instructed children to touch the arrow button to hear the six prompts. Children had the opportunity to attempt the prompt (see order in Appendix A). If they did not correctly complete the prompt, they were encouraged to try again or move to the next question. Participants in this condition played until they successfully completed all six prompts, until 10 min had elapsed, OR participants indicated they no longer wanted to play.

Haptic Condition
Although visually identical to the tablet condition, in the haptic condition (n = 23) the target stimulus game had haptic feedback embedded within the game design such that as bunnies increase in size, they also increased in the tactile feedback associated with dragging them across the screen. Each bunny differed in its oscillation pattern such that the largest bunny at high friction was the most difficult to move, while the smallest bunny was at low friction, the easiest to move. The other bunny was on the pattern somewhere in between. We also chose to associate particular textures with the bunnies. Again, children first watched the introductory video and then played with this game until (a) they successfully completed the prompts, (b) 10 min passed, or (c) they indicated they no longer wanted to play.

Control Condition
Children in the control condition (n = 22) played a different, unrelated STEM game on the Tanvas tablet. We chose a puzzle game since it involved the same game mechanics (drag and drop), but did not formally teach the STEM concept of weight and balance (see Figure 2 for a screenshot). Children played this puzzle game for approximately 10 min, responding to the in-game prompts as they naturally occurred.
Multimodal Technol. Interact. 2020, 4, x FOR PEER REVIEW 9 of 27 Children in the control condition (n = 22) played a different, unrelated STEM game on the Tanvas tablet. We chose a puzzle game since it involved the same game mechanics (drag and drop), but did not formally teach the STEM concept of weight and balance (see Figure 2 for a screenshot). Children played this puzzle game for approximately 10 min, responding to the in-game prompts as they naturally occurred.

Pre-Assessment Measures
After receiving consent and assent, we evaluated children on the following two pre-assessment tasks. See Appendix A for complete recording form.

Pre-Test
We first collected information on how well participants already understood the concept of weight and balance. Modelled after a worksheet from kindergarten classrooms, participants were asked to point (on the worksheet) to the image that showed where the "elephant is heavier than the fox," for example. See Appendix A for list of questions and Appendix B for images. Responses were recorded as dichotomous units (yes = 1 or no = 0) for (1) attempting the question (rather than choosing to pass on the question) and (2) choosing the correct answer. We added the units for choosing the correct answer on all six questions (min = 0, max = 6; M = 2.51; SD = 1.23).

Verbal Ability
In order to test children's verbal ability, participants performed the Picture Naming Task (PNT). The Picture Naming Task is an expressive vocabulary measure highly correlated with other measures of language and literacy, particularly receptive vocabulary [64]. Children are presented with laminated flashcards of color pictures of various objects (e.g., animals, clothing, household goods, food) by the researcher and asked to name as many as possible in a minute. The number of pictures named correctly in one minute served as the child's verbal ability score (min = 4, max = 27; M = 14.74; SD = 5.26).

Pre-Assessment Measures
After receiving consent and assent, we evaluated children on the following two pre-assessment tasks. See Appendix A for complete recording form.

Pre-Test
We first collected information on how well participants already understood the concept of weight and balance. Modelled after a worksheet from kindergarten classrooms, participants were asked to point (on the worksheet) to the image that showed where the "elephant is heavier than the fox," for example. See Appendix A for list of questions and Appendix B for images. Responses were recorded as dichotomous units (yes = 1 or no = 0) for (1) attempting the question (rather than choosing to pass on the question) and (2) choosing the correct answer. We added the units for choosing the correct answer on all six questions (min = 0, max = 6; M = 2.51; SD = 1.23).

Verbal Ability
In order to test children's verbal ability, participants performed the Picture Naming Task (PNT). The Picture Naming Task is an expressive vocabulary measure highly correlated with other measures of language and literacy, particularly receptive vocabulary [64]. Children are presented with laminated flashcards of color pictures of various objects (e.g., animals, clothing, household goods, food) by the researcher and asked to name as many as possible in a minute. The number of pictures named correctly in one minute served as the child's verbal ability score (min = 4, max = 27; M = 14.74; SD = 5.26).

Post-Exposure Measures
After the exposure to different conditions, participants completed another three assessments designed to track learning. Refer to Appendix A for the recording form with the following measures.

Comprehension
Given the tablet with a screenshot of our Bunny Balance game, we asked participants to identify both the lightest and heaviest bunnies in the lineup. We asked "Can you show me which bunny in this group is the heaviest [lightest]? Point to it." Responses were recorded as dichotomous units (yes = 1 or no = 0) for (1) attempting the question (rather than choosing to pass) and (2) pointing to the correct answer. We added the units for choosing the correct answer on both questions to create a comprehension score (min = 0, max = 2; M = 2.51; SD = 1.23).

Transfer of Learning
The ability to take something learned on in one context and apply it to another one is the essential goal of learning. This concept is considered transfer. To create a new transfer task for this study, we first consulted previous research using knowledge transfer tasks capturing 2D to 3D transfer with preschoolers [21,[65][66][67] and identified the necessary components. We determined that it was important that children learn to transfer the task from a tablet to a real-life 3D experience. Therefore, we provided children with a 3D scale and counting bears and asked participants to show us how they would (a) balance the scale and (b, c) make either end heavier using any of the six bears and only one bear on each side. The sides were marked by colored stickers (purple and gray). See Figure 3 for sample images of the transfer tasks. Responses were recorded for number of attempts made and accuracy. We calculated a transfer score by adding the number of times participants were successful on the first try with the three questions (min = 0, max = 3; M = 1.76; SD = 0.91). After the exposure to different conditions, participants completed another three assessments designed to track learning. Refer to Appendix A for the recording form with the following measures.

Comprehension
Given the tablet with a screenshot of our Bunny Balance game, we asked participants to identify both the lightest and heaviest bunnies in the lineup. We asked "Can you show me which bunny in this group is the heaviest [lightest]? Point to it." Responses were recorded as dichotomous units (yes = 1 or no = 0) for (1) attempting the question (rather than choosing to pass) and (2) pointing to the correct answer. We added the units for choosing the correct answer on both questions to create a comprehension score (min = 0, max = 2; M = 2.51; SD = 1.23).

Transfer of Learning
The ability to take something learned on in one context and apply it to another one is the essential goal of learning. This concept is considered transfer. To create a new transfer task for this study, we first consulted previous research using knowledge transfer tasks capturing 2D to 3D transfer with preschoolers [21,[65][66][67] and identified the necessary components. We determined that it was important that children learn to transfer the task from a tablet to a real-life 3D experience. Therefore, we provided children with a 3D scale and counting bears and asked participants to show us how they would (a) balance the scale and (b,c) make either end heavier using any of the six bears and only one bear on each side. The sides were marked by colored stickers (purple and gray). See Figure 3 for sample images of the transfer tasks. Responses were recorded for number of attempts made and accuracy. We calculated a transfer score by adding the number of times participants were successful on the first try with the three questions (min = 0, max = 3; M = 1.76; SD = 0.91).

Appeal
Finally, we asked children "Did you like playing with this game?" and "Would you like to play this game again?" Children answered these questions by saying "yes" or "no." Each option was paired with a pictorial cue of the same type (i.e., thumbs up for "yes" and thumbs down for "no"). If they responded "no," we moved onto the next question and marked it as "0." If they responded "yes"

Appeal
Finally, we asked children "Did you like playing with this game?" and "Would you like to play this game again?" Children answered these questions by saying "yes" or "no." Each option was paired with a pictorial cue of the same type (i.e., thumbs up for "yes" and thumbs down for "no"). If they responded "no," we moved onto the next question and marked it as "0." If they responded "yes" to either of these questions, we followed up with: "How much do you like playing with this game?" and "How much would you like to play with this game again?" respectively. These questions were scored using a three-point Likert-type scale, with response options being: a little (1), a lot (2), and a whole lot (3). Each option was paired with a pictorial cue of the same type (i.e., different smiley faces). We calculated an appeal score by adding the two, three-point Likert scales (min = 0, max = 6; M = 3.87; SD = 2.18)

Parent Questionnaire
The parent questionnaire was modelled after a similar survey designed to gauge parent attitudes, behaviors, and practices around STEM and STEM media [25]. Specifically, parents were asked to report child media use, their own media use, as well as their child's attitude towards STEM activities and media. Child and parent demographics were also recorded. Age, gender, race/ethnicity, household income, and highest level of education attained were collected. In addition to these items, the parent questionnaire also included the Behavior Rating Inventory of Executive Functioning-Preschool Version (BRIEF-P), a parent report of child executive function. The BRIEF-P is a 63-item measure for parents of preschool children illustrating five elements of executive function [68]. For each statement, parents score whether their child does that "thing" never (1), sometimes (2), or often (3). We use the global executive composite score as a measure of executive function (M = 89.67, SD = 15.96; "normal" scores are between 69 and 89). This measure was chosen due to its good internal consistency reliability and high test-retest reliability [68].

Results
We first checked the data for possible covariates by running correlation analyses. Because participant age and sex were positively correlated with transfer scores, we included these variables in analyses testing transfer. Further, because child age was highly correlated with pre-test scores and PNT vocabulary scores, we use child age as a proxy for these variables in both comprehension and transfer analyses to avoid potential multicollinearity problems.

Comprehension
To address our research questions, we first performed an analysis of variance (ANOVA) with comprehension scores as the dependent variable and condition as the independent variable. Initially, we found that condition had no impact on comprehension scores using ANOVA (F (2,63) = 0.69, p = 0.50). We also ran ANOVA analyses to determine the effect of condition on comprehension scores with age, pre-test scores, executive function scores, and appeal scores as moderators. These analyses were also not significant and in the same pattern as the original ANOVA statistics so we did not report these. See Appendix C for SPSS output. After the null findings from these analyses and given the restricted range of the comprehension score (0-2), we determined that logistic regression of each question as correct (1) or incorrect (0) would be preferable to ANOVA statistics. We then performed a logistic regression to ascertain the effect of condition, child age, appeal scores, and executive function scores on the likelihood that participants could correctly identify the heaviest bunny. See Table 2 for complete results. The logistic regression model was statistically significant (χ2(4) = 11.43, p = 0.02). The model explained 33.0% (Nagelkerke R2) of the variance in correctly identifying the heaviest bunny and correctly classified 90.5% of cases. Here, higher appeal scores were associated with an increased likelihood of correct bunny identification (p = 0.04). Child age was trending towards being associated with an increased likelihood of being correct (p = 0.08). That is, older children were more likely to correctly identify the heaviest bunny. The second logistic regression predicted the effects of age, appeal scores, executive function scores, and condition on the likelihood that participants could correctly identify the lightest bunny. This time, the logistic regression model was not statistically significant (χ2(4) = 3.92, p = 0.41).

Transfer of Learning
To answer our research questions around transfer scores, we performed an analysis of variance (ANOVA) with transfer scores as the dependent variable and condition as the independent variable. Initially, we found that condition had no impact on transfer scores (F (2,65) = 0.21, p = 0.81). We also ran ANOVA analyses to determine the effect of condition on transfer scores with age, pre-test scores, executive function scores, and appeal scores as moderators. These analyses were also not significant and in the same pattern as the original ANOVA statistics so we do not report these. See Appendix C for SPSS output. After the null findings from these analyses and given the restricted range of the transfer score (0-3), we determined that logistic regression of each question as correct (1) or incorrect (0) would be preferable to ANOVA statistics. We performed a logistic regression to ascertain the effects of age, condition, appeal scores, sex, and executive function scores on the likelihood that participants could correctly answer the three transfer prompts correctly on the first try. The logistic regression model for participants correctly modelling the purple side heavier was not statistically significant (χ2(5) = 5.33, p = 0.38). The logistic regression model for participants correctly modelling the gray side heavier was statistically significant (χ2(5) = 11.67, p = 0.04). See Table 3 for complete results. The model explained 21.8% (Nagelkerke R2) of the variance in correctly performing the second transfer task and correctly classified 68.2% of cases. Child age was associated with an increased likelihood of being correct (p = 0.002). Unsurprisingly, older children were more likely to correctly make the grey side heavier using manipulatives bears, but the remaining variables of interest were not statistically significant. The logistic regression model for participants correctly balancing the scale was also statistically significant (χ2(5) = 13.54, p = 0.02). See Table 4 for complete results. The model explained 24.7% (Nagelkerke R2) of the variance in correctly performing the third transfer task and correctly classified 69.7% of cases. Here, older children and those with higher appeal scores were not any more likely to correctly balance the scale. Instead, child sex and executive function scores were associated with an increased likelihood of being correct (p = 0.003 and 0.07, respectively). That is, females and children with higher (better) executive function scores were more likely to correctly balance the scale using manipulative bunnies.

Discussion
This study adds to the literature on preschoolers learning STEM concepts from interactive tablets in two vital ways. First, we offer the first findings (to our knowledge) on preschoolers' learning from haptic feedback tablets using a systematic experimental design. More importantly, however, we offer specific variables related to learning (or lack thereof) to consider when designing research in this area moving forward.

Major Experiment Findings
Our findings demonstrate the first known experimental design testing a STEM game enhanced with haptic feedback against the same game without haptic feedback and a control condition on two learning measures with this age group. Interestingly, there were no differences in comprehension and transfer scores by condition. Our findings are not in line with previous research on haptic feedback technology and learning with elementary students [10,44] nor do they support the embodied cognition [46,47] and Capacity Model [59] frameworks. Embodied cognition would have posed that children could learn from the haptic feedback experience due to the inherently physical interaction with the device (i.e., feeling a sensation).
Perhaps our current outcome measures are not enough to demonstrate this learning and in fact, future research should look to analyzing the videos of children playing the tablet games to determine whether learning is demonstrated in other ways. It is possible that analyzing the way children gesture during the gameplay could be a more robust learning outcome measure. Similarly, the Capacity Model would have suggested that at least the two conditions playing our target game would have outperformed the control condition on both the comprehension and transfer tasks. Part of this disconnect could be due to the fact that earlier studies on haptic feedback were done with significantly older elementary students. In terms of development, 10-, 11-, and 12-year-olds are very different than three-and four-year-olds [69]. Not only are older elementary students physically bigger, with more mature fine and gross motor skills [70], but their cognitive abilities are also greatly improved [71].
They can attend to a task longer than preschoolers and have much more refined thinking processes than young children [72].
Given this information and our findings, it is entirely possible that haptic technology is inappropriate for the preschool age group because of either their physical or cognitive developmental stages, though likely due to some combination of the two. In terms of the physical domain, it is possible that young children's fingers do not have the ability to perceive the haptic feedback that we envisioned them feeling not only because of limited surface area of their small fingers, but also due to their poorer fine motor skills.
Cognitively speaking, it is also possible that preschoolers interacting with haptics might be too distracted by the novel technology that they are unable to spend the rest of their cognitive energy on the educational material at hand. This phenomenon is referred to as cognitive load [14,73]. The Capacity Model could explain this cognitive load issue in part due to the distance between the gameplay mechanics, narrative features, and educational content. Meaning that, the distance between these three elements (or between any combination of two of the three elements) is too large, making it difficult for children to demonstrate learning from the experience. Perhaps the haptic feedback was a gameplay mechanic that did not actually offer the necessary information to make it closer to the narrative or educational content thereby becoming a distractor rather than enhancer. Further, preschoolers' language development is firmly tied to their cognitive development [70]. It could also be possible that even if young children noticed differences in the haptic feedback, they did not have the language necessary to make sense of this interaction and therefore did not show any difference in learning outcomes.

Lessons Learned
Now we offer the remaining findings, their contributions to the literature, and some of our lessons learned throughout the research experience.

Age
We found that age was occasionally significantly or trending towards significantly related to the likelihood of being correct on the comprehension and transfer tasks, even with a narrow age range. These findings support our second hypothesis, generally, that older children would perform better on the outcome measures. Performance was also in line with previous research on preschoolers learning of weight and balance as well since older children generally outperformed younger children [33]. However, our hypothesis was based on the idea that this outcome would be true only in the haptic and tablet conditions so in that way, H2 was not supported. Age was also unsurprisingly related to verbal ability via PNT scores and pre-test scores. Surely children improve on these measures with age, but it does not explain why these two measures were not related to the outcome measures. Perhaps developmental age is the most important requirement for understanding this STEM concept. This claim would be in line with research that suggests by five years old, all children are able to accurately determine weight and balance when size and weight correspond [29,33,74].

Game and Haptic Feedback
Given the literature on learning STEM concepts from tablets, we expected that the groups actively practicing the comprehension and transfer target skill on the tablet would outperform our control group as is the case in Aladé, Lauricella [21], Schroeder and Kirkorian [22], and similar studies. What we most certainly did not expect was for the control condition to perform similarly on outcome measures compared to the haptic and tablet conditions, thus H1 was not supported. In attempting to understand why all conditions performed similarly, we first wondered whether there was a difference in game appeal by condition. Our results demonstrate that appeal did not differ across conditions so appeal of the game (Bunny Balance vs. Puzzle game) was not driving this non-difference. Instead of appeal, we posit that there might be something about the target game that was too far removed from the learning outcome tasks (in the same way that the puzzle game was meant to be) OR the distance between game play and educational content was too wide; concepts associated with Capacity Model 2.0 [59]. Did everyone perform similarly because of poorly designed game mechanics-in fact, the bunnies were more difficult to drag and drop than the puzzle pieces-or was it something else about the game specifically? Perhaps the drag and drop nature of Bunny Balance was too far removed from the language of "heavier," "lighter," and "balance," making participants perform basically the same on outcome tasks. Another possibility is that this game might not be right for teaching this particular concept. In light of our transfer task, we know that there are already 3D models that children could use to practice these skills (e.g., scale, see-saw, sorting objects, and weights). As such, these methods might be preferable to tablet games practicing weight and balance, especially since this age group benefits from more hands-on experience with manipulatives [49]. It could be that merely dragging their fingers around on the tablet was not enough of the physical manipulation needed for this type of learning [50,51]. Future research would be wise to consider other STEM concepts better suited to haptic feedback experiences such as learning about fire, space, or animals that are unrealistic or otherwise impossible to touch in real life [62].

Other Considerations
Most studies of this nature average approximately 20 participants per condition and are able to find differences [21,75,76]. Although our sample count was similar, it is certainly possible that unlike earlier research, this analysis would need more statistical power in order to detect very small differences between conditions. This change could be done either by further restricting the age range or including more participants in each condition; neither of which were possible at the time of the experiment.
Additionally, we also recognize that the outcome tasks are unique to this study and a number of other tasks could have been created to address and accurately measure the learning outcomes. Of course, we developed these measures with guidance from earlier research on comprehension and transfer [21,77,78] but recommend other tasks that could measure learning outcomes for researchers moving forward.

Conclusions
By experimentally testing the effect of playing a STEM game with embedded haptics on children's STEM comprehension and transfer outcomes, this work is an important step in identifying how haptic technology may be able to support STEM learning in early childhood. Even though haptic technology is still relatively novel, we are not the first to consider it as a tool for children's learning and entertainment. Indeed, Disney Research has also reportedly been working on "rendering 3D tactile features on touch surfaces" as well. However, we do believe we are one of the first to test STEM learning outcomes with young children. Despite null findings, this work is a significant first. Surely, it might be too soon to tell whether haptic feedback technology is the solution to get more young children into the STEM pipeline sooner with an engaging learning tool, but it also does not appear to be going anywhere anytime soon. START HERE Intro: Hi! My name is _________, and I work at Northwestern. My job is to find out what kids like you think. There are no right or wrong answers, I'm just trying to see what you think. You can help me today by playing some games with me and answering some questions. We will also read a story and play a game on the tablet. Your mom or dad (and your teacher) have already said that it's okay for you to help me with this project.
[to parent] Mom/Dad, just so you know, we're really interested in seeing what the kids do independently, so no need to worry about if they're doing something right or wrong, or anything like that. Whatever they do on their own is great. →Now, we're going to play a game using these picture cards.

Remember
Follow directions below exactly as written, reading aloud all words in bold.
Continue to Picture Naming Test Administration, only if the child names all four sample cards correctly during this Sample Administration.

Sample Items Administration Procedure
Select the four practice items from the stack: apple, baby, bear, cat.

Say, "I'm going to look at these cards and name what's in the picture. Watch what I do."
Look at and clearly name the four sample cards while the child observes.
Say, "Now you name these pictures." Show the four sample cards to the child in the same order as you named them, and give the child an opportunity to name each picture.
Praise the child for naming the picture correctly; otherwise, provide the correct picture name. If the child responds in a different language, say "This is also called a (picture name). Call it a (picture name)." Continue on to Test Administration only if the child names all four pictures correctly. Select NA on this recording form if you don't continue administration.

Test Items Administration Procedure
Shuffle stimulus cards (NOT practice items, though) before starting.
Say: "Now we're going to look at some other pictures. This time, name them as fast as you can!" Start the stopwatch and immediately show the first card to the child.
If the child does not respond within 3 s, point to the picture and say: "Do you know what that is?" or "What's that?" If the child still does not respond within an additional 2 s, show the next card.
As soon as the child names a picture, show the next card.

GAME PLAY
Now is the really fun part. You get to play a game on this tablet. Now, this tablet might feel a little bit different to you than other tablets.
If you watch me right now, I will show you how to play! You have to put your finger over the bunny, then drag it to sit directly on top of one of the seats. If the bunny disappears, do not worry, and just try again! Here, see me do it? Are you ready to play?
ASSESSMENT STIMULI "Can you balance the see-saw?" "You balanced the see-saw, good job!" "Can you make the red end heavier?" "You made the red end heavier, nice job!" "Can you make the blue end heavier?" "You made the blue end heavier, good job!" "Can you make the blue end lighter?" "You made the blue end lighter, nice going!" "Can you make the red end lighter?" "You made the red end light, good job!"

POST-ASSESSMENT QUESTIONS
Alright, you're doing so great. We have some fun things to do next.

COMPREHENSION QUESTIONS
First, I want you to take a look at these bunnies. →Show child the scale and bears →Mention that bears must be put directly in the middle of the seats  →Make sure child can correctly point to yes and no images before proceeding.
Child points to appropriate symbols? Yes or No →Show child smiley page "Great. I might also ask you to point to one of these smiley faces. This one means a little, this one means a lot, and this one means a whole lot" (point to corresponding smiley") "So which one means a little? "And which one means a lot?" "And which one means a whole lot?" →Make sure child can correctly point to images before proceeding.
Child points to appropriate symbols? Yes or No "Okay, so thinking about the game you played on the tablet:

1.
"Did you like playing with that game, yes or no?" Yes No IF YES, 2.
"How much do you like playing with the game?" A little A lot A whole lot 3.
"Would you like to play with the game again sometime?" Yes No IF YES, 4.
"How much would you like to play the game again, a little, a lot, or a whole lot?" A little A lot A whole lot 5.
"What did you like about the game you played? (If no, why did you not like the game?) Record response below: 6.
"This is my last question. Did you notice anything different about the game? What did the screen feel like?
Record response below: Okay, we are all finished! Thank you so much for helping me. You did a wonderful job! I have some stickers to give you for being such a great helper.

Appendix B
Image of Pre-Test Multimodal Technol. Interact. 2020, 4, x FOR PEER REVIEW

of 27
Appendix C