Next Article in Journal
Eye Movement Planning on Single-Sensor-Single-Indicator Displays is Vulnerable to User Anxiety and Cognitive Load
Previous Article in Journal
Uncertainty Visualization of Gaze Estimation to Support Operator-Controlled Calibration
 
 
Journal of Eye Movement Research is published by MDPI from Volume 18 Issue 1 (2025). Previous articles were published by another publisher in Open Access under a CC-BY (or CC-BY-NC-ND) licence, and they are hosted by MDPI on mdpi.com as a courtesy and upon agreement with Bern Open Publishing (BOP).
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Skeleton-Based Approach to Analyzing Oculomotor Behavior When Viewing Animated Characters

by
Thibaut Le Naour
and
Jean-Pierre Bresciani
University of Fribourg, Fribourg, Switzerland
J. Eye Mov. Res. 2017, 10(5), 1-19; https://doi.org/10.16910/jemr.10.5.7
Submission received: 15 May 2017 / Published: 18 December 2017

Abstract

:
Knowing what people look at and understanding how they analyze the dynamic gestures of their peers is an exciting challenge. In this context, we propose a new approach to quantifying and visualizing the oculomotor behavior of viewers watching the movements of animated characters in dynamic sequences. Using this approach, we were able to illustrate, on a ‘heat mesh’, the gaze distribution of one or several viewers, i.e., the time spent on each part of the body, and to visualize viewers’ timelines, which are linked to the heat mesh. Our approach notably provides an ‘intuitive’ overview combining the spatial and temporal characteristics of the gaze pattern, thereby constituting an efficient tool for quickly comparing the oculomotor behaviors of different viewers. The functionalities of our system are illustrated through two use case experiments with 2D and 3D animated media sources, respectively.

Introduction

Body language plays an important role in human communication. It helps to convey and understand the emotions or intentions of others. The role of body language in human communication is actually so important that some social activities and sports (e.g., dance, gymnastics) are based on codified gestures, the production of which can be evaluated and judged by juries of experts. Along a similar line, human performance sometimes heavily relies on the ability to use current movement and/or gesture information to anticipate the future actions of others. This is the case in sports like tennis, boxing, or soccer, in which anticipating what opponents and/or partners are about to do is crucial. Therefore, a better understanding of how people scrutinize and analyze the gestures of other individuals can help improve human communication and performance.
By measuring oculomotor behavior while the viewer watches images, natural scenes or movies, eye tracking technology becomes a powerful tool for developing a better understanding of what the viewer’s perceptual strategies are and how he/she gathers information. However, eye tracking can generate large quantities of data which need to be processed and analyzed in order to identify and extract the most relevant information regarding the viewer’s oculomotor behavior. Processing of eye tracking data notably entails data exploration, organization and visualization, in order, for instance, to prepare datasets for statistical analysis and/or to communicate information to a non-expert public.
In this paper, we present a visualization tool designed to analyze and visualize eye tracking data from experiments with animated characters. We developed this tool to help the users better understand the perceptual strategies of viewers watching human-body gestures. The watched material can be in 2D (e.g., video) or 3D format (e.g., virtual reality), and our tool allows the user to process and index eye tracking data relative to movements that are semantically equivalent but do not necessarily have the same dynamics. The user can compare or aggregate data from groups of animated media sources and/or groups of viewers to spatially and temporally visualize the gaze data distribution on the different parts of the body. These operations allow the user to visualize differences (e.g., in synchronization) between viewers or groups of viewers, and compare the number and duration of the fixations. Importantly, our tool has been designed to be easily used by both experts and non-experts in eye tracking data processing.
To illustrate the functionalities of our visualization tool, we present two use case experiments with applications in the sport domain. In the first experiment (second section), official gymnastic judges were required to evaluate gymnastics sequences. In the second experiment (third section), goalkeepers had to ‘analyze’ the movements of the penalty taker during the run-up to the ball in order to assess the direction of the kick. The two use case experiments are based on two different kinds of animated format: 2D videos for the first experiment (cf. Figure 1(a1,b1,c1)) and 3D scenes in virtual reality for the second experiment (cf. Figure 1(a1,b1,c1)). In both experiments, the visual scene consists of sequences of motion of an animated character. A 3D scene represents an animated character generated by a 3D engine (e.g., Unity 3D or Unreal engine). These sequences contain dynamic stimuli which are annotated according to the anatomical parts of the human body. As proposed by Blascheck and colleagues (Blascheck et al., 2014), we define these stimuli as Areas of Interest (AOIs).

Gaze Visualization on Animated Characters

Related Techniques

Our system of visualization relies on two concepts dedicated to the visualization of stimuli representing animated characters:
  • a heat mesh (i.e., the colored mesh illustrated by the Figure 1d) is used to provide qualitative information about the gaze distribution of one or more viewers;
  • a viewer timeline is used to specifically compare the synchronization between viewers.
Blascheck and colleagues (Blascheck et al., 2014) proposed a classification of visualization systems according to their specific features and target applications. Based on their classification, our system belongs to the AOI-based techniques (as opposed to point-based techniques), and is designed to address spatio-temporal aspects (as opposed to systems addressing temporal-only or spatial-only aspects). Using their terminology, our system works on 2D and 3D dynamic stimuli and provides a visualization which is static, interactive, in 3D, on single or multiple users and not in context.
AOIs. Although our system provides a graphical output (defined by a colored mesh) similar to those of heat map (Mackworth & Mackworth, 1958) or vertex-based mapping (Stellmach, Nacke, & Dachselt, 2010), the colored mesh is built from data previously quantified via body area annotations, instead of directly calculating the attention map with eye tracking data. These annotations (the parts of the body) are several regions defined as dynamic AOIs (Poole & Ball, 2006). Areas of Interest (AOI) constitute a classical way to describe visualized stimuli. For example Rodriguez and colleagues (Rodrigues, Veloso, & Mealha, 2012) proposed to annotate a video recording of television news with different areas which are static. More similar to our study, Papenmeier and colleagues (Papenmeier & Huff, 2010) proposed a tool which analyzes virtual scenes with dynamic AOI. Specifically, the authors created a 3D-model of the scene from the video and annotated the most important objects and their trajectories manually.
Along a similar line, Stellmach and colleagues (Stellmach et al., 2010) provided a tool dedicated to the annotation of 3D scenes and the visualization of the gaze data distribution of viewers. After annotating the different objects of the scene, they proposed the concept of models of interest (MOI) to facilitate the investigation of objects in 3D scenes by mapping the data against time.
In an industrial context, several companies (which sell eye trackers) proposed tools which allow to work with dynamic stimuli (i.e., Tobii, SMI).
While most of these tools work well with simple animations, they cannot be used to analyze complex skeleton-based animations, such as those involving human movement. Our tool has been specifically developed to fill this gap.
Timeline AOI visualization. Timeline visualization is used to show the temporal aspects of AOI-based data. Classically, time is represented on one axis while AOIs or viewers are represented on a second axis with separate timelines. Similar to our method, Wu (Wu, 2016) used timelines colored with references to the scene objects for each viewer. Kurzhals (Kurzhals, Heimerl, & Weiskopf, 2014) proposed a framework ‘unifying AOI and viewers timelines’. However, understanding these unified representations becomes increasingly difficult when the number of viewers and AOIs increases.
The main benefit of our approach lies in the fact that all AOIs (i.e., parts of the body) are summarized in a unified model, relying on an articulated mesh. This representation allows us to graphically decouple the timelines of viewers for each AOI (cf. Figure 4). As we will see, our method also allows to easily compare differences of synchronization between viewers.
Gaze behavior on characters. Although a large number of studies rely on the analysis of gaze behavior with stimuli represented by animated characters, applications allowing the exploration and visualization of the data collected are often overlooked. Bente and colleagues (G Bente, Petersen, Krämer, & de Ruiter, 2001) were among the first authors to use eye tracking to quantify the time spent on animated characters. In an experiment investigating nonverbal behavior, they compared the participants’ impression regarding a sequence of dyadic interactions in two different contexts: video recording vs. computer animation. To do that, provided that their characters remained in a very restricted area, they recorded the time spent on 6 fixed areas: upper area (head and facial activity), middle area (body, arm, and hand movements) and lower area (leg and foot movement) of the two characters. More recently, still investigating nonverbal behavior, Roth and colleagues (Roth et al., 2016) used annotated rectangle AOIs on videos to quantify the time spent on head and body regions.

Our Visualization System

In the following section, we first present the features of our system and then we detail how it was designed. The subsequent sections will present two use cases based on 2D and 3D media sources, respectively.
The main characteristics of our visualization system are: First, it is dedicated to the visualization, aggregation and comparison of eye tracking gaze data for one or several viewers who can be split into different groups. Second, unlike most software, it is independent of both the hardware used to collect the eye movement data and of the application used to generate the visual stimuli. Third, it can be easily used by both experts and non-experts in oculomotor research.
Input data. With our system, eye tracking data can be collected both with video or 3D media sources. The user has to provide two types of information as input to our visualization system: the experimental material which is common to all viewers, and the recorded eye movement data of each individual viewer. This information has to be provided in XML format (see the example given in the supplementary material).
Experimental material. The experimental material consists of the viewed material and a list of media sources descriptions (i.e., metadata) used for the experiment. In our case, the viewed media sources represent movements of real characters in videos and virtual characters in 3D. All viewed media sources need to be semantically equivalent but they do not need to have the same kinematic features. The information describing a media source can be divided into three categories:
  • Basic information. Name and duration of the media source; Note that the name is important since the user will be able to compare different media source by class, by associating one or various parts of the name to different classes.
  • Intervals (optional). The user has the possibility to define specific intervals. In most cases, the movements that we want to analyze can be segmented into temporal sub-sequences. Specifically, a movement can be semantically or technically viewed as a concatenation of different phases, each of which can be analyzed separately. For example, whether for interpersonal communication or in sport situations, each ‘phase’ can be analyzed separately. The resulting sub-sequences are defined manually by the user according to the more general context of the analyzed gesture. The result of this partitioning is illustrated in Figure 1d.
  • Temporal mapping between media sources (optional). The user can add a temporal mapping between media sources. As mentioned before, one of the features of our system is the ability to compare the gaze distribution of various viewers on different semantically equivalent gestures. Segmentation is a first step in this direction. However, for any given piece of semantic information, the gestures present different kinematic features (in particular regarding speed and acceleration, or the beginning and the end of movements). To overcome this problem, our tool gives the user the possibility to create a temporal correspondence between any chosen reference sequence of movement and any other selected sequences. This temporal correspondence is performed by the dynamic time warping algorithm. This process is explained in detail in the third section.
Viewers’ experimental information. Viewers’ information consists of a list of captures for each viewer. Each capture is defined by three types of information:
  • Basic information. The name and the index of the capture order during the experiment. Note that the name is important since the user will be able to compare different media sources by groups, by associating one or various parts of the name to different groups;
  • Eye tracking samples. This is a list of eye tracking samples. Each sample is defined by the coordinates of the eyes onto the screen and its corresponding time;
  • Sample joint mapping. For each sample, an AOI is defined either by the name of the body part of the viewed character, or by an empty string. The list contains n values. Regarding the naming of body parts (i.e., joints of the skeleton), our system accepts the classical nomenclatures used by the major software products such as motion builder, mixamo, unity or kinect. In the next sections, we will explain how we calculate this sample body part mapping in 2D and 3D.
Visualization features. In this section, we explain how we combine the spatial and temporal information related to one or several viewers (in the same representation).
Fixation calculation. From the list of samples related to each capture, only fixations are used to compute visualization information. As described by (Salvucci & Goldberg, 2000), we consider a fixation as ‘a pause over informative regions of interest’ which lasts >100ms and has a spatial dispersion of gaze points with a threshold set at . To extract this information from the list of samples, we use the Dispersion-Threshold Identification (I-DT) algorithm (Stark & Ellis, 1981; Widdel, 1984). Then, from the mapping between eye coordinates and intersected body parts, we compute a new fixation-joint mapping which corresponds to the fixations which intersect an AOI area on the character. From this new mapping, we obtain a fixation time T(j) for each joint. Note that if we calculate the sum T = sum_{j in K} T(j), T is not equal to the global fixation duration of S since we exclude eye tracking coordinates that are too far away from the skeleton as well as data recorded during saccadic eye movements.
Spatial information. One of the main techniques used to visualize gaze patterns is known as heat maps. Introduced by Mackworth and Mackworth in 1958 (Mackworth & Mackworth, 1958), this simple and intuitive technique provides qualitative information about the gaze behavior of the viewers. In 3D, some studies (Maurus, Hammer, & Beyerer, 2014; Stellmach et al., 2010) used attention maps on 3D objects (object-based or surface-based). The graphical results provided by our method look like those provided by a surface-based method, but the difference between the classical studies and our method lies in the fact that we build the heat map from data previously quantified via body area annotations, instead of directly calculating the attention map with eye tracking data.
In our system, the color is defined based on the concept of skinning, commonly used in computer graphics. Skinning (Lewis, Cordner, & Fong, 2000) is a process which associates each vertex of a mesh with one or several joints of a skeleton. A weight is then attributed to each joint bound with a vertex. Usually, this technique allows to naturally animate and deform a mesh controlled by a skeleton. For example, the vertices located around a joint and influenced by two bones are impacted by the weighted transformations of two joints. With our system, the mesh is colored according to these weights and the joints attached to each vertex. This step creates a shading between the different parts of the body. Thus the scalar color of a vertex vc is calculated as:
Jemr 10 00032 i001
where w(v, j) is the weight which associates the vertex v and the joint j. To have a good overview of the distribution of the data, we propose three metrics to calculate TMAX. First, TMAX represents the total duration of the interval concerned (i.e., fixations in and outside of the mesh, and saccadic movements). Second, TMAX represents the sum of T(j) only on the character during a given interval. Third, TMAX represents the maximum value calculated on every part of the body (during a given interval).
Figure 1d and Figure 2 (Output/left column) show examples of the mesh coloring for various viewers with a dynamic motion picture, partitioned into sub-sequences (Figure 1d).
In the case of multiple viewers, the mesh color is obtained by averaging the fixation durations Tviewer(j) (for each viewer) and applying Equation 1. The result is illustrated in Figure 2 (left column at the top).
Our skinning definition contains 51 joints divided into: 5 joints for the body trunk, 2 joints for the neck and the head, 4 joints for each leg, 3 joints for each arm, 15 joints for the fingers of each hand.
Timeline visualization. In line with recent work on viewer timeline visualization (Kurzhals et al., 2014; Wu, 2016), our system represents spatial and temporal features of viewers’ gaze patterns in a single figure. The originality of our contribution lies in the decoupling of areas into separated blocks of timelines. Thus, for each block, we present separate timelines stacked horizontally, as well as fixation percentages by viewer and watched body part.
As illustrated in Figure 2, we propose various options to visualize the temporal information. First, as for spatial information, we only use the fixations to calculate the temporal visualization. Second, we distinguish two modes of timeline visualization: single or aggregated. In the single mode, each timeline represents one motion picture. The distribution of the watched body part is represented by two colors: red when the region is watched and blue when it is not watched. In the aggregated mode, a timeline represents the gaze distribution on several motion pictures. Here we use a ‘heat’ line to represent the information. If the motion pictures to aggregate are the same (i.e., we want to aggregate various occurrences of the same motion), we just average the set of occurrences for each time step of the captured gaze pattern. If the motion pictures/sequences are not the same (but semantically equal), we use the correspondence mapping previously introduced to aggregate them. Specifically, for each viewer, the color of each value on the time-line is normalized by the maximum value calculated among all parts visualized (this, for a given interval).
As explained before, the user can choose to visualize several media sources or/and several viewers, and each media source can be represented by several intervals. So, considering the two modes of visualization previously introduced, several visualization possibilities are available. For one viewer and a given interval, it is possible to display, for each body part, either all captures, or the aggregation of all occurrences of the same media source, or the aggregation of all media sources. Additionally, the user can filter or compare various media sources. For several viewers and a given interval, the options are the same but the user can also compare or filter the viewers.
AOIs/Fixations visualization. The last proposed feature concerns the visualization of the “time spent” on body parts (Figure 2 (Output/bottom of the right column)). Considering a given interval, for each AOI, it is possible to display the number and the average duration of fixations, as well as the number and duration of the sum of adjacent fixations on this AOI. The selecting options for AOIs/Fixations visualization are the same as for timeline visualizations.

2D Use Case: Evaluation of Gymnastics judges

Eye tracking plays a central role in developing a better understanding of the relationship between gaze behavior and decision making in sports. In this context, the aim of this experiment was twofold: First, investigate how gymnastics judges analyze a gymnastic gesture, and second, determine if there are differences in the gaze pattern of judges of different levels of expertise.
While eye tracking is often used in sport applications to analyze perceptual-cognitive skills, sports official are often overlooked. Hancock and colleagues (2013) recently performed an experiment assessing gaze behavior, decision accuracy, and decision sensitivity of ice hockey officials. In gymnastics, Bard et al. (1980) conducted a study analyzing the gaze pattern of gymnastics judges. They specifically measured the number and location of ocular fixations. They found that experts had fewer fixations of longer duration. This has been confirmed by several studies in gymnastics (Williams, Davids, Burwitz, & Williams, 1994), in handspring vaults (Page, 2009), and in rhythmic gymnastics (Korda, Siettos, Cagno, Evdokimidis, & Smyrnis, 2015).
In our first use case experiment, official gymnastics judges had to evaluate and mark the performance of gymnasts on the horizontal bar. Video footages of gymnasts’ performances were shown to the judges and we recorded their eye movements. This kind of animation material constituted an excellent benchmark for our system because: (i) each sequence can be segmented into several phases; (ii) the movement is rich in information (the character uses all parts of his/her body and rotates around him/herself); (iii) finding out whether oculomotor behavior differs between participants and trying to link these behaviors with the attributed marks has a functional relevance.

Material Preparation

To work with video sequences, we created an application to annotate them manually by adding a skeleton which follows the video character, i.e., the AOIs are defined by the skeleton (Figure 1b shows an example of the 2D skeleton mapped on the related video material). The skeleton creation and annotation were performed by one user, a member of the research team. In this experiment, the skeleton contained 26 joints and 30 segments which were set up with a width and a depth level. The depth level notably determines which segments are displayed on the front plane in case of ‘overlap’, thereby managing masking between objects. The number of joints corresponding to a low-resolution skeleton (less than the 50 joints possible introduced in the previous section) was arbitrarily defined to simplify the task for the user. This aspect constitutes a direct limitation of the use of video format, i.e., it is impossible to have a high resolution of the subjacent skeleton. The skeleton annotation required approximately 1 hour for every 30-second sequence of video. On average, the user annotated a skeleton every four frames, the remaining animation being automatically computed by interpolation between the annotated frames.
Figure 1 (upper part) provides an overview of the experiment.
Eye gaze-sample joint mapping. As mentioned before, to calculate gaze data distribution, the visualized body part must be provided for each sample. The sample joint metric that we used is similar to existing metrics used to calculate mean gaze duration and proportion of time spent on each dAOI (Jacob & Karn, 2003), here represented by the areas linked to the joints of the skeletons (see Figure 3). For each frame of the sequence, we measured the eye tracking coordinates on the screen ck = (ckx, cky) where k represents the current frame. The associated joint is determined by a function f which calculates the shortest distance between the bones of the joints of the skeleton as well as the coordinates ck. As illustrated in Figure 3, f excludes the coordinates which are too far away from the skeleton. This distance of exclusion is directly given by the user accordingly to the accuracy of the eye-tracker used. In our case, the accuracy of the system allowed us to use a distance of 0.25° to 0.5°. If the coordinates are valid, f calculates the best-fitting bone, depending on the shortest distance between skeleton bones and the coordinates ck and taking into account the depth hierarchy between joints.

Experiment

We used the SR Research Eyelink 1000 Plus eye tracker to record eye movement data, sampling at 1000 Hz. The angular distance between the two adjacent ‘closest’ joints was 3.05° and the distance between the two adjacent ‘most distant’ joints was 8.1°.
Participants. 18 female gymnastics judges participated in the experiment. Nine judges had a Swiss level B.1 (Mage = 24.5 years, SD = 2.3 years) and the other nine a Swiss level B.2 (M(Page) = 32.8 years, SD = 5.6 years). The B.2 judges had more experience and expertise than the B.1 judges (they need to judge 6 competitions before starting the B.2 training which lasts one year).
Procedure and Design. Each judge was presented with 9 videos of gymnasts performing a movement at the horizontal bar. The videos were filmed with 3 gymnasts of different levels (C5, C6 and C7 of the Swiss Gymnastic Federation) who performed 3 occurrences of the same gesture. The point of view of the videos corresponded to the placement of judges during a competition. Each video lasted about 30s (~900 samples with a frequency of 30 frames/second and ~200 annotated skeletons). During the experiment, viewers were placed in an ecological situation (i.e., the distance between the viewer and the video character corresponded to what occurs during a competition). Two repetitions of each video were presented, for a total of 18 videos per judge. For each judge, the order of presentation of the videos was counterbalanced using a Latin square.
Results. Figure 4 shows the results of randomly selected sub-sequences for two different kinds of body part: a body part that was often visualized (the hips) and body parts that were seldom visualized (right and left foot). First, we can note that the heat mesh illustrates with clarity the areas of interest (gazed at) in the selected sub-sequences for one or more viewers.
Concerning global fixation, as expected for gymnastics experts, our results show that the body part that was the most scrutinized by the judges was the hips area, which is a central ‘anchor’ to evaluate performance in gymnastics.
Concerning the timeline blocks visualization, Figure 4 illustrates clearly the differences in synchronization between viewers for a given sample. Specifically, a viewer’s timeline represents a sub-sequence of the movement, and is linked to a body segment/area. The red color indicates the time slots during which a given segment/area is gazed at. It is particularly interesting to notice that for definite areas, the majority of judges are ‘synchronized’ when fixing their gaze on peripheral areas. After verification, these areas correspond to movements that were not executed correctly, i.e., movements that can be considered as ‘artifacts’. Therefore, our system allowed us to easily detect mistakes or incorrectly executed movements, as well as elements and details that were missed by a judge. Gaze patterns related to the ‘left foot’ and ‘right foot’ evidence shared oculomotor ‘strategies’ between viewers, as several of them show synchronized fixations. The percentages associated to each timeline confirm the observed tendencies. A comparison of the two groups of judges with a general overview of gaze and fixation distribution is presented in Appendix Figure 8.
For illustration purposes, we assessed whether the level of expertise of the judges could affect the synchronization between viewers for the body parts that were infrequently scanned, namely less than 1% of the total scanning time. We considered that these fixations likely corresponded to detected errors. We expected the judges to look at these parts simultaneously because they detected the errors when they occurred. We also expected more expert judges to show more synchronization in this error-detection process. To assess that, we first computed the total overlap duration for these body parts, i.e., the total duration during which at least a third of the judges were simultaneously watching these body parts. We then compared the overlap duration between the two groups of expertise.
The more experienced judges (B.2 level) had an average overlap of MoverlapB2 = 83 ms, SDoverlapB2 = 78 ms, whereas that of less experienced judges (B.1 level) was of MoverlapB1 = 76 ms, SDoverlapB1 = 49 ms. However, a Welch two sample t-test (data normally distributed and homogeneous variance between groups) indicated that this difference between the groups was non-significant (t(14) = −0.222, p = 0.827). Once again, this comparison was performed for illustration purposes, and many more tests could be run on the data depending on the specific question at hand.

3D Use Case: Prediction of the Direction of a Penalty Kick

In the second use case experiment, goalkeepers had to scrutinize the run-up of the penalty taker to try to determine ahead of the kick in which direction the ball would be kicked. This experiment was conducted in virtual reality.
Sport applications based on virtual reality technology are increasing almost exponentially. In particular, VR is very useful to analyze players/athletes’ behavior or to improve sensorimotor learning. For example, Huang (Huang, Churches, & Reilly, 2015) proposed an application to improve American football performance based on gameplays created by coaches. A survey dedicated to the use of virtual reality to analyze sport performance is given by Bideau and colleagues (Bideau et al., 2010). In our use case experiment, the motion picture of the penalty taker was in 3D format, namely a 3D animation. As explained by Bideau and colleagues, this format presents several advantages over video playback. In particular, it allows for easy editing of the scenes and adding embedded 2D and 3D information. It also allows to flexibly edit the movement to modify the dynamic, the orientation of the joints, the display of certain parts of the body, or the juxtaposition of several movements. Concerning the use of both virtual reality and eye tracking for the analysis of animated characters, Bente et al. (2001) were among the firsts to compare the effects of videotaped nonverbal interactions vs. computer animations of the same behavior on perception. They found that the two media sources give rise to similar results. In their study, the animation of the virtual characters was based on interpolations between wire-frame models which did not move. Since Bente, several studies combined virtual reality and eye-tracking for research on social interactions. For example Lahiri et al. (2011) used virtual reality to provide dynamic feedback during social interactions. They wanted to better understand the gaze patterns of adolescents with autism spectrum disorders. To visualize data, they used the classical scanpath and AOI techniques as introduced by Rodrigues et al. (2012). In another study, Bente et al. (2007) investigated the length of gaze fixations during virtual face-to-face interactions. More recently, Roth et al. (2016) analyzed the impact of body motion and emotional expressions in faces on emotion recognition. They compared the visual attention patterns between the face and the body area and found that humans predominantly judge emotions based on the facial expression. In another context, Wilms et al. (2010) used the gaze behavior of viewers interacting with a virtual character to modulate the facial expressions of the virtual character (i.e., the facial expressions of the virtual character depended on the gaze behavior of the viewer).
As mentioned in the previous section, eye tracking technology is often used to analyze gaze behavior to better understand decision making with athletes, coaches and officials. Concerning the striker-defender opposition, several studies (e.g., in tennis (Goulet, Bard, & Fleury, 1989), or in soccer (Mann, Williams, Ward, & Janelle, 2007; Williams et al., 1994)) have shown that expert players who make faster decisions and have more correct responses make more fixations. Hancock and Ste-marie (2013) suggested that fixation duration depends on the type of sport and the experimental conditions. Specifically, shorter durations are usually observed for interception (e.g., racket sports) and strategic sports (e.g., team sports) whereas longer durations take place in closed-skills sports. This has been confirmed by Woolley et al. (2015) who investigated goalkeeper anticipation during a penalty kick and observed fewer fixations of longer duration on fewer locations. However, to our knowledge, no study has yet explored the differences and common synchronizations between viewers or groups of viewers.
Anticipating the direction of the ball during a football penalty kick is a topic which has often been investigated. For instance, Savelsbergh et al. (2005) have shown that goalkeepers were more successful in anticipating the direction of the kick to come when fixating on the stance leg (i.e., non-kicking leg) prior to foot-to-ball contact. In contrast, Woolley et al. (2015) suggested that when trying to predict the direction of the kick, goal-keepers could use a global perceptual approach by extracting information cues from various body segments of the body (e.g., kicking leg, stance leg, hips) rather than focusing on one particular body area. In our use case experiment, we assessed whether there are differences between novice and expert goalkeepers regarding the visual scanning strategy used to anticipate the direction of the kick, and how those affect the estimation performance.

Experiment

The steps usually involved in the ‘production’ of a motion picture in 3D format and its use in a virtual reality setup are illustrated in Figure 5. As shown, several pre-processing steps are required. The preparation of the 3D scene is decomposed into three steps: the capture of the actors (1.(a)), the creation of an animated skeleton (1.(b)) and the creation and binding of a mesh to this skeleton: 1.(d).
To capture the gestures of the actors, we used the Optitrack system with 12 infrared cameras rated at 240 Hz. The actors were dressed in a suit equipped with 49 markers. The animation of the skeleton (composed of 51 joints) was automatically computed by the Motive software. To create a mesh and bind it to the animated skeleton, we used the Mixamo and MotionBuilder softwares, respectively. Mesh creation takes about fifteen minutes, and five additional minutes are needed for every gesture. Finally the virtual scene was create and rendered with Unity3D.
Twenty penalty kick gestures were motion-captured with five football players having on average fifteen years of experience in football. Each player performed four gestures, namely two kicks to the right side and two kicks to the left side. Each gesture included the full sequence of a penalty kick: laying of the ball, preparation, run-up and ball kick. For each trial, the whole sequence was presented to the viewer, but only the run-up and the kick phase interested us for the analysis. Because the original sequences had different durations, the Dynamic Time Warping algorithm (DTW) was used to create a correspondence between the sequences. Following this operation, all sequences ‘fed’ to our visualization system are mapped to the duration of the reference sequence.
DTW. When all movements do not have the same duration, and when the gestures have different dynamics and characteristics (e.g., different count of footsteps during the run-up in our experiment), it is impossible to establish a linear mapping between the motion pictures. To temporally map the different motion pictures, we used the dynamic time warping algorithm introduced by Berndt and Clifford (1994) on time series and applied in computer animation by Bruderlin and Williams (1995). This algorithm calculates an optimal match between two given gestures. First, the algorithm calculates the matrix of distances (see Figure 6) between the postures of two given gestures. Next, it calculates the optimal path corresponding to the best temporal warping to align the two sequences.
In this study, we applied the algorithm to the analysis of 3D animations by using the distance function
Jemr 10 00032 i002
In this equation, KA and KB are the two skeletons which are compared, i and j are the indices of the postures in the sequences A and B, n the number of joints in the skeleton and θ the angle (represented by a quaternion) of the current joint. d = L o g ( θ A 1 θ B ) represents the geodesic distance (Buss & Fillmore, 2001) between two quaternions θA and θB. In Figure 6, D is represented by the black square. This procedure allows to compare various gestures. The next section explains how we calculated the time spent on each part of the body.
Eye gaze-sample joint mapping. After collecting the gaze data, we used the OpenGL library (Woo, Neider, D3Ds, & Shreiner, 1999), provided by Unity3D, to map the eye coordinates and the objects displayed in the scene. Specifically, we used a ray cast algorithm to project a 3D ray from the gaze screen coordinate through the camera into the scene, and then check if that ray intersected any body parts. For each gaze sample collected at a specific time, we played the scene and used this algorithm to map the sample to a body part.
Participants. Football players having between ten and twenty years (Mexp = 15.4 years) of football experience participated in the experiment.
Five of these players were expert goalkeepers (M(Page) = 23.5 years, SD = 1.7 years) whereas the other five were expert field players (M(Page) = 23.9 years, SD = 3 years) without any goalkeeping experience.
Procedure and Design. The participants were presented with two repetitions of the twenty different penalty kick sequences, for a total of forty sequences per participant. The order of presentation of the sequences was fully randomized for each participant. Because the orientation of the head before the run-up sometimes provided important information regarding the future direction of the kick, the animation of the head was altered so that it did not provide any information regarding the kick. This modification was implemented because we were interested in the gaze pattern of the participants during the run-up phase. For each trial, participants had to estimate whether the kicker would strike to the right or to the left of the goal, and give their response as fast as possible (i.e., as soon as they made their decision) by pressing a response key.

Results

Regarding anticipation performance, we compared the results of the expert goalkeepers (GK) with that of the field players (FP) both regarding the percentage of anticipation error and the time of the response. On average, goalkeepers had an error rate of MerrorGK = 38%, SDerrorGK = 9.3 whereas the field players had an error rate of MerrorFP = 38.5%, SDerrorFP = 8.9. A Welch two sample t-test (data normally distributed and homogeneous variance between groups) indicated that this difference between the two levels of expertise was non-significant (t(8) = 0.086, p = 0.932). Regarding the time of response, goalkeepers responded on average in MtimeGK = 13 ms, SDtimeGK = 172 ms before ball contact, whereas the field players responded on average MtimeFP = 14 ms, SDtimeFP = 126 ms before ball contact. Once again, a Welch two sample t-test (data normally distributed and homogeneous variance between groups) indicated no significant difference between the two levels of expertise (t(8) = −0.004, p = 0.996).
Concerning gaze behaviors, Figure 7 displays the distribution of gaze fixations recorded during the run-up of the kicker. Field players are characterized by a more diffuse visual scanning on the body and particularly on the head as compared to expert goalkeepers who focused their gaze primarily on the supporting leg. These results are in line with previous work on this topic though the setup and methodology were different (Savelsbergh et al., 2005). However, after analyzing the graphical results of the top three expert subjects, unlike the study on gymnastics, a gaze pattern is not obvious to establish.
In line with the results reported in previous studies, and for illustration purposes, we compared the gaze behavior of expert goalkeepers vs. field players. In particular, we focused on the total duration of fixation as well as the total number of fixations on the lower part of the supporting leg. For this analysis, we excluded the participants that did not look at this body part at all. In total, goalkeepers spent MdurationGK = 3763 ms, SDdurationGK = 2670 ms fixating the lower supporting leg, whereas the field players only spent MdurationFP = 211 ms, SDdurationFP = 97 ms. However, a Welch two sample t-test (data normally distributed and homogeneous variance between groups) indicated that this difference failed to reach significance (t(3.0105) = 2.6581, p = 0.076).
Regarding the total number of fixations, on average, the goalkeepers performed much more fixations MnumberGK = 11, SDnumberGK = 7.9 than the field players MnumberFP = 1.33, SDnumberFP = 0.58. But, as for fixation duration, this difference failed to reach significance, as indicated by a Wilcoxon rank sum test (data normally distributed but heterogeneous variance between groups, W = 11.5, p = 0.0718).
Though non-significant, our results regarding fixation duration are consistent with those reported in previous studies. Indeed, the experts had longer fixation durations on the supporting leg. However, we also found that experts had a greater number of fixations, which is at odds with previously reported results showing the opposite pattern (appendix Figure 9). This discrepancy is intriguing and should be investigated more specifically in future research. In particular, it might be interesting to compare more systematically 3D stimuli and 2D video media sources to test to which extent they affect the gaze pattern of viewers.

Conclusion and Discussion

In this paper, we presented an interactive 3D tool, based on AOI data, that allows for spatio-temporal investigation of large data sets of recorded eye movements. In particular, this tool allows the user to manage and analyze the gaze pattern of several viewers on several animated sequences. This tool is independent of the hardware used to record the eye movements and of the application used to generate the visual scene/experimental stimuli, making it more flexible for general usage.
Specifically, we presented a new approach to visualize the oculomotor behavior of viewers watching the movements of animated characters in dynamic sequences. This approach allows to illustrate the gaze distribution of one or several viewers, i.e., the time spent on each part of the body on a ‘heat mesh’. This is in line with previous work on heat maps (Blascheck et al., 2014) or surface-based attention maps (Stellmach et al., 2010). Associated to this approach, we also proposed a new way to visualize viewer timelines using blocks of timeline linked to the heat mesh. As with classical systems, our system allows to visualize the ‘heat’ information (Maurus et al., 2014; Stellmach et al., 2010) and to have a good overview of the observed AOIs (Kurzhals & Weiskopf, 2013; Maurus et al., 2014). To our knowledge, the system presented here is the first which proposes to visualize the spatiotemporal features of the gaze patterns of several viewers having watched animated characters within a unified figure. As with the interactive applications proposed by Stellmach (Stellmach et al., 2010) and Maurus (Maurus et al., 2014), our tool allows the user to visualize the spatial distribution on 3D objects. However, we introduce a graphical link between the timelines and spatial information (Kurzhals & Weiskopf, 2013; Maurus et al., 2014; Stellmach et al., 2010), and provide a clear visualization of the synchronization and overlap between viewers. Our tool also allows the user to directly export the observed features for statistical analysis. These differences are summarized in Table 1.
While our model is not specifically dedicated to the analysis of the AOI order (scanpath), the two use case experiments highlight its advantages. Regarding the first use case experiment, whereas classical studies in this domain focused on the number and the duration of fixations, our system allowed us: (1) to observe if a pattern of gaze exists not by using scanpath but by comparing the synchronization and the overlap between viewers’ timelines; and also (2) to easily compare the synchronization between viewers. In the second use case experiment, our tool allowed us to clearly see the difference between expert goalkeepers and field players regarding the time spent on the different body parts. More interestingly, it allowed us to identify that the gaze behavior on certain body parts is not consistent with the literature.
These two use case experiments demonstrate that our system is an efficient tool to quickly compare the oculomotor behavior of different viewers, and notably to identify the synchronization (or the lack of it) between viewers for each dynamic area of interest. In this respect, we believe that our system is ideal for users who want to quickly and easily compare the gaze pattern of different viewers or groups of viewers.
Therefore, our system could serve as an excellent support for experiments dedicated to nonverbal behavior analysis, as for example in the study of Roth and colleagues (Roth et al., 2016) who analyzed how viewers look at avatars displaying affective behavior. Our system would notably have allowed these authors to compare the time spent on the body and the head, respectively.
Another straightforward application of our system is in research on cognition and learning in sport. Specifically, our system could be used to provide feedback to novices to help them better understand the difference between their strategy and that of experts, and to adapt their strategy accordingly to improve their performance. In other words, we believe that our system is particularly well-suited for improving performance in applications for which visual scanning strategies play a key role. These last points are all the more true as we believe our system is usable by a non-expert public.
In future work, we plan to improve the number of AOIs by including the different areas of the face. Moreover, we project to include an automatic analysis of the data allowing to compare the differences which are significant (using Test-T or ANOVA) between groups of viewers and/or groups of media sources. Concerning potential applications, after using our system in the sport domain (e.g., for judging or evaluating the interactions between players), we plan to test it in conversational situations, as for example with sign language.

Ethics and Conflict of Interest

The author(s) declare(s) that the contents of the article are in agreement with the ethics described in http://biblio.unibe.ch/portale/elibrary/BOP/jemr/ethics.html and that there is no conflict of interest regarding the publication of this paper.

Appendix A

Figure 8. Gymnastic judgment. Top: Illustration of gaze distribution average for the two groups of judges (level B1 and B2). Bottom: Illustration of viewers’ fixations for each group of judges.
Figure 8. Gymnastic judgment. Top: Illustration of gaze distribution average for the two groups of judges (level B1 and B2). Bottom: Illustration of viewers’ fixations for each group of judges.
Jemr 10 00032 g008
Figure 8. Penalty anticipation. Top: Illustration of gaze distribution average for the two groups of viewers (expert goalkeepers (Expert) and field players (Novice)). Bottom: Illustration of viewers’ fixations for each group.
Figure 8. Penalty anticipation. Top: Illustration of gaze distribution average for the two groups of viewers (expert goalkeepers (Expert) and field players (Novice)). Bottom: Illustration of viewers’ fixations for each group.
Jemr 10 00032 g009
Figure 8. Graphical User Interface: Overview of the commands.
Figure 8. Graphical User Interface: Overview of the commands.
Jemr 10 00032 g010
Figure 8. Example of input file.
Figure 8. Example of input file.
Jemr 10 00032 g011

References

  1. Bard, C., M. Fleury, L. Carriere, and M. Halle. 1980. Analysis of Gymnastics Judges Visual Search. Research Quarterly for Exercise and Sport 51: 267–273. [Google Scholar] [CrossRef] [PubMed]
  2. Bente, G., F. Eschenburg, and N. Krämer. 2007. Virtual Gaze. A Pilot Study on the Effects of Computer Simulated Gaze in Avatar-Based Conversations. Virtual Reality, 185–194. [Google Scholar] [CrossRef]
  3. Bente, G., A. Petersen, N. Krämer, and J. P. de Ruiter. 2001. Transcript-based computer animation of movement: Evaluating a new tool for nonverbal behavior research. Behavior Research Methods, Instruments, & Computers 33: 303–310. [Google Scholar] [CrossRef]
  4. Berndt, D. J., and J. Clifford. 1994. Using dynamic time warping to find patterns in time series. KDD workshop. [Google Scholar]
  5. Bideau, B., R. Kulpa, N. Vignais, S. Brault, F. Multon, and C. Craig. 2010. Using Virtual Reality to Analyze Sports Performance. IEEE Computer Graphics and Applications 30: 14–21. [Google Scholar] [CrossRef]
  6. Blascheck, T., K. Kurzhals, M. Raschke, M. Burch, D. Weiskopf, and T. Ertl. 2014. State-of-the-Art of Visualization for Eye Tracking Data. EuroVis—STARs: The Eurographics Association. [Google Scholar] [CrossRef]
  7. Bruderlin, A., and L. Williams. 1995. Motion Signal Processing. In Proceedings of the 22Nd Annual Conference on Computer Graphics and Interactive Techniques. ACM. [Google Scholar] [CrossRef]
  8. Buss, S. R., and J. P. Fillmore. 2001. Spherical Averages and Applications to Spherical Splines and Interpolation. ACM Trans. Graph. 20: 95–126. [Google Scholar] [CrossRef]
  9. Goulet, C., C. Bard, and M. Fleury. 1989. Expertise Differences in Preparing to Return a Tennis Serve: A Visual Information Processing Approach. Journal of Sport and Exercise Psychology 11: 382–398. [Google Scholar] [CrossRef]
  10. Hancock, D. J., and D. M. Ste-marie. 2013. Gaze behaviors and decision making accuracy of higher-and lower-level ice hockey referees. Psychology of Sport & Exercise 14: 66–71. [Google Scholar] [CrossRef]
  11. Huang, Y., L. Churches, and B. Reilly. 2015. A Case Study on Virtual Reality American Football Training. In Proceedings of the 2015 Virtual Reality International Conference. ACM. [Google Scholar] [CrossRef]
  12. Jacob, R. J., and K. S. Karn. 2003. Eye tracking in human-computer interaction and usability research: Ready to deliver the promises. Mind 2: 4. [Google Scholar] [CrossRef]
  13. Korda, A., C. Siettos, A. D. I. Cagno, I. Evdokimidis, and N. Smyrnis. 2015. Judging the Judges’ Performance in Rhythmic Gymnastics. 640–648. 640–648. [Google Scholar] [CrossRef]
  14. Kurzhals, K., F. Heimerl, and D. Weiskopf. 2014. ISeeCube: Visual Analysis of Gaze Data for Video. In Proceedings of the Symposium on Eye Tracking Research and Applications. ACM. [Google Scholar] [CrossRef]
  15. Kurzhals, K., and D. Weiskopf. 2013. Space-time visual analytics of eye-tracking data for dynamic stimuli. IEEE Transactions on Visualization and Computer Graphics 19: 2129–2138. [Google Scholar] [CrossRef] [PubMed]
  16. Lahiri, U., A. Trewyn, Z. Warren, and N. Sarkar. 2011. Dynamic eye gaze and its potential in virtual reality based applications for children with autism spectrum disorders. Autism-Open Access 1. [Google Scholar] [CrossRef]
  17. Lewis, J. P., M. Cordner, and N. Fong. 2000. Pose Space Deformation: A Unified Approach to Shape Interpolation and Skeleton-driven Deformation. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. ACM Press/Addison-Wesley Publishing Co. [Google Scholar] [CrossRef]
  18. Mackworth, J. F., and N. H. Mackworth. 1958. Eye Fixations Recorded on Changing Visual Scenes by the Television Eye-Marker. J. Opt. Soc. Am. 48: 439–445. [Google Scholar] [CrossRef] [PubMed]
  19. Mann, D. T. Y., A. M. Williams, P. Ward, and C. M. Janelle. 2007. Perceptual-Cognitive Expertise in Sport: A Meta-Analysis. Journal of Sport and Exercise Psychology 29: 457–478. [Google Scholar] [CrossRef]
  20. Maurus, M., J. H. Hammer, and J. Beyerer. 2014. Realistic Heatmap Visualization for Interactive Analysis of 3D Gaze Data. In Proceedings of the Symposium on Eye Tracking Research and Applications. ACM. [Google Scholar] [CrossRef]
  21. Page, J. 2009. The development and effectiveness of perceptual training programme for coaches and judges in gymnastics. University of Liverpool (University of Chester). [Google Scholar]
  22. Papenmeier, F., and M. Huff. 2010. DynAOI: A tool for matching eye-movement data with dynamic areas of interest in animations and movies. Beh3Dor Research Methods 42: 179–187. [Google Scholar] [CrossRef]
  23. Poole, A., and L. J. Ball. 2006. Eye tracking in HCI and usability research. Encyclopedia of Human Computer Interaction 1: 211–219. [Google Scholar]
  24. Rodrigues, R., A. Veloso, and O. Mealha. 2012. A television news graphical layout analysis method using eye tracking. In 16th International Conference on Information Visualisation (IV). pp. 357–362. [Google Scholar] [CrossRef]
  25. Roth, D., C. Bloch, A.-K. Wilbers, M. E. Latoschik, K. Kaspar, and G. Bente. 2016. What You See is What You Get: Channel Dominance in the Decoding of Affective Nonverbal Behavior Displayed by Avatars. 66th Annual Conference of the International Communication Association (ICA), Fukuoka, Japan, June 9–13. [Google Scholar]
  26. Salvucci, D. D., and J. H. Goldberg. 2000. Identifying Fixations and Saccades in Eye-tracking Protocols. In Proceedings of the 2000 Symposium on Eye Tracking Research & Applications. ACM. [Google Scholar] [CrossRef]
  27. Savelsbergh, G. J. P., J. V. der Kamp, A. M. Williams, and P. Ward. 2005. Anticipation and visual search behaviour in expert soccer goalkeepers. Ergonomics 48: 1686–1697. [Google Scholar] [CrossRef] [PubMed]
  28. Stark, L.W., and S.R. Ellis. 1981. Scanpaths revisited: Cognitive models direct active looking 193–226. Lawrence Erlbaum Associates. [Google Scholar]
  29. Stellmach, S., L. E. Nacke, and R. Dachselt. 2010. Advanced Gaze Visualizations for Three-Dimensional Virtual Environments. In Proceedings of ETRA 2010. pp. 109–112. [Google Scholar] [CrossRef]
  30. Widdel, H. 1984. Operational Problems in Analysing Eye Movements. Advances in Psychology 22: 21–29. [Google Scholar]
  31. Williams, A. M., K. Davids, L. Burwitz, and J. G. Williams. 1994. Visual Search Strategies in Experienced and Inexperienced Soccer Players. Research Quarterly for Exercise and Sport 65: 127–135. [Google Scholar] [CrossRef]
  32. Wilms, M., L. Schilbach, U. Pfeiffer, G. Bente, G. R. Fink, and K. Vogeley. 2010. It’s in your eyes-using gaze-contingent stimuli to create truly interactive paradigms for social cognitive and affective neuroscience. Social Cognitive and Affective Neuroscience 5: 98–107. [Google Scholar] [CrossRef]
  33. Woo, M., J. Neider, T. D3Ds, and D. Shreiner. 1999. OpenGL Programming Guide: The Official Guide to Learning OpenGL, Version 1.2. [Google Scholar]
  34. Woolley, T. L., R. G. Crowther, K. Doma, and J.D. Connor. 2015. The use of spatial manipulation to examine goalkeepers anticipation. Journal of Sports Sciences 33: 1766–1774. [Google Scholar] [CrossRef] [PubMed]
  35. Wu, M. 2016. SEQIT: Visualizing Sequences of Interest in Eye Tracking Data. IEEE Transactions on Visualization and Computer Graphics (TVCG, Proceedings of InfoVis 2015) 22: 449–458. [Google Scholar] [CrossRef]
Figure 1. Overview of the pipeline of our system. After capturing an animated sequence by video capture (a1) or using a system of motion capture (a2), the user manually annotates (b1), or the captured motion is automatically annotated (b2). After visualization of the animated sequences by the viewer(s), our system automatically computes the mapping between each fixation and the different parts of the skeleton. Finally, our system generates one or more meshes (corresponding to a temporal decomposition given by the user) colored by heat maps of the tracked areas (d).
Figure 1. Overview of the pipeline of our system. After capturing an animated sequence by video capture (a1) or using a system of motion capture (a2), the user manually annotates (b1), or the captured motion is automatically annotated (b2). After visualization of the animated sequences by the viewer(s), our system automatically computes the mapping between each fixation and the different parts of the skeleton. Finally, our system generates one or more meshes (corresponding to a temporal decomposition given by the user) colored by heat maps of the tracked areas (d).
Jemr 10 00032 g001
Figure 2. Overview of the different features of our system.
Figure 2. Overview of the different features of our system.
Jemr 10 00032 g002
Figure 3. Only the samples on and near the segments are taken into account (represented by blue and red dots, respectively). Here for instance, the crosses are too far and will therefore not be taken into account.
Figure 3. Only the samples on and near the segments are taken into account (represented by blue and red dots, respectively). Here for instance, the crosses are too far and will therefore not be taken into account.
Jemr 10 00032 g003
Figure 4. Illustration of viewers’ gaze patterns. The colored mesh averages the data recorded with 18 viewers. Each block represents a body part and each line corresponds to a viewer (the red color shows the fixation times). The blue color means that the area is not watched at this time. The block percentage corresponds to the average time spent on the related body part by assuming that we considered only fixations on the body. The timeline percentage corresponds to the average time spent on the related body part. Specifically, these two percentage values have been calculated on all sequences.
Figure 4. Illustration of viewers’ gaze patterns. The colored mesh averages the data recorded with 18 viewers. Each block represents a body part and each line corresponds to a viewer (the red color shows the fixation times). The blue color means that the area is not watched at this time. The block percentage corresponds to the average time spent on the related body part by assuming that we considered only fixations on the body. The timeline percentage corresponds to the average time spent on the related body part. Specifically, these two percentage values have been calculated on all sequences.
Jemr 10 00032 g004
Figure 5. Overview of the steps and design of the second use case experiment assessing goalkeepers’ performance and strategies in anticipating the direction of a penalty kick.
Figure 5. Overview of the steps and design of the second use case experiment assessing goalkeepers’ performance and strategies in anticipating the direction of a penalty kick.
Jemr 10 00032 g005
Figure 6. Skeleton animation for two different gestures (top) and time warping matrix for these two animations (bottom). The blue areas correspond to the zone for which the distance between the posture of SkA and that of SkB is the lowest. The black curve represents the warping path.
Figure 6. Skeleton animation for two different gestures (top) and time warping matrix for these two animations (bottom). The blue areas correspond to the zone for which the distance between the posture of SkA and that of SkB is the lowest. The black curve represents the warping path.
Jemr 10 00032 g006
Figure 7. Difference of visual fixation between field players (left) and expert goalkeepers (right). The color scale represents the total duration of fixations on the different body parts.
Figure 7. Difference of visual fixation between field players (left) and expert goalkeepers (right). The color scale represents the total duration of fixations on the different body parts.
Jemr 10 00032 g007
Table 1. Comparison of the features of our system with those of systems used in previous related works.
Table 1. Comparison of the features of our system with those of systems used in previous related works.
Display featuresstimuli
spatial distributiontime-linesSynchro overlappingscanpathtypeAOIComparing Several usersComparing Several media
MaurusYesNoNoNo3DnoNoNo
Stellmach
(Tobii 1750)
YesYesNoYes3DAutomatic (vertex based)NoNo
Kurzhal
(Tobii T60 XL)
NoYesYesYes2DrectanglesYesNo
Roth
(SMI RED-500)
Yes(2D)NoNoNo3DrectanglesNoNo
MaurusYesYesYesNo2D/3D2D: rectangles 3D:automatic
(vertex based)
YesYes

Share and Cite

MDPI and ACS Style

Le Naour, T.; Bresciani, J.-P. A Skeleton-Based Approach to Analyzing Oculomotor Behavior When Viewing Animated Characters. J. Eye Mov. Res. 2017, 10, 1-19. https://doi.org/10.16910/jemr.10.5.7

AMA Style

Le Naour T, Bresciani J-P. A Skeleton-Based Approach to Analyzing Oculomotor Behavior When Viewing Animated Characters. Journal of Eye Movement Research. 2017; 10(5):1-19. https://doi.org/10.16910/jemr.10.5.7

Chicago/Turabian Style

Le Naour, Thibaut, and Jean-Pierre Bresciani. 2017. "A Skeleton-Based Approach to Analyzing Oculomotor Behavior When Viewing Animated Characters" Journal of Eye Movement Research 10, no. 5: 1-19. https://doi.org/10.16910/jemr.10.5.7

APA Style

Le Naour, T., & Bresciani, J.-P. (2017). A Skeleton-Based Approach to Analyzing Oculomotor Behavior When Viewing Animated Characters. Journal of Eye Movement Research, 10(5), 1-19. https://doi.org/10.16910/jemr.10.5.7

Article Metrics

Back to TopTop