Introduction
Technology-enhanced educational environment, provide several benefits to improve surgical education programs. For instance, simulation is one of the technologies that allows trainees to perform clinical activities interactively by recreating such operations in a computer-based system without exposing patients to the associated risks (
Maran & Glavin, 2003; Munshi, Lababidi, & Alyousef, 2015). However, still there is a need for research to develop strategies for improving the curriculum integration of these systems and for creating standardized approaches. In this respect, the mental workload theory and the eyetracking technology are two important concepts that can be implemented in surgical education programs.
The mental workload concept has long been accepted as an essential aspect of individual performance within complex systems (
Xie & Salvendy, 2000). It is reported that mental workload can change the performance of individuals (Zheng, Cassera, Martinec, Spaun, & Swanström, 2010) and further affect the competence of the whole system (
Xie & Salvendy, 2000). Accordingly, system developers need certain models to assess the mental workload imposed on individuals at an early stages so that alternative system designs can be appraised (
Xie & Salvendy, 2000). At the same time, mental workload can negatively affect performance and increase the probability of errors (
Zheng et al., 2010), and researchers have spent a great deal of effort developing measures and probes of mental workload (
Ahlstrom & Friedman-Berg, 2006). Supportively,
Moray (
1988) stated that adjusting the allocation of mental workload could reduce human errors, improve system safety, and increase productivity. In earlier studies, three types of mental workload has been defined: intrinsic load, extraneous or ineffective load, and germane or effective load (Sweller, Van Merrienboer, & Paas, 1998). Intrinsic load is an interaction between the nature of the material being learned and the expertise of the learners (Paas, Tuovinen, Tabbers, & Van Gerven, 2003;
Sweller et al., 1998). Extraneous load is resulting from mainly poorly designed instruction, and germane load is related to processes that contribute to the construction and automation of schemas (
Paas et al., 2003).
Eye-tacking provides a valuable source of information, and events such as fixations, blinks, and pupil diameter can be used to assess the mental workload (Tsai, Viirre, Strychacz, Chase, & Jung, 2007). Accordingly, there are several studies conducted on the assessment of mental workload by using eye-tracking technology (
Menekse Dalveren & Cagiltay, 2018). A precise evaluation of mental workload will be essential for developing systems that manage user attention (Atkins, Tien, Khan, Meneghetti, & Zheng, 2013; Dalveren, Çağıltay, Özçelik, & Maraş, 2017; Iqbal, Zheng, & Bailey, 2004). Researchers have used eye-movement events found to correlate with cognitive demands (
Ahlstrom & Friedman-Berg, 2006). For instance,
Benedetto et al. (
2011) examined the changes in blink duration and blink rate in a simple driving task and stated that blink events reflect the effects of visual workload. Another study evaluates the mental workload by developing combined measures based on various physiological indices (
Ryu & Myung, 2005). To determine the mental workload, three physiological signals were recorded; these are: alpha rhythm, eye blink interval, and heart rate variability (
Ryu & Myung, 2005). The study of de Greef, Lafeber, van Oostendorp, and Lindenberg (2009) describes an approach for objective assessment of mental workload by analyzing the differences in pupil diameter and several aspects of eye-movement under different levels of mental workload. Eye-movement events are also used in medicine for diagnoses, treatment and training purposes (Jarodzka, Holmqvist, & Gruber, 2017) and for clinical applications such as Alzheimer’s (
Crawford et al., 2005), HIV-1 infected patients with eye-movement dysfunction (Sweeney, Brew, Keilp, Sidtis, & Price, 1991), and schizophrenia (Flechtner, Steinacher, Sauer, & Mackert, 1997). Studies show that these events provide crucial information about how users interact with complex visual displays (
Marshall, 2002). The field of radiology and visual search (
Nodine & Kundel, 1987) and laparoscopic surgery training (Law, Atkins, Kirkpatrick, & Lomax, 2004; Tien, Atkins, Zheng, & Swindells, 2010) are among the cases in medicine where eye-tracking approach has been adopted. To provide an example, according to the study by Zheng, Jiang, and Atkins (2015), participants perform a simulated laparoscopic procedure, and when the task difficulty is increased, the task completion time and pupil size also increase as a result.
Previous studies were conducted mostly on pupil size changes, but there are other eye-movement events, fixation for example, that can be informative for understanding mental workload. Fixation occurs when eye-movements are nearly still in order to assemble necessary information. Accordingly, in this study fixation number and fixation duration events are used to validate the mental workload imposed by different scenarios. As changes in eye-movement events, such as fixation number and fixation duration, with changes in mental workload are likely affected due to the nature of the scenarios (
Tsai et al., 2007), understanding the surgical resident’s mental workload while performing surgical operations is crucial for assessing task difficulties (
Andrzejewska & Stolińska, 2016). It is stated by
Just and Carpenter (
1976) longer fixation duration related with difficulty in interpreting the information present or a greater involvement in its exploration. Accordingly, it was found that more complex problem results in more fixation numbers and longer fixation duration (
Bałaj & Szubielska, 2014;
Menekse Dalveren & Cagiltay, 2018;
Rayner, 1998). Also, another study stated that the fixation duration might be related to the mental workload, when the mental workload increases the longer fixation duration for observation occurs (Brookings, Wilson, & Swain, 1996;
Hankins & Wilson, 1998;
Veltman & Gaillard, 1998; Wierwille, Rahimi, & Casali, 1985). Hence, this study attempts to understand the mental workload changes of the participants through their eye-movement events, namely fixation number and fixation duration, while performing tasks having different difficulty levels in four surgical scenarios. Accordingly, the scenarios are developed in different fidelity levels (high- and low-fidelity) which expected to affect mental workload of the participants. Additionally, in each scenario, the hand condition effect on mental workload is also investigated. Hence, in this study it is hypothesized that because of the changes in the mental workload under these situations (different hand conditions, fidelity levels and task difficulties of scenarios) eye-tracking data would display different behaviors. The authors believe that, this information will be very critical to better understand the mental workload of the participants in these situations. This information provides insights to the instructional system designers to better order and adapt related computer-based simulation technologies according to the skill levels and progress of the trainees.
Discussion
This research describes an approach for an objective assessment of mental workload by analyzing the differences in the fixation number and fixation duration under different levels of mental workload while surgical residents perform simulated scenarios. The eye-movement data was collected with an eye-tracking device and classified into fixation number and fixation duration events with an eye-movement classification algorithm (BIT). These eye-movement events are selected because they seem to be most suited to provide insight about changes in mental workload (De Rivecourt, Kuperus, Post, & Mulder, 2008). There are many other eye-movement classification algorithms, but in this study an open-source eye-movement classification algorithm, BIT, was used. The reason behind this choice was that BIT algorithm is eye-tracker independent and easy to implement and use. The aim of this study is to examine whether the fixation number and fixation duration events can, indeed, be indicators for mental workload and whether there are any among the imposed mental workloads within different scenarios. According to the results, the fixation number and fixation duration both show a significant increase if the mental workload increases. For understanding the differences between the scenarios, four of them were developed in this study; two were simulated surgical models and two were general models. The results can be summarized as highlighted below:
In the dominant hand condition, Scenario-1 has the lowest mean rank for the fixation number (1.47) and fixation duration (1.04) while Scenario-2 has the highest mean rank for the fixation number (3.78) and fixation duration (3.70).
When using the non-dominant hand, Scenario-1 has the lowest mean rank for the fixation number (1.26) and fixation duration (1.04), while Scenario-2 has the highest mean rank for fixation number (3.70) and Scenario-4 has the highest mean rank for fixation duration (3.52).
When using both hands, Scenario-1 has the lowest mean rank for the fixation number (1.07) and fixation duration (1.00), whereas Scenario-2 has the highest mean rank for fixation number (3.80) and fixation duration (3.96).
In general, it can be concluded that in the scenarios that are designed by using the models that simulate the operational area (Scenario 2 & 4), the fixation duration and fixation number values become higher compared to the other group of scenarios (Scenario 1 & 3).
In previous studies, it has been stated that fixation time both show a general significant increase if the mental workload increases (
de Greef et al., 2009). Another study stated that the pupil size increased in response to task difficulty (Nakayama, Takahashi, & Shimizu, 2002).
Iqbal et al. (
2004) also stated that more difficult tasks demand longer processing times, induce higher subjective ratings of mental workload, and reliably evoke greater pupillary response at corresponding subtasks than a less difficult task. Additionally,
Zheng et al. (
2015) stated that the pupil size of surgical residents is influenced depending on the task difficulties increasing as the difficulty level elevates. It is also reported that the fidelity level is a crucial factor affecting the mental workload (
Munshi et al., 2015). According to the previous studies fixation number and fixation duration are widely used eye-movement events and are generally believed to increase with increasing mental workload (He, Wang, Gao, & Chen, 2012;
Maltz & Shinar, 1999; Marquart, Cabrall, & de Winter, 2015; May, Kennedy, Williams, Dunlap, & Brannan, 1990;
Miura, 1990;
Rayner & Morris, 1990;
Recarte & Nunes, 2000). In support to these studies, our results show that the scenarios based on simulated tasks using surgical models (higher level of fidelity) increase surgical residents’ mental workloads. Hence, it can be concluded that eye-movement events, such as fixation number and fixation duration, can be used to increase our knowledge of the mental workload of surgical trainees. Since the four scenarios were not performed in randomized and balanced order amongst the surgical residents there might be a training effect. Even this training affect, the results show that lately performed scenarios (2 and 4) are the ones having higher fixation events. Accordingly, this order affect can be considered as acceptable for this study.
Additionally, as there are very limited studies analyzing the eye-movement behaviors of endo-neurosurgery residents, there is no standards in classifying the simulation content according to the level of surgical skills (
Cagiltay & Berker, 2018; Cagiltay, Ozcelik, Sengul, & Berker, 2017). Similarly, the metrics that can be used to evaluate the skill levels of these residents are also very limited and there are no standards on these metrics, either (
Cagiltay et al., 2017). Hence, the results of this study encourage researchers to develop other standardized approaches for using objective metrics in surgical skill performance. Additionally, the results may guide instructional designers to better organize the content of computer-based simulation scenarios through the eye-movement behaviors of the trainees. As reported in the earlier studies, individual characteristics, situational characteristics and training motivation explain incremental variance in training outcomes beyond the effects of cognitive ability (Colquitt, LePine, & Noe, 2000). These individual differences are more effective in the case of skill-based training environments such as endo-neurosurgery which requires development of both cognitive and psychomotor abilities. By using information collected from the trainees’ behaviors such as eye-movement data, instructional designers can adapt the sequence and difficulty levels of the tasks on each trainee to provide a training opportunity according to the skill and progress levels of each trainee. Hence, in the future the computer-based instructional software developed for skill-based training purposes will be more adaptive by using the data collected from the behaviors (such as eye-movements) and performance of the trainees.