Your Eyes Under Pressure: Real-Time Estimation of Cognitive Load with Smooth Pursuit Tracking

Dell’Acqua, Pierluigi; Garofalo, Marco; La Rosa, Francesco; Villari, Massimo

doi:10.3390/bdcc9110288

Open AccessArticle

Your Eyes Under Pressure: Real-Time Estimation of Cognitive Load with Smooth Pursuit Tracking

Department of Mathematical and Computer Sciences, Physical Sciences and Earth Sciences, University of Messina, Viale Ferdinando Stagno d’Alcontres, 31, 98166 Messina, Italy

^*

Author to whom correspondence should be addressed.

Big Data Cogn. Comput. 2025, 9(11), 288; https://doi.org/10.3390/bdcc9110288

Submission received: 11 September 2025 / Revised: 24 October 2025 / Accepted: 5 November 2025 / Published: 13 November 2025

(This article belongs to the Special Issue Advances in Artificial Intelligence for Computer Vision, Augmented Reality Virtual Reality and Metaverse)

Download

Browse Figures

Versions Notes

Abstract

Understanding and accurately estimating cognitive workload is crucial for the development of adaptive, user-centered interactive systems across a variety of domains including augmented reality, automotive driving assistance, and intelligent tutoring systems. Cognitive workload assessment enables dynamic system adaptation to improve user experience and safety. In this work, we introduce a novel framework that leverages smooth pursuit eye movements as a non-invasive and temporally precise indicator of mental effort. A key innovation of our approach is the development of trajectory-independent algorithms that address a significant limitation of existing methods, which generally rely on a predefined or known stimulus trajectory. Our framework leverages two solutions to provide accurate cognitive load estimation, without requiring knowledge of the exact target path, based on Kalman filter and B-spline heuristic classifiers. This enables the application of our methods in more naturalistic and unconstrained environments where stimulus trajectories may be unknown. We evaluated these algorithms against classical supervised machine learning models on a publicly available benchmark dataset featuring diverse pursuit trajectories and varying cognitive workload conditions. The results demonstrate competitive performance along with robustness across different task complexities and trajectory types. Moreover, our framework supports real-time inference, making it viable for continuous cognitive workload monitoring. To further enhance deployment feasibility, we propose a federated learning architecture, allowing privacy-preserving adaptation of models across heterogeneous devices without the need to share raw gaze data. This scalable approach mitigates privacy concerns and facilitates collaborative model improvement in distributed real-world scenarios. Experimental findings confirm that metrics derived from smooth pursuit eye movements reliably reflect fluctuations in cognitive states induced by working memory load tasks, substantiating their use for real-time, continuous workload estimation. By integrating trajectory independence, robust classification techniques, and federated privacy-aware learning, our work advances the state of the art in adaptive human–computer interaction. This framework offers a scientifically grounded, privacy-conscious, and practically deployable solution for cognitive workload estimation that can be adapted to diverse application contexts.

Keywords:

HCI; eye-tracking; adaptive system; federated learning

1. Introduction

Cognitive workload, defined as the mental effort required to complete a task, plays a central role in determining user performance, attention levels, and error rates across domains such as aviation, automotive systems, human–computer interaction (HCI), and adaptive learning environments. While traditional methods for assessing cognitive load often rely on subjective instruments like the NASA-TLX questionnaire [1] or physiological signals such as electroencephalogram (EEG) [2] and heart rate variability (HRV) [3], eye tracking has emerged as a promising, non-invasive alternative for real-time and context-sensitive monitoring. Eye movements offer a direct reflection of attentional and cognitive processes, with features such as fixation duration [4], saccadic patterns [5], blink rate [6], and pupil dilation [7] showing strong correlations with variations in mental effort. Among these, smooth pursuit eye movements (SPEM), also known as the continuous tracking of a moving stimulus, have gained particular attention due to their reliance on sustained attention and visuomotor coordination [8].

A seminal contribution in this field is the work of Kosch [9], who introduced innovative metrics such as pursuit entropy and gaze deviation error to assess workload in multitasking scenarios. Kosch’s findings demonstrated that pursuit-based features exhibit strong correlations with task difficulty and subjective workload ratings, often outperforming fixation-based metrics in dynamic environments. Notably, the study showed that the characteristics of the stimulus trajectory, such as its speed and shape, do not significantly influence workload estimation; rather, variations in cognitive effort are primarily driven by the presence and difficulty of the secondary task. However, a key limitation of this approach remains its dependence on a predefined, known stimulus trajectory, which restricts applicability in more ecological and unconstrained settings.

Modern computational approaches include support vector machine (SVM) [10], random forests [11], and deep learning models, particularly recurrent neural networks [12,13] to process sequential gaze data and predict workload levels with increasing accuracy. Datasets collected under controlled conditions, such as n-back tasks and visual search paradigms, are commonly employed to train these systems [14,15], with Kosch’s dataset frequently serving as a benchmark for hybrid models that combine both pursuit and fixation-based features. Hybrid approaches combining spatial trajectory encoding with temporal attention mechanisms are also gaining traction. Despite these advances, a major limitation for deploying such models on edge devices, including IoT devices, remains their computational load and resource demands, which typically exceed the capabilities of such constrained environments [16].

To address privacy, personalization, and scalability issues in gaze-based cognitive state modeling, recent research has explored federated learning (FL) frameworks enabling decentralized model training on edge devices like augmented reality (AR) headsets or mobile eye-trackers while keeping sensitive data local [17]. This is ideal for user-centric monitoring, accommodating individual eye movement variability through on-device adaptation. Notable approaches include federated multi-task learning capturing user-specific trends without compromising global generalization [18], FedAvg and FedProx variants improving convergence under heterogeneous data [19], and FedPer, a personalized FL framework preserving shared representation with per-user adaptation layers, showing improved gaze-based emotion and attention prediction [20].

Building upon these insights, our work presents a novel framework that precisely overcomes these limitations by developing trajectory-independent algorithms for cognitive workload estimation. Utilizing Kalman Filter (KF) [21]-based predictive models, extensively used in computer vision tasks [22,23,24,25], and B-spline [26]-based heuristic classifiers, our methods avoid reliance on stimulus trajectory knowledge without sacrificing predictive accuracy or interpretability. Additionally, we adopt a privacy-preserving FL scheme to enable scalable model training across heterogeneous user devices without sharing raw sensitive gaze data, addressing privacy concerns inherent in cognitive workload estimation. Compared to Kosch, our approach achieves comparable or improved classification performance, with additional robustness across varied gaze path geometries and task conditions. These improvements significantly extend the practical applicability of smooth pursuit metrics to real-world human–machine interaction scenarios. Despite not being the most modern technologies, these methods typically require substantially fewer computational resources than many contemporary approaches, facilitating deployment on resource-constrained edge devices.

In summary, our framework delivers a scalable and privacy-aware solution for continuous cognitive workload monitoring based on smooth pursuit eye movements. This work thus paves the way for broader adoption of trajectory-independent gaze metrics within adaptive, context-aware interactive technologies.

1.1. Adaptive Stress-Aware Systems

Adaptive stress-aware systems aim to dynamically adjust their behavior in response to the user’s cognitive or emotional state, with the goal of maintaining optimal task performance, reducing cognitive overload, and enhancing user experience. The design of such systems must incorporate principles from affective computing [27], human factors engineering [28], and user modeling [29], while ensuring real-time responsiveness and non-intrusiveness. Central to their design is the integration of sensing modalities, such as eye-tracking, physiological signals (e.g., EEG, HRV), or behavioral cues, and the capability to infer mental workload or stress levels with sufficient temporal resolution [30]. Once user state is inferred, adaptation strategies can be employed at various system levels, including content presentation, interaction complexity, information pacing, or modality switching.

In AR or mixed-reality environments, an adaptive interface might reduce visual clutter, slow down stimulus presentation, or switch to auditory guidance when elevated stress is detected [31]. In driving assistance systems, workload-sensitive interfaces may postpone secondary tasks (e.g., notifications) during high cognitive demand [32]. In intelligent tutoring systems, the difficulty of instructional content can be adjusted in real time according to estimated mental load, fostering engagement without overwhelming the learner [33]. Importantly, adaptation should avoid undesirable feedback loops, where the system’s reaction to stress inadvertently increases the user’s discomfort, a phenomenon sometimes referred to as “adaptive instability” [34].

Protocols to evaluate stress-aware systems aim to induce varying levels of mental workload or psychological stress while allowing for precise measurement of corresponding physiological and behavioral responses [35]. Broadly, stress induction methods can be guided by either objective measures, requiring a priori classification of stress levels associated with each task, or by a combination of subjective and objective metrics, where stress is inferred post hoc through correlation analyses [36]. The Trier Social Stress Test (TSST) is a widely adopted protocol that combines public speaking and mental arithmetic under evaluative pressure [37]. Tasks like the Stroop Color Word Test (SCWT) [38] and mental arithmetic challenges, such as PASAT [39], MIST [40], and the Add-N task [41,42], have consistently been shown to elicit significant changes in physiological indicators like HRV, pupil dilation, and gaze patterns [43,44]. Other common stressors include exposure to emotionally charged visual stimuli, such as the International Affective Picture System (IAPS) [45], or controlled video playback.

In the cognitive domain, the n-back task is particularly relevant for eye-tracking research, as it induces increasing working memory load with rising n values [46]. Typically, 0-back and 1-back tasks correspond to lower cognitive load, while 2-back and 3-back configurations are associated with high demand and measurable increases in cognitive effort.

It is important to note that while generic stress protocols provide reliable baseline effects, experimental paradigms should ideally be adapted to the target application domain. For example, when designing stress detection algorithms for AR-based driving assistance systems, stress induction should occur in valid scenarios, such as simulated driving environments with variable traffic, auditory distractions, or multitasking demands, to better reflect real-world complexity. Contextualizing the induction protocol ensures that the collected data are more relevant for the intended deployment environment of the adaptive system [47].

1.2. Cognitive Workload Estimation via Eye-Tracking

The human visual system exhibits a rich repertoire of eye movements that reflect both perceptual and cognitive processes. These include fixations, saccades, smooth pursuit, vestibulo-ocular, optokinetic, and vergence movements [48]. Among them, fixations, saccades, and smooth pursuit are the most commonly analyzed for workload estimation, as they are directly modulated by attention, task demand, and fatigue [49].

Several studies have demonstrated that increased task complexity or mental load affects gaze behavior in quantifiable ways [50,51]. For instance, fixation durations tend to increase under higher cognitive demand, while saccadic frequency and amplitude typically decrease, indicating reduced visual exploration [52]. Similarly, smooth pursuit tracking performance deteriorates with elevated workload, leading to less accurate and more irregular eye trajectories [9]. These findings suggest that gaze metrics can serve as continuous indicators of cognitive state, suitable for adaptive interface design and operator monitoring.

To leverage these relationships, researchers have developed numerous computational models for eye-movement classification and workload inference. Classical threshold-based algorithms, such as the Velocity Threshold (I-VT) and Dispersion Threshold (I-DT) methods [53], enable real-time segmentation of fixations and saccades, but are sensitive to noise and individual variability. To improve robustness, probabilistic approaches like Hidden Markov Models (HMMs) [54] and Gaussian Mixture Models (GMMs) [55] have been used to capture temporal dependencies in gaze patterns. More recently, deep learning-based models, such as convolutional neural networks (CNNs) [56,57], have achieved high accuracy in complex and noisy scenarios.

Despite advances in eye-tracking technology, real-time workload monitoring still faces several challenges. Many existing methods require extensive calibration, rely on laboratory-grade hardware, or assume knowledge of a fixed stimulus trajectory [58,59,60]. Furthermore, purely data-driven models often lack interpretability, which hinders their deployment in safety-critical or human-centered systems.

The remainder of this paper is organized as follows: Section 2 introduces the multi-strategy framework for cognitive workload estimation and describes the proposed methods: Kosch-based pursuit deviation, KF-based modeling, and B-spline approximation. In addition, it details the real-time system implementation and the FL extension for privacy-preserving adaptation. Section 3 presents the experimental results, including performance evaluation and error analysis across different trajectory types and workload conditions, and discusses the implications of the findings as well as directions for future research. Finally, Section 5 summarizes the main conclusions of the study.

2. Materials and Method

The proposed cognitive workload monitoring system is conceived as a foundational component for an adaptive framework intended to respond to varying levels of cognitive load. It continuously captures gaze data through an eye gaze estimation system while the user performs a task, and estimates cognitive workload in real time using computational models. Although the current implementation focuses solely on workload estimation, this capability paves the way for future integration with adaptive interfaces. Figure 1 presents the conceptual pipeline of the adaptive framework, integrating real-time gaze analysis and workload estimation to enable user-centered interface adaptation through continuous feedback.

2.1. Kosch-Based Pursuit Deviation Model

Among existing resources, the dataset introduced by Kosch [9] is publicly available via https://github.com/tkosch/your-eyes-tell, accessed on 5 September 2025; however, the author did not release a public implementation of the proposed approach. Therefore, we re-implemented the original methodology, which quantifies cognitive effort by measuring deviations between the user’s gaze and the known target trajectory. The dataset was collected under controlled laboratory conditions from 20 participants (9 females and 11 males, aged between 22 and 34), who were instructed to follow moving stimuli along predefined paths while performing auditory n-back tasks designed to induce varying levels of cognitive workload. The experimental design was based on three independent variables: trajectory type (rectangular, sinusoidal, circular), n-back task level (0, 1, 2, 3), and target speed (slow and fast: 450 or 650 pixels/s). Their combination yielded 21 unique trial configurations, each lasting 26 s; for the 0-back condition, only the slow target speed was used. Gaze data were recorded with a SensoMotoric Instruments RED250 eye tracker at 250 Hz.

In our re-implementation of the Kosch model, the initial preprocessing led to the removal of three participants due to corrupted or poorly calibrated data. Subsequent preprocessing steps applied to each experimental trial include the following:

Three candidates were removed: candidate 9 was excluded because some of the corresponding files were found to be corrupted, whereas candidates 11 and 17 were removed following a careful visual inspection of the eye movement data, which revealed improper calibration of the eye-tracking device. It is worth noting that the author reported the exclusion of only two participants, without specifying which ones.
Min-Max normalization of the x and y coordinates of the target trajectory (with parameter saving and application), followed by the application of the same normalization parameters to the gaze coordinates.
Feature extraction by computing the Euclidean distance $d (p, q)$ between the normalized coordinates of the moving target and the eye movements:

$d (p, q) = \sqrt{{(p_{x} - q_{x})}^{2} + {(p_{y} - q_{y})}^{2}}$

where p and q represent the normalized coordinate vectors of the target and the gaze points, respectively.
Smoothing of the Euclidean distance signal using a moving average filter with a 250-sample window (equivalent to one second) and a stride of one.

Each instance in the dataset (357 total) corresponds to one experimental trial containing 6000 features. Labels were derived from the n-back level: trials without a secondary task were assigned to the low workload class, while trials with 1-, 2-, or 3-back tasks represented high-workload conditions. A linear SVM classifier was trained with a Leave-One-Group-Out (LOGO) cross-validation scheme to ensure subject-level generalization.

While this approach performs well in controlled settings, it requires a known target trajectory, limiting its applicability in naturalistic contexts where the user’s gaze may follow unconstrained stimuli.

2.2. Kalman-Based Virtual Trajectory Model

To overcome limitations related to the availability of target trajectory data, a second method was developed using the KF [21]. The filter estimated a smoothed version of the user’s gaze path, which was employed to model the trajectory of the eye, serving as a virtual reference in the absence of a known stimulus path.

The KF is a recursive optimal estimator widely used for tracking and predicting the state of dynamic systems in the presence of noise and uncertainty. It operates in two main steps: a prediction phase, where the current system state is estimated based on a prior model, and an update phase, where new measurements are incorporated to refine the estimate. The classical KF assumes that both the system dynamics and the observation models are linear and that the associated noise processes are Gaussian. Under these assumptions, the KF provides the minimum mean square error estimate of the hidden states.

In real-world applications, however, many systems exhibit non-linear behaviors. To address this limitation, the Extended Kalman Filter (EKF) [25] extends the original formulation by linearizing the system around the current estimate using a first-order Taylor expansion. This allows the EKF to handle non-linear state transitions and observation functions, albeit at the cost of increased computational complexity and potential numerical instability if the system exhibits strong non-linearities or if the linear approximation is insufficient. Figure 2 illustrates the operational principles of the KF in terms of state prediction and measurement update processes.

The Euclidean distance between the measured gaze and the predicted state can then be used as a proxy for cognitive workload, under the assumption that mental effort degrades the smoothness and consistency of eye movements. From the distances between observed and predicted gaze points, a set of statistical features, including mean, standard deviation, and interquartile range, was extracted and used as input to a new SVM classifier. To address class imbalance in the training data, we applied the Synthetic Minority Over-sampling Technique (SMOTE) algorithm [61], which generated synthetic instances of the minority class.

2.3. Lightweight B-Spline Approximation

A third, lightweight method based on cubic B-spline approximation [26] was designed to meet the requirements of low-power and real-time systems. Short gaze segments were separately approximated along the horizontal and vertical axes, and the mean Euclidean deviation between actual and fitted trajectories was used as the workload feature.

Splines are piecewise polynomial functions commonly used to approximate or interpolate complex, noisy data in a smooth and continuous manner. Among them, B-splines (Basis splines) represent a particularly flexible and numerically stable class, defined as a linear combination of basis functions with local support. Unlike high-degree polynomials that tend to oscillate between data points, B-splines ensure smoothness and continuity while avoiding overfitting, as each segment of the curve is influenced only by a subset of control points. This locality property makes B-splines especially suitable for modeling human gaze trajectories, where short, temporally localized patterns must be captured with high precision. In this work, B-splines were adopted to approximate gaze trajectories within short time windows, enabling the extraction of irregularities associated with elevated cognitive workload.

Adaptive Thresholding and Real-Time Application

Experimental analysis revealed that the optimal classification boundary varies according to the trajectory due to the distinct spatial dynamics of each pattern. To enhance robustness across these heterogeneous motion profiles, an adaptive thresholding mechanism was introduced, dynamically adjusting the decision boundary based on the Pearson correlation between spline axes. This adaptive strategy allows the classifier to compensate for variations in trajectory curvature and user-specific tracking behavior, thereby improving the reliability of workload estimation under different motion conditions.

The complete model was embedded into a real-time desktop application that employs webcam-based eye tracking via the gazetimation library [62]. The system processes continuous gaze streams at 30 Hz, applying temporal smoothing and a majority-vote strategy over sliding windows to mitigate transient noise and ensure stable classification. Cognitive state feedback is presented through intuitive color-coded indicators: green for low workload, red for high workload, and gray for uncertain or invalid detections.

Figure 3 illustrates representative gaze trajectories captured by the real-time system. The top row shows mixed fixation–saccade patterns, the middle row depicts regular smooth pursuit behavior, and the bottom row highlights irregular pursuit trajectories observed under elevated cognitive load. These examples show the system’s ability to distinguish between different gaze dynamics in real time, validating the feasibility and responsiveness of the proposed framework for adaptive human–computer interfaces.

2.4. Federated Learning Extension

To enhance privacy and scalability, the proposed framework was extended with a FL architecture tailored to the smooth pursuit-based cognitive workload estimation pipeline (Figure 4). In this decentralized setup, gaze data collected from mobile or desktop devices, such as AR glasses or smartphone-based eye trackers, are used to train local models without transmitting raw data to a central server, thereby complying with data protection regulations such as the General Data Protection Regulation (GDPR) [63].

Each client device executes local training using its own labeled gaze recordings, adopting either a lightweight SVM-based or spline-based classifier. Model updates (e.g., support vectors) are periodically sent to a central aggregation server through secure communication protocols (TLS). The server performs global aggregation via the FedAvg algorithm [64], optionally integrating personalization layers inspired by FedPer [20] to preserve user-specific adaptations while leveraging shared knowledge across participants.

The architecture was implemented using the Flower FL framework [65], selected for its modularity, Python interoperability (scikit-learn, TensorFlow), and support for both simulation and deployment in real-world environments. Local training is coordinated asynchronously to handle heterogeneous devices differing in performance, availability, and data volume, an essential capability in mobile and wearable scenarios.

To enhance robustness and protect sensitive behavioral information, differential privacy mechanisms were integrated into the communication layer. This design enables real-time on-device inference while continuously improving model generalization across users, making the system suitable for stress-aware, privacy-compliant adaptive human–machine interaction applications.

3. Results

Our implementation of Kosch’s methodology, as well as our proposed solutions, was validated on the publicly available dataset [9].

The SVM was trained separately with different combinations of trajectory type and target speed. To ensure subject-level generalization and avoid overfitting to specific participants, a LOGO cross-validation scheme was applied, where data from one participant were held out for testing while the classifier was trained on the remaining participants. This process was repeated for all participants, and performance metrics were averaged across folds.

Table 1 reports the evaluation results averaged across the two workload levels (low and high), for each possible configuration of the three trajectories (rectangular, sinusoidal, circular) and two speed settings (slow and fast). Performance was assessed using five standard metrics:

Accuracy: measures the overall proportion of correctly classified instances, considering both positive and negative classes.

$Accuracy = \frac{T P + T N}{T P + T N + F P + F N}$

where $T P$ (True Positives) and $T N$ (True Negatives) represent correctly predicted samples, while $F P$ (False Positives) and $F N$ (False Negatives) correspond to misclassified instances.
Precision: quantifies the reliability of positive predictions, indicating the fraction of samples classified as positive that are truly positive.

$Precision = \frac{T P}{T P + F P}$

A higher Precision implies fewer false alarms in detecting high workload instances.
Recall (Sensitivity): expresses the model’s ability to correctly identify all positive cases.

$Recall = \frac{T P}{T P + F N}$

A higher Recall indicates a better capability to detect true positive samples without missing any.
F1-score: represents the harmonic mean between Precision and Recall, balancing false positives and false negatives.

$F 1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}$

This metric is particularly informative when there is an uneven class distribution.
Matthews Correlation Coefficient (MCC): provides a balanced measure of classification quality, even in the presence of class imbalance, reflecting the correlation between predicted and actual labels.

$MCC = \frac{T P \times T N - F P \times F N}{\sqrt{(T P + F P) (T P + F N) (T N + F P) (T N + F N)}}$

The MCC ranges from $- 1$ (total disagreement) to $+ 1$ (perfect prediction), with 0 indicating random classification.

Overall, these metrics complement each other: while Accuracy offers a general performance overview, Precision and Recall capture the trade-off between false alarms and missed detections, the F1-score balances the two, and MCC provides a robust indicator of classification consistency.

As shown in Table 1, the Kosch-based classifier achieved consistently high accuracy across different trajectory types and speeds, with the best performance observed for circular and sinusoidal trajectories at higher speeds.

Representative gaze trajectories for low- and high-workload conditions are illustrated in Figure 5. Each subfigure compares the target trajectory (blue) with the participant’s estimated gaze (orange) during smooth pursuit. Several trends emerge from the results: Figure 5a–c show the smooth pursuit behavior in experiments without cognitive load, for circular, rectangular, and sinusoidal trajectories, respectively. The gaze closely follows the target trajectory, with only minor deviations, indicating low cognitive effort. Figure 5d–f depict the high workload, where participants performed n-back tasks while following circular, rectangular, and sinusoidal trajectories. In these cases, gaze paths exhibit larger deviations from the target, particularly for circular and sinusoidal trajectories, reflecting increased mental workload. The discrepancies are less pronounced for rectangular trajectories, likely due to the more predictable linear motion along edges. All experiments were repeated multiple times with different participants to ensure the reliability and consistency of the observed gaze patterns across individuals.

Overall, the findings confirm that smooth pursuit eye movements are sensitive indicators of cognitive load, with trajectory type affecting the detectability of workload-induced gaze irregularities.

Finally, the average NASA-TLX self-assessment scores, grouped by trajectory type, n-back task difficulty, and target speed, are reported in Figure 6. These scores represent the raw subjective ratings collected from participants after each experimental session. Consistent with the findings of Kosch, the analysis of the NASA-TLX data indicates that cognitive workload depends primarily on the difficulty of the n-back task applied to the participants, rather than on the speed of the target trajectory. Participants reported significantly higher perceived workload with increasing n-back levels, while trajectory type and stimulus speed showed minimal influence. This coherence between subjective and objective indicators strengthens the validity of smooth pursuit metrics as a practical tool for cognitive workload estimation.

Table 2 presents the average performance of all evaluated models, including the reproduced Kosch implementation (Section 2.1), the KF-based model (Section 2.2), the B-Spline Approximation (Section 2.3), and Adaptive Thresholding (Section “Adaptive Thresholding and Real-Time Application”).

Examining the performance of different classifiers provides further insights. The Kosch’s implementation achieved high Accuracy and F1-scores, yet the MCC was lower due to class imbalance, highlighting the limitations of relying solely on accuracy metrics. The KF-based model offered similar predictive power without requiring a predefined stimulus trajectory, demonstrating enhanced adaptability for real-world scenarios where the exact target path may not be known. Meanwhile, the B-spline approximation method combined with dynamic thresholding, achieved a good balance between predictive performance and computational efficiency, with an MCC of 0.36 and execution times of approximately 2 ms per segment. This makes it particularly suitable for real-time adaptive interfaces and human–computer interaction scenarios.

Adaptive Thresholding was used to accommodate trajectory-specific variations. We obtained the following thresholds:

Circular trajectories: 0.0203
Sinusoidal trajectories: 0.0252
Rectangular trajectories: 0.0272

To characterize the local geometry of each spline segment, we compute the Pearson correlation coefficient (

r_{x y}

) between the x and y coordinates. This coefficient provides a compact descriptor of the linear dependency between spatial components, which reflects the geometric regularity of the curve [66,67]. Empirically, highly correlated segments (

r_{x y} > 0.8

) correspond to smooth, arc-like trajectories, whereas low correlation values (

r_{x y} < 0.6

) are typically associated with orthogonal or piecewise-linear shapes. Intermediate correlations (

0.6 \leq r_{x y} \leq 0.8

) often arise in oscillatory patterns resembling sinusoidal profiles.

In this study, these thresholds were deliberately chosen in a heuristic manner, without pursuing an optimal calibration, to preliminarily assess the discriminative potential of the proposed correlation-based classification. In order to improve the robustness and generalization of the method [68,69] a refinement of these boundaries, through data-driven optimization or learning-based approaches, is possible.

To further quantify the significance of observed performance differences, a one-way Analysis of Variance (ANOVA) was conducted on the accuracy scores obtained over five cross-validation folds for each model. The analysis revealed a significant main effect of the model type on accuracy (

p < 0.001

). Post hoc comparisons using the Tukey HSD test indicated that the Kosch model achieved significantly higher accuracy than the other methods, notably the KF-based and B-Spline models (

p < 0.01

), with the only exception being the Adaptive Thresholding model, for which the difference was not statistically significant (

p = 0.08

).

The higher accuracy compared to the other methods (KF-based method and B-Spline Approximation) revealed that fixed classification thresholds could bias results depending on trajectory geometry. By adopting trajectory-specific thresholds determined via the Pearson correlation between horizontal and vertical gaze components, misclassification rates were reduced. This emphasizes the importance of adapting the model to the geometric properties of gaze trajectories.

Figure 7 shows error rates produced by Kosch’s and Adaptive Thresholding methods, grouped by trajectory type and cognitive workload. The improved version demonstrates a slightly reduced error rate across trajectory types.

4. Future Work

Several promising directions emerge from the current study and open opportunities for further research. First, extending the framework to incorporate multimodal physiological data, such as EEG, galvanic skin response (GSR), or HRV, could enhance the robustness of cognitive workload estimation, particularly in ambiguous or noisy gaze conditions. Combining gaze-based and biosignal-based features may also improve sensitivity to rapid workload fluctuations and individual variability.

Second, ongoing work will explore the use of advanced deep learning architectures, including spatio-temporal attention models and transformers, which can jointly model gaze dynamics and contextual task information. These architectures could support the automatic extraction of higher-level cognitive indicators without the need for handcrafted features. This exploration requires a focus on assessing the applicability and efficiency of these models in resource-constrained edge environments, where computational power, memory, and energy availability are limited. The goal is to identify architectures that balance predictive performance with computational feasibility, enabling real-time cognitive state estimation directly on edge devices.

Third, the deployment of FL protocols in operational mobile systems presents an opportunity for large-scale personalization without compromising user privacy. This requires addressing practical challenges such as client heterogeneity, communication efficiency, and model drift over time.

Finally, future experiments will aim to validate the system in real-world settings, including AR-based instructional environments, adaptive driving assistance, and neuroergonomic workplaces.

5. Conclusions

This work presented a modular and scalable framework for estimating cognitive workload using smooth pursuit eye movements, with particular emphasis on applicability in real-time and mobile contexts. By exploring multiple computational strategies, including SVM-based classification, Kalman filtering, and lightweight B-spline models, we adopted pursuit-based features to obtain a viable and robust signal for cognitive state inference. The integration of a real-time prototype, using low-cost webcam-based gaze estimation, further confirmed the feasibility of deploying these methods in interactive systems such as AR interfaces or adaptive user environments.

Beyond methodological contributions, the system was designed with extensibility in mind: the inclusion of a FL scenario paves the way for privacy-preserving personalization across distributed devices, while the proposed criteria for stress-aware adaptation support user-centered interface modulation in high-demand scenarios. Experimental results validated the effectiveness of our approach under varying cognitive loads and confirmed that pursuit-derived features can complement traditional fixation-based methods.

Author Contributions

Conceptualization, P.D., M.G., F.L.R. and M.V.; methodology, P.D., M.G. and F.L.R.; software, P.D., M.G. and F.L.R.; validation, P.D., M.G. and F.L.R.; writing—original draft preparation, P.D., M.G. and F.L.R.; writing—review and editing, P.D., M.G., F.L.R. and M.V.; supervision, M.V.; funding acquisition, M.V. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been supported the Italian Ministry of University and Research (MUR) “Research projects of National Interest (PRIN-PNRR)” through the project “Cloud Continuum aimed at On-Demand Services in Smart Sustainable Environments” (CUP: J53D23015080001—IC: P2022YNBHP), the “SEcurity and RIghts in the CyberSpace (SERICS)” partnership (PE00000014), under the MUR National Recovery and Resilience Plan funded by the European Union—NextGenerationEU. In particular, it has been supported within the SERICS partnership through the SOP project (CUP H73C22000890001), and the Italian Ministry of Health, Piano Operativo Salute (POS) trajectory 2 “eHealth, diagnostica avanzata, medical device e mini invasività” through the project “Rete eHealth: AI e strumenti ICT Innovativi orientati alla Diagnostica Digitale (RAIDD)” (CUP J43C22000380001).

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Local Ethics Committee of Messina (J43C22000380001—2023-02-05).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Acknowledgments

The authors gratefully acknowledge the contribution of Emanuele Longo, who kindly participated in the experimental activities.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hart, S.G.; Staveland, L.E. Development of NASA-TLX (Task Load Index): Results of Empirical and Theoretical Research. Adv. Psychol. 1988, 52, 139–183. [Google Scholar] [CrossRef]
Antonenko, P.; Paas, F.; Grabner, R.; van Gog, T. Using Electroencephalography to Measure Cognitive Load. Educ. Psychol. Rev. 2010, 22, 425–438. [Google Scholar] [CrossRef]
Ider, Ö.; Kusnick, K. Heart rate dynamics for cognitive load estimation in a driving context. Sci. Rep. 2024, 14, 79728. [Google Scholar]
Just, M.A.; Carpenter, P.A. Eye fixations and cognitive processes. Cogn. Psychol. 1976, 8, 441–480. [Google Scholar] [CrossRef]
Di Stasi, L.L.; Catena, A.; Canas, J.J.; Macknik, S.L.; Martinez-Conde, S. Saccadic velocity as an arousal index in naturalistic tasks. Neurosci. Biobehav. Rev. 2013, 37, 968–975. [Google Scholar] [CrossRef]
Stern, J.A.; Walrath, L.C.; Goldstein, R. The endogenous eyeblink. Psychophysiology 1984, 21, 22–33. [Google Scholar] [CrossRef]
Beatty, J. Task-evoked pupillary responses, processing load, and the structure of processing resources. Psychol. Bull. 1982, 91, 276–292. [Google Scholar] [CrossRef]
Korda, Z.; Walcher, S.; Körner, C.; Benedek, M. Effects of internally directed cognition on smooth pursuit eye movements: A systematic examination of perceptual decoupling. Atten. Percept. Psychophys. 2023, 85, 1159–1178. [Google Scholar] [CrossRef] [PubMed]
Kosch, T.; Hassib, M.; Woźniak, P.W.; Buschek, D.; Alt, F. Your Eyes Tell: Leveraging Smooth Pursuit for Assessing Cognitive Workload. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems CHI’18, Montreal, QC, Canada, 21–26 April 2018; pp. 1–13. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Nasri, M.; Kosa, M.; Chukoskie, L.; Moghaddam, M.; Harteveld, C. Exploring Eye Tracking to Detect Cognitive Load in Complex Virtual Reality Training. In Proceedings of the 2024 IEEE International Symposium on Mixed and Augmented Reality, Bellevue, WA, USA, 21–25 October 2024; pp. 1–6. [Google Scholar] [CrossRef]
Sims, S.D.; Putnam, V.; Conati, C. Predicting Confusion from Eye-Tracking Data with Recurrent Neural Networks. In Proceedings of the 2019 Symposium on Eye Tracking Research and Applications, Denver, CO, USA, 25–28 June 2019; pp. 1–10. [Google Scholar]
Khan, M.A.; Asadi, H.; Qazani, M.R.C.; Lim, C.P.; Nahavandi, S. Functional Near-Infrared Spectroscopy (fNIRS) and Eye Tracking for Cognitive Load Classification in a Driving Simulator Using Deep Learning. arXiv 2024, arXiv:2408.06349. [Google Scholar]
Oppelt, M.P.; Foltyn, A.; Deuschel, J.; Lang, N.R.; Holzer, N.; Eskofier, B.M.; Yang, S.H. ADABase: A Multimodal Dataset for Cognitive Load Estimation. Sensors 2022, 23, 340. [Google Scholar] [CrossRef]
Ktistakis, E.; Skaramagkas, V.; Manousos, D.; Tachos, N.S.; Tripoliti, E.; Fotiadis, D.I.; Tsiknakis, M. COLET: A Dataset for Cognitive WorkLoad Estimation Based on Eye-Tracking. In Computer Methods and Programs in Biomedicine; Technical Report; Elsevier: Amsterdam, The Netherlands, 2022. [Google Scholar]
Zaniolo, L.; Garbin, C.; Marques, O. Deep learning for edge devices. IEEE Potentials 2023, 42, 39–45. [Google Scholar] [CrossRef]
Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated learning: Challenges, methods, and future directions. IEEE Signal Process. Mag. 2020, 37, 50–60. [Google Scholar] [CrossRef]
Hard, A.; Rao, K.; Mathews, R.; Ramaswamy, S.; Beaufays, F.; Augenstein, S.; Eichner, H.; Kiddon, C.; Ramage, D. Federated Learning for Mobile Keyboard Prediction. In Proceedings of the 22nd International Conference on Artificial Intelligence, Paris, France, 21–22 August 2018. [Google Scholar]
Li, Q.; He, B.; Song, D. Model-contrastive federated learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 10713–10722. [Google Scholar]
Arivazhagan, M.G.; Aggarwal, V.; Singh, A.K.; Choudhary, S. Federated Learning with Personalization Layers. arXiv 2019, arXiv:1912.00818. [Google Scholar] [CrossRef]
Kalman, R.E. A New Approach to Linear Filtering and Prediction Problems. J. Basic Eng. 1960, 82, 35–45. [Google Scholar] [CrossRef]
Welch, G.; Bishop, G. An Introduction to the Kalman Filter; University of North Carolina at Chapel Hill, Department of Computer Science: Chapel Hill, NC, USA, 1995. [Google Scholar]
Mourikis, A.I.; Roumeliotis, S.I. A multi-state constraint Kalman filter for vision-aided inertial navigation. In Proceedings of the 2007 IEEE International Conference on Robotics and Automation, Rome, Italy, 10–14 April 2007; pp. 3565–3572. [Google Scholar]
Li, Y.; Song, L.; Zhang, M. Real-time facial landmark tracking using Kalman filter and cascaded regression. Pattern Recognit. Lett. 2020, 138, 150–157. [Google Scholar]
Wang, F.; Zhao, Y.; Xu, H. Real-time hand tracking using extended Kalman filter and dynamic motion model. Multimed. Tools Appl. 2019, 78, 32527–32546. [Google Scholar]
de Boor, C. A Practical Guide to Splines; Applied Mathematical Sciences; Springer: New York, NY, USA, 1978; Volume 27. [Google Scholar]
Picard, R.W. Affective Computing; MIT Press: Cambridge, MA, USA, 1997. [Google Scholar]
Wickens, C.D.; Hollands, J.G.; Banbury, S.; Parasuraman, R. Applied Attention Theory; Human Factors and Ergonomics Series; CRC Press: Boca Raton, FL, USA, 2008. [Google Scholar]
Cowie, R.; Douglas-Cowie, E.; Tsapatsoulis, N.; Votsis, G.; Kollias, S.; Fellenz, W.; Taylor, J.G. Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 2001, 18, 32–80. [Google Scholar] [CrossRef]
Tao, J.; Tian, T. Affective computing: A review. Lect. Notes Comput. Sci. 2005, 3784, 981–995. [Google Scholar]
Ghasemi, Y.; Singh, A.; Kim, M.; Johnson, A.; Jeong, H. Effects of Head-locked Augmented Reality on User’s Performance and Perceived Workload. arXiv 2021, arXiv:2106.14068. [Google Scholar] [CrossRef]
Mehler, B.; Reimer, B.; Coughlin, J.C. The impact of incremental increases in cognitive workload on physiological arousal and performance in young adult drivers. Transp. Res. Rec. 2009, 2138, 6–12. [Google Scholar] [CrossRef]
Yacef, K.; Zaïane, O.; Pechenizkiy, M. Student modeling and cognitive load in adaptive learning systems. In Proceedings of the International Conference on Educational Data Mining (EDM), Cordoba, Spain, 1–3 July 2009. [Google Scholar]
Nasri, M. Towards Intelligent VR Training: A Physiological Adaptation Framework for Cognitive Load and Stress Detection. In Proceedings of the 33rd ACM Conference on User Modeling, Adaptation and Personalization UMAP’25, New York, NY, USA, 16–19 June 2025; pp. 419–423. [Google Scholar] [CrossRef]
Picard, R.W. Automating the Recognition of Stress and Emotion: From Lab to Real-World Impact. IEEE MultiMedia 2016, 23, 3–7. [Google Scholar] [CrossRef]
Drole, K.; Doupona, M.; Steffen, K.; Jerin, A.; Paravlic, A. Associations between subjective and objective measures of stress and load: An insight from 45-week prospective study in 189 elite athletes. Front. Psychol. 2025, 15, 1521290. [Google Scholar] [CrossRef] [PubMed]
Kirschbaum, C.; Pirke, K.M.; Hellhammer, D.H. The ‘Trier Social Stress Test’—A Tool for Investigating Psychobiological Stress Responses in a Laboratory Setting. Neuropsychobiology 1993, 28, 76–81. [Google Scholar] [CrossRef] [PubMed]
MacLeod, C.M. Half a Century of Research on the Stroop Effect: An Integrative Review. Psychol. Bull. 1991, 109, 163–203. [Google Scholar] [CrossRef]
Gronwall, D.M.A. Paced Auditory Serial-Addition Task: A measure of recovery from concussion. Percept. Mot. Skills 1977, 44, 367–373. [Google Scholar] [CrossRef] [PubMed]
Dedovic, K.; Renwick, R.; Mahani, N.K.; Engert, V.; Lupien, S.J.; Pruessner, J.C. The Montreal Imaging Stress Task: Using Functional Imaging to Investigate the Effects of Perceiving and Processing Psychosocial Stress in the Human Brain. J. Psychiatry Neurosci. 2005, 30, 319–325. [Google Scholar] [CrossRef]
Embrey, J.R.; Mason, A.; Newell, B.R. Too hard, too easy, or just right? The effects of context on effort and boredom aversion. Psychon. Bull. Rev. 2024, 31, 2801–2810. [Google Scholar] [CrossRef]
Saskovets, M.; Lohachov, M.; Liang, Z. Validation of a New Stress Induction Protocol Using Speech Improvisation (IMPRO). Brain Sci. 2025, 15, 522. [Google Scholar] [CrossRef]
Wel, P.; Steenbergen, H. Pupil dilation as an index of effort in cognitive control tasks: A review. Psychon. Bull. Rev. 2018, 25, 2005–2015. [Google Scholar] [CrossRef]
Meshkati, N. Heart Rate Variability and Mental Workload Assessment. In Advances in Psychology; Human Mental Workload; Hancock, P.A., Meshkati, N., Eds.; Elsevier: Amsterdam, The Netherlands, 1988; Volume 52, pp. 101–115. [Google Scholar] [CrossRef]
Lang, P.J.; Bradley, M.M.; Cuthbert, B.N. International Affective Picture System (IAPS): Affective Ratings of Pictures and Instruction Manual; Technical Report A-8, University of Florida; NIMH, Center for the Study of Emotion & Attention: Gainesville, FL, USA, 2008. [Google Scholar]
Owen, A.M.; McMillan, K.M.; Laird, A.R.; Bullmore, E. N-back Working Memory Paradigm: A Meta-analysis of Normative Functional Neuroimaging Studies. Hum. Brain Mapp. 2005, 25, 46–59. [Google Scholar] [CrossRef] [PubMed]
Béquet, A.J.; Hidalgo-Muñoz, A.R.; Jallais, C. Towards Mindless Stress Regulation in Advanced Driver Assistance Systems: A Systematic Review. Front. Psychol. 2020, 11, 609124. [Google Scholar] [CrossRef]
Martinez-Conde, S.; Macknik, S.L.; Martinez, L.M. Computational and Cognitive Neuroscience of Vision; Oxford University Press: Oxford, UK, 2013. [Google Scholar]
König, P.; Wilming, N.; Kietzmann, T.C.; Ossandón, J.P.; Onat, S.; Ehinger, B.V.; Gameiro, R.R.; Kaspar, K. Eye Movements as a Window to Cognitive Processes. J. Eye Mov. Res. 2016, 9, 1–16. [Google Scholar] [CrossRef]
Bläsing, D.; Bornewasser, M. Influence of Increasing Task Complexity and Use of Informational Assistance Systems on Mental Workload. Brain Sci. 2021, 11, 102. [Google Scholar] [CrossRef]
Martinez-Cedillo, A.; Gavrila, N.; Mishra, A.; Geangu, E.; Foulsham, T. Cognitive load affects gaze dynamics during real-world tasks. Exp. Brain Res. 2025, 243, 82. [Google Scholar] [CrossRef]
Di Stasi, L.; Marchitto, M.; Antolí, A.; Cañas, J. Saccadic peak velocity as an alternative index of operator attention: A short review. Eur. Rev. Appl. Psychol. 2013, 63, 335–343. [Google Scholar] [CrossRef]
Salvucci, D.D.; Goldberg, J.H. Identifying fixations and saccades in eye-tracking protocols. In Proceedings of the 2000 Symposium on Eye Tracking Research & Applications; Association for Computing Machinery: Palm Beach Gardens, FL, USA; pp. 71–78. [CrossRef]
Komogortsev, O.V.; Holland, C.D.; Karpov, A. Classification algorithm for eye movement-based biometrics. IEEE Trans. Inf. Forensics Secur. 2013, 8, 865–879. [Google Scholar]
López, A.; Ferrero, F.J.; Qaisar, S.M.; Postolache, O. Gaussian Mixture Model of Saccadic Eye Movements. In Proceedings of the 2022 IEEE International Symposium on Medical Measurements and Applications (MeMeA), Messina, Italy, 22–24 June 2022; pp. 1–5. [Google Scholar] [CrossRef]
Cole, Z.; Kuntzelman, K.; Dodd, M.; Johnson, M. Convolutional neural networks can decode eye movement data: A black box approach to predicting task from eye movements. J. Vis. 2021, 21, 9. [Google Scholar] [CrossRef]
von Behren, A.L.; Sauer, Y.; Severitt, B.; Wahl, S. CNN-based estimation of gaze distance in virtual reality using eye tracking and depth data. In Proceedings of the 2025 Symposium on Eye Tracking Research and Applications ETRA’25, New York, NY, USA, 26–29 May 2025. [Google Scholar] [CrossRef]
Kasneci, E.; Gao, H.; Ozdel, S.; Maquiling, V.; Thaqi, E.; Lau, C.; Rong, Y.; Kasneci, G.; Bozkir, E. Introduction to Eye Tracking: A Hands-On Tutorial for Students and Practitioners. arXiv 2024, arXiv:2404.15435. [Google Scholar] [CrossRef]
Eye Tracking: The Complete Pocket Guide-iMotions—imotions.com. Available online: https://imotions.com/blog/learning/best-practice/eye-tracking/?srsltid=AfmBOoqaaAN6p1fMJIsgqM-7CEPgNN8_rFbJd5UgFnfO67ryRMZa7YJw (accessed on 13 October 2025).
Holmqvist, K.; Nyström, M.; Andersson, R.; Dewhurst, R.; Jarodzka, H.; Van de Weijer, J. Eye Tracking: A Comprehensive Guide to Methods and Measures; Holmqvist, K., Nyström, N., Andersson, R., Dewhurst, R., Jarodzka, H., Van de Weijer, J., Eds.; Oxford University Press: Oxford, UK, 2011. [Google Scholar]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
Paul, S.K.; Nicolescu, M.; Nicolescu, M. Enhancing Robotic Task Parameter Estimation Through Unified User Interaction: Gestures and Verbal Instructions in Collaboration. In Proceedings of the 2024 8th International Conference on Robotics and Automation Sciences (ICRAS), Tokyo, Japan, 21–23 June 2024; IEEE: New York, NY, USA, 2024; pp. 66–71. [Google Scholar] [CrossRef]
European Union. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such Data, and Repealing Directive 95/46/EC (General Data Protection Regulation); Official Journal of the European Union: Luxembourg, 2016; pp. 1–88. [Google Scholar]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
Beutel, D.J.; Topal, T.; Mathur, A.; Qiu, X.; Fernandez-Marques, J.; Gao, Y.; Sani, L.; Li, K.H.; Parcollet, T.; de Gusmão, P.P.B.; et al. Flower: A Friendly Federated Learning Framework. arXiv 2022, arXiv:2007.14390. [Google Scholar] [CrossRef]
Wold, S.; Esbensen, K.; Geladi, P. Principal Component Models in Shape Analysis; Elsevier: Amsterdam, The Netherlands, 1987. [Google Scholar]
Bookstein, F.L. Morphometric Tools for Landmark Data: Geometry and Biology; Cambridge University Press: Cambridge, UK, 1997. [Google Scholar]
Cohen, J. Statistical Power Analysis for the Behavioral Sciences; Lawrence Erlbaum Associates: Mahwah, NJ, USA, 1988. [Google Scholar]
Schober, P.; Boer, C.; Schwarte, L.A. Correlation coefficients: Appropriate use and interpretation. Anesth. Analg. 2018, 126, 1763–1768. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Block diagram of the proposed adaptive system. Gaze data captured by the eye-tracker are processed to estimate the user’s mental workload, which drives dynamic adjustments of the interface in a closed feedback loop.

Figure 2. Diagram illustrating the prediction and update steps of the Kalman Filter used for tracking in noisy environments.

Figure 3. Examples of gaze trajectories acquired by the real-time system using webcam-based eye tracking. The top row shows mixed fixation–saccade patterns, the middle row shows regular smooth pursuit, and the bottom row shows irregular pursuit under cognitive load. The visualization employs green, gray, and red squares to intuitively represent distinct levels of cognitive workload across time or task segments.

Figure 4. Block diagram of the proposed FL architecture.

Figure 5. Comparison between target trajectory (blue) and user’s gaze position (orange) during smooth pursuit for different trajectory types. The first row (a–c) displays results under low cognitive workload, while the second row (d–f) shows results obtained under high cognitive workload using n-back tasks.

Figure 6. Average NASA-TLX scores grouped by trajectory type, n-back task difficulty, and target speed.

Figure 7. Error rates produced by Kosch’s and Adaptive Thresholding methods, grouped by trajectory type (circular, rectangular and sinusoidal) and cognitive workload (low or high).

Table 1. Accuracy, Precision, Recall, F1-score, and MCC for each trajectory–speed combination.

Trajectory	Accuracy	Precision	Recall	F1	MCC
Circular-Slow	0.74	0.85	0.75	0.77	0.44
Circular-Fast	0.84	0.93	0.86	0.88	0.61
Rectangular-Slow	0.71	0.88	0.75	0.77	0.31
Rectangular-Fast	0.90	0.97	0.90	0.92	0.76
Sinusoidal-Slow	0.76	0.88	0.82	0.82	0.41
Sinusoidal-Fast	0.97	0.97	1.00	0.98	0.88

Table 2. Performance summary of the evaluated models for binary stress classification.

Model	Accuracy	Precision	Recall	F1	MCC
Kosch	0.82	0.90	0.90	0.90	0.26
KF-based Model	0.68	0.94	0.67	0.77	0.30
B-Spline Approximation	0.68	0.94	0.68	0.79	0.29
Adaptive Thresholding	0.75	0.94	0.75	0.83	0.36

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Dell’Acqua, P.; Garofalo, M.; La Rosa, F.; Villari, M. Your Eyes Under Pressure: Real-Time Estimation of Cognitive Load with Smooth Pursuit Tracking. Big Data Cogn. Comput. 2025, 9, 288. https://doi.org/10.3390/bdcc9110288

AMA Style

Dell’Acqua P, Garofalo M, La Rosa F, Villari M. Your Eyes Under Pressure: Real-Time Estimation of Cognitive Load with Smooth Pursuit Tracking. Big Data and Cognitive Computing. 2025; 9(11):288. https://doi.org/10.3390/bdcc9110288

Chicago/Turabian Style

Dell’Acqua, Pierluigi, Marco Garofalo, Francesco La Rosa, and Massimo Villari. 2025. "Your Eyes Under Pressure: Real-Time Estimation of Cognitive Load with Smooth Pursuit Tracking" Big Data and Cognitive Computing 9, no. 11: 288. https://doi.org/10.3390/bdcc9110288

APA Style

Dell’Acqua, P., Garofalo, M., La Rosa, F., & Villari, M. (2025). Your Eyes Under Pressure: Real-Time Estimation of Cognitive Load with Smooth Pursuit Tracking. Big Data and Cognitive Computing, 9(11), 288. https://doi.org/10.3390/bdcc9110288

Article Menu

Your Eyes Under Pressure: Real-Time Estimation of Cognitive Load with Smooth Pursuit Tracking

Abstract

1. Introduction

1.1. Adaptive Stress-Aware Systems

1.2. Cognitive Workload Estimation via Eye-Tracking

2. Materials and Method

2.1. Kosch-Based Pursuit Deviation Model

2.2. Kalman-Based Virtual Trajectory Model

2.3. Lightweight B-Spline Approximation

Adaptive Thresholding and Real-Time Application

2.4. Federated Learning Extension

3. Results

4. Future Work

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI