Enhancing the Precision of Eye Detection with EEG-Based Machine Learning Models

Ahmad, Masroor; Ali, Tahir Muhammad; Arianti, Nunik Destria

doi:10.3390/engproc2025107128

Open AccessProceeding Paper

Enhancing the Precision of Eye Detection with EEG-Based Machine Learning Models^†

by

Masroor Ahmad

^1,*,

Tahir Muhammad Ali

²

and

Nunik Destria Arianti

³

¹

Department of Software Engineering University of Sialkot, Sialkot 51040, Pakistan

²

Department of Computer Science, Gulf University for Sciences and Technology, Safat 13060, Kuwait

³

Department of Information System, Nusa Putra University, Sukabumi 43155, West Java, Indonesia

^*

Author to whom correspondence should be addressed.

^†

Presented at the 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society, Aizuwakamatsu City, Japan, 20–26 January 2025.

Eng. Proc. 2025, 107(1), 128; https://doi.org/10.3390/engproc2025107128

Published: 13 October 2025

(This article belongs to the Proceedings of The 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society)

Download

Browse Figures

Versions Notes

Abstract

Achieving a dataset of eye detection comprises a critical task in computer vision and image processing. The primary goal of this dataset is to accurately locate and identify the position of eyes in image or video frames. This process can firstly detect the face region and then focus on the eye regions. In this study, 14,980 examples of physiological signal recordings, most likely from EEG or similar sensors, were included in this dataset, which was created for the analysis of neural or sensor-based movement. The constant signals from specific sensor channels are represented by 14 numerical features (AF3, F7, F3, O1, O2, P7, P8, T8, FC5, FC6, etc.). These characteristics record complex changes in signal designs over time, which could suggest shifts in sensor or neuronal activity. Also, the dataset involves a binary target variable called eye detection, and this shows if an eye-related event—such as turning or an open/closed state—is identified during an individual case. The basic label of this dataset is eye detection in human beings, which has instances of (0,1). The eye detection dataset has 14 features and 14,980 instances that can be utilized for training a model.

Keywords:

eye detection; brain–computer interface (BCI); electroencephalography (EEG); machine learning; physiological signals

1. Introduction

This dataset is focused on eye detection, commonly associated with the brain–computer interface (BCI) or electroencephalography (EEG) systems [1]. The dataset captures real-time physiological signals or movements. This dataset is particularly relevant for applications in neuroscience, healthcare, and human–computer interaction, enabling the development of advanced systems like eye-tracking devices, assistive technologies for individuals with disabilities, or even fatigue detection systems for drivers and operators [2,3,4]. The literature has examined and used various algorithms such as Naive Base, Gradient Boosted Trees, Deep Learning, K Nearest Neighbor (K-NN) Model, and Random Forest for detection of eyes. This dataset has 14 features (AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, AF4) and 14,980 instances, and 1 is the label column with the dataset, which can more accurately detect features compared to the usable algorithm. The target values of dataset 0 represent no detection, and 1 represents detection [5]. The values appear to be constant, which could be the result of an EEG recording or sensor [6]. The flow of this research paper starts with the Abstract, Section 1 is Introduction, Section 2 is Literature Review, Section 3 is Methodology, Section 4 is Results, Section 5 is Conclusion, and Section 6 is References.

2. Literature Review

The literature review was performed by 12 or 14 different research or scholarly papers based on the eye detection dataset. For prediction and accuracy check, different machine learning algorithms, such as Naive Base, Gradient Boosted Trees, Deep Learning, K-NN Model, and Random Forest, are used for detection of eyes, as eyes present multiple advances and difficulties in set calculation, eye tracking, face detection, and eye detection in different areas. The article covers a number of face detection techniques [7,8,9], including appearance-based approaches that use machine learning techniques [10,11,12] like PCA, SVMs, and neural networks, as well as matching, which makes use of standard templates [13]. Managing shifting lighting, posture, closing, and facial features like glasses or facial hair are some of the main difficulties. The review explores look estimation with a focus on its uses in medical imaging, driver monitoring systems, and human–computer interaction. This study uses this dataset for detection in various applications, such as image classification, model understanding, and diagnostic improvement [14], and it describes developments in combining eye tracking data with machine learning and deep learning [15].

Eye tracking is used in daily life, for example, in education, in human beings, and it helps develop expertise, assess cognitive load, and improve instructional design. The eye tracking system can also help the health system. For example, eye tracking and deep learning are used in health research for predicting diseases like Alzheimer’s and examining visual attention. On the other hand, it can also detect eyes when playing games, for example, gaming and Virtual Reality (VR)-enhanced gaming or VR experiences where the system tracks the user’s eye movements to improve immersion, aiming accuracy, or interaction within the virtual environment. This literature review examines eye tracking research, which offers valuable information about how customers interact with a piece of art. Despite its promise, there are still issues with device efficiency, data privacy, and practical deployment in large-scale systems. At the end of this literature review, one more example significantly enhances this dataset for eye detection, particularly device interactions with eye tracking features in smartphones and computers that enable hands-free control, such as scrolling or unlocking devices with a glance.

3. Methodology

Machine learning techniques can play an important role in the dataset of eye detection to predict those with a label column of 0 and 1 values. In this paper, we used different kinds of algorithms to detect eye accuracy [14]. Every algorithm has different accuracies, the details of which are described below.

3.1. Dataset Features

This dataset has 14 features (AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, AF4) and 14,980 instances.

3.2. Tool

We used the RapidMiner tool to apply machine learning techniques because it is the most popular tool for such applications. RapidMiner is an open-source platform. We used the 10.3 version of RapidMiner for eye detection.

3.3. Predictive Analytic Workflow

Firstly, we loaded the dataset of eye detection on RapidMiner; then, we created a single row of input data operators that are used in deployments. After that, we can find the labeled column of the eye detection dataset; in the corner, we can change the type, select the polynomial, and then click on change role. After that, we can click on the next button and then select the data, and the data can be successfully imported. After that, to show the design, we select and choose the split data; in the split data, we can use 30% of the data for testing and 70% of the data for training. Then, we can use machine learning algorithms like deep learning; the first switch of the split data is connected with deep learning, and we choose the switch of deep learning that is connected with apply model; the apply model switch lab is connected with performance, the last is connected with result, and clicking the play button over the algorithm can show accuracy [15]. The model used in our research paper is given below.

3.4. K-NN Model

Simple, non-parametric machine learning algorithms are used for both classification and regression tasks. K-NN is an instance-based algorithm, meaning it makes decisions based on the stored training data and does not explicitly learn a model. Figure 1 shows the process in RapidMiner for KNN. This RapidMiner workflow begins by using the Retrieve EEG_Eye_State operator to load the dataset. The data is then passed to the Split Data operator, which divides it into a training set (tra) and a test set (tes). The training set is used to train a k-NN (k-Nearest Neighbors) model, which outputs the trained model (mod) and the performance data (per). Finally, the trained model (mod) and the test set (tes) are fed into the Apply Model operator to generate predictions, which are then evaluated by the Performance operator to assess the model’s quality.

3.5. Deep Learning

Deep learning is a branch of machine learning that makes use of multi-layered artificial neural networks or deep neural networks. We used it for classification and regression for prediction. Deep learning requires large amounts of data and computational power to train effectively but can achieve state-of-the-art results in many domains. Figure 2 shows the process in RapidMiner for deep learning model. This figure also starts by using the Retrieve EEG_Eye_State operator to load the dataset, which is then divided into training and test sets by the Split Data operator. The training data (tra) is fed into the Deep Learning operator to train the model (mod). The trained model (mod) and the test data (tes) are used by the Apply Model operator to generate predictions, which are finally evaluated for quality by the Performance operator.

3.6. Decision Tree

One kind of supervised machine learning algorithm is the decision tree. It is applied to regression as well as categorization. Decision trees split feature space to maximize information gain or minimize impurity, aligning with cricket match prediction. However, overfitting and complex feature interactions can occur if tree depth is not controlled. Figure 3 shows the process in RapidMiner for Decision tree model. This RapidMiner workflow implements a Decision Tree classification process. It also starts by using operator (Retrieve EEG_Eye_State) to load the dataset. The Split Data operator then divides the data into training (tra) and test (tes) subsets. The training data is used by the Decision Tree operator to create the model (mod). Finally, this trained model (mod) is applied to the test data (tes) using the Apply Model operator to generate predictions, and the Performance operator evaluates the quality of those predictions.

3.7. Random Forest

Random forest is a popular ensemble machine learning algorithm that combines multiple decision trees to improve predictive accuracy and control overfitting. Each tree is built using a random subset of data and features, and the final prediction is made by averaging the outputs (for regression) or majority voting (for classification). It is robust, handles missing data well, and works effectively for both structured and unstructured data. Figure 4 shows the process in RapidMiner for Random Forest model. Now in Figure 4. workflow outlines a Random Forest classification process. The workflow begins by using the Retrieve EEG_Eye_State operator to load the dataset. Next, the Split Data operator divides the data into training (tra) and test (tes) subsets. The training data is then used by the Random Forest operator to build the ensemble model (mod). The trained model (mod) is applied to the test data (tes) via the Apply Model operator to generate predictions, which are finally evaluated by the Performance operator to measure the model’s accuracy.

4. Results

The dataset has a binary target variable (eye detection), 14,980 situations, and 14 numerical variables that represent sensor signals. AF3, F7, F3, FC5, T7, P7, O1, O2, P8, T8, FC6, F4, F8, and AF4 are some of the characteristics; these are all likely signal communications from an EEG system or sensor system. The majority of attributes have mean values between 4000 and 4600, showing that the dataset includes standardized measurements or normalized sensor readings. The majority of attributes have mean values between 4000 and 4600, showing that the dataset includes standardized measurements or normalized sensor readings. AF3, FC5, P7, and AF40 are among the features that show extremely high maximum values (e.g., 309,231 in AF3 and 715,897 in AF4). These could be abnormalities, unusual values, or rare events. An increase in the standard deviation suggests that some characteristics (such as AF4 and P7) have a lot of variation. There are two classes in the dataset: Class 0—8257 cases (55.1% of data); no eye detection examples of Class 1 (eye detection)—6723 (44.9% of data). It is important to take the small difference in class into account when developing a predictive model. Some extreme values may require processing, like standardization or outlier removal. To assess the importance of feature–target variable connections for predicting eye detection, they must be looked at. This dataset can be used to predict eye-related actions by analyzing patterns in neural or physiological data. Eye tracking technology, technological assistance, cognitive state analysis, and signal processing studies are a few examples of applications. Table 1 shows the accuracy of all models evaluated, indicating that traditional machine learning approaches outperformed the deep learning model in this study.

5. Conclusions

Ultimately, the conclusion of this dataset is a time-series capture of sensor readings from different channels, most likely from an EEG device or brain–computer user interface. Measurements from different electrode positions are represented by the columns (e.g., AF3, F7, F3, FC5, T7, etc.), and this can be attached to the 10–20 EEG placement system. The values are constant numerical data that show sensor readings or signal values recorded over time. A binary variable (0 or 1) in the “eye detection” column shows whether eye movement or identification took place during the event that was recorded. The dataset is organized to make analyses easier, such as identifying trends in brain activity, establishing how different channels relate to one another, or exploring how eye movements affect brain signals. Human–computer interaction research, mental stress detection, and cognitive state analysis are a few possible uses. Because of its structure, it can be used for time-series analysis, statistical study, or machine learning classification to gain important knowledge about the recorded signals and how they relate to eye detection events.

Author Contributions

M.A. conceptualized the study, designed the methodology, and supervised the overall research. T.M.A. conducted data collection, preprocessing, and implementation of the models. N.D.A. contributed to data analysis, interpretation of findings, and drafting of the manuscript. All authors reviewed, edited, and approved the final version of the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ramos-Garcia, R.I.; Tiffany, S.; Sazonov, E. Using respiratory signals for the recognition of human activities. In Proceedings of the 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 16–20 August 2016. [Google Scholar]
Lankford, C. Effective Eye-gaze Input into Windows. In Proceedings of the Symposium on Eye Tracking Research & Applications (ETRA ’00), ACM, New York, NY, USA, 6–8 November 2000; pp. 23–27. [Google Scholar]
Weiser, M. The Computer for the 21st Century. Sci. Am. 1991, 265, 94–105. [Google Scholar] [CrossRef]
Perelman, B.S. Detecting deception via eyeblink frequency modulation. PeerJ 2014, 2, e260. [Google Scholar] [CrossRef] [PubMed]
Grossmann, T. The eyes as windows into other minds: An integrative perspective. Perspect. Psychol. Sci. 2017, 12, 107–121. [Google Scholar] [CrossRef] [PubMed]
Walczyk, J.J.; Griffith, D.A.; Yates, R.; Visconte, S.R.; Simoneaux, B.; Harris, L.L. Lie detection by inducing cognitive load: Eye movements and other cues to the false answers of ‘witnesses’ to crimes. Crim. Justice Behav. 2012, 39, 887–909. [Google Scholar] [CrossRef]
Ekenel, H.K.; Stallkamp, J.; Stiefelhagen, R. A video-based door monitoring system using local appearance-based face models. Comput. Vis. Image Underst. 2010, 114, 596–608. [Google Scholar] [CrossRef]
Ahlstrom, U.; Friedman-Berg, F.J. Using eye movement activity as a correlate of cognitive workload. Int. J. Ind. Ergon. 2006, 36, 623–636. [Google Scholar] [CrossRef]
Hoffman, J.E. Visual attention and eye movements. In Attention; Pashler, H., Ed.; Psychology Press: Hove, UK, 1998; pp. 119–154. [Google Scholar]
Kuo, S.C.; Lin, C.J.; Liao, J.R. 3D reconstruction and face recognition using kernel-based ICA and neural networks. Expert Syst. Appl. 2011, 38, 5406–5415. [Google Scholar] [CrossRef]
Yang, J.; Ling, X.; Zhu, Y.; Zheng, Z. A face detection and recognition system in color image series. Math. Comput. Simul. 2008, 77, 531–539. [Google Scholar] [CrossRef]
Babiker, I.; Faye, I.; Malik, A. Pupillary behavior in positive and negative emotions. In Proceedings of the IEEE International Conference on Signal and Image Processing Applications, Melaka, Malaysia, 8–10 October 2013; pp. 379–383. [Google Scholar] [CrossRef]
Diwaker, C.; Tomar, P.; Solanki, A.; Nayyar, A.; Jhanjhi, N.Z.; Abdullah, A.; Supramaniam, M. A New Model for Predicting Component-Based Software Reliability Using Soft Computing. IEEE Access 2019, 7, 147191–147203. [Google Scholar] [CrossRef]
Kok, S.H.; Abdullah, A.; Jhanjhi, N.Z.; Supramaniam, M. A review of intrusion detection system using machine learning approach. Int. J. Eng. Res. Technol. 2019, 12, 8–15. [Google Scholar]
Airehrour, D.; Gutierrez, J.; Ray, S.K. GradeTrust: A secure trust based routing protocol for MANETs. In Proceedings of the 25th International Telecommunication Networks and Applications Conference (ITNAC), Sydney, Australia, 18–20 November 2015; pp. 65–70. [Google Scholar]

Figure 1. The K-nearest neighbors.

Figure 2. Deep Learning.

Figure 3. Decision tree.

Figure 4. Random forest.

Table 1. Algorithm accuracy.

Algorithm	Accuracy
K-NN	96.08%
Deep learning	55.5%
Decision Tree	55.56%
Random Forest	56.72%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ahmad, M.; Ali, T.M.; Arianti, N.D. Enhancing the Precision of Eye Detection with EEG-Based Machine Learning Models. Eng. Proc. 2025, 107, 128. https://doi.org/10.3390/engproc2025107128

AMA Style

Ahmad M, Ali TM, Arianti ND. Enhancing the Precision of Eye Detection with EEG-Based Machine Learning Models. Engineering Proceedings. 2025; 107(1):128. https://doi.org/10.3390/engproc2025107128

Chicago/Turabian Style

Ahmad, Masroor, Tahir Muhammad Ali, and Nunik Destria Arianti. 2025. "Enhancing the Precision of Eye Detection with EEG-Based Machine Learning Models" Engineering Proceedings 107, no. 1: 128. https://doi.org/10.3390/engproc2025107128

APA Style

Ahmad, M., Ali, T. M., & Arianti, N. D. (2025). Enhancing the Precision of Eye Detection with EEG-Based Machine Learning Models. Engineering Proceedings, 107(1), 128. https://doi.org/10.3390/engproc2025107128

Article Menu

Enhancing the Precision of Eye Detection with EEG-Based Machine Learning Models^†

Abstract

1. Introduction

2. Literature Review