A Survey of EEG-Based Approaches to Classroom Attention Assessment in Education

Wei, Lijun; Yu, Yuanyu; Qin, Yuping; Zhang, Shuang

doi:10.3390/info16100860

Open AccessReview

A Survey of EEG-Based Approaches to Classroom Attention Assessment in Education

¹

School of Music, Neijiang Normal University, Neijiang 641100, China

²

School of Artificial Intelligence, Neijiang Normal University, Neijiang 641100, China

^*

Author to whom correspondence should be addressed.

Information 2025, 16(10), 860; https://doi.org/10.3390/info16100860

Submission received: 5 September 2025 / Revised: 28 September 2025 / Accepted: 2 October 2025 / Published: 4 October 2025

(This article belongs to the Special Issue Artificial Intelligence in the Era of Omni-Channel Media)

Download

Browse Figures

Versions Notes

Abstract

In evaluating classroom teaching quality, students’ attention assessment is a critical indicator in education management, as it holds significant practical value for improving teaching methods and instructional quality. Electroencephalogram (EEG) signals can monitor dynamic neural activity in the brain in real time. Their objectivity and non-invasive nature make them particularly suitable for attention assessment in classroom environments. This article first provides a brief overview of existing attention assessment methods, and then presents a comprehensive review of the current research status and methodologies in EEG-based attention assessment, including signal acquisition, preprocessing, feature extraction and selection, classification, and evaluation. Subsequently, the challenges in EEG-based teaching attention assessment are discussed, including the acquisition of high-quality signals, multimodal data fusion, complexity of data, and hardware setups for deep learning method implementation. Finally, a multimodal classroom attention assessment method, which integrates EEG and eye movement signals, is proposed to enhance teaching management.

Keywords:

attention assessment in education; electroencephalogram (EEG); feature extraction; feature classification; EEG multimodal data fusion

1. Introduction

Improving students’ learning efficiency in classroom teaching is an important topic in modern educational management. Traditional classroom evaluation primarily depends on manual techniques, which limits the broader application and advancement of evaluation methods in educational management. It is found that students’ attention in class is crucial and directly affects their learning efficiencies [1]. Attention refers to the ability of human psychological activities to be directed and focused on a specific object. It is a behavioral and cognitive process that emphasizes key information within the learning and memory environment. Some studies indicate that only 46% to 67% of students are able to concentrate in class, with as many as half of the students requiring the teacher’s attention and guidance [2]. Therefore, it is important to assess the students’ attention for better classroom teaching. To assess attention, researchers typically rely on observation, behavioral scales, questionnaires, and performance-based tasks such as the Nogo Task [3], Conners’ Continuous Performance Test (CPT) [4], and the Wisconsin Card Sorting Test (WCST) [5,6]. However, these methods are often influenced by the subjects’ subjective perceptions, leading to significant deviations in the results. For example, when survey questions require a lot of cognitive effort, some subjects may simply give an option that seems reasonable [7]. Alternatively, some participants may respond according to social expectations and choose positive options in order to present a better personal image [8]. Furthermore, the complex diagnostic process makes it challenging to assess the subject’s attention state quickly, resulting in a lack of real-time feedback regarding their attention levels [9]. Therefore, developing an intelligent automatic attention state assessment system to evaluate attention levels quickly and accurately holds significant research importance. The growing development of machine learning has resulted in a rising number of researchers utilizing machine learning algorithms to evaluate attention states.

In previous studies, some researchers have assessed an individual’s attention level by examining external behavioral characteristics of the subjects, such as facial expressions, eye movements, head gestures, and so on, or by tracking the subjects’ gaze and identifying their posture characteristics. For example, Gupta et al. examined students’ facial expressions to assess their emotional states during the learning process. Emotions were categorized into four types: high positive emotions, low positive emotions, high negative emotions, and low negative emotions, in order to assist teachers in enhancing their teaching strategies [10]. Pabba et al. proposed a monitor system that utilizes facial expression recognition to monitor students’ attention in real time. The system employed computer vision technology along with convolutional neural networks (CNNs) to classify six emotional states of students in terms of bored, confused, focused, frustrated, yawning, and sleepy. The system can provide real-time feedback to teachers, aiding in the optimization of classroom teaching strategies [11]. However, facial images captured in real-world scenarios are frequently influenced by environmental factors, limitations of the camera, individual differences, and the conditions of the experiment. This results in a reduction in image quality and impacts the accuracy of facial expression recognition [12]. Rosengrant et al. presented an eye tracking method designed to evaluate students’ attention during computer classes. The system employed a camera to capture data on students’ eye movements and analyzes the trajectories of these movements to determine whether students have shifted their focus [13,14]. Although eye tracking can indicate levels of attention, the quality of the eye tracking data is influenced by various factors, including students’ head movements, the presence of glasses, eyelid occlusion, etc. In addition, there is significant variability in eye movement behaviors among individuals. These factors make it difficult to establish a universal attention model that is applicable to all students.

With the development of brain science research, educational researchers have begun to apply the research results of brain science to the field of education, and to study learning science from the perspective of neuroscience. With the changes in cognitive or neural states, the electrical signals from neuronal activity in the brain directly reflect a learner’s mental state. This provides a reliable theoretical foundation for identifying individual attention levels. EEG is a non-invasive technique that measures the brain’s electrical activity. Through feature extraction and selection of EEG signals, machine learning classification algorithms can be used to evaluate individual attention levels, thereby accurately assessing a student’s attention level. In 1999, Aoki et al. found that the gamma band of EEG reflects the activity of attention [15]. Recent advancements in brain science and wearable technology have shown that EEG signals can accurately and efficiently assess attention levels [16,17].

In recent years, research on using EEG signals to evaluate attention levels has been widely used in various fields. In the educational area, Al-Nafjan et al. proposed a brain–computer interface (BCI) system to assess the students’ attention states during online courses through EEG signal analysis [18]. The classification method proposed in this study can more effectively distinguish the user’s attention states. In the healthcare field, EEG serves as a critical diagnostic tool for attention deficit hyperactivity disorder (ADHD), as it can capture the different neural activity patterns between ADHD patients and normal individuals [19]. EEG is also widely used in neurofeedback therapy, which helps patients learn to regulate their brainwave patterns by monitoring their EEG activity in real time and providing feedback signals [20]. In the field of intelligent driving, EEG has been used in detecting driver attention by analyzing the characteristics in different cognitive states. Obaidan et al. used a deep multiscale convolutional neural network to extract the spectral-temporal features of EEG signals for detecting the driver drowsiness [21]. Recent studies have indicated that EEG shows significant values in attention detection across various fields and owns great potential in future. Based on these situations, this article provide a comprehensive survey on the research methods and progress of student’s classroom attention assessment with EEG signals.

Figure 1 is the general framework of EEG-based attention assessment system, which includes several process: EEG signal recording, preprocessing, feature extraction and selection, classification, and evaluation. EEG signals are recorded using electrodes placed on the scalp. Signal preprocessing is used to eliminate noise and artifacts, improve the signal-to-noise ratio (SNR), and provide clear data for subsequent feature extraction. It includes filtering, artifact removal, data segmentation, baseline correction, and signal enhancement. Feature extraction aims to extract effective features related to attention from preprocessed EEG signals with time domain, frequency domain, time–frequency domain, nonlinear, or spatial domain. Feature selection involves choosing the most discriminative subset of extracted features to reduce dimensionality, minimize computational effort, and facilitate classifier computation, using methods such as filtering, wrapping, and embedding. The classifier is designed to map the selected features to different attention state categories using traditional machine learning methods or deep learning techniques. The accuracy and reliability of the classifier are evaluated by various metrics.

This paper is organized as follows. Section 2 introduces the methods for EEG signal recording. Section 3 focuses on the approaches for EEG signal processing. Section 4 covers the EEG signal feature extraction and selection. The classification and evaluation metrics are reviewed in Section 5. Commonly used datasets in attention assessment are introduced in Section 6. Section 7 surveys traditional attention assessment methods without artificial intelligence. Section 8 discusses some current challenges in EEG-based attention assessment for classroom teaching. In Section 9, a multimodal attention assessment method based on EEG is proposed for evaluating teaching quality in the classroom.

2. EEG Signal Recording

2.1. EEG Signal

The EEG signal results from the electrical activity generated by ionic currents that move between the brain and the neurons. In 1929, Hans Berger, a German psychiatrist, was the first to capture EEG signal using electrodes positioned on the human scalp [22]. He successfully recorded electrical activity with a frequency around 10 Hz, which he designated as the “alpha rhythm”. EEG reflects the electrical activity of a group of neurons in the brain region where the electrode is placed. It contains a wide range of important psychophysiological information. Therefore, the EEG signal directly reflects brain activity and is essential for understanding the physiological processes taking place in the human brain. As a biomedical signal, the EEG signal is characterized by its nonlinear, non-stationary nature and low SNR. EEG is a time-domain signal that is typically split into several frequency bands. Studies on EEG signals have shown that these specific bands within the signal are closely linked to certain mental status, as described in Table 1 [23,24]. It should be noted that the ranges of these bands are not strictly fixed and can vary depending on individual differences and cognitive states [25].

The assessment of attention is linked to brain electrical activity across different frequency bands. Alpha waves are associated with relaxation and rest, and when individuals focus their attention, the activity of alpha waves in the parietal and occipital regions typically decreases [26]. Therefore, concentrated attention is often accompanied by the suppression of alpha waves. Theta waves are associated with cognitive control and attention allocation [27]. Some studies have observed increased theta wave activity in the EEGs of patients with ADHD [28]. Beta waves are linked to positive thinking, problem-solving, and focused attention [29]. Some studies indicate that beta wave activity may be reduced in patients with ADHD [28,30]. Gamma waves are closely associated with higher cognitive functions, such as information processing and sensory–motor integration [26]. In activities requiring high attention, such as when individuals concentrate on a particular visual stimulus, gamma wave activity generally rises [31,32].

2.2. Hardware Setup for EEG Signal Processing

The EEG hardware system is designed to accurately acquire, amplify, digitize, and process the weak brain electrical signals. Figure 2 is a general hardware configuration for EEG signal processing, which includes an EEG signal acquisition device and a processing unit.

A typical EEG acquisition generally consists of an electrode cap, amplifiers, analog-to-digital converters (ADCs), microcontroller unit (MCU), and data transmission modules. EEG signals are collected through the electrode cap. The microvolt-level EEG signals are first amplified by high-gain, low-noise amplifiers, and then filtered to remove noise and artifacts. Subsequently, the signals are converted into digital by ADCs and sent to an MCU. The amplifier, filter, and ADC, together, are commonly referred to as the analog front-end (AFE), which can be implemented using the ADS1299 bio-potential AFE chip from Texas Instruments. Additionally, 8-channel low-noise programmable gain amplifiers and 24b-bit ADCs are integrated in ADS1299 [33]. MCU is used for processing digitized signals, performing digital filtering, and conducting basic data analysis [34]. In some cases, edge computing is deployed within EEG signal acquisition devices to reduce latency, lower bandwidth requirements, and improve system responsiveness. Edge computing functions can be implemented using field programmable gate arrays (FGPAs), ARM Cortex processors, and application-specific integrated circuits (ASICs) [35,36,37,38]. The processed data is transmitted to the backend processing unit for further signal processing and analysis via wired or wireless methods. Commonly used wired interfaces include USB, UART, Ethernet, etc., while wireless interfaces include Wi-Fi, Bluetooth, and others.

The commercial EEG acquisition devices vary according to different practical requirements. The BioSemi ActiveTwo system supports up to 280 channels with a 24-bit ADC resolution and is widely used for attention assessment [39,40,41,42]. The BrainAmp DC is also a solution for laboratory EEG recordings with up to 256 channels at 16-bit ADC resolution [43,44,45]. Besides the commercial device, some researchers also developed their custom designed EEG signal acquisition device in research. Table 2 lists some EEG acquisition devices in research.

The processing unit is the core platform for EEG signal data processing and analysis, covering the entire workflow from raw signal handling to final result output. It is primarily based on a personal computer platform, where the CPU performs key functions including EEG signal preprocessing and noise reduction, feature extraction and computation, as well as feature classification and decision-making output. In recent years, GPUs have become essential tools for processing EEG data and accelerating deep learning models due to their powerful parallel processing capabilities and high memory bandwidth. EEG signal processing involves numerous repetitive mathematical operations, such as filtering, feature extraction, and artifact removal, where the parallel processing units of GPUs can significantly enhance efficiency. Deep learning models have been applied to EEG artifact removal and classification, and GPUs have reduced the training time of these models from several days to just a few hours. Table 3 lists the NVIDIA series GPUs used in EEG signal processing.

2.3. EEG Signal Acquisition

EEG signal acquisition is a crucial technical method for studying brain activity, as it records the electrical activity of brain neurons through electrodes. EEG signal acquisition can be carried out using both invasive and non-invasive techniques. In the invasive method, the SNR is significantly higher than those of non-invasive approaches. However, this method has the drawback of requiring surgical implantation into the skull, with electrodes penetrating the cerebral cortex to obtain signals. As a result, non-invasive methods are more commonly used in attention assessment, as they are cost-effective and allow for easy signal acquisition using wearable EEG caps or headsets that position electrodes along the scalp.

In EEG signal acquisition, both the 10-20 and 10-10 electrode systems are utilized. The 10-20 system is the international standard for the placement of EEG electrodes and has been in use for several decades [62]. This system uses measurements of external landmarks on the skull. It locates the electrodes on the scalp and assumes a consistent correlation between scalp electrode positions and the underlying brain structures. The 10-20 system uses percentages to determine the positions of the electrodes, with the distance between adjacent electrodes representing 10% or 20% of the total distance. The main advantages of the 10-20 system are its standardization and wide applicability; furthermore, it is widely used in clinic and research fields. The 10-10 system is an extension of the 10-20 system, achieved by adding additional electrodes between the electrodes of the 10-20 system [63]. This increases electrode density and aims to provide higher resolution EEG data. The choice between the 10-20 and 10-10 systems depends on the specific goals and needs of the research. If the attention assessment in classroom requires a standardized setup and does not have high demands for spatial resolution, the 10-20 system may be more suitable because it is highly standardized and widely used. The data recorded using this system exhibit good consistency and reproducibility across different studies and laboratories. Therefore, it demonstrates excellent compatibility for cross-study comparisons [64]. Additionally, the 10-20 system uses fewer electrodes, which can reduce the risk of artifacts, especially in scenarios involving movement or long-term monitoring. Furthermore, the smaller amount of data facilitates real-time signal processing, which is important in classroom attention assessment. However, if the research necessitates detailed localization of brain activity and requires high-resolution data, the 10-10 system may be more appropriate. In addition to the traditional method of scalp EEG acquisition, recent studies have also explored the use of ear electrodes for EEG collection, which have been applied in attention assessment [65,66].

The BioSemi AcitveTwo system, actiCHamp, and BrainAMP DC utilize a large number of wet electrodes, which makes them relatively expensive and bulky. To reduce size and achieve portability in EEG collection devices, dry electrodes are used for signal acquisition. MindWave Mobile 2 from Neurosky is a single electrode (FP1 position in 10-20 system) EEG portable headset with 12-bit raw brainwaves outputs. It is powered by dry batteries and transmits wireless data to a smartphone or personal computer via Bluetooth. Even though the MindWave Mobile 2 uses a single channel, it has been applied in attention assessment research and has achieved notable outcomes [67,68]. This product facilitates real-time attention assessment and modulation through wearable brain–computer interfaces (BCIs). In research related to attention assessment, the sampling rates vary from 128 Hz [69] to 8192 Hz [70], with 512 Hz [9,16,71,72] and 1000 Hz [73,74,75] being commonly used.

3. EEG Signal Preprocessing

The amplitude of EEG signals is quite weak, typically in the microvolt range. During signal acquisition, it is susceptible to various types of noise, which can lead to a reduced SNR. Additionally, during the recording of EEG signals, they are often affected by various artifacts, which primarily originate from both physiological and nonphysiological sources. Among them, physiological artifacts mainly include electrooculography (EOG) artifacts caused by eye movements and blinking, electromyography (EMG) artifacts generated by muscle activity, electrocardiography (ECG) artifacts, etc. Nonphysiological artifacts primarily encompass noise introduced by external factors such as power line interference, electrode displacement, and loose connections in the wiring. Consequently, signal preprocessing steps are essential to mitigate noise contamination that may affect subsequent feature extraction and classification.

3.1. Filtering

Frequency-domain filters can eliminate artifacts from recorded EEG signals by narrowing the analyzed bandwidth. Choose an appropriate filter based on the desired EEG frequency band or the artifact frequency range. This effectively reduces noise and artifacts in the original EEG signal. Commonly used filters include low-pass filters, high-pass filters, band-pass filters, and band-stop filters. According to the suggestions from the American Clinical Neurophysiology Society, the low-frequency filter should not exceed 1 Hz, and the high-frequency filter should be higher than 70 Hz in standard EEG singal recordings [76]. In the attention assessment, the low cut-off frequency can be as low as 0.1 Hz [77,78]. Compared to other types of filters, the Butterworth filter provides a more uniform gain within the passband. It is frequently used in EEG signal processing and is also employed to enhance specific frequency bands of the electroencephalogram [79]. In EEG signal processing, the most common application of a notch filter is to eliminate power line interference, typically at frequencies of 50 Hz or 60 Hz.

3.2. Artifacts Removal

The artifacts from other bio-electrical signals, such as EOG, EMG, and ECG can be partially removed using filtering techniques; however, completely eliminating these artifacts is often challenging [80]. To more effectively remove artifacts, it is often necessary to combine filtering with other signal processing techniques.

Independent Component Analysis (ICA) is a blind source separation technique that decomposes EEG signals into multiple independent components, allowing for the identification and removal of components containing artifacts. This is typically achieved by subtracting the artifact components from the original data and then reconstructing the remaining EEG signals [81]. ICA can effectively separate independent components from mixed signals without prior knowledge of the exact characteristics of the artifacts, making it suitable for removing various types of artifacts, including eye movements, muscle activity, and cardiac artifacts. However, its effectiveness is influenced by data quality and algorithm parameters [82].

Wavelet transform (WT) is a time–frequency analysis method that decomposes a signal into a series of wavelet functions, which possess excellent localization properties in both time and frequency domains. This capability allows wavelet transform to provide information about the signal at various time and frequency scales, giving it a unique advantage in processing non-stationary signals, such as EEG signals. Eye movement artifacts usually appear as low-frequency, high-amplitude waveforms, whereas muscle artifacts are characterized by high-frequency, random waveforms. WT can decompose EEG signals into components of different scales and frequencies, and then, artifacts can be removed by applying thresholding to the components that contain artifacts [83,84]. WT has advantages in handling non-stationary signals and can effectively remove artifacts from EEG signals [85].

Empirical mode decomposition (EMD), as an adaptive signal processing method, has shown effective results in the removal of artifacts from EEG signals. EMD can decompose a signal into a series of intrinsic mode functions (IMFs), which can then be used to identify and remove artifacts based on their characteristics. Research have demonstrated that EMD can isolate ocular artifacts into specific IMFs. By identifying and eliminating these IMFs that contain ocular artifacts, it is possible to effectively reduce the interference of ocular artifacts on EEG signals [86].

The comparison of these methods for removing artifacts is listed in Table 4. ICA is effective for processing multi-channel EEG data and can remove eye and muscle artifacts well. However, it requires a lot of computing power and often needs manual adjustments. The WT is good for analyzing non-stationary signals, but its effectiveness relies on choosing the right wavelet. EMD is adaptable for nonlinear signals but can be affected by boundary issues and mode aliasing. Each method has its own advantages and disadvantages in terms of computing efficiency, automation, and artifact removal in EEG preprocessing. For multi-channel data with identifiable artifact sources, ICA is the preferred choice. For specific frequency band interference or when quick processing is needed, wavelet transform is ideal. EMD is a good choice for single-channel data and nonlinear or non-stationary signals.

In practical applications, combined methods often yield better results. Combining ICA with WT and EMD can enhance the accuracy and robustness of artifact removal. This approach allows for more effective separation and extraction of relevant signal components, improving the overall quality of the processed data [81,82]. EMD is often combined with other signal processing methods. For example, by combining EMD with ICA, the EEG signals can first be decomposed into multiple IMFs using EMD. Subsequently, ICA can be applied to these IMFs for separation, allowing for more effective artifact removal [87].

3.3. Signal Segmentation and Baseline Correction

EEG signals are typically recorded continuously; however, to analyze brain activity related to specific events or cognitive processes, they need to be segmented into multiple time intervals. The methods for signal segmentation include event-related approaches, and fixed time window methods. Event-based segmentation refers to dividing continuous EEG signals into multiple time segments based on specific events in the experimental design, such as stimulus presentation or responses. Fixed time window segmentation is dividing EEG signals into segments of a predetermined length without considering specific experimental events.

The artifacts always cause baseline drift in the EEG signals, which can obscure genuine brain activity and affect subsequent analysis and feature extraction [88]. Therefore, baseline correction is necessary for providing a more accurate representation of neural activity in the brain. The principle of baseline correction is subtracting the average value calculated over the average value within a specific time period. This is a simple and effective approach, particularly suitable for cases where baseline drift is relatively linear. For EEG signals with complex baseline drift, adaptive baseline correction methods can be employed to extract intricate baseline drift patterns that include both gradual changes and abrupt activities [89].

Signal segmentation and baseline correction are essential steps in the preprocessing of EEG signals, as they can enhance data quality and improve the effectiveness of the analysis. It is recommended to first segment the signals before performing baseline correction. Kessler et al. indicate that the combination of EEG singal preprocessing steps such as filtering, artifact removal, signal segmentation, and baseline correction can significantly affect classification performance [90].

3.4. Signal Enhancement

In addition to filtering and artifact removal, signal enhancement techniques may also be used to improve signal quality. Different from filtering and artifact removal methods, the primary goal of signal enhancement is to increase the SNR of the target signals. This approach emphasizes critical information within the EEG data, thereby improving the clarity and detectability of the desired signals. Below are some research methods used for EEG signal enhancement.

Xu et al. proposed a method based on the ICA algorithm to enhance the P300 component in ERP signals. Experimental results demonstrated that after enhancement, the P300 could be easily distinguished between target and non-target responses with minimal effort. This method achieved an accuracy of 100% on the dataset IIb of BCI Competition 2003 [91]. Maddirala et al. proposed an ICA-based signal enhancement method for single-channel EEG signals. The approach first employs Singular Spectrum Analysis to decompose the single-channel EEG data into multivariate components. Subsequently, ICA is applied to these multivariate data to extract the underlying EEG source signals [92].

Lin et al. introduced a real-time EEG signal enhancement method based on Canonical Correlation Analysis (CCA). In this approach, CCA is used for source separation, and the Gaussian Mixture Model is integrated to further enhance the EEG signals [93]. Kalunga et al. utilized CCA to enhance steady-state visually evoked potentials (SSVEPs) in EEG signals. Their method effectively improved the SSVEP features across EEG signals recorded at multiple different time durations [94].

Cai et al. employed the Common Spatial Patterns (CSPs) method to enhance EEG signals, aiming to address the low SNR issue in auditory attention detection tasks [95]. Khanmohammadi et al. proposed a signal enhancement method for epilepsy detection that combines Principal Component Analysis (PCA) with CSP. The approach first applies PCA to reduce the dimensionality of EEG signals, followed by CSP to enhance the SNR [96].

Molla et al. applied a multivariate wavelet transform technique to enhance ERP signals in EEG data, thereby improving the SNR ratio. Experimental results demonstrated that the use of this algorithm significantly boosted the classification performance of single-trial ERP data [97]. Sita et al. proposed a wavelet domain nonlinear filtering method to enhance the SNR of evoked potentials. It was applied to auditory evoked late latency responses and brainstem auditory evoked potentials. Experimental results demonstrated that the method effectively enhanced and smoothed the component peaks [98].

Maki et al. introduced a probabilistic generative model of multi-channel Wiener filters for EEG to enhance ERP signals. This method introduced prior information about the spatial correlation matrix related to the target ERP. Experiments showed that this method significantly reduced artifacts and improved the observation effect of ERP [99]. Yadav et al. employed an adaptive Kalman filter with parameters optimized using metaheuristic techniques to enhance ERP signals. The method was tested under various noise conditions and demonstrated favorable performance [100].

4. EEG Signal Feature Extraction and Selection

The preprocessed EEG signals still contain components unrelated in attention assessment, and the high dimensionality of the data can diminish the performance of subsequent attention assessment model training and recognition. Therefore, it is necessary to extract attention-related features from the original EEG signals. Following this, specific feature selection methods can be employed to reduce the dimensionality of the extracted feature data, minimizing the feature sample space, lowering the computational demands for subsequent classification, and improving prediction accuracy. In this section, the features used in attention assessment in classroom teaching are summarized and compared.

4.1. Time-Domain Features

EEG is a time-domain signal; therefore, time-domain feature extraction methods related to attention are an important component of BCI and cognitive neuroscience research. Time-domain feature extraction is used to quantify the changes in EEG signals over time, which in turn reflects various attention states. These methods generally include calculating statistical properties of the signals, analyzing waveform characteristics, and examining ERPs.

The time statistical feature method utilizes statistical measures such as mean, standard deviation, variance, skewness, and kurtosis to characterize the distribution properties of EEG signals. Its advantages are that it is easy to calculate and implement, providing a quick overview of the signal’s overall statistical properties [101]. However, it could not detect small changes in the signal and cannot capture time related information, which limits its effectiveness for non-stationary signals. The amplitude feature extraction method identifies attention-related patterns by analyzing changes in the EEG signal’s amplitude, such as maximum, minimum, and peak-to-peak values. Its physical significance is clear, as it directly reflects the signal strength from specific brain regions. However, it is susceptible to noise interference and is not sensitive to small variations in the signal. The energy feature extraction method detects signal strength characteristics by calculating the energy or power of the signal within a specific time window. However, it is not sensitive to phase changes in the signal.

ERPs refer to the response of EEG to specific stimuli or events. ERPs components, such as P1, N1, P2, N2, and P300, are key indicators for studying attention assessment [26,102]. ERPs offers high temporal resolution, allowing for precise tracking of the timing of cognitive processes. However, its spatial resolution is low, making it challenging to accurately identify the sources of neural activity. Additionally, ERPs can be affected by noise and artifacts, and data quality may vary depending on the study, participants, and scoring procedures [103].

The Hjorth parameter method is employed for EEG time-domain feature extraction and includes three parameters: activity, mobility, and complexity. Activity represents the variance of the signal, mobility indicates the changes in the signal’s frequency, and complexity reflects the degree of complexity of the signal [104]. It can provide a comprehensive description of the signal’s amplitude, frequency, and complexity, showing effectiveness in distinguishing between varying levels of attention in EEG signals [105]. However, the calculations are relatively complex, and it can be challenging to fully extract the intrinsic features of complex nonlinear EEG signals.

4.2. Frequency-Domain Features

In EEG attention assessment, there is a wide variety of frequency-domain feature extraction methods. These methods primarily reflect the brain’s activity state by analyzing the energy distribution of EEG signals across different frequencies, which in turn helps infer an individual’s level of attention.

Fast Fourier transform (FFT) is one of the most commonly used methods for frequency-domain feature extraction. It transforms time-domain signals into frequency-domain representations, allowing for the calculation of power across different frequencies. As described in Table 1, EEG signals are generally categorized into several primary frequency bands, namely Delta, Theta, Alpha, Beta, and Gamma. The power spectrum reflects the energy distribution across various frequency bands, which can be used to identify different states of attention. To calculate the energy or power of each frequency band, the power spectral density (PSD) is typically used for estimation. The energy value for each frequency band can be obtained by integrating the power within that band. PSD analysis allows for the evaluation of the extent to which various brain regions participate in specific attention tasks, as well as the changes in attention levels.

Delta waves are commonly associated with deep sleep and unconscious states. However, research indicates that Delta waves are also related to various cognitive processes, particularly during attention assessment tasks [106]. Theta waves are linked to relaxation, meditation, and working memory. When performing tasks that require cognitive control, Theta wave activity in the prefrontal cortex increases [107,108]. Alpha waves are primarily associated with a relaxed state of alertness, and their power typically decreases when individuals focus their attention or engage in visual–spatial tasks [109,110]. Beta waves are associated with active thinking, problem-solving, and alertness. Their power typically increases during tasks that require high levels of concentration, particularly in the frontal lobe region [107,108]. Gamma waves are linked to higher cognitive functions, and research has shown that gamma wave activity plays a significant role in visual attention processes [111]. In EEG-based attention assessment research, the Theta/Beta ratio (TBR) of the frontal lobe is closely related to attention states [112].

The advantages of FFT include high computational efficiency and simplicity of algorithm implementation. However, it lacks time resolution, making it challenging to analyze non-stationary signals. PSD can quantify the energy intensity of a signal within specific frequency bands, making it suitable for analyzing the rhythmic activities of EEG signals. However, it does not offer phase information.

4.3. Time–Frequency-Domain Features

Time–frequency-domain features characterize the frequency variations in EEG signals over different time periods, allowing for a more comprehensive observation of the distribution of signal power in the time–frequency-domain. Short-Time Fourier transform (STFT) is a commonly used time–frequency analysis method that allows for the extraction of the frequency components of the signal at different time points. STFT is suitable for the rapid analysis of the basic spectral characteristics of EEG signals; however, the time and frequency resolutions are limited by the length of the time window, making it impossible to achieve optimal performance in both simultaneously. Compared to STFT, WT offers higher frequency resolution in the low-frequency range and better time resolution in the high-frequency range. Therefore, WT is more suitable for analyzing non-stationary signals. The choice of wavelet basis function significantly impacts the results, as different basis functions are suited to various signal characteristics. WT requires considerable computational resources when handling large-scale EEG data. The Hilbert–Huang transform (HHT) is a time–frequency analysis technique for nonlinear and non-stationary signals. It effectively determines the instantaneous frequency and power of EEG signal, has the capability to analyze both nonlinear and non-stationary signals, and retains time information [101].

In addition to the above methods, EMD [113], Wigner–Ville distribution (WVD) [114] and Wavelet Packet decomposition (WPD) [115] are also utilized in attention extraction of EEG signals.

4.4. Nonlinear Features

Nonlinear methods can effectively capture complex dynamic information in EEG signals that linear approaches may fail to detect, thereby increasing the accuracy of distinguishing different attention levels [116]. In EEG attention assessment, commonly used methods for nonlinear feature extraction include Fractal dimension (FD) and entropy. These methods can capture the complexity and dynamics of the signals, which makes them suitable for analyzing the brain’s attention states.

4.4.1. Fractal Dimension

FD is a metric that describes complex geometric shapes through non-integer dimensions, effectively capturing the nonlinear features and self-similarity present in EEG signals [117]. Higuchi fractal dimension (HFD) is a commonly used method for calculating fractal dimension. It estimates the fractal dimension of a signal by calculating the length of the curve at different time intervals. This method is particularly well-suited for analyzing non-stationary signals like EEG. With HFD, the complexity of EEG signals can be quantified, thereby reflecting the state of brain activity. Research has shown that HFD can effectively differentiate EEG signals under various cognitive states, such as improvements in attention [118]. In addition to HFD, other nonlinear features, such as Largest Lyapunov exponent (LLE), Correlation eimension (CD), and Katz fractal dimension (KFD) have been utilized in attention assessment [119]. Because EEG signals exhibit nonlinear and non-stationary properties, employing nonlinear feature extraction methods can better reveal the nonlinear dynamic characteristics within these signals, thereby enhancing their biological interpretability. However, nonlinear feature extraction methods generally have higher computational complexity compared to linear methods. They require careful parameter selection and have stricter demands on computation speed. If nonlinear feature extraction is not performed properly, it may lead to overfitting of the training data.

4.4.2. Entropy

Entropy is a measure of the complexity and randomness of a signal and it has been used to quantify and analyze the complexity and uncertainty of EEG signals related to attention. For example, approximate entropy has been employed to extract variations in brain activity associated with focused or dispersed attention [120], while sample entropy has been utilized to differentiate between various brain states and cognitive processes, such as sleep stages, epileptic seizures, and levels of attention [121]. Additionally, multiscale entropy has been used to investigate the hierarchical processing of attention and the information transfer between different brain regions [122]. These studies indicate that entropy values can effectively characterize the level of attention in terms of temporal features. However, the calculation of entropy values is relatively complex and not suitable for real-time attention assessment.

4.5. Spatial-Domain Features

Spatial-domain features utilize the distribution patterns of EEG signals to decode an individual’s attention state [70]. By analyzing the contributions and relationships of signals from different electrode locations, attention characteristics can be studied.

CSP maximizes the variance difference between two classes of signals through spatial filtering, effectively extracting spatial patterns in EEG that are relevant to specific tasks. CSP and its improved version, filterbank common spatial pattern filters (FB-CSP), have been utilized for attention feature recognition [123,124]. CSP has been used to detect auditory attention in teaching. Niu et al. found that using CSP to extract spatial features from EEG can more effectively improve the recognition accuracy and model stability of music auditory attention [125]. Geirnaert et al. developed a method based on FB-CSP for decoding the directional focus of auditory attention [123]. Experimental results indicated that this method can reliably and efficiently identify the direction of auditory attention.

CSP operates in a linear Euclidean space, which makes it challenging to accurately capture the nonlinear characteristics of multivariate EEG signals. The Riemannian manifold is a nonlinear space that enables a more comprehensive analysis of the characteristics of multivariate EEG signals. The Riemannian geometry-based spatial filtering (RSF) approach improves the classification of EEG signals by maximizing the Riemannian distance between the covariance matrices of different classes, thereby projecting the signals into a lower-dimensional subspace [126]. Xu et al. proposed a Riemannian manifold-based method for attention state classification, which combines filter banks to process signals from different frequency bands separately [127]. This approach demonstrates high efficiency in identifying attention states compared with baseline methods.

4.6. Comparison of EEG Signal Feature Extraction Methods

The comparison of these EEG signal feature extraction methods is listed in Table 5. Time-domain feature extraction directly analyzes the waveform of the EEG signal. It is usually simple to calculate, easy to implement, and can directly reflect the waveform characteristics of the signal. Although time-domain features have computational advantages, they cannot reflect the frequency and spatial information of the signal well and have difficulty capturing more complex patterns in EEG signals. Frequency-domain features can reveal the energy distribution of different frequency bands in EEG signals and is suitable for capturing frequency changes in EEG signals that are related to specific cognitive states. A major disadvantage of frequency-domain features is that it loses time information and cannot reflect the dynamic changes in signal characteristics. Time–frequency-domain feature extraction combines the advantages of time-domain and frequency-domain features, and can simultaneously capture the time and frequency information of EEG signals. However, its computational complexity is relatively high and may require more computing resources. Nonlinear features are suitable for studying the complexity of brain function and can provide additional information about brain activity patterns. However, they are usually computationally complex and sensitive to noise, and depend on the selection of appropriate nonlinear indicators. Spatial-domain features are suitable for analyzing inter-brain interactions and can reveal information transfer and collaboration between different brain regions. However, spatial-domain feature extraction requires a high-density electrode grid to achieve accurate spatial resolution, and its generalization ability may be limited.

4.7. EEG Signal Feature Selection

EEG signals usually include large volumes of high-dimensional data. Feature selection helps to reduce data dimensionality and computational complexity to avoid the curse of dimensionality [128]. By identifying and selecting the features that most effectively differentiate between various attention states, it also enhances classifier performance and improves the accuracy and robustness of attention recognition. Commonly used feature selection methods include filter methods, wrapper methods, and embedded methods [129].

The filter methods assess features using statistical tests such as correlation-based feature selection (CFS) [130], mutual information [131], and F-test [132] to select those that have a high correlation with the target variable. The filter methods are computationally simple but may ignore the interdependencies between features [115].

The wrapper methods approach feature selection as a search problem, identifying the optimal combination of features by assessing the performance of various feature subsets. Unlike the Filter methods, which assess features independently, the wrapper methods employ machine learning algorithms to evaluate the usefulness of feature subsets. This approach enables the selection of features that can enhance the overall performance of the model. Forward selection [133], backward elimination [134], recursive feature elimination (RFE) [135], and the aenetic algorithm (GA) [136,137] have been utilized in attention assessment. Xu et al. employed wrapper methods to perform feature selection on EEGs of different attention states, and the classification accuracy on the test set was improved to 94.1% [115]. The advantage of wrapper methods is that it can take into account the interactions between features and generally achieves better classification performance. However, the wrapper methods tend to be computationally intensive, are susceptible to overfitting, and their performance is highly dependent on the selection of the classifier.

The embedded methods integrate the feature selection process into the training process of the learning algorithm. Unlike filter and wrapper methods, embedded methods learn feature importance during the model training process itself, thereby selecting the subset of features that contribute most to the model’s performance [138]. Common embedded methods include regularized linear regression models, decision tree, and random forest models. Least absolute shrinkage and selection operator (LASSO) is a regularized linear regression model that performs feature selection by adding an L1 norm penalty term to the loss function [139]. Decision trees perform classification and regression analysis by constructing tree models. In EEG signal processing, decision trees can be used for automatic feature selection. Random forest improves the accuracy and stability of classification or regression by building multiple decision trees and combining their prediction results. Moon et al. found that decision trees and random forests perform effectively in attention assessment across both single-channel and selected multi-channel models [140]. Embedded methods balance computational efficiency and classification performance effectively, making them well-suited for high-dimensional data. However, they are relatively complex and demand proper parameter tuning.

In addition to the above traditional methods, other methods such as PCA [141], ICA [142], k-nearest neighbors (KNNs) [130], and recurrent neural network (RNN) [143] have also been used in attention feature selection.

5. Classification and Evaluation Metrics

In EEG attention assessment, classification refers to dividing EEG signals into different categories according to attention states. Evaluation metrics are used to measure the performance of the classifier to ensure that it can accurately and reliably identify different attention states. In recent years, machine learning techniques have been extensively utilized in EEG signal classification. Among them, deep learning, as a rapidly evolving branch of machine learning, has shown promising potentials in assessing attention levels. In this section, traditional machine learning methods and deep learning methods used in EEG attention classification are reviewed separately.

5.1. Traditional Machine Learning Methods

Traditional machine learning methods mainly rely on manually designed feature extraction and classifiers in EEG attention level assessment. The commonly used models include support vector machine (SVM), KNN, and random forest, etc.

5.1.1. SVM Methods

SVM is commonly used in classification and regression analysis. In EEG signal analysis, SVM distinguishes different brain activity patterns by finding the optimal hyperplane. It can effectively process high-dimensional, nonlinear EEG data, thereby realizing the classification of different attention states. Jin et al. developed a machine learning classifier based on SVM to distinguish between on-task and off-task thinking with an accuracy between 50% and 85% [144]. Peng et al. utilized the HHT to analyze EEG signals for attention assessment and employed SVM to classify the signals into focused and relaxed states, achieving an accuracy of 84.8% [145]. Chen et al. proposed a hybrid feature learning framework for attention assessment using SVM for classification. In their cross-session and cross-subject experiments, classification accuracy rates of 86.27% and 94.01% were achieved, respectively [146].

5.1.2. KNN Methods

The KNN algorithm is a simple and effective classification algorithm. Its basic idea is to classify by measuring the distance between samples. The category of the sample to be classified is determined by majority voting based on the categories of the K-nearest neighbors. Al-Nafjan et al. employed a KNN classifier for attention assessment in online classes and achieved an accuracy of 83% in their k-fold cross-validation experiment [18]. Sahu et al. developed a KNN classifier to identify the human sustained attention level from prefrontal EEG rhythms, and the F1-score of prefrontal Theta EEG rhythm was 88.88% [147]. Esqueda et al. investigated methodologies for the classification of attention level by EEG signals and found that the accuracy of a KNN classifier was 89.68% [148].

5.1.3. Random Forest Methods

Random forest is an ensemble learning method that makes decisions by learning rules from training data. Internal nodes split data based on feature values and ultimately make classification decisions at leaf nodes. The random forest algorithm has shown good performance in processing EEG data and can effectively classify different brain states and cognitive activities [18,148,149]. In the research of [148], the random forest classifier achieved an accuracy of 92.91%. Al-Nafjan et al. proposed a random forest classifier for students’ attention assessment during online class and achieved a high accuracy of 96% [18].

5.1.4. Comparison of Traditional Machine Learning Methods

The advantage of SVM methods is their effectiveness in high-dimensional spaces and their high classification accuracy. However, they are very sensitive to the choice of parameters and kernel functions, which can require extensive experimentation to optimize. Furthermore, they have a high training time complexity, especially on large datasets. The advantages of KNN methods are that the algorithm is simple and easy to implement, do not require an explicit training process, and are effective for nonlinear classification. However, their disadvantages are sensitive to data dimensions, require normalization, and have high computational complexity. The advantages of random forests include being able to handle high-dimensional data, not requiring feature selection, and being able to give feature importance. The disadvantages are that the model is less interpretable and requires more training time than other algorithms. The comparison is described in Table 6.

These methods have different potentials in EEG signal processing and classification. The choice of algorithm depends on the specific application scenario and data characteristics. In practical applications, the above methods can also be combined with other methods for a better feature classification. For example, Demidova et al. combined the random forest classifier with the SVM classifier to improve the SVM classification results [150]. In addition, computational cost and model complexity also need to be considered when choosing an appropriate machine learning algorithm for classification. For example, random forests generally require more computing resources to train, while the computational complexity of KNN increases with the size of the dataset. In practical applications, there is a trade-off between accuracy and computational efficiency.

To improve the efficiency of EEG classification using traditional machine learning methods, the following methods may be adopted. First, reducing the number of features and optimizing the dimensionality reduction techniques can reduce computational complexity and improve classification efficiency. Second, for computationally intensive algorithms such as random forests, parallel computing can be used to speed up the training process. Finally, the performance and efficiency of the algorithm can be improved by selecting the optimal algorithm parameters through methods such as cross-validation.

Although traditional machine learning methods are widely used in EEG attention assessment, their performance depends heavily on the quality of feature engineering. Features extracted using traditional methods may fail to fully capture the complex patterns in EEG signals. Moreover, manual feature selection and optimization are required, often demanding expert domain knowledge and extensive experimentation [151].

5.2. Deep Learning Methods

Deep learning methods for EEG attention classification have been applied and developed in recent years. Different from traditional machine learning methods that rely on manual feature extraction, deep learning methods automatically learn feature representations from raw EEG signals by constructing multi-layer neural networks [152]. The deep learning methods combine feature extraction and classification into a framework, enabling end-to-end learning ability so that it can reduce the reliance on manual feature engineering, as demonstrated in Figure 3, using CNN as an example.

In research, a multi-level neural network architecture is usually constructed first, utilizing its deep structure to progressively extract time-domain, frequency-domain, and spatial-domain features related to attention from EEG signals. In the training stage, the errors between the prediction data and the labeled dataset is calculated through the loss function, and the network parameters are optimized with backpropagation algorithm to improve the classification accuracy of the model. In the evaluation stage, methods such as cross-validation are usually used to test classification performance, and the model is evaluated through metrics such as accuracy and recall.

Among deep learning methods, CNN, RNN, Transformer, and hybrid models have been commonly employed in EEG-based attention classification, and they are introduced as follows.

5.2.1. CNN Methods

The advantage of CNN in attention classification is its ability to automatically learn spatiotemporal features, avoiding the complexity and limitations of traditional manual feature extraction. Toa et al. used low-cost devices to collect EEG signals and CNN models to classify visual attention test tasks. Their study achieved an accuracy of 76% in classifying attention levels using this approach [153]. Cai et al. proposed an EEG auditory attention detection system with CNN as the classifier in noisy environments. The experiments showed that the auditory attention can be detected in 2 s with the accuracy of 80.2% in noisy acoustic environments [124]. Vandecappelle et al. developed an auditory attention detection system with a CNN classifier in a multi-speaker scenario. It achieved a median accuracy of around 81% in 1 to 2 s [154]. Wang et al. designed a CNN-based personalized system for attention assessment in virtual reality experiments. The mean accuracy of the two CNN classifiers are 94.41% and 94.15%, respectively [155].

5.2.2. RNN Methods

The advantage of RNN and its variants is their ability to process sequential data, which is highly consistent with the time-dependent characteristics of EEG signals. Traditional classification methods based on handcrafted features and statistical methods often face challenges in capturing the complex temporal dynamics and nonlinear relationships inherent in EEG signals. Geravanchizadeh et al. proposed a dynamic selective auditory attention detection system, which achieved an accuracy of 94.2% using a RNN agent as the classifier [156]. Rivas et al. employed the long short-term memory (LSTM) and gated recurrent unit (GRU), the variants of RNN, to predict the attention level and meditation. The proposed LSTM and GRU models excelled in predicting mediation and attention states with a low root mean squared error (RMSE), respectively [68]. Lu et al. proposed a method to assess the target speech from single-trial EEG signals in competing two-talker environments using LSTM-based RNN. The experiment demonstrated that the subjects achieved an average detection accuracy of 96.12% [157].

5.2.3. Transformer Methods

As a powerful deep learning method, the Transformer has been widely used in various fields of artificial intelligence. The Transformer uses a self-attention mechanism and has shown excellent performance in processing sequential data, capable of capturing long-range spatiotemporal dependencies in EEG signals. Compared with RNN and CNN, the Transformer has better long-term dependency capture capabilities when processing EEG data [158]. Xu et al. proposed a model named Attention-based multiple dimensions EEG transformer (AMDET) that can effectively extract and classify the spectral–spatial–temporal features from EEG signals. AMDET employed multi-head attention mechanisms and achieved accuracy rates of 97.48%, 96.85%, 97.17% and 87.32% in four datasets, respectively [159]. Xu et al. developed a Transformer-based auditory attention detection model. The experimental results showed that this model achieved a decoding accuracy of 76.35% within a 0.15 s decoding window, which is much higher than the accuracy of the linear model [160]. Ding et al. introduced a Transformer-based deep learning framework for EEG attention state classification, named EEG-PatchFormer. In a benchmark comparison with the established baseline models, EEG-PatchFormer achieved the highest classification accuracy of 75.63% [161].

5.2.4. Hybrid Models Methods

Combining traditional machine learning models with deep learning models, or fusing different types of deep learning modules with attention mechanisms to form a hybrid model can improve the accuracy and robustness of attention state classification. Zhang et al. proposed an attention-based hybrid deep learning model for EEG emotion recognition, which can extract frequency, spatial and temporal features [162]. Geravanchizadeh et al. proposed a method based on the combination of Transformer and graph convolutional neural network to detect auditory attention from EEG. Compared with the most advanced attention classification methods in the literature, their method has an accuracy rate of 80.12% as a classification metric [163]. Zhao et al. proposed a single-channel EEG signal classification method with two-branch multiscale CNN and Transformer model. The classification model was tested on self-built and public datasets with other methods and achieved competitive classification performance [164]. Xue et al. introduced a hybrid classification framework for attention assessment. The multi-models of SVM with RF, SVM with CNN, and RF with CNN were evaluated on datasets. The multi-model of RF with CNN achieved the best performance with an accuracy of 96.0% and an F1-score of 0.93, respectively [165]. EskandariNasab et al. developed the classifier with GRU and CNN for attention assessment. The proposed method achieved the accuracy of 98.9% and 97.2% for EEG length 1 s on different databases, respectively [166].

5.2.5. Comparison of Deep Learning Methods

In EEG attention assessment, CNN, RNN, Transformer, and hybrid models have their own advantages and disadvantages as classifiers. CNN can effectively extract local spatial features from EEG signals, and its training efficiency is relatively high. For data with spatial structure such as EEG signals, CNN can quickly capture the correlation between different electrodes. However, CNN has limitations in processing long-term temporal dependencies and is difficult to capture global temporal information in EEG signals. RNN and its variants (e.g., LSTM and GRU) can effectively capture the time series information in EEG signals. The training time of RNN is long, especially when processing long sequences of EEG data, and the computational complexity is high. When processing long sequences, RNN may experience vanishing or exploding gradients, which can affect the model’s inference performance. Nevertheless, LSTM and GRU overcome the vanishing gradient problem of traditional RNN [68]. The Transformer is combined with the self-attention mechanism so that it can fully capture long-range dependency information. It can also be processed in parallel, improving training efficiency. Transformer models require significant data and hardware resources, have high computational costs, and are prone to overfitting without large training datasets. Additionally, they are less effective at capturing local spatial features and often need to be combined with other models for this purpose [167]. Hybrid models combine the advantages of multiple models and can be combined appropriately based on requirements. However, hybrid models have complex structures, require high computing and training resources, and are difficult to optimize and tune. The comparison of these methods is listed in Table 7.

5.3. Evaluation Metrics

Various metrics have been used to evaluate the performance of EEG classification. Accuracy is one of the most commonly used classification performance indicators, which indicates the proportion of samples correctly classified by the classifier to the total number of samples. In EEG attention assessment, high accuracy means that the classifier can distinguish different attention states more accurately. The calculation formula of accuracy is shown in Equation (1). Where TP indicates the numbers of true positive examples, TN represents the numbers of true negative examples, FP illustrates the numbers of false positive examples, and FN is the numbers of false negative examples.

A c c u = \frac{T P + T N}{(T P + T N) + (F P + F N)}

(1)

Precision refers to the proportion of samples that are actually positive among the samples that are judged as positive by the classifier. A high precision means that the classifier has a high probability of making the correct judgment when determining attention concentration. The calculation formula of precision is shown in Equation (2).

P r e = \frac{T P}{(T P + F P)}

(2)

Recall rate refers to the proportion of samples that are correctly classified as positive by the classifier among all samples that are actually positive. A high recall rate means that the classifier can identify as many states of attention as possible. The calculation formula of recall is shown in Equation (3).

R e c = \frac{T P}{(T P + F N)}

(3)

The F1-score is the harmonic mean of precision and recall, which takes both precision and recall into consideration. The higher the F1-score, the better the performance of the classifier. The calculation formula of F1-score is shown in Equation (4).

F 1 = \frac{2 \times P r e \times R e c}{(P r e + R e c)}

(4)

Area under the receiver operating characteristic curve (AUC-ROC) is a commonly used metric for evaluating the performance of binary classifiers, indicating the classifier’s ability to distinguish between positive and negative classes. Higher AUC values indicate better performance. The ROC curve is plotted with the false positive rate on the horizontal axis and the true positive rate on the vertical axis.

The Kappa coefficient is used to assess the degree of consistency between the classifier’s predictions and the true labels, accounting for the influence of chance. The Kappa coefficient ranges from −1 to 1, with higher values indicating better consistency.

The confusion matrix is an intuitive method to evaluate the performance of a classifier. By showing the prediction results of the classifier on each category, you can clearly understand the strengths and weaknesses of the classifier. The confusion matrix includes TP, FP, TN, and FN.

In EEG attention assessment research, these metrics provide a comprehensive evaluation of the performance of classifiers. For example, by comparing the accuracy, precision, recall, and F1-scores of different classifiers on the same dataset, the best-performing classifier can be identified. Additionally, analyzing the confusion matrix helps to reveal which attention states are most prone to misclassification, providing targeted insights for improving classification models.

6. Datasets

For attention assessment, high-quality and reliable EEG data is crucial. Researchers have introduced a variety of datasets to train and validate their models. These datasets typically contain EEG signals collected under specific cognitive tasks or situations, and may be supplemented with behavioral data, subjective assessments, and other physiological signals. Table 8 lists some available datasets for EEG attention assessment. The KUL and DTU datasets are among the most frequently used for auditory attention detection in audio scenarios [168]. In the KUL dataset, 64-channel EEG data was recorded from 16 normal-hearing subjects using a BioSemi ActiveTwo system at the sampling rate of 8196 Hz. The stimuli in recording were four Dutch short stories. In the DTU dataset, 64-channel EEG data was recorded from 18 normal-hearing subjects using a BioSemi ActiveTwo system at the sampling rate of 512 Hz. The stimuli in recording are four Dutch short stories. The stimuli comprised Danish audio books. The auditory stimuli of PKU and AVED are stories in Mandarin. Based on these datasets, researchers can develop and validate more accurate and reliable methods for attention assessment.

7. Traditional Methods for Attention Assessment

In addition to the EEG attention assessment methods based on artificial intelligence introduced above, there are also some traditional, non-artificial intelligence approaches for EEG-based attention assessment. In this section, statistical analysis and ERP methods are surveyed.

7.1. Statistical Analysis

The statistical analysis method is based on mathematical probability theory and infers the association between attention state and EEG characteristics through quantitative indicators, without relying on data-driven learning mechanisms. After extracting the relevant EEG features, statistical methods can be used to assess the attention state. Poulsen et al. used 14-channel portable EEG devices to record the EEG signals of multiple students in a classroom while they watched video stimuli, aiming to detect stimulus-evoked neural activity related to students’ attention level. In this study, correlated component analysis was used to measure inter-subject correlation and intra-viewing correlation. The experimental results demonstrated that changes in inter-subject correlation of students’ brain signals were associated with attention to visually evoked responses [182]. Zhang et al. analyzed EEG signals recorded during counting, eyes-closed, and idle states using the Jensen–Shannon Divergence method. Their experiments demonstrated that the statistical complexity features derived from this algorithm can effectively distinguish attention levels across different conditions [183]. Ko et al. employed one-way analysis of variance (ANOVA) and post hoc t-tests to compare the mean spectral power across different EEG frequency bands. Their analysis revealed a clear association between dynamic changes in the EEG spectrum and sustained attention tasks. The experimental results also indicated that an increase in beta-band power is correlated with enhanced levels of visual attention [73]. Ni et al. conducted an empirical study on learning attention involving 28 college students, employing both questionnaire surveys and EEG recordings. Data analysis was performed using independent samples t-tests within SPSS 25.0 statistical software. Their study found that learners exhibited higher levels of attention when engaging with text-based media, and that male students as well as graduate students demonstrated higher attention scores [16].

Statistical analysis methods are typically grounded in well-established mathematical theories. This foundation gives their results a high degree of interpretability, allowing researchers to clearly understand the inner workings of the models as well as the relationships between variables. Statistical analysis methods have relatively flexible requirements regarding sample size, performing well even with small sample data. Additionally, they allow for inference through hypothesis testing.

However, many traditional statistical methods rely on strict assumptions, and conventional statistical analyses based on stationarity assumptions may fail to accurately capture the dynamic characteristics of EEG signals. Moreover, when dealing with large-scale, high-dimensional, and nonlinear EEG data, traditional statistical approaches often require extensive manual feature engineering by experts to achieve satisfactory performance.

7.2. ERP Methods

ERP can characterize the temporal dynamics of cognitive processes [184]. After preprocessing the recorded EEG signals, component quantification is performed by time-locking and averaging the signals, thereby extracting the brain electrical activity components related to specific events. In attention assessment, the P300 is one of the most widely used ERP components. The P300 appears as a positive wave approximately 300 ms after a stimulus in the ERP, with its amplitude and latency reflecting cognitive load and attention intensity.

Tan et al. conducted a comparative study between children with ADHD and healthy controls. Through statistical tests and logistic regression analysis, they identified significant differences between the groups. Specifically, the ADHD group exhibited significantly lower P300 amplitudes compared to the control group, confirming that P300 amplitude serves as a discriminative marker for attention deficits. This finding indicates that P300 can be used as an objective biomarker to assist in the assessment and diagnosis of attention-related disorders [185]. Datta et al. conducted a study involving volunteers performing a prolonged cognitive task. They found that as the P300 amplitude decreased, the error rate in learning increased. It indicated that the P300 amplitude can serve as a marker of attention decline and a tendency toward making errors [186]. Tao et al. found that preschool children with ADHD exhibited longer P300 latencies at all electrode sites compared to healthy controls. The alterations in the P300 may serve as an objective marker for the clinical assessment of preschool children with ADHD [187].

ERP originates directly from the postsynaptic potentials of neurons, making it to more directly reflect the brain’s neurophysiological responses to attention states. The ERP captures the dynamic changes in brain electrical activity in real time with millisecond precision, providing very high temporal resolution. However, to obtain reliable ERP signals, the timing and duration of stimulus presentation must be strictly controlled during the experiment. Small variations in the procedure may cause the ERP results to become unclear or unreliable. Therefore, conducting ERP experiments is relatively complex and demands on the researchers’ professional skills.

8. Discussion

Although technological advances have been made in the field of EEG attention assessment, there are still some issues that need to be addressed in order to better assess students’ attention in the classroom.

8.1. High-Quality EEG Data Acquisition

In the EEG attention assessment during classroom teaching, obtaining high-quality EEG signals comes with multiple challenges. These limitations mainly come from the complexity of technology, environment, individual differences, and actual application scenarios.

First, the sources of sound in classroom teaching are complex. In addition to the teacher’s voice, other sounds in the classroom, such as classmates talking, fans, external traffic noise, and the noise generated by teaching equipment such as projectors and computers, can affect EEG signals through auditory or electromagnetic pathways. Second, unlike the experimental environment in the laboratory, the students’ body movements, eye movements, and muscle contractions in the classroom will produce significant artifacts of EOG and EMG. In addition, when wearing wearable devices for a long time in classroom, sweating or body movement may cause the contact stability of the EEG signal collection electrodes to decrease. Poor contact will lead to high electrical impedance, signal drift, increased noise, etc. For the mentioned problems, improvements can be made from the aspects of system hardware design and algorithms.

8.1.1. Electrode Design

Traditional wet electrodes are considered as the gold standard due to their low impedance and high SNR. However, wet electrodes have many limitations in classroom teaching, including the time-consuming skin preparation and application of conductive gel, which can cause skin discomfort or even allergies. Drying of the gel after prolonged use may also lead to a decrease in signal quality and make long-term monitoring difficult.

To avoid this problem, dry electrodes are designed to optimize signal quality, wearing comfort and robustness. For attention assessment in classroom, active dry electrodes with integrated amplifiers can be used to reduce motion artifacts and improve signal quality, especially using electrodes with special structures in hair areas [188]. Zhu et al. propose a flexible, stable composite semi-dry electrode with low impedance for EEG recording [189]. Wang et al. introduced flexible stretchable dry electrodes with embedded Ag/AgCl nanowires in Polydimethylsiloxane (PDMS) that can be used to record signals such as EEG [190]. In addition, the development of dry contact electrode ear EEG provides a non-invasive and stable way for in-ear EEG recording [191].

In future classroom attention assessments, electrode design should consider comfort, high quality, portability, and low cost. Utilizing flexible electronics technology, electrodes are fabricated from conductive polymers (e.g., polypyrrole (PPy) and PEDOT:PSS) or nanomaterials (e.g., graphene, carbon nanotubes, and metal nanowires). These materials offer excellent signal transmission performance while also adapting to the shape of the scalp, improving wearer comfort and ensuring signal stability over extended periods of use. In addition, in-ear EEG electrodes and microneedle electrodes also have the advantages of convenient and reliable signal acquisition.

8.1.2. Noises and Artifacts Removal with Deep Learning Methods

In recent years, in addition to traditional methods, deep learning techniques have shown great promise for removing artifacts and noise in EEG signals. These methods can effectively distinguish clean neural activity from raw EEG signals by learning the complex patterns of noise and artifacts, thus overcoming the limitations of traditional approaches that depend on signal models or prior assumptions.

Sun et al. proposed a one-dimensional residual CNN (1D-ResCNN) model for EEG denoising based on the original waveform, which can effectively remove noise from EEG signals [192]. Mahmud et al. introduced a one-dimensional CNN-based model called MLMRS-Net for correction of EEG motion artifacts, especially when using wearable sensors for dynamic monitoring [193]. Zhang et al. proposed a novel CNN model (NovelCNN) for removing muscle artifacts from EEG data [194]. Gao et al. proposed a dual-scale CNN-LSTM model called DuoCL to simultaneously capture the temporal and spatial dependencies in EEG signals, thereby more effectively removing deep artifacts [195].

Erfanian et al. proposed an RNN-based adaptive noise canceller for real-time removal of EOG interference in EEG signals. This study used a RNN to model interference signals by recording ocular and blink artifacts with electrodes placed around the eyes. The results demonstrate that RNN performs well in eliminating ocular artifacts [196]. Liu et al. introduced an EEG state-based imputation model (SRI-EEG), which utilizes bidirectional LSTM to eliminate artifacts in EEG signals. This approach detects physiological artifacts and automatically replaces them with estimated values [197]. Recently, Jiang et al. developed a model called CLEnet, which combines a dual-scale CNN with a LSTM network and introduces an improved one-dimensional efficient multiscale attention mechanism. This model is capable of removing artifacts from multi-channel EEG data mixed with unknown artifacts [198].

Wang et al. introduced an improved generative adversarial network (GAN) that does not rely on experience and prior knowledge for EOG and EMG artifact removal of EEG signals in BCI systems [199]. Tibermacine et al. studied the method of using GAN framework to improve the quality of EEG signals. The study showed that the standard GAN model and the Wasserstein GAN model with gradient penalty are superior to the classic wavelet-based threshold method and linear filtering method in removing artifacts [200]. Dong et al. presented an end-to-end deep learning framework called AR-WGAN, which can automatically remove artifacts without manual intervention and is suitable for real-time processing of large-scale EEG data [201].

An autoencoder achieves denoising by learning to encode the input signal into a low-dimensional representation and then decode it back to the original signal. Nguyen et al. proposed a hybrid deep wavelet sparse autoencoder (DWSAE) method for online automatic removal of EOG artifacts, which can effectively filter the EEG signal without affecting the real EEG signal in the non-artifact area [202]. Acharjee et al. developed a one-dimensional convolutional denoising autoencoder (CDAE) architecture to remove eyeblink artifacts mixed in signal EEG signal [203]. Nagar et al. presented a fractional one-dimensional CNN autoencoder for removing EMG artifacts from EEG. This method takes into account the need for compressed deep learning architecture for portable and low-energy devices and compresses the training parameter algorithm [204].

Pu et al. proposed the EEGDnet model, which incorporates a Transformer module to simultaneously account for the non-local and local self-similarities of EEG signals. When removing ocular artifacts, EEGDnet improved the correlation coefficient by 18%, and when removing muscle artifacts, the improvement was 11% [205]. Chen et al. developed a Transformer-based EEG denoising model named Denoiseformer to remove artifacts in single-channel contaminated EEG signals. Using the Transformer architecture, the contaminated EEG signal is segmented into multiple segments and the potential pattern relationship between the segments is extracted to achieve efficient denoising [206]. Chuang et al. introduced a Transformer-based EEG artifact removal model called artifact removal transformer (ART), which improved signal denoising capabilities for complex spatial and multi-channel EEG. Although ART has a slightly higher number of parameters, it maintains relatively fast inference times, making it suitable for near-real-time EEG analysis [207].

It should be noted that the above methods are the current research progress in removing noise and artifacts from EEG signals. Although some of these methods have not been applied in classroom attention assessment, they can serve as reference for future research. Table 9 shows the advantages, disadvantages and application scenarios of these methods in EEG signal denoising and artifact removal. The choice of methods depends on specific EEG application, available computational resources, and data characteristics. For EEG signals that need to process time dependencies, RNN or Transformer may be more suitable; for scenarios that need to process large amounts of data quickly, CNN may have more advantages. CNNs and autoencoders are more suitable for resource-constrained environments, while Transformers and GANs excel in scenarios with sufficient computational power and large datasets.

In addition, the hybrid model combines the advantages of different models and has the ability to extract local/global features and time series modeling, which comprehensively improves the accuracy and robustness to various artifacts. For example, Yin et al. introduced a GAN guided parallel CNN and Transformer network named GCTNet. Compared with other methods, GCTNet achieved an 11.15% reduction in relative RMSE and a 9.81% improvement in SNR in the task of removing EMG artifacts [208]. Cai et al. proposed a Dual-Branch Hybrid CNN–Transformer Network called DHCT-GAN for physiological artifact removal. DHCT-GAN significantly outperformed existing state-of-the-art networks in removing various artifacts, and achieved superior artifact removal compared to single-branch models [209].

8.2. Multimodal Data Fusion

As mentioned before, there are some challenges in obtaining high-quality EEG signals. EEG signals may also be affected by individual differences, and the measurement environment. Therefore, combining other modal data, such as eye movement data and facial expressions, can provide complementary information to make up for the limitations of EEG, thereby improving the accuracy and robustness of attention assessment. The following are some advances in multimodal EEG research that can be adopted for classroom attention assessment research.

8.2.1. Methods Combined with Eye Movement

Eye movement can effectively identify a user’s interests, attention level, and cognitive load [210]. Eye movements can be detected using eye tracking technology in mobile devices or cameras in classrooms. Research on combining eye movement with EEG to assess attention has become an important direction in the fields of BCI and affective computing. The methods for fusing eye movement data with EEG signals can be categorized as feature-level fusion, decision-level fusion, and model-level fusion. Figure 4 shows the diagram of the methods for fusing EEG and eye movements into multimodal data.

Feature-level fusion is one of the most common fusion methods. By extracting eye movement feature data such as gaze point, gaze duration, eye saccade, pupil size, etc., and fusing them with EEG data features, the eye movement data features and EEG features are combined into a unified feature vector through concatenation or weighted averaging methods, which is then fed into a classification model for attention assessment. Qin et al. constructed a quality evaluation system for university course teaching based on the attention assessment with EEG signals and eye movement. In this study, gaze duration, eye saccade, and pupil size were chosen as the attention features for eye movement, while the power spectrum of the Alpha wave was the attention feature for EEGs [211]. Song et al. proposed a multimodal emotion recognition system based on EEG signals and eye movement data. The cosine loss function was employed to balance the fusion of EEG and eye movement features [212].

Decision-level fusion first processes and classifies eye movement data and EEG data through independent models to obtain their respective attention state decisions. These independent decisions are then combined through voting, weighted averaging or some complex fusion algorithms to produce the final attention or emotion state judgment. Zhao et al. extracted EEG and eye movement signals, which were input into a machine learning pattern recognition network for intent recognition. The final recognition results of human–computer interaction intention were obtained through decision-level fusion [213]. Gong et al. proposed a coordinated-representation decision fusion network (CoDF-Net) to fuse signals of EEG and eye movement for emotion recognition [214].

Model-level fusion directly realizes the fusion of multimodal data at the model architecture level. Model-level fusion usually involves information interaction and integration at the middle layer of the model, so as to capture the complex dependencies between different modal data. Guo et al. proposed a multimodal fusion method to identify multiple emotions by combining eye images, eye movements and EEG signals [215]. Li et al. proposed a multimodal fusion network (MTREE-Net) that combines features from EEG and eye movement modalities with a dynamic reweighting strategy to detect target images [216].

8.2.2. Methods Combined with Facial Expression

Facial expressions are direct indicators of students’ outward expressions of emotion and attention. By analyzing the target’s facial action units and expression patterns, its emotional state, such as concentration, confusion, boredom, etc., can be judged. The fusion of facial expressions and EEG signals can provide richer and more accurate attention assessment information than a single modality. The methods of fusing facial expression data with EEG signals also include the type of feature-level fusion, decision-level fusion, and model-level fusion.

As for the research of feature-level fusion, Singh et al. proposed a multimodal emotion recognition model that combines facial expressions and EEG signals. The facial expression and EEG features are extracted and fused using an adapted modified information gain for feature fusion (AMIG)-based approach [217]. Zhao et al. designed a feature fusion network with a three-layer bidirectional LSTM structure to fuse the expression and EEG features for emotion recognition [218].

In the research of decision-level fusion, Wu et al. utilized a CNN to recognize facial emotions, employed an LSTM to identify EEG-based emotions, and used D-S evidence theory to fuse the emotion identification results at the decision-level. The experiment demonstrated that decision-level fusion of facial expressions and EEG patterns showed higher accuracy than single modality [219]. Li et al. calculated facial geometric features by locating facial landmarks, extracted the PSD features of EEG signals through time–frequency analysis, and used the LSTM to achieve decision-level fusion to recognize the user’s emotions [220].

In the research of model-level fusion, Jin et al. proposed a model called residual multimodal Transformer (RMMT) to fuse facial expressions and EEG signals to achieve continuous emotion recognition [221]. Rayatdoost et al. presented a multimodal deep representation learning method, with facial behavior and EEG signals as input signals, and implemented emotion recognition through a gated fusion method [222].

8.2.3. Comparison of Multimodal Methods

The fusion of eye movement signals or facial expressions with EEG are two common multimodal methods in attention assessment. Each method has its own unique advantages and limitations, and an appropriate multimodal fusion approach can be selected based on specific circumstances in classroom attention assessment.

For the multimodal of EEG and eye movement, the eye movement tracking can accurately record individual visual attention focus, scanning path, gaze duration, blinking, and other behavioral data. Compared to facial expressions, eye movements are not easy to disguise. And EEG signals directly reflect the brain’s electrophysiological activity and are highly objective and difficult to disguise. Therefore, By combing EEG with eye movement with EEG signals, attention states can be captured from both neural activity and behavioral performance, thereby improving the accuracy of recognition. The temporal variation rate of eye movement and EEG signals is relatively high, facilitating real-time attention assessment. Compared with facial expressions, eye movement data is less affected by factors such as ambient lighting and occlusion, and is more resistant to interference. In order to accurately detect eye movements, the accuracy of cameras in the classroom is limited, and a high-precision eye movement device is required. Data synchronization is complex because eye movement and EEG timestamps should be precisely aligned. In addition, eye movement focuses primarily on visual behavior and cannot reflect students’ emotions or fatigue status.

For the multimodal of EEG and facial expression, facial expressions can effectively reflect students’ emotional states, such as confusion, engagement, and fatigue, which indirectly provides an indicator for attention assessment. Facial expression recognition typically only requires a camera, making it easier to deploy in a classroom setting and requiring low equipment costs. It is suitable for use in large-class teaching and can capture the facial expressions of multiple people at the same time. However, facial expressions are easily subject to subjective control or disguise and may not fully reflect the internal state of attention. Compared to EEG and eye movement signals, facial expressions typically change more slowly. Furthermore, some methods extract features from static facial images, neglecting the dynamic changes in facial expressions. These factors lead to poor performance in real time attention assessment. In addition, facial expression data collected by the camera is easily affected by environmental factors such as lighting changes, head posture, and partial occlusion, resulting in a decrease in recognition accuracy. The comparison of these two multimodal methods is listed in Table 10. As for classroom teaching, EEG combined with eye movements is more suitable for continuous attention assessment, but it requires high synchronization of equipment. EEG combined with facial expressions is better at capturing the impact of emotional factors on attention assessment, but it is susceptible to external interference.

8.3. Complexity of EEG Data

The application of EEG in classroom attention assessment has shown great potential, but the complexity of its data interpretation is a key challenge facing current research. This complexity mainly comes from the characteristics of EEG signal, the difficulties in data processing and analysis, and the challenges caused by multidisciplinary research.

8.3.1. Complexity of Signal

The amplitude of EEG signals is small, at the microvolt level, and is easily interfered with by various artifacts and external electromagnetic signals. In a real classroom environment, students’ heads and bodies move frequently, which can cause a lot of artifacts and significantly reduce the quality of EEG signals. Xu et al. indicated that the invalid data and loss data in a classroom environment can be up to twice that of a laboratory environment due to movement [17]. To solve the problem of EEG signals in classroom environments being contaminated by artifacts, advanced deep learning models can be used to eliminate artifacts. Additionally, when conditions permit, portable EEG signal acquisition devices equipped with new electrodes can be used to reduce the impact of movement on signal quality.

There are significant differences in EEG signal patterns between different individuals, and even under the same task, the neural signatures may differ between different students [223]. Research has shown that individuals differ in how they allocate their attention intensity and consistency, and that these differences are associated with overall task performance [224]. The choice of cognitive task can also lead to differences in EEG responses. Appropriate selection of EEG indicators is crucial for assessing individual differences in cognitive tasks [225]. To address the problem of individual differences in EEG signals in attention assessment, multimodal EEG methods that combine eye movements or expressions can be used for correction to improve accuracy and robustness. It is also possible to resolve the individual and group differences in EEG signals by introducing a neural network with an attention mechanism. Finally, more powerful transfer learning and few-shot learning algorithms should be developed. It will enable pre-trained models to adapt more effectively to new individuals and reduce the reliance on large amounts of calibration data.

8.3.2. Complexity of Features

Currently, there is no universal standard for quantifying attention level in EEG signals. Different studies use various metrics, such as Alpha wave suppression, Theta/Beta ratio, and multi-band power ratios. Additionally, there is no universally accepted mathematical representation formula for attention assessment. EEG characteristics of human attention states may exist across different time domains, frequency domains, time–frequency domains, and exhibit complex nonlinear dynamics. Extracting the features that are most sensitive and discriminative to attention states from this vast array of data remains a challenging problem. In addition, in attention assessments, multi-channel EEG data often contains redundancy. That is, not all channels are equally important for attention assessment, and selecting the most relevant channels can effectively reduce computational load and hardware complexity. This is particularly important for portable EEG monitoring devices.

For EEG attention features, nonlinear dynamic indicators such as approximate entropy, sample entropy, and fractal dimension may be utilized to quantify complex EEG signals. Alternatively, a more comprehensive feature set can be constructed by combining time-domain, frequency-domain, time–frequency-domain, and nonlinear features. In addition, introducing attention mechanisms into deep learning models is another improvement approach. It makes the model automatically focus on the most attention-relevant parts of the EEG signal (e.g., specific channels or time periods), thereby optimizing feature selection [226].

8.3.3. Complexity of Interpretability and Explainability

In EEG attention assessment research, interpretability and explainability are key attributes for evaluating model transparency and data trust, particularly in the increasing of complex deep learning models [227]. The complexity of EEG signals and the black-box nature of deep learning models are the main reasons for the challenges of explainability and interpretability [228,229].

In EEG attention assessment, interpretability refers to whether it is possible to intuitively understand which specific features of the EEG signal the model focuses on when determining attention states. In the traditional attention assessment method that calculates power in specific frequency bands, its feature extraction process and association with the attention state are clear, so it has high interpretability. Models with better interpretability usually have relatively simple structures and clear logic, such as linear models and decision trees. However, in some cases, these models might be less capable of processing complex and nonlinear EEG data than deep learning models. For deep neural network models that contain multiple layers of nonlinear transformations, the internal decision logic is complex, and the feature extraction and conversion processes are difficult to directly map into concepts that can be understood by humans. Therefore, maintaining high accuracy while enhancing the model’s interpretability remains a major research challenge in EEG signal processing and analyzing.

In EEG attention assessment, explainability refers to the ability to explain why the model determines that the current state reflects focused attention or identify which EEG features contribute the most to this decision. Although deep learning models excel at attention assessment, their decision-making processes are often highly nonlinear and abstract, making them difficult to trace and understand. In addition, EEG signals are significantly affected by individual differences, and the model has poor generalization ability between different subjects or on different datasets, which also reduces explainability.

To improve the interpretability and explainability of EEG attention assessment, an attention mechanism can be introduced to highlight the key regions or features in the input signals that the model focuses on. For example, Miao et al. propose an attention network called LMDA-Net to improve performance of BCI while enhancing model interpretability by introducing a multi-dimensional attention mechanism [230]. Shapley additive explanations (SHAP) is a game-theoretic approach that utilizes Shapley values to measure each feature’s contribution to the prediction outcome. It provides both local and global explanations by assigning importance values to individual features. Studies have shown that SHAP can enhance the interpretability and explainability of EEG signal recognition [231]. In addition, multimodal EEG attention assessment methods that combine eye movements or facial expressions can not only provide richer and complementary information about attention assessment, but also enhance the interpretability of the model by analyzing the interactions between modalities.

8.4. Hardware Setups for Deep Learning Method Implementation

Deep learning methods provide obvious advantages in EEG attention assessment, but when applied in classroom teaching environments, it is necessary to consider not only the accuracy of the evaluation but also the real-time response of the system and power consumption. EEG devices use high sampling rates to collect multi-channel data. The transmission of data from the acquisition device to the GPU for processing faces challenges in terms of latency and power consumption. Therefore, when designing a hardware system for deep learning, the following aspects need to be taken into consideration.

8.4.1. System Architecture Design

Deep learning models consume significant computational and time resources when processing complex EEG data. To reduce latency in transmitting and processing large EEG datasets, edge computing units can be added at the EEG acquisition device. These units perform light data preprocessing and initial inference, while GPUs handle training of complex models and deep inference.

Edge computing deploys computing resources on edge devices close to the data source. This significantly reduces data transmission latency and improves real-time processing efficiency. Using FPGA or ASIC as edge computing units accelerates algorithms through hardware. They can effectively perform signal filtering and artifact removal on the front end, as well as feature extraction and classification. Aslam et al. designed an ASIC with an SVM classifier. By using a lookup table for logarithmic division, they effectively reduced the computational complexity of the machine learning process. This ASIC achieved an emotion classification accuracy of 72.96% on eight-channel EEG signals, with a power consumption of 2.04 mW [36]. Gonzalez et al. developed a hardware neural network algorithm called BioCNN on an FPGA for emotion classification using EEG signals. It achieved accuracy comparable to leading software-based emotion detectors, with a latency of less than 1 ms [232].

In the backend of processing unit, the CPU and GPU are combined with the communication of a high-speed PCIe bus. The CPU is mainly responsible for receiving data streams from the EEG devices, as well as performing necessary tasks such as format conversion, timestamp synchronization, and data preprocessing. In the EEG signal preprocessing procedure, complex artifact removal algorithms, such as ICA or deep learning-based methods, can be realized in GPU. Computationally intensive feature extraction techniques like STFT, WT, and HHT can also be accelerated on the GPU. Additionally, GPU handles deep learning model training and inference acceleration, as well as feature classification. The GPU transmits the attention assessment results back to the CPU, which is responsible for result visualization, storage, and interaction with users.

The integration of edge computing with the backend of heterogeneous computing architecture can optimize the overall system efficiency. Edge computing reduces data transmission load and latency while preserving the critical features of the signals. The collaborative operation of the CPU and GPU improves the efficiency of signal processing and optimizing the system’s energy efficiency.

8.4.2. Real-Time Signal Processing and Low Power Consumption

In the research of classroom attention assessment, real-time signal processing combined with low power consumption presents a significant technical challenge. Typically, classroom attention assessment requires continuous real-time processing of high-frequency, multi-channel EEG data collected from multiple students. However, the computational power, storage capacity, and battery life of edge computing devices are generally limited, which restricts the real-time implementation of complex signal processing algorithms directly on EEG acquisition devices.

In practical applications, the tasks of deep learning models can be reasonably distributed. Data acquisition, preliminary preprocessing, and lightweight feature extraction can be performed on edge computing devices, while complex deep learning inference or high-level semantic analysis of attention states can be offloaded to backend GPUs for processing. The edge computing units embedded in EEG signal acquisition devices play a crucial role in enhancing signal real-time performance and reducing power consumption. Existing studies have demonstrated that the preprocessing, feature extraction, training, and classification tasks for EEG signals can all be implemented using FPGA or ASIC. For example, Fang et al. developed a real-time EEG emotion recognition hardware system based on a CNN on an FPGA platform. They extracted emotional features using methods such as sample entropy, differential asymmetry, short-time Fourier transform, and channel reconstruction, and generated EEG images by fusing spectrograms. The system achieved an average classification accuracy of around 80%. The training process and real-time classification of each EEG image took 0.12495 ms and 0.02634 ms, respectively. Each emotion recognition execution required 450 ms, with a total power consumption of 76.61 mW [233]. Li et al. proposed an edge AI accelerator design for real-time EEG emotion recognition, implemented on a RISC-V FPGA platform. In the experiments, the system achieved an accuracy of 79.04% for valence and 85.95% for arousal from 17-channel EEG data. The hardware design, simulated using TSMC’s 16nm technology, operated at 500 MHz with an energy consumption of 42.69 nJ per prediction, demonstrated a 2.1 times improvement in energy efficiency compared to traditional AI methods [234]. Regarding signal transmission, it is recommended to use low-latency, high-bandwidth communication technologies such as Wi-Fi or 5G for data transfer between edge devices and backend of signal processing unit to enhance real-time performance. Additionally, it should be noted that Bluetooth’s limited bandwidth restricts the transmission of multi-channel, high-sampling-rate EEG signals, requiring either a decrease in the EEG sampling rate or compressing data [46].

In summary, in the classroom attention assessment scenario, the reasonable development and use of edge computing units can help improve the real-time response of the system and reduce system power consumption.

9. Future Work

Based on the review of EEG attention assessment approaches, it can be found that machine learning, especially deep learning, has shown promising potential in removing noise and artifacts from EEG signals as well as in feature classification. However, the application of deep learning models in EEG-based attention assessment has also raised concerns regarding reduced model interpretability and explainability. Therefore, using deep learning methods to extract attention features from EEG signals and integrating them with eye movement features is a promising approach for a comprehensive attention assessment. It can enhance the accuracy and robustness of attention assessment, facilitate the assessment of real-time continuous attention, and also improve the comprehensibility and interpretability of the model. To improve educational management for teaching quality, we propose a multimodal EEG attention assessment system combined with eye movement signals for classroom teaching scenarios, as shown in Figure 5.

Although eye trackers can accurately measure each student’s eye movement trajectory, they are expensive and inconvenient to wear, especially for students who wear glasses. In this study, high-definition cameras placed at the front of the classroom are utilized to capture students’ eye movements. In our recent study, a multimodal system combined EEG with eye movement was developed for university teaching quality evaluation, in which the eye movements are also captured by high-definition cameras with satisfied performance [211]. Different from our previous studies, this research utilizes deep learning models as the classifier for EEG signals to achieve automatic extraction and classification of attention features. Eye movement signals are classified using traditional machine learning methods to conserve computational resources. Moreover, the strong correlation between eye movement features and attention in machine learning approaches helps improve the interpretability of multimodal attention assessment models. Another difference is that EEG and eye movement data are integrated using a decision-level fusion method to enhance robustness. The data of each modality is processed separately to generate its own attention decision results, and then these independent decisions are combined to obtain the final comprehensive attention assessment in classroom for education management.

10. Conclusions

As a non-invasive signal acquisition method, EEG can objectively measure the brain’s electrical activity and provide neurophysiological indicators related to attention level, making it highly significant in attention assessment research. This article provides a comprehensive review of EEG-based methods for attention assessment. It covers aspects such as EEG signal acquisition, signal preprocessing, feature extraction and selection, feature classification, and evaluation metrics. This article also discusses some challenges currently faced in EEG attention assessment research in classroom environments, such as acquisition of high-quality EEG signals, fusion of multimodal data, complexity of EEG data, and hardware setups for deep learning method implementation.

Many existing EEG-based attention assessments are conducted in laboratories or some specific scenarios, but in real classroom environments, there are some unpredictable and uncontrollable situations, such as students’ body movements and noise interference in the classroom. Unimodal EEG-based attention assessment may lack accuracy in this scenario, so a multimodal system combining EEG and eye movement signals is proposed to improve classroom teaching management. Based on the actual situation of large-class teaching, deep learning methods are introduced for EEG attention classification, and eye movement signals are captured using high-definition cameras equipped in classroom. We hope that this article can provide objective and practical guidance of EEG-based approaches to classroom attention assessment in education.

Author Contributions

Conceptualization, L.W. and Y.Y.; methodology, Y.Q. and S.Z.; writing—original draft preparation, L.W.; writing—review and editing, Y.Y., Y.Q., and S.Z.; project administration, S.Z.; funding acquisition, Y.Y. and S.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the “Intelligent Assessment and Visualization of Learning Stress Based on EEG Rhythms and Eye Movement Signal Fusion”, 2023 Higher Education Research Planning Project by the China Association of Higher Education (23XXK0402); “Innovative Experimental Project Based on Adaptive Closed-Loop Neuromodulation System”, 2025 Sichuan Provincial Innovative Experimental Project for Undergraduate Universities (143); “Enhancing Teaching Competence of University Faculty in the Context of Artificial Intelligence”, Ministry of Education Industry–University Cooperative Education Program (230903879055748); “Application of ’EEG Data + Artificial Intelligence’ in Comprehensive Evaluation of Higher Education Teaching”, Sichuan Network Culture Research Center, a Key Research Base for Social Sciences in Sichuan Province (WLWH23-21); “A study on the correlation between PUA forms in family education and the formation of adolescent resistance psychology based on multi-source information” and “Research on the mechanism of improving teaching effectiveness of middle school teachers in the context of educational modernization”, Neijiang Municipal Philosophy and Social Sciences Planning Project (NJ2025ZD007, NJ2025YB042); “Development Pathways for Smart Education in Local Universities in the Era of Artificial Intelligence”, Sichuan Research Center for Educational Informatization Application and Development (JYXX23-013); and “AI-Enabled Teaching Reform Project”, University-level educational research project at Neijiang Normal University (JG202413).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The authors would like to thank Xincheng Li from the International Department of Chengdu Foreign Language School for his technical support for the data collection and table analysis.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

EEG	Electroencephalogram
CPT	Conners’ Continuous Performance Test
WCST	Wisconsin Card Sorting Test
CNN	Convolutional neural networks
BCI	Brain–computer interface
ADHD	Attention deficit hyperactivity disorder
SNR	Signal-to-noise ratio
AFE	Analog front-end
MCU	Microcontroller unit
GPU	Graphics processing unit
FPGA	Field programmable gate array
ASIC	Application-specific integrated circuit
ADC	Analog-to-digital converter
EOG	Electrooculography
EMG	Electromyography
ECG	Electrocardiography
ICA	Independent component analysis
WT	Wavelet transform
EMD	Empirical mode decomposition
IMF	Intrinsic mode functions
ERP	Examining event-related potential
CCA	Canonical Correlation Analysis
SSVEP	Steady-state visually evoked potential
FFT	Fast Fourier transform
PSD	Power spectral density
TBR	Theta/Beta ratio
STFT	Short-time Fourier transform
HHT	Hilbert–Huang transform
WVD	Wigner–Ville distribution
WPD	Wavelet packet decomposition
FD	Fractal dimension
HFD	Higuchi fractal dimension
LLE	Largest Lyapunov exponent
CD	Correlation dimension
KFD	Katz fractal dimension
CSP	Common spatial pattern
FB-CSP	Filterbank common spatial pattern filters
RSF	Riemannian geometry-based spatial filtering
CFS	Correlation-based feature selection
RFE	Recursive feature elimination
GA	Genetic algorithm
LASSO	Least absolute shrinkage and selection operator
PCA	Principal component analysis
KNN	K-nearest neighbor
RNN	Recurrent neural network
SVM	Support vector machine
LSTM	Long short-term memory
GRU	Gated recurrent unit
RMSE	Root mean squared error
AUC-ROC	Area under the receiver operating characteristic curve
PDMS	Polydimethylsiloxane
ANOVA	Analysis of variance
GAN	Generative adversarial network
DWSAE	Deep wavelet sparse autoencoder
SHAP	Shapley Additive exPlanations
RISC	Reduced instruction set computer

References

Carini, R.M.; Kuh, G.D.; Klein, S.P. Student engagement and student learning: Testing the linkages. Res. High. Educ. 2006, 47, 1–32. [Google Scholar] [CrossRef]
Raca, M.; Kidzinski, L.; Dillenbourg, P. Translating head motion into attention-towards processing of student’s body-language. In Proceedings of the 8th International Conference on Educational Data Mining, Madrid, Spain, 26–29 July 2015. [Google Scholar]
Blume, F.; Hudak, J.; Dresler, T.; Ehlis, A.C.; Kühnhausen, J.; Renner, T.J.; Gawrilow, C. NIRS-based neurofeedback training in a virtual reality classroom for children with attention-deficit/hyperactivity disorder: Study protocol for a randomized controlled trial. Trials 2017, 18, 41. [Google Scholar] [CrossRef] [PubMed]
Eom, H.; Kim, K.; Lee, S.; Hong, Y.J.; Heo, J.; Kim, J.J.; Kim, E. Development of virtual reality continuous performance test utilizing social cues for children and adolescents with attention-deficit/hyperactivity disorder. Cyberpsychol. Behav. Soc. Netw. 2019, 22, 198–204. [Google Scholar] [CrossRef] [PubMed]
Gil-Berrozpe, G.; Sánchez-Torres, A.; Moreno-Izco, L.; Lorente-Omeñaca, R.; Ballesteros, A.; Rosero, Á.; Peralta, V.; Cuesta, M. Empirical validation of the wcst network structure in patients. Eur. Psychiatry 2021, 64, S519. [Google Scholar] [CrossRef]
Schepers, J.M. The construction and evaluation of an attention questionnaire. SA J. Ind. Psychol. 2007, 33, 16–24. [Google Scholar] [CrossRef]
Krosnick, J.A. Response strategies for coping with the cognitive demands of attitude measures in surveys. Appl. Cogn. Psychol. 1991, 5, 213–236. [Google Scholar] [CrossRef]
Larson, R.B. Controlling social desirability bias. Int. J. Mark. Res. 2019, 61, 534–547. [Google Scholar] [CrossRef]
Chiang, H.S.; Hsiao, K.L.; Liu, L.C. EEG-based detection model for evaluating and improving learning attention. J. Med. Biol. Eng. 2018, 38, 847–856. [Google Scholar] [CrossRef]
Gupta, S.K.; Ashwin, T.; Guddeti, R.M.R. Students’ affective content analysis in smart classroom environment using deep learning techniques. Multimed. Tools Appl. 2019, 78, 25321–25348. [Google Scholar] [CrossRef]
Pabba, C.; Kumar, P. An intelligent system for monitoring students’ engagement in large classroom teaching through facial expression recognition. Expert Syst. 2022, 39, e12839. [Google Scholar] [CrossRef]
Zhu, X.; Ye, S.; Zhao, L.; Dai, Z. Hybrid attention cascade network for facial expression recognition. Sensors 2021, 21, 2003. [Google Scholar] [CrossRef] [PubMed]
Rosengrant, D.; Hearrington, D.; O’Brien, J. Investigating student sustained attention in a guided inquiry lecture course using an eye tracker. Educ. Psychol. Rev. 2021, 33, 11–26. [Google Scholar] [CrossRef]
Wang, Y.; Lu, S.; Harter, D. Multi-sensor eye-tracking systems and tools for capturing student attention and understanding engagement in learning: A review. IEEE Sens. J. 2021, 21, 22402–22413. [Google Scholar] [CrossRef]
Aoki, F.; Fetz, E.; Shupe, L.; Lettich, E.; Ojemann, G. Increased gamma-range activity in human sensorimotor cortex during performance of visuomotor tasks. Clin. Neurophysiol. 1999, 110, 524–537. [Google Scholar] [CrossRef]
Ni, D.; Wang, S.; Liu, G. The EEG-Based Attention Analysis in Multimedia m-Learning. Comput. Math. Methods Med. 2020, 2020, 4837291. [Google Scholar] [CrossRef]
Xu, K.; Torgrimson, S.J.; Torres, R.; Lenartowicz, A.; Grammer, J.K. EEG data quality in real-world settings: Examining neural correlates of attention in school-aged children. Mind Brain Educ. 2022, 16, 221–227. [Google Scholar] [CrossRef]
Al-Nafjan, A.; Aldayel, M. Predict students’ attention in online learning using EEG data. Sustainability 2022, 14, 6553. [Google Scholar] [CrossRef]
Wang, T.S.; Wang, S.S.; Wang, C.L.; Wong, S.B. Theta/beta ratio in EEG correlated with attentional capacity assessed by Conners Continuous Performance Test in children with ADHD. Front. Psychiatry 2024, 14, 1305397. [Google Scholar] [CrossRef] [PubMed]
Bazanova, O.M.; Auer, T.; Sapina, E.A. On the efficiency of individualized theta/beta ratio neurofeedback combined with forehead EMG training in ADHD children. Front. Hum. Neurosci. 2018, 12, 3. [Google Scholar] [CrossRef]
Obaidan, H.B.; Hussain, M.; Almajed, R. EEG_DMNet: A deep multi-scale convolutional neural network for electroencephalography-based driver drowsiness detection. Electronics 2024, 13, 2084. [Google Scholar] [CrossRef]
Jan, J.E.; Wong, P.K. Behaviour of the alpha rhythm in electroencephalograms of visually impaired children. Dev. Med. Child Neurol. 1988, 30, 444–450. [Google Scholar] [CrossRef]
Ursuţiu, D.; Samoilă, C.; Drăgulin, S.; Constantin, F.A. Investigation of music and colours influences on the levels of emotion and concentration. In Proceedings of the Online Engineering & Internet of Things: Proceedings of the 14th International Conference on Remote Engineering and Virtual Instrumentation REV 2017, New York, NY, USA, 15–17 March 2017; Springer: Berlin/Heidelberg, Germany, 2017; pp. 910–918. [Google Scholar]
Kawala-Sterniuk, A.; Browarska, N.; Al-Bakri, A.; Pelc, M.; Zygarlicki, J.; Sidikova, M.; Martinek, R.; Gorzelanczyk, E.J. Summary of over fifty years with brain-computer interfaces—A review. Brain Sci. 2021, 11, 43. [Google Scholar] [CrossRef]
Klimesch, W. EEG alpha and theta oscillations reflect cognitive and memory performance: A review and analysis. Brain Res. Rev. 1999, 29, 169–195. [Google Scholar] [CrossRef] [PubMed]
Turkeš, R.; Mortier, S.; De Winne, J.; Botteldooren, D.; Devos, P.; Latré, S.; Verdonck, T. Who is WithMe? EEG features for attention in a visual task, with auditory and rhythmic support. Front. Neurosci. 2025, 18, 1434444. [Google Scholar] [CrossRef]
Attar, E.T. Eeg waves studying intensively to recognize the human attention behavior. In Proceedings of the 2023 International Conference on Computing, Communication, and Intelligent Systems (ICCCIS), Greater Noida, India, 3–4 November 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 1–5. [Google Scholar]
Sharma, M.; Kumar, M.; Kushwaha, S.; Kumar, D. Quantitative electroencephalography–A promising biomarker in children with attention deficit/hyperactivity disorder. Arch. Ment. Health 2022, 23, 129–132. [Google Scholar] [CrossRef]
Newson, J.J.; Thiagarajan, T.C. EEG frequency bands in psychiatric disorders: A review of resting state studies. Front. Hum. Neurosci. 2019, 12, 521. [Google Scholar] [CrossRef]
Shi, T.; Li, X.; Song, J.; Zhao, N.; Sun, C.; Xia, W.; Wu, L.; Tomoda, A. EEG characteristics and visual cognitive function of children with attention deficit hyperactivity disorder (ADHD). Brain Dev. 2012, 34, 806–811. [Google Scholar] [CrossRef] [PubMed]
Samal, P.; Hashmi, M.F. Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: A review. Artif. Intell. Rev. 2024, 57, 50. [Google Scholar] [CrossRef]
Magazzini, L.; Singh, K.D. Spatial attention modulates visual gamma oscillations across the human ventral stream. Neuroimage 2018, 166, 219–229. [Google Scholar] [CrossRef]
Rashid, U.; Niazi, I.K.; Signal, N.; Taylor, D. An EEG experimental study evaluating the performance of Texas instruments ADS1299. Sensors 2018, 18, 3721. [Google Scholar] [CrossRef]
Li, G.; Chung, W.Y. A context-aware EEG headset system for early detection of driver drowsiness. Sensors 2015, 15, 20873–20893. [Google Scholar] [CrossRef] [PubMed]
Liu, H.; Zhu, Z.; Wang, Z.; Zhao, X.; Xu, T.; Zhou, T.; Wu, C.; Pignaton De Freitas, E.; Hu, H. Design and implementation of a scalable and high-throughput EEG acquisition and analysis system. Moore More 2024, 1, 14. [Google Scholar] [CrossRef]
Aslam, A.R.; Altaf, M.A.B. An on-chip processor for chronic neurological disorders assistance using negative affectivity classification. IEEE Trans. Biomed. Circuits Syst. 2020, 14, 838–851. [Google Scholar] [CrossRef]
Zanetti, R.; Arza, A.; Aminifar, A.; Atienza, D. Real-time EEG-based cognitive workload monitoring on wearable devices. IEEE Trans. Biomed. Eng. 2021, 69, 265–277. [Google Scholar] [CrossRef]
Gonzalez, H.A.; George, R.; Muzaffar, S.; Acevedo, J.; Hoeppner, S.; Mayr, C.; Yoo, J.; Fitzek, F.H.; Elfadel, I.M. Hardware acceleration of EEG-based emotion classification systems: A comprehensive survey. IEEE Trans. Biomed. Circuits Syst. 2021, 15, 412–442. [Google Scholar] [CrossRef]
Li, P.; Cai, S.; Su, E.; Xie, L. A biologically inspired attention network for EEG-based auditory attention detection. IEEE Signal Process. Lett. 2021, 29, 284–288. [Google Scholar] [CrossRef]
Boyle, N.B.; Dye, L.; Lawton, C.L.; Billington, J. A combination of green tea, rhodiola, magnesium, and B vitamins increases electroencephalogram theta activity during attentional task performance under conditions of induced social stress. Front. Nutr. 2022, 9, 935001. [Google Scholar] [CrossRef]
Cowley, B.U.; Juurmaa, K.; Palomäki, J. Reduced power in fronto-parietal theta EEG linked to impaired attention-sampling in adult ADHD. Eneuro 2022, 9. [Google Scholar] [CrossRef] [PubMed]
Das, N.; Bertrand, A.; Francart, T. EEG-based auditory attention detection: Boundary conditions for background noise and speaker positions. J. Neural Eng. 2018, 15, 066017. [Google Scholar] [CrossRef]
Cai, S.; Zhu, H.; Schultz, T.; Li, H. EEG-based auditory attention detection in cocktail party environment. APSIPA Trans. Signal Inf. Process. 2023, 12, e22. [Google Scholar] [CrossRef]
Putze, F.; Eilts, H. Analyzing the importance of EEG channels for internal and external attention detection. In Proceedings of the 2023 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Honolulu, HI, USA, 1–4 October 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 4752–4757. [Google Scholar]
Trajkovic, J.; Veniero, D.; Hanslmayr, S.; Palva, S.; Cruz, G.; Romei, V.; Thut, G. Top-down and bottom-up interactions rely on nested brain oscillations to shape rhythmic visual attention sampling. PLoS Biol. 2025, 23, e3002688. [Google Scholar] [CrossRef] [PubMed]
Li, R.; Zhang, Y.; Fan, G.; Li, Z.; Li, J.; Fan, S.; Lou, C.; Liu, X. Design and implementation of high sampling rate and multichannel wireless recorder for EEG monitoring and SSVEP response detection. Front. Neurosci. 2023, 17, 1193950. [Google Scholar] [CrossRef]
Lei, J.; Li, X.; Chen, W.; Wang, A.; Han, Q.; Bai, S.; Zhang, M. Design of a compact wireless EEG recorder with extra-high sampling rate and precise time synchronization for auditory brainstem response. IEEE Sens. J. 2021, 22, 4484–4493. [Google Scholar] [CrossRef]
Valentin, O.; Ducharme, M.; Crétot-Richert, G.; Monsarrat-Chanon, H.; Viallet, G.; Delnavaz, A.; Voix, J. Validation and benchmarking of a wearable EEG acquisition platform for real-world applications. IEEE Trans. Biomed. Circuits Syst. 2018, 13, 103–111. [Google Scholar] [CrossRef]
Lin, C.T.; Wang, Y.; Chen, S.F.; Huang, K.C.; Liao, L.D. Design and verification of a wearable wireless 64-channel high-resolution EEG acquisition system with wi-fi transmission. Med. Biol. Eng. Comput. 2023, 61, 3003–3019. [Google Scholar] [CrossRef]
Zhou, S.; Zhang, W.; Liu, Y.; Chen, X.; Liu, H. Real-Time Driver Attention Detection in Complex Driving Environments via Binocular Depth Compensation and Multi-Source Temporal Bidirectional Long Short-Term Memory Network. Sensors 2025, 25, 5548. [Google Scholar] [CrossRef]
Pichandi, S.; Balasubramanian, G.; Chakrapani, V. Hybrid deep models for parallel feature extraction and enhanced emotion state classification. Sci. Rep. 2024, 14, 24957. [Google Scholar] [CrossRef]
Yuan, S.; Yan, K.; Wang, S.; Liu, J.X.; Wang, J. EEG-Based Seizure Prediction Using Hybrid DenseNet–ViT Network with Attention Fusion. Brain Sci. 2024, 14, 839. [Google Scholar] [CrossRef] [PubMed]
Alemaw, A.S.; Slavic, G.; Zontone, P.; Marcenaro, L.; Gomez, D.M.; Regazzoni, C. Modeling interactions between autonomous agents in a multi-agent self-awareness architecture. IEEE Trans. Multimed. 2025, 27, 5035–5049. [Google Scholar] [CrossRef]
Kim, S.K.; Kim, J.B.; Kim, H.; Kim, L.; Kim, S.H. Early Diagnosis of Alzheimer’s Disease in Human Participants Using EEGConformer and Attention-Based LSTM During the Short Question Task. Diagnostics 2025, 15, 448. [Google Scholar] [CrossRef]
Jin, J.; Chen, Z.; Cai, H.; Pan, J. Affective eeg-based person identification with continual learning. IEEE Trans. Instrum. Meas. 2024, 73, 4007716. [Google Scholar] [CrossRef]
Liu, H.W.; Wang, S.; Tong, S.X. DysDiTect: Dyslexia Identification Using CNN-Positional-LSTM-Attention Modeling with Chinese Dictation Task. Brain Sci. 2024, 14, 444. [Google Scholar] [CrossRef]
Li, M.; Yu, P.; Shen, Y. A spatial and temporal transformer-based EEG emotion recognition in VR environment. Front. Hum. Neurosci. 2025, 19, 1517273. [Google Scholar] [CrossRef] [PubMed]
Kwon, Y.H.; Shin, S.B.; Kim, S.D. Electroencephalography based fusion two-dimensional (2D)-convolution neural networks (CNN) model for emotion recognition system. Sensors 2018, 18, 1383. [Google Scholar] [CrossRef] [PubMed]
Chauhan, N.; Choi, B.J. Regional contribution in electrophysiological-based classifications of attention deficit hyperactive disorder (ADHD) using machine learning. Computation 2023, 11, 180. [Google Scholar] [CrossRef]
Djamal, E.C.; Ramadhan, R.I.; Mandasari, M.I.; Djajasasmita, D. Identification of post-stroke EEG signal using wavelet and convolutional neural networks. Bull. Electr. Eng. Inform. 2020, 9, 1890–1898. [Google Scholar] [CrossRef]
Ieracitano, C.; Mammone, N.; Hussain, A.; Morabito, F.C. A novel explainable machine learning approach for EEG-based brain-computer interface systems. Neural Comput. Appl. 2022, 34, 11347–11360. [Google Scholar] [CrossRef]
Homan, R.W. The 10–20 electrode system and cerebral location. Am. J. EEG Technol. 1988, 28, 269–279. [Google Scholar] [CrossRef]
Koessler, L.; Maillard, L.; Benhadid, A.; Vignal, J.P.; Felblinger, J.; Vespignani, H.; Braun, M. Automated cortical projection of EEG sensors: Anatomical correlation via the international 10–10 system. Neuroimage 2009, 46, 64–72. [Google Scholar] [CrossRef]
Meng, Y.; Liu, Y.; Wang, G.; Song, H.; Zhang, Y.; Lu, J.; Li, P.; Ma, X. M-NIG: Mobile network information gain for EEG-based epileptic seizure prediction. Sci. Rep. 2025, 15, 15181. [Google Scholar] [CrossRef]
Paul, A.; Hota, G.; Khaleghi, B.; Xu, Y.; Rosing, T.; Cauwenberghs, G. Attention state classification with in-ear EEG. In Proceedings of the 2021 IEEE Biomedical Circuits and Systems Conference (BioCAS), Berlin, Germany, 7–9 October 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–5. [Google Scholar]
Holtze, B.; Rosenkranz, M.; Jaeger, M.; Debener, S.; Mirkovic, B. Ear-EEG measures of auditory attention to continuous speech. Front. Neurosci. 2022, 16, 869426. [Google Scholar] [CrossRef] [PubMed]
Munteanu, D.; Munteanu, N. Comparison Between Assisted Training and Classical Training in Nonformal Learning Based on Automatic Attention Measurement Using a Neurofeedback Device. eLearn. Softw. Educ. 2019, 1, 302. [Google Scholar]
Rivas, F.; Sierra-Garcia, J.E.; Camara, J.M. Comparison of LSTM-and GRU-Type RNN networks for attention and meditation prediction on raw EEG data from low-cost headsets. Electronics 2025, 14, 707. [Google Scholar] [CrossRef]
Klimesch, W.; Doppelmayr, M.; Russegger, H.; Pachinger, T.; Schwaiger, J. Induced alpha band power changes in the human EEG and attention. Neurosci. Lett. 1998, 244, 73–76. [Google Scholar] [CrossRef]
Cai, S.; Zhang, R.; Zhang, M.; Wu, J.; Li, H. EEG-based auditory attention detection with spiking graph convolutional network. IEEE Trans. Cogn. Dev. Syst. 2024, 16, 1698–1706. [Google Scholar] [CrossRef]
Cai, S.; Su, E.; Xie, L.; Li, H. EEG-based auditory attention detection via frequency and channel neural attention. IEEE Trans. Hum.-Mach. Syst. 2021, 52, 256–266. [Google Scholar] [CrossRef]
Liu, N.H.; Chiang, C.Y.; Chu, H.C. Recognizing the degree of human attention using EEG signals from mobile sensors. Sensors 2013, 13, 10273–10286. [Google Scholar] [CrossRef]
Ko, L.W.; Komarov, O.; Hairston, W.D.; Jung, T.P.; Lin, C.T. Sustained attention in real classroom settings: An EEG study. Front. Hum. Neurosci. 2017, 11, 388. [Google Scholar] [CrossRef]
Ciccarelli, G.; Nolan, M.; Perricone, J.; Calamia, P.T.; Haro, S.; O’sullivan, J.; Mesgarani, N.; Quatieri, T.F.; Smalt, C.J. Comparison of two-talker attention decoding from EEG with nonlinear neural networks and linear methods. Sci. Rep. 2019, 9, 11538. [Google Scholar] [CrossRef]
Nogueira, W.; Cosatti, G.; Schierholz, I.; Egger, M.; Mirkovic, B.; Büchner, A. Toward decoding selective attention from single-trial EEG data in cochlear implant users. IEEE Trans. Biomed. Eng. 2019, 67, 38–49. [Google Scholar] [CrossRef]
Sinha, S.R.; Sullivan, L.R.; Sabau, D.; Orta, D.S.J.; Dombrowski, K.E.; Halford, J.J.; Hani, A.J.; Drislane, F.W.; Stecker, M.M. American clinical neurophysiology society guideline 1: Minimum technical requirements for performing clinical electroencephalography. Neurodiagn. J. 2016, 56, 235–244. [Google Scholar] [CrossRef]
Choi, Y.; Kim, M.; Chun, C. Effect of temperature on attention ability based on electroencephalogram measurements. Build. Environ. 2019, 147, 299–304. [Google Scholar] [CrossRef]
Bleichner, M.G.; Mirkovic, B.; Debener, S. Identifying auditory attention with ear-EEG: CEEGrid versus high-density cap-EEG comparison. J. Neural Eng. 2016, 13, 066004. [Google Scholar] [CrossRef]
Wang, J.; Wang, W.; Hou, Z.G. Toward improving engagement in neural rehabilitation: Attention enhancement based on brain–computer interface and audiovisual feedback. IEEE Trans. Cogn. Dev. Syst. 2019, 12, 787–796. [Google Scholar] [CrossRef]
Minguillon, J.; Lopez-Gordo, M.A.; Pelayo, F. Trends in EEG-BCI for daily-life: Requirements for artifact removal. Biomed. Signal Process. Control 2017, 31, 407–418. [Google Scholar] [CrossRef]
Radüntz, T.; Scouten, J.; Hochmuth, O.; Meffert, B. Automated EEG artifact elimination by applying machine learning algorithms to ICA-based features. J. Neural Eng. 2017, 14, 046004. [Google Scholar] [CrossRef] [PubMed]
Noorbasha, S.K.; Sudha, G.F. Removal of EOG artifacts and separation of different cerebral activity components from single channel EEG—An efficient approach combining SSA–ICA with wavelet thresholding for BCI applications. Biomed. Signal Process. Control 2021, 63, 102168. [Google Scholar] [CrossRef]
Phadikar, S.; Sinha, N.; Ghosh, R. Automatic eyeblink artifact removal from EEG signal using wavelet transform with heuristically optimized threshold. IEEE J. Biomed. Health Inform. 2020, 25, 475–484. [Google Scholar] [CrossRef]
Nayak, A.B.; Shah, A.; Maheshwari, S.; Anand, V.; Chakraborty, S.; Kumar, T.S. An empirical wavelet transform-based approach for motion artifact removal in electroencephalogram signals. Decis. Anal. J. 2024, 10, 100420. [Google Scholar] [CrossRef]
Maddirala, A.K.; Veluvolu, K.C. ICA with CWT and k-means for eye-blink artifact removal from fewer channel EEG. IEEE Trans. Neural Syst. Rehabil. Eng. 2022, 30, 1361–1373. [Google Scholar] [CrossRef]
Patel, R.; Janawadkar, M.P.; Sengottuvel, S.; Gireesan, K.; Radhakrishnan, T.S. Suppression of eye-blink associated artifact using single channel EEG data by combining cross-correlation with empirical mode decomposition. IEEE Sens. J. 2016, 16, 6947–6954. [Google Scholar] [CrossRef]
Wang, G.; Teng, C.; Li, K.; Zhang, Z.; Yan, X. The removal of EOG artifacts from EEG signals using independent component analysis and multivariate empirical mode decomposition. IEEE J. Biomed. Health Inform. 2015, 20, 1301–1308. [Google Scholar] [CrossRef]
Fathima, S.; Ahmed, M. Hierarchical-variational mode decomposition for baseline correction in electroencephalogram signals. IEEE Open J. Instrum. Meas. 2023, 2, 4000208. [Google Scholar] [CrossRef]
Lo, P.C.; Leu, J.S. Adaptive baseline correction of meditation EEG. Am. J. Electroneurodiagn. Technol. 2001, 41, 142–155. [Google Scholar] [CrossRef]
Kessler, R.; Enge, A.; Skeide, M.A. How EEG preprocessing shapes decoding performance. Commun. Biol. 2025, 8, 1039. [Google Scholar] [CrossRef] [PubMed]
Xu, N.; Gao, X.; Hong, B.; Miao, X.; Gao, S.; Yang, F. BCI competition 2003-data set IIb: Enhancing P300 wave detection using ICA-based subspace projections for BCI applications. IEEE Trans. Biomed. Eng. 2004, 51, 1067–1072. [Google Scholar] [CrossRef]
Maddirala, A.K.; Shaik, R.A. Separation of sources from single-channel EEG signals using independent component analysis. IEEE Trans. Instrum. Meas. 2017, 67, 382–393. [Google Scholar] [CrossRef]
Lin, C.T.; Huang, C.S.; Yang, W.Y.; Singh, A.K.; Chuang, C.H.; Wang, Y.K. Real-Time EEG Signal Enhancement Using Canonical Correlation Analysis and Gaussian Mixture Clustering. J. Healthc. Eng. 2018, 2018, 5081258. [Google Scholar] [CrossRef]
Kalunga, E.; Djouani, K.; Hamam, Y.; Chevallier, S.; Monacelli, E. SSVEP enhancement based on Canonical Correlation Analysis to improve BCI performances. In Proceedings of the 2013 Africon, Pointe aux Piments, Mauritius, 9–12 September 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 1–5. [Google Scholar]
Cai, S.; Li, P.; Su, E.; Xie, L. Auditory attention detection via cross-modal attention. Front. Neurosci. 2021, 15, 652058. [Google Scholar] [CrossRef]
Khanmohammadi, S.; Chou, C.A. Adaptive seizure onset detection framework using a hybrid PCA–CSP approach. IEEE J. Biomed. Health Inform. 2017, 22, 154–160. [Google Scholar] [CrossRef]
Molla, M.K.I.; Tanaka, T.; Osa, T.; Islam, M.R. EEG signal enhancement using multivariate wavelet transform application to single-trial classification of event-related potentials. In Proceedings of the 2015 IEEE International Conference on Digital Signal Processing (DSP), Singapore, 21–24 July 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 804–808. [Google Scholar]
Sita, G.; Ramakrishnan, A. Wavelet domain nonlinear filtering for evoked potential signal enhancement. Comput. Biomed. Res. 2000, 33, 431–446. [Google Scholar] [CrossRef]
Maki, H.; Toda, T.; Sakti, S.; Neubig, G.; Nakamura, S. EEG signal enhancement using multi-channel Wiener filter with a spatial correlation prior. In Proceedings of the 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), South Brisbane, Australia, 19–24 April 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 2639–2643. [Google Scholar]
Yadav, S.; Saha, S.K.; Kar, R. An application of the Kalman filter for EEG/ERP signal enhancement with the autoregressive realisation. Biomed. Signal Process. Control 2023, 86, 105213. [Google Scholar] [CrossRef]
Singh, A.K.; Krishnan, S. Trends in EEG signal feature extraction applications. Front. Artif. Intell. 2023, 5, 1072801. [Google Scholar] [CrossRef]
Dallmer-Zerbe, I.; Popp, F.; Lam, A.P.; Philipsen, A.; Herrmann, C.S. Transcranial alternating current stimulation (tACS) as a tool to modulate P300 amplitude in attention deficit hyperactivity disorder (ADHD): Preliminary findings. Brain Topogr. 2020, 33, 191–207. [Google Scholar] [CrossRef]
Zhang, G.; Luck, S.J. Variations in ERP data quality across paradigms, participants, and scoring procedures. Psychophysiology 2023, 60, e14264. [Google Scholar] [CrossRef]
Mehmood, R.M.; Bilal, M.; Vimal, S.; Lee, S.W. EEG-based affective state recognition from human brain signals by using Hjorth-activity. Measurement 2022, 202, 111738. [Google Scholar] [CrossRef]
Raj, V.; Hazarika, J.; Hazra, R. Feature selection for attention demanding task induced EEG detection. In Proceedings of the 2020 IEEE Applied Signal Processing Conference (ASPCON), Kolkata, India, 7–9 October 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 11–15. [Google Scholar]
Harmony, T. The functional significance of delta oscillations in cognitive processing. Front. Integr. Neurosci. 2013, 7, 83. [Google Scholar] [CrossRef] [PubMed]
Matsuo, M.; Higuchi, T.; Ichibakase, T.; Suyama, H.; Takahara, R.; Nakamura, M. Differences in Electroencephalography Power Levels between Poor and Good Performance in Attentional Tasks. Brain Sci. 2024, 14, 527. [Google Scholar] [CrossRef] [PubMed]
Wang, C.; Wang, X.; Zhu, M.; Pi, Y.; Wang, X.; Wan, F.; Chen, S.; Li, G. Spectrum power and brain functional connectivity of different EEG frequency bands in attention network tests. In Proceedings of the 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Virtual, 1–5 November 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 224–227. [Google Scholar]
Sharma, A.; Singh, M. Assessing alpha activity in attention and relaxed state: An EEG analysis. In Proceedings of the 2015 1st International Conference on Next Generation Computing Technologies (NGCT), Dehradun, India, 4–5 September 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 508–513. [Google Scholar]
Yang, X.; Fiebelkorn, I.C.; Jensen, O.; Knight, R.T.; Kastner, S. Differential neural mechanisms underlie cortical gating of visual spatial attention mediated by alpha-band oscillations. Proc. Natl. Acad. Sci. USA 2024, 121, e2313304121. [Google Scholar] [CrossRef] [PubMed]
Kahlbrock, N.; Butz, M.; May, E.S.; Brenner, M.; Kircheis, G.; Häussinger, D.; Schnitzler, A. Lowered frequency and impaired modulation of gamma band oscillations in a bimodal attention task are associated with reduced critical flicker frequency. Neuroimage 2012, 61, 216–227. [Google Scholar] [CrossRef]
Van Son, D.; De Blasio, F.M.; Fogarty, J.S.; Angelidis, A.; Barry, R.J.; Putman, P. Frontal EEG theta/beta ratio during mind wandering episodes. Biol. Psychol. 2019, 140, 19–27. [Google Scholar] [CrossRef]
Deshmukh, M.; Khemchandani, M.; Mhatre, M. Impact of brain regions on attention deficit hyperactivity disorder (ADHD) electroencephalogram (EEG) signals: Comparison of machine learning algorithms with empirical mode decomposition and time domain analysis. Appl. Neuropsychol. Child 2025, 1–17. [Google Scholar] [CrossRef]
Sharma, Y.; Singh, B.K. Classification of children with attention-deficit hyperactivity disorder using Wigner-Ville time-frequency and deep expEEGNetwork feature-based computational models. IEEE Trans. Med. Robot. Bionics 2023, 5, 890–902. [Google Scholar] [CrossRef]
Xu, X.; Nie, X.; Zhang, J.; Xu, T. Multi-level attention recognition of EEG based on feature selection. Int. J. Environ. Res. Public Health 2023, 20, 3487. [Google Scholar] [CrossRef]
Ke, Y.; Chen, L.; Fu, L.; Jia, Y.; Li, P.; Zhao, X.; Qi, H.; Zhou, P.; Zhang, L.; Wan, B.; et al. Visual attention recognition based on nonlinear dynamical parameters of EEG. Bio-Med. Mater. Eng. 2014, 24, 349–355. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Zhou, W.; Yuan, Q.; Li, X.; Meng, Q.; Zhao, X.; Wang, J. Comparison of ictal and interictal EEG signals using fractal features. Int. J. Neural Syst. 2013, 23, 1350028. [Google Scholar] [CrossRef]
Lee, M.W.; Yang, N.J.; Mok, H.K.; Yang, R.C.; Chiu, Y.H.; Lin, L.C. Music and movement therapy improves quality of life and attention and associated electroencephalogram changes in patients with attention-deficit/hyperactivity disorder. Pediatr. Neonatol. 2024, 65, 581–587. [Google Scholar] [CrossRef] [PubMed]
Cura, O.K.; Akan, A.; Atli, S.K. Detection of Attention Deficit Hyperactivity Disorder based on EEG feature maps and deep learning. Biocybern. Biomed. Eng. 2024, 44, 450–460. [Google Scholar] [CrossRef]
Lu, Y.; Wang, M.; Zhang, Q.; Han, Y. Identification of auditory object-specific attention from single-trial electroencephalogram signals via entropy measures and machine learning. Entropy 2018, 20, 386. [Google Scholar] [PubMed]
Canyurt, C.; Zengin, R. Epileptic activity detection using mean value, RMS, sample entropy, and permutation entropy methods. J. Cogn. Syst. 2023, 8, 16–27. [Google Scholar] [CrossRef]
Angulo-Ruiz, B.Y.; Munoz, V.; Rodríguez-Martínez, E.I.; Cabello-Navarro, C.; Gomez, C.M. Multiscale entropy of ADHD children during resting state condition. Cogn. Neurodyn. 2023, 17, 869–891. [Google Scholar] [CrossRef] [PubMed]
Geirnaert, S.; Francart, T.; Bertrand, A. Fast EEG-based decoding of the directional focus of auditory attention using common spatial patterns. IEEE Trans. Biomed. Eng. 2020, 68, 1557–1568. [Google Scholar] [CrossRef] [PubMed]
Cai, S.; Su, E.; Song, Y.; Xie, L.; Li, H. Low Latency Auditory Attention Detection with Common Spatial Pattern Analysis of EEG Signals. In Proceedings of the INTERSPEECH, Shanghai, China, 25–29 October 2020; pp. 2772–2776. [Google Scholar]
Niu, Y.; Chen, N.; Zhu, H.; Jin, J.; Li, G. Music-oriented auditory attention detection from electroencephalogram. Neurosci. Lett. 2024, 818, 137534. [Google Scholar] [CrossRef]
Wang, Y.; He, H. Electroencephalogram emotion recognition based on manifold geomorphological features in Riemannian space. IEEE Intell. Syst. 2024, 39, 23–36. [Google Scholar] [CrossRef]
Xu, G.; Wang, Z.; Zhao, X.; Li, R.; Zhou, T.; Xu, T.; Hu, H. Attentional state classification using amplitude and phase feature extraction method based on filter bank and Riemannian manifold. IEEE Trans. Neural Syst. Rehabil. Eng. 2023, 31, 4402–4412. [Google Scholar] [CrossRef]
Anuragi, A.; Sisodia, D.S.; Pachori, R.B. Mitigating the curse of dimensionality using feature projection techniques on electroencephalography datasets: An empirical review. Artif. Intell. Rev. 2024, 57, 75. [Google Scholar] [CrossRef]
Li, Y.; Li, T.; Liu, H. Recent advances in feature selection and its applications. Knowl. Inf. Syst. 2017, 53, 551–577. [Google Scholar] [CrossRef]
Hu, B.; Li, X.; Sun, S.; Ratcliffe, M. Attention recognition in EEG-based affective learning research using CFS+ KNN algorithm. IEEE/ACM Trans. Comput. Biol. Bioinform. 2016, 15, 38–45. [Google Scholar] [CrossRef]
Liang, Z.; Wang, X.; Zhao, J.; Li, X. Comparative study of attention-related features on attention monitoring systems with a single EEG channel. J. Neurosci. Methods 2022, 382, 109711. [Google Scholar] [CrossRef]
Kaongoen, N.; Choi, J.; Jo, S. Speech-imagery-based brain–computer interface system using ear-EEG. J. Neural Eng. 2021, 18, 016023. [Google Scholar] [CrossRef]
Dias, N.S.; Kamrunnahar, M.; Mendes, P.M.; Schiff, S.J.; Correia, J.H. Feature selection on movement imagery discrimination and attention detection. Med. Biol. Eng. Comput. 2010, 48, 331–341. [Google Scholar] [CrossRef]
McCann, M.T.; Thompson, D.E.; Syed, Z.H.; Huggins, J.E. Electrode subset selection methods for an EEG-based P300 brain-computer interface. Disabil. Rehabil. Assist. Technol. 2015, 10, 216–220. [Google Scholar] [CrossRef] [PubMed]
Huang, Y.; Zhang, Z.; Yang, Y.; Mo, P.C.; Zhang, Z.; He, J.; Hu, S.; Wang, X.; Li, Y. Exploring Skin Potential Signals in Electrodermal Activity: Identifying Key Features for Attention State Differentiation. IEEE Access 2024, 12, 100832–100847. [Google Scholar] [CrossRef]
Alirezaei, M.; Sardouie, S.H. Detection of human attention using EEG signals. In Proceedings of the 2017 24th National and 2nd International Iranian Conference on biomedical engineering (ICBME), Tehran, Iran, 30 November–1 December 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–5. [Google Scholar]
Chen, C.M.; Wang, J.Y.; Yu, C.M. Assessing the attention levels of students by using a novel attention aware system based on brainwave signals. Br. J. Educ. Technol. 2017, 48, 348–369. [Google Scholar] [CrossRef]
Zheng, W.; Chen, S.; Fu, Z.; Zhu, F.; Yan, H.; Yang, J. Feature selection boosted by unselected features. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 4562–4574. [Google Scholar] [CrossRef]
Alickovic, E.; Lunner, T.; Gustafsson, F.; Ljung, L. A tutorial on auditory attention identification methods. Front. Neurosci. 2019, 13, 153. [Google Scholar] [CrossRef]
Moon, J.; Kwon, Y.; Park, J.; Yoon, W.C. Detecting user attention to video segments using interval EEG features. Expert Syst. Appl. 2019, 115, 578–592. [Google Scholar] [CrossRef]
Maniruzzaman, M.; Hasan, M.A.M.; Asai, N.; Shin, J. Optimal channels and features selection based ADHD detection from EEG signal using statistical and machine learning techniques. IEEE Access 2023, 11, 33570–33583. [Google Scholar] [CrossRef]
Huang, Z.; Cheng, L.; Liu, Y. Key feature extraction method of electroencephalogram signal by independent component analysis for athlete selection and training. Comput. Intell. Neurosci. 2022, 2022, 6752067. [Google Scholar] [CrossRef]
Zhou, S.; Gao, T. Brain activity recognition method based on attention-based rnn mode. Appl. Sci. 2021, 11, 10425. [Google Scholar] [CrossRef]
Jin, C.Y.; Borst, J.P.; Van Vugt, M.K. Predicting task-general mind-wandering with EEG. Cogn. Affect. Behav. Neurosci. 2019, 19, 1059–1073. [Google Scholar] [CrossRef]
Peng, C.J.; Chen, Y.C.; Chen, C.C.; Chen, S.J.; Cagneau, B.; Chassagne, L. An EEG-based attentiveness recognition system using Hilbert–Huang transform and support vector machine. J. Med. Biol. Eng. 2020, 40, 230–238. [Google Scholar] [CrossRef]
Chen, X.; Bao, X.; Jitian, K.; Li, R.; Zhu, L.; Kong, W. Hybrid EEG Feature Learning Method for Cross-Session Human Mental Attention State Classification. Brain Sci. 2025, 15, 805. [Google Scholar] [CrossRef] [PubMed]
Sahu, P.K.; Jain, K. Sustained attention detection in humans using a prefrontal theta-eeg rhythm. Cogn. Neurodyn. 2024, 18, 2675–2687. [Google Scholar] [CrossRef] [PubMed]
Esqueda-Elizondo, J.J.; Juárez-Ramírez, R.; López-Bonilla, O.R.; García-Guerrero, E.E.; Galindo-Aldana, G.M.; Jiménez-Beristáin, L.; Serrano-Trujillo, A.; Tlelo-Cuautle, E.; Inzunza-González, E. Attention measurement of an autism spectrum disorder user using EEG signals: A case study. Math. Comput. Appl. 2022, 27, 21. [Google Scholar] [CrossRef]
de Brito Guerra, T.C.; Nóbrega, T.; Morya, E.; de M. Martins, A.; de Sousa, V.A., Jr. Electroencephalography signal analysis for human activities classification: A solution based on machine learning and motor imagery. Sensors 2023, 23, 4277. [Google Scholar] [CrossRef] [PubMed]
Demidova, L.; Klyueva, I.; Pylkin, A. Hybrid approach to improving the results of the SVM classification using the random forest algorithm. Procedia Comput. Sci. 2019, 150, 455–461. [Google Scholar] [CrossRef]
Tibrewal, N.; Leeuwis, N.; Alimardani, M. The promise of deep learning for bcis: Classification of motor imagery eeg using convolutional neural network. bioRxiv 2021. [Google Scholar] [CrossRef]
Craik, A.; He, Y.; Contreras-Vidal, J.L. Deep learning for electroencephalogram (EEG) classification tasks: A review. J. Neural Eng. 2019, 16, 031001. [Google Scholar] [CrossRef]
Toa, C.K.; Sim, K.S.; Tan, S.C. Emotiv insight with convolutional neural network: Visual attention test classification. In Proceedings of the International Conference on Computational Collective Intelligence, Ho Chi Minh City, Vietnam, 12–15 November 2025; Springer: Berlin/Heidelberg, Germany, 2021; pp. 348–357. [Google Scholar]
Vandecappelle, S.; Deckers, L.; Das, N.; Ansari, A.H.; Bertrand, A.; Francart, T. EEG-based detection of the locus of auditory attention with convolutional neural networks. eLife 2021, 10, e56481. [Google Scholar] [CrossRef]
Wang, Y.; Shi, Y.; Du, J.; Lin, Y.; Wang, Q. A CNN-based personalized system for attention detection in wayfinding tasks. Adv. Eng. Inform. 2020, 46, 101180. [Google Scholar] [CrossRef]
Geravanchizadeh, M.; Roushan, H. Dynamic selective auditory attention detection using RNN and reinforcement learning. Sci. Rep. 2021, 11, 15497. [Google Scholar] [CrossRef]
Lu, Y.; Wang, M.; Yao, L.; Shen, H.; Wu, W.; Zhang, Q.; Zhang, L.; Chen, M.; Liu, H.; Peng, R.; et al. Auditory attention decoding from electroencephalography based on long short-term memory networks. Biomed. Signal Process. Control 2021, 70, 102966. [Google Scholar] [CrossRef]
Lee, Y.E.; Lee, S.H. EEG-transformer: Self-attention from transformer architecture for decoding EEG of imagined speech. In Proceedings of the 2022 10th International winter conference on brain-computer interface (BCI), Gangwon-do, Republic of Korea, 21–23 February 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–4. [Google Scholar]
Xu, Y.; Du, Y.; Li, L.; Lai, H.; Zou, J.; Zhou, T.; Xiao, L.; Liu, L.; Ma, P. AMDET: Attention based multiple dimensions EEG transformer for emotion recognition. IEEE Trans. Affect. Comput. 2023, 15, 1067–1077. [Google Scholar] [CrossRef]
Xu, Z.; Bai, Y.; Zhao, R.; Hu, H.; Ni, G.; Ming, D. Decoding selective auditory attention with EEG using a transformer model. Methods 2022, 204, 410–417. [Google Scholar] [CrossRef]
Ding, Y.; Lee, J.H.; Zhang, S.; Luo, T.; Guan, C. Decoding Human Attentive States from Spatial-temporal EEG Patches Using Transformers. arXiv 2025, arXiv:2502.03736. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, Y.; Wang, S. An attention-based hybrid deep learning model for EEG emotion recognition. Signal, image and video processing 2023, 17, 2305–2313. [Google Scholar] [CrossRef]
Geravanchizadeh, M.; Shaygan Asl, A.; Danishvar, S. Selective Auditory Attention Detection Using Combined Transformer and Convolutional Graph Neural Networks. Bioengineering 2024, 11, 1216. [Google Scholar] [CrossRef]
Zhao, X.; Lu, J.; Zhao, J.; Yuan, Z. Single-Channel EEG Classification of Human Attention with Two-Branch Multiscale CNN and Transformer Model. In Proceedings of the 2024 International Joint Conference on Neural Networks (IJCNN), Yokohama, Japan, 30 June–5 July 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–8. [Google Scholar]
Xue, Y.; Wu, Y.; Zhang, S.; Feng, J. Attention Recognition Based on EEG Using Multi-Model Classification. In Proceedings of the 2025 4th International Conference on Electronics, Integrated Circuits and Communication Technology (EICCT), Chengdu, China, 11–13 July 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 459–463. [Google Scholar]
EskandariNasab, M.; Raeisi, Z.; Lashaki, R.A.; Najafi, H. A GRU–CNN model for auditory attention detection using microstate and recurrence quantification analysis. Sci. Rep. 2024, 14, 8861. [Google Scholar] [CrossRef] [PubMed]
Li, C.; Huang, X.; Song, R.; Qian, R.; Liu, X.; Chen, X. EEG-based seizure prediction via Transformer guided CNN. Measurement 2022, 203, 111948. [Google Scholar] [CrossRef]
Li, L.; Fan, C.; Zhang, H.; Zhang, J.; Yang, X.; Zhou, J.; Lv, Z. MHANet: Multi-scale Hybrid Attention Network for Auditory Attention Detection. arXiv 2025, arXiv:2505.15364. [Google Scholar] [CrossRef]
Das, N.; Francart, T.; Bertrand, A. Auditory Attention Detection Dataset KULeuven; Zenodo: Geneva, Switzerland, 2019. [Google Scholar]
Fuglsang, S.A.; Wong, D.; Hjortkjær, J. EEG and Audio Dataset for Auditory Attention Decoding; Zenodo: Geneva, Switzerland, 2018. [Google Scholar]
Fu, Z.; Wu, X.; Chen, J. Congruent audiovisual speech enhances auditory attention decoding with EEG. J. Neural Eng. 2019, 16, 066033. [Google Scholar] [CrossRef]
Fan, C.; Zhang, J.; Zhang, H.; Xiang, W.; Tao, J.; Li, X.; Yi, J.; Sui, D.; Lv, Z. MSFNet: Multi-scale fusion network for brain-controlled speaker extraction. In Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, Australia, 28 October–1 November 2024; pp. 1652–1661. [Google Scholar]
Broderick, M.P.; Anderson, A.J.; Di Liberto, G.M.; Crosse, M.J.; Lalor, E.C. Electrophysiological correlates of semantic dissimilarity reflect the comprehension of natural, narrative speech. Curr. Biol. 2018, 28, 803–809. [Google Scholar] [CrossRef]
Nguyen, N.D.T.; Phan, H.; Geirnaert, S.; Mikkelsen, K.; Kidmose, P. Aadnet: An end-to-end deep learning model for auditory attention decoding. IEEE Trans. Neural Syst. Rehabil. Eng. 2025, 33, 2695–2706. [Google Scholar] [CrossRef] [PubMed]
Zhang, Y.; Cai, H.; Nie, L.; Xu, P.; Zhao, S.; Guan, C. An end-to-end 3D convolutional neural network for decoding attentive mental state. Neural Netw. 2021, 144, 129–137. [Google Scholar] [CrossRef] [PubMed]
Geravanchizadeh, M.; Gavgani, S.B. Selective auditory attention detection based on effective connectivity by single-trial EEG. J. Neural Eng. 2020, 17, 026021. [Google Scholar] [CrossRef] [PubMed]
Reichert, C.; Tellez Ceja, I.F.; Sweeney-Reed, C.M.; Heinze, H.J.; Hinrichs, H.; Dürschmid, S. Impact of stimulus features on the performance of a gaze-independent brain-computer interface based on covert spatial attention shifts. Front. Neurosci. 2020, 14, 591777. [Google Scholar] [CrossRef]
Delvigne, V.; Wannous, H.; Dutoit, T.; Ris, L.; Vandeborre, J.P. PhyDAA: Physiological dataset assessing attention. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 2612–2623. [Google Scholar] [CrossRef]
Jaeger, M.; Mirkovic, B.; Bleichner, M.G.; Debener, S. Decoding the attended speaker from EEG using adaptive evaluation intervals captures fluctuations in attentional listening. Front. Neurosci. 2020, 14, 603. [Google Scholar] [CrossRef]
Torkamani-Azar, M.; Kanik, S.D.; Aydin, S.; Cetin, M. Prediction of reaction time and vigilance variability from spatio-spectral features of resting-state EEG in a long sustained attention task. IEEE J. Biomed. Health Inform. 2020, 24, 2550–2558. [Google Scholar] [CrossRef]
Liu, H.; Zhang, Y.; Liu, G.; Du, X.; Wang, H.; Zhang, D. A Multi-Label EEG Dataset for Mental Attention State Classification in Online Learning. In Proceedings of the ICASSP 2025—2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 6–11 April 2025; IEEE: Piscataway, NJ, USA, 2025; pp. 1–5. [Google Scholar]
Poulsen, A.T.; Kamronn, S.; Dmochowski, J.; Parra, L.C.; Hansen, L.K. EEG in the classroom: Synchronised neural recordings during video presentation. Sci. Rep. 2017, 7, 43916. [Google Scholar] [CrossRef] [PubMed]
Zhang, Z.X.; Xu, S.Q.; Zhou, E.N.; Huang, X.L.; Wang, J. Research on attention EEG based on Jensen-Shannon Divergence. Adv. Mater. Res. 2014, 884, 512–515. [Google Scholar] [CrossRef]
Schupp, H.T.; Flaisch, T.; Stockburger, J.; Junghöfer, M. Emotion and attention: Event-related brain potential studies. Prog. Brain Res. 2006, 156, 31–51. [Google Scholar] [PubMed]
Tan, C.; Zhou, H.; Zheng, A.; Yang, M.; Li, C.; Yang, T.; Chen, J.; Zhang, J.; Li, T. P300 event-related potentials as diagnostic biomarkers for attention deficit hyperactivity disorder in children. Front. Psychiatry 2025, 16, 1590850. [Google Scholar] [CrossRef]
Datta, A.; Cusack, R.; Hawkins, K.; Heutink, J.; Rorden, C.; Robertson, I.H.; Manly, T. The P300 as a marker of waning attention and error propensity. Comput. Intell. Neurosci. 2007, 2007, 093968. [Google Scholar] [CrossRef]
Tao, M.; Sun, J.; Liu, S.; Zhu, Y.; Ren, Y.; Liu, Z.; Wang, X.; Yang, W.; Li, G.; Wang, X.; et al. An event-related potential study of P300 in preschool children with attention deficit hyperactivity disorder. Front. Pediatr. 2024, 12, 1461921. [Google Scholar] [CrossRef]
Wang, Z.; Ding, Y.; Yuan, W.; Chen, H.; Chen, W.; Chen, C. Active claw-shaped dry electrodes for EEG measurement in hair areas. Bioengineering 2024, 11, 276. [Google Scholar] [CrossRef] [PubMed]
Zhu, Y.; Bayin, C.; Li, H.; Shu, X.; Deng, J.; Yuan, H.; Shen, H.; Liang, Z.; Li, Y. A flexible, stable, semi-dry electrode with low impedance for electroencephalography recording. RSC Adv. 2024, 14, 34415–34427. [Google Scholar] [CrossRef]
Wang, T.; Yao, S.; Shao, L.H.; Zhu, Y. Stretchable Ag/AgCl Nanowire Dry Electrodes for High-Quality Multimodal Bioelectronic Sensing. Sensors 2024, 24, 6670. [Google Scholar] [CrossRef]
Kaveh, R.; Doong, J.; Zhou, A.; Schwendeman, C.; Gopalan, K.; Burghardt, F.L.; Arias, A.C.; Maharbiz, M.M.; Muller, R. Wireless user-generic ear EEG. IEEE Trans. Biomed. Circuits Syst. 2020, 14, 727–737. [Google Scholar] [CrossRef]
Sun, W.; Su, Y.; Wu, X.; Wu, X. A novel end-to-end 1D-ResCNN model to remove artifact from EEG signals. Neurocomputing 2020, 404, 108–121. [Google Scholar] [CrossRef]
Mahmud, S.; Hossain, M.S.; Chowdhury, M.E.; Reaz, M.B.I. MLMRS-Net: Electroencephalography (EEG) motion artifacts removal using a multi-layer multi-resolution spatially pooled 1D signal reconstruction network. Neural Comput. Appl. 2023, 35, 8371–8388. [Google Scholar] [CrossRef]
Zhang, H.; Wei, C.; Zhao, M.; Liu, Q.; Wu, H. A novel convolutional neural network model to remove muscle artifacts from EEG. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1265–1269. [Google Scholar]
Gao, T.; Chen, D.; Tang, Y.; Ming, Z.; Li, X. EEG reconstruction with a dual-scale CNN-LSTM model for deep artifact removal. IEEE J. Biomed. Health Inform. 2022, 27, 1283–1294. [Google Scholar] [CrossRef]
Erfanian, A.; Mahmoudi, B. Real-time ocular artifact suppression using recurrent neural network for electro-encephalogram based brain-computer interface. Med. Biol. Eng. Comput. 2005, 43, 296–305. [Google Scholar] [CrossRef]
Liu, Y.; Höllerer, T.; Sra, M. SRI-EEG: State-based recurrent imputation for EEG artifact correction. Front. Comput. Neurosci. 2022, 16, 803384. [Google Scholar] [CrossRef]
Jiang, R.; Tong, S.; Wu, J.; Hu, H.; Zhang, R.; Wang, H.; Zhao, Y.; Zhu, W.; Li, S.; Zhang, X. A novel EEG artifact removal algorithm based on an advanced attention mechanism. Sci. Rep. 2025, 15, 19419. [Google Scholar] [CrossRef]
Wang, S.; Luo, Y.; Shen, H. An improved Generative Adversarial Network for Denoising EEG signals of brain-computer interface systems. In Proceedings of the 2022 China Automation Congress (CAC), Xiamen, China, 25–27 November 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 6498–6502. [Google Scholar]
Tibermacine, I.E.; Russo, S.; Citeroni, F.; Mancini, G.; Rabehi, A.; Alharbi, A.H.; El-Kenawy, E.S.M.; Napoli, C. Adversarial denoising of EEG signals: A comparative analysis of standard GAN and WGAN-GP approaches. Front. Hum. Neurosci. 2025, 19, 1583342. [Google Scholar] [CrossRef]
Dong, Y.; Tang, X.; Li, Q.; Wang, Y.; Jiang, N.; Tian, L.; Zheng, Y.; Li, X.; Zhao, S.; Li, G.; et al. An approach for EEG denoising based on wasserstein generative adversarial network. IEEE Trans. Neural Syst. Rehabil. Eng. 2023, 31, 3524–3534. [Google Scholar] [CrossRef]
Nguyen, H.A.T.; Le, T.H.; Bui, T.D. A deep wavelet sparse autoencoder method for online and automatic electrooculographical artifact removal. Neural Comput. Appl. 2020, 32, 18255–18270. [Google Scholar] [CrossRef]
Acharjee, R.; Ahamed, S.R. Automatic Eyeblink artifact removal from Single Channel EEG signals using one-dimensional convolutional Denoising autoencoder. In Proceedings of the 2024 International Conference on Computer, Electrical & Communication Engineering (ICCECE), Kolkata, India, 2–3 February 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–7. [Google Scholar]
Nagar, S.; Kumar, A. Orthogonal features based EEG signals denoising using fractional and compressed one-dimensional CNN AutoEncoder. IEEE Trans. Neural Syst. Rehabil. Eng. 2022, 30, 2474–2485. [Google Scholar] [CrossRef] [PubMed]
Pu, X.; Yi, P.; Chen, K.; Ma, Z.; Zhao, D.; Ren, Y. EEGDnet: Fusing non-local and local self-similarity for EEG signal denoising with transformer. Comput. Biol. Med. 2022, 151, 106248. [Google Scholar] [CrossRef]
Chen, J.; Pi, D.; Jiang, X.; Xu, Y.; Chen, Y.; Wang, X. Denosieformer: A transformer-based approach for single-channel EEG artifact removal. IEEE Trans. Instrum. Meas. 2023, 73, 2501116. [Google Scholar] [CrossRef]
Chuang, C.H.; Chang, K.Y.; Huang, C.S.; Bessas, A.M. Art: Artifact removal transformer for reconstructing noise-free multichannel electroencephalographic signals. arXiv 2024, arXiv:2409.07326. [Google Scholar] [CrossRef]
Yin, J.; Liu, A.; Li, C.; Qian, R.; Chen, X. A GAN guided parallel CNN and transformer network for EEG denoising. IEEE J. Biomed. Health Inform. 2023, 29, 3930–3941. [Google Scholar] [CrossRef]
Cai, Y.; Meng, Z.; Huang, D. DHCT-GAN: Improving EEG signal quality with a dual-branch hybrid CNN–transformer network. Sensors 2025, 25, 231. [Google Scholar] [CrossRef] [PubMed]
Zhu, L.; Lv, J. Review of studies on user research based on EEG and eye tracking. Appl. Sci. 2023, 13, 6502. [Google Scholar] [CrossRef]
Qin, Y.; Yang, J.; Zhang, M.; Zhang, M.; Kuang, J.; Yu, Y.; Zhang, S. Construction of a Quality Evaluation System for University Course Teaching Based on Multimodal Brain Data. Recent Patents Eng. 2025; in press. [Google Scholar]
Song, Y.; Feng, L.; Zhang, W.; Song, X.; Cheng, M. Multimodal Emotion Recognition based on the Fusion of EEG Signals and Eye Movement Data. In Proceedings of the 2024 IEEE 25th China Conference on System Simulation Technology and its Application (CCSSTA), Tianjin, China, 21–23 July 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 127–132. [Google Scholar]
Zhao, M.; Gao, H.; Wang, W.; Qu, J. Research on human-computer interaction intention recognition based on EEG and eye movement. IEEE Access 2020, 8, 145824–145832. [Google Scholar] [CrossRef]
Gong, X.; Dong, Y.; Zhang, T. CoDF-Net: Coordinated-representation decision fusion network for emotion recognition with EEG and eye movement signals. Int. J. Mach. Learn. Cybern. 2024, 15, 1213–1226. [Google Scholar] [CrossRef]
Guo, J.J.; Zhou, R.; Zhao, L.M.; Lu, B.L. Multimodal emotion recognition from eye image, eye movement and EEG using deep neural networks. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 3071–3074. [Google Scholar]
Li, X.; Wei, W.; Zhao, K.; Mao, J.; Lu, Y.; Qiu, S.; He, H. Exploring EEG and eye movement fusion for multi-class target RSVP-BCI. Inf. Fusion 2025, 121, 103135. [Google Scholar] [CrossRef]
Singh, P.; Tripathi, M.K.; Patil, M.B.; Shivendra; Neelakantappa, M. Multimodal emotion recognition model via hybrid model with improved feature level fusion on facial and EEG feature set. Multimed. Tools Appl. 2025, 84, 1–36. [Google Scholar] [CrossRef]
Zhao, Y.; Chen, D. Expression EEG multimodal emotion recognition method based on the bidirectional LSTM and attention mechanism. Comput. Math. Methods Med. 2021, 2021, 9967592. [Google Scholar] [CrossRef] [PubMed]
Wu, Y.; Li, J. Multi-modal emotion identification fusing facial expression and EEG. Multimed. Tools Appl. 2023, 82, 10901–10919. [Google Scholar] [CrossRef]
Li, D.; Wang, Z.; Wang, C.; Liu, S.; Chi, W.; Dong, E.; Song, X.; Gao, Q.; Song, Y. The fusion of electroencephalography and facial expression for continuous emotion recognition. IEEE Access 2019, 7, 155724–155736. [Google Scholar] [CrossRef]
Jin, X.; Xiao, J.; Jin, L.; Zhang, X. Residual multimodal Transformer for expression-EEG fusion continuous emotion recognition. CAAI Trans. Intell. Technol. 2024, 9, 1290–1304. [Google Scholar] [CrossRef]
Rayatdoost, S.; Rudrauf, D.; Soleymani, M. Multimodal gated information fusion for emotion recognition from EEG signals and facial behaviors. In Proceedings of the 2020 International Conference on Multimodal Interaction, Virtual, 25–29 October 2020; pp. 655–659. [Google Scholar]
Liu, S.; Wang, Z.; An, Y.; Li, B.; Wang, X.; Zhang, Y. DA-CapsNet: A multi-branch capsule network based on adversarial domain adaption for cross-subject EEG emotion recognition. Knowl.-Based Syst. 2024, 283, 111137. [Google Scholar] [CrossRef]
Unsworth, N.; Miller, A.L. Individual differences in the intensity and consistency of attention. Curr. Dir. Psychol. Sci. 2021, 30, 391–400. [Google Scholar] [CrossRef]
Matthews, G.; Reinerman-Jones, L.; Abich IV, J.; Kustubayeva, A. Metrics for individual differences in EEG response to cognitive workload: Optimizing performance prediction. Personal. Individ. Differ. 2017, 118, 22–28. [Google Scholar] [CrossRef]
Zhang, B.; Xu, M.; Zhang, Y.; Ye, S.; Chen, Y. Attention-ProNet: A Prototype Network with Hybrid Attention Mechanisms Applied to Zero Calibration in Rapid Serial Visual Presentation-Based Brain–Computer Interface. Bioengineering 2024, 11, 347. [Google Scholar] [CrossRef]
Leblanc, B.; Germain, P. On the Relationship Between Interpretability and Explainability in Machine Learning. arXiv 2023, arXiv:2311.11491. [Google Scholar]
Khare, S.K.; Acharya, U.R. An explainable and interpretable model for attention deficit hyperactivity disorder in children using EEG signals. Comput. Biol. Med. 2023, 155, 106676. [Google Scholar] [CrossRef]
ŞAHiN, E.; Arslan, N.N.; Özdemir, D. Unlocking the black box: An in-depth review on interpretability, explainability, and reliability in deep learning. Neural Comput. Appl. 2025, 37, 859–965. [Google Scholar] [CrossRef]
Miao, Z.; Zhao, M.; Zhang, X.; Ming, D. LMDA-Net: A lightweight multi-dimensional attention network for general EEG-based brain-computer interfaces and interpretability. NeuroImage 2023, 276, 120209. [Google Scholar] [CrossRef] [PubMed]
Shawly, T.; Alsheikhy, A.A. Eeg-based detection of epileptic seizures in patients with disabilities using a novel attention-driven deep learning framework with SHAP interpretability. Egypt. Inform. J. 2025, 31, 100734. [Google Scholar] [CrossRef]
Gonzalez, H.A.; Muzaffar, S.; Yoo, J.; Elfadel, I.M. BioCNN: A hardware inference engine for EEG-based emotion detection. IEEE Access 2020, 8, 140896–140914. [Google Scholar] [CrossRef]
Fang, W.C.; Wang, K.Y.; Fahier, N.; Ho, Y.L.; Huang, Y.D. Development and validation of an EEG-based real-time emotion recognition system using edge AI computing platform with convolutional neural network system-on-chip design. IEEE J. Emerg. Sel. Top. Circuits Syst. 2019, 9, 645–657. [Google Scholar] [CrossRef]
Li, J.Y.; Fang, W.C. An edge ai accelerator design based on hdc model for real-time eeg-based emotion recognition system with risc-v fpga platform. In Proceedings of the 2024 IEEE International Symposium on Circuits and Systems (ISCAS), Singapore, 19–22 May 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–5. [Google Scholar]

Figure 1. Process framework diagram of EEG attention assessment in classroom.

Figure 2. Hardware setup for EEG signal processing system.

Figure 3. The deep learning methods in EEG attention classification, with CNN as an example.

Figure 4. Schematic diagram of EEG and eye movement multimodal data fusion methods. (a) Feature-level fusion. (b) Decision-level fusion; (c) Model-level fusion.

Figure 5. A multimodal attention assessment method for classroom teaching evaluation.

Table 1. EEG frequency bands and mental status.

EEG Band	Frequency Range (Hz)	Mental Status
Delta	0.1–4 Hz	Deep sleep, unconscious.
Theta	4–8 Hz	Deep relaxation, internal focus, meditation.
Low Alpha	8–10 Hz	Wakeful relaxation, conscious, good mood, calmness.
High Alpha	10–12 Hz	Enhanced self-awareness and concentration.
Low Beta	12–18 Hz	Thinking and focused attention.
High Beta	18–30 Hz	Cognitive activity, alertness.
Low Gamma	30–50 Hz	Cognitive processing, self-control.
High Gamma	50–70 Hz	Engaged in memory, hearing reading and speaking.

Table 2. The commercial EEG signal acquisition device.

Device	Channel	Resolution	Max. Sample Rate	Interface
BioSemi ActiveTwo system	280	24-bit	16 kHz	USB
actiCHamp	160	24-bit	25 kH	USB
BrainAmp DC	256	16-bit	5 kH	USB
Neuroscan EEG	64	24-bit	20 kHz	USB
DSI-24	19	16-bit	300 Hz	Bluetooth/USB
Emotiv Epoc X	14	16-bit	2048 Hz	Bluetooth/USB
Emotiv FLEX 2 Saline	32	16-bit	2048 Hz	Bluetooth/USB
MindWave Mobile 2	1	12-bit	512 Hz	Bluetooth
Versatile EEG	32	24-bit	256 Hz	Bluetooth
Li et al. [46]	32	16-bit	30 kHz	Wi-Fi
Lei et al. [47]	8	24-bit	16 kHz	Bluetooth
Valentin et al. [48]	8	24-bit	4 kHz	USB
Liu et al. [35]	16	24-bit	1000 Hz	Wi-Fi
Liu et al. [35]	192	24-bit	4000 Hz	Fiber/USB/Wi-Fi
Lin et al. [49]	64	24-bit	300/512 Hz	Wi-Fi

Table 3. The GPU of NVIDIA series used in EEG signal processing.

Model	No. of CUDA	Memory	Memory Bandwidth	Reference
GeForce RTX 3090	10,496	24 GB	936.2 GB/s	[50]
GeForce RTX 3080	8704	10 GB	760.3 GB/s	[51]
GeForce RTX 3080 Ti	10,240	12 GB	912.4 GB/s	[52]
GeForce RTX 3070	5888	8 GB	448.0 GB/s	[53]
GeForce RTX 3070 Ti	6144	8 GB	608.3 GB/s	[54]
GeForce RTX 3060	3584	12 GB	360.0 GB/s	[55]
GeForce RTX 3060 Ti	4864	8 GB	448.0 GB/s	[56]
GeForce RTX 4090	16,384	24 GB	1.01 TB/s	[57]
GeForce GTX 1070	1920	8 GB	256.3 GB/s	[58]
GeForce GTX 1070 Ti	2432	8 GB	256.3 GB/s	[59]
GeForce GTX 1050	640	2 GB	112.1 GB/s	[60]
RTX 2080 GPU Ti	4352	11 GB	616.0 GB/s	[61]

Table 4. Comparison of artifact removal methods.

Methods	Advantages	Challenges	Scenarios
ICA	Effectively separate mixed signals, unsupervised learning, multi-channel analysis.	Calculation complexity, manual judgment of artifact.	Multi-channel EEG; EOG artifacts, EMG artifacts.
WT	Time–frequency local analysis, multi-resolution features, good computational efficiency.	Depends on the wavelet basis, frequency band aliasing.	Transient artifacts, EOG artifacts, EMG artifacts.
EMD	Good adaptability, local feature extraction, nonlinear processing capability.	Modal aliasing problem, signal quality degrades at ends.	Single-channel EEG, non-stationary, nonlinear artifacts.

Table 5. Comparison of EEG feature extraction methods.

Methods	Advantages	Challenges
Time-domain features	Simple implementation and low computational complexity. Suitable for processing short-term stationary signals.	Unable to capture important frequencies or complex time–frequency relationships.
Frequency-domain features	Can obtain the characteristics of the signal in specific frequency bands.	Unable to capture the time-varying characteristics of the signal.
Time–frequency-domain features	Simultaneously acquire time and frequency information to analyze non-stationary signals.	The computational complexity is high and depends on the choice of decomposition algorithm.
Nonlinear features	Extract chaotic signals and complex dynamic change features.	The computational complexity is high and be sensitive to parameters.
Spatial-domain features	Suitable for processing multi-channel EEG data and analyzing brain region interactions	Depends on sensor layout, generalization ability may be limited.

Table 6. Comparison of the traditional machine learning methods for classification.

Methods	Advantages	Challenges
SVM	Handle high-dimensional data effectively, and usually provides high classification accuracy.	Large datasets take a long time to calculate, and be sensitive to kernel function choice.
KNN	Simple to implement and understand, requires no training, and performs well for low-dimensional data or small sample sets.	Be sensitive to sample size and dimensionality, as computational complexity grows with data size. Choose the k value based on the data structure.
Random forest	Can process a large number of features and has good robustness to noise. Usually shows good classification performance and good resistance to overfitting.	The algorithm model is complex, requires more computing resources, and is not good at handling time series features.

Table 7. Comparison of deep learning methods for classification.

Methods	Advantages	Challenges
CNN	Good at capturing spatial features, high computational efficiency, and parallel training.	Not suitable for capturing the temporal dynamics of time series signals, and training requires much data.
RNN	Good at capturing temporal dynamics and sequential relationships, suitable for processing continuous EEG signals.	There is a gradient vanishing problem, long training time, and poor parallelism.
Transformer	Be suitable for long time series analysis, can identify global features, high parallel processing efficiency.	The computational complexity is high, requiring a lot of computing resources, and performance degrades for small datasets.
Hybrid models	Combining the advantages of multiple models, more adaptable.	Structure is complex, training and parameter adjustment are difficult, and usually require more computing resources.

Table 8. Datasets used in EEG attention assessment.

Datasets	Sampling Rate (Hz)	No. of Subjects	No. of Channels	Stimuli
KUL [169]	8192	16	64	Audio
DTU [170]	512	18	64	Audio
PKU [171]	500	16	64	Audio
AVED [172]	1000	20	32	Audio & Visual
Cocktail Party [173]	512	33	128	Audio
Das et al. [42]	8192	28	64	Audio
EventAAD [174]	1000	24	32	Audio
Zhang et al. [175]	1000	30	28	Visual
Geravanchizadeh et al. [176]	512	40	128	Audio
Reichert et al. [177]	250	18	14	Visual
Ciccarelli et al. [74]	1000	11	64	Audio
Delvigne et al. [178]	500	11	32	VR headsets
Jaeger et al. [179]	500	21	94	Audio
Torkamani et al. [180]	N/A	10	64	Visual
Liu et al. [181]	500	20	32	Visual
Ko et al. [73]	1000	18	32	Visual

Table 9. Comparison of deep learning methods in removing noise and artifacts.

Methods	Advantages	Challenges	Scenarios
CNN	Automatically learn spatial features, robust to structured noise, and highly computationally efficient.	Weak ability to model time dependencies, may require large amounts of data. Limited effect in high-frequency noise.	EMG artifacts, EOG artifacts, high-frequency noise/artifacts, noise caused by poor electrode contact.
RNN	Good in temporal sequence modeling and handling variable-length sequences.	Slow training, gradient disappearance or explosion, high computing resource requirements.	ECG artifacts, EOG artifacts, and long-term artifacts.
GAN	Learning complex distributions, suitable for non-stationary signals.	Training is unstable, requiring careful tuning, and has high computational cost.	Mixed artifacts (e.g., EMG + EOG) and complex physiological noise.
Autoencoder	Unsupervised learning, removing redundant information, simple structure and easy to train.	The effect on complex noise is limited and useful information may be lost.	EMG artifacts, EOG artifacts.
Transformer	Captures global spatial–temporal dependencie and transient dynamic features, high denoising performance.	Requires a large amount of data, high model complexity, and high computing resource consumption.	Global artifacts (e.g., EOG, ECG) and complex time-dependent noise.

Table 10. Comparison of two multimodal fusion methods.

Methods	Advantages	Challenges
EEG + Eye movement	Strong objective accuracy, difficult to disguise. Facilitates real-time assessment. Strong anti-interference ability.	High-precision eye tracking devices may be required. Complex data synchronization. Eye movement data lacks features for emotion recognition.
EEG + Facial expression	Strong emotional connection. Low deployment cost. Suitable for large-class teaching.	Facial expressions are easy to disguise. Real-time performance is not good. Be sensitive to the environment.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wei, L.; Yu, Y.; Qin, Y.; Zhang, S. A Survey of EEG-Based Approaches to Classroom Attention Assessment in Education. Information 2025, 16, 860. https://doi.org/10.3390/info16100860

AMA Style

Wei L, Yu Y, Qin Y, Zhang S. A Survey of EEG-Based Approaches to Classroom Attention Assessment in Education. Information. 2025; 16(10):860. https://doi.org/10.3390/info16100860

Chicago/Turabian Style

Wei, Lijun, Yuanyu Yu, Yuping Qin, and Shuang Zhang. 2025. "A Survey of EEG-Based Approaches to Classroom Attention Assessment in Education" Information 16, no. 10: 860. https://doi.org/10.3390/info16100860

APA Style

Wei, L., Yu, Y., Qin, Y., & Zhang, S. (2025). A Survey of EEG-Based Approaches to Classroom Attention Assessment in Education. Information, 16(10), 860. https://doi.org/10.3390/info16100860

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Survey of EEG-Based Approaches to Classroom Attention Assessment in Education

Abstract

1. Introduction

2. EEG Signal Recording

2.1. EEG Signal

2.2. Hardware Setup for EEG Signal Processing

2.3. EEG Signal Acquisition

3. EEG Signal Preprocessing

3.1. Filtering

3.2. Artifacts Removal

3.3. Signal Segmentation and Baseline Correction

3.4. Signal Enhancement

4. EEG Signal Feature Extraction and Selection

4.1. Time-Domain Features

4.2. Frequency-Domain Features

4.3. Time–Frequency-Domain Features

4.4. Nonlinear Features

4.4.1. Fractal Dimension

4.4.2. Entropy

4.5. Spatial-Domain Features

4.6. Comparison of EEG Signal Feature Extraction Methods

4.7. EEG Signal Feature Selection

5. Classification and Evaluation Metrics

5.1. Traditional Machine Learning Methods

5.1.1. SVM Methods

5.1.2. KNN Methods

5.1.3. Random Forest Methods

5.1.4. Comparison of Traditional Machine Learning Methods

5.2. Deep Learning Methods

5.2.1. CNN Methods

5.2.2. RNN Methods

5.2.3. Transformer Methods

5.2.4. Hybrid Models Methods

5.2.5. Comparison of Deep Learning Methods

5.3. Evaluation Metrics

6. Datasets

7. Traditional Methods for Attention Assessment

7.1. Statistical Analysis

7.2. ERP Methods

8. Discussion

8.1. High-Quality EEG Data Acquisition

8.1.1. Electrode Design

8.1.2. Noises and Artifacts Removal with Deep Learning Methods

8.2. Multimodal Data Fusion

8.2.1. Methods Combined with Eye Movement

8.2.2. Methods Combined with Facial Expression

8.2.3. Comparison of Multimodal Methods

8.3. Complexity of EEG Data

8.3.1. Complexity of Signal

8.3.2. Complexity of Features

8.3.3. Complexity of Interpretability and Explainability

8.4. Hardware Setups for Deep Learning Method Implementation

8.4.1. System Architecture Design

8.4.2. Real-Time Signal Processing and Low Power Consumption

9. Future Work

10. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI