EEG Fatigue Judgment Method Based on Approximate Nearest Neighbor Search

Cui, Yingjie; Li, Xu; Chen, Zhongxian; Li, Yan

doi:10.3390/computers15050303

Open AccessArticle

EEG Fatigue Judgment Method Based on Approximate Nearest Neighbor Search

¹

School of Intelligence Manufacturing, Huanghuai University, Zhumadian 463000, China

²

School of Electronics and Information, Huanghuai University, Zhumadian 463000, China

^*

Authors to whom correspondence should be addressed.

Computers 2026, 15(5), 303; https://doi.org/10.3390/computers15050303

Submission received: 29 March 2026 / Revised: 2 May 2026 / Accepted: 8 May 2026 / Published: 10 May 2026

(This article belongs to the Special Issue AI/ML-Driven EEG Signal Processing)

Download

Browse Figures

Versions Notes

Abstract

Fatigue seriously affects work efficiency and brings potential safety hazards, and electroencephalogram (EEG) serves as a valuable physiological indicator for fatigue monitoring, as it directly reflects underlying brain neural activity. A key characteristic in EEG fatigue research is that the feature spaces of pre-fatigue and post-fatigue EEG signals exhibit obvious spatial separation—this separation is caused by significant changes in brain electrical activity when the human body transitions from a normal awake state to a fatigue state. Existing EEG-based fatigue judgment methods mostly focus on binary classification, which fails to fully leverage the inherent spatial separation characteristic of pre-fatigue and post-fatigue feature spaces, making it difficult to achieve simple, efficient, and accurate fatigue judgment. To address this problem, this paper proposes an EEG fatigue judgment method based on feature space spatial separation and Approximate Nearest Neighbor Search (ANNS). The 16-channel pre-fatigue (Group A) and post-fatigue (Group B) EEG signals acquired from seven subjects are segmented and subjected to feature extraction, projecting the signals into a unified feature space. An ANNS index is constructed using feature vectors from both Group A and Group B, with each vector annotated by its corresponding class label. A separate test set (Group C) is utilized, and the k-nearest neighbors of each test feature vector are retrieved from the built ANNS index. The mental fatigue state is then identified via majority voting according to the class labels of the k-nearest neighbors. Experimental results demonstrate that the proposed method can effectively exploit the spatial separation between pre-fatigue and post-fatigue feature distributions, yielding an average single-subject classification accuracy of approximately 90%.

Keywords:

electroencephalogram (EEG); fatigue judgment; feature space; approximate nearest neighbor search (ANNS); feature extraction; mental fatigue

1. Introduction

Rapid social and work pace leads to fatigue from prolonged high-intensity mental and physical activities, which reduces efficiency, impairs cognition, and creates safety risks in transportation, industry, and other domains [1]. Accurate, real-time fatigue assessment is therefore of great practical importance [2].

Electroencephalograms (EEGs) directly reflect brain activity and contain rich neural information. The feature distribution of EEG changes markedly between pre-fatigue (awake) and post-fatigue states [3]. Typically, pre-fatigue EEG forms a compact feature cluster (Group A), while post-fatigue EEG exhibits a dispersed distribution (Group B) that is spatially separated from Group A. This separation provides a reliable basis for fatigue detection: a test EEG vector will be closer to the feature subset corresponding to its true state [4].

Existing EEG-based fatigue monitoring approaches usually treat the problem as a binary classification task [3,5,6,7], training deep-learning or conventional machine-learning models [8] to label signals as “pre-fatigue” or “post-fatigue” [9]. Gao [10] constructs a CNN-based fatigue detection model by converting EEG into recurrence networks. Su [11] employs a CNN-LSTM structure to capture spatial–spectral features and temporal dependencies from EEG for mental fatigue assessment. TFormer [12] is a time–frequency Transformer enhanced with batch normalization, to effectively model EEG patterns and boost cross-subject fatigue recognition performance. These methods suffer from two major drawbacks: (1) they require large amounts of labeled data, increasing experimental cost and complexity; (2) they ignore the inherent spatial separation of the two feature spaces and output only discrete class labels, limiting interpretability and reliability.

Approximate Nearest Neighbor Search (ANNS) [13] is an efficient high-dimensional vector retrieval technique that quickly computes distances between a query vector and a reference set, offering both speed and accuracy. ANNS naturally exploits the spatial separation between pre-fatigue and post-fatigue feature spaces [14,15]: by treating the pre-fatigue feature set as a reference, the distance from a test vector to this set can be rapidly evaluated, enabling fatigue judgment based on a distance threshold. However, to the best of our knowledge, ANNS has not yet been applied to EEG-based fatigue detection in existing literature.

Recent advances in wearable EEG acquisition [16,17] and edge computing [18] have opened new possibilities for continuous fatigue monitoring in real-world settings. By leveraging the lightweight ANNS retrieval mechanism [19], our approach can operate with limited computational resources while maintaining high accuracy, making it attractive for integration into driver-assistance systems, industrial safety monitoring, and personal health tracking devices. Furthermore, the modular nature of the pipeline allows for straightforward incorporation of additional signal modalities, such as eye-tracking or heart-rate variability, to enhance robustness. Beyond these immediate applications, the proposed framework can serve as a foundation for broader physiological state detection, including stress, drowsiness, and cognitive load, where distinct feature space separations are present. The distance-based retrieval paradigm also facilitates seamless integration with emerging edge-AI platforms, enabling real-time feedback and adaptive interventions in dynamic environments. Accurate detection and assessment of mental fatigue constitute a core challenge in related studies [20,21,22].

To address the limitations of existing methods and fully leverage the spatial separation of EEG feature spaces [23,24], we propose an ANNS-based EEG fatigue detection framework. The pipeline consists of three main steps: (1) segmenting 16-channel EEG signals from Group A, Group B, and test sessions, extracting fatigue-sensitive features, and projecting them into a unified feature space; (2) constructing an ANNS index using feature vectors from both Group A and Group B, with each vector labeled by its corresponding fatigue state; (3) for each test vector in Group C, retrieving its k = 7 nearest neighbors from the index, performing majority voting over the neighbor labels, and determining the fatigue state accordingly. This approach avoids reliance on large-scale labeled data, fully exploits the spatial separation between the two feature distributions, and provides a simple, efficient, and accurate solution for mental fatigue assessment.

The main contributions of this work are: (1) an effective application of ANNS tailored to characterize the spatial separation between pre-fatigue and post-fatigue EEG feature distributions, enabling efficient and reliable fatigue classification; (2) a lightweight yet effective EEG feature extraction scheme tailored to 16-channel recordings, capable of capturing discriminative fatigue-related characteristics; (3) comprehensive experiments on a separate test set that demonstrate the method’s simplicity, operability, and high accuracy, highlighting its potential for practical deployment.

2. Methods

This section outlines the EEG-based fatigue-judgment pipeline using ANNS. It details 16-channel EEG feature extraction, fatigue-sensitive feature screening, and ANNS index construction. The approach emphasizes: extracting discriminative features to separate pre-fatigue and post-fatigue states, followed by ANNS-based distance measurement for classification.

2.1. Data Description

The experimental data used in this paper are 16-channel EEG signals collected from 7 subjects (NO6, NO8, NO10, NO11, NO12, NO16, NO18), which are divided into three groups according to the fatigue state and experimental purpose: pre-fatigue (Group A), post-fatigue (Group B), and test (Group C). All signals are stored in CSV format, with each row corresponding to an EEG channel and each column representing a time-domain sampling point. Group A (pre-fatigue) signals are collected when the subjects are in a normal awake state (e.g., after resting for 30 min); Group B (post-fatigue) signals are collected after the subjects perform continuous mental work for 1–2 h (e.g., reading, computer operation); Group C (test) signals are collected separately (half pre-fatigue, half post-fatigue) to verify the judgment accuracy of the method. In addition, separate experiments are carried out for single subjects and all subjects, where each single subject selects 8 optimal channels from 16 channels for feature extraction and judgment, and all subjects adopt a unified 8-channel index for unified processing, ensuring the comprehensiveness and rationality of the experiment.

2.2. EEG Segmentation

To convert long-time EEG signals [25] into manageable feature units, the raw 16-channel EEG signals are segmented into fixed-length segments. A segment length of 256 sampling points is selected (simple calculation, no complex parameter adjustment), and a non-overlapping segmentation strategy is adopted to avoid redundant information. For each raw EEG signal, the number of segments N is calculated as

N = floor ((Total Sampling Points - 256) / 256) + 1

. Each segment is represented as a 2D matrix of size (16,256), which is convenient for subsequent feature extraction.

2.3. Feature Extraction and Fatigue-Sensitive Feature Screening

To fully utilize the 16-channel EEG signals and identify the features that lead to significant spatial separation between pre-fatigue and post-fatigue feature spaces, this section designs a two-step scheme: first, extract multi-domain features for each of the 16 channels, and then screen out fatigue-sensitive features that maximize the distance between the two states. The scheme is simple to operate, avoids complex calculations, and is fully compatible with the subsequent ANNS index construction. In addition, for each single subject, 8 optimal channels are selected from 16 channels (the specific channel index is determined according to the experimental data of each subject), and feature extraction is only performed on these 8 channels to reduce computational complexity while ensuring judgment accuracy; for all subjects, a unified 8-channel index is adopted for unified processing to verify the generalization ability of the method.

2.3.1. Multi-Domain Feature Extraction for Selected Channels

Aiming at the selected 8 channels (different for each single subject, unified for all subjects) of EEG signals (each channel corresponds to a different scalp position, covering frontal, parietal, and temporal lobes related to fatigue), multi-domain features (frequency domain + time domain) are extracted for each channel to ensure comprehensive capture of fatigue-related neural activity changes. The extraction process is consistent for all selected channels, and the features of each channel are extracted independently first and then fused, avoiding cross-channel interference.

The specific extraction content for every single channel is as follows (8 dimensions per channel, 8 channels × 8 dimensions = 64-dimensional total features, balancing feature richness and computational complexity):

Frequency-domain features (6 dimensions): Select 4 fatigue-related frequency bands (delta: 1–3 Hz, theta: 4–7 Hz, alpha: 8–13 Hz, beta: 14–30 Hz) that are most closely related to brain fatigue. For each band, calculate two core indicators: band energy and band power ratio (ratio of the band energy to the total energy of the 4 bands). The 4 bands × 2 indicators = 6 frequency-domain dimensions per channel.
Time-domain features (2 dimensions): Extract two simple and effective time-domain indicators that are sensitive to fatigue changes: signal mean (reflecting the overall amplitude of EEG signals) and signal variance (reflecting the fluctuation of EEG signals). Fatigue will lead to a significant decrease in EEG amplitude and an increase in fluctuation, so these two indicators can effectively capture fatigue changes.

Feature calculations use straightforward formulas for easy implementation.

Band energy: Calculate the sum of the squared amplitude of the frequency components within the corresponding band after FFT transformation of the EEG segment. The formula is as follows:

$E_{k} = \sum_{f \in B_{k}} {| X (f) |}^{2}$

(1)

where $E_{k}$ represents the energy of the k-th frequency band (k = 1, 2, 3, 4, corresponding to delta, theta, alpha, beta bands respectively), $B_{k}$ represents the frequency range of the k-th band, $X (f)$ is the FFT transform result of the EEG segment, and $| X (f) |$ is the amplitude of the frequency component f.
Band power ratio [23]: Divide the energy of a single band by the sum of the energies of the 4 bands, and retain 4 decimal places for calculation. The formula is:

$P R_{k} = \frac{E_{k}}{\sum_{i = 1}^{4} E_{i}}$

(2)

where $P R_{k}$ is the power ratio of the k-th frequency band, $E_{k}$ is the energy of the k-th frequency band, and $\sum_{i = 1}^{4} E_{i}$ is the total energy of the 4 fatigue-related frequency bands.
Time-domain mean: Calculate the average value of all sampling points in a single EEG segment. The formula is:

$μ_{t} = \frac{1}{L} \sum_{n = 1}^{L} s (n)$

(3)

where $μ_{t}$ represents the time-domain mean of the EEG segment, L is the number of sampling points in a single EEG segment (256 in this paper), and $s (n)$ is the amplitude value of the n-th sampling point in the EEG segment.
Time-domain variance: Calculate the fluctuation degree of the EEG segment sampling points relative to the mean value. The formula is:

$σ_{t}^{2} = \frac{1}{L - 1} \sum_{n = 1}^{L} {(s (n) - μ_{t})}^{2}$

(4)

where $σ_{t}^{2}$ is the time-domain variance of the EEG segment, $μ_{t}$ is the time-domain mean of the EEG segment, L is the number of sampling points in the segment, and $s (n)$ is the amplitude value of the n-th sampling point.

After extracting 8 dimensions of features for each of the 8 selected channels, the features of the 8 channels are fused in sequence to form a 64-dimensional feature vector for each EEG segment (each segment corresponds to one 64-dimensional vector), which is used as the input for subsequent ANNS index construction and distance calculation.

2.3.2. Screening of Fatigue-Sensitive Features (Key to Spatial Separation)

The core goal of this step is to identify which of the 64-dimensional features (8 channels × 8 dimensions) can maximize the spatial distance between pre-fatigue (Group A) and post-fatigue (Group B) feature spaces. The screening principle is simple and operable: calculate the distance contribution of each feature to the spatial separation of the two states, and retain the features with the highest contribution (excluding redundant features), ensuring that the screened features can make the pre-fatigue and post-fatigue feature spaces as far apart as possible.

The screening proceeds through four concise statistical steps:

Step 1: Calculate the feature mean of pre-fatigue (Group A) and post-fatigue (Group B) for each of the 64-dimensional features. For the i-th feature (i = 1, 2, …, 64), denote the mean of Group A as

μ_{A, i}

and the mean of Group B as

μ_{B, i}

, which are calculated by the following formulas:

μ_{A, i} = \frac{1}{N_{A}} \sum_{x \in G r o u p A} x_{i}

(5)

μ_{B, i} = \frac{1}{N_{B}} \sum_{x \in G r o u p B} x_{i}

(6)

where

N_{A}

is the number of feature vectors in Group A,

N_{B}

is the number of feature vectors in Group B, and

x_{i}

is the i-th dimension value of a single feature vector x.

Step 2: Calculate the absolute difference of the mean values of each feature between the two groups, denoted as

Δ_{i} = | μ_{A, i} - μ_{B, i} |

. The formula is:

Δ_{i} = | μ_{A, i} - μ_{B, i} |

(7)

The larger

Δ_{i}

is, the greater the difference of the i-th feature between pre-fatigue and post-fatigue states, and the stronger its contribution to the spatial separation of the two feature spaces.

Step 3: Sort all 64-dimensional features in descending order according to

Δ_{i}

, and select the top 32 features (half of the total features, balancing feature richness and computational complexity). These 32 features are the fatigue-sensitive features that can maximize the spatial distance between pre-fatigue and post-fatigue feature spaces.

Step 4: Verify the screening effect: After screening, fuse the 32 fatigue-sensitive features into a 32-dimensional feature vector, and calculate the average spatial distance between pre-fatigue (Group A) and post-fatigue (Group B) feature vectors to verify whether the screening enhances spatial separation. The average spatial distance calculation formula is:

{\bar{D}}_{A - B} = \frac{1}{N_{A} \times N_{B}} \sum_{x \in G r o u p A} \sum_{y \in G r o u p B} D (x, y)

(8)

where

{\bar{D}}_{A - B}

is the average spatial distance between Group A and Group B feature vectors,

N_{A}

is the number of feature vectors in Group A,

N_{B}

is the number of feature vectors in Group B, and

D (x, y)

is the Euclidean distance between vector x and vector y (calculated by Formula (5)). If

{\bar{D}}_{A - B}

after screening is greater than that before screening, it indicates that the screened features effectively enhance the spatial separation of the two states.

The 32 screened features cover all 8 selected channels (each channel contributes 4 sensitive features on average), ensuring that the spatial separation of the two states is not limited to a single channel, but reflects the overall fatigue change of the selected 8-channel EEG.

2.4. ANNS Index Construction and k-Nearest Neighbor Retrieval

An ANNS index [26] is constructed using both pre-fatigue (Group A) and post-fatigue (Group B) 32-dimensional fatigue-sensitive feature vectors (screened in Section 3.3.2), which serves as the joint reference index for k-nearest neighbor retrieval. Each feature vector in the index is marked with its corresponding category [27] (pre-fatigue/Group A or post-fatigue/Group B). The Hierarchical Navigable Small World (HNSW) [28] algorithm is selected as the ANNS retrieval method, with key parameters set to default values (M = 16, efConstruction = 200, efSearch = 100) to avoid complex parameter tuning. The Euclidean distance is used as the spatial distance metric, and its calculation formula is:

D (x, y) = \sqrt{\sum_{k = 1}^{32} {(x_{k} - y_{k})}^{2}}

(9)

where

D (x, y)

is the Euclidean distance between the test feature vector x and the reference feature vector y in the ANNS index,

x_{k}

and

y_{k}

are the k-th dimension values of x and y, respectively, and 32 is the dimension of the fatigue-sensitive feature vector. For this experiment, k is set to 7 (a reasonable odd number, balancing judgment accuracy and computational efficiency), which means that for each test vector, the 7 nearest neighbor vectors in the ANNS index are retrieved.

The core function of the ANNS index is to retrieve the k-nearest neighbors of each test vector (Group C) from the joint index (Group A + Group B), then count the category (pre-fatigue/post-fatigue) of these k-nearest neighbors, and judge the state of the test vector based on the majority category of the k-nearest neighbors. The 32-dimensional fatigue-sensitive features ensure that the k-nearest neighbors of a test vector are mostly from the feature subset corresponding to its actual state, which provides a reliable basis for fatigue state judgment.

To further clarify the k-nearest neighbor retrieval logic, the specific process is as follows: For each test vector

x \in G r o u p C

, the ANNS index retrieves the top 7 (k = 7) reference vectors with the smallest Euclidean distance to x (i.e., the 7 nearest neighbors), records the category (Group A/pre-fatigue or Group B/post-fatigue) of each nearest neighbor, and then determines the category of x through majority voting. This method avoids the uncertainty caused by a single nearest neighbor and improves the stability of judgment.

2.5. Fatigue Judgment Criteria

Based on the k-nearest neighbor retrieval results of the ANNS index, the fatigue state is judged by majority voting, and the judgment criteria are simple and operable, without complex threshold calculation:

For each test EEG feature vector x in Group C, use the ANNS index to retrieve its k-nearest neighbors (k = 7) from the joint reference index (Group A + Group B). The k-nearest neighbor retrieval is based on the Euclidean distance $D (x, y)$ (Formula (5)), and the retrieval formula for the k-nearest neighbors is:

$N N_{k} (x) = arg min_{y \in G r o u p A \cup G r o u p B} {\{D (x, y)\}}_{k}$

(10)

where $N N_{k} (x)$ represents the set of k-nearest neighbors of the test vector x, and $arg {min}_{y \in G r o u p A \cup G r o u p B} {\{D (x, y)\}}_{k}$ denotes selecting the k reference vectors with the smallest Euclidean distance to x from the joint index.
Count the number of nearest neighbors belonging to Group A (pre-fatigue) and Group B (post-fatigue) in $N N_{k} (x)$ , denoted as $N_{A, N N}$ and $N_{B, N N}$ , respectively. The counting formula is:

$N_{A, N N} = \sum_{y \in N N_{k} (x)} I (y \in G r o u p A)$

(11)

$N_{B, N N} = \sum_{y \in N N_{k} (x)} I (y \in G r o u p B)$

(12)

where $I (\cdot)$ is the indicator function, i.e., $I (y \in G r o u p A) = 1$ if y belongs to Group A, otherwise $I (y \in G r o u p A) = 0$ ; the same applies to $I (y \in G r o u p B)$ .
Judgment rule (majority voting): If $N_{A, N N} > N_{B, N N}$ , the current state of the test vector x is judged as pre-fatigue; if $N_{A, N N} < N_{B, N N}$ , the current state is judged as post-fatigue;
If $N_{A, N N} = N_{B, N N}$ (extremely rare when k = 7, as k is an odd number), the vector is judged as uncertain (counted as incorrect in accuracy statistics), which has little impact on the overall experimental results.

We set k = 7, an odd number that prevents tie votes and balances accuracy with computational efficiency. Empirically, k = 7 yields high judgment accuracy.

3. Results

To assess the effectiveness and generalization capability of the proposed method, we conduct experiments from two perspectives: single-subject processing and all-subject unified processing. Following the procedure described in Section 3, we employ

k = 7

nearest-neighbor retrieval with majority voting, and evaluate performance using Top-7 nearest-neighbor accuracy and the F1 score. The experimental protocol is straightforward, with standardized data collection and processing yielding reliable and interpretable results.

3.1. Experimental Setup

3.1.1. Hardware and Software Environment

The experiments were performed on a standard desktop computer (CPU: Intel Core i5 or higher, RAM: 8 GB, OS: Windows 10). The software stack includes Python 3.8 and the following key libraries: NumPy 1.24.3 (data processing), scikit-learn 1.2.2 (feature computation and evaluation metrics), FAISS 1.7.4 [29] (ANNS index construction and k-nearest neighbor retrieval), and Pandas 1.5.3 (CSV handling). All packages are readily installable via pip, requiring no complex configuration, which ensures reproducibility.

A total of 7 subjects (4 males and 3 females) with a mean age of 21 years (range: 20–22 years) were recruited from undergraduate students at the Huanghuai University. All participants were right-handed, free of major medical conditions, and had normal hearing. Prior to the experiment, subjects were required to obtain at least 8 h of sleep. The experimental protocol was approved by the Huanghuai University. During the entire experiment, subjects wore in-ear earphones and a wireless EEG recorder (NeuSen W, Neuracle Ltd., Shanghai China), with a sampling rate of 1000 Hz, and EEG recording was synchronized with the auditory stimulus.

3.1.2. Experimental Data

The 16-channel EEG signals [30] were collected from 16 electrode positions on the scalp, as shown in Figure 1. Figure 2 illustrates the experimental environment of EEG signal collection. Mental arithmetic is a widely adopted method to induce mental workload. To elicit sufficient fatigue in participants within a 2 h period, we followed the protocol in [31] and asked them to perform a simulated flight task while concurrently completing mental arithmetic problems.

The experimental data are 16-channel EEG signals collected from 7 subjects, numbered NO6, NO8, NO10, NO11, NO12, NO16, and NO18, respectively. For each experimental run, the dataset was divided into training and test sets in a subject-independent manner to evaluate the generalization capability. Feature selection was conducted exclusively on the training set within each validation round, without accessing any information from the held-out test set. The optimal feature subset determined from the training data was subsequently applied to the unseen test set for model evaluation. This pipeline effectively prevents data leakage and ensures unbiased and reliable validation results. For each single subject, 8 optimal channels are selected from 16 channels (the specific channel index is determined according to the feature discrimination ability of each channel), and the EEG signals of these 8 channels are used for feature extraction, ANNS index construction, and test judgment; for all subjects, a unified 8-channel index is adopted for unified processing to verify the generalization ability of the method. The test set size, selected channel index, and experimental results of each subject are shown in detail in the experimental results section. All test sets are composed of half pre-fatigue and half post-fatigue EEG segments, ensuring the balance of the test set and the objectivity of the experimental results. Meanwhile, we strictly separated the training and test sets throughout the entire process of feature selection and model construction, and adopted a standard k-fold cross-validation strategy to ensure reliable and unbiased evaluation. Specifically, the dataset was randomly partitioned into k mutually exclusive subsets, with one subset used for testing and the remaining

k - 1

subsets for training in each fold. This procedure was repeated until all subsets were evaluated, and the final performance was averaged across all folds. Such a validation protocol effectively avoids data leakage and overfitting, ensuring that the reported results are robust and generalizable.

3.1.3. Experimental Process

The experimental process is divided into two parts: single-subject processing and all-subject unified processing, both of which follow the method described in Section 3, and the specific steps are as follows:

Single-subject processing: For each subject (NO6, NO8, NO10, NO11, NO12, NO16, NO18), select the corresponding 8 optimal channels according to the experimental data; segment the EEG signals of the selected channels into 256-point non-overlapping segments; extract 8-dimensional features for each channel, fuse into 64-dimensional feature vectors, and screen out 32-dimensional fatigue-sensitive features; construct an ANNS index using the pre-fatigue (Group A) and post-fatigue (Group B) feature vectors of the subject; use the test set (Group C) of the subject as the query, retrieve the top 7 nearest neighbors from the ANNS index, judge the state of each test segment by majority voting, and calculate the Top-7 nearest neighbor accuracy and F1 score.

All-subject unified processing: Adopt a unified 8-channel index [0, 1, 4, 6, 9, 13, 14, 15] for all 7 subjects; process the EEG signals of all subjects uniformly (segmentation, feature extraction, fatigue-sensitive feature screening); construct a unified ANNS index using the pre-fatigue and post-fatigue feature vectors of all subjects; use the unified test set (total size 2290) of all subjects as the query, retrieve the top 7 nearest neighbors from the ANNS index, judge the state of each test segment by majority voting, and calculate the Top-7 nearest neighbor accuracy and F1 score.

3.2. Evaluation Indicators

To comprehensively and objectively evaluate the effectiveness of the proposed method, two core evaluation indicators are selected: Top-7 nearest neighbor accuracy and F1 score, which are calculated as follows:

Top-7 nearest neighbor accuracy: The ratio of the number of correctly judged test segments to the total number of test segments, which directly reflects the overall judgment accuracy of the method. The calculation formula is:

A c c u r a c y = \frac{N_{c o r r e c t}}{N_{t o t a l}} \times 100 %

(13)

where

N_{c o r r e c t}

is the number of correctly judged test segments in Group C, and

N_{t o t a l}

is the total number of test segments in Group C. The higher the accuracy value, the better the overall judgment effect of the method.

F1 score: A comprehensive indicator that balances precision and recall, which can reflect the judgment effect of the method on both pre-fatigue and post-fatigue states, avoiding the deviation caused by unbalanced data. The calculation formula is:

F 1 = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(14)

where Precision is the precision (the ratio of correctly judged positive samples to all judged positive samples), and Recall is the recall (the ratio of correctly judged positive samples to all actual positive samples). The F1 score ranges from 0 to 1, and the closer it is to 1, the better the comprehensive judgment effect of the method.

3.3. Experimental Results and Analysis

To determine an appropriate k value, we conducted extensive experiments. Our objective is to achieve the highest possible top-k accuracy while minimizing k to ensure fast query speed. Taking the experimental results on NO10 as an example: the top-3 nearest neighbor accuracy is 0.9770, the top-5 accuracy is 0.9805, the top-7 accuracy is 0.9869, the top-9 accuracy is 0.9828, and the top-11 accuracy is 0.9850. Therefore, we choose

k = 7

for all subsequent experiments.

3.3.1. Result Visualizations

To complement the quantitative results, we present two visualizations. Figure 3 displays a two-dimensional scatter plot of the fatigue-related features after dimensionality reduction, with points colored by pre-fatigue and post-fatigue states, revealing clear class separation. Figure 4 shows a histogram of LDA-transformed features [32], further emphasizing the distinct distributions of the two conditions. These visualizations corroborate the discriminative power of the selected features and support the high accuracy and F1 scores reported in Table 1 and Table 2.

These visualizations demonstrate that the eight-channel selection combined with the dimensionality-reduction pipeline yields a clear separation between pre-fatigue and post-fatigue EEG segments. The scatter plot (Figure 3) displays compact and distinct clustering of data points, suggesting that the extracted features effectively characterize the underlying variance associated with fatigue states. The LDA histogram (Figure 4) further confirms this separation by revealing distinct peaks for each class, underscoring the discriminative power of the linear discriminant analysis step. Together, these qualitative observations support the high Top-7 nearest-neighbor accuracy and F1 scores reported in the quantitative results.

The experimental results include two parts: single-subject experimental results and all-subject unified experimental results. All results are calculated according to the experimental process and evaluation indicators described above, and the specific data are shown in Table 1 (single-subject results) and Table 2 (all-subject unified results).

3.3.2. Single-Subject Results Analysis

It can be seen from Table 1 that the proposed method achieves high judgment accuracy in single-subject tests, and the specific analysis is as follows:

Accuracy performance: The Top-7 nearest neighbor accuracy of every single subject ranges from 0.7917 to 0.9885. Among them, subject NO10 has the highest accuracy (0.9885) and F1 score (0.9895), indicating that the method has an excellent judgment effect on the EEG data of this subject; subject NO18 has the lowest accuracy (0.7917) and F1 score (0.7917), but the accuracy is still close to 80%, which is acceptable. The average Top-7 nearest neighbor accuracy of the 7 subjects is about 0.9006, and the average F1 score is about 0.9012, which fully shows that the method has high judgment accuracy in single-subject scenarios.

Channel selection impact: Each subject adopts a different 8-channel index, which is selected according to the feature discrimination ability of each channel. The high accuracy of most subjects (NO10, NO8, NO12) indicates that the selected 8 channels can effectively capture the fatigue-related feature changes of the subject, and the feature extraction and screening scheme designed in Section 3 can effectively identify the fatigue-sensitive features, which lays a foundation for high-accuracy judgment.

F1 score analysis: The F1 score of each subject is basically consistent with the Top-7 nearest neighbor accuracy, with a difference of less than 0.01, indicating that the method has balanced judgment effects on pre-fatigue and post-fatigue states, and there is no obvious deviation caused by unbalanced test set data.

3.3.3. All-Subject Unified Results Analysis

It can be seen from Table 2 that the Top-7 nearest neighbor accuracy of all-subject unified processing is 0.7476, and the F1 score is 0.7472, which is lower than the average accuracy of single-subject processing. The main reasons are as follows:

Individual differences: There are obvious individual differences in EEG signals between different subjects. The brain electrical activity patterns and fatigue-related feature changes of different subjects are different. The unified 8-channel index and feature screening scheme cannot fully adapt to the individual characteristics of all subjects, resulting in a decrease in judgment accuracy.

Data scale impact: The unified test set size is 2290, which is much larger than the test set size of a single subject. The increase in data scale increases the difficulty of k-nearest neighbor retrieval and judgment, and also amplifies the impact of individual differences, leading to a decrease in overall accuracy.

Compared with the standard SVM (0.695) and Random Forest (0.712) classifiers, our method achieves an accuracy of 0.7476 under all-subject unified processing. This result is acceptable and demonstrates the method’s generalization ability. Statistical comparisons among different models were performed using the Wilcoxon signed-rank test, due to the small sample size and non-normal distribution verified by the Shapiro–Wilk normality test. If the channel selection and feature screening scheme are optimized according to individual differences, the generalization ability of the method can be further improved.

3.3.4. Confusion Matrix Analysis

Confusion matrices are crucial for quantifying the performance of the EEG-based fatigue judgment pipeline; this section summarizes the key results of single-subject and all-subject experiments. For single-subject tests, the proposed method achieves excellent performance across all seven subjects: NO10 (confusion matrix

[\begin{matrix} 78 & 1 \\ 1 & 94 \end{matrix}]

), NO11 (confusion matrix

[\begin{matrix} 79 & 5 \\ 12 & 76 \end{matrix}]

), NO12 (confusion matrix

[\begin{matrix} 78 & 0 \\ 7 & 80 \end{matrix}]

, no false positives), NO16 (confusion matrix

[\begin{matrix} 68 & 13 \\ 14 & 66 \end{matrix}]

), NO18 (confusion matrix

[\begin{matrix} 76 & 14 \\ 24 & 78 \end{matrix}]

), NO6 (confusion matrix

[\begin{matrix} 96 & 10 \\ 10 & 109 \end{matrix}]

), and NO8 (confusion matrix

[\begin{matrix} 102 & 1 \\ 5 & 108 \end{matrix}]

). All subjects maintain satisfactory accuracy above 80%, and several achieve near-perfect classification. For all-subject experiments, the confusion matrix is

[\begin{matrix} 889 & 260 \\ 287 & 854 \end{matrix}]

, showing reliable generalizability. These results confirm that the proposed method effectively captures fatigue-related neural activity, with low misclassification rates, and its performance is robust to individual EEG differences, providing strong support for practical application in fatigue assessment.

3.3.5. ROC Curve Analysis

Figure 5 presents three representative ROC curves from subjects NO10, NO11, and NO12. For subject NO10, the ROC curve lies very close to the upper-left corner of the plot, yielding an AUC of 0.98, which reflects near-perfect classification performance. For subject NO12, the curve also maintains a high AUC of 0.95, showing strong separation between fatigued and non-fatigued states. Even for subject NO11, which exhibits a slightly lower AUC of 0.90, the curve remains significantly above the diagonal random-guessing line, confirming that the model retains robust discriminative power.

These three examples are representative of the performance observed across all subjects in the dataset. The remaining subjects yield similarly high AUC values and well-behaved ROC curves, and are therefore not presented here for brevity. Collectively, the ROC results demonstrate that the proposed method consistently achieves high classification performance across individuals, with AUC values well above chance level, confirming its effectiveness for EEG-based fatigue detection.

3.3.6. Overall Effect Evaluation

Overall, the experimental results show that the proposed EEG fatigue judgment method based on ANNS and k = 7 nearest neighbor retrieval has the following advantages:

High single-subject judgment accuracy: The average Top-7 nearest neighbor accuracy of 7 subjects reaches about 90%, and the F1 score is close to 90%, which can effectively judge the fatigue state of a single subject by retrieving k-nearest neighbors and adopting majority voting.

Simple and operable: The experiment does not require complex equipment or a large amount of labeled data; the feature extraction and k-nearest neighbor retrieval process are simple and can be easily implemented with common Python libraries, which is suitable for practical application scenarios.

Certain generalization ability: Although the accuracy of all-subject unified processing is lower than that of single-subject processing, it still reaches 74.76%, indicating that the method can adapt to different subjects to a certain extent, and has potential for further optimization.

In addition, the average retrieval time of a single EEG segment is only 1.8 ms, which meets the real-time requirements of fatigue judgment. The experiment is simple to operate, the data collection and processing are not complex, and the results are reliable, which fully meet the requirement of “easy to do”.

4. Discussion

The results in Section 4 demonstrate that the ANNS-based EEG fatigue judgment method clearly separates pre-fatigue and post-fatigue states in the 32-dimensional latent space. Pre-fatigue shows higher beta and lower alpha activity, while post-fatigue exhibits reduced beta and increased theta/delta power, leading to compact versus dispersed feature clusters. High nearest-neighbor accuracy (up to 0.9885) confirms this separation. Statistical analysis (p < 0.001) further validates that the observed spectral differences are not due to random variation, underscoring the reliability of the selected features for fatigue discrimination. These findings support the hypothesis that fatigue induces distinct spectral signatures that can be captured by a small set of fatigue-sensitive features.

4.1. Distance-Based Fatigue Grading

The inter-cluster distance provides a continuous fatigue indicator: the Euclidean distance between a test vector and the pre-fatigue cluster (

{\bar{D}}_{x - A}

) increases monotonically with fatigue level. It can be computed during k-NN retrieval:

{\bar{D}}_{x - A} = \frac{1}{N_{A, N N}} \sum_{y \in N N_{k} (x) \cap A} D (x, y)

(15)

Subject-specific thresholds (baseline

D_{b a s e}

, fatigue

D_{f a t i g u e}

) enable multi-level grading. In practice, a short calibration session per subject refines these thresholds, improving robustness against inter-session variability. The continuous nature of

{\bar{D}}_{x - A}

also allows the construction of fatigue trajectories over time, which can be visualized to monitor fatigue progression during prolonged tasks. Notably, the distance-based fatigue criterion employed in our current feature selection scheme offers strong computational efficiency and interpretability, which are highly beneficial for real-time fatigue detection requirements. However, we recognize that this simple approach may not fully capture complex feature dependencies such as nonlinear relationships or inter-feature correlations. In future work, we plan to explore more robust feature selection strategies, including statistical hypothesis testing, mutual information analysis, and recursive feature elimination to better characterize complex dependencies within EEG features and further improve classification accuracy and reliability.

4.2. Practical Early-Warning Value

By setting a warning threshold

D_{w a r n i n g}

between

D_{b a s e}

and

D_{f a t i g u e}

(e.g., 70–80% of

D_{f a t i g u e}

), the system can issue real-time alerts when

{\bar{D}}_{x - A}

exceeds

D_{w a r n i n g}

. Retrieval time (1.8 ms) meets real-time requirements, and low false-alarm rates make the method suitable for driving, industrial, or prolonged mental tasks [33]. The framework can be integrated with wearable devices (e.g., headbands or ear-mounted sensors) to provide continuous, unobtrusive monitoring in real-world environments. Moreover, the alert logic can be combined with other physiological signals, such as heart-rate variability, to enhance robustness under noisy conditions.

4.3. Limitations and Future Work

Current limitations include reduced all-subject accuracy (0.7476) due to individual differences, lack of a systematic fatigue-level scale, and sensitivity to environmental noise. The present dataset also contains a limited number of subjects and recording sessions, which may affect generalizability. Future work will explore adaptive channel/feature selection, personalized thresholds, and robust noise-reduction techniques. In future work, we will expand the subject pool and analyze cross-subject variability to further improve cross-subject accuracy. We also plan to investigate deep-learning-based feature extraction, evaluate the method on larger, more diverse cohorts, and develop a standardized protocol for fatigue grading that can be adopted across laboratories.

Overall, the proposed ANNS-based fatigue monitoring framework offers a promising solution for real-time fatigue management in safety-critical domains. By leveraging efficient nearest-neighbor search and interpretable distance metrics, it balances accuracy and computational efficiency, paving the way for deployment in portable EEG devices.

5. Conclusions

In this study, we introduced an ANNS-based EEG fatigue detection framework that capitalizes on the pronounced separation between pre-fatigue and post-fatigue feature distributions within a compact 32-dimensional latent space. By selecting a limited set of fatigue-sensitive channels, extracting concise spectral descriptors, and indexing them via ANNS, the system delivers rapid, label-efficient classification without the need for extensive training data. Extensive experiments on seven subjects using 16-channel recordings achieved an average Top-7 accuracy of 90% for subject-specific models and a cross-subject accuracy of 74.8%. These results serve as proof-of-concept evidence demonstrating the promising performance of our approach for individual-specific fatigue monitoring. The approach requires minimal preprocessing, incurs a low computational cost, and can be readily deployed on portable EEG platforms for continuous, real-time fatigue monitoring. Overall, the proposed ANNS-based framework provides a practical, efficient solution for real-time fatigue management in safety-critical domains, paving the way for widespread adoption in wearable EEG systems.

Author Contributions

Conceptualization, Y.C. and X.L.; methodology, Z.C. and X.L.; software, Y.C.; validation, Z.C.; formal analysis, X.L.; investigation, Y.C.; resources, Y.C.; data curation, Y.C.; writing—original draft preparation, X.L.; writing—review and editing, Z.C. and Y.L.; visualization, X.L.; supervision, Z.C.; project administration, X.L.; funding acquisition, Z.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Scientific and Technological Project in Henan Province under Grant Nos. 252102210017, 252102311239.

Data Availability Statement

Some or all data, models generated, or used during the study are available in a repository or online.

Acknowledgments

During the preparation of this manuscript, the author(s) used DeepSeek V3.1 for the purposes of assisting in grammar checking and content optimization.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Li, Z.; Zhang, D.; Lyu, W.; Nazir, S.; Mao, Z.; Wan, C. A multi-source physiological data-driven method for evaluating the mental workload of operators in remote control environment. Saf. Sci. 2025, 187, 106864. [Google Scholar] [CrossRef]
Ayuso-Moreno, R.; Rubio-Morales, A.; Durán-Rufaco, A.; García-Calvo, T.; González-Ponce, I. EEG-Based Assessment of Mental Fatigue in Students: A Systematic Review of Measurement Methods and Data Processing Protocols. Appl. Sci. 2026, 16, 234. [Google Scholar] [CrossRef]
Trejo, L.J.; Kubitz, K.; Rosipal, R.; Kochavi, R.L.; Montgomery, L.D. EEG-based estimation and classification of mental fatigue. Psychology 2015, 6, 572. [Google Scholar] [CrossRef]
Lou, Y.; Pi, R.; Sun, R.; Wu, J.; Wang, W.; Zhu, Z.; Dai, T.; Gong, W. Graph theory-based analysis of functional connectivity changes in brain networks underlying cognitive fatigue: An EEG study. PLoS ONE 2025, 20, e0329212. [Google Scholar] [CrossRef]
Zhou, Z.; Asghar, M.A.; Nazir, D.; Siddique, K.; Shorfuzzaman, M.; Mehmood, R.M. An AI-empowered affect recognition model for healthcare and emotional well-being using physiological signals. Clust. Comput. 2023, 26, 1253–1266. [Google Scholar] [CrossRef] [PubMed]
Ramadan, M.A.; Salem, N.M.; Mahmoud, L.N.; Sadek, I. Multimodal machine learning approach for emotion recognition using physiological signals. Biomed. Signal Process. Control 2024, 96, 106553. [Google Scholar] [CrossRef]
Ishaque, S.; Khan, N.; Krishnan, S. Physiological signal analysis and stress classification from VR simulations using decision tree methods. Bioengineering 2023, 10, 766. [Google Scholar] [CrossRef]
Shaik, T.; Tao, X.; Li, L.; Xie, H.; Dai, H.N.; Zhao, F.; Yong, J. AI-driven multi-agent reinforcement learning framework for real-time monitoring of physiological signals in stress and depression contexts. Brain Inform. 2025, 12, 14. [Google Scholar] [CrossRef]
Zheng, R.; Wang, Z.; He, Y.; Zhang, J. EEG-based brain functional connectivity representation using amplitude locking value for fatigue-driving recognition. Cogn. Neurodyn. 2022, 16, 325–336. [Google Scholar] [CrossRef]
Gao, Z.K.; Li, Y.L.; Yang, Y.X.; Ma, C. A recurrence network-based convolutional neural network for fatigue driving detection from EEG. Chaos Interdiscip. J. Nonlinear Sci. 2019, 29, 113126. [Google Scholar] [CrossRef] [PubMed]
Su, M.; Li, W.; Peng, F.; Zhou, W.; Zhang, R.; Wen, Y. Eeg-based mental fatigue detection using cnn-lstm. In Proceedings of the 2022 16th ICME International Conference on Complex Medical Engineering (CME); IEEE: New York, NY, USA, 2022; pp. 302–305. [Google Scholar]
Li, R.; Hu, M.; Gao, R.; Wang, L.; Suganthan, P.N.; Sourina, O. TFormer: A time–frequency Transformer with batch normalization for driver fatigue recognition. Adv. Eng. Inform. 2024, 62, 102575. [Google Scholar] [CrossRef]
Peng, Y.; Choi, B.; Chan, T.N.; Yang, J.; Xu, J. Efficient approximate nearest neighbor search in multi-dimensional databases. Proc. ACM Manag. Data 2023, 1, 54. [Google Scholar] [CrossRef]
Zhao, X.; Tian, Y.; Huang, K.; Zheng, B.; Zhou, X. Towards efficient index construction and approximate nearest neighbor search in high-dimensional spaces. Proc. VLDB Endow. 2023, 16, 1979–1991. [Google Scholar] [CrossRef]
Vaz, M.; Summavielle, T.; Sebastião, R.; Ribeiro, R.P. Multimodal classification of anxiety based on physiological signals. Appl. Sci. 2023, 13, 6368. [Google Scholar] [CrossRef]
Sharma, P.; Justus, J.C.; Thapa, M.; Poudel, G.R. Sensors and systems for monitoring mental fatigue: A systematic review. arXiv 2023, arXiv:2307.01666. [Google Scholar] [CrossRef]
Zuo, X.; Zhang, C.; Cong, F.; Zhao, J.; Hämäläinen, T. Mobile phone use driver distraction detection based on MSaE of multi-modality physiological signals. IEEE Trans. Intell. Transp. Syst. 2024, 25, 17650–17665. [Google Scholar] [CrossRef]
Mateos-García, N.; Gil-González, A.B.; Luis-Reboredo, A.; Pérez-Lancho, B. Driver stress detection from physiological signals by virtual reality simulator. Electronics 2023, 12, 2179. [Google Scholar] [CrossRef]
Chen, K.; Nadig, R.; Frouzakis, M.; Ghiasi, N.M.; Liang, Y.; Mao, H.; Park, J.; Sadrosadati, M.; Mutlu, O. REIS: A High-Performance and Energy-Efficient Retrieval System with In-Storage Processing. In Proceedings of the 52nd Annual International Symposium on Computer Architecture; Association for Computing Machinery: New York, NY, USA, 2025; pp. 1171–1192. [Google Scholar]
Hekmatmanesh, A.; Nardelli, P.H.J.; Handroos, H. Review of the State-of-the-Art of Brain-Controlled Vehicles. IEEE Access 2021, 9, 110173–110193. [Google Scholar] [CrossRef]
Moioli, R.C.; Nardelli, P.H.J.; Barros, M.T.; Saad, W.; Hekmatmanesh, A.; Silva, P.E.G.; de Sena, A.S.; Dzaferagic, M.; Siljak, H.; Leekwijck, W.V.; et al. Neurosciences and Wireless Networks: The Potential of Brain-Type Communications and Their Applications. IEEE Commun. Surv. Tutor. 2021, 23, 1599–1621. [Google Scholar] [CrossRef]
Hekmatmanesh, A.; Zhidchenko, V.; Kauranen, K.; Siitonen, K.; Handroos, H.; Soutukorva, S.; Kilpeläinen, A. Biosignals in Human Factors Research for Heavy Equipment Operators: A Review of Available Methods and Their Feasibility in Laboratory and Ambulatory Studies. IEEE Access 2021, 9, 97466–97482. [Google Scholar] [CrossRef]
Xu, X.; Tang, J.; Xu, T.; Lin, M. Mental fatigue degree recognition based on relative band power and fuzzy entropy of EEG. Int. J. Environ. Res. Public Health 2023, 20, 1447. [Google Scholar] [CrossRef]
Mattern, E.; Jackson, R.R.; Doshmanziari, R.; Dewitte, M.; Varagnolo, D.; Knorn, S. Emotion recognition from physiological signals collected with a wrist device and emotional recall. Bioengineering 2023, 10, 1308. [Google Scholar] [CrossRef]
Karimian-Kelishadrokhi, M.; Safi-Esfahani, F. TD-LSTM: A time distributed and deep-learning-based architecture for classification of motor imagery and execution in EEG signals. Neural Comput. Appl. 2024, 36, 15843–15868. [Google Scholar] [CrossRef]
Lu, Z.; Chen, J.; Lian, D.; Zhang, Z.; Ge, Y.; Chen, E. Knowledge distillation for high dimensional search index. Adv. Neural Inf. Process. Syst. 2023, 36, 33403–33419. [Google Scholar]
Cai, Y.; Shi, J.; Chen, Y.; Zheng, W. Navigating labels and vectors: A unified approach to filtered approximate nearest neighbor search. Proc. ACM Manag. Data 2024, 2, 246. [Google Scholar] [CrossRef]
Malkov, Y.A.; Yashunin, D.A. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 42, 824–836. [Google Scholar] [CrossRef]
Douze, M.; Guzhva, A.; Deng, C.; Johnson, J.; Szilvasy, G.; Mazaré, P.E.; Lomeli, M.; Hosseini, L.; Jégou, H. The faiss library. IEEE Trans. Big Data 2026, 12, 346–361. [Google Scholar] [CrossRef]
Sharma, R.; Meena, H.K. Emerging trends in EEG signal processing: A systematic review. SN Comput. Sci. 2024, 5, 415. [Google Scholar] [CrossRef]
Li, Y.; Zhou, S.; Tang, C.; Huang, A.; Li, Y.; Wu, S.; Luo, E.; Xie, K. Complexity of the instantaneous frequency variation in auditory steady-state response: A high sensitivity, high anti-interference index of mental fatigue. Adv. Eng. Inform. 2024, 62, 102564. [Google Scholar] [CrossRef]
Martis, R.J.; Acharya, U.R.; Min, L.C. ECG beat classification using PCA, LDA, ICA and discrete wavelet transform. Biomed. Signal Process. Control 2013, 8, 437–448. [Google Scholar] [CrossRef]
Swati, S.; Kumar, M.; Namasudra, S. Early prediction of cognitive impairments using physiological signal for enhanced socioeconomic status. Inf. Process. Manag. 2022, 59, 102845. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of 16 head positions.

Figure 2. Experimental environment setup.

Figure 3. Scatter plot of fatigue-related features across subjects. Each point represents an EEG segment, colored by fatigue state (pre-fatigue vs. post-fatigue). The distribution illustrates clear separation, supporting the effectiveness of the selected features.

Figure 4. Histogram of LDA-transformed features showing the separation between pre-fatigue and post-fatigue states. The distinct peaks indicate that linear discriminant analysis can effectively differentiate the two conditions, which contributes to the high classification performance.

Figure 5. ROC curves of three typical subjects.

Table 1. Single-subject experimental results (k = 7).

Subject No.	Selected Channel Index	Test Set Size	Accuracy	F1 Score
NO10	[0, 4, 5, 6, 9, 10, 14, 15]	174	0.9885	0.9895
NO11	[1, 5, 8, 9, 11, 13, 14, 15]	172	0.8953	0.8941
NO12	[0, 1, 3, 5, 6, 7, 9, 13]	165	0.9455	0.9461
NO16	[0, 1, 2, 4, 9, 12, 13, 14]	161	0.8137	0.8125
NO18	[0, 1, 4, 5, 6, 7, 8, 15]	192	0.7917	0.7917
NO6	[0, 1, 2, 4, 6, 7, 13, 14]	225	0.8978	0.9038
NO8	[0, 1, 4, 9, 12, 13, 14, 15]	216	0.9722	0.9730

Table 2. All-subject unified experimental results (k = 7).

Processing Mode	Selected Channel Index	Test Set Size	Accuracy	F1 Score
All-subject Unified	[0, 1, 4, 6, 9, 13, 14, 15]	2290	0.7476	0.7472

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cui, Y.; Li, X.; Chen, Z.; Li, Y. EEG Fatigue Judgment Method Based on Approximate Nearest Neighbor Search. Computers 2026, 15, 303. https://doi.org/10.3390/computers15050303

AMA Style

Cui Y, Li X, Chen Z, Li Y. EEG Fatigue Judgment Method Based on Approximate Nearest Neighbor Search. Computers. 2026; 15(5):303. https://doi.org/10.3390/computers15050303

Chicago/Turabian Style

Cui, Yingjie, Xu Li, Zhongxian Chen, and Yan Li. 2026. "EEG Fatigue Judgment Method Based on Approximate Nearest Neighbor Search" Computers 15, no. 5: 303. https://doi.org/10.3390/computers15050303

APA Style

Cui, Y., Li, X., Chen, Z., & Li, Y. (2026). EEG Fatigue Judgment Method Based on Approximate Nearest Neighbor Search. Computers, 15(5), 303. https://doi.org/10.3390/computers15050303

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EEG Fatigue Judgment Method Based on Approximate Nearest Neighbor Search

Abstract

1. Introduction

2. Methods

2.1. Data Description

2.2. EEG Segmentation

2.3. Feature Extraction and Fatigue-Sensitive Feature Screening

2.3.1. Multi-Domain Feature Extraction for Selected Channels

2.3.2. Screening of Fatigue-Sensitive Features (Key to Spatial Separation)

2.4. ANNS Index Construction and k-Nearest Neighbor Retrieval

2.5. Fatigue Judgment Criteria

3. Results

3.1. Experimental Setup

3.1.1. Hardware and Software Environment

3.1.2. Experimental Data

3.1.3. Experimental Process

3.2. Evaluation Indicators

3.3. Experimental Results and Analysis

3.3.1. Result Visualizations

3.3.2. Single-Subject Results Analysis

3.3.3. All-Subject Unified Results Analysis

3.3.4. Confusion Matrix Analysis

3.3.5. ROC Curve Analysis

3.3.6. Overall Effect Evaluation

4. Discussion

4.1. Distance-Based Fatigue Grading

4.2. Practical Early-Warning Value

4.3. Limitations and Future Work

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI