Next Article in Journal
Comparative Analysis of Diagnostic Performance Between Elastography and AI-Based S-Detect for Thyroid Nodule Detection
Previous Article in Journal
Diagnosing Plantar Plate Injuries: A Narrative Review of Clinical and Imaging Approaches
Previous Article in Special Issue
Enhancing Diagnostic Accuracy of Neurological Disorders Through Feature-Driven Multi-Class Classification with Machine Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhancing Alzheimer’s Diagnosis with Machine Learning on EEG: A Spectral Feature-Based Comparative Analysis

1
Department of Computer Applications, Akkus Vocational School, Ordu University, 52950 Ordu, Türkiye
2
Department of Electrical and Electronics Engineering, Faculty of Engineering, Ondokuz Mayıs University, 55139 Samsun, Türkiye
3
Department of Electrical and Electronic Engineering, Faculty of Engineering, Giresun University, 28200 Giresun, Türkiye
*
Author to whom correspondence should be addressed.
Diagnostics 2025, 15(17), 2190; https://doi.org/10.3390/diagnostics15172190
Submission received: 29 June 2025 / Revised: 8 August 2025 / Accepted: 18 August 2025 / Published: 29 August 2025
(This article belongs to the Special Issue Artificial Intelligence in Brain Diseases)

Abstract

Background/Objectives: Alzheimer’s disease (AD) is a devastating neurodegenerative disorder that progressively impairs cognitive, neurological, and behavioral functions, severely affecting quality of life. The current diagnostic process relies on expert interpretation of extensive clinical assessments, often leading to delays that reduce the effectiveness of early interventions. Given the lack of a definitive cure, accelerating and improving diagnosis is critical to slowing disease progression. Electroencephalography (EEG), a widely used non-invasive technique, captures AD-related brain activity alterations, yet extracting meaningful features from EEG signals remains a significant challenge. This study introduces a machine learning (ML)-driven approach to enhance AD diagnosis using EEG data. Methods: EEG recordings from 36 AD patients, 23 Frontotemporal Dementia (FTD) patients, and 29 healthy individuals (HC) were analyzed. EEG signals were processed within the 0.5–45 Hz frequency range using the Welch method to compute the Power Spectral Density (PSD). From both the time-domain signals and the corresponding PSD, a total of 342 statistical and spectral features were extracted. The resulting feature set was then partitioned into training and test datasets while preserving the distribution of class labels. Feature selection was performed on the training set using Spearman and Pearson correlation analyses to identify the most informative features. To enhance classification performance, hyperparameter tuning was conducted using Bayesian optimization. Subsequently, classification was carried out using Support Vector Machines (SVMs) and k-Nearest Neighbors (k-NN) the optimized hyperparameters. Results: The SVM classifier achieved a notable accuracy of 96.01%, outperforming previously reported methods. Conclusions: These results demonstrate the potential of machine learning-based EEG analysis as an effective approach for the early diagnosis of Alzheimer’s Disease, enabling timely clinical intervention and ultimately contributing to improved patient outcomes.

1. Introduction

In light of technological advancements, there have been many developments in the field of healthcare, which, along with these advances, have increasingly prolonged human life by improving the diagnosis and treatment of diseases. With the increase in life expectancy, the global elderly population is also rapidly rising. This situation has particularly led to an increase in dementia, a health issue that commonly arises during aging. Dementia is a neurodegenerative disease characterized by declining cognitive and behavioral functions, especially memory, due to the death of brain cells (neurons) caused by aging or specific neurological conditions [1]. According to the “World Alzheimer Report” published in 2018, approximately 50 million people worldwide have dementia, and by 2050, the number of dementia cases is projected to exceed 152 million [2].
Alzheimer’s disease (AD) accounts for a significant proportion of dementia cases, comprising 60–70% [3]. This situation has made AD a global issue [4]. AD is an irreversible neurodegenerative disease characterized by a progressive loss of neurological, mental, and cognitive functions, including changes in emotions, behavior, memory, language, and judgment [5,6,7,8]. In individuals with AD, brain electrical activity slows down compared to healthy individuals, manifesting as impairments in cognitive functions [1]. When examining the age distribution of individuals with this disease, the prevalence of AD is 1% among people aged 60–64. In comparison, this rate rises to 38% in individuals over the age of 85, clearly indicating that AD increases with advancing age [9]. Individuals with this disease are diagnosed based on prolonged tests and examinations by experienced professionals, and the accuracy of these diagnoses ranges between 85% and 93% [10].
Although there is currently no existing cure for AD, it is believed that some medications can slow the progression and, consequently, the symptoms of the disease if the diagnosis is made as early as possible. Performing a rapid diagnosis of AD is crucial for effectively using these medications during their effective period, significantly impacting the progression of the disease. A prompt diagnosis allows for the commencement of treatment before permanent brain damage occurs and enables the treatment of potential psychiatric symptoms such as depression and psychosis. With accurate diagnosis and treatment, patients may have the opportunity to maintain their personal needs and care for longer. Additionally, an early diagnosis allows the patient’s relatives to gain information about the disease and make financial and emotional plans for future situations [11]. Considering all these factors, the importance of diagnosing AD becomes clearer for the patient, their relatives, and society as a whole.
EEG (electroencephalogram) signals are recordings of the brain’s electrical activity, captured through electrodes or transducers placed on the scalp. These signals reflect the complex neuronal dynamics associated with brain function [12].
The EEG was first introduced into the literature by Hans Berger in 1929 as a method for recording electrical activity in the human brain [13]. Since Hans Berger’s first observation of pathological EEG sessions in a patient with a confirmed diagnosis of AD [14], numerous studies have been conducted on AD using EEG signals. Particularly in the last 20 years, EEG has been employed as a useful tool for diagnosing dementia [15]. EEG signals are of great importance for diagnosing brain-related diseases. One of the major advantages of EEG signals is their ability to capture brain signals without surgical intervention. EEG signals are also more time- and cost-effective than other methods, significantly increasing their use in diagnosing AD.
When examining the EEG signals of patients with AD, certain abnormalities are detected compared to the EEG signals of healthy individuals. The most notable characteristic of these abnormalities is the slowing of rhythms and decreased coherence between different brain regions. There is an increase in theta and delta band activities and a decrease in alpha and beta band activities. Additionally, there is a decrease in coherence within the alpha and beta bands. These abnormalities increase the severity of the disease [15].
This study aims to classify EEG signals from individuals with Alzheimer’s Disease (AD), Frontotemporal Dementia (FTD), and healthy individuals (HC) using machine learning (ML) methods after signal processing stages. By facilitating the earliest possible diagnosis of AD and improving decision-making times for professionals, the goal is to enable patients to lead a more comfortable life for a longer period. In recent years, machine learning techniques have been increasingly employed in healthcare to facilitate early diagnosis and improve clinical outcomes for complex disorders [7,8,9,10,16,17,18]. As reported in existing studies, ML methods were chosen for classification due to their superior performance compared to traditional approaches. Additionally, it contributes to the growing body of literature on the use of ML in the diagnosis of dementia and neurodegenerative diseases.

2. Literature Review

In 1984, a report by the National Institutes of Neurological and Communicative Disorders and Stroke and the Alzheimer’s Disease and Related Disorders Association highlighted an increase in slow-wave band activity in EEGs of individuals with Alzheimer’s Disease (AD), suggesting EEG as a potential diagnostic tool [19]. Between 1985 and 1990, studies on AD EEG signals generally found increased low-frequency band power and decreased high-frequency band power [20,21].
EEG classification studies, along with the datasets used, frequency bands, feature extraction methods, classifiers employed, performance metrics, reported limitations, and the clinical significance of the results, are summarized in Table 1.

3. Materials and Methods

In this study, a publicly available EEG dataset was employed to support the diagnosis of Alzheimer’s Disease. The data analysis followed a structured pipeline comprising three main stages: pre-processing, feature extraction, and classification, as illustrated in Figure 1. Each stage of this pipeline is described in detail in the subsequent sections to ensure transparency and reproducibility of the methodology.

3.1. Dataset

This study used an open access EEG dataset recorded by the neurology team of the second Neurology Department at Thessaloniki AHEPA General Hospital [30]. This dataset consists of EEG signals recorded from subjects resting with closed eyes. The dataset includes 36 AD patients, 23 FTD patients, and 29 HC. The EEG data of 36 AD patients, 23 FTD patients, and 29 HC were processed in the study.
The neurological and cognitive status of the subjects was assessed using the International Mini-Mental State Examination (MMSE). The MMSE score ranges from 0 to 30, with lower scores indicating severe cognitive decline. The average MMSE score for AD subjects was 17.75, with a standard deviation of 4.5, while the average MMSE score for FTD subjects was 22.17, with a standard deviation of 8.22. The MMSE score for healthy subjects was reported to be 30. The median duration of the disease among the subjects was 25 months. Table 2 presents the demographic characteristics of AD, FTD, and HC.
The EEG signals of the subjects were recorded using 19 scalp electrodes (Fp1, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, and O2) and two reference electrodes (A1 and A2) from the Nihon Kohden EEG 2100 clinical device as shown Figure 2. According to the international 10–20 electrode placement system, the electrodes were placed on the scalp. The A1 and A2 reference electrodes were used for impedance control. Before each recording, the skin impedance was adjusted to be below 5 kΩ. The sampling rate of the recordings was 500 Hz with a resolution of 10 μV/mm. Table 3 provides detailed information about the recording parameters of the dataset.
During the acquisition of EEG signals, artifacts from the environment and the subject’s own physiological movements can be introduced into the EEG signal. These artifacts can cause distortions in the EEG signal. To obtain reliable features, the signal is cleaned of artifacts before or after the ADC process using filters [12]. Artifact removal is critical for the effective processing of EEG signals.
This study used EEG recordings that had been pre-processed and cleaned of artifacts. The researchers who prepared the dataset initially applied a 0.5–45 Hz Butterworth band pass filter and re-referenced the signals to the A1-A2 channels. Subsequently, Artifact Subspace Reconstruction (ASR) was applied using EEGLAB to remove system artifacts from the signal. ASR is a method where unwanted high-variance channel data is removed, and the channel is reconstructed from the remaining data. Clean data portions are automatically identified, and unwanted segments are removed by setting a threshold [31]. During ASR, the maximum acceptable duration was set to 0.5 and the window standard deviation to 17. Following this, Independent Component Analysis (ICA) with the RunICA algorithm was applied to EEGLAB, and data of physiological artifacts were cleaned as much as possible using the “ICLabel” classification. ICA is based on the principle that signals can be separated into independent components. It separates and cleans artifacts from signals [12,32].

3.2. Pre-Processing

In addition to the pre-processing steps performed in the study, EEG recordings of different durations were segmented into epochs. The literature review shows that EEG recordings can be divided into epochs ranging from 2 to 30 s and can include overlap [10,27,33]. Based on this, the EEG recordings were segmented into 30 s epochs with 50% overlap, as shown in Figure 3. Segmenting the recordings with overlap minimized data loss and increased the number of data segments available for processing.

3.3. Feature Extraction

The primary goal of feature extraction is to identify distinctive and meaningful features from pre-processed signals and to create a feature vector. It is intended to perform feature extraction with reduced data rather than the full dataset, which can improve classification performance. Creating a feature vector reduces the data, increasing the classification’s training speed and enhancing the model’s accuracy [7].
Distinctive and meaningful features of the signals can appear in the time, frequency, and time–frequency domains. In the time domain, features such as statistical measures and Hjorth parameters of the signal are obtained [34]. When the features in the time domain are insufficient for analyzing and classifying the signal, the signal is transformed into the frequency or time–frequency domain, where additional features are explored.
This study performed feature extraction on the pre-processed signals using both the time and spectral domains. Each channel of the EEG signal was converted from the time domain to the spectral domain using Welch’s spectral analysis in the 0.5–45 Hz range. The spectral domain is a commonly used method for distinguishing signals and extracting information from relevant data [35].
Welch’s spectral analysis divides the input signal into overlapping segments. A chosen window function is applied to each segment. The Fast Fourier Transform (FFT) is applied to the windowed segments to compute the periodogram of each segment, and the average periodogram of the windowed segments is calculated as shown in Equation (1) [35].
S w = 1 L l = 1 L l w
where L is the total number of windowed segments, l w is the periodogram of the windowed segments, and S(w) is the average periodogram.
In this study, the “Hamming” window function was chosen, and the overlap rate was set to 75% [33] to obtain the PSD of the signal.
For each of the 19 channels of the EEG signal, 7 features were identified in the time domain and 11 features in the spectral domain. A total of 18 features were extracted for each channel. The identified features determined in the time and spectral domains have been calculated using the formulas in Table 4.
Spectral-domain feature extraction was performed using frequency ranges for 5 sub-bands, as shown in Table 5. The power of these frequency ranges for the defined sub-bands was calculated using the BP formula in Equation (9) in Table 4. The ratios between the bands were obtained in Equation (10) in Table 4.
Before proceeding to the feature selection phase, the feature vectors were divided into two sets using systematic sampling: 70% for training and 30% for testing.

3.4. Feature Selection

A feature vector consisting of a total of 342 features (19 channels × 18 features per channel) was created from the EEG signals. High-dimensional feature vectors can contain irrelevant or redundant data, which may prolong and complicate the learning process during classification. Feature selection is performed to speed up the learning process and improve class discrimination. Principal Component Analysis (PCA), Independent Component Analysis (ICA), Correlation Coefficient, and Conditional Mutual Information Maximization (CMIM) are some approaches used for feature selection [36].
In this study, the Correlation Coefficient approach was applied for feature selection. In this approach, the relationship of a single feature with the label is examined to determine how well the feature contributes to class separation, and the features are ranked based on their contribution [36]. Two different correlation coefficient approaches, Spearman and Pearson, were used to select the best correlation coefficient approach. The average of the correlation coefficients was used to set a threshold, and features exceeding this threshold were used to create new feature vectors.
Spearman’s correlation coefficient is calculated as shown in Equation (11) [37], and Pearson’s correlation coefficient is calculated as shown in Equation (12) [38]:
r s = 1 6 d i 2 N N 2 1
where r s is Spearman’s correlation coefficient, d i is the difference between each pair of ranked variables, and N is the total number of samples.
p x , y = x y σ x σ y
where p ( x , y ) is the Pearson correlation coefficient between variables x and y , x y is the cross-correlation between x and y, and σ x and σ y are the variances of x and y signals, respectively.
Since correlation coefficients with negative values indicate an inverse relationship, the absolute values of these coefficients were taken, and the threshold R T H value was calculated for both correlation coefficient approaches as given in Equation (13).
R T H = 1 K k = 1 K r k
where R T H is the average of the correlation coefficients, K is the number of features, and r k is the correlation coefficient for the k-th feature.
The feature selection was performed using the approach that provided the best threshold value from Pearson and Spearman correlation coefficients, resulting in a new feature vector. The study used three feature vectors: one with all features without selection and the others are vectors containing more meaningful features selected after the feature selection process. In order to visualize these three high-dimensional datasets, t-Distributed Stochastic Neighbor Embedding (t-SNE) [39] was used to visualize the datasets.

3.5. Hyperparameter Optimization

Hyperparameter optimization aims to determine the optimal parameter combinations of the classification algorithms to be used prior to the training phase, in order to achieve the best training performance. Hyperparameters can be determined in two main ways: manual search or automated search. In manual search, the parameters of the chosen classification algorithm are adjusted one by one to reach the best possible performance, which is often time-consuming and computationally expensive. In contrast, automated search involves a systematic exploration of the parameter space to identify the combination that yields the best performance [40]. In this study, we employed a Gaussian Process (GP) Bayesian optimization method [41,42], which is widely used for optimizing objective functions that are costly to evaluate. The mathematical expressions used in Bayesian optimization are presented below.
x = arg m i n x H f ( x )
f x ~ G P ( μ x , σ 2 x )
E I x = E m a x ( 0 , f x b e s t f x )
P f D n e w = P D n e w f P f D o l d P ( D n e w )
where x denotes the point that minimizes the objective function, while H defines the search space over which the optimization is to be performed. The unknown function f ( x ) is approximated using a Gaussian Process defined by a mean function μ x and a variance σ 2 x , which capture the expected value and the uncertainty of the prediction at each point, respectively. The term f x b e s t refers to the best function value observed so far, and E I x quantifies the expected improvement resulting from evaluating point x . The posterior distribution P f D n e w is obtained by updating the prior P f D o l d with new data D n e w , using the likelihood P D n e w f . The term P ( D n e w ) serves as a normalization constant, ensuring the posterior is a valid probability distribution.

3.6. Classification

ML, a subfield of artificial intelligence, can be defined as computer models and algorithms that automatically learn from data and experiences using mathematics, statistics, optimization, and knowledge discovery to solve tasks or problems [43]. ML can be categorized into supervised, unsupervised, reinforcement, and deep learning [44].
Supervised learning involves creating a model by using relationships between a predefined set of inputs and target outputs to train the system. Supervised learning algorithms are divided into classification and regression [43].
Classification is one of the supervised learning algorithms in ML. It involves labeling data to determine which class it belongs to, using an algorithm to train the feature vectors allocated for training, and then deciding which class an unknown feature vector belongs to through a decision mechanism [45].
This study used SVM and k-NN classification algorithms to diagnose AD using EEG signal feature vectors. SVM is a powerful ML model based on kernels. SVM is a learning algorithm aimed at classifying data by finding the optimal hyperplane in a space called the feature space, which is formed from the training data [46]. The position and orientation of the hyperplane are adjusted to achieve the best classification [47]. When the data in the feature space is not linearly separable, the feature space is transformed into a higher-dimensional space using the “kernel trick” to classify the data [48]. The quadratic kernel function used in this study is calculated as shown in Equation (18) [49]:
Q x i x j = x i x j 2 + c 2
k-NN is a learning algorithm that aims to classify based on the distance to the nearest neighbors (k) in the feature space of the attributes [50]. Distance measurements are calculated according to distance metrics such as Euclidean, Cosine, Chebyshev, and Mahalanobis. The distance metric used in this study is cosine, calculated as shown in Equation (19) [51]:
cos x , y = 1 i = 1 N x i y i i = 1 N x i 2 i = 1 N y i 2

3.7. Performance Evaluation

A confusion matrix is a metric table used to understand and evaluate the performance of classification algorithms by calculating statistical measurements [52]. The confusion matrix visualizes the model’s actual and predicted classes and calculates performance metrics such as accuracy, recall, specificity, precision, and F1 score [53]. The structure of the confusion matrix is shown in Table 6. In the table, TP, FP, TN, and FN represent True Positives, False Positives, True Negatives, and False Negatives, respectively [54].
In this study, the metrics and formulas used to evaluate classification performance with the confusion matrix are provided in Table 7. Accuracy calculates the proportion of correct predictions over the data points assessed [55]. Recall/Sensitivity calculates the proportion of correctly classified true positive data [56]. Specificity calculates the proportion of correctly classified true negative data [55]. Precision calculates the proportion of truly positive examples among those predicted as positive [56]. Negative predictive value (NPV) indicates the proportion of truly negative cases among those predicted as negative by the model. False discovery rate (FDR) is the proportion of actually negative cases among the samples that the model predicted as positive [57]. Balanced Classification Rate (BCR) is the average of per-class sensitivity and specificity, providing a balanced measure for multi-class classification tasks [58]. The F1 Score is obtained by calculating the harmonic mean of recall and precision [55]. In addition to these metrics, the Receiver Operating Characteristic (ROC) curve is also used to measure performance. ROC curves are a technique used to evaluate and visualize the performance of classification algorithms. The ROC curve plots sensitivity (on the y-axis) against specificity (on the x-axis) at different points of the model. The area under the plotted curve is called the Area Under Curve (AUC). The AUC value ranges from 0 to 1, providing information about the classification’s performance. An AUC value of 1 indicates better classification performance [59].

4. Results and Discussion

This study’s EEG signals obtained from patients with AD, FTD, and HC were processed through pre-processing and feature extraction stages and classified using ML methods, specifically the SVM and k-NN algorithms. The aim was to detect whether the EEG signals in the test data, which were separated and included in the system through systematic sampling, indicated the presence of Alzheimer’s Disease. This section presents the methods used and their classification results, evaluating the performance rates of the classification algorithms.
All stages of the study were conducted using MATLAB R2021b. In the initial stage, EEG signals obtained from the dataset were pre-processed and segmented into 30 s epochs with 15 s overlaps. The resulting number of epochs for AD, FTD, and HC is 1888, 1074, and 1563, respectively.
After the pre-processing stage, features were extracted from EEG signals both in the time and spectral domains. Welch spectral analysis was used to obtain the PSD of the EEG signals, and a transition to the spectral domain was performed.
During feature extraction, seven features were extracted from the time domain for each channel of the EEG signal. In the spectral domain, the signal was divided into five frequency bands, and each band’s power and the bands’ ratios other than the gamma band were calculated. Since AD primarily affects low-frequency wave activities, the ratio of the gamma band to the other bands was not considered. For the spectral domain, 11 features were extracted from each channel of the EEG signal.
For each 30 s EEG signal from 19 channels, 342 features (19 channels × 18 features) were extracted, resulting in a feature vector of size 4521 × 342.
The feature vectors, consisting of 342 features, were divided into training and test datasets, 70% of which were used for training and 30% for testing through systematic sampling. The feature vectors are divided into training and test datasets to determine how well the system classifies previously unseen test data. Systematic sampling was used to ensure that data from a particular group did not dominate either the training or the test datasets. Systematic sampling was employed to achieve an equal distribution. Table 8 shows that the number of samples in the AD group is higher than in the FTD and HC. This is because the AD group has more EEG recordings and longer recording durations than the FTD and HC groups. Percentile distributions were adjusted based on these values through systematic sampling.
To evaluate the effectiveness and discriminatory power of the classification, the first 18 features extracted from the records of AD, HC, and FTD individuals (data numbers 1, 1889, and 3452, respectively), and the Spearman and Pearson correlation coefficients for these features are presented in Table 9. Spearman and Pearson correlation coefficients were calculated only for the training dataset in order to prevent data leakage.
To determine the features with the best correlation with the labels, the absolute values of the negative correlation coefficients were taken, and the R T H value was calculated for both correlation methods. For the Spearman correlation coefficient method, R T H _ s p e a r m a n = 0.1398 was computed. For the Pearson correlation coefficient method, R T H _ p e a r s o n = 0.0969 was computed.
When both correlation methods were examined, it was found that the Spearman correlation coefficient method had a better threshold value. While there were 130 features above the threshold value in the Spearman correlation coefficient method, there were 137 features above the threshold value in the Pearson correlation coefficient method. Based on these results, feature selection was performed using the Spearman correlation coefficient. A new, smaller feature vector was created with the 130 more meaningful features above the R T H _ s p e a r m a n value. The 130 features selected according to the Spearman correlation coefficient approach are listed in Table 10.
When examining the features selected using the Spearman correlation coefficient method, the feature with the highest correlation to the label is feature number 268, Pz/TABPR. The Pz/TABPR feature represents the theta-alpha band power ratio in the parietal region of the brain, with a correlation coefficient of 0.5259. All of the 130 selected features were derived from the spectral domain. This result demonstrates the importance of feature extraction in the spectral domain for the classification stage. A new training set was created by selecting the top 50 features out of the 130 selected features. These three high-dimensional datasets were visualized by applying t-SNE in two-dimensional form and are shown in Figure 4.
For the classification using the SVM and k-NN algorithms, the best training accuracy rates were achieved by determining some hyperparameters of the algorithms through Bayesian optimization.
In the SVM classification algorithm, after setting the kernel function to “quadratic,” the kernel scale to “automatic”, the box constraint levels were determined by the Bayesian optimization method. As a result of this process, the box constraint value and classification accuracy were found as follows: 2.4972 and 90.81% for the training dataset with 342 features, 3.9812 and 92.60% for the training dataset with 130 features, and 5.5952 and 88.06% for the training dataset with 50 features. The analysis of different parameter values was conducted, and a portion of the results from this analysis is provided in Table 11.
In the k-NN classification algorithm, after selecting the distance metric as “cosine” and the distance weight as “squared inverse,” the number of neighbors (k) was determined using the Bayesian optimization method. As a result of this process, the k value and classification accuracy were found as follows: k = 6 and 77.88% accuracy for the training dataset with 342 features, k = 2 and 90.05% accuracy for the training dataset with 130 features, and k = 4 and 89.03% accuracy for the training dataset with 50 features. A portion of the analysis of the k-NN algorithm parameters is provided in Table 12.
Based on the parameters determined, training datasets consisting of 342, 130, and 50 features were trained using the SVM and k-NN classification algorithms. After training, the classification algorithms were tested using a test dataset that the system had not previously encountered, resulting in confusion matrices and ROC curves. The obtained confusion matrices are presented in Figure 5, and the ROC curves are shown in Figure 6.
The accuracy, sensitivity, specificity, precision, NPV, FDR, BCR, F1 Score, and AUC values were calculated from the confusion matrices of the trained models. As shown in Table 13, the best classification accuracy performance was achieved with the SVM algorithm using the feature vector of 130 features, with an accuracy of 96.01%. When comparing accuracy performances, models trained with 130 features outperformed those trained with 342 features and 50 features. The obtained accuracy rates are presented as a bar chart in Figure 7.
When the studies conducted on the dataset used in this study are examined, the metrics and performance results for Alzheimer’s disease (AD) diagnosis from EEG signals are presented in Table 14.
While diagnosing AD from EEG signals, various parameters and methods—such as the feature extraction techniques, signal ranges, and classification algorithms—directly influence the system’s performance.
In light of this information, it is observed that the proposed system achieves better accuracy performance compared to the other studies.

5. Conclusions

This study introduces a machine learning-based approach for diagnosing AD using EEG signals, focusing on enhancing accuracy and efficiency in the diagnostic process. The system utilized SVM and k-NN classifiers, with features extracted from the time and spectral domains. The selection of 130 highly discriminative features, predominantly derived from spectral analysis, was crucial in improving classification accuracy.
The SVM algorithm, particularly when applied to the reduced feature set, achieved a superior accuracy of 96.01% and it has outperformed other studies that used the same dataset. This significant improvement underscores the effectiveness of the feature selection process and the optimization of classifier parameters through Bayesian optimization.
The results of this study demonstrate the potential of machine learning to automate and expedite the AD diagnostic process, offering a valuable tool to complement traditional methods. However, despite these promising outcomes, achieving the highest possible accuracy remains essential, especially given the profound impact of AD on individuals and their families.
Despite the promising results obtained in this study, several limitations should be taken into account. Firstly, the dataset was not split on a subject-wise basis, which may have led to data leakage. Additionally, the use of only a single dataset limits the generalizability of the proposed approach. The study employed a limited number of feature selection methods, and the absence of alternative techniques such as ReliefF and statistical approaches may have resulted in overlooking more appropriate feature subsets. Furthermore, only two classification algorithms (SVM and k-NN) were used, and no comparisons were made with advanced techniques such as deep learning or ensemble methods. Lastly, no analysis was performed to identify which EEG channels contributed most significantly to classification performance; however, such channel-based analyses could enhance the diagnosis of Alzheimer’s disease.
In future work, we aim to address the limitations of the current study by enhancing the feature extraction process and investigating alternative feature selection techniques to further improve classification performance. Additionally, we plan to explore advanced classification methods, including ensemble learning algorithms and deep neural network architectures, and systematically compare their performance with traditional machine learning models. These advancements are expected to contribute to the development of a more robust and generalizable framework for EEG-based Alzheimer’s Disease diagnosis. Ultimately, such efforts could lead to improved diagnostic accuracy, earlier detection, and better patient care.

Author Contributions

Conceptualization, Y.S. and C.K.; methodology, Y.S. and C.K.; software, Y.S.; validation, Y.S., C.K., and F.O.; formal analysis, Y.S.; investigation, Y.S.; resources, Y.S.; data curation, Y.S.; writing—original draft preparation, Y.S., C.K., and F.O.; writing—review and editing, Y.S., C.K., and F.O.; visualization, Y.S. and C.K.; supervision, C.K. All authors have read and agreed to the published version of the manuscript.

Funding

This study received no external funding.

Institutional Review Board Statement

This study did not involve direct human or animal subjects, and ethical approval was therefore not required. The EEG data used in this study were obtained from a publicly available, anonymized dataset with all necessary ethical approvals already granted.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting this study’s findings are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors have no competing interests to declare relevant to this article’s content.

Abbreviations

The following abbreviations are used in this manuscript:
ADAlzheimer’s Disease
AUCArea Under Curve
BILSTMBidirectional Long Short-Term Memory
BCRBalanced Classification Rate
CCConventional Coherence
CNNConvolutional Neural Networks
CWTContinuous Wavelet Transform
DTDecision Trees
DWTDiscrete Wavelet Transform
ELMExtreme Learning Machine
EMDEmpirical Mode Decomposition
FDRFalse Discovery Rate
FFTFast Fourier Transform
FTDFrontotemporal Dementia
HCHealthy Individuals
ICAIndependent Component Analysis
k-NNk-Nearest Neighbors
MLMachine Learning
MLPMultilayer Perceptron Model
MMSEMini-Mental State Examination
NNNeural Network
NPVNegative Predictive Value
PCLDAPrincipal Component Linear Discriminant Analysis
PCLRPrincipal Component Logistic Regression
PLSLDAPartial Least Squares LDA
PNNProbabilistic Neural Network
PSDPower Spectral Density
QDAQuadratic Discriminant Analysis
RFRandom Forest
ROCReceiver Operating Characteristic
SVMsSupport Vector Machines
t-SNEt-distributed Stochastic Neighbor Embedding
WCWavelet coherence

References

  1. Safi, M.S.; Safi, S.M.M. Early detection of Alzheimer’s disease from EEG signals using Hjorth parameters. Biomed. Signal Process. Control 2021, 65, 102338. [Google Scholar] [CrossRef]
  2. Patterson, C. World Alzheimer report 2018: The state of the art of dementia research: New frontiers. Alzheimer’s Dis. Int. (ADI) 2018, 2, 14–20. [Google Scholar]
  3. Durongbhan, P.; Zhao, Y.; Chen, L.; Zis, P.; De Marco, M.; Unwin, Z.C.; Sarrigiannis, P.G. A dementia classification framework using frequency and time-frequency features based on EEG signals. IEEE Trans. Neural Syst. Rehabil. Eng. 2019, 27, 826–835. [Google Scholar] [CrossRef]
  4. Alsubaie, M.G.; Luo, S.; Shaukat, K. Alzheimer’s disease detection using deep learning on neuroimaging: A systematic review. Mach. Learn. Knowl. Extr. 2024, 6, 464–505. [Google Scholar] [CrossRef]
  5. Fonteijn, H.M.; Modat, M.; Clarkson, M.J.; Barnes, J.; Lehmann, M.; Hobbs, N.Z.; Alexander, D.C. An event-based model for disease progression and its application in familial Alzheimer’s disease and Huntington’s disease. NeuroImage 2012, 60, 1880–1889. [Google Scholar] [CrossRef] [PubMed]
  6. Ghanemi, A. Alzheimer’s disease therapies: Selected advances and future perspectives. Alex. J. Med. 2015, 51, 1–3. [Google Scholar] [CrossRef]
  7. AlSharabi, K.; Salamah, Y.B.; Abdurraqeeb, A.M.; Aljalal, M.; Alturki, F.A. EEG signal processing for Alzheimer’s disorders using discrete wavelet transform and machine learning approaches. IEEE Access 2022, 10, 89781–89797. [Google Scholar] [CrossRef]
  8. Alberdi, A.; Aztiria, A.; Basarab, A. On the early diagnosis of Alzheimer’s Disease from multimodal signals: A survey. Artif. Intell. Med. 2016, 71, 1–29. [Google Scholar] [CrossRef]
  9. Ruiz-Gómez, S.J.; Gómez, C.; Poza, J.; Gutiérrez-Tobal, G.C.; Tola-Arribas, M.A.; Cano, M.; Hornero, R. Automated multiclass classification of spontaneous EEG activity in Alzheimer’s disease and mild cognitive impairment. Entropy 2018, 20, 35. [Google Scholar] [CrossRef] [PubMed]
  10. Miltiadous, A.; Gionanidis, E.; Tzimourta, K.D.; Giannakeas, N.; Tzallas, A.T. DICE-net: A novel Convolution-Transformer Architecture for Alzheimer Detection in EEG Signals. IEEE Access 2023, 11, 71840–71858. [Google Scholar] [CrossRef]
  11. Dauwels, J.; Vialatte, F.; Cichocki, A. Diagnosis of Alzheimer’s disease from EEG signals: Where are we standing? Curr. Alzheimer Res. 2010, 7, 487–505. [Google Scholar] [CrossRef]
  12. Sanei, S.; Chambers, J.A. EEG Signal Processing; John Wiley & Sons: Hoboken, NJ, USA, 2007. [Google Scholar]
  13. Tsolaki, A.; Kazis, D.; Kompatsiaris, I.; Kosmidou, V.; Tsolaki, M. Electroencephalogram and Alzheimer’s disease: Clinical and research approaches. Int. J. Alzheimer’s Dis. 2014, 2014, 349249. [Google Scholar] [CrossRef]
  14. Berger, H. Über das Elektrenkephalogramm des Menschen: Dritte Mitteilung. Arch. Psychiatr. Nervenkr. 1931, 94, 16–60. [Google Scholar] [CrossRef]
  15. Jeong, J. EEG dynamics in patients with Alzheimer’s disease. Clin. Neurophysiol. 2004, 115, 1490–1505. [Google Scholar] [CrossRef] [PubMed]
  16. Özbilgin, F.; Kurnaz, Ç.; Aydın, E. Prediction of coronary artery disease using machine learning techniques with iris analysis. Diagnostics 2023, 13, 1081. [Google Scholar] [CrossRef] [PubMed]
  17. Özbilgin, F.; Kurnaz, Ç.; Aydın, E. Non-invasive coronary artery disease identification through the iris and bio-demographic health profile features using stacking learning. Image Vision. Comput. 2024, 146, 105046. [Google Scholar] [CrossRef]
  18. Özbilgin, F.; Kurnaz, Ç. An alternative approach for determining the cholesterol level: Iris analysis. Int. J. Imaging Syst. Technol. 2022, 32, 1159–1171. [Google Scholar] [CrossRef]
  19. McKhann, G.; Drachman, D.; Folstein, M.; Katzman, R.; Price, D.; Stadlan, E.M. Clinical diagnosis of Alzheimer’s disease: Report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease. Neurology 1984, 34, 939. [Google Scholar] [CrossRef]
  20. Penttilä, M.; Partanen, J.V.; Soininen, H.; Riekkinen, P.J. Quantitative analysis of occipital EEG in different stages of Alzheimer’s disease. Electroencephalogr. Clin. Neurophysiol. 1985, 60, 1–6. [Google Scholar] [CrossRef]
  21. Brenner, R.P.; Reynolds, C.F., III; Ulrich, R.F. Diagnostic efficacy of computerized spectral versus visual EEG analysis in elderly normal, demented and depressed subjects. Electroencephalogr. Clin. Neurophysiol. 1988, 69, 110–117. [Google Scholar] [CrossRef] [PubMed]
  22. Lehmann, C.; Koenig, T.; Jelic, V.; Prichep, L.; John, R.E.; Wahlund, L.O.; Dierks, T. Application and comparison of classification algorithms for recognition of Alzheimer’s disease in electrical brain activity (EEG). J. Neurosci. Methods 2007, 161, 342–350. [Google Scholar] [CrossRef]
  23. Sankari, Z.; Adeli, H. Probabilistic neural networks for diagnosis of Alzheimer’s disease using conventional and wavelet coherence. J. Neurosci. Methods 2011, 197, 165–170. [Google Scholar] [CrossRef] [PubMed]
  24. Morabito, F.C.; Campolo, M.; Ieracitano, C.; Ebadi, J.M.; Bonanno, L.; Bramanti, A.; Bramanti, P. Deep convolutional neural networks for classification of mild cognitive impaired and Alzheimer’s disease patients from scalp EEG recordings. In Proceedings of the IEEE 2nd International Forum on Research and Technologies for Society and Industry Leveraging a Better Tomorrow (RTSI), Bologna, Italy, 7–9 September 2016; pp. 1–6. [Google Scholar] [CrossRef]
  25. Fiscon, G.; Weitschek, E.; Cialini, A.; Felici, G.; Bertolazzi, P.; De Salvo, S.; De Cola, M.C. Combining EEG signal processing with supervised methods for Alzheimer’s patients classification. BMC Med. Inform. Decis. Mak. 2018, 18, 35. [Google Scholar] [CrossRef]
  26. Bairagi, V. EEG signal analysis for early diagnosis of Alzheimer disease using spectral and wavelet based features. Int. J. Inf. Technol. 2018, 10, 403–412. [Google Scholar] [CrossRef]
  27. Vecchio, F.; Miraglia, F.; Alù, F.; Menna, M.; Judica, E.; Cotelli, M.; Rossini, P.M. Classification of Alzheimer’s disease with respect to physiological aging with innovative EEG biomarkers in a machine learning implementation. J. Alzheimer’s Dis. 2020, 75, 1253–1261. [Google Scholar] [CrossRef]
  28. Göker, H. Welch Spectral Analysis and Deep Learning Approach for Diagnosing Alzheimer’s Disease from Resting-State EEG Recordings. Trait. Du Signal 2023, 40, 257–264. [Google Scholar] [CrossRef]
  29. Kim, S.K.; Kim, J.B.; Kim, H.; Kim, L.; Kim, S.H. Early Diagnosis of Alzheimer’s Disease in Human Participants Using EEG Conformer and Attention-Based LSTM During the Short Question Task. Diagnostics 2025, 15, 448. [Google Scholar] [CrossRef]
  30. OpenNeuro. Available online: https://openneuro.org/datasets/ds004504/versions/1.0.6 (accessed on 4 September 2024).
  31. Chang, C.Y.; Hsu, S.H.; Pion-Tonachini, L.; Jung, T.P. Evaluation of artifact subspace reconstruction for automatic EEG artifact removal. In Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, HI, USA, 17–21 July 2018; pp. 1242–1245. [Google Scholar] [CrossRef]
  32. Jung, T.P.; Humphries, C.; Lee, T.W.; Makeig, S.; McKeown, M.; Iragui, V.; Sejnowski, T.J. Extended ICA removes artifacts from electroencephalographic recordings. In Proceedings of the 1997 Conference on Advances in Neural Information Processing Systems 10 (NIPS ’97), Denver, CO, USA, 31 July 1997; pp. 894–900. [Google Scholar]
  33. Smith, E.E.; Reznik, S.J.; Stewart, J.L.; Allen, J.J. Assessing and conceptualizing frontal EEG asymmetry: An updated primer on recording, processing, analyzing, and interpreting frontal alpha asymmetry. Int. J. Psychophysiol. 2017, 111, 98–114. [Google Scholar] [CrossRef] [PubMed]
  34. Wang, J.; Wang, M. Review of the emotional feature extraction and classification using EEG signals. Cogn. Robot. 2021, 1, 29–40. [Google Scholar] [CrossRef]
  35. Parhi, K.K.; Ayinala, M. Low-complexity Welch power spectral density computation. IEEE Trans. Circuits Syst. I Regul. Pap. 2013, 61, 172–182. [Google Scholar] [CrossRef]
  36. Khalid, S.; Khalil, T.; Nasreen, S. A survey of feature selection and feature extraction techniques in machine learning. In Proceedings of the 2014 Science and Information Conference, London, UK, 27–29 August 2014; pp. 372–378. [Google Scholar] [CrossRef]
  37. Xiao, C.; Ye, J.; Esteves, R.M.; Rong, C. Using Spearman’s correlation coefficients for exploratory data analysis on big dataset. Concurr. Comput. Pract. Exp. 2016, 28, 3866–3878. [Google Scholar] [CrossRef]
  38. Benesty, J.; Chen, J.; Huang, Y. On the importance of the Pearson correlation coefficient in noise reduction. IEEE Trans. Audio Speech Lang. Process. 2008, 16, 757–765. [Google Scholar] [CrossRef]
  39. Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
  40. Wu, J.; Chen, X.Y.; Zhang, H.; Xiong, L.D.; Lei, H.; Deng, S.H. Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar] [CrossRef]
  41. Garnett, R. Bayesian Optimization; Cambridge University Press: Cambridge, UK, 2023. [Google Scholar]
  42. Frazier, P.I. A tutorial on Bayesian optimization. arXiv 2018. [Google Scholar] [CrossRef]
  43. Telikani, A.; Tahmassebi, A.; Banzhaf, W.; Gandomi, A.H. Evolutionary machine learning: A survey. ACM Comput. Surv. (CSUR) 2021, 54, 1–35. [Google Scholar] [CrossRef]
  44. Lujan, M.Á.; Jimeno, M.V.; Mateo Sotos, J.; Ricarte, J.J.; Borja, A.L. A survey on EEG signal processing techniques and machine learning: Applications to the neurofeedback of autobiographical memory deficits in schizophrenia. Electronics 2021, 10, 3037. [Google Scholar] [CrossRef]
  45. Günal, S. Örüntü Tanıma Uygulamalarında Alt Uzay Analiziyle Öznitelik Seçimi ve Sınıflandırma. Doctoral Thesis, Osmangazi University, Eskişehir, Turkey, 2008. [Google Scholar]
  46. Cervantes, J.; Garcia-Lamont, F.; Rodríguez-Mazahua, L.; Lopez, A. A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing 2020, 408, 189–215. [Google Scholar] [CrossRef]
  47. Hira, Z.M.; Gillies, D.F. A review of feature selection and feature extraction methods applied on microarray data. Advemces Bioinform. 2015, 2015, 198363. [Google Scholar] [CrossRef]
  48. Hosseini, M.P.; Hosseini, A.; Ahi, K. A review on machine learning for EEG signal processing in bioengineering. IEEE Rev. Biomed. Eng. 2020, 14, 204–218. [Google Scholar] [CrossRef]
  49. Kızılaslan, G. Meta Sezgisel Algoritmalar İle Biyolojik Sinyallerin İşlenmesi. Master’s Thesis, İstanbul Üniversitesi, İstanbul, Turkey, 2012. [Google Scholar]
  50. Musolf, A.M.; Holzinger, E.R.; Malley, J.D.; Bailey-Wilson, J.E. What makes a good prediction? Feature importance and beginning to open the black box of machine learning in genetics. Hum. Genet. 2022, 141, 1515–1528. [Google Scholar] [CrossRef]
  51. Eşme, E.; Karlık, B. Design of intelligent garment with sensor fusion for rescue teams. J. Fac. Eng. Archit. Gazi Univ. 2019, 34, 1187–1200. [Google Scholar]
  52. Maria Navin, J.R.; Pankaja, R. Performance analysis of text classification algorithms using confusion matrix. Int. J. Eng. Tech. Res. (IJETR) 2016, 6, 75–78. [Google Scholar]
  53. Heydarian, M.; Doyle, T.E.; Samavi, R. MLCM: Multi-label confusion matrix. IEEE Access 2022, 10, 19083–19095. [Google Scholar] [CrossRef]
  54. Bergil, E. EEG İşaretlerinin Epileptik Nöbet Kestiriminde Modern Yöntemlerle Analizi ve Sınıflandırılması. Doctoral Thesis, Sakarya University, Sakarya, Turkey, 2018. [Google Scholar]
  55. Hossin, M.; Sulaiman, M.N. A review on evaluation metrics for data classification evaluations. Int. J. Data Min. Knowl. Process 2015, 5, 1. [Google Scholar] [CrossRef]
  56. Sokolova, M.; Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 2009, 45, 427–437. [Google Scholar] [CrossRef]
  57. Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2021, 17, 168–192. [Google Scholar] [CrossRef]
  58. Grandini, M.; Bagli, E.; Visani, G. Metrics for multi-class classification: An overview. arXiv 2020, arXiv:2008.05756. [Google Scholar] [CrossRef]
  59. Fawcett, T. An introduction to ROC analysis. Pattern Recognit. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
  60. Wang, Z.; Liu, A.; Yu, J.; Wang, P.; Bi, Y.; Xue, S.; Zhang, J.; Guo, H.; Zhang, W. The effect of aperiodic components in distinguishing Alzheimer’s disease from frontotemporal dementia. Geroscience 2024, 46, 751–768. [Google Scholar] [CrossRef]
  61. Chen, Y.; Wang, H.; Zhang, D.; Zhang, L.; Tao, L. Multi-feature fusion learning for Alzheimer’s disease prediction using EEG signals in resting state. Front. Neurosci. 2023, 17, 1272834. [Google Scholar] [CrossRef] [PubMed]
  62. Velichko, A.; Belyaev, M.; Izotov, Y.; Murugappan, M.; Heidari, H. Neural Network Entropy (NNetEn): Entropy-Based EEG Signal and Chaotic Time Series Classification, Python Package for NNetEn Calculation. Algorithms 2023, 16, 255. [Google Scholar] [CrossRef]
  63. Ma, Y.; Bland, J.K.S.; Fujinami, T. Classification of Alzheimer’s disease and frontotemporal dementia using electroencephalography to quantify communication between electrode pairs. Diagnostics 2024, 14, 2189. [Google Scholar] [CrossRef]
  64. Rostamikia, M.; Sarbaz, Y.; Makouei, S. EEG-based classification of Alzheimer’s disease and frontotemporal dementia: A comprehensive analysis of discriminative features. Cogn. Neurodynamics 2024, 18, 3447–3462. [Google Scholar] [CrossRef]
  65. Stefanou, K.; Tzimourta, K.D.; Bellos, C.; Stergios, G.; Markoglou, K.; Gionanidis, E.; Tsipouras, M.G.; Giannakeas, N.; Tzallas, A.T.; Miltiadous, A. A novel CNN-based framework for alzheimer’s disease detection using EEG spectrogram representations. J. Pers. Med. 2025, 15, 27. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Representation of the designed system.
Figure 1. Representation of the designed system.
Diagnostics 15 02190 g001
Figure 2. Placement of the 19 electrodes.
Figure 2. Placement of the 19 electrodes.
Diagnostics 15 02190 g002
Figure 3. Illustration of the EEG signal segmentation into 30 s epochs with 50% overlap.
Figure 3. Illustration of the EEG signal segmentation into 30 s epochs with 50% overlap.
Diagnostics 15 02190 g003
Figure 4. t-SNE 2-D embedding for training set: (a) all (342) features; (b) 130 features; (c) 50 features.
Figure 4. t-SNE 2-D embedding for training set: (a) all (342) features; (b) 130 features; (c) 50 features.
Diagnostics 15 02190 g004
Figure 5. Confusion matrices: (a) SVM model trained with 342 features; (b) SVM model trained with 130 features; (c) SVM model trained with 50 features; (d) k-NN model trained with 342 features; (e) k-NN model trained with 130 features; (f) k-NN model trained with 50 features.
Figure 5. Confusion matrices: (a) SVM model trained with 342 features; (b) SVM model trained with 130 features; (c) SVM model trained with 50 features; (d) k-NN model trained with 342 features; (e) k-NN model trained with 130 features; (f) k-NN model trained with 50 features.
Diagnostics 15 02190 g005
Figure 6. ROC (one-vs.-rest) curves: (a) ROC curve of the SVM model trained with 342 features; (b) ROC curve of the SVM model trained with 130 features; (c) ROC curve of the SVM model trained with 50 features; (d) ROC curve of the k-NN model trained with 342 features; (e) ROC curve of the k-NN model trained with 130 features; (f) ROC curve of the k-NN model trained with 50 features.
Figure 6. ROC (one-vs.-rest) curves: (a) ROC curve of the SVM model trained with 342 features; (b) ROC curve of the SVM model trained with 130 features; (c) ROC curve of the SVM model trained with 50 features; (d) ROC curve of the k-NN model trained with 342 features; (e) ROC curve of the k-NN model trained with 130 features; (f) ROC curve of the k-NN model trained with 50 features.
Diagnostics 15 02190 g006
Figure 7. Classification accuracy by feature count.
Figure 7. Classification accuracy by feature count.
Diagnostics 15 02190 g007
Table 1. Literature review.
Table 1. Literature review.
Author(s), YearDatasetBands UsedFeature
Extraction
Classifiers
Applied
Metrics &
Performance
LimitationsSignificance
Lehmann et al., 2007 [22]116 mild AD, 81 moderate AD, 45 HCDelta, Theta, Alpha1, Alpha2, Beta1–3Spectral power, centroids, synchronization (hand-crafted)PC-LDA, PLS-LDA, PC-LR, PLS-LR, Bagging, RF, SVM, NNSVM and NN (Mod. AD vs. HC: Sens. 89%, Spec. 88%)High sensitivity to feature selection, sample imbalance risk.Demonstrated feasibility of EEG-based AD classification; modern ML methods are slightly superior.
Sankari and Adeli, 2011 [23]20 AD, 7 HCDelta, Theta, Alpha, Beta Coherence and wavelet coherence (hand-crafted)PNNConventional coherence: 100% accuracy Small sample size, potential overfitting.Demonstrated potential of coherence measures and PNN in early AD diagnosis using EEG.
Morabito et al., 2016 [24]63 AD, 56 MCI, 23 HC0.1–30 Hz total (includes Delta, Theta, Alpha, Beta)CWT + time–frequency stats; CNN learns latent features; mix of hand-crafted and automaticCNN AD/MCI/HC: 82% acc., 83% sens., 75% spec.Better training accuracy (95%).Deep CNN effectively extracted latent EEG features.
Fiscon et al., 2018 [25]49 AD, 37 MCI, 23 HCDelta, Theta, Alpha, Beta, Gamma FFT, DWT (hand-crafted)DTAD vs. HC: 83% acc.Small sample, limited generalizability.Shows strong potential for early AD detection.
Bairagi et al., 2018 [26]20AD, 25HCDelta, Theta, Alpha, BetaDWT (hand-crafted)SVM, k-NN94% acc.Small dataset; limited generalizability.Combining entropy and fractal features with wavelet analysis yields high accuracy for AD detection.
Durongbhan et al., 2019 [3]20 AD, 20 HCDelta, Theta, Alpha, BetaFFT, CWT (hand-crafted); k-NN, SVM, DTk-NN: FFT features: 97% acc., CWT features: 99% acc.Relatively small dataset, class balance and overfitting risk not detailed.Spectral features from FFT and CWT combined with k-NN yielded high AD classification accuracy.
Vecchio et al., 2020 [27]175 AD, 120 HCDelta, Theta Alpha1, Alpha2, Beta 1, Beta2, GammaLagged Linear Coherence (hand-crafted)SVM95% ± 3% acc.Limited to logistic regression, no external validation.LLC features can effectively classify AD vs. HC with high accuracy.
Safi & Safi, 2021 [1]EEG; 31 mild AD, 20 moderate AD, 35 HC Delta, Theta, Alpha, BetaPSD, DWT, EMD (hand-crafted)k-NN, SVM, RLDA97.64% acc.Performance varies across decomposition methods.Demonstrated effectiveness of decomposition-based Hjorth features in classifying AD severity.
AlSharabi et al., 2022 [7]EEG; 31 mild AD, 22 moderate AD, 35 HC; Delta, Theta, Alpha, Beta, GammaDWT + statistical features (hand-crafted)LDA, QDA, SVM, k-NN, NB, DT, ELM, ANN, RF99.98% acc. with k-NN using DWT featuresNo external validation, dataset overlap risk.Very high accuracy using DWT-based features suggests strong discriminative potential for AD stages.
Göker et al., 2023 [28]24 AH, 24 HCDelta, Theta, Alpha, Beta, GammaPSD (hand-crafted)SVM, k-NN, RF, BiLSTM98.85% acc. for the HC classRelatively small dataset, no external validation.In the EEG dataset, channels from individual subjects were treated as independent recordings.
Kim et al., 2025 [29]20 SCD, 28 MCI, 10 ADDelta, Theta, Alpha, Beta, GammaAutomatic feature extractionEEG Conformer/
Attention-LSTM
Resting-state acc: 71.67%,
Tasking-state acc: 79.16%
AD-MCI separation remains challenging.Shows promise in differentiating SCD, MCI, and AD using EEG-based deep learning.
Table 2. Demographic information of the dataset.
Table 2. Demographic information of the dataset.
GenderMean Age
AD13 Males/23 Females66.4 (±7.9)
FTD14 Males/9 Females63.6 (±8.2)
HC11 Males/18 Females67.9 (±5.4)
Table 3. Record information on the dataset.
Table 3. Record information on the dataset.
Recording Time (Minute)
MinimumMaximumTotal
AD5.121.3485.5
FTD7.916.9276.5
HC12.516.5402
Table 4. Formulas for features determined in the time domain and spectral domain.
Table 4. Formulas for features determined in the time domain and spectral domain.
Feature NumberDomainFeatureEquationEquation Number
1TimeKurtosis (KU) n = 1 N ( x ( n ) A V G ) 4 N 1 S D 4 (2)
2Average (AVG) 1 N n = 1 N x n (3)
3Root Mean Square (RMS) n = 1 N x n 2 N (4)
4Skewness (SK) n = 1 N ( x ( n ) A V G ) 3 N 1 S D 3 (5)
5Standard deviation (SD) 1 N n = 1 N ( x n A V G ) 2 (6)
6Variance (VAR) 1 N n = 1 N ( x n A V G ) 2 (7)
7Norm (NOR) n = 1 N x ( n ) 2 (8)
8SpectralDelta Band Power (DBP) 1 N i N X i 2 (9)
9Theta Band Power (TBP)
10Alpha Band Power (ABP)
11Beta Band Power (BBP)
12Gamma Band Power (GBP)
13Delta-Theta Band Power Ratio (DTBPR)
14Delta-Alpha Band Power Ratio (DABPR) P x P y (10)
15Delta-Beta Band Power Ratio (DTBPR)
16Theta-Alpha Band Power Ratio (TABPR)
17Theta-Beta Band Power Ratio (TBBPR)
18Alpha-Beta Band Power Ratio (ABBPR)
Table 5. Sub-band frequency ranges.
Table 5. Sub-band frequency ranges.
Frequency Range
Delta0.5–4 Hz
Theta4–8 Hz
Alpha8–13 Hz
Beta13–25 Hz
Gamma25–45 Hz
Table 6. Confusion matrix.
Table 6. Confusion matrix.
Predicted
PositiveNegative
TruePositiveTPFN
NegativeFPTN
Table 7. Metrics used for performance evaluation.
Table 7. Metrics used for performance evaluation.
FeatureEquationEquation Number
Accuracy T P + T N T P + F P + T N + F N × 100 (20)
Sensitivity T P T P + F N (21)
Specificity T N T N + F P (22)
Precision T P T P + F P (23)
NPV T N F N + T N (24)
FDR1-Precision(25)
BCR 1 C i = 1 C s e n s i t i v i t y i + s p e c i f i c i t y i 2 (26)
F1 Score 2 × S e n s i t i v i t y × P r e c i s i o n S e n s i t i v i t y + P r e c i s i o n (27)
Table 8. Dataset distributions.
Table 8. Dataset distributions.
DatasetLabelData CountProportionTotal Data CountOverall Ratio
TrainingAD132141.74%316570%
FTD75023.70%
HC109434.57%
TestingAD56741.81%135630%
FTD32023.60%
HC46934.59%
Table 9. Results of feature extraction and feature–label relationships based on Spearman and Pearson correlation coefficients.
Table 9. Results of feature extraction and feature–label relationships based on Spearman and Pearson correlation coefficients.
NoFeatureData NumberSpearman
Corr. Coeff.
Pearson
Corr. Coeff.
111893452
Label
ADHCFTD
1Fp1/KU2.93883.74832.7580−0.00160.0111
2Fp1/AVG0.53950.0301−0.6613−0.02580.0251
3Fp1/RMS31.319936.537432.36030.03830.0451
4Fp1/SK0.13530.57600.14220.03610.0266
5Fp1/SS31.316336.538632.35460.03820.0447
6Fp1/VAR980.71311335.07401046.82500.03820.0415
7Fp1/NO3835.89404474.90803963.32000.03830.0451
8Fp1/DBP398.0764617.8834668.4050−0.0224−0.0444
9Fp1/TBP53.111043.724654.8660−0.2247−0.1329
10Fp1/ABP13.095226.217020.66420.21610.2019
11Fp1/BBP15.703311.51119.42130.08660.0487
12Fp1/GBP22.10145.05173.8901−0.0849−0.0089
13Fp1/DTBPR0.13340.07070.0820−0.2062−0.1416
14Fp1/DABPR0.03280.04240.03090.23930.2055
15Fp1/DBBPR0.03940.01860.01400.09830.0573
16Fp1/TABPR0.24650.59950.37660.42790.2785
17Fp1/TBBPR0.29560.26320.17170.26100.0969
18Fp1/ABBPR1.19910.43900.4559−0.1956−0.0893
Table 10. The 130 features were selected according to the Spearman correlation coefficient approach.
Table 10. The 130 features were selected according to the Spearman correlation coefficient approach.
Selected Features
268, 178, 286, 160, 124, 176, 172, 269, 266, 262, 232, 179, 340, 125, 16, 233, 34, 142, 287, 304, 158, 214, 263, 154, 280, 180, 284, 70, 161, 267, 173, 52, 118, 122, 250, 162, 196, 177, 341, 119, 143, 270, 305, 338, 334, 288, 140, 281, 136, 155, 123, 159, 189, 88, 285, 227, 106, 335, 231, 342, 251, 230, 144, 126, 89, 107, 225, 137, 226, 17, 36, 339, 197, 297, 141, 32, 53, 193, 14, 45, 207, 322, 49, 243, 215, 302, 9, 28, 301, 212, 229, 27, 10, 83, 248, 63, 216, 244, 298, 208, 13, 82, 101, 35, 68, 71, 67, 247, 18, 64, 211, 72, 31, 100, 50, 174, 86, 306, 135, 46, 194, 104, 279, 299, 283, 245, 316, 87, 105, 303
Table 11. Partial analysis of SVM parameters using different training sets.
Table 11. Partial analysis of SVM parameters using different training sets.
Kernel
Function
Kernel
Scale
Number of
Features
Box
Constraint Level
Training
%
QuadraticAutomatic342189.66
2.497290.81
389.98
489.98
590.58
130192.13
293.23
393.17
3.981292.60
592.92
50287.42
387.29
487.64
587.14
5.595288.06
Table 12. Partial analysis of k-NN parameters using different training sets.
Table 12. Partial analysis of k-NN parameters using different training sets.
Distance
Metric
Distance
Weight
Number of
Features
kTraining
%
CosineSquared Inverse342276.87
377.21
478.60
577.34
677.88
130290.05
389.92
489.73
589.85
689.98
50288.65
388.49
488.69
589.03
688.53
Table 13. Performance metrics of the models: accuracy, sensitivity, specificity, precision, NPV, FDR, BCR, F1 Score, AUC.
Table 13. Performance metrics of the models: accuracy, sensitivity, specificity, precision, NPV, FDR, BCR, F1 Score, AUC.
ClassificationNumber of
Features
Accuracy%SensitivitySpecificityPrecisionNPVFDRBCRF1 ScoreAUC
SVM34295.940.960.950.930.970.060.960.950.98
13096.010.970.950.930.970.060.960.950.98
5092.990.970.890.870.970.120.930.920.98
k-NN34284.290.860.820.780.890.210.850.820.93
13094.540.930.950.930.950.060.950.930.96
5092.620.920.920.890.940.100.930.910.97
Table 14. Comparison of the methodology of studies that use the same dataset as the dataset used in this study.
Table 14. Comparison of the methodology of studies that use the same dataset as the dataset used in this study.
Ref.YearBand PowerFeature ExtractionClassifiers
Applied
Metric and
Performance
Miltidaous et al. [10]2023Delta, Theta, Alpha
Beta, Gamma
Relative Band Power, Spectral Coherence Connectivity (SCC) DICE-net Acc:
AD/HC = 83.23%
Wang et al. [60]2023Theta, Alpha
Beta
PSDSVMAUC:
AD/FTD = 0.73
Chen et al. [61]2023Delta, Theta, Alpha
Beta, Gamma
CNN+ Visual Transformers (ViTs) CNNAcc:
AD/FTD/HC = 80.23%
Velichko et al. [62]2023Delta, Theta, Alpha
Beta, Gamma
New entropy: Neural Network Entropy (NNetEn)SVMAcc:
AD/HC = 88.45%
Ma et al. [63]2024Delta, Theta, Alpha
Beta, Gamma
PHI values for each electrode pairSVM Acc:
AD/HC = 76.9%,
FTD/HC = 90.4%
Rostamikia et al. [64]2024Delta, Theta, Alpha
Beta, Gamma
Time and DWTSVMAcc:
AD/FTD = 87.8%,
AD + FTD/HC = 93.5%
Stefanou et al. [65]2025Delta, Theta, Alpha
Beta, Gamma
FFT-based spectrogramsCNNAcc:
AD/HC = 79.45%,
AD + FTD/HC = 80.69%
Proposed2025Delta, Theta, Alpha
Beta, Gamma
Time and PSDSVMAcc:
AD/FTD/HC = 96.01%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Senkaya, Y.; Kurnaz, C.; Ozbilgin, F. Enhancing Alzheimer’s Diagnosis with Machine Learning on EEG: A Spectral Feature-Based Comparative Analysis. Diagnostics 2025, 15, 2190. https://doi.org/10.3390/diagnostics15172190

AMA Style

Senkaya Y, Kurnaz C, Ozbilgin F. Enhancing Alzheimer’s Diagnosis with Machine Learning on EEG: A Spectral Feature-Based Comparative Analysis. Diagnostics. 2025; 15(17):2190. https://doi.org/10.3390/diagnostics15172190

Chicago/Turabian Style

Senkaya, Yeliz, Cetin Kurnaz, and Ferdi Ozbilgin. 2025. "Enhancing Alzheimer’s Diagnosis with Machine Learning on EEG: A Spectral Feature-Based Comparative Analysis" Diagnostics 15, no. 17: 2190. https://doi.org/10.3390/diagnostics15172190

APA Style

Senkaya, Y., Kurnaz, C., & Ozbilgin, F. (2025). Enhancing Alzheimer’s Diagnosis with Machine Learning on EEG: A Spectral Feature-Based Comparative Analysis. Diagnostics, 15(17), 2190. https://doi.org/10.3390/diagnostics15172190

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop