3. Results
After applying the PRISMA methodological protocol and conducting a detailed analysis of the 11 selected studies, the relevant information was systematized into a series of comparative tables. These tables provide a structured overview of the main approaches, techniques, configurations, and findings reported in the literature related to fear detection using EEG signals and artificial intelligence.
Each table groups and summarizes the key aspects evaluated, including the algorithms used and their performance, the EEG devices employed, the available databases, the emotional models applied, the brain regions involved, the preprocessing methods, the EEG bands associated with fear, the duration of the experiments, and the stimulation protocols used. This organization facilitates comparison across studies and lays the groundwork for a critical discussion of current practices and their technological and neuroscientific implications.
In order to address the research question, 11 studies were analyzed following a rigorous systematic review process. The most relevant findings are presented and discussed in depth below based on the generated results tables.
3.1. Algorithms Used and Their Performance
Table 2 presents the most commonly used algorithms for classifying fear based on EEG signals. A clear preference for non-linear models is observed, particularly Support Vector Machines (SVMs), Decision Trees, Random Forest, and Convolutional Neural Networks (CNNs). These models have demonstrated their ability to handle complex relationships between physiological variables and emotional categories, outperforming traditional linear methods such as logistic regression.
The use of Decision Trees proved particularly effective in simulated environments. In the study by Turnip et al. (2024) [
16]. the implementation of this algorithm within a flight simulator setting achieved an accuracy rate of 92.95%. This performance is partly explained by the model’s ability to make decisions based on logical thresholds and its ease in interpreting hierarchical patterns within non-linear datasets. The structural simplicity of this approach also makes it an attractive option for embedded systems.
On the other hand, Support Vector Machines (SVMs) stand out as one of the most robust and consistent algorithms in this field. Studies such as those by Krishna et al. (2019) [
11] and Ishizuka et al. (2024) [
15] showed that, when combined with proper feature extraction techniques (such as DWT or EMD) and signal denoising, this model can exceed 90% accuracy. Furthermore, the SVM offers a particular advantage in scenarios with medium-sized datasets as its margin-based function maximizes class separation with moderate computational cost. This makes it ideal for integration into portable systems and devices with limited computing capacity.
Deep learning algorithms, such as Convolutional Neural Networks (CNNs) and Deep Belief Networks (DBNs), have shown high potential, especially in studies involving large, well-labeled datasets. Vempati and Sharma (2023) [
17] reported strong results using CNNs to detect emotions with high spatial dimensionality in EEG patterns. However, these approaches require large amounts of data to avoid overfitting and are more sensitive to the quality of emotional labeling. Additionally, their computational demands pose a barrier for deployment in continuous or real-time monitoring systems, particularly in populations outside the laboratory setting.
In summary, non-linear algorithms such as the SVM and Random Forest offer an optimal combination of performance, scalability, and technical feasibility for integration into practical solutions. Their flexibility to adapt to different EEG acquisition schemes, along with their relatively low computational cost, makes them preferred candidates for implementation in clinical settings, households, or mobile environments. Moreover, their interpretability and algorithmic transparency add additional value in contexts where explainability is essential, such as mental health applications or elderly monitoring.
3.2. EEG Hardware Configurations
Table 3 shows a marked diversity in the EEG devices used across the analyzed studies, reflecting a balance between the need for neurophysiological precision and applicability in real-world settings. On one hand, clinical-grade equipment such as the Biosemi ActiveTwo and Neuroscan SynAmps RT stand out for their high channel density (64 to 128 electrodes), superior sampling frequency (up to 2048 Hz), and high-fidelity signal acquisition. These features make them the gold standard for controlled neuroscience research, where precise cortical localization and complex topographical analysis are required [
7,
9].
However, their use also implies significant limitations: high cost, low portability, lengthy calibration procedures, and the need for a specialized technical environment. These conditions restrict their practical viability in out-of-laboratory applications, such as in homes or community hospitals.
On the other hand, portable and commercial-grade devices such as the Emotiv Epoc+, MyndPlay, and Emotiv Insight, while offering lower resolution (5 to 14 electrodes and sampling frequencies of 128–256 Hz), excel in ergonomic design, ease of use, low cost, and wireless connectivity. Studies like that of [
14] demonstrated that, with the use of robust signal processing pipelines—including bandpass filtering, artifact removal via ICA, and feature extraction through DWT or EMD—it is possible to achieve classification rates above 85%, even under uncontrolled environmental conditions.
Additionally, technologies like OpenBCI offer an intermediate option: open-source, modular devices with expandability up to 16 channels, allowing for custom configurations based on experimental paradigms. The authors of [
15] leveraged this flexibility in mixed environments, combining visual stimulation with multiband analysis of brain signals and yielding promising results in both accuracy and adaptability.
The choice of EEG hardware should not be seen as a purely technical decision but rather as a strategic one that considers the type of user, the context of use, and the system’s computational requirements. Portability, autonomy, and ease of placement become key factors. This implies that systems based on consumer-grade hardware must compensate for their lower resolution through a more rigorous methodological design in terms of preprocessing, electrode placement, and personalized calibration.
Despite their viability, a lack of comparative studies across different types of hardware applied to the same protocol was identified. This gap limits the ability to establish objective guidelines for selecting devices based on performance, cost, and usability. Future research should address this comparison using standardized methodologies that evaluate variables such as accuracy, latency, signal stability, artifact robustness, and user acceptance.
In summary, the review suggests that portable EEG devices can be not only adequate but even preferable for emotion detection applications outside the laboratory, provided that their technical integration is supported by a systemic approach that encompasses the entire acquisition, processing, and classification pipeline. In this way, the implementation of accessible, efficient, and scalable solutions in social, clinical, and educational contexts is greatly enhanced [
17,
18].
3.3. Databases and Emotional Stimulation Paradigms
Table 4 and
Table 5 show that most of the studies included in this review relied on public databases such as DEAP, DREAMER, and GAMEEMO, which are designed for emotion classification through audiovisual stimuli. These datasets typically use video clips or images labeled with valence and arousal scales (e.g., SAM–Self-Assessment Manikin), facilitating reproducibility and comparison across algorithms and models [
14,
19]. However, they present methodological limitations when focusing on a specific emotion like fear as they are often centered on general emotional states or lack a sufficient number of fear-specific samples.
In contrast, studies that developed personalized databases [
7,
16] achieved greater experimental control over stimulus quality and intensity. Turnip et al. used a highly realistic flight simulator as an emotional induction environment, while Proverbio and Cesati designed a protocol based on guided autobiographical recall. These approaches enabled the capture of more intense, distinguishable, and emotionally congruent EEG responses, resulting in more accurate models with lower error rates. Personalized datasets also allowed the adjustment of contextual variables such as stimulus duration, breathing rhythm, recovery time, and environmental control—factors rarely considered in public datasets.
Regarding stimulation paradigms, it was found that dynamic audiovisual stimuli, especially high-emotional-load videos or interactive simulations, were the most effective at inducing sustained fear. The use of emotional video games, flight simulators, or immersive narratives triggered prolonged activation of brain networks associated with fear, facilitating the identification of specific patterns in the beta and gamma bands [
14,
16]. These configurations more accurately replicate real-world conditions of threat or uncertainty, offering a more ecologically valid environment than passive exposure to images or sounds.
In contrast, studies using static images or acted phrases, such as those in the CREMA-D dataset [
12], showed lower effectiveness in inducing specific fear. This is because fear often requires contextualization and temporal buildup to fully develop at the neurophysiological level. Moreover, brief stimuli may elicit attentional or reflexive responses but do not necessarily activate the limbic circuits involved in fear processing.
The results suggest that the choice of data source and emotional stimulation paradigm has a direct impact on the quality of recorded EEG signals and, consequently, on the performance of classification algorithms. While public datasets are useful for model validation and facilitating comparisons between studies, their ability to induce authentic fear is limited due to the low affective load of stimuli and their generalist focus.
In contrast, personalized datasets allow for greater emotional specificity, experimental control, and adaptability to user context. This is especially relevant for applications in sensitive populations, where induction must be ethical, gradual, and safe. Additionally, naturalistic and immersive paradigms yield more sustained responses, reducing the need for extremely short observation windows or excessive processing to detect subtle changes.
One identified limitation is the lack of standardization in emotional stimulation protocols, which prevents direct comparisons between studies. Future work should consider building open repositories of fear-validated emotional stimuli, as well as developing tools to calibrate stimulus intensity based on the emotional profile of the participant.
3.4. Brain Regions and Electrode Placement
Table 6 shows that electrodes placed over the frontal (Fp1, F3, Fz) and temporal (T4, T7, T8) regions of the scalp exhibit the highest sensitivity to brain activation patterns associated with fear. These findings are consistent with the neuroscientific literature, which identifies the dorsolateral prefrontal cortex, orbitofrontal cortex, and medial temporal lobe structures—particularly the amygdala—as key nodes in the brain’s fear network [
4,
5]. Activation of these areas is related both to the cognitive appraisal of threatening stimuli and to the physiological preparation for action (fight, flight, or freeze).
The reviewed studies reinforce this functional association. For example, Proverbio et al. (2024) [
7] observed that activity recorded in F3 and Fp1 significantly increased during guided emotional recall tasks focused on fear-related memories, while Ishizuka et al. (2024) [
15] found strong correlations between increased gamma-band power in T7 and subjective fear responses induced by simulations. Likewise, Subasi et al. (2021) [
18] documented sustained increases in frontal electrode activity in response to aversive stimuli, using wavelet transform and multiband segmentation.
From a practical and technological perspective, these findings have key implications. Focusing on the frontotemporal regions allows for the design of EEG systems with a reduced number of channels without compromising the quality of emotional analysis. This optimization not only lowers equipment costs but also simplifies system design, improves device ergonomics, and shortens calibration times—crucial factors for implementing portable or home-based solutions.
Furthermore, frontal regions are more anatomically accessible, facilitating sensor placement without the need for specialized technical assistance. In contexts such as monitoring elderly individuals or patients with reduced mobility, this represents a significant advantage. This choice may also enhance user acceptance by avoiding invasive or uncomfortable configurations.
However, some limitations were also identified. Certain studies do not explicitly report the electrode locations with the highest activation or fail to conduct detailed topographic analyses. This lack of standardization in result presentation limits the ability to generalize precise recommendations for optimal fear-related electrode placement.
Finally, the consistent activation of frontotemporal areas may be mediated by emotional lateralization, a property still under debate in the neurophysiological literature. Some studies suggest that fear is more strongly expressed in the right hemisphere, particularly in the right frontotemporal region [
20], which could have implications for designing asymmetric or adaptive emotional monitoring systems.
3.5. Dominant Brainwave Frequency Bands
Table 7 consolidates the most relevant findings on the activity of different brain frequency bands during the emotional experience of fear. Consistently, the reviewed studies report an increase in beta (13–30 Hz) and gamma (30–100 Hz) band activity, along with a decrease in alpha band (8–12 Hz) activity, in response to stimuli perceived as threatening. These results align with the neuroscientific literature, which links the beta band to cognitive alertness, active processing, and motor preparation [
21], while the gamma band is associated with emotional perception integration, selective attention, and the encoding of intense memories [
22].
The reduction in alpha power, particularly in frontal and occipital regions, indicates a shift from resting or relaxed states to a vigilance or threat anticipation response. Klimesch [
23] has emphasized that alpha activity is strongly related to cortical inhibition processes; its suppression suggests disinhibited access to relevant sensory information, which is adaptive in threatening contexts. In studies like [
15], pronounced alpha desynchronization was observed during fear recognition tasks, alongside gamma band increases, especially in temporal electrodes.
Regarding the theta band (4–7 Hz), although less frequently reported, some studies mention its activation during deep emotional processing tasks, particularly in the medial cortex and hippocampus. This may relate to contextual memory of fear or the reactivation of past experiences [
24].
From a functional perspective, beta and gamma bands emerge as key candidate biomarkers for fear detection through EEG, given their sensitivity to intense emotional activation and their correlation with attentional and perceptual processes linked to danger. However, effective utilization requires the application of adaptive multiband feature extraction techniques, such as Discrete Wavelet Transform (DWT) or Empirical Mode Decomposition (EMD), which enable the isolation and quantification of spectral energy with greater temporal resolution.
Moreover, gamma signals—due to their high frequency and low voltage—are especially vulnerable to artifacts and muscular noise. For this reason, studies with better results applied specialized filtering, adaptive window segmentation, and the combination of spectral and spatial analysis (e.g., topographic mapping) to ensure reliability [
7,
18].
In practical environments, such as emotional monitoring in older adults or the implementation of real-time alert systems, this information is highly valuable. The use of these bands as real-time indicators could enable the deployment of low-latency emotional inference models, particularly when combined with sustained emotional stimulation protocols. Additionally, alpha suppression could serve as an initial activation signal in systems requiring a baseline threshold before triggering an alert.
Nevertheless, a gap is noted in the literature regarding inter-individual variability of emotional bands, as well as their dynamic interaction. Future research should explore multivariate and longitudinal models analyzing the simultaneous coactivation of bands in response to different types of fear (acute, anticipatory, and contextual) and across various populations (young vs. older adults; clinical vs. healthy).
3.6. Data Preprocessing
Table 8 highlights that EEG data preprocessing is a critical stage in the analysis pipeline, with a direct impact on machine learning quality and emotional classification accuracy. The reviewed studies applied a diverse combination of techniques, including: band-pass filtering (typically between 0.5 and 50 Hz), artifact removal for ocular or muscular movement (using manual or automated methods such as ICA), segmentation into temporal windows, and advanced transformations such as Empirical Mode Decomposition (EMD), Discrete Wavelet Transform (DWT), or Fourier Transform.
For instance, ref. [
18] demonstrated that applying DWT followed by dimensionality reduction through PCA significantly improved SVM classifier accuracy while reducing redundant noise. The authors of [
15] used EMD to extract intrinsic signal components and combined these vectors with logistic regression and supervised learning algorithms, achieving up to 12% improvements in accuracy metrics compared to unprocessed signals. Additionally, studies like [
14] implemented fully automated cleaning and normalization pipelines, enabling the use of portable EEG in uncontrolled conditions, such as emotional gaming at home.
The data suggest that the effectiveness of a fear detection system does not rely solely on the classification algorithm but rather on the quality of the extracted features, which in turn depend on robust preprocessing. This implies that any real-life-oriented system should include a standardized and automated EEG signal processing pipeline. Ideally, this pipeline should address the following: (i) physiological artifact removal, (ii) adaptive contextual segmentation, (iii) multiband feature extraction (wavelet and EMD), and (iv) dimensionality reduction to enhance computational efficiency. Ignoring this stage compromises system reliability and may result in false positives or negatives, especially in noisy environments or with complex brain signals.
3.7. Emotion Model in Fear Detection
Table 9 presents the analysis of emotion models used across studies reveals a methodological diversity that significantly impacts the quality of fear classification. The basic emotions model by Ekman [
7,
8,
15] has proven effective for direct classification tasks due to its simplicity and its alignment with well-defined neurophysiological patterns. However, its discrete approach may limit the detection of emotional nuances.
In contrast, the circumplex model by Russell, often used with scales like SAM [
9,
14,
17], allows for a more nuanced representation of affective states across valence and arousal dimensions. This model offers higher sensitivity to subtle changes but requires more complex computational approaches.
The five-level fear model [
16] stands out as a middle ground, enabling finer-grained classifications in controlled environments such as flight simulators. Meanwhile, guided acting approaches provide experimental control but reduce emotional authenticity, whereas naturalistic induction [
13] improves ecological validity at the cost of greater variability and noise.
In summary, there is no universally superior model. The choice must align with system objectives, experimental context, and target population. Discrete models are suitable for simplified clinical settings, while dimensional or graduated models offer greater sensitivity for adaptive or long-term monitoring systems.
3.8. Stimulus Duration and Experimental Context
Table 10 shows that the duration of the emotional stimulus and the context in which it is presented are key factors in the effectiveness of fear detection using EEG. Studies with prolonged protocols, such as [
16], which used flight simulators in sessions lasting up to 30 min, achieved greater stability in EEG patterns associated with fear. Similarly, alakus et al. (2020) [
14] employed 5-min emotional video game sessions, generating continuous and contextualized interaction with the stimulus, which enabled the detection of more robust brain responses.
In contrast, studies such as Balan et al. (2019) and Subasi et al. (2021) [
9,
18], which worked with databases like DEAP (using videos approximately 1 min long), showed acceptable results but with higher variability across subjects. Even shorter studies, such as those using acted clips from the CREMA-D dataset (1–3 s), presented low emotional resolution due to the brain’s limited ability to generate an intense affective response within such brief time windows.
Stimulus duration directly influences the EEG system’s ability to capture emotional dynamics. The longer the stimulus, the greater the resulting brain activity—often with higher amplitude. For this reason, it is said that prolonged stimuli allow the observation of the full cycle of emotional processing: anticipation, perception, cognitive evaluation, and physiological response. Additionally, extended protocols reduce the likelihood of artifacts caused by sudden noise or startle reactions.
In practical applications, it is recommended that systems include sustained analysis windows or prolonged/interactive stimuli to ensure that brain responses are consistent enough to feed a classification algorithm with high reliability. Consequently, immersive environments, extended narratives, or real-time interaction not only increase system sensitivity but also enable the segmentation of more stable and less noise-prone signals. This is essential for real-time applications where decisions must be based on continuous and cumulative evidence.
3.9. Age Influence on EEG Patterns Associated with Fear
The age composition of the samples used in the studies included in this review presents significant methodological and conceptual implications for the detection of fear using EEG signals and artificial intelligence. As shown in
Table 11, a substantial proportion of the studies focused exclusively on young adults, primarily university students, with little to no representation of other stages of the lifespan. This homogeneous sampling pattern was observed in at least seven of the eleven reviewed studies, while only two investigations, specifically those by Cao et al. and Sneddon et al., reported demographically diverse samples. Additionally, in two cases, the participants’ ages were not explicitly reported, representing a critical methodological omission in neurophysiological studies, where age constitutes a key modulator of brain activity.
Age is a decisive factor in the functional configuration of the brain, particularly in frontal regions, which are central to emotional processing. In this context, multiple studies have documented that the electrical dynamics of the brain, as measured by electroencephalography, change substantially across the lifespan. Cortical maturation, synaptic reorganization, and variations in functional connectivity directly influence the morphology of EEG signals and their association with affective and cognitive states [
25,
26,
27].
A central metric in emotional EEG evaluation is frontal alpha asymmetry, particularly in electrodes F3 and F4, which has been consistently associated with negatively valenced emotions such as fear. The presence of right-hemisphere alpha suppression in response to threatening stimuli has been interpreted as a marker of cortical activation linked to withdrawal motivation and aversive processing [
28,
29]. This suppression—commonly referred to as “alpha suppression”—acts as a neurophysiological indicator of emotional engagement. However, the strength and consistency of this response are not uniform across ages. Studies such as Marshall et al. [
25] have shown that during early childhood and adolescence, frontal asymmetry patterns are not yet fully developed, limiting their reliability as emotional markers at those stages. Conversely, research by Christou et al. [
27] suggests that in older adults, interindividual variability in EEG responses to emotional stimuli increases, potentially reflecting neurobiological aging and cognitive compensation mechanisms.
This body of evidence highlights the need to treat age not merely as a descriptive variable but as a critical parameter that must be explicitly incorporated into algorithmic models for emotion detection. Using artificial intelligence models trained exclusively on data from young adults entails the risk of generating predictive biases when applied to different populations. This concern is far from trivial as the field of affective neurotechnology is advancing toward clinical, educational, and wellness applications, where model validity depends directly on its capacity to represent human brain diversity.
It is worth noting that this concern has already been addressed in other areas where EEG and AI are applied. For example, in automated diagnosis of depressive disorders, variables such as age and gender have been shown to affect brain connectivity patterns and nonlinear EEG features, thereby influencing classifier performance [
30]. Similarly, in studies of mild cognitive impairment, the activity of alpha rhythm sources varies as a function of age and correlates with cerebral metabolic state [
31]. Even in brain–computer interfaces designed for navigation or rehabilitation, user-specific calibration has proven essential to improve model accuracy [
32].
In light of these findings, it becomes evident that the limited age diversity in studies on fear detection using EEG constitutes one of the major methodological weaknesses in the field. Including participants of different age groups would not only enhance the generalizability of models but also allow for a more rigorous exploration of neurophysiological emotion variations across the lifespan. Furthermore, integrating such data into the training and validation of AI systems could open the door to developing more robust, personalized, and ethically aligned classifiers that better reflect the complexity of the human emotional brain.
3.10. Hybrid and Multimodal Approaches to EEG-Based Fear Detection
The integration of electroencephalography (EEG) with artificial intelligence has catalyzed the emergence of hybrid systems for fear detection. As shown in
Table 12, the reviewed studies collectively illustrate a transition from conventional, retrospective approaches to adaptive, multimodal systems capable (at least in part) of real-time emotion tracking. These systems vary in technical sophistication, methodological design, and translational maturity but together delineate the contours of an evolving research frontier in affective neuroscience and computational modeling.
Broadly speaking, the studies can be categorized into two main trajectories. The first relies on pre-established, emotion-labeled EEG datasets such as DEAP (e.g., Bălan et al. (2019) and Krishna et al. (2019), [
9,
11]), which provide high experimental control but limit ecological validity and emotional authenticity. These works employ multichannel EEG combined with peripheral physiological signals (e.g., GSR, EMG, temperature, and respiration), extracting complex features such as fractal dimensions, entropy, and Hjorth parameters. Classifiers such as Extreme Learning Machines (ELMs) and Support Vector Machines (SVMs) yielded classification accuracies exceeding 85%, particularly when leveraging multimodal input. However, the reliance on offline labeling post-stimulation introduces constraints regarding temporal precision and undermines their real-world adaptability.
Conversely, studies by Ishizuka et al. (2024) [
15] and de Man and Stassen (2017) [
8] pursued experimental designs tailored to the participant, focusing on emotionally relevant stimuli and personalized emotion tagging. Ishizuka et al. employed an 8-channel EEG system, combined with empirical mode decomposition (EMD) and a post-stimulation interview for labeling emotional segments. Despite the small sample size (
n = 5), their system achieved a subject-dependent classification accuracy of 87.7%, demonstrating feasibility for lightweight, portable applications. de Man et al. [
8], on the other hand, used a single-channel EEG with real-time emotional self-reporting via a rotary potentiometer, achieving meaningful differentiation between calm and fear states through high-beta and low-gamma band analysis. These studies, although methodologically less complex, offer valuable insight into the construction of low-cost systems with potential for continuous, subjective emotion assessment.
Kim et al. (2019) [
10] represent a middle-ground approach, exploring perceptual fear in urban settings through mobile EEG and static nightscape images. Their findings—an increase in beta activity and decrease in alpha rhythms in response to isolated, human-free urban scenes—underscore the influence of environmental context in modulating fear responses. Although not based on deep learning nor real-time classification, this work broadens the scope of fear induction paradigms and supports applications in environmental psychology and urban design.
From a translational standpoint, the study by Turnip et al. (2024) [
16] is particularly noteworthy. Utilizing a clinical-grade EEG system (Mitsar 202) with 19 channels, they simulated critical landing phases in a flight simulator and achieved 92.95% accuracy in classifying five levels of fear using Decision Tree classifiers. The system integrates signal features from alpha, beta1, beta2, and gamma bands, highlighting the feasibility of EEG-based fear recognition in complex, cognitively demanding tasks. Nevertheless, the lack of physiological signal fusion, real-world deployment, and inter-subject validation indicates that the system remains in a pre-implementation phase.
Algorithmically, the reviewed studies confirm the superiority of hybrid and deep learning models (such as CNN-LSTM, TQWT + ELM, and EMD + SVM) over traditional classifiers. These architectures benefit from their capacity to integrate spatial and temporal features, capturing nonlinear dynamics of affective responses. However, the absence of cross-dataset validation and the frequent use of single, pre-labeled databases limit the generalizability of these results. Moreover, the field continues to lack standardized evaluation metrics and shared benchmarks, hindering comparative analysis and reproducibility.
Equally concerning is the persistent absence of ethical safeguards. None of the reviewed studies addressed critical issues such as neural data protection, dynamic consent models, or algorithmic fairness—despite operating on neural correlates of one of the most evolutionarily significant and clinically sensitive human emotions. Given that fear responses are deeply modulated by trauma, context, and sociocultural background, the lack of ethical oversight is not only a methodological shortcoming but also a structural gap in affective computing research. Future systems must prioritize equitable design, user-centered calibration, and participatory frameworks to avoid replicating or amplifying bias under the guise of technological neutrality.
Overall, the reviewed systems reflect a field in active maturation, combining algorithmic innovation with increasingly ecologically valid paradigms. Yet, key gaps remain—methodologically (e.g., longitudinal validation and inter-subject replicability), operationally (e.g., testing in real-world environments), and socio-technically (e.g., ethical deployment and user diversity). Bridging these gaps requires coordinated interdisciplinary efforts across neuroscience, engineering, human–computer interaction, and neuroethics.
Only through such integration can EEG-based fear detection systems evolve from high-performing prototypes into socially responsible technologies, capable of serving health, safety, and human well-being without compromising personal dignity or cognitive autonomy.
3.11. Applicability of Hybrid EEG Systems in the Detection of Other Emotions
While this review has focused on fear detection through EEG signals, it is important to note that these technologies have also demonstrated significant potential in identifying other complex emotional states, such as depression. Several studies have employed hybrid systems that combine EEG with artificial intelligence algorithms and multimodal approaches, yielding promising results in both diagnostic accuracy and clinical characterization.
For instance, Yasin et al. [
33] conducted a comprehensive review of methods integrating audiovisual and EEG signals with machine learning algorithms such as SVM, KNN, and CNN. Their analysis concluded that multimodal systems significantly enhance depression detection and relapse prediction. However, they highlighted limitations such as subject variability and the need for methodological standardization.
In a subsequent study, the same group evaluated the effectiveness of active and passive EEG paradigms, combined with classifiers (SVM, Decision Trees, and logistic regression) and clustering techniques (t-SNE and kernel PCA). Their hybrid system achieved 100% accuracy using Decision Trees in distinguishing depressed individuals from controls, emphasizing the relevance of active paradigms in detecting cognitive deficits associated with depression [
34].
Similarly, Yasin et al. [
35] proposed a hybrid model based on a CNN-LSTM architecture applied to quadriplegic patients, achieving 98% accuracy in depression relapse detection. This study is pioneering in integrating EEG and clinical data for this specific population, demonstrating the versatility of these techniques in challenging clinical contexts.
Finally, their systematic review focused on EEG-based diagnosis of major depressive disorder and bipolar disorder using artificial neural networks (ANN, CNN, RNN, and DBN). The review highlights the effectiveness of hybrid models leveraging nonlinear features and biomarkers while pointing out the need for dataset standardization and better handling of EEG signal noise [
36].
These findings reinforce the notion that EEG-based systems, especially when combined with hybrid AI models, are not only useful for detecting fear but also offer a robust framework for identifying emotional disorders such as depression, opening new avenues for clinical and technological research.
3.12. Synthesis and Response to the Research Question
Based on the comprehensive analysis of the 11 studies included in this systematic review, key elements were identified that define an optimal configuration for fear detection from EEG signals using artificial intelligence. Together, these findings provide a well-supported answer to the research question.
First, it is confirmed that non-linear algorithms such as Support Vector Machines (SVMs) and Decision Trees offer a superior balance between accuracy, efficiency, and ease of implementation. Their performance significantly improves when trained on preprocessed data labeled using consistent emotional models (e.g., Ekman or Russell), achieving accuracies above 90% under controlled conditions. Moreover, these models have demonstrated greater robustness to inter-individual variability—a critical aspect for diverse populations such as older adults.
Second, the importance of the frontotemporal brain regions in fear encoding is emphasized, which allows for a reduced electrode configuration without sacrificing diagnostic quality. This observation is fundamental for the design of portable systems that do not rely on complex clinical setups. The use of commercial EEG devices such as Emotiv or MyndPlay is justified in this context, provided they are used with specific placement protocols and standardized signal-cleaning methods.
Additionally, the stimuli used to induce fear are a key component of the system architecture. Prolonged simulations, emotional video games, and immersive scenarios generate more distinct and sustained neural patterns compared to short clips or artificial stimuli. This leads to higher classifier reliability, fewer false positives, and greater sensitivity to gradual emotional state changes.
Furthermore, multiband feature extraction focused on gamma and beta activity, along with alpha suppression, proves to be a robust practice for capturing brain signals related to fear. When combined with techniques like DWT or EMD, this enables the construction of enhanced graphical representations of EEG signals, facilitating detection and classification by learning algorithms.
Ultimately, the cross-cutting element that determines system effectiveness is the integration of all these techniques into a coherent, automated, and experimentally validated pipeline. It is not enough to select powerful algorithms or appropriate sensors—the synergistic articulation between stimulus, signal acquisition, preprocessing, feature selection, and inference defines the real capacity of a system to detect fear with accuracy, stability, and in real time.
In summary, the reviewed studies converge on a comprehensive methodological model where the combination of accessible hardware, realistic emotional stimulation protocols, advanced neuroinformatics processing, and non-linear classification models offers a more effective and applicable approach for automatic fear detection in clinical, preventive, and assistive contexts.
3.13. Ethical Considerations
One of the most relevant findings of this review is the very limited number of studies that explicitly address fear detection using electroencephalography (EEG) and artificial intelligence. Only eleven studies met the inclusion criteria, which not only reflects the emerging nature of this research area but also underscores the presence of multiple ethical, methodological, and technological barriers that hinder its development. These include the emotional risks involved in inducing fear, the complexity of obtaining informed consent in neuroemotional contexts, and the widespread absence of robust ethical protocols for the collection, interpretation, and use of brain data linked to affective states.
Inducing fear experimentally entails significant emotional and psychological risks. Unlike positive or neutral emotions, fear can activate traumatic memories, trigger acute anxiety episodes, and cause prolonged psychological discomfort, especially among individuals with clinical backgrounds or emotional vulnerability [
4,
5]. This affective load explains why many ethics committees restrict or reject protocols involving the direct induction of fear, thus limiting the number of viable studies. However, this also reveals an unresolved tension: it is necessary to study fear as a relevant clinical marker but under conditions that guarantee the emotional safety and integrity of participants.
Among the reviewed studies, a general absence of differential ethical frameworks was identified. Few works reported specific strategies to ensure emotionally contextualized informed consent. None offered emotional crisis management protocols or post-exposure psychological support. Similarly, there is a lack of procedures for handling sensitive data, such as EEG patterns associated with fear responses, which could be used—especially in unregulated settings—for emotional profiling or inference without the subject’s awareness or consent.
This gap becomes even more critical when considering the demographic composition of the samples. As shown in
Table 11, most studies relied on homogeneous populations: young, university-educated adults, with minimal or no ethnic, age, or clinical diversity. The absence of demographic representation not only restricts generalizability but also increases the risk of algorithmic bias, exclusion, and classification errors when these models are applied to broader populations. From an ethical standpoint, this constitutes a problem of epistemic and distributive justice: technologies are being built based on data from a narrow segment of humanity, while their deployment is intended to be global and inclusive.
The risk is not merely technical. If algorithms are trained to detect emotional patterns based on non-diverse EEG data, they are likely to produce false positives or negatives, particularly in individuals whose neural responses vary due to genetic, developmental, or cultural factors. In clinical contexts, this could lead to misdiagnoses or stigma. In occupational or security environments, it could result in discriminatory practices or unauthorized emotional surveillance. The absence of safeguards exposes participants to unregulated and ethically ambiguous uses of their brain activity—a form of technological vulnerability that current legislation has yet to address.
Likewise, the type of stimulus used to induce fear is ethically and methodologically critical. As suggested in the literature [
37], the use of controlled, progressive, and symbolic environments, such as virtual reality or abstract representations, is necessary to elicit fear without causing trauma. Poorly designed or overly intense stimuli cannot only lead to emotional harm but also compromise the ecological validity of the EEG signal. An artificially induced emotional state using unrealistic or irrelevant stimuli will generate atypical physiological responses, thus distorting the emotional brain signature and affecting the accuracy of AI models.
While some studies have explored fear detection in vulnerable populations—such as individuals with psychiatric conditions or quadriplegia—few have implemented differentiated ethical protocols. These populations require reinforced levels of consent, support, and protection, given their potential cognitive or communicative limitations. This is especially relevant when employing artificial intelligence, a technology often opaque to the general public and difficult to interpret without technical expertise.
Taken together, this review reveals a structural ethical gap in the development of EEG- and AI-based fear detection systems. Addressing this gap requires going beyond mere technical validation or institutional approval. It demands the integration of ethical principles from AI research (e.g., explainability, algorithmic fairness, and non-maleficence) as well as neuroethical frameworks sensitive to human diversity. Only in this way can we develop technologies that are not only effective but also fair, transparent, safe, and socially legitimate.