4.1. EDA-Graph Features for Continuous Estimations of Arousal and Valence
Our comparative analysis demonstrates that the EDA-graph approach fundamentally transforms how physiological signals are represented and processed for emotion recognition. Traditional EDA methods, while well-established, exhibit systematic limitations in tracking sympathetic nervous system dynamics, particularly during rapid emotional changes. As illustrated in
Figure 5, conventional time-domain and frequency-domain features struggle to capture the complex nonlinear dynamics of autonomic responses, resulting in substantial deviations between recognized and actual emotional states (RMSE: 1.039 for Arousal and 1.182 for Valence). This limitation aligns with previous research [
52,
53], highlighting the inadequacy of conventional methods in capturing the multifaceted dynamics of autonomic responses, even when decomposing EDA signals into phasic and tonic components.
The EDA-graph approach offers three critical advantages over traditional methods. First, by transforming time-series data into graph structures (as illustrated in
Figure 1), our method preserves complex temporal relationships that would otherwise be lost in conventional analyses. Second, the extraction of topological and spectral features allows us to capture higher-order patterns that reflect the intricate organization of physiological responses during emotional experiences. Third, graph metrics provide a framework for modeling both short- and long-term dependencies in physiological signals, offering a more comprehensive representation of emotional states.
These advantages translate directly into quantifiable performance improvements. Our approach achieved RMSE values of 0.801 for Arousal and 0.714 for Valence, which are lower than those reported in previously published work on the CASE dataset. These are indirect comparisons; while we used the same dataset, differences in preprocessing, subject partitioning, and hyperparameter selection protocols may have affected the results. Notable reference points include Dollack et al. [
39] (RMSE: 1.130 for Arousal, 0.74 for Valence using Gradient Boosting), D’Amelio et al. [
39] (RMSE: 0.846 for Arousal, 0.867 for Valence using XBoost), and Pinzon-Arenas et al. [
39] (RMSE: 1.342 ± 1.05 for Arousal, 1.336 ± 1.25 for Valence using TCN-SBU-LSTM). The consistent improvement across both Arousal and Valence dimensions, even when compared with sophisticated deep learning architectures, demonstrates the robust advantage of our graph–theoretical approach.
Feature importance analysis revealed distinct patterns across modalities, providing insights into the physiological mechanisms underlying emotional responses. Most notably, graph total eigenvector centrality emerged as the most influential feature with a PI score of 0.756 (MDI = 0.035), substantially higher than all other features. This finding suggests that eigenvector centrality, which measures how well each node is connected to other highly connected nodes in the graph, uniquely captures network information flow patterns corresponding to emotional states. The recursive nature of this metric appears particularly well-suited to representing the complex organization of autonomic responses during emotional experiences [
49,
50,
51,
52,
53].
We validated the critical role of eigenvector centrality through systematic feature ablation studies [
54]. Removing this single feature resulted in a 42.3% mean increase in RMSE across all 30 LOSO folds (95% CI: 40.7–43.9%, bootstrap with 1000 resamples of the fold-level values), while removing any other individual feature produced performance drops of less than 5%. The ablation protocol was based on complete retraining within each fold; the model was retrained from scratch with eigenvector centrality excluded, ensuring the result cannot be attributed to a fixed-model masking artefact. The finding was consistent across LASSO (39.8% RMSE increase), LightGBM (35.4%), and CatBoost (36.9%), ruling out algorithm-specific bias. A variance inflation factor (VIF) of 1.34 for eigenvector centrality confirmed low collinearity with the remaining features, establishing that the observed dominance reflects a genuine feature–outcome relationship rather than a multicollinearity artefact. This extreme performance degradation demonstrates that eigenvector centrality captures essential, non-redundant information about emotional states within the EDA signal. While such highly skewed feature importance distributions are uncommon in typical machine learning contexts, similar patterns have been documented in other complex physiological modeling domains [
55], particularly in neurophysiological signal analysis where physiologically-relevant measures can dominate predictive performance due to strong correlations with underlying biological mechanisms.
Our analysis also revealed fundamental differences in how the Arousal and Valence dimensions manifest in physiological signals. Arousal estimation demonstrated stronger correlations with specific graph-based features (PI: 0.27 for component eccentricity and 0.11 for total triangle number), suggesting that sympathetic activation patterns create distinct topological signatures in the EDA signal [
56]. This finding aligns with the established understanding that Arousal directly modulates sympathetic nervous system activity through well-defined physiological pathways [
57]. In contrast, Valence estimation exhibited more subtle and complex patterns in the graph features, indicating that emotional Valence may influence the fine structure of autonomic responses through more intricate mechanisms [
58]. This differential pattern highlights a fundamental challenge in emotion recognition: while Arousal has clear physiological correlations easily captured through direct measurements, Valence manifests through a more nuanced modulation of autonomic activity patterns, requiring more sophisticated analysis approaches for effective detection and quantification.
As demonstrated in
Figure 5, our graph-based approach successfully captures dynamic trends in both Arousal and Valence dimensions after training, while traditional EDA features fail to track these changes effectively. This superior tracking capability suggests that emotional states influence the structural organization of physiological signals through multiple pathways, integrating both sympathetic and parasympathetic responses in ways that conventional analysis methods cannot adequately represent.
4.2. Feature Differentiation Across Emotional States
Our statistical analysis revealed significant differences in how various physiological signal representations capture emotional information, with important implications for affective computing. EDA-graph features demonstrated superior discriminative capabilities across all affective quadrants, representing a fundamental advancement in psychophysiological measurement methodology.
Component Eccentricity, Graph Energy, Radius, and Graph Spectrum Maximum exhibited statistically significant differences across all five affective quadrants (LALV, LAHV, HALV, HAHV, and Neutral) following the Holm–Bonferroni correction (
p < 0.005). This finding transcends the conventional understanding presented by Jang et al. [
59], as our graph–theoretical approach not only captures subtle variations, but fundamentally transforms how physiological signals are represented in affective computing.
The superiority of graph-based approaches stems from their capacity to encode topological properties of the signal’s dynamical system. Unlike conventional metrics that extract statistical summaries from the time domain, graph–theoretical features capture underlying connectivity patterns and structural properties that emerge from complex physiological dynamics during emotional experiences. This aligns with Valenza et al.’s [
60] theoretical framework, but extends it by demonstrating that nonlinear dynamics in psychophysiological responses can be effectively quantified through graph–theoretical constructs that preserve system complexity while providing interpretable metrics.
Component Eccentricity, which quantifies the maximum distance between nodes in the graph representation, warrants particular attention for its ability to discriminate across all emotional quadrants. This metric appears to capture fundamental properties of emotional Arousal and Valence that transcend specific emotional categories, suggesting that emotional states may manifest as distinct topological configurations in physiological signals. This perspective challenges traditional views of emotion-specific autonomic patterns and offers a unifying framework for understanding psychophysiological responses to emotional stimuli.
Traditional EDA features showed varied discriminative capabilities consistent with the heterogeneous findings reported in the literature [
57,
61]. Among these features, TVSymp_mean emerged as the only traditional feature exhibiting significant differences across all five quadrants (
p < 0.005). This finding extends Boucsein’s [
62] conceptualization of sympathetic activity as an Arousal indicator by suggesting that tonic sympathetic activity may serve as a bridge between Arousal and Valence dimensions, potentially reflecting an integrated representation of emotional experience rather than simply indicating general Arousal.
Our analysis revealed quadrant-specific sensitivity patterns: EDATon_slope and TVSymp_mean differentiated high-Arousal and high-Valence states, while TVSymp_min and EDAPh_min specifically identified low-Arousal, low-Valence states. This pattern reveals a more sophisticated structure of physiological differentiation than previously recognized, moving beyond Critchley’s [
63,
64] observations of differential sympathetic engagement to suggest that specific EDA parameters may selectively encode particular regions of the affective space. This dimensional specificity supports Bradley and Lang’s [
65] theoretical framework while extending it by identifying precise physiological markers for specific emotional quadrants.
These findings have substantial theoretical implications: rather than seeking universal physiological markers of emotion, affective computing systems might benefit from employing quadrant-specific feature sets optimized for particular regions of the affective space. This approach would transform emotion recognition methodology by acknowledging the heterogeneity of physiological signatures across the affective landscape, as highlighted in Kreibig’s [
57] review, while providing concrete computational solutions to address this complexity.
Features derived from other physiological modalities demonstrated more constrained discriminative capabilities. ZEMG_mav and CEMG_wl features showed significant differences exclusively between the LAHV quadrant and all other quadrants (
p < 0.005), revealing a fundamental constraint in EMG-based affective computing: while highly specific to positive emotional states, these features lack the dimensional breadth necessary for comprehensive emotion recognition. This finding extends beyond merely confirming Cacioppo et al. and Guntinas-Lichius et al.’s [
63,
64] work on Zygomaticus major activity, highlighting the inherent limitations of single-modality approaches and emphasizing the need for multimodal integration in affective computing [
65].
ZEMG_medfreq exhibited selective sensitivity to valence dimensions within high-arousal contexts, representing a previously unidentified specificity with significant implications for understanding the neural mechanisms underlying emotion. This finding extends neuroscientific frameworks of approach–withdrawal systems [
66] by suggesting that spectral characteristics of facial EMG may provide unique information about motivational orientation specifically during high-Arousal states. This state-dependent information extraction offers a new perspective on how physiological signals might encode emotional information conditionally rather than universally.
The observed pattern of feature discriminability across modalities suggests a hierarchical structure that has profound implications for affective computing architecture design. The superior performance of EDA-graph features aligns with complex systems approaches to psychophysiology [
67] but extends beyond them by suggesting that different levels of signal representation may access different aspects of emotional information. This hierarchical perspective challenges the conventional flat approach to feature selection in emotion recognition systems.
Based on these findings, we propose that future affective computing systems should implement a hierarchical feature architecture that prioritizes graph–theoretical representations while strategically incorporating traditional features for specific affective quadrants. This approach would address the limitations identified by Picard et al. [
68] regarding physiological emotion recognition while providing a concrete computational framework for implementing Barrett’s [
67] psychological construction theory of emotion in affective computing systems.
The quadrant-specific sensitivity patterns observed across modalities suggest potential advancements in personalized emotion recognition. Rather than employing a one-size-fits-all approach, future systems might dynamically select feature sets based on initial assessments of which affective quadrant a user’s emotional state likely occupies. This adaptive feature selection would represent a fundamental advancement over current approaches, potentially resolving the persistent challenges related to individual differences and context specificity in affective computing.
4.3. Generalizability of the Models Across Emotional Contexts
A critical challenge of emotion recognition research is developing models that generalize effectively beyond laboratory conditions to diverse real-world scenarios. Our comprehensive evaluation across multiple datasets provides unique insights into the robustness and ecological validity of the EDA-graph approach. The progressive decrease in accuracy from highly controlled to naturalistic environments mixed emotions (98.20%), DEAP (92.75%), and ForDigitStress (86.54%) reveals a systematic pattern that quantifies the impact of ecological factors on emotion recognition (
Figure 2). This performance gradient aligns with previous research suggesting that controlled emotional stimuli yield higher recognition rates compared to naturalistic settings [
69]. Importantly, while accuracy decreases as real-world complexity increases, our approach maintains performance substantially above state-of-the-art benchmarks across all contexts.
This observed decline of approximately 12% between laboratory and naturalistic environments quantifies a critical parameter in emotion recognition research: the ecological validity cost. Previous studies have reported substantially larger drops (often 15–25%) when transitioning to real-world contexts [
70], suggesting that our graph-based approach offers superior robustness to environmental variations. The relatively modest decline can be attributed to the ability of graph-based features to capture fundamental structural patterns in physiological signals that persist despite confounding factors in naturalistic settings.
Our EDA-graph method demonstrated exceptional performance on the mixed-emotions dataset, achieving 98.20% accuracy compared to the previous state-of-the-art accuracy of 78.00% (
Table 6). This 20.2 percentage point improvement is particularly significant given the inherent complexity of discriminating between overlapping emotional states. The balanced performance across positive (precision = 0.99), negative (precision = 0.99), and mixed emotions (precision = 0.96) indicates that the graph-based features capture subtle nuances in emotional expressions that conventional approaches miss [
71].
The high precision values across all emotion categories, with only a minor decrease for mixed emotions (SD = 0.02), suggest that graph-based features effectively model the complex physiological patterns associated with ambiguous or blended emotional states. This capability addresses a fundamental challenge in affective computing: the recognition of non-prototypical emotions that characterize many real-world experiences.
When applied to the DEAP dataset, our model achieved 92.75% accuracy, compared to a previously reported accuracy of 66.00% (
Table 7) under different experimental conditions. This robust cross-dataset generalization is particularly noteworthy given the different emotion elicitation methods between datasets, video clips in mixed emotions versus music videos in DEAP, suggesting that our approach captures fundamental physiological signatures of emotional states regardless of the specific stimuli used to evoke them. The consistent performance across both high-Arousal (93.12%) and low-Arousal states (92.38%) represents a key advancement, as previous approaches typically show substantial performance disparities between arousal levels [
72]. The minimal performance difference (0.74%) between arousal conditions indicates that our EDA-graph approach effectively mitigates the common bias toward high-Arousal detection, providing more balanced emotion recognition across the affective space.
Similarly, the comparable performance for positive Valence (93.05%) and negative Valence (92.45%) states contrasts with previous findings that positive emotional states produce more distinctive physiological signatures [
47]. The balanced recognition across Valence conditions suggests that graph-based features extract meaningful patterns from both positive and negative emotional experiences, addressing another common limitation in physiological emotion recognition.
The application to the ForDigitStress dataset (86.54% accuracy) represents the most stringent test of ecological validity, as this dataset captures emotional responses in naturalistic, stress-inducing job interview scenarios. Despite the decrease in overall accuracy compared to laboratory-based datasets, our approach achieved higher accuracy than the previously reported result for this dataset (72.50%), acknowledging that this comparison is indirect given differences in the experimental pipeline. The ForDigitStress dataset presents unique challenges not encountered in passive viewing paradigms: (1) active participation creating complex interactions between cognitive load, social anxiety, and emotional responses; (2) continuous recording during extended interactions with natural emotional dynamics; and (3) significant individual differences in stress responses to similar social situations. Despite these challenges, our model maintained consistent performances across different stress intensities (high: 87.12%, moderate: 86.33%, and low: 86.17%), demonstrating robust detection capabilities across the stress spectrum.
The consistent accuracy across stress levels suggests that graph-based features capture fundamental physiological signatures of stress that transcend intensity variations. This characteristic is particularly valuable for real-world stress monitoring applications where accurately distinguishing between stress levels is essential for appropriate intervention.
The demonstrated generalizability across datasets with increasing ecological complexity has significant implications for deploying emotion recognition systems in real-world contexts. The performance gradient quantifies expected accuracy decreases as environmental complexity increases, providing critical information for application developers about potential accuracy–ecology tradeoffs.
Importantly, the relatively modest performance degradation in naturalistic settings compared to typical reductions reported in the literature [
70] suggests that graph-based features offer superior robustness to environmental variations. This can be attributed to their ability to capture the underlying structural organization of physiological signals even in the presence of movement artifacts, cognitive load variations, and other confounding factors common in real-world scenarios.
These findings collectively establish that EDA-graph features provide a robust foundation for emotion recognition systems designed for real-world deployment, balancing high accuracy with ecological validity across diverse emotional contexts. The consistently favorable results relative to previously published benchmarks, from controlled laboratory settings to naturalistic environments, support the practical value of graph-based physiological signal representation for affective computing applications, while acknowledging that protocol differences limit direct quantitative comparisons.
It is important to emphasize the limitations that constrain direct comparisons between our results and those reported in the cited literature. The cross-study comparisons presented in
Table 5 and
Table 6 are indirect: the referenced methods differ from ours in several protocol dimensions that can substantially affect reported metrics. Specifically, (1) dataset partitioning: some cited methods use fixed train/test splits or k-fold cross-validation rather than LOSO, which can produce different generalization estimates for the same model; (2) preprocessing: EDA sampling rates, filtering parameters, artifact removal strategies, and normalization schemes vary across studies and can affect both raw features and derived graph metrics; (3) task definition: the number of emotion classes, their boundaries in the Arousal–Valence space, and whether the task is framed as regression or classification differ across methods, making numerical comparison of accuracy or RMSE values inherently approximate; and (4) annotation scales: datasets use different rating scales (e.g., 1–9 vs. 0.5–9.5) requiring alignment before threshold-based class assignment. For these reasons, throughout this paper we use the phrasing “previously reported results under different experimental conditions” rather than claiming strict outperformance, and we encourage readers to interpret the comparison tables as indicative benchmarks rather than controlled head-to-head evaluations. Reproducing all baseline methods under a fully identical protocol would require access to each team’s implementation and is beyond the scope of this work; such a unified benchmark remains an important open challenge for the field.
4.4. Theoretical Implications and Limitations
Our findings contribute significantly to both a theoretical understanding of emotional processing and methodological approaches to emotion recognition. However, they also reveal important limitations that should be considered when interpreting these results and developing future applications.
The superior performance of graph-based features for emotion recognition offers insights into how emotional states manifest in physiological signals. Our findings suggest that emotions influence the structural organization of physiological signals through multiple pathways, integrating both sympathetic and parasympathetic responses. This perspective extends beyond traditional views that focus on direct correlations between emotional dimensions and specific physiological measures.
Graph metrics appear to capture how emotional states organize physiological responses across multiple time scales. The exceptional performance of eigenvector centrality (PI = 0.756) suggests that emotions influence the global organization of autonomic responses, creating characteristic patterns of connectivity that reflect the underlying emotional state. This finding aligns with emerging network-based perspectives on physiological systems that emphasize dynamic interactions rather than isolated measurements.
Our implementation of the two-dimensional Arousal–Valence model builds on Russell’s foundational work [
11] but reveals both opportunities and limitations. While this approach overcomes certain constraints of categorical models like Ekman’s six basic emotions, the assumption that emotional states can be adequately represented in just two dimensions may oversimplify the complexity of human emotional experiences [
73]. The high accuracy achieved within this framework suggests that despite this simplification, the Arousal–Valence space captures fundamental aspects of emotional experience that manifest reliably in physiological responses.
The differential performance of specific features reveals distinct aspects of emotion–physiology relationships. Traditional features such as EDAPh_mean (PI = 0.999) directly reflect sympathetic activation [
74,
75], while TVSymp_mean (PI = 0.278) captures overall sympathetic tone [
52,
53]. Frequency-domain features showed limited predictive power, suggesting emotional states manifest beyond simple frequency patterns [
55]. Among graph features, eigenvector centrality metrics capture the global organization of autonomic responses [
76], while component eccentricity (importance = 0.27) reflects the temporal structure of emotional responses [
77]. These findings suggest that emotions create characteristic patterns in autonomic signal organization across multiple scales of analysis.
Despite the promising results, several methodological limitations warrant consideration. First, our approach relies on the dimensional emotion model, which may not capture all relevant aspects of emotional experience. While the Arousal–Valence space provides a useful framework, recent theoretical developments suggest that additional dimensions such as dominance or approach–avoidance tendencies may be necessary for comprehensive emotion characterization.
Second, the ground truth for our models derives from self-reported emotional states, which introduce inherent subjectivity. While continuous annotation provides temporal resolution advantages over retrospective reports, the cognitive demands of simultaneous viewing and reporting may impact data quality [
78,
79]. The act of continuously manipulating the joystick while experiencing and reporting emotions creates a dual-task paradigm, where motor control and cognitive processing compete for attentional resources. This motor system engagement may influence the very autonomic nervous system responses we aim to measure through EDA, potentially affecting the validity of our ground truth data [
80].
Third, while our models demonstrated robust cross-dataset generalization, all datasets employed laboratory-grade sensors with controlled application procedures. Real-world deployment would likely involve consumer-grade wearable devices with less controlled sensor placement and higher noise levels. The impact of these factors on model performance requires further investigation before confident deployment in everyday contexts.
Our approach employs continuous estimation models to capture nuanced emotional experiences, but fundamental questions remain regarding whether current physiological measurement techniques can adequately capture complex emotional dynamics [
81]. Recent theoretical frameworks propose that mixed emotions constitute distinct, measurable psychological states, but the empirical evidence supporting their reliable detection through physiological signals remains limited [
82].
Our utilization of the dimensional Arousal–Valence space, while theoretically capable of representing both pure and blended emotions through continuous trajectories, faces practical limitations in distinguishing between genuine mixed emotional states and measurement artifacts [
83,
84]. The high precision achieved for mixed-emotion classification (0.96) suggests that our approach captures meaningful physiological patterns associated with these complex states, but perfect discrimination remains challenging.
The representation of mixed emotions presents both conceptual and technical challenges. Conceptually, mixed emotions may not simply represent intermediate points in the Arousal–Valence space but may involve the simultaneous activation of distinct emotional systems. Technically, the physiological signatures of mixed emotions may be more complex and variable than those of prototypical emotions, making consistent detection more challenging across individuals and contexts.
The implementation of graph-based emotion recognition systems in real-time applications faces computational challenges that must be considered. Our analysis of processing requirements reveals that graph-based features introduce substantially higher computational overhead compared to traditional approaches. While traditional EDA features can be computed at approximately 8300 samples/second with a small memory footprint (24 MB), graph-based features reduce throughput to around 2200 samples/second and increase memory requirements to 128 MB.
Although all methods meet the minimum 8 Hz sampling rate required for reliable emotion recognition, the increased computational demands of graph-based analysis may present challenges for deployment on resource-constrained devices, such as wearables or mobile phones. Future optimization of algorithms and feature selection may help address these computational constraints while maintaining recognition accuracy.
Additionally, the higher complexity of graph-based analysis introduces challenges for interpretability. While traditional features like EDAPh_mean have direct physiological interpretations, the meaning of graph metrics like eigenvector centrality in the context of emotional states is less immediately transparent. This complexity may create barriers for explaining system behavior to users or integrating findings with existing psychological theories of emotion.
Computational performance analysis revealed increasing processing demands across modalities. For Traditional EDA, feature calculation required 0.5 h, feature analysis consumed 2.8 h, and regression analysis took 4.5 h. The multimodal approach demonstrated higher computational requirements, with 1.2 h for feature calculation, 8 h for feature analysis, and 13.2 h for regression analysis. EDA-graph processing showed further increased demands, requiring 2 h for feature calculation, 9.8 h for feature analysis, and 16.4 h for regression computations, as shown in
Figure 6.
The combined modality, incorporating all features, exhibited the highest computational overhead. While feature calculation time remained relatively efficient at 0.8 h, feature analysis extended to 20.5 h, and regression analysis required 34.2 h (
Figure 1). This substantial increase in processing time reflects the complexity of handling the expanded feature set and the comprehensive analysis required for the combined approach.
Processing times were measured using a standard computational environment with Intel Xeon processors and 128 GB RAM, Intel, Santa Clara, CA, USA, analyzing data from 30 subjects. The observed time differences between modalities align with their respective feature set sizes and computational complexity requirements.
4.5. Applications and Future Directions
The marked improvement in continuous emotion recognition accuracy has significant implications for real-world applications. The lower RMSE values and reduced variance indicate that our model can track subtle emotional changes more reliably, making it particularly valuable for applications requiring fine-grained emotion monitoring. In mental health assessment, our approach could enable more objective monitoring of emotional states in conditions like anxiety disorders, depression, or bipolar disorder, potentially identifying subtle shifts in emotional patterns before they become clinically significant. The ability to detect mixed emotional states with high precision (0.96) is particularly relevant for psychiatric applications, as emotional ambivalence is a common feature of several psychological disorders.
For human–computer interaction systems, the improved accuracy in detecting both simple and mixed emotional states could enhance adaptive interfaces that respond more appropriately to users’ emotional needs. The consistent performance across the arousal-valence space would enable more personalized and nuanced adaptations compared to systems that can only reliably detect extreme emotional states.
The robust generalization across datasets with varying degrees of ecological validity suggests that our approach could be effectively deployed in diverse real-world contexts. The relatively modest performance decrease in naturalistic settings (approximately 12% from laboratory to job interview scenarios) indicates potential viability for applications ranging from clinical assessment to consumer wellness monitoring.
The temporal stability of the classifications, maintained across entire sequences with no degradation in performance over time, demonstrates the robustness of our approach for continuous monitoring applications. The model shows no performance deterioration even during rapid transitions between emotional states, suggesting that the graph-based features capture fundamental aspects of emotional responses that remain stable across different temporal contexts.
Real-time implementation of graph-based emotion recognition requires careful consideration of the trade-off between computational cost and recognition accuracy. While graph-based features offer superior performance, their higher computational demands may necessitate optimization for deployment on consumer devices. Potential approaches include feature selection to identify the most discriminative subset of graph features, algorithm optimization to reduce computational complexity, and hardware acceleration to enable real-time processing on resource-constrained devices.
Our findings open several promising avenues for future research. First, incorporating cognitive models of emotion could deepen our understanding of the relationship between physiological responses and subjective emotional experiences. Integrating cognitive appraisal theory with graph-based physiological analysis might explain why certain graph metrics effectively capture emotional information and could potentially lead to more theoretically grounded feature extraction methods.
Second, investigating cultural differences in mixed emotion experiences and their physiological correlates could enhance the generalizability of our findings. Cross-cultural studies have shown that the prevalence and acceptance of mixed emotions vary significantly across cultures, which may influence how these emotional states manifest physiologically. Expanding our approach to diverse cultural contexts would strengthen its theoretical foundation and practical applicability.
Third, the quadrant-specific sensitivity patterns observed across modalities suggest the need for adaptive feature selection algorithms that can dynamically adjust feature weightings based on initial classifications of affective quadrants. Such systems would represent a significant advancement over current emotion recognition approaches by implementing a hierarchical classification strategy that first identifies the affective quadrant before employing quadrant-specific feature sets for fine-grained emotion recognition.
Fourth, the limited discriminative capability of traditional physiological measures across all affective quadrants highlights the need for integrated multimodal approaches that strategically combine modality-specific information. Future research should explore optimal fusion strategies that account for the differential sensitivity of various modalities to specific emotional dimensions, potentially implementing context-sensitive weighting schemes that prioritize different modalities based on the target emotional states.
Finally, the hierarchical structure of feature discriminability observed suggests that emotion recognition systems might benefit from deep learning architectures specifically designed to extract and integrate information at multiple levels of representation. Neural network architectures that explicitly model the hierarchical nature of emotional information in physiological signals could potentially achieve more robust performance across diverse emotional contexts and individual differences.