Next Article in Journal
Hemodynamic Alteration in Aortic Valve Stenosis: CFD Insights from Leaflet-Resolved Models
Previous Article in Journal
Class III Malocclusion in Growing Patients: Facemask vs. Functional Appliance: Experimental Study
Previous Article in Special Issue
Relationship Between RAP and Multi-Modal Cerebral Physiological Dynamics in Moderate/Severe Acute Traumatic Neural Injury: A CAHR-TBI Multivariate Analysis
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Trans-cVAE-GAN: Transformer-Based cVAE-GAN for High-Fidelity EEG Signal Generation

School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266520, China
*
Author to whom correspondence should be addressed.
Bioengineering 2025, 12(10), 1028; https://doi.org/10.3390/bioengineering12101028
Submission received: 19 August 2025 / Revised: 20 September 2025 / Accepted: 24 September 2025 / Published: 26 September 2025

Abstract

Electroencephalography signal generation remains a challenging task due to its non-stationarity, multi-scale oscillations, and strong spatiotemporal coupling. Conventional generative models, including VAEs and GAN variants such as DCGAN, WGAN, and WGAN-GP, often yield blurred waveforms, unstable spectral distributions, or lack semantic controllability, limiting their effectiveness in emotion-related applications. To address these challenges, this research proposes a Transformer-based conditional variational autoencoder–generative adversarial network (Trans-cVAE-GAN) that combines Transformer-driven temporal modeling, label-conditioned latent inference, and adversarial learning. A multi-dimensional structural loss further constrains generation by preserving temporal correlation, frequency-domain consistency, and statistical distribution. Experiments on three SEED-family datasets—SEED, SEED-FRA, and SEED-GER—demonstrate high similarity to real EEG, with representative mean ± SD correlations of Pearson ≈ 0.84 ± 0.08/0.74 ± 0.12/0.84 ± 0.07 and Spearman ≈ 0.82 ± 0.07/0.72 ± 0.12/0.83 ± 0.08, together with low spectral divergence (KL ≈ 0.39 ± 0.15/0.41 ± 0.20/0.37 ± 0.18). Comparative analyses show consistent gains over classical GAN baselines, while ablations verify the indispensable roles of the Transformer encoder, label conditioning, and cVAE module. In downstream emotion recognition, augmentation with generated EEG raises accuracy from 86.9% to 91.8% on SEED (with analogous gains on SEED-FRA and SEED-GER), underscoring enhanced generalization and robustness. These results confirm that the proposed approach simultaneously ensures fidelity, stability, and controllability across cohorts, offering a scalable solution for affective computing and brain–computer interface applications.

1. Introduction

With the continuous advancement of affective computing and human–computer interaction research, electroencephalography (EEG), as a non-invasive neurophysiological measurement technique with high temporal resolution, has shown broad application prospects in fields such as emotion recognition [1,2,3] and cognitive impairment assessment [4,5]. EEG signals reflect the electrophysiological processes of neuronal population activity in the brain and are characterized by strong nonlinearity, non-stationarity, and significant inter-individual variability, making the acquisition and modeling of high-quality EEG data a persistent challenge.
In practice, high-quality EEG data collection not only depends on expensive instrumentation and stringent experimental conditions but is also subject to participant compliance and ethical constraints. As a result, data acquisition is costly, sample sizes are often limited, and the applicability and scalability of deep learning models for EEG analysis are severely restricted [6]. Consequently, developing high-quality and semantically controllable EEG data generation methods has become a focal point of recent research.
Existing studies have attempted to apply generative adversarial networks (GANs) [7,8,9], variational autoencoders (VAEs) [10,11], deep convolutional GANs (DCGANs) [12,13], Wasserstein GANs (WGAN) [14,15], and Wasserstein GANs with gradient penalty (WGAN-GP) [16] to EEG signal generation. However, the following bottlenecks remain:
(1)
Traditional VAE architectures can model latent distributions but often produce blurry samples with structural distortion;
(2)
Classical GANs have limited temporal modeling capability, making it difficult to capture the dynamic features and spectral structures of EEG signals;
(3)
Most existing models lack effective emotion label conditioning mechanisms, hindering their ability to generate EEG data under specific semantic guidance and thereby limiting their usability in practical scenarios such as emotion regulation and cognitive intervention.
To address these issues, this paper proposes an EEG generation model that integrates a Transformer architecture, a conditional VAE mechanism, and an adversarial learning strategy. The model introduces multi-dimensional loss constraints to achieve fidelity-oriented modeling of generated signals from temporal, spectral, and statistical perspectives, while incorporating emotion labels as conditional inputs to enable controllable generation of EEG segments with specific emotional attributes. The Transformer module effectively captures long-range temporal dependencies in EEG signals; the variational inference and reparameterization mechanisms enhance the interpretability of the latent space and the diversity of generated samples; and the discriminator, through adversarial feedback, improves the realism and structural consistency of synthetic data.
The main contributions of this work are as follows:
(1)
Incorporation of emotion label conditioning to enhance semantic controllability of generated signals
Most existing EEG generation models are unconditional, making it difficult to generate target samples based on specified labels. This study incorporates conditional label embedding (one-hot) encoded emotion labels, concatenated in both the encoder and decoder stages, to achieve conditional modeling. This significantly improves the model’s ability to control emotional attributes, enabling the generation of EEG segments with target emotion characteristics while preserving signal authenticity.
(2)
Integration of a Transformer encoder to improve temporal dependency modeling
Compared with traditional Convolutional Neural Networks (CNN) or Recurrent Neural Networks (RNN)’ architectures, Transformers are better suited for capturing long-range temporal dependencies and rhythmic variations in EEG signals. In this work, multi-layer Transformer encoders are integrated into the generator, leveraging self-attention mechanisms to model both local and global temporal features, thereby enhancing the representation of rhythmic structures in generated signals.
(3)
Joint multi-dimensional structure-aware loss to improve multi-modal fidelity of generated samples
Given the complex structural characteristics of EEG signals in the temporal, spectral, and distributional domains, this study designs a multi-loss fusion strategy incorporating mean squared error (MSE), Pearson correlation, signal smoothness, power spectral mean squared error, and Kullback–Leibler (KL) divergence. This approach constrains generated samples from multiple perspectives, improving their structural consistency in morphology, spectrum, and statistical distribution.

2. Related Work

In recent years, generative models have attracted considerable attention in the modeling and synthesis of EEG signals. To address challenges such as the difficulty of EEG data acquisition, limited sample sizes, and distributional imbalance, researchers have increasingly explored the use of VAEs, GANs, and their hybrid architectures, aiming to improve the quality and representational capacity of synthesized signals. This section reviews related studies from two perspectives: VAE-based generative methods and GAN-based generative methods.

2.1. VAE-Based Methods for EEG Signal Generation

The VAE is a representative probabilistic generative model that achieves controllable sample generation by modeling the latent distribution of data. A typical VAE consists of an encoder, a decoder, and a latent space. The encoder maps the input signal to the distribution parameters of latent variables, from which a latent vector is sampled via the reparameterization trick, and the decoder reconstructs the original signal from this latent vector. The training objective comprises both a reconstruction error term and a KL divergence regularization term, thereby ensuring fidelity in signal reconstruction while optimizing the continuity and structural expressiveness of the latent space.
In the context of EEG signal generation, VAEs have been introduced primarily due to their ability to model the distributions of high-dimensional neural signals. In 2019, Aznan et al. [17] first proposed an end-to-end training approach to directly generate meaningful EEG data in the signal space, thereby improving the overall efficiency of EEG synthesis. In 2024, Cisotto et al. [18] proposed hvEEGNet, a hierarchical VAE trained with a DTW-based loss for multi-channel EEG reconstruction on BCI IV-2a (motor imagery), achieving subject-wise high-fidelity reconstructions. In 2022, Bethge et al. [19] designed the EEG2cVAE model for emotion recognition, enabling preliminary generation of EEG signals under different emotional states. In 2023, Li et al. [20] proposed the Causal Recurrent Variational Autoencoder (CR-VAE), which learns Granger causality graphs from multivariate time series and incorporates latent causal mechanisms into the data generation process. This approach not only generates EEG signals but also enhances transparency in the EEG generation process. In the same year, Tian et al. [21] introduced the Dual-Encoder Variational Autoencoder–Generative Adversarial Network (DEVAE-GAN), which integrates spatiotemporal features to address the issue of data sparsity in building robust EEG-based classification models.
Although VAEs have demonstrated certain advantages in EEG signal generation, the quality of the generated data often falls short compared to GAN-based models. Specifically, EEG segments generated by VAEs tend to contain noise interference and partially inaccurate waveform characteristics, which can adversely affect downstream applications. Furthermore, VAEs exhibit limited capability in capturing the complex spatiotemporal coupling features of EEG data. More critically, despite their promising generative performance in various EEG modeling tasks, a systematic understanding of how VAEs identify and reproduce specific EEG patterns remains lacking, which constrains their adoption, trustworthiness, and interpretability in both clinical and engineering applications.

2.2. GAN-Based Methods for EEG Signal Generation

GANs, first proposed by Goodfellow et al. [22] in 2014, are among the most representative generative models in recent years. Their core concept lies in training a generator and a discriminator in an adversarial framework, whereby the generator progressively produces data samples that more closely approximate the real distribution through a minimax game. GAN architecture has achieved groundbreaking advances in fields such as image synthesis and speech generation and have likewise attracted considerable interest from EEG researchers.
In 2018, Abdelfattah et al. [23] proposed a recurrent generative adversarial network (DCGAN), which follows the same design principles as conventional GANs but replaces the convolutional generator with a recurrent neural network, thereby enhancing the model’s ability to capture long-term dependencies in EEG data. In 2021, Fahimi et al. [24] introduced a DCGAN-based approach that employed a trained deep convolutional neural network to extract feature vectors from a subset of a target participant’s EEG data. These features were then combined with the original dataset to form an augmented training set, which improved the performance of EEG classifiers. In 2023, Dong et al. [12] proposed a DCGAN-based data augmentation method that introduced Gaussian noise into the generator and used convolutional neural networks to process EEG time series data. The authors analyzed the generated data using fast Fourier transform and continuous wavelet transform, finding strong similarity to real EEG signals. However, DCGAN-based architecture still suffers from typical issues such as training instability, gradient vanishing, and mode collapse—problems that become more pronounced in EEG applications due to high noise levels and substantial inter-subject variability.
To improve training stability and data quality, Arjovsky et al. [25] introduced the WGAN into EEG generation in 2017. By replacing the original cross-entropy loss with the Wasserstein distance, this model theoretically alleviates the problem of gradient discontinuity. In 2020, Panwar et al. [26] proposed WGAN-GP, incorporating a gradient penalty strategy to further enhance training stability. In 2020, Smith et al. [27] proposes a conditional GAN for 1D time-series synthesis (TSGAN), evaluated on 70+ univariate datasets from the UCR archive, and reports improved FID and TSTR/TRTS scores over WGAN baselines, indicating better realism and downstream utility under a conditional setting.
Although WGAN and its variants have, to some extent, addressed stability issues, several limitations remain:
(1)
Lack of temporal modeling capability—Most EEG GAN models rely on one-dimensional convolutional architectures, which are inadequate for capturing long-range dependencies or rhythmically structured temporal patterns, impairing the accurate reconstruction of key frequency bands;
(2)
Absence of label-conditioned control mechanisms—The original GAN framework is inherently unconditional, making it difficult to generate EEG samples with semantic attributes corresponding to specific emotional states or cognitive tasks;
(3)
Neglect of structural loss constraints—GAN models typically optimize solely based on discriminator feedback and often omit explicit modeling of structural consistency in terms of temporal smoothness or spectral fidelity, resulting in waveform distortion or spectral drift in generated signals;
(4)
Susceptibility to mode collapse—In multi-class emotion generation tasks, GANs often struggle to adequately cover the full data distribution across multiple emotional labels, leading to insufficient sample diversity.
Therefore, while GAN-based models exhibit promising potential for EEG signal generation, they still face significant challenges in maintaining frequency-domain consistency, class controllability, and temporal fidelity. To overcome these limitations, researchers have increasingly explored combining GANs with architectures such as VAEs and Transformers, leveraging both latent distribution modeling capabilities and adversarial learning mechanisms to more comprehensively capture the complex structure of EEG signals.

3. Materials and Methods

3.1. Dataset

This study employs the publicly available SEED (SJTU Emotion EEG Dataset) [28,29] as the primary data source for model training and evaluation. The SEED dataset was developed by Shanghai Jiao Tong University, with the aim of collecting multi-channel EEG signals through a film-elicitation paradigm to capture neural activity patterns associated with different emotional states. It has been widely used in research on emotion classification, affective computing, and EEG data modeling. Beyond SEED, this research additionally evaluates the proposed method on two datasets, SEED-FRA [30,31] and SEED-GER [30,31], providing culturally distinct cohorts for external validation.
The dataset comprises EEG recordings from 15 healthy participants with 7 males and 8 females aged 23.27 ± 2.37 years. In addition, this research evaluates on SEED-FRA with 5 males and 3 females aged 22.5 ± 2.78 years and SEED-GER with 7 males and 1 female aged 22.25 ± 1.98 years to protect personal privacy. Each participant took part in three experimental sessions, with at least a one-week interval between sessions to avoid emotional habituation effects. In each session, participants sequentially watched 15 emotion-eliciting video clips, each lasting approximately four minutes. The video content was pre-screened to induce three typical emotional states—positive, neutral, and negative—with five clips for each category. After viewing each clip, participants completed a self-assessment questionnaire to verify the consistency and validity of the elicited emotions.
EEG data were collected using the NeuroScan SynAmps2 system (Compumedics, Charlotte, NC, USA) with electrode placement following the international 10–20 system, covering 62 channels at an original sampling rate of 1000 Hz.
For each participant, the data from one experimental session are stored in a single file containing EEG recordings from 15 emotional trials. Each trial is stored as a matrix of dimensions [channels × time points], where N depends on the trial duration and sampling rate. Each trial is annotated with a clear emotion label, and label distributions are consistent across participants, ensuring standardization and comparability of the annotations.

3.2. Preprocessing

Prior to model training, the raw EEG signals were subjected to standard preprocessing steps to improve data consistency and facilitate model convergence. First, each EEG trial was reformatted to ensure a uniform dimension of [time × channels]. Subsequently, Z-score normalization was applied to each channel within a trial to eliminate amplitude discrepancies across channels.
EEG signals were down-sampled to 200 Hz, band-limited using a 0–45 Hz filter and cleaned via independent component analysis (ICA) to remove artifacts. To expand the dataset and capture temporal features, a sliding window segmentation was applied to the EEG signals, with a window length of 400 samples and a step size of 200 samples, a setting chosen to facilitate downstream classification tasks. Each segment inherited the emotion label of its parent trial, and labels were mapped to standardized category indices for use in conditional generative modeling.

3.3. Model

3.3.1. Overall Architecture

To achieve high-quality EEG signal generation under explicit emotion label control, this research developed a deep generative model named Trans-cVAE-GAN, which integrates the concepts of cVAE, Transformer, and GAN. The model takes SEED EEG data as input, incorporates emotion labels for conditional modeling, and outputs synthetic EEG segments with the specified emotional attributes. The overall framework, illustrated in Figure 1, consists of four primary functional modules:
Transformer Encoder—Models the temporal dependencies in input EEG segments and extracts high-dimensional latent feature representations.
Latent Space Modeling and Reparameterization Module—Concatenates the encoder output with the embedded emotion label vector to obtain the mean and variance vectors of the latent variables, and samples a latent code via the reparameterization trick.
Conditional Decoder—Concatenates the sampled latent code with the emotion label, then passes it through a multilayer perceptron and linear mapping layers to produce the synthetic EEG signal.
Discriminator—Distinguishes between generated and real EEG signals, providing adversarial feedback to improve the realism and diversity of synthetic samples.
The model is trained end-to-end. The generator—comprising the encoder and decoder—optimizes for both reconstruction error and KL divergence in the latent space, while the discriminator engages in adversarial training with the generator to produce synthetic EEG signals that closely resemble real data while maintaining semantic consistency.
To further enhance the fidelity of generated samples in terms of temporal structure and spectral characteristics, the loss function incorporates additional multi-dimensional structural constraints, including Pearson correlation, temporal smoothness, and Welch power spectral distance. These are jointly optimized within the generator training process.

3.3.2. Transformer Encoder

EEG signals are a highly nonstationary time series with strong temporal dependencies. To more effectively model their underlying temporal dynamics, this paper introduces a Transformer-based encoder, illustrated in Figure 2, into the generator’s encoder module. Compared with CNN or RNN, the Transformer can capture important dependencies across the entire temporal span via a multi-head attention mechanism, thereby enhancing the representational power and generalization ability of emotion-related EEG features.
In implementation, the input EEG data are shaped as [ B , T , C ] , where B denotes batch size, T is the number of steps, and C is the number of channels. The input is first projected into an embedding space via a linear mapping layer, producing a time series representation of size [ B , T , d e m b e d ] . This sequence is then fed into a stack of two Transformer encoder layers, each consisting of multi-head self-attention and feed-forward networks, with residual connections and layer normalization applied for stable training.
The core advantage of the encoder lies in its ability to directly model interactions between any two time points, thereby capturing both local and global dynamic features of EEG activity. To obtain a fixed-length feature representation, the temporal dimension of the Transformer’s output is compressed via average pooling, resulting in a vector of size [ B , d e m b e d ] , denoted as z i n p u t , which serves as the latent representation of the input segment.
This design enables the model to effectively encode the temporal patterns of EEG segments under different emotional states, providing a high-quality representation for subsequent latent space modeling and conditional generation.

3.3.3. Latent Space Modeling and Label Embedding

In conditional generation tasks, latent space modeling must not only capture abstract representations of the input data but also explicitly incorporate external label information for class control. To this end, this paper introduces a label embedding mechanism into the encoder output and employs a variational inference strategy to model the latent variables, constructing a label-aware latent space representation.
Specifically, after passing through the Transformer encoder, an EEG segment is represented by a fixed-length vector z i n p u t R d . In parallel, the emotion label y { 0 , 1 , 2 } is converted into a conditional label embedding vector and mapped via a fully connected embedding layer into a representation e y of the same dimension as the encoder output. The two vectors, z i n p u t and e y , are concatenated to form a joint representation [ z i n p u t , e y ] , which serves as the input to latent space modeling.
This joint representation is fed into two independent fully connected networks to estimate the parameters of a Gaussian distribution over the latent variables: the mean vector μ R k and the log-variance vector l o g σ 2 R k . To enable differentiable sampling, this research adopts the reparameterization trick, representing the sampling process as:
z = μ + σ       ϵ ,     ϵ N 0 , I
where denotes the Hadamard product and σ = e x p ( 0.5 l o g σ 2 ) . This method avoids non-differentiability during backpropagation, allowing the latent variable z to participate in end-to-end neural network training.
Through this design, the model can learn the intrinsic data distribution while incorporating external emotion label information, thus enabling conditional latent space modeling that provides a structured prior for label-controlled decoding and generation.

3.4. Transformer Conditional Decoder

To generate EEG signals with explicit emotion label control, this research designs a Transformer-based conditional decoder, illustrated in Figure 3, that maps latent variables z from the latent space into multi-channel time series signals consistent with the morphology of real EEG. This module retains the decoding characteristics of variational generative models while incorporating the global modeling capacity of a Transformer decoder, thereby improving structural consistency and temporal fidelity in the synthesized signals.
In implementation, the latent variable z R k is concatenated with the conditional label embedding emotion label vector y o n e h o t R n c l a s s to form a conditional vector [ z , y o n e h o t ] R k + n c l a s s . This vector is mapped via a fully connected layer to a target dimension and then replicated T times to construct a pseudo time sequence M R T × d e m b e d , which serves as the memory input to the Transformer decoder.
Simultaneously, a learnable positional encoding matrix P R T × d e m b e d is introduced as the initial query (tgt) for the target sequence. During training, tgt is initialized as a zero tensor plus positional encodings, simulating the process of “decoding” a complete EEG sequence step-by-step from the latent variable.
The Transformer decoder consists of two stacked decoding layers, each comprising multi-head self-attention, source–target attention, and a feed-forward network, with residual connections and layer normalization. At each time step, the decoder combines the memory with its own historical outputs to iteratively refine the sequence representation. The final Transformer output is projected via a linear layer to the EEG channel dimension C , yielding an output of shape [ B , T , C ] , consistent with the input format.
This conditional decoder offers three key advantages:
Direct embedding of label conditions within the decoding process;
Enhanced global modeling capacity through the Transformer architecture;
High retention of EEG temporal structure in the outputs, enabling high-fidelity, semantically controllable generation of emotion-related EEG data.

3.5. Discriminator Network

To enhance the realism and distributional consistency of generated EEG signals, an adversarial learning mechanism is integrated into the generative framework. A lightweight one-dimensional convolutional discriminator network, illustrated in Figure 4, is designed to distinguish between generated and real EEG segments, thereby guiding the generator to produce high-quality EEG data with stronger neurophysiological characteristics.
The discriminator takes as input either generated or real EEG segments of shape [ B , T , C ] , where T denotes the number of time steps and C denotes the number of channels. Before entering the discriminator, the input is transposed to [ B , C , T ] to accommodate the temporal sequence modeling requirements of one-dimensional convolution operations.
The network consists of three sequential one-dimensional convolutional layers with kernel sizes of 5, 5, and 3, respectively, and a stride of 2. Zero-padding is applied to maintain the temporal dimension. Each convolutional layer is followed by a LeakyReLU activation function to mitigate vanishing gradients and enhance expressiveness of features. The resulting high-dimensional temporal features are flattened into a vector via a Flatten operation and passed through two fully connected layers, ultimately outputting a scalar value that predicts the authenticity of the EEG segment.
The discriminator is trained to maximize the probability of correctly identifying real EEG signals while minimizing its error rate on generated EEG signals. Through this adversarial process, the generator is driven to produce synthetic samples that are increasingly challenging for the discriminator to classify. During training, the generator and discriminator are updated alternately, forming a game-theoretic loop that stably improves both the quality of generated data and the discriminator’s capacity.
Overall, the proposed discriminator is compact and efficient, capable of capturing subtle differences between generated and real EEG in both local temporal patterns and global statistical distributions. This provides the generator with effective training feedback, enhancing the expressive power and practical value of the overall generative system.

3.6. Loss Function and Joint Optimization Strategy

To improve the structural, semantic, and spectral fidelity of generated EEG signals, this research designs a multi-objective joint optimization strategy that integrates the reconstruction objectives of the VAE, the authenticity discrimination objective of the GAN, and multiple structural consistency metrics into a unified training framework. The overall loss function comprises the following six components:
(1)
Reconstruction Loss
Measures the amplitude difference between generated and real EEG segments in the time domain using Mean Squared Error (MSE):
L r e c o n = 1 N i = 1 N   x ^ i x ( i ) 2 2
(2)
KL Divergence Loss
Regularizes the distribution of latent variables to approximate a standard normal distribution, ensuring continuity and sample ability of the latent space:
L K L = 1 2 j = 1 k ( 1 + l o g σ j 2 μ j 2 σ j 2 )
(3)
Adversarial Loss
Measures the generator’s ability to fool the discriminator, guiding it toward the real data distribution. This research adopts the standard binary cross-entropy form:
L a d v = E x ^ [ l o g D ( x ^ ) ]
(4)
Pearson Correlation Loss
Evaluates morphological trend similarity between generated and real signals by computing the average channel-wise correlation per sample and taking its complement:
L c o r r = 1 c o r r ( x ^ , x )
(5)
Smoothness Loss
Constrains temporal smoothness of generated signals by penalizing large variations between adjacent time steps:
L s m o o t h = 1 T 1 t = 1 T 1   x ^ t x ^ t + 1 2 2
(6)
Power Spectrum Consistency Loss
Enhances spectral-domain structural consistency by comparing the power spectral density (PSD) of generated and real signals. The Welch method is used to estimate PSD, and the MSE between the two spectra is computed.
To prevent imbalance among the multiple objectives from negatively affecting training, a dynamic weighting mechanism is introduced. Each loss component is normalized, and its weight is automatically allocated via a Softmax distribution with bounded values, ensuring that all objectives receive sufficient optimization during training. The total generator loss is defined as:
To balance the six objectives during training, this research adopts a batch-wise dynamic weighting scheme. At each batch, the scalar losses L i (reconstruction, KL, adversarial, Pearson, smoothness, spectrum) are z-scored within the batch, passed through a softmax, then clipped to [0.1, 0.4] and renormalized to sum to one. This research computed the weights under no-grad. The weights are not learnable and require no initialization, which keeps optimization numerically stable and prevents any single objective from dominating. With six terms, the post-clipping bounds imply a maximum per-term contribution of 0.4/(0.4 + 5 × 0.1) = 0.444 and a minimum of 0.1/(0.4 + 5 × 0.1) = 0.111. The generator loss is
L g e n = i = 1 6 α i L i
where α i are the dynamically computed weights.
The generator and discriminator are trained alternately. In each iteration, the discriminator is updated first to improve its ability to distinguish between real and generated data, followed by two updates to the generator to enhance both reconstruction accuracy and its capacity to fool the discriminator.
Through this multi-objective joint loss design and adversarial optimization strategy, the model can accurately reconstruct EEG segments in the temporal amplitude domain while closely matching real signals in morphology, spectrum, and statistical structure—achieving multi-perspective, high-consistency generation of emotion-related EEG signals.

4. Experimental Design and Results Analysis

4.1. Experimental Environment

All experiments were conducted on a Windows server equipped with an NVIDIA RTX 4060Ti GPU (NVIDIA Corporation, Santa Clara, CA, USA). The operating system was Windows 11 (Microsoft Corporation, Redmond, WA, USA), the primary development environment was Python 3.10, and the deep learning framework used was PyTorch 2.0.1. Additional computational and signal processing libraries included NumPy 1.24.4, SciPy 1.10.1, and scikit-learn 1.3.2. To ensure experimental stability, all random processes were initialized with fixed random seeds.

4.2. Implementation Details and Hyperparameters

The training configuration as below: a Transformer encoder with 2 layers, 4 heads, 256-dim embedding, a VAE latent dimension of 128, dynamic loss weighting (softmax on z-scored losses clamped to [0.1, 0.4]), Adam optimizers (generator lr = 1 × 10−3, discriminator lr = 1 × 10−4, β 1 = 0.9 , β 1 = 0.999 , no learning-rate schedule, batch size 64, and 100 epochs.

4.3. Evaluation Metrics

To comprehensively evaluate the performance of the proposed EEG signal generation model in terms of realism, structural consistency, and spectral preservation, this research designed multiple objective evaluation metrics from the perspectives of statistical correlation, frequency-domain consistency, information divergence, and feature-space distribution.

4.3.1. Pearson Correlation Coefficient

The Pearson correlation coefficient is used to measure the channel-level linear correlation between generated signals and real signals, defined as:
r x y = i = 1 n ( x i x ¯ ) y i y ¯ 2 i = 1 n x i x ¯ 2 i = 1 n y i y ¯ 2
where x i and y i represent the i -th sampling points of the real and generated signals, and x ¯ and y ¯ denote their respective meaning. This metric reflects whether the generated signals preserve the temporal morphology of the real signals.

4.3.2. Spearman’s Rank Correlation Coefficient

To further assess the nonlinear relationship between generated and real signals, Spearman’s rank correlation coefficient is introduced. It is calculated based on the rank differences in the signal values and is suitable for evaluating non-normal or nonlinear signal relationships:
ρ = 1 6 d i 2 n ( n 2 1 )
where d i is the rank difference in the i -th sampling point in the two signals, and n is the number of sampling points.

4.3.3. Kullback–Leibler Divergence

KL divergence is used to quantify the informational difference between the spectral distributions of real and generated signals, defined as:
D K L ( P Q = i = 1 n P ( i ) l o g ( P ( i ) Q ( i ) )
where P ( i ) and Q ( i ) are the normalized power spectral values of the real and generated signals at the i -th frequency bin.
In subsequent experiments, this research reports average metric values across all channels and emotion categories and compares them against reference baselines to systematically evaluate the realism, spectral fidelity, and label controllability of the proposed model.

4.3.4. Fréchet Distance

To assess multivariate distributional similarity beyond pointwise shape and spectral overlap, this research computes the Fréchet distance (FID) between Gaussian embeddings of windowed, channel-wise EEG features extracted from real and generated signals. Per channel, the waveform is segmented into short windows of 0.30 s with 0.15 s hop; for each window this research form a feature vector comprising five band-power integrals (0.5–4, 4–8, 8–13, 13–30, 30–45 Hz; log-energy), the spectral centroid and spectral entropy computed from a Welch PSD normalized over 0.5–45 Hz, the time-domain RMS, and AR(4) coefficients estimated via Yule–Walker. Real and generated feature sets are pooled and z-normalized feature-wise, then split to estimate means μ 1 , μ 2 and covariances C 1 , C 2 ; covariances are symmetrized and diagonally shrunk to ensure positive semi-definiteness, and the matrix square root uses an SVD of C 1 C 2 . The resulting Fréchet (Gaussian) distance is
F I D X , Z = | | μ 1 μ 2 | | 2 + T r ( C 1 + C 2 2 C 1 C 2 1 2 )
reported per channel (lower is better). By summarizing joint statistics of short-term time–frequency descriptors, FID complements Pearson/Spearman and KL/PSD-MSE, providing a distribution-level criterion for alignment between real and synthesized EEG.

4.3.5. Mean Squared Error

To assess frequency-domain similarity in a phase-agnostic manner, this researcher compares the Welch PSDs of real and generated signals per channel. For each channel, PSDs are computed with a 2.0 s Hamming window, 50% overlap, and N f f t = 1024 , then band-limited to 0.5–45 Hz. To remove scale effects, each PSD is L1-normalized over the band to form a probability-like spectral envelope, P ~ x f = P x ( f ) / 0.5 45 P x ( u ) d u and P ~ z f = P z ( f ) / 0.5 45 P z ( u ) d u . Denoting the Welch frequency grid within [0.5,45] Hz by f i i = 1 M , the PSD-MSE is the discrete mean-squared error between the normalized envelopes:
M S E x , z = 1 M i = 1 M P ~ x f i P ~ z f i 2  
where lower values indicate closer spectral envelopes. PSD-MSE complements time-domain shape metrics and divergence measures by specifically quantifying frequency-wise amplitude redistribution while remaining insensitive to phase.

4.3.6. Earth Mover’s Distance

To quantify frequency-domain discrepancy in a phase-agnostic way, this research computes the 1-D Earth Mover’s Distance (EMD) between the Welch PSDs of real and generated signals on each channel. For each channel, a zero-phase Welch estimate is obtained and restricted to 0.5–45 Hz. The resulting spectra P x f   and P z f are L1-normalized to form probability densities over frequency, from which cumulative distribution functions F x f and F z f are constructed. The EMD is then the area between these CDFs over the band:
E M D x , z = 0.5 45 F x f F z f d f
reported per channel in units of Hz. Because it compares spectral envelopes rather than pointwise amplitudes, EMD complements Pearson/Spearman and PSD-MSE/KL by capturing frequency-wise mass shifts that reflect band-power redistribution between real and synthesized EEG.

4.3.7. Classification Consistency

To assess semantic consistency from a downstream perspective, this research adopts a frozen-classifier congruence protocol. A CNN baseline is trained only on real EEG to learn mappings from signals to emotion labels; after training, the classifier is frozen and used to score generated segments conditioned on the same labels. If synthesized signals preserve label-relevant structure, performance on generated data should approach that on real data.
This research reports accuracy, precision, recall, F1-score, and AUC. To avoid cross-subject leakage, this research uses within-subject, trial-level splits on each dataset: trials pooled across the three sessions are stratified by label and partitioned 80%/10%/10% into train/validation/test; all windows inherit their parent trial’s split. The frozen classifier’s hyper-parameters and decision thresholds are fixed by the validation set; the test set remains untouched and real-only.
This protocol evaluates whether generated samples are interpretable by a classifier trained on real data (semantic congruence) without conflating effects from joint training. The augmentation study—where generated segments are added to the real training set while validation/test remain real-only will be reported in Section 4.3, isolating its impact on generalization.

4.4. Experimental Results and Analysis

4.4.1. Time-Domain Comparison

Figure 5 illustrates the time-domain comparison between real and generated EEG signals across 16 representative channels from the SEED dataset. Overall, the generated signals reproduce the major peaks, troughs, and envelope fluctuations of the real EEG, showing consistent temporal dynamics. In most channels, the phases of oscillatory components are well aligned, while in a few channels (e.g., 3, 6, and 15) the generated signals show amplitude compression or upward offset. Despite these local deviations, the generated signals preserve the overall waveform morphology and cross-channel coherence, indicating that the proposed model effectively captures both global temporal dependencies and local oscillatory structure, which are essential for emotion-related EEG generation.
Figure 6 illustrates the time-domain comparison between real and generated EEG signals across 16 representative channels from the SEED-FRA dataset. The generated traces closely follow the real waveforms with well-aligned phases and comparable short-term fluctuations; most channels exhibit tight overlaps within the [−2, 3] a.u. range implied by z-scaling. A few channels show local amplitude compression or slight baseline offsets, yet the overall waveform morphology and cross-channel coherence are preserved. These observations indicate that the model generalizes SEED-FRA by capturing both global temporal dependencies and local oscillatory structure under a distinct subject/language cohort.
Figure 7 illustrates the time-domain comparison between real and generated EEG signals across 16 representative channels from the SEED-GER dataset. The generated traces generally follow the rhythm and envelope of the real waveforms, with comparable short-term oscillations and visible peak–trough sequences. Several channels (e.g., 2, 3, 6, 13–16) exhibit baseline offsets and amplitude scaling relative to the real signals, whereas other channels show closer overlap. Despite these deviations, the generated signals preserve the overall waveform morphology and temporal patterning across channels, indicating that the model transfers to SEED-GER while retaining task-relevant dynamics. The residual offsets likely reflect dataset-specific amplitude/reference characteristics and could be further reduced with amplitude calibration or re-referencing.

4.4.2. Pearson Correlation Analysis

Figure 8 presents the channel-wise Pearson correlation coefficients between generated and real EEG signals from the SEED dataset. Most channels exhibit high correlations, with the majority ≥0.8 and many approaching 0.9, indicating strong preservation of temporal morphology. A small number of channels lie in the 0.7–0.8 range, suggesting mild amplitude/phase mismatches without altering the overall trends. Overall, the distribution shows that the proposed model robustly reproduces channel-specific temporal structures and maintains cross-channel consistency with high fidelity.
Figure 9 presents the channel-wise Pearson correlation between real and generated EEG on the SEED-FRA dataset. Most channels fall in the 0.70–0.85 range, with several channels approaching or exceeding 0.90, indicating good preservation of temporal morphology for the majority of channels. A subset of channels lies around 0.50–0.70, and a small cluster dips below 0.50 (mid-index channels), reflecting localized amplitude/phase mismatches that may stem from noise or corpus-specific variability. Overall, the distribution suggests that the model maintains channel-specific waveform structure on SEED-FRA while leaving room for improvement on a few channels with lower alignment.
Figure 10 presents the channel-wise Pearson correlation between real and generated EEG on the SEED-GER dataset. Most channels exhibit high correlations (≥0.80), with many approaching 0.90–0.95, indicating strong preservation of time-domain morphology across channels. A smaller subset lies around 0.70–0.80, reflecting mild amplitude/phase mismatches consistent with the time-domain offsets observed in several channels, yet without altering the overall trend structure. Overall, the distribution suggests that the model maintains channel-specific waveform similarity on SEED-GER with uniformly high alignment and only limited room for improvement on a few channels.

4.4.3. Spearman Correlation Analysis

Figure 11 presents the channel-wise Spearman correlation between generated and real EEG signals from the SEED dataset. Most channels exhibit high coefficients, with the majority in the 0.80–0.90 range and several approaching 0.95, indicating preservation of monotonic rank-order relationships and local fluctuation patterns. A small subset lies around 0.70–0.80, suggesting minor amplitude/phase mismatches without altering overall trends; very low values (<0.50) are not observed. Taken together, these results indicate that the generated EEG maintains the nonlinear structural characteristics of real EEG, supporting robustness in temporal-morphology reconstruction.
Figure 12 presents the channel-wise Spearman correlation between generated and real EEG on the SEED-FRA dataset. Most channels lie in the 0.70–0.85 range, indicating that the model preserves monotonic rank-order relationships and local fluctuation patterns. Several channels fall around 0.55–0.70, and a small cluster dips below 0.50 (mid-index channels), reflecting localized amplitude/phase mismatches likely driven by noise or corpus-specific variability. A few channels approach 0.85–0.90 toward the upper end, showing strong alignment. Overall, the distribution suggests good nonlinear structural consistency on SEED-FRA, with room to improve alignment for the subset of lower-scoring channels.
Figure 13 presents the channel-wise Spearman correlation between generated and real EEG on the SEED-GER dataset. Most channels exhibit high coefficients (≈0.80–0.90), with several approaching 0.95, indicating strong preservation of monotonic rank-order relationships and local fluctuation patterns. A smaller subset lies around 0.70–0.80, consistent with mild amplitude/phase mismatches observed in a few channels, yet without altering overall trends. Overall, the distribution suggests uniformly good nonlinear structural alignment on SEED-GER, with limited room for improvement on lower-scoring channels.

4.4.4. Spectral Distribution Information Divergence Analysis

Figure 14 illustrates the channel-wise KL divergence between the spectral distributions (0.5–45 Hz) of generated and real EEG signals from the SEED dataset. Most channels lie around 0.25–0.50, indicating acceptable preservation of frequency-domain structure for the majority of channels. A subset shows higher divergence, with several channels reaching ~0.60–0.75 (and a few approaching 0.80), suggesting localized spectral-envelope mismatches (e.g., band-power shifts) that may arise from noise or subject-specific variability. Overall, the distribution indicates broadly similar spectral structure with pockets of discrepancy, without evidence of severe global distortions.
Figure 15 illustrates the channel-wise KL divergence between the 0.5–45 Hz spectral distributions of real and generated EEG on the SEED-FRA dataset. Most channels fall in the low-to-moderate range (~0.20–0.60), indicating acceptable preservation of frequency-domain structure for the majority of channels. Several channels show elevated divergence (~0.70–0.95) and isolated peaks exceed 1.0, suggesting localized spectral-envelope mismatches (e.g., band-power shifts or cohort-specific variability). Overall, the distribution points to broadly similar spectral profiles on SEED-FRA with a small subset of channels requiring further calibration to reduce divergence.
Figure 16 illustrates the channel-wise KL divergence between the 0.5–45 Hz spectral distributions of real and generated EEG on the SEED-GER dataset. Most channels lie in a low–moderate range (~0.20–0.60), indicating broadly preserved spectral envelopes, while several channels rise to ~0.70–0.90 and isolated peaks approach ~1.2, reflecting localized band-power shifts or cohort-specific variability. Overall, the distribution suggests generally similar frequency-domain structure on SEED-GER with a small subset of channels exhibiting higher divergence that could benefit from additional calibration.

4.4.5. Frequency-Domain Consistency Analysis

Figure 17 compares the Welch PSDs (0.5–45 Hz, dB/Hz) of real and generated EEG across 16 representative channels from the SEED dataset. The generated spectra closely follow the 1/f-like slope and band-limited peaks of the real signals, with clear agreement in the δ (0.5–4 Hz), θ (4–8 Hz), and α (8–13 Hz) bands. The β (13–30 Hz) and low-γ (30–45 Hz) ranges are broadly aligned, though several channels (e.g., 6, 10, and 15) show mild power offsets or slightly deeper/narrower notches around specific peaks. Despite these local differences, peak locations and the overall spectral envelope are preserved, and no spurious high-frequency energy is observed. These results indicate that the proposed model stably reconstructs frequency-domain structure, capturing global rhythmic trends together with channel-specific spectral details relevant for emotion-related analysis.
Figure 18 compares the Welch PSDs (0.5–45 Hz, dB/Hz) of real and generated EEG across 16 representative channels from the SEED-FRA dataset. The generated spectra closely track the real 1/f-like slope and band-limited peaks, with clear agreement in the δ (0.5–4 Hz), θ (4–8 Hz), and α (8–13 Hz) ranges. The β (13–30 Hz) and low-γ (30–45 Hz) bands are broadly aligned as well, though several channels show mild level offsets (a few dB) or slightly deeper/narrower notches at specific frequencies. Despite these local differences, peak locations and the overall spectral envelope are preserved, and no spurious high-frequency energy is observed. These results indicate that the model generalizes SEED-FRA while stably reconstructing frequency-domain structure relevant for emotion-related analysis.
Figure 19 compares the Welch PSDs (0.5–45 Hz, dB/Hz) of real and generated EEG across 16 representative channels from the SEED-GER dataset. The generated spectra generally track the real 1/f slope and band-limited peaks with clear agreement in δ (0.5–4 Hz), θ (4–8 Hz), and α (8–13 Hz), and broadly aligned energy in β (13–30 Hz) and low-γ (30–45 Hz). Several channels exhibit level offsets of a few dB and sharper or slightly shifted notches/peaks around mid-beta frequencies, while others show near-overlap across the band. Despite these local differences, peak locations and the overall spectral envelope are preserved, and no spurious high-frequency energy is observed, indicating that the model transfers to SEED-GER while stably reconstructing frequency-domain structure relevant for emotion-related analysis.

4.4.6. Feature-Space Distribution Visualization Analysis

Figure 20 presents a t-SNE visualization of PSD-based frequency-domain features (0.5–45 Hz) for real (blue) and generated (orange) EEG from the SEED dataset. The two sets occupy a similar manifold with comparable spread, and there is substantial local overlap, indicating preservation of global feature-space structure. At the same time, the generated points show a modest centroid shift (primarily along the t-SNE-2 axis) relative to the real ones, consistent with a small domain offset rather than mode collapse. Overall, the neighborhood structure is maintained, and no isolated or collapsed clusters are observed, supporting that the synthesized signals capture physiologically meaningful spectral statistics while exhibiting a mild distributional shift that could be further reduced by calibration or domain alignment.
Figure 21 presents a t-SNE visualization of PSD-based frequency-domain features (0.5–45 Hz) for real (blue) and generated (orange) EEG from the SEED-FRA dataset. The two sets occupy a similar manifold with comparable spread, showing substantial local overlap across the plane; several mixed neighborhoods appear, including the right-hand cluster, indicating that the generator preserves the global structure of the spectral feature space. A small, nearly uniform shift in some orange points relative to nearby blue points can be seen (without isolated or collapsed clusters), suggesting a mild domain offset rather than mode collapse. Overall, the neighborhood geometry is maintained, supporting that the synthesized signals capture physiologically meaningful spectral statistics on SEED-FRA.
Figure 22 presents a t-SNE visualization of PSD-based frequency-domain features (0.5–45 Hz) for real (blue) and generated (orange) EEG from the SEED-GER dataset. The two sets trace a similar global manifold but exhibit a clear centroid shift along the diagonal (t-SNE-1/2) direction, yielding partially separated bands rather than mixed neighborhoods. This pattern indicates a systematic domain offset in spectral statistics—consistent with the level/offset differences observed in the GER spectra—while mode coverage is preserved (no collapsed or isolated clusters). Overall, the embedding suggests that the generator captures the structure of the GER spectral feature space but leaves a modest distributional gap that could be reduced with amplitude calibration or domain alignment.

4.4.7. Fréchet Distance Analysis

Figure 23 reports the channel-wise FID between real and generated EEG on the SEED dataset (lower is better). Most channels fall in a low–moderate range (~10–18), indicating broadly similar distributions of windowed spectral–temporal features. Several channels show elevated values (~20–26)—with a few local peaks near the mid-index channels—suggesting localized distributional gaps consistent with the small PSD/phase offsets observed in the time/frequency plots. Overall, the profile points to good distributional alignment for the majority of channels.
Figure 24 reports the channel-wise FID between real and generated EEG on the SEED-FRA dataset (lower is better). Most channels lie in a low range (~2–8), indicating close alignment of the windowed spectral–temporal feature distributions. Several channels show moderate elevations (~9–12), and a few late-index channels peak around ~13–16, suggesting localized distributional gaps consistent with the mild level/offset differences seen in the frequency-domain plots. Overall, the profile reflects good distributional similarity for the majority of channels.
Figure 25 reports the channel-wise FID between real and generated EEG on the SEED-GER dataset (lower is better). The majority of channels fall in a moderate range (~8–18), indicating broadly similar windowed spectral–temporal feature distributions. Several groups show elevated FID (~20–30), and there are isolated peaks near ~35–38 at late-index channels, suggesting localized distributional gaps consistent with the amplitude/offset differences and the centroid shift observed in the GER frequency and t-SNE analyses. A few channels reach low values (~2–6), showing near-match distributions. Overall, the profile points to good but less uniform alignment on SEED-GER than on SEED-FRA.

4.4.8. MSE Analysis

Figure 26 reports the channel-wise PSD-MSE between the 0.5–45 Hz Welch spectra of real and generated EEG on the SEED dataset (lower is better). Most channels fall below ~1.0, indicating close agreement of spectral envelopes; a second band lies around ~1.0–2.0, and a few sparse peaks reach ~2.5–3.5, with an isolated maximum near ~4.8 on a late-index channel. These spikes suggest localized band-power mismatches rather than global distortion. Overall, the profile indicates broad spectral similarity across channels with a small subset showing larger deviations that align with the minor discrepancies observed in the PSD and KL/EMD analyses.
Figure 27 reports the channel-wise PSD-MSE between the 0.5–45 Hz Welch spectra of real and generated EEG on the SEED-FRA dataset (lower is better). The bulk of channels cluster around ~0.6–1.6, indicating generally close spectral-envelope agreement; a secondary band appears at ~1.6–2.6, and several peaks rise to ~3.0–4.1 over late-index channels. These higher values suggest localized band-power mismatches rather than global distortions, consistent with the modest spectral offsets observed in the FRA PSD and the elevated EMD/KL on a subset of channels. Overall, the profile reflects good but less uniform frequency-domain alignment than SEED, with a small group of channels driving the larger errors.
Figure 28 reports the channel-wise PSD-MSE between the 0.5–45 Hz Welch spectra of real and generated EEG on the SEED-GER dataset (lower is better). Most channels lie in a low range (~0.4–1.5), indicating broadly similar spectral envelopes; a secondary band appears around ~1.5–2.5, and several isolated peaks rise to ~3.0–3.4, with an early-index outlier near ~4.8. These spikes suggest localized band-power mismatches rather than global distortion, consistent with the modest offsets seen in the GER PSD and the higher FID/EMD on a small subset of channels. Overall, the profile indicates good but less uniform frequency-domain alignment on SEED-GER, with a few channels driving the larger errors.

4.4.9. Earth Mover’s Distance Analysis

Figure 29 reports the channel-wise Earth Mover’s Distance (EMD) between the 0.5–45 Hz Welch PSDs of real and generated EEG on the SEED dataset (units: Hz, lower is better). Most channels lie in a low range (~0.08–0.20 Hz), indicating small frequency-wise mass shifts and broadly preserved spectral envelopes. A subset rises to ~0.20–0.30 Hz, and there are isolated peaks around ~0.35–0.45 Hz at mid/late-index channels, reflecting localized band-power redistributions rather than global mismatch. Overall, the profile suggests good frequency-domain alignment with a few channels exhibiting modest spectral shifts, consistent with the minor offsets seen in the PSD comparisons.
Figure 30 reports the channel-wise Earth Mover’s Distance (EMD) between the 0.5–45 Hz Welch PSDs of real and generated EEG on the SEED-FRA dataset (units: Hz, lower is better). Most channels fall in a moderate range (~0.20–0.40 Hz), indicating broadly preserved spectral envelopes with noticeable but bounded frequency-wise mass shifts. Several channels rise to ~0.45–0.55 Hz, and isolated peaks approach ~0.60–0.65 Hz (mid-index channels), reflecting localized band-power redistributions rather than global mismatch. Overall, the profile suggests good but less uniform frequency-domain alignment on SEED-FRA compared with SEED, consistent with cohort/device variability and the small spectral level offsets observed in the PSD comparisons.
Figure 31 reports the channel-wise Earth Mover’s Distance (EMD) between the 0.5–45 Hz Welch PSDs of real and generated EEG on the SEED-GER dataset (units: Hz, lower is better). Most channels lie in a low–moderate range (~0.12–0.30 Hz), indicating broadly preserved spectral envelopes. Several channels rise to ~0.30–0.38 Hz, and there are isolated peaks near ~0.40–0.47 Hz around mid- and late-index channels, reflecting localized band-power shifts rather than global mismatch. Overall, the profile suggests good frequency-domain alignment on SEED-GER with a small subset of channels exhibiting higher EMD, consistent with the modest offsets seen in the GER PSD and t-SNE analyses.

4.4.10. Classification Consistency Analysis

Table 1 lists the mean ± SD across subjects. For completeness, this research includes AUC alongside Accuracy/Precision/Recall/F1. Following the protocol in Section 4.3.7, this research evaluates frozen-classifier congruence and augmentation utility on SEED, SEED-FRA, and SEED-GER.
In frozen-classifier congruence, a CNN trained only on real data is frozen and used to score generated segments. Across datasets, performance on generated data approaches but remains below the real-data baseline: accuracy differences are –0.046 (SEED: 0.823 vs. 0.869), –0.042 (SEED-FRA: 0.793 vs. 0.835), and –0.040 (SEED-GER: 0.809 vs. 0.849). Precision/Recall/F1 exhibits the same trend, indicating that synthesized signals are largely interpretable by a classifier trained on real data, while a modest domain gap persists.
For augmentation utility, when class-balanced generated segments are added to the real training set (validation/test kept real-only, hyper-parameters unchanged), mean performance consistently improves: accuracy gains are +0.049 (SEED: 0.918 vs. 0.869), +0.049 (SEED-FRA: 0.884 vs. 0.83the 5), and +0.045 (SEED-GER: 0.894 vs. 0.849), with concomitant improvements in F1 and AUC (Table 1, Table 2 and Table 3). These results indicate that the generator contributes non-redundant, label-consistent variability that enhances generalization rather than introducing noise.

4.5. Ablation Study

4.5.1. Experimental Setup

To examine the contribution of each module and training strategy, this research conducts single-factor ablations in which one component (architecture, conditioning, or loss) is removed or replaced while all other configurations (optimizer, schedule, batch size, epochs, preprocessing, splits) are kept identical. The tested variants include: Without cVAE (remove KL, latent becomes deterministic), Without GAN (remove discriminator and adversarial loss), Without Label Conditioning (no conditional label embedding labels in encoder/decoder), Without Positional Embedding, Without Transformer Encoder (replace with a linear mapping), Transformer → CNN (1-D conv + pooling instead of attention), Without Pearson Loss, and the Baseline (full model with Transformer encoder, positional information, label conditioning, cVAE, GAN, Pearson + smoothness + spectral losses, and joint optimization).

4.5.2. Results and Analysis

As shown in Table 4, the Baseline attains the best agreement (Pearson 0.838 ± 0.075, Spearman 0.819 ± 0.068) with comparatively low distributional gaps (KL 0.389 ± 0.145, FID 13.962 ± 4.293, EMD 0.198 ± 0.089) in the SEED dataset. Removing cVAE or GAN causes large drops in correlation (Pearson ≈ 0.51–0.52, Spearman ≈ 0.50–0.52) and markedly higher KL/FID/EMD (e.g., KL 1.476–1.545, FID 18.9–20.2, EMD 0.454–0.584), confirming the joint role of variational inference and adversarial learning in shaping a useful latent space and realistic samples. Label conditioning and positional information are also important: removing either degrades correlations to 0.61–0.63 and raises KL to 0.84–0.96. Eliminating the Transformer encoder further reduces Pearson/Spearman to 0.505/0.543. Replacing attention with a CNN partially recovers performance (Pearson 0.673 ± 0.145), but remains well below the Baseline, indicating the advantages of global attention over purely local convolution. Notably, ablation of the Pearson loss yields one of the worst distributional profiles (Pearson 0.454 ± 0.156, KL 1.735 ± 0.243, FID 21.445 ± 8.475, EMD 0.635 ± 0.234), underscoring its contribution to time-domain shape fidelity and its downstream effects on spectral/feature distributions.
As shown in Table 5, the Baseline reaches Pearson 0.739 ± 0.120 and Spearman 0.721 ± 0.117 with low FID (5.275 ± 2.906) relative to ablations in the SEED-FRA dataset. The most damaging removals are the Transformer encoder (Pearson 0.419 ± 0.201) and the adversarial/variational components (cVAE: 0.438, GAN: 0.427), all accompanied by strong increases in KL (1.54–2.15), FID (7.46–11.95), and EMD (0.65–0.79). Label conditioning and positional cues again provide clear gains (ablations at Pearson 0.516–0.579). CNN in place of attention yields 0.584 ± 0.175, above the linear mapping but below the Baseline. Pearson-loss removal (Pearson 0.554; KL 1.835) confirms that explicit trend alignment helps the broader objective mix on a cross-cohort corpus.
As shown in Table 6, the Baseline achieves the highest correlations (Pearson 0.844 ± 0.068, Spearman 0.831 ± 0.076) with modest KL (0.368 ± 0.184) in the SEED-GER dataset. Drops are evident when removing label conditioning (Pearson 0.437 ± 0.176) or Transformer (0.517 ± 0.157), and when ablating cVAE/GAN (Pearson 0.546–0.575). These degradations coincide with larger FID (17.3–19.6) and EMD (0.587–0.721), reflecting distributional shifts seen in GER’s t-SNE/PSD analyses. CNN replacement again performs mid-pack (Pearson 0.681 ± 0.167), highlighting that global attention is beneficial even under device/cohort shifts. Removing the Pearson loss (Pearson 0.538, KL 0.735) still harms overall alignment, though its effect size is somewhat smaller than on SEED/FRA.
Across all three datasets, the full model consistently yields the best correlation metrics and the lowest KL/FID/EMD. The largest degradations arise from removing cVAE/GAN or the Transformer; label conditioning and positional information give non-trivial gains; CNN recovers part of the attention benefits but not all. The Pearson loss materially improves time-domain morphology and, indirectly, frequency-domain and distributional alignment. These patterns support the necessity and complementarity of the proposed architectural and objective choices.

4.6. Comparative Experiments

4.6.1. Experimental Setup

To establish a controlled comparison in terms of time–frequency fidelity and downstream usability, this research evaluates four representative GAN baselines—DCGAN, WGAN, WGAN-GP, and T-CGAN [32]—under identical data, preprocessing, and training protocols. T-CGAN conditions both the generator and discriminator on time stamps to model irregularly sampled time series. All models are trained and assessed on the SEED, SEED-FRA, and SEED-GER datasets with the same within-subject, trial-level split. Evaluation comprises channel-wise Pearson and Spearman correlations and distributional distances. Downstream consistency and augmentation value are examined with a unified CNN classifier using three regimes: Real-only (train/validation/test on real data), Generated-only → Real test (train on generated, validate/test on real), and Real + Generated (augment the real training set with class-balanced generated segments while keeping validation/test real-only).

4.6.2. Results and Analysis

The results of comparative experiments were summarized in Table 7, Table 8, Table 9, Table 10, Table 11 and Table 12. Across datasets, the proposed baseline achieves the highest similarity and the lowest distributional discrepancy. On SEED (Table 7), the baseline records Pearson 0.838 ± 0.075, Spearman 0.819 ± 0.068, KL 0.389 ± 0.145, FID 13.962 ± 4.293, and EMD 0.198 ± 0.089, outperforming the GAN baselines; among them, WGAN-GP is consistently the strongest competitor (e.g., Pearson 0.629 ± 0.084, Spearman 0.694 ± 0.121, KL 0.459 ± 0.139), whereas DCGAN, WGAN, and T-CGAN show larger gaps in both correlation and spectral distance. The same pattern is observed on SEED-FRA (Table 9), where the baseline yields Pearson 0.739 ± 0.120, Spearman 0.721 ± 0.117, KL 0.411 ± 0.195, FID 5.275 ± 2.906, and EMD 0.320 ± 0.100, with WGAN-GP again ranking second among baselines. On SEED-GER (Table 11), the baseline remains best (Pearson 0.844 ± 0.068, Spearman 0.831 ± 0.076, KL 0.368 ± 0.184, FID 15.308 ± 7.523, EMD 0.227 ± 0.084), indicating robustness under cohort/device shifts, while WGAN-GP is again the closest challenger, and the other baselines trail further behind.
Downstream classification corroborates these consistency results. Under the Generated-only → Real test protocol (Table 8, Table 10 and Table 12), performance remains below the Real-only baseline for all generators, reflecting a domain gap when the classifier has never seen real data. Nevertheless, the baseline generator attains the strongest transfer among competitors on each dataset (SEED: 0.823 ± 0.186; SEED-FRA: 0.793 ± 0.176; SEED-GER: 0.809 ± 0.204 in accuracy). When generated segments are used for training set augmentation with validation/test kept real-only, the baseline achieves the largest and most consistent gains—SEED: 0.918 ± 0.094 vs. 0.869 ± 0.102; SEED-FRA: 0.884 ± 0.089 vs. 0.835 ± 0.112; SEED-GER: 0.894 ± 0.102 vs. 0.849 ± 0.113—accompanied by concomitant improvements in F1 and AUC. These outcomes indicate that the baseline not only preserves time-domain morphology and spectral envelopes more faithfully (high correlation, low KL/FID/EMD) but also contributes label-consistent, non-redundant variability that enhances classifier generalization. Overall, merely altering the adversarial loss (DCGAN, WGAN, WGAN-GP, T-CGAN) is insufficient to match the balanced, cross-dataset performance achieved by the proposed baseline, which integrates variational modeling, conditional control, and attention-based sequence representation.

5. Discussion

5.1. Challenges of EEG Signal Structural Characteristics for Generative Modeling

EEG’s non-stationarity, multi-scale rhythms, and cross-channel coupling make faithful generation difficult, and the cross-dataset ablations (SEED/SEED-FRA/SEED-GER) consistently reflect this. The Baseline yields the strongest balance of time-domain similarity and spectral fidelity (e.g., SEED Pearson/Spearman ≈ 0.84/0.82, KL ≈ 0.39), while removing cVAE or GAN markedly lowers correlations (Pearson ≈ 0.43–0.58) and inflates KL (≈0.75–1.68). Dropping label conditioning further degrades both temporal and spectral metrics, confirming its role in emotion-specific structure. Eliminating positional encoding or the Transformer encoder weakens long-range temporal modeling and raises spectral errors (KL often >0.9, up to ~2.15 on SEED-FRA). Removing the Pearson loss uniformly worsens morphology and spectrum (higher KL/FID/EMD). Taken together, variational modeling, adversarial refinement, label conditioning, positional encoding, and attention are all necessary to counter EEG’s intrinsic dynamics; the full model achieves the most stable trade-off across datasets.

5.2. Applicability and Complementarity of Multi-Dimensional Evaluation Metrics

A multi-metric perspective proves essential for EEG generation, as time-domain similarity and frequency-domain fidelity do not always covary. Across SEED, the Baseline achieves the most balanced profile—high temporal agreement (Pearson 0.838 ± 0.075, Spearman 0.819 ± 0.068) together with low spectral divergence (KL 0.389 ± 0.145, EMD 0.198 ± 0.089) and competitive FID (13.962 ± 4.293)—whereas classical GAN variants (DCGAN, WGAN, WGAN-GP, T-CGAN) exhibit mixed behavior: some improve one axis slightly (e.g., WGAN-GP modestly better Spearman 0.694 ± 0.121 than other baselines) yet incur higher KL/EMD (e.g., KL 0.459–0.635, EMD 0.257–0.424) or lower correlations (Pearson typically 0.54–0.63). The same pattern holds on SEED-FRA and SEED-GER, where the Baseline simultaneously maintains stronger correlations and lower KL/EMD than competing GANs. Importantly, downstream classification aligns with these multi-metric trends: under a frozen real-trained CNN, generated segments approach real-data performance (e.g., SEED accuracy 0.823 ± 0.186 vs. 0.869 ± 0.102), and augmenting real training sets with generated segments consistently improves validation/test results while keeping evaluation real-only (SEED 0.918 ± 0.094, SEED-FRA 0.884 ± 0.089, SEED-GER 0.894 ± 0.102 accuracy). These findings indicate that single metrics are insufficient to certify signal quality; rather, concordance across complementary measures—temporal correlations, spectral divergence (KL/EMD), and FID—together with downstream gains provides a robust assessment, and the Baseline’s balanced superiority across these axes supports its practical effectiveness.

5.3. Challenges in the Interpretability of Generative Models

Despite these improvements, generative modeling for EEG remains hindered by interpretability challenges. The Baseline model outperforms conventional GAN variants in both temporal and spectral domains, yet the internal mechanisms that link latent representations to neurophysiological features remain opaque. Like most deep generative models, the framework functions as a “black box”, limiting its clinical trustworthiness and controllability. Future research should, therefore, emphasize interpretability-oriented strategies, such as incorporating guided latent variable modeling, integrating neurophysiological priorities, or applying post hoc tools such as SHAP and LIME to map latent dimensions to cognitive or emotional processes. Enhancing interpretability will not only improve mechanistic understanding but also facilitate reliable applications in clinical diagnosis, neuroregulation, and brain–computer interfaces.

5.4. Loss Design Under Non-Stationary EEG

EEG dynamics are non-stationary, so time-domain objectives that admit free lags or warping, such as cross-correlation with non-zero lags, may reward temporal shifts rather than reproducing morphology, which is misaligned with our goal of phase-aware waveform fidelity. This research, therefore, pairs a zero-lag, mean/variance-normalized Pearson term, which enforces shape and phase alignment, with a Welch-PSD term, which constrains the spectral envelope. In the ablation experiment, removing the Pearson term produced the largest drop in time-domain fidelity, while the spectrum loss remained complementary. More expressive objectives, such as coherence or phase-locking value measures and time–frequency losses via STFT or CWT, could be explored, but they introduce additional hyperparameters and computation; this research leaves these extensions for future work.

5.5. Practical Fidelity for Classification

While several similarity metrics exhibit moderate averages (e.g., channel-wise Pearson/Spearman and band-agnostic PSD errors), these indicators are conservative for short, non-stationary EEG windows. What matters for application is whether the synthesized signals preserve label-relevant structure in aggregate across channels and frequencies. This is corroborated by the frozen-classifier congruence and augmentation analyses (Section 4.4.10; Table 1, Table 2 and Table 3): generated segments scored by a CNN trained on real data approach the real-data baseline, and adding generated data to the real training set consistently improves mean accuracy/precision/recall/F1 (validation/test kept real-only), indicating non-redundant, label-consistent variability rather than noise.

6. Limitations

The evaluation is restricted to the SEED family SEED, SEED-FRA, and SEED-GER and does not include heterogeneous datasets such as DEAP or DREAMER with different montages, sampling rates, referencing schemes, trial structures, and continuous labels. This research also did not assess cross-device robustness or cross-device generation meaningful extension will require channel harmonization, label-space alignment, and device-aware adaptation via normalization, re-referencing, and domain alignment. Architecture ablations are removal-style and reported as diagnostic under inter-module dependencies, whereas loss ablations are interface-preserving. A full replacement, interface-preserving protocol with identity/linear adapters and parameter-matched substitutes is left for future work to further isolate causal effects. Moreover, this research does not quantify conditional controllability. In particular, fixed-latent label-swap experiments and the associated label-consistency metrics are left for future work.

7. Conclusions

This study proposed a Trans-cVAE-GAN framework for EEG signal generation, which integrates label conditioning, adversarial training, and variational inference to address the challenges of non-stationarity and spatiotemporal coupling in EEG data. Experimental results demonstrated that the model achieves superior performance over conventional GAN variants, with ablation studies confirming the necessity of each key component. The generated signals not only preserved temporal and spectral consistency but also enhanced downstream emotion classification, highlighting the model’s potential for EEG data augmentation and brain–computer interface applications.

Author Contributions

Conceptualization, Y.Y.; Data curation, Y.L.; Methodology, Y.Y. and Y.L.; Supervision, Y.L.; Validation, Y.Y.; Writing—original draft, Y.Y.; Writing—review and editing, X.W., X.H., H.S., R.D. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article. The data presented in this study are available in Shanghai Jiaotong University at 10.1109/TAMD.2015.2431497, 10.1109/NER.2013.6695876, 10.1088/1741-2552/ac5c8d. These data were derived from the following resources available in the public domain: https://bcmi.sjtu.edu.cn/home/seed/seed.html, https://bcmi.sjtu.edu.cn/home/seed/seed-FRA.html, https://bcmi.sjtu.edu.cn/home/seed/seed-GER.html (all accessed on 23 September 2025).

Conflicts of Interest

The authors declare no potential conflicts of interest with respect to the research, authorship, and publication of this article.

References

  1. Cai, M.; Chen, J.; Hua, C.; Wen, G.; Fu, R. EEG emotion recognition using EEG-SWTNS neural network through EEG spectral image. Inf. Sci. 2024, 680, 121198. [Google Scholar] [CrossRef]
  2. Chen, B.; Chen, C.P.; Zhang, T. GDDN: Graph domain disentanglement network for generalizable EEG emotion recognition. IEEE Trans. Affect. Comput. 2024, 15, 1739–1753. [Google Scholar] [CrossRef]
  3. Fan, C.; Wang, J.; Huang, W.; Yang, X.; Pei, G.; Li, T.; Lv, Z. Light-weight residual convolution-based capsule network for EEG emotion recognition. Adv. Eng. Inform. 2024, 61, 102522. [Google Scholar] [CrossRef]
  4. Krothapalli, M.; Buddendorff, L.; Yadav, H.; Schilaty, N.D.; Jain, S. From gut microbiota to brain waves: The potential of the microbiome and EEG as biomarkers for cognitive impairment. Int. J. Mol. Sci. 2024, 25, 6678. [Google Scholar] [CrossRef]
  5. Xue, C.; Li, A.; Wu, R.; Chai, J.; Qiang, Y.; Zhao, J.; Yang, Q. VRNPT: A neuropsychological test tool for diagnosing mild cognitive impairment using virtual reality and EEG signals. Int. J. Hum.-Comput. Interact. 2024, 40, 6268–6286. [Google Scholar] [CrossRef]
  6. You, Z.; Guo, Y.; Zhang, X.; Zhao, Y. Virtual Electroencephalogram Acquisition: A Review on Electroencephalogram Generative Methods. Sensors 2025, 25, 3178. [Google Scholar] [CrossRef]
  7. Pan, Y.; Li, N.; Zhang, Y.; Xu, P.; Yao, D. Short-length SSVEP data extension by a novel generative adversarial networks based framework. Cogn. Neurodynamics 2024, 18, 2925–2945. [Google Scholar] [CrossRef]
  8. Panwar, S.; Rad, P.; Quarles, J.; Huang, Y. Generating EEG signals of an RSVP experiment by a class conditioned wasserstein generative adversarial network. In Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 6–9 October 2019; pp. 1304–1310. [Google Scholar]
  9. Yin, X.; Han, Y.; Sun, H.; Xu, Z.; Yu, H.; Duan, X. Multi-attention generative adversarial network for multivariate time series prediction. IEEE Access 2021, 9, 57351–57363. [Google Scholar] [CrossRef]
  10. Krishna, G.; Tran, C.; Carnahan, M.; Tewfik, A. Constrained variational autoencoder for improving EEG based speech recognition systems. arXiv 2020, arXiv:2006.02902. [Google Scholar] [CrossRef]
  11. Luo, Y.; Zhu, L.-Z.; Wan, Z.-Y.; Lu, B.-L. Data augmentation for enhancing EEG-based emotion recognition with deep generative models. J. Neural Eng. 2020, 17, 056021. [Google Scholar] [CrossRef]
  12. Dong, Y.; Tang, X.; Tan, F.; Li, Q.; Wang, Y.; Zhang, H.; Xie, J.; Liang, W.; Li, G.; Fang, P. An approach for EEG data augmentation based on deep convolutional generative adversarial network. In Proceedings of the 2022 IEEE International Conference on Cyborg and Bionic Systems (CBS), Wuhan, China, 24–26 March 2023; pp. 347–351. [Google Scholar]
  13. Zhang, Q.; Liu, Y. Improving brain computer interface performance by data augmentation with conditional deep convolutional generative adversarial networks. arXiv 2018, arXiv:1806.07108. [Google Scholar] [CrossRef]
  14. Liu, Q.; Hao, J.; Guo, Y. EEG data augmentation for emotion recognition with a task-driven GAN. Algorithms 2023, 16, 118. [Google Scholar] [CrossRef]
  15. Zhang, A.; Su, L.; Zhang, Y.; Fu, Y.; Wu, L.; Liang, S. EEG data augmentation for emotion recognition with a multiple generator conditional Wasserstein GAN. Complex Intell. Syst. 2022, 8, 3059–3071. [Google Scholar] [CrossRef]
  16. Xu, M.; Chen, Y.; Wang, Y.; Wang, D.; Liu, Z.; Zhang, L. BWGAN-GP: An EEG data generation method for class imbalance problem in RSVP tasks. IEEE Trans. Neural Syst. Rehabil. Eng. 2022, 30, 251–263. [Google Scholar] [CrossRef] [PubMed]
  17. Aznan, N.K.N.; Atapour-Abarghouei, A.; Bonner, S.; Connolly, J.D.; Al Moubayed, N.; Breckon, T.P. Simulating brain signals: Creating synthetic eeg data via neural-based generative models for improved ssvep classification. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
  18. Cisotto, G.; Zancanaro, A.; Zoppis, I.F.; Manzoni, S.L. hvEEGNet: A novel deep learning model for high-fidelity EEG reconstruction. Front. Neuroinform. 2024, 18, 1459970. [Google Scholar] [CrossRef] [PubMed]
  19. Bethge, D.; Hallgarten, P.; Grosse-Puppendahl, T.; Kari, M.; Chuang, L.L.; Özdenizci, O.; Schmidt, A. EEG2Vec: Learning affective EEG representations via variational autoencoders. In Proceedings of the 2022 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Prague, Czech Republic, 9–12 October 2022; pp. 3150–3157. [Google Scholar]
  20. Li, H.; Yu, S.; Principe, J. Causal recurrent variational autoencoder for medical time series generation. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; pp. 8562–8570. [Google Scholar]
  21. Tian, C.; Ma, Y.; Cammon, J.; Fang, F.; Zhang, Y.; Meng, M. Dual-encoder VAE-GAN with spatiotemporal features for emotional EEG data augmentation. IEEE Trans. Neural Syst. Rehabil. Eng. 2023, 31, 2018–2027. [Google Scholar] [CrossRef] [PubMed]
  22. Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, 8–13 December 2014. [Google Scholar]
  23. Abdelfattah, S.M.; Abdelrahman, G.M.; Wang, M. Augmenting the size of EEG datasets using generative adversarial networks. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–6. [Google Scholar]
  24. Fahimi, F.; Dosen, S.; Ang, K.K.; Mrachacz-Kersting, N.; Guan, C. Generative adversarial networks-based data augmentation for brain–computer interface. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 4039–4051. [Google Scholar] [CrossRef]
  25. Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
  26. Panwar, S.; Rad, P.; Jung, T.-P.; Huang, Y. Modeling EEG data distribution with a Wasserstein generative adversarial network to predict RSVP events. IEEE Trans. Neural Syst. Rehabil. Eng. 2020, 28, 1720–1730. [Google Scholar] [CrossRef]
  27. Smith, K.E.; Smith, A.O. Conditional GAN for timeseries generation. arXiv 2020, arXiv:2006.16477. [Google Scholar] [CrossRef]
  28. Duan, R.-N.; Zhu, J.-Y.; Lu, B.-L. Differential entropy feature for EEG-based emotion classification. In Proceedings of the 2013 6th international IEEE/EMBS conference on neural engineering (NER), San Diego, CA, USA, 6–8 November 2013; pp. 81–84. [Google Scholar]
  29. Zheng, W.-L.; Lu, B.-L. Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks. IEEE Trans. Auton. Ment. Dev. 2015, 7, 162–175. [Google Scholar] [CrossRef]
  30. Liu, W.; Zheng, W.-L.; Li, Z.; Wu, S.-Y.; Gan, L.; Lu, B.-L. Identifying similarities and differences in emotion recognition with EEG and eye movements among Chinese, German, and French People. J. Neural Eng. 2022, 19, 026012. [Google Scholar] [CrossRef] [PubMed]
  31. Schaefer, A.; Nils, F.; Sanchez, X.; Philippot, P. Assessing the effectiveness of a large database of emotion-eliciting films: A new tool for emotion researchers. Cogn. Emot. 2010, 24, 1153–1172. [Google Scholar] [CrossRef]
  32. Ramponi, G.; Protopapas, P.; Brambilla, M.; Janssen, R. T-cgan: Conditional generative adversarial network for data augmentation in noisy time series with irregular sampling. arXiv 2018, arXiv:1811.08295. [Google Scholar]
Figure 1. Trans-cVAE-GAN overall architecture.
Figure 1. Trans-cVAE-GAN overall architecture.
Bioengineering 12 01028 g001
Figure 2. Transformer-based encoder architecture.
Figure 2. Transformer-based encoder architecture.
Bioengineering 12 01028 g002
Figure 3. Transformer-based decoder architecture.
Figure 3. Transformer-based decoder architecture.
Bioengineering 12 01028 g003
Figure 4. Discriminator network architecture.
Figure 4. Discriminator network architecture.
Bioengineering 12 01028 g004
Figure 5. Time-domain signal comparison for channel 0–16 with SEED Dataset.
Figure 5. Time-domain signal comparison for channel 0–16 with SEED Dataset.
Bioengineering 12 01028 g005
Figure 6. Time-domain signal comparison for channel 0–16 with SEED-FRA Dataset.
Figure 6. Time-domain signal comparison for channel 0–16 with SEED-FRA Dataset.
Bioengineering 12 01028 g006
Figure 7. Time-domain signal comparison for channel 0–16 with SEED-GER Dataset.
Figure 7. Time-domain signal comparison for channel 0–16 with SEED-GER Dataset.
Bioengineering 12 01028 g007
Figure 8. Channel-wise Pearson correlation distribution with SEED Dataset.
Figure 8. Channel-wise Pearson correlation distribution with SEED Dataset.
Bioengineering 12 01028 g008
Figure 9. Channel-wise Pearson correlation distribution with SEED-FRA Dataset.
Figure 9. Channel-wise Pearson correlation distribution with SEED-FRA Dataset.
Bioengineering 12 01028 g009
Figure 10. Channel-wise Pearson correlation distribution with SEED-GER Dataset.
Figure 10. Channel-wise Pearson correlation distribution with SEED-GER Dataset.
Bioengineering 12 01028 g010
Figure 11. Channel-wise Spearman correlation comparison with SEED Dataset.
Figure 11. Channel-wise Spearman correlation comparison with SEED Dataset.
Bioengineering 12 01028 g011
Figure 12. Channel-wise Spearman correlation comparison with SEED-FRA Dataset.
Figure 12. Channel-wise Spearman correlation comparison with SEED-FRA Dataset.
Bioengineering 12 01028 g012
Figure 13. Channel-wise Spearman correlation comparison with SEED-GER Dataset.
Figure 13. Channel-wise Spearman correlation comparison with SEED-GER Dataset.
Bioengineering 12 01028 g013
Figure 14. Channel-wise KL divergence comparison with SEED Dataset.
Figure 14. Channel-wise KL divergence comparison with SEED Dataset.
Bioengineering 12 01028 g014
Figure 15. Channel-wise KL divergence comparison with SEED-FRA Dataset.
Figure 15. Channel-wise KL divergence comparison with SEED-FRA Dataset.
Bioengineering 12 01028 g015
Figure 16. Channel-wise KL divergence comparison with SEED-GER Dataset.
Figure 16. Channel-wise KL divergence comparison with SEED-GER Dataset.
Bioengineering 12 01028 g016
Figure 17. Frequency power spectrum comparison for channel 1–16 with SEED Dataset.
Figure 17. Frequency power spectrum comparison for channel 1–16 with SEED Dataset.
Bioengineering 12 01028 g017
Figure 18. Frequency power spectrum comparison for channel 1–16 with SEED-FRA Dataset.
Figure 18. Frequency power spectrum comparison for channel 1–16 with SEED-FRA Dataset.
Bioengineering 12 01028 g018
Figure 19. Frequency power spectrum comparison for channel 1–16 with SEED-GER Dataset.
Figure 19. Frequency power spectrum comparison for channel 1–16 with SEED-GER Dataset.
Bioengineering 12 01028 g019
Figure 20. t-SNE of Frequency-Domain Features distribution with SEED Dataset.
Figure 20. t-SNE of Frequency-Domain Features distribution with SEED Dataset.
Bioengineering 12 01028 g020
Figure 21. t-SNE of Frequency-Domain Features distribution with SEED-FRA Dataset.
Figure 21. t-SNE of Frequency-Domain Features distribution with SEED-FRA Dataset.
Bioengineering 12 01028 g021
Figure 22. t-SNE of Frequency-Domain Features distribution with SEED-GER Dataset.
Figure 22. t-SNE of Frequency-Domain Features distribution with SEED-GER Dataset.
Bioengineering 12 01028 g022
Figure 23. Channel-wise FID with SEED Dataset.
Figure 23. Channel-wise FID with SEED Dataset.
Bioengineering 12 01028 g023
Figure 24. Channel-wise FID with SEED-FRA Dataset.
Figure 24. Channel-wise FID with SEED-FRA Dataset.
Bioengineering 12 01028 g024
Figure 25. Channel-wise FID with SEED-GER Dataset.
Figure 25. Channel-wise FID with SEED-GER Dataset.
Bioengineering 12 01028 g025
Figure 26. Channel-wise PSD MSE (0–45 Hz) with SEED Dataset.
Figure 26. Channel-wise PSD MSE (0–45 Hz) with SEED Dataset.
Bioengineering 12 01028 g026
Figure 27. Channel-wise PSD MSE (0–45 Hz) with SEED-FRA Dataset.
Figure 27. Channel-wise PSD MSE (0–45 Hz) with SEED-FRA Dataset.
Bioengineering 12 01028 g027
Figure 28. Channel-wise PSD MSE (0–45 Hz) with SEED-GER Dataset.
Figure 28. Channel-wise PSD MSE (0–45 Hz) with SEED-GER Dataset.
Bioengineering 12 01028 g028
Figure 29. Channel-wise EMD of Welch PSD with SEED Dataset.
Figure 29. Channel-wise EMD of Welch PSD with SEED Dataset.
Bioengineering 12 01028 g029
Figure 30. Channel-wise EMD of Welch PSD with SEED-FRA Dataset.
Figure 30. Channel-wise EMD of Welch PSD with SEED-FRA Dataset.
Bioengineering 12 01028 g030
Figure 31. Channel-wise EMD of Welch PSD with SEED-GER Dataset.
Figure 31. Channel-wise EMD of Welch PSD with SEED-GER Dataset.
Bioengineering 12 01028 g031
Table 1. Classification Consistency Analysis with SEED Dataset.
Table 1. Classification Consistency Analysis with SEED Dataset.
Data RegimeTrain SetValidation/TestAccuracy
(Mean ± SD)
Precision
(Mean ± SD)
Recall
(Mean ± SD)
F1-Score
(Mean ± SD)
AUC
(Mean ± SD)
Real EEGRealReal0.869 ± 0.1020.863 ± 0.1140.859 ± 0.1320.860 ± 0.1260.857 ± 0.107
Generated EEGGeneratedReal0.823 ± 0.1860.831 ± 0.1940.834 ± 0.1760.836 ± 0.1820.832 ± 0.190
Real + Generated EEGReal +
Generated
Real0.918 ± 0.0940.921 ± 0.0870.918 ± 0.0960.903 ± 0.0770.917 ± 0.084
Table 2. Classification Consistency Analysis with SEED-FRA Dataset.
Table 2. Classification Consistency Analysis with SEED-FRA Dataset.
Data RegimeTrain SetValidation/TestAccuracy
(Mean ± SD)
Precision
(Mean ± SD)
Recall
(Mean ± SD)
F1-Score
(Mean ± SD)
AUC
(Mean ± SD)
Real EEGRealReal0.835 ± 0.1120.831 ± 0.1210.839 ± 0.1170.836 ± 0.1240.826 ± 0.124
Generated EEGGeneratedReal0.793 ± 0.1760.784 ± 0.1760.795 ± 0.1690.782 ± 0.1800.787 ± 0.173
Real + Generated EEGReal +
Generated
Real0.884 ± 0.0890.881 ± 0.0730.878 ± 0.0820.876 ± 0.0740.881 ± 0.081
Table 3. Classification Consistency Analysis with SEED-GER Dataset.
Table 3. Classification Consistency Analysis with SEED-GER Dataset.
Data RegimeTrain SetValidation/TestAccuracy
(Mean ± SD)
Precision
(Mean ± SD)
Recall
(Mean ± SD)
F1-Score
(Mean ± SD)
AUC
(Mean ± SD)
Real EEGRealReal0.849 ± 0.1130.854 ± 0.1150.846 ± 0.1210.853 ± 0.1170.853 ± 0.119
Generated EEGGeneratedReal0.809 ± 0.2040.812 ± 0.2100.808 ± 0.1980.812 ± 0.2020.814 ± 0.213
Real + Generated EEGReal +
Generated
Real0.894 ± 0.1020.892 ± 0.1070.901 ± 0.1130.903 ± 0.0970.896 ± 0.104
Table 4. Ablation Study Results with SEED Dataset.
Table 4. Ablation Study Results with SEED Dataset.
ModelPearson
(Mean ± SD)
Spearman
(Mean ± SD)
KL Divergence
(Mean ± SD)
FID
(Mean ± SD)
EMD
(Mean ± SD)
Baseline0.838 ± 0.0750.819 ± 0.0680.389 ± 0.14513.962 ± 4.2930.198 ± 0.089
Without cVAE0.512 ± 0.1040.516 ± 0.1221.476 ± 0.25418.894 ± 8.0580.454 ± 0.243
Without GAN0.521 ± 0.1310.504 ± 0.1321.545 ± 0.24720.156 ± 8.1560.584 ± 0.215
Without Label Conditioning0.634 ± 0.0860.621 ± 0.1070.843 ± 0.17616.453 ± 7.5340.345 ± 0.156
Without Positional Embedding0.612 ± 0.0730.635 ± 0.0960.957 ± 0.16915.346 ± 6.4530.376 ± 0.175
Without Pearson Loss0.454 ± 0.1560.476 ± 0.1851.735 ± 0.24321.445 ± 8.4750.635 ± 0.234
Without Transformer Encoder0.505 ± 0.1160.543 ± 0.1211.246 ± 0.02819.473 ± 8.4560.548 ± 0.164
Transformer → CNN0.673 ± 0.1450.667 ± 0.1751.333 ± 0.16818.437 ± 8.0450.534 ± 0.237
Table 5. Ablation Study Results with SEED-FRA Dataset.
Table 5. Ablation Study Results with SEED-FRA Dataset.
ModelPearson
(Mean ± SD)
Spearman
(Mean ± SD)
KL Divergence
(Mean ± SD)
FID
(Mean ± SD)
EMD
(Mean ± SD)
Baseline0.739 ± 0.1200.721 ± 0.1170.411 ± 0.1955.275 ± 2.9060.320 ± 0.100
Without cVAE0.438 ± 0.1830.413 ± 0.1971.537 ± 0.3549.453 ± 5.2340.731 ± 0.315
Without GAN0.427 ± 0.1970.454 ± 0.2041.678 ± 0.42310.049 ± 4.4560.794 ± 0.434
Without Label Conditioning0.579 ± 0.1640.543 ± 0.1721.435 ± 0.2758.134 ± 4.4350.683 ± 0.286
Without Positional Embedding0.516 ± 0.1570.523 ± 0.1841.535 ± 0.434 7.464 ± 3.5370.647 ± 0.307
Without Pearson Loss0.554 ± 0.1460.549 ± 0.1351.835 ± 0.3547.587 ± 4.2410.681 ± 0.314
Without Transformer Encoder0.419 ± 0.2010.425 ± 0.2012.154 ± 0.54611.946 ± 6.0280.754 ± 0.412
Transformer → CNN0.584 ± 0.1750.548 ± 0.1861.945 ± 0.7127.944 ± 5.4530.657 ± 0.284
Table 6. Ablation Study Results with SEED-GER Dataset.
Table 6. Ablation Study Results with SEED-GER Dataset.
ModelPearson
(Mean ± SD)
Spearman
(Mean ± SD)
KL Divergence
(Mean ± SD)
FID
(Mean ± SD)
EMD
(Mean ± SD)
Baseline0.844 ± 0.0680.831 ± 0.0760.368 ± 0.18415.308 ± 7.5230.227 ± 0.084
Without cVAE0.575 ± 0.1720.548 ± 0.1340.745 ± 0.54218.453 ± 6.4570.646 ± 0.143
Without GAN0.546 ± 0.1950.512 ± 0.1430.764 ± 0.45919.378 ± 8.4350.682 ± 0.187
Without Label Conditioning0.437 ± 0.1760.487 ± 0.1540.845 ± 0.45319.547 ± 7.9450.721 ± 0.195
Without Positional Embedding0.543 ± 0.2010.537 ± 0.1940.794 ± 0.49418.647 ± 7.5470.675 ± 0.157
Without Pearson Loss0.538 ± 0.1680.546 ± 0.1680.735 ± 0.42717.287 ± 7.6840.587 ± 0.135
Without Transformer Encoder0.517 ± 0.1570.508 ± 0.1210.935 ± 0.54317.548 ± 8.0540.594 ± 0.154
Transformer → CNN0.681 ± 0.1670.654 ± 0.1350.673 ± 0.35418.387 ± 8.1540.543 ± 0.123
Table 7. Comparative Experiment Statistical Results with SEED Dataset.
Table 7. Comparative Experiment Statistical Results with SEED Dataset.
ModelPearson
(Mean ± SD)
Spearman
(Mean ± SD)
KL Divergence
(Mean ± SD)
FID
(Mean ± SD)
EMD
(Mean ± SD)
Baseline0.838 ± 0.0750.819 ± 0.0680.389 ± 0.14513.962 ± 4.2930.198 ± 0.089
DCGAN0.543 ± 0.0960.576 ± 0.1050.635 ± 0.23415.436 ± 3.9570.323 ± 0.153
WGAN0.567 ± 0.1260.586 ± 0.1350.536 ± 0.14214.954 ± 4.9820.424 ± 0.183
WGAN-GP0.629 ± 0.0840.694 ± 0.1210.459 ± 0.13914.532 ± 4.5310.257 ± 0.136
T-CGAN0.624 ± 0.0930.657 ± 0.1190.546 ± 0.17615.168 ± 4.5870.275 ± 0.149
Table 8. Comparative Experiment Classification Results with SEED Dataset.
Table 8. Comparative Experiment Classification Results with SEED Dataset.
ModelTrain SetValidation/TestAccuracy
(Mean ± SD)
Precision
(Mean ± SD)
Recall
(Mean ± SD)
F1-Score
(Mean ± SD)
AUC
(Mean ± SD)
BaselineRealReal0.869 ± 0.1020.863 ± 0.1140.859 ± 0.1320.860 ± 0.1260.857 ± 0.107
GeneratedReal0.823 ± 0.1860.831 ± 0.1940.834 ± 0.1760.836 ± 0.1820.832 ± 0.190
Real + GeneratedReal0.918 ± 0.0940.921 ± 0.0870.918 ± 0.0960.903 ± 0.0770.917 ± 0.084
DCGANRealReal0.869 ± 0.1020.863 ± 0.1140.859 ± 0.1320.860 ± 0.1260.857 ± 0.107
GeneratedReal0.803 ± 0.1670.809 ± 0.1860.806 ± 0.1720.798 ± 0.1630.796 ± 0.171
Real + GeneratedReal0.881 ± 0.0960.878 ± 0.1020.876 ± 0.1090.874 ± 0.1040.877 ± 0.098
WGANRealReal0.869 ± 0.1020.863 ± 0.1140.859 ± 0.1320.860 ± 0.1260.857 ± 0.107
GeneratedReal0.817 ± 0.1690.821 ± 0.1730.819 ± 0.1760.820 ± 0.1670.823 ± 0.162
Real + GeneratedReal0.891 ± 0.1040.889 ± 0.1060.893 ± 0.0970.892 ± 0.1010.891 ± 0.106
WGAN-GPRealReal0.869 ± 0.1020.863 ± 0.1140.859 ± 0.1320.860 ± 0.1260.857 ± 0.107
GeneratedReal0.821 ± 0.1860.818 ± 0.1760.823 ± 0.1680.819 ± 0.1730.820 ± 0.157
Real + GeneratedReal0.904 ± 0.0990.901 ± 0.1030.897 ± 0.1010.903 ± 0.0960.899 ± 0.093
T-CGANRealReal0.869 ± 0.1020.863 ± 0.1140.859 ± 0.1320.860 ± 0.1260.857 ± 0.107
GeneratedReal0.812 ± 0.1680.807 ± 0.1590.809 ± 0.1720.813 ± 0.1630.811 ± 0.159
Real + GeneratedReal0.886 ± 0.1020.883 ± 0.1060.878 ± 0.0960.886 ± 0.0930.879 ± 0.104
Table 9. Comparative Experiment Statistical Results with SEED-FRA Dataset (mean ± sd).
Table 9. Comparative Experiment Statistical Results with SEED-FRA Dataset (mean ± sd).
ModelPearson
(Mean ± SD)
Spearman
(Mean ± SD)
KL Divergence
(Mean ± SD)
FID
(Mean ± SD)
EMD
(Mean ± SD)
Baseline0.739 ± 0.1200.721 ± 0.1170.411 ± 0.1955.275 ± 2.9060.320 ± 0.100
DCGAN0.496 ± 0.1030.481 ± 0.1380.589 ± 0.2618.354 ± 3.9840.631 ± 0.203
WGAN0.547 ± 0.1350.537 ± 0.1860.573 ± 0.2347.545 ± 3.5330.538 ± 0.251
WGAN-GP0.603 ± 0.1290.684 ± 0.1390.510 ± 0.2117.371 ± 2.5740.357 ± 0.086
T-CGAN0.594 ± 0.0960.583 ± 0.1260.476 ± 0.1868.163 ± 3.8960.376 ± 0.168
Table 10. Comparative Experiment Classification Results with SEED-FRA Dataset.
Table 10. Comparative Experiment Classification Results with SEED-FRA Dataset.
ModelTrain SetValidation/TestAccuracy
(Mean ± SD)
Precision
(Mean ± SD)
Recall
(Mean ± SD)
F1-Score
(Mean ± SD)
AUC
(Mean ± SD)
BaselineRealReal0.835 ± 0.1120.831 ± 0.1210.839 ± 0.1170.836 ± 0.1240.826 ± 0.124
GeneratedReal0.793 ± 0.1760.784 ± 0.1760.795 ± 0.1690.782 ± 0.1800.787 ± 0.173
Real + GeneratedReal0.884 ± 0.0890.881 ± 0.0730.878 ± 0.0820.876 ± 0.0740.881 ± 0.081
DCGANRealReal0.835 ± 0.1120.831 ± 0.1210.839 ± 0.1170.836 ± 0.1240.826 ± 0.124
GeneratedReal0.773 ± 0.2010.770 ± 0.1970.768 ± 0.1940.776 ± 0.1930.774 ± 0.213
Real + GeneratedReal0.851 ± 0.1240.853 ± 0.1380.849 ± 0.1320.852 ± 0.1350.847 ± 0.129
WGANRealReal0.835 ± 0.1120.831 ± 0.1210.839 ± 0.1170.836 ± 0.1240.826 ± 0.124
GeneratedReal0.776 ± 0.1860.779 ± 0.1730.784 ± 0.1740.777 ± 0.1820.782 ± 0.179
Real + GeneratedReal0.861 ± 0.1160.863 ± 0.1060.859 ± 0.1240.868 ± 0.0970.864 ± 0.107
WGAN-GPRealReal0.835 ± 0.1120.831 ± 0.1210.839 ± 0.1170.836 ± 0.1240.826 ± 0.124
GeneratedReal0.781 ± 0.1630.776 ± 0.1590.778 ± 0.1680.783 ± 0.1350.778 ± 0.143
Real + GeneratedReal0.867 ± 0.1020.871 ± 0.0930.873 ± 0.0860.869 ± 0.1060.870 ± 0.112
T-CGANRealReal0.835 ± 0.1120.831 ± 0.1210.839 ± 0.1170.836 ± 0.1240.826 ± 0.124
GeneratedReal0.773 ± 0.1720.771 ± 0.1760.768 ± 0.1960.762 ± 0.1890.776 ± 0.168
Real + GeneratedReal0.849 ± 0.0840.847 ± 0.0860.851 ± 0.0760.849 ± 0.0810.850 ± 0.093
Table 11. Comparative Experiment Statistical Results with SEED-GER Dataset.
Table 11. Comparative Experiment Statistical Results with SEED-GER Dataset.
ModelPearson
(Mean ± SD)
Spearman
(Mean ± SD)
KL Divergence
(Mean ± SD)
FID
(Mean ± SD)
EMD
(Mean ± SD)
Baseline0.844 ± 0.0680.831 ± 0.0760.368 ± 0.18415.308 ± 7.5230.227 ± 0.084
DCGAN0.568 ± 0.1680.594 ± 0.1720.531 ± 0.30618.354 ± 8.4610.513 ± 0.197
WGAN0.506 ± 0.2060.524 ± 0.2340.608 ± 0.25918.891 ± 8.6480.672 ± 0.216
WGAN-GP0.623 ± 0.1060.648 ± 0.2690.514 ± 0.18317.541 ± 6.9840.435 ± 0.105
T-CGAN0.618 ± 0.0830.615 ± 0.0910.481 ± 0.24117.764 ± 7.9480.437 ± 0.117
Table 12. Comparative Experiment Classification Results with SEED-GER Dataset.
Table 12. Comparative Experiment Classification Results with SEED-GER Dataset.
ModelTrain SetValidation/TestAccuracy
(Mean ± SD)
Precision
(Mean ± SD)
Recall
(Mean ± SD)
F1-Score
(Mean ± SD)
AUC
(Mean ± SD)
BaselineRealReal0.849 ± 0.1130.854 ± 0.1150.846 ± 0.1210.853 ± 0.1170.853 ± 0.119
GeneratedReal0.809 ± 0.2040.812 ± 0.2100.808 ± 0.1980.812 ± 0.2020.814 ± 0.213
Real + GeneratedReal0.894 ± 0.1020.892 ± 0.1070.901 ± 0.1130.903 ± 0.0970.896 ± 0.104
DCGANRealReal0.849 ± 0.1130.854 ± 0.1150.846 ± 0.1210.853 ± 0.1170.853 ± 0.119
GeneratedReal0.768 ± 0.1890.772 ± 0.1760.767 ± 0.1840.756 ± 0.1670.761 ± 0.172
Real + GeneratedReal0.867 ± 0.1190.863 ± 0.1230.861 ± 0.1170.867 ± 0.1200.862 ± 0.131
WGANRealReal0.849 ± 0.1130.854 ± 0.1150.846 ± 0.1210.853 ± 0.1170.853 ± 0.119
GeneratedReal0.773 ± 0.173 0.772 ± 0.1540.768 ± 0.1680.764 ± 0.1710.774 ± 0.169
Real + GeneratedReal0.872 ± 0.0830.869 ± 0.0940.873 ± 0.1020.864 ± 0.0990.871 ± 0.082
WGAN-GPRealReal0.849 ± 0.1130.854 ± 0.1150.846 ± 0.1210.853 ± 0.1170.853 ± 0.119
GeneratedReal0.791 ± 0.1100.793 ± 0.1030.801 ± 0.1090.796 ± 0.1040.794 ± 0.096
Real + GeneratedReal0.883 ± 0.1060.886 ± 0.0920.879 ± 0.1090.881 ± 0.0860.884 ± 0.093
T-CGANRealReal0.849 ± 0.1130.854 ± 0.1150.846 ± 0.1210.853 ± 0.1170.853 ± 0.119
GeneratedReal0.763 ± 0.1940.772 ± 0.1830.768 ± 0.1730.765 ± 0.1690.772 ± 0.164
Real + GeneratedReal0.863 ± 0.1120.861 ± 0.0940.869 ± 0.1060.871 ± 0.1090.873 ± 0.097
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yao, Y.; Wang, X.; Hao, X.; Sun, H.; Dong, R.; Li, Y. Trans-cVAE-GAN: Transformer-Based cVAE-GAN for High-Fidelity EEG Signal Generation. Bioengineering 2025, 12, 1028. https://doi.org/10.3390/bioengineering12101028

AMA Style

Yao Y, Wang X, Hao X, Sun H, Dong R, Li Y. Trans-cVAE-GAN: Transformer-Based cVAE-GAN for High-Fidelity EEG Signal Generation. Bioengineering. 2025; 12(10):1028. https://doi.org/10.3390/bioengineering12101028

Chicago/Turabian Style

Yao, Yiduo, Xiao Wang, Xudong Hao, Hongyu Sun, Ruixin Dong, and Yansheng Li. 2025. "Trans-cVAE-GAN: Transformer-Based cVAE-GAN for High-Fidelity EEG Signal Generation" Bioengineering 12, no. 10: 1028. https://doi.org/10.3390/bioengineering12101028

APA Style

Yao, Y., Wang, X., Hao, X., Sun, H., Dong, R., & Li, Y. (2025). Trans-cVAE-GAN: Transformer-Based cVAE-GAN for High-Fidelity EEG Signal Generation. Bioengineering, 12(10), 1028. https://doi.org/10.3390/bioengineering12101028

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop