Decoding Temporally Encoded 3D Objects from Low-Cost Wearable Electroencephalography

LaRocco, John; Tahmina, Qudsia; Zia, Saideh; Merchant, Shahil; Forrester, Jason; He, Eason; Lin, Ye

doi:10.3390/technologies13110501

Open AccessArticle

Decoding Temporally Encoded 3D Objects from Low-Cost Wearable Electroencephalography

by

John LaRocco

^1,*

,

Qudsia Tahmina

²

,

Saideh Zia

²

,

Shahil Merchant

²,

Jason Forrester

²,

Eason He

² and

Ye Lin

²

¹

Department of Psychiatry and Behavioral Sciences, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA

²

Electrical and Computer Engineering, College of Engineering, The Ohio State University, Columbus, OH 43210, USA

^*

Author to whom correspondence should be addressed.

Technologies 2025, 13(11), 501; https://doi.org/10.3390/technologies13110501

Submission received: 19 August 2025 / Revised: 22 October 2025 / Accepted: 27 October 2025 / Published: 1 November 2025

(This article belongs to the Special Issue Advances in Biomedical Engineering and Artificial Intelligence for Neurological Health)

Download

Browse Figures

Versions Notes

Abstract

Decoding visual content from neural activity remains a central challenge at the intersections of engineering, neuroscience, and computational modeling. Prior work has primarily leveraged electroencephalography (EEG) with generative models to recover static images. In this study, we advance EEG-based decoding by introducing a temporal encoding framework that approximates dynamic object transformations across time. EEG recordings from healthy participants (n = 20) were used to model neural representations of objects presented in “initial” and “later” states. Individualized classifiers trained on time-specific EEG signatures achieved high discriminability, with Random Forest models reaching a mean accuracy and standard deviation of 92 ± 2% and a mean AUC-ROC and standard deviation of 0.87 ± 0.10, driven largely by gamma- and beta-band activity at the frontal electrodes. These results confirm and extend evidence of strong interindividual variability, showing that subject-specific models outperform intersubject approaches in decoding temporally varying object representations. Beyond classification, we demonstrate that pairwise temporal encodings can be integrated into a generative pipeline to produce approximated reconstructions of short video sequences and 3D object renderings. Our findings establish that temporal EEG features, captured using low-cost open-source hardware, are sufficient to support the decoding of visual content across discrete time points, providing a versatile platform for potential applications in neural decoding, immersive media, and human–computer interaction.

Keywords:

EEG; electroencephalography; image reconstruction; generative AI; 3D printing

1. Introduction

1.1. Overview

Electroencephalography (EEG) provides a cost-effective and non-invasive measure of neural dynamics via scalp electrodes, commonly arranged using standardized configurations such as the International 10–20 system [1,2,3]. While EEG has been successfully applied to decode static two-dimensional images, the prior work has been constrained by limited stimulus categories and the neglect of the temporal variability in object representations [4,5,6]. In this study, we advance EEG-based decoding by demonstrating the decoding and temporal differentiation of object representations across sequential presentations using a low-cost and open-source OpenBCI headset. Specifically, we implement a modified one-versus-rest classification framework which is capable of distinguishing not only between object categories, but also between temporally ordered instances of the same object. By encoding static images as chronological sequences, our method supports the generation of approximated reconstructions from decoded EEG signals, enabling both cinematic rendering and subsequent three-dimensional object conversion. This work was primarily intended as a proof of concept for low-cost feasibility rather than as a mechanistic neuroscience experiment. This contributes (i) a demonstration that temporally distinct neural representations of identical objects can be reliably discriminated using low-cost EEG, (ii) an individualized pipeline for translating EEG signals into dynamic scenes and 3D object models, and (iii) an accessible approach that broadens the potential applicability of brain–computer interface (BCI) technologies to domains such as digital media, manufacturing, and biomedical engineering [4,5,6,7].

1.2. Background

1.2.1. Summarizing Prior Work

EEG-based BCI has existed for decades, ranging in its scope and purpose [8,9,10,11]. Earlier work converted EEG to an image through encoding 1D temporal data with 2D images [12,13]. Recent work demonstrated the capability to convert EEG into images in real-time, but this required specialized systems and complex AI models [6]. Others have even reported object retrieval, leveraging synergies in verbal and visual processing [4,14,15,16]. AI has also been used to convert images into videos and 3D objects. However, the prerequisite was encoding EEG to images, often with expensive EEG systems and requiring an intense software backend [4,17,18]. Development of the pipeline used in this study first required reviewing the prior literature on encoding EEG to image (in Section 1.2.2), EEG to video (in Section 1.2.3), and EEG to object (in Section 1.2.4), as well as a summary of the gaps in the literature (in Section 1.2.5).

1.2.2. EEG to Image

The extraction and encoding of 3D models from remembered images using EEG has recently garnered attention [19,20,21]. Prior investigators have made notable strides in decoding neural signals to reconstruct visual stimuli, leveraging various methodologies and technologies. Image “reconstruction” from EEG is referred to in the literature for both encoding patterns of EEG, often corresponding to specific visual images, decoding images from specific stimulus, classification of temporally encoded object sequences, and attempts at directly reconstructing images without prior prompts [19,20,21]. While overlap exists, EEG patterns associated with visual recall generate consistently identifiable features [22,23,24,25]. If an image can be recalled and “encoded” with EEG, then it may be converted by projecting a “decoded” 2D image to a 3D object or image.

Prior work builds on research that outlines how EEG patterns correlate with texture perception [26,27,28]. Visual cortex engagement is a key parameter for consistent EEG generation and recording. Visual cortex engagement is fundamentally linked to EEG-based neural decoding, as EEG signals capture the electrical activity associated with visual processing. Peer-reviewed research has demonstrated that EEG, particularly in the occipital and frontal lobes, can reliably reflect visual cortical dynamics and support the decoding of visual stimuli for applications such as BCIs [26,27,28]. However, the complexity of EEG signals necessitates sophisticated modeling techniques for effective image reconstruction and decoding [29]. Similarly, the integration of advanced neural networks such as Generative Adversarial Networks (GANs) has been explored [30]. These computationally intensive tools have expensive energy and hardware requirements, even before encoding EEG. Furthermore, 2D to 3D conversion techniques, such as depth map estimation, have also been used in photogrammetry [31].

Additional research showcases the reconstruction of faces from EEG data, emphasizing EEG’s potential for gaining insights into the processing of dynamic visual stimuli, thus indicating a progression towards 3D interpretations [32]. This highlights the advancements towards utilizing EEG not just for static images, but also for complex scenarios requiring 3D representations. Ongoing developments in models that leverage variational inference for image generation from EEG data indicate a burgeoning interest in employing diffusion models alongside conventional neural networks, as their high dimensionality may lead to credible 3D reconstructions from brain activity [5]. On the other hand, other work emphasizes the integration of head tissue conductivity estimations with EEG source localization, underlining the importance of physiological modeling in improving the accuracy of reconstructed visual outputs, including dynamic ones [1].

1.2.3. EEG to Video

The evolution of reconstructing movies from brain activity measured through EEG is an area of growing research interest, as driven by recent advances in signal processing and machine learning. EEG provides superior temporal resolution relative to other imaging modalities such as functional magnetic resonance imaging. Previous findings indicate that EEG can effectively decode visual stimuli, making it a promising candidate for movie reconstruction, although challenges remain, including noise and signal alignment [27,32].

A well-defined area of visual memory is sequence memory; here, one can record EEG during the presentation of consecutive images or other stimuli. The encoding period can be as short as 200–400 ms, and distinctive EEGs are generated during image recall [22,25,33]. Notably, separate brain regions recalling sequential images activate in the order they were first observed. The EEG corresponding to each recalled phase in the sequence has been consistent enough to characterize [24]. This “slideshow” may be applicable to dynamic visual stimuli.

Recent studies have highlighted the potential for extending existing EEG-based image reconstruction and decoding techniques to dynamic visual stimuli. These studies emphasize that methodologies could be adapted to reconstruct short movie segments by leveraging EEG-based temporal dynamics [27,32]. This opens new avenues for utilizing EEG in creating more immersive and interactive BCIs, potentially applying real-time feedback loops during movie screenings [34,35].

Innovative algorithms employ generative models to synthesize images from EEG signals [5,30]. A variant on GANs introduced a method that emphasized visual stimuli saliency while leveraging deep learning frameworks to achieve higher fidelity in reconstructions. The prior work on the EEG-ConDiffusion framework exemplified a structured approach to image generation through a pipeline that addresses the inherent complexities of EEG data [5,30]. Moreover, specific works demonstrated that combining perceptual and imaginative tasks improved the reconstruction accuracy, suggesting that a multifaceted approach may be pivotal in refining reconstruction methodologies [13].

Prior insights used generative models to precisely align and encode EEG data with visual stimuli. Their research accentuated the correlation between the viewer’s cognitive activity, viewed stimuli, and reconstructed content [36,37]. The implications of using EEG for such purposes lie not only in advancing neuroscience but also in enhancing technologies such as virtual and augmented reality, where understanding and predicting user responses can lead to more engaging experiences. In conclusion, the current landscape of reconstructing movies from EEG signals demonstrates a blend of established techniques and cutting-edge innovations, including object reconstruction and expanded applications [36,37].

1.2.4. EEG to Object

In recent years, EEG-based 3D reconstruction systems have often employed high-density electrode montages to enhance the spatial resolution of the neural signals. Taberna et al. highlighted the importance of accurate electrode localization for reliable brain imaging, suggesting that their developed 3D scanning method can significantly contribute to improving EEG’s usability as a brain imaging tool, aiding in the spatial contextualization of neural data [38]. Complementary to this, a prior approach proposed a photogrammetry-based approach that utilizes standard digital cameras to accurately localize EEG electrodes. This method not only surpasses traditional electromagnetic digitization techniques in terms of efficiency, but also facilitates better integration with magnetic resonance imaging for source analysis [39].

The reconstruction and decoding process benefitted significantly from advances in machine learning techniques, specifically deep learning frameworks, including convolutional neural networks (CNNs) and GANs. One approach leveraged the strengths of autoencoders and generative models, thereby enhancing the detail and accuracy of the reconstructed and decoded objects, although this research did not directly correlate with EEG data [31]. Potential limitations in the use CNNs and GANs are the energy and hardware costs [31]. Furthermore, advancements in CNN architectures demonstrate the ability to effectively integrate multi-dimensional EEG data, yielding superior performance in decoding tasks relevant to object recognition and manipulation [33,40].

Additionally, studies have explored the use of EEG data to directly reconstruct visual stimuli by analyzing the neural correlates of visual perception. While real-time visual cortex engagement has been used for prior protocols, integration of visual memory can evoke separate brain regions [22,25,33]. The reported findings support the premise that the temporal resolution of EEG might enable effective reconstruction of dynamic visual information, which could also be applicable in real-time object recognition and tracking scenarios [32]. As the technology progresses, we may witness a broader application of EEG-based 3D reconstructions in both clinical and cognitive neuroscience domains.

1.2.5. Summary of Prior Work

The reconstruction, decoding, and processing of remembered images using low-cost open-source EEG offers several advantages that significantly enhance the research capabilities and accessibility for broader applications. One primary benefit is the economic viability of such systems, which democratizes access to brain imaging technologies. Historically, advanced EEG setups are expensive and complex, limiting their use primarily to well-funded research institutions. Open-source EEG platforms, such as OpenBCI and Creamino, provide an affordable alternative compatible with existing software frameworks, thus enabling new research avenues and educational applications at lower costs [14,41].

Using low-cost EEG systems for reconstructing and decoding remembered images enhances the scalability of research studies focusing on cognitive processes such as memory recall. The integration of innovative machine learning models with EEG data can effectively decode the associated electrophysiological correlates of visual memory. For instance, research in computational algorithms and open-source software frameworks illustrate how researchers can tailor their analyses and improve data handling through accessible technological solutions, allowing for sophisticated and reproducible research designs [42,43,44].

Furthermore, systems that facilitate real-time data processing enhance interactive research applications and BCI, contributing to fields ranging from psychology to robotics [2]. Conducting experiments with high temporal resolution using portable EEG systems is particularly advantageous in studying dynamic cognitive phenomena, such as the spatiotemporal trajectories inherent in visual object recall [45]. Reported work on EEG revealed neural responses associated with memory reactivation during active recall or visual imagery tasks [46,47]. Early findings underscore the potential of utilizing such methodologies to further explore brain functions related to memory and cognition, pushing the boundaries of our understanding of the human brain [48].

In summary, low-cost open-source EEG systems serve as pivotal tools in decoding remembered images, providing significant benefits in terms of accessibility, cost-effectiveness, scalability, and collaborative research practices. Future studies utilizing these technologies are well-positioned to deepen our understanding of neural mechanisms linked to memory recall, potentially advancing scientific knowledge and applicable technology.

2. Materials and Methods

2.1. Summary

The deployment of an EEG-based image identification system necessitated careful consideration of stimulus selection, signal acquisition, feature extraction, and classification algorithms. During data collection, each participant was instructed to visually and aurally engage with the presented stimuli, while EEG signals were recorded. Data acquisition was conducted using an OpenBCI EEG headset in conjunction with a Cyton board and the OpenBCI acquisition software v5.2.2 (OpenBCI Foundation, New York, NY, USA). Following preprocessing and artifact rejection, feature extraction focused on identifying the most robust EEG signatures associated with visual recall, informed by the existing literature. For classification, a model was selected based on its ability to achieve high accuracy while minimizing overfitting. The overall system design leveraged validated methodologies from prior research to maximize the reliability and performance [27,45,49]. The participants are discussed in Section 2.2 and the stimulus in Section 2.3. The image processing is detailed in Section 2.4, the design requirements in Section 2.5, and the feature extraction in Section 2.6. The data classification is discussed in Section 2.7. Finally, the performance metrics and statistical tests are detailed in Section 2.7.

2.2. Participants

A total of 20 adult participants (mean age = 24.3 ± 4.2 years; 4 females, 16 males) were recruited during summer 2025 via word-of-mouth and printed flyers. Eligibility criteria included the following: age between 18 and 40 years, no hearing impairments, and possessing normal or corrected-to-normal vision. All participants provided written informed consent in accordance with IRB approval (STUDY20250042). The participants were seated at a standardized distance, at least 24 inches (61 cm) away from the display screen. Following acquiring consent, the overseeing experimenter fitted each participant with a standard dry EEG cap and attached the reference electrode. The experimental instructions were presented onscreen, and EEG data acquisition commenced immediately thereafter with the stimulus presentation.

2.3. Stimulus Presentation

All software was implemented in Python v.3.7 [50]. Prior EEG-based image encoding and decoding implementations used visual stimuli in generating training data [4,12,49]. To implement temporal encoding, images depicting the same object at distinct visually recognizable timepoints (e.g., a ship progressing along a river) were arranged in sequential pairs. The protocol ensured that the six images representing the “initial” state were always presented prior to the six corresponding “later” state images. The full chronological sequence of stimuli is illustrated in Figure 1, while the Cyton board command protocol is detailed in Figure 2.

For each image, data acquisition comprised a single demonstration phase followed by ten experimental trials. During the demonstration, a stimulus consisting of a white background with black characters was displayed for 4 s. Subsequently, a 1 s “wait” screen was presented, followed by a 2 s blank screen, during which participants were instructed to retain the image in their memory. Another 1 s “wait” screen was interleaved after the blank interval. This fixed sequence was repeated for a total of ten trials per image. Each session encompassed ten trials of 12 unique images presented in pseudo-random order, with “initial state” images consistently preceding their corresponding “later state” versions. The total duration of each session was approximately 20 min, and only a single session was recorded per participant. If it was not possible to complete the entire session, the maximum amount of data possible were acquired. Data were excluded from analysis if complete sets of trials for each image were not obtained. In parallel with EEG acquisition, the images required specific processing.

2.4. Image Processing

The images used are shown in Figure 3. Each image was encoded with an integer from 1 to 12. The first “initial state” images (1–6) were always displayed before the “later state” images (7–12). Simple images were used to ensure the participants could clearly discern the changes. The inception score was used to ensure quality outcomes [51].

After classification, the image was sent to a pipeline prepared using ComfyUI [52]. Due to the energy and hardware requirements of conventional GANs and CNNs, the pipeline was designed to run locally. Each image was combined with its pair (e.g., between the “initial state” and “later state”) and animated. The conversion of two sequential images to an animation has been used well before generative AI, but ComfyUI enables a generative solution [52]. A parallel pipeline converted the image to a 3D solid, corresponding to OBJ format. As detailed in prior work, the use of ComfyUI to convert 2D to 3D started with the ComfyUI-Hunyuan3DWrapper and ComfyUI-Y7-SBS-2Dto3D, which employed depth map estimation and related photogrammetric techniques [52]. From the OBJ format, each 3D model was converted to an STL file for 3D printing using Python. These specifications are detailed in the design requirements.

2.5. Design Requirements

EEG data acquisition was implemented using open-source hardware and software, specifically the OpenBCI Cyton biosensing board in conjunction with the Ultracortex Mark IV headset. Recording was conducted with sixteen channels of scalp EEG data at a sampling rate of 250 Hz. Data acquisition and timestamping were automated via a custom Python script to ensure temporal precision and reproducibility. As illustrated in Figure 4, the electrodes were positioned at the following standardized sites in the International 10–20 system: Fp1, Fp2, F7, F3, F4, F8, T3, C3, C4, T4, T5, T6, P3, P4, O1, and O2.

Each trial was recorded as an individual file, with the filename encoding the image identifier, trial number, and participant ID. Trials lacking valid timestamp data were excluded from further analysis, which were less than 2% of the total trials. The inclusion criteria required a minimum of two trials with valid timestamps for each image–participant combination for that participant’s data to be retained in the final dataset. After data collection, the feature extraction, feature selection, and classification steps were executed offline after data collection. To prepare for later real-time use in later iterations, a pseudo-real-time pipeline was prototyped that implemented a sliding window of 2 s to replay previously recorded data, advanced in 200 ms increments. Data within the window were sent for feature extraction.

2.6. Preprocessing and Feature Extraction

Based upon prior work, the selection of feature types included spatiotemporal features and amplitude [53]. Each EEG data file contained approximately 20 s of recordings. The recorded data from each EEG channel were segmented, resulting in 1 s non-overlapping windows and processed independently. Each 2 s trial epoch encompassed a 1 s post-stimulus analysis window. Consistent with extensive prior research in visual recall and related paradigms, task-evoked EEG activity is predominantly characterized by transient event-related responses occurring within approximately the first 300–1000 ms following stimulus onset features [22,23,24,25]. The additional 1 s of recording served as a temporal buffer to ensure complete capture of late or variable-latency components. However, only the initial 1 s post-stimulus segment was retained for analysis to optimize signal-to-noise ratio and improve classification performance.

For each window, time-domain features were extracted. Windows exhibiting total signal amplitudes exceeding ±3 standard deviations from the session baseline were identified as artifacts and excluded from further analysis. The remaining signals were bandpass-filtered between 0.1 Hz and 123 Hz using a zero-phase, bidirectional fourth-order Butterworth filter, with additional notch filtering applied to suppress 60 Hz line noise. To minimize edge artifacts during filtering, we first extended the signal using symmetric reflection padding. The padding length was set equal to half the filter’s impulse response on each side of the signal. This extension was applied prior to the filtering process. After filtering, the padded sections were removed to restore the original signal length. This standard, widely used approach maintained signal continuity at the edges and reduced boundary-related distortions [8,9,18]. A temporal average was then computed for each window, as this feature has demonstrated utility in previous imaged speech BCI studies. Subsequently, the 99.95th percentile of signal amplitude (percent intensity) was calculated for each window. Finally, the power spectral density (PSD) features were computed using Welch’s method for major EEG frequency bands—delta band (1–4 Hz), theta band (5–8 Hz), alpha band (8–12 Hz), beta band (13–30 Hz), and gamma band (30–100 Hz)—in alignment with standard EEG analysis protocols [54,55]. The mean power within the lower and upper sub-bands of each EEG frequency band was computed (e.g., 8–10 Hz for the lower alpha sub-band). The extracted features were placed into concentrated feature vectors, which included both the absolute (non-normalized) spectral power values and the spectral power values normalized with the total spectral power across all frequency bands.

Combining spectral, temporal, and amplitude features provides a richer representation of EEG dynamics compared to other brain–computer interface (BCI) classification modalities, such as steady-state visually evoked potential (SSVEP) decoding [4,17,18]. The simplest SSVEP decoding approaches rely on responses at a limited set of stimulation frequencies. More advanced methods, such as canonical correlation analysis (CCA) or task-modulated SSVEPs, extend this by incorporating harmonic components and target-specific frequency responses. While CCA-based SSVEP decoding integrates correlation and spectral information, it typically does not leverage broader EEG feature domains such as amplitude envelopes or cross-band spectral patterns. In contrast, other non-SSVEP paradigms, including covert speech and motor imagery BCIs, more readily combine multidimensional feature sets that draw from temporal dynamics, spectral content, and signal amplitude [8,9,18]. Although some SSVEP protocols incorporate task demands and precise stimulus timing, activity in the visual cortex and processes related to imagery or recall involve more complex neural mechanisms. As detailed in Supplemental Data, CCA and spatial filtering methods, such as common spatial patterns, were also tested to enhance the feature space but did not yield performance improvements. Once feature vectors were extracted, they were subsequently passed to a classifier for decoding.

2.7. Data Classification

The classification framework included both intrasubject and intersubject analyses. Intrasubject classification examined the feasibility of subject-specific EEG-based image identification by evaluating classifier performance on data from individual participants. Low-performance metrics, including the accuracy, the F1-score, and the area under the receiver operating characteristic (ROC) curve (AUC-ROC)—were interpreted as indicators of poor signal quality or limited feature discriminability. In contrast, intersubject classification evaluated model generalizability across participants, thereby providing insight into the feasibility of a subject-independent EEG-based image identification system. Robust decoding performance in this setting suggested that the approach could scale effectively with larger datasets. Feature selection employed the Average Distance between Events and Non-Events (ADEN) calculation technique, which applied two statistical weighting schemes to identify the most informative features under each classification scenario [55].

ADEN is a supervised feature selection technique which identifies the maximum three to six discriminative feature contrasts per run, avoiding data leakage by using only the randomly selected training dataset. For each class, the feature values were averaged, followed by a scaling step that applied the combination of standard z-score normalization and standard Cohen’s d effect size. The absolute difference between the scaled class averages was then computed for each feature. The features were ranked by the magnitude of this inter-class distance, with the highest value indicating the greatest separability between classes. ADEN was implemented as a supervised feature selection method, operating strictly within the training set during each fold of cross-validation. For each classification scenario, feature ranking and subset selection were performed exclusively on the training data, without access to the corresponding test data. This separation ensured that the selected subset is determined independently of the test set, thereby preventing data leakage. When the trained classifier was applied to the test set, only the features identified from that specific training fold were retained. This procedure was repeated for all folds and scenarios, consistent with best practices for supervised feature [55]. The feature selection process ran independently for each participant and cross-validation fold; so, the exact number of unique features varied. For each case, the top-ranked three to six features were selected for downstream application on the validation data [55].

Given the presence of 16 input channels and the potential for noise in the data, overfitting was identified as a significant concern. To mitigate improve model generalization, evaluation metrics with less class imbalance sensitivity—including the F1 score and the AUC-ROC—were prioritized over the overall classification accuracy. Traditional machine learning algorithms were favored over more complex deep learning models to reduce the risk of overfitting. Based on prior methodologies used in comparable BCI systems, three classifiers were implemented for evaluation: Linear Discriminant Analysis (LDA), k-Nearest Neighbors (KNN), and Random Forest (RF) [56]. For each classification task, the dataset was randomly partitioned into four separate, similarly sized blocks. The classification framework was modeled as a modified one-versus-rest problem for each class of the 12 images, with class balance achieved using methods suitable for limited sample sizes. Training and testing splits were designed to maintain equivalent class distributions. Each classifier specific to an image employed four-fold leave-one-out cross validation (LOOCV), holding out one block at a time for validation to assess the generalization reliability. Classification metrics, including the accuracy, the F1 score, and the AUC-ROC, were computed for each classifier configuration with 12 classes and then averaged across both systems and image categories. Cross-validation was performed across trials rather than within trials to avoid temporal leakage.

The performance evaluation was conducted using an ensemble classification framework employing a modified one-vs-rest strategy across 12 target classes [4,14,15,16]. For each instance, the final class label was assigned by selecting the binary classifier exhibiting the highest confidence score, as estimated during model inference. The reported metrics represent the mean multi-class classification performance on the held-out test set, aggregated across all 12 classes, rather than a binary evaluation scheme. To ensure dataset quality, permutation tests (n = 1000 shuffles) were conducted on each dataset. Experiments were conducted for separate intrasubject and intersubject classification scenarios to evaluate the model robustness, each with their own performance metrics.

2.8. Performance Metrics

To evaluate the potential performance enhancement offered by an instinctive image identification and decoding system in processing speed and throughput, the information transfer rate (ITR) was computed for each system implementation using Equation (1) [10].

ITR (in bits/trial) = log₂(N) + P × log₂(P) + (1 − P) × log₂((1 − P)/(N − 1))

(1)

As shown in Equation (1), the units of

I T R

are quantified as bits per trial. Classification effectiveness is directly related to both the total distinct class number (

N)

and the mean accuracy in classification (

P)

.

In the implemented system illustrated in Figure 5, integers from 1 to 12 were assigned for each image. The participant wore the OpenBCI EEG headset, and the presentation displayed each image with the “StimPres” Python script. The participant was instructed to remember the prior image for 10 trials. Each participant had a number of uniquely coded EEG trials, with file names corresponding to the image code and trial number. A randomized portion of the EEG files from an individual participant was processed and trained as classifiers using the “train” Python script. Testing and validation occurred with the “trial” Python script, which used previously withheld validation EEG files on each classifier model. When the classifier model observed a validation EEG file, it was assigned an integer, from 1 to 12, corresponding to which image the model calculated it belonged to. The classifier output was compared to the “gold standard,” which was used to generate the confusion matrix and performance metrics.

Owing to the structure of the classifier, each image was also evaluated against itself at a temporally distinct point. Consistent and accurate identification of the same object across different timepoints serves as evidence of discrete temporal encoding [57]. To streamline the computation, a 1 s sampling window was adopted in accordance with the existing EEG data acquisition and processing protocol. Subsequently, Equation (2) was used to convert the results to bits per minute.

ITR (in bits/min) = ITR (in bits/trial) × 1 (trial/seconds) × 60 (seconds/min)

(2)

An arbitrary trial rate of 1 trial/s was adopted as a baseline to estimate the lower bound of the achievable information transfer rate (ITR). The use of non-overlapping 1 s trial windows stemmed from their role in feature extraction, not as a fixed system parameter. In a real-time implementation, alternative strategies—such as overlapping or sliding windows, asynchronous event detection, and adaptive trial durations—would typically be utilized. Therefore, the 1 trial/s rate served only as a simplified reference for illustrative ITR estimation [8,9,18].

Classifier performance is a critical factor in achieving a high Information Transfer Rate (ITR). Based on prior benchmarking results, it was anticipated that the Random Forest (RF) classifier would achieve superior average performance across key metrics including accuracy, AUC, and F1 score [27]. Previous studies also suggest that the most informative features for classification are the spectral band power and average mean amplitude, particularly when extracted from electrodes positioned on the upper and posterior regions of the scalp [49,53]. Specifically, electrodes located at the parietal and occipital sites, such as Pz, P4, and Oz within the International 10–20 electrode coordinate system, have been consistently associated with EEG patterns linked to visual recall, likely due to their anatomical proximity to the visual cortex [6,49]. Additionally, while gamma-band activity related to visual recall has been observed in frontal electrodes, these signals may be confounded by ocular artifacts [58]. EEG activity was visually plotted using MNE-Python to confirm findings (v. 1.10.1, MNE Developers, worldwide). To validate the feasibility of the proposed approach, initial classification tests were conducted in an offline setting. Mean values were reported alongside standard deviations. In addition to 1000 chance-level permutation tests for each dataset, statistical testing was performed to determine any significant differences between the classifiers, using paired t-tests.

To quantify the efficiency of the 3D projection system, the Structural Similarity Index Measure (SSIM) between the original topographic image and the corresponding orientation of its 3D reconstruction was calculated [22,25,33]. However, SSIM could be distorted by including “white space” background pixels between both the source image and 2D projection of the 3D object [4,5,26].

3. Results

3.1. Summarizing Results

Across multiple scenarios, the classifier performance was evaluated for the image decoding system. The first scenario (in Section 3.2) assessed intrasubject classification, measuring the system’s ability to discriminate images within individual subjects. The second scenario (in Section 3.3) focused on intersubject classification, testing the generalizability of the model when trained on one subject’s data and validated on another’s. The third analysis (in Section 3.4) involved statistical tests, EEG plots, and feature and electrode selection to identify those contributing most significantly to robust image separation. For each phase, the ITR was computed to quantify the system effectiveness. Subsequently, 3D object reconstructions were generated using the ComfyUI pipeline (in Section 3.5).

3.2. Intrasubject Competition

For intrasubject classification, as shown in Figure 6, RF was the highest-performing classifier for mean AUC-ROC. The RF performance reached a mean accuracy and standard deviation of 92 ± 2%, a mean F1 score and standard deviation of 0.64 ± 0.08, and a mean AUC-ROC and standard deviation of 0.87 ± 0.10. The LDA achieved a mean accuracy and standard deviation of 92 ± 4%, a mean F1 score and standard deviation of 0.64 ± 0.05, and a mean AUC-ROC and standard deviation of 0.83 ± 0.11. The KNN achieved a mean accuracy of 89 ± 12%, a mean F1 score of 0.71 ± 0.04, and a mean AUC-ROC of 0.79 ± 0.08. No significant differences were found between classifier types.

The performance across individual participants is plotted for the RF in Figure 7. The average rate of bits per intrasubject trial was 2.83, leading to an ITR of 170.2 bits per minute.

3.3. Intersubject Competition

In Figure 8, the results for the intersubject classification are plotted. Significant differences and large effect sizes were found, according to the post hoc tests (p values < 0.02, d score < 0.8), contrasting both the LDA against RF and the LDA against KNN.

For the intersubject classification, the highest average performance was with the RF, which resulted in a mean average accuracy of 92 ± 0.015%, a mean average F1 score of 0.48 ± 0.01, and a mean average AUC-ROC of 0.63 ± 0.05. For intersubject classification, the bits per trial for RF was 2.91, and the ITR was 174.4 bits per minute.

3.4. Top Features

For the chance-level permutation tests, the average p-value was 0.001, strongly suggesting consistent performance above chance-level accuracy. Based on the consistently ranked average maximum distances between classes, the spectral band power on gamma and beta was the most consistent separable feature across each image and electrode channel. The most consistent electrode positions for the features were frontal, including Fp1, Fp2, F3, and F4.

The normalized EEG bands are shown in Figure 9, indicating the power on higher frequency bands. The average EEG activity for recalling the “before” dog is shown in Figure 10.

The average EEG activity for recalling the “after” dog image is shown in Figure 11. While the frontal cortex was most active, the temporal and parietal lobes were also active.

3.5. Image to Object

Each image object was converted into a 3D object, as shown in Figure 12. Other conversions are detailed in the Supplemental Details.

The “initial state” was denoted as “1,” and the “later state” was denoted as “2.” The SSIM values are listed on Table 1. The banana images had the highest SSIM, with 0.69 for the “initial phase” and 0.67 for the “later stage.”

The files and code are available in the repository, linked in the data availability statement.

4. Discussion

4.1. Summary

All 20 participants contributed usable EEG data for the image decoding model, although some recordings may reflect cognitive processes beyond visual recall. Compared to the intersubject model, the individualized models worked most reliably with gamma and beta features on frontal electrodes, reaching a mean accuracy of 92 ± 2%, a corresponding mean F1 score of 0.64 ± 0.08, and an observed mean AUC-ROC of 0.87 ± 0.10. Despite the relatively low F1 score and high accuracy score for RF, the relative discrepancy between metrics is not necessarily a flaw in the model approach itself. In a one-versus-rest ensemble, classes with limited samples or substantial feature overlap often exhibit reduced recall or precision, which can lower the overall F1 score despite maintaining high accuracy. Since the ensemble optimizes binary separability on a per-class basis, imbalances may intensify as classifiers associated with dominant categories exert disproportionate influence. Earlier studies involving visual recall noted F3 and F4 were active, although Fp1 and Fp2 often had ocular artifact contamination [58]. Observed activity towards other regions was consistent with prior work [49,58]. The preprocessing pipeline included a bandpass filter and artifact rejection procedures to mitigate non-neural noise. Ocular artifacts predominantly manifest in the low-frequency range (typically below 20 Hz), whereas muscular artifacts extend partially into the higher-frequency spectrum, overlapping with the lower gamma band (typically below 30 Hz). Given the experimental design, the inclusion of both normalized and non-normalized spectral features, and the absence of systematic artifact-related components following preprocessing, it is unlikely that ocular or muscular activity contributed to the observed gamma-band effects [8,9,18]. The prior work did not directly incorporate the temporal encoding of discrete stages and the transformation of objects over time [42,43,44]. Objects can be reliably separated from themselves at different time points, even with a low-cost EEG headset. Incorporating transformation and dynamism into the encoding of visual memory and replicable cognitive processes directly enables a more naturalistic and realistic context of individual objects. The use of low-cost EEG headsets with open-source software could greatly improve the accessibility of the technique and technology, especially in engineering and expression [4,26]. From art to the people with physical impairments, such technology could assist with rapid prototyping of designs [4,26]. The use of an older, less complex machine learning technique precludes the need to run a GAN, although higher-resolution models would require extensive training and hardware. Starting with a finite number of images, the “one-versus-rest” classifier framework can be generalized for a higher dynamic number of categories. Using standard SSIM, the relatively low SSIM values were likely due to the prevalence of “white space” background pixels in the projection and source image [26]. Even with relatively low SSIM values, 3D projection of 2D images could be conducted on a local device. While limitations remain, the system reliably differentiates and decodes object representations at distinct time points. As a proof-of-concept study, these foundational results highlight the potential for developing robust, adaptable, and interactive EEG-driven image reconstruction and decoding systems, paving the way for expanding real-time applications in research, engineering, and creative industries [59]. The conversion of EEG into temporally encoded 3D objects has been demonstrated reliably on a local device, although limitations were present.

4.2. Limitations

A primary limitation was the reliance on offline performance, but it was essential to establish a proof of concept. A second limitation was the potential noise from ocular artifacts, although this could be compensated for by using certain frontal channels for artifact rejection and other techniques [58]. Another limitation was the relatively small number of participants and images; however, this was to establish a precedent that could be built upon. A major limitation is the absence of null trials, randomized distractors, or scrambled temporal sequences that could control for expectancy- or fatigue-driven effects. Future versions of this paradigm should introduce such controls to isolate neural responses attributable directly to visual encoding processes. Another potential limitation was using a modified “one-versus-rest” classifier ensemble with a fixed number of categories, rather than a dynamic number. However, the system could be dynamically updated in future configurations, as well as ensemble optimization. Another potential shortcoming was bypassing the use of a GAN, widely used in prior work [4,5,26]. The fidelity of reconstructed 3D objects can be further enhanced by employing specialized models optimized for high-quality geometric and texture synthesis [4,5,26]. To avoiding contamination by background pixels, the relatively low SSIM values could be significantly improved by integrating spatial similarity recognition metrics that better align the structural relationships between the 2D input and the resulting 3D projection [26]. However, future work could simply scale the existing precedents established in this study and elsewhere. While our results do not directly elucidate the underlying neural mechanisms, they establish a foundation for future studies integrating neurophysiological analyses. These limitations offer clear paths for improvement.

4.3. Future Work

This proof-of-concept study demonstrated low-cost feasibility, rather than as an in-depth exploration of neuroscience and visual recall. Nonetheless, results aligned with expectations of visual cortical entrainment to structured temporal sequences, modulated by attention and perceptual encoding. The next step is optimization of a real-time system, including artifact rejection, filtering, and classification performance. The prototypical “one-versus-rest” classifier could be adapted for a dynamic number of categories, as well as employing ensemble optimization. As currently performed in creative fields and several industries, a pre-trained GAN could be included in order to refine the resolution of 3D objects [36,37]. The separability and 3D projection quality of objects in a scene could also be improved to ensure greater reliability [4,5,26]. Using existing precedents, methods incorporating human–computer interaction and the ability to customize objects or edit generated videos intuitively could further bridge the gap between imagination and engineering [36,37]. Deep learning techniques could also be explored for improving classifier accuracy and robustness. Advances in other generative AI fields could also be applied, such as extrapolating or interpolating the state of an object more efficiently [40,56]. Based on current usage of related technologies, the system could also be adapted for specific uses, such as manufacturing (using different versions of a product), animation (streamlining animation for 2D images), or transportation (recalling landmarks along a route) [25]. Real-time streaming of memories, imagined images, and dynamic scenes has already been established, but reducing the hardware requirements directly improves its accessibility [4,5,14,26].

5. Conclusions

This feasibility study established the technical feasibility of approximating visual imagery from EEG data using individualized models, even when constrained to low-cost consumer-grade headsets and open-source software environments. The proposed system demonstrates robust classification and decoding performance, achieving a high accuracy, F1 score, and AUC-ROC, with optimal results observed when gamma- and beta-band features are extracted from frontal electrodes—regions known to be associated with cognitive control and visual processing [58]. By integrating temporal encoding mechanisms, the approach approximates object transformations across time, yielding a more ecologically valid representation of visual memory compared to static or single-frame decoding paradigms [4,5,14,26]. Despite inherent limitations, including offline validation, a modest sample size, and the use of relatively simple machine learning algorithms, the system reliably differentiates and decodes image and object representations at distinct temporal intervals. While there remain significant opportunities for improvement, such as real-time operation and artifact mitigation, the foundational results presented here underscore the potential for developing more robust, versatile, and interactive EEG-driven image decoding systems, paving the way for practical deployment in research, art, and industrial contexts [36,37].

Supplementary Materials

The data, models, and supplementary information are available at https://github.com/javeharron/nostalgiaAlpha (accessed on 23 August 2025).

Author Contributions

Conceptualization, J.L. and S.Z.; methodology, S.Z. and Y.L.; software, S.Z.; validation, J.L., E.H. and Q.T.; formal analysis, J.L.; investigation, J.F., Y.L. and E.H.; resources, S.Z.; data curation, J.L.; writing—original draft preparation, J.L.; writing—review and editing, J.L.; visualization, J.L.; supervision, J.L. and Q.T.; project administration, J.L., J.F. and S.M.; funding acquisition, Q.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the Ethics Committee of The Ohio State University STUDY20250042 on 11 July 2025.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data, models, and Supplementary Information are available at https://github.com/javeharron/nostalgiaAlpha (accessed on 23 August 2025).

Acknowledgments

The authors would like to thank The Ohio State University. The authors used Perplexity (Perplexity AI, Inc., 2025) and Scite (Scite, Inc., 2025) to refine the grammar and structure of several sentences. Content was critically reviewed and not directly copied from AI-generated output.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Acar, Z.; Acar, C.; Makeig, S. Simultaneous head tissue conductivity and EEG source location estimation. NeuroImage 2016, 124, 168–180. [Google Scholar] [CrossRef]
Zou, B.; Zheng, Y.; Shen, M.; Luo, Y.; Li, L.; Zhang, L. Beats: An open-source; high-precision, multi-channel EEG acquisition tool system. IEEE Trans. Biomed. Circuits Syst. 2022, 16, 1287–1298. [Google Scholar] [CrossRef]
Trans Cranial Technologies. 10/20 System Positioning. 2012. Available online: https://thebrainstimulator.net/wp-content/uploads/2023/02/Trans_Cranial_Technologies-10_20_positioning_manual_v1_0.pdf (accessed on 21 February 2023).
Song, Y.; Liu, B.; Li, X.; Shi, N.; Wang, Y.; Gao, X. Decoding natural images from EEG for object recognition. arXiv 2023, arXiv:2308.13234. [Google Scholar]
Yang, G.; Liu, J. A New Framework Combining Diffusion Models and the Convolution Classifier for Generating Images from EEG Signals. Brain Sci. 2024, 14, 478. [Google Scholar] [CrossRef]
Liu, X.; Liu, Y.; Wang, Y.; Ren, K.; Shi, H.; Wang, Z.L.D.; Lu, B.; Zheng, W. EEG2video: Towards decoding dynamic visual perception from EEG signals. Adv. Neural Inf. Process. Syst. 2024, 37, 72245–72273. [Google Scholar]
Jahn, N.; Meshi, D.; Bente, G.; Schmälzle, R. Media neuroscience on a shoestring. J. Media Psychol. Theor. Methods Appl. 2023, 35, 75–86. [Google Scholar] [CrossRef]
Capati, F.A.; Bechelli, R.P.; Castro, M.C.F. Hybrid SSVEP/P300 BCI keyboard. In Proceedings of the International Joint Conference on Biomedical Engineering Systems and Technologies, Rome, Italy, 21–23 February 2016; Volume 2016, pp. 214–218. [Google Scholar]
Allison, B.; Luth, T.; Valbuena, D.; Teymourian, A.; Volosyak, I.; Graser, A. BCI demographics: How many (and what kinds of) people can use an SSVEP BCI? IEEE Trans. Neural Syst. Rehabil. Eng. 2010, 18, 107–116. [Google Scholar] [CrossRef] [PubMed]
Blankertz, B.; Dornhege, G.; Krauledat, M.; Muller, K.R.; Kunzmann, V.; Losch, F.; Curio, G. The Berlin Brain-Computer Interface: EEG-based communication without subject training. IEEE Trans. Neural Syst. Rehabil. Eng. 2006, 14, 147–152. [Google Scholar] [CrossRef]
Kübler, A. The history of BCI: From a vision for the future to real support for personhood in people with locked-in syndrome. Neuroethics 2020, 13, 163–180. [Google Scholar] [CrossRef]
Singh, P.; Pandey, P.; Miyapuram, K.; Raman, S. EEG2IMAGE: Image reconstruction from EEG brain signals. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; Volume 2023, pp. 1–5. [Google Scholar]
Shimizu, H.; Srinivasan, R. Improving classification and reconstruction of imagined images from EEG signals. PLoS ONE 2022, 17, e0274847. [Google Scholar] [CrossRef]
LaRocco, J.; Tahmina, Q.; Lecian, S.; Moore, J.; Helbig, C.; Gupta, S. Evaluation of an English language phoneme-based imagined speech brain computer interface with low-cost electroencephalography. Front. Neuroinform. 2023, 17, 1306277. [Google Scholar] [CrossRef] [PubMed]
Tang, J.; LeBel, A.; Jain, S.; Huth, A.G. Semantic reconstruction of continuous language from non-invasive brain recordings. Nat. Neurosci. 2023, 26, 858–866. [Google Scholar] [CrossRef]
Lopez-Bernal, D.; Balderas, D.; Ponce, P.; Molina, A. A state-of-the-art review of EEG-based imagined speech decoding. Front. Hum. Neurosci. 2022, 16, 867281. [Google Scholar] [CrossRef] [PubMed]
Zhang, M. Reconstructing static memories from the brain with eeg feature extraction and generative adversarial networks. J. Stud. Res. 2023, 12, 1–17. [Google Scholar] [CrossRef]
Tang, Z.; Wang, X.; Wu, J.; Ping, Y.; Guo, X.; Cui, Z. A BCI painting system using a hybrid control approach based on SSVEP and P300. Comput. Biol. Med. 2022, 150, 106118. [Google Scholar] [CrossRef] [PubMed]
Zeng, H.; Xia, N.; Tao, M.; Pan, D.; Zheng, H.; Wang, C.; Xu, F.; Zakaria, W.; Dai, G. DCAE: A dual conditional autoencoder framework for the reconstruction from EEG into image. Biomed. Signal Process. Control 2023, 81, 104440. [Google Scholar] [CrossRef]
Shirakawa, K.; Nagano, Y.; Tanaka, M.; Aoki, S.C.; Muraki, Y.; Majima, K.; Kamitani, Y. Spurious reconstruction from brain activity. Neural Netw. 2025, 190, 107515. [Google Scholar] [CrossRef]
Mishra, R.; Sharma, K.; Jha, R.R.; Bhavsar, A. NeuroGAN: Image reconstruction from EEG signals via an attention-based GAN. Neural Comput. Appl. 2023, 35, 9181–9192. [Google Scholar] [CrossRef]
El-Kalliny, M.M.; Wittig, J.H., Jr.; Sheehan, T.C.; Sreekumar, V.; Inati, S.K.; Zaghloul, K.A. Changing temporal context in human temporal lobe promotes memory of distinct episodes. Nat. Commun. 2019, 10, 203. [Google Scholar] [CrossRef]
Linde-Domingo, J.; Treder, M.S.; Kerrén, C.; Wimber, M. Evidence that neural information flow is reversed between object perception and object reconstruction from memory. Nat. Commun. 2019, 10, 179. [Google Scholar] [CrossRef]
Li, Y.; Pazdera, J.K.; Kahana, M.J. EEG decoders track memory dynamics. Nat. Commun. 2024, 15, 2981. [Google Scholar] [CrossRef]
Huang, Q.; Jia, J.; Han, Q.; Luo, H. Fast-backward replay of sequentially memorized items in humans. eLife 2018, 7, e35164. [Google Scholar] [CrossRef]
Wakita, S.; Orima, T.; Motoyoshi, I. Photorealistic reconstruction of visual texture from EEG signals. Front. Comput. Neurosci. 2021, 15, 754587. [Google Scholar] [CrossRef]
Ling, S.; Lee, A.; Armstrong, B.; Nestor, A. How are visual words represented? insights from EEG-based visual word decoding, feature derivation and image reconstruction. Hum. Brain Mapp. 2019, 40, 5056–5068. [Google Scholar] [CrossRef] [PubMed]
Nestor, A.; Plaut, D.; Behrmann, M. Feature-based face representations and image reconstruction from behavioral and neural data. Proc. Natl. Acad. Sci. USA 2015, 113, 416–421. [Google Scholar] [CrossRef] [PubMed]
Fuad, N.; Taib, M. Three dimensional eeg model and analysis of correlation between sub band for right and left frontal brainwave for brain balancing application. J. Mach. Mach. Commun. 2015, 1, 91–106. [Google Scholar] [CrossRef]
Khaleghi, N.; Rezaii, T.; Beheshti, S.; Meshgini, S.; Sheykhivand, S.; Danishvar, S. Visual saliency and image reconstruction from eeg signals via an effective geometric deep network-based generative adversarial network. Electronics 2022, 11, 3637. [Google Scholar] [CrossRef]
Yang, B.; Wen, H.; Wang, S.; Clark, R.; Markham, A.; Trigoni, N. 3D object reconstruction from a single depth view with adversarial learning. Proc. IEEE Int. Conf. Comput. Vis. Workshops 2017, 2017, 679–688. [Google Scholar]
Nemrodov, D.; Niemeier, M.; Patel, A.; Nestor, A. The neural dynamics of facial identity processing: Insights from eeg-based pattern analysis and image reconstruction. Eneuro 2018, 5, ENEURO.0358-17.2018. [Google Scholar] [CrossRef]
Hung, Y.C.; Wang, Y.K.; Prasad, M.; Lin, C.T. Brain dynamic states analysis based on 3D convolutional neural network. Trans. IEEE Int. Conf. Syst. Man Cybern. 2017, 2017, 222–227. [Google Scholar]
Park, S.; Kim, D.; Han, C.; Im, C. Estimation of emotional arousal changes of a group of individuals during movie screening using steady-state visual-evoked potential. Front. Neuroinform. 2021, 15, 731236. [Google Scholar] [CrossRef] [PubMed]
Rashkov, G.; Bobe, A.; Fastovets, D.; Komarova, M. Natural image reconstruction from brain waves: A novel visual BCI system with native feedback. bioRxiv 2019, 2019, 787101. [Google Scholar]
Wang, P.; Wang, S.; Peng, D.; Chen, L.; Wu, C.; Wei, Z.; Childs, P.; Guo, Y.; Li, L. Neurocognition-inspired design with machine learning. Des. Sci. 2020, 6, e33. [Google Scholar] [CrossRef]
Shen, G.; Horikawa, T.; Majima, K.; Kamitani, Y. Deep image reconstruction from human brain activity. PLoS Comput. Biol. 2019, 15, e1006633. [Google Scholar] [CrossRef]
Taberna, G.; Marino, M.; Ganzetti, M.; Mantini, D. Spatial localization of EEG electrodes using 3D scanning. J. Neural Eng. 2019, 16, 026020. [Google Scholar] [CrossRef]
Clausner, T.; Dalal, S.S.; Crespo-García, M. Photogrammetry-based head digitization for rapid and accurate localization of EEG electrodes and MEG fiducial markers using a single digital SLR camera. Front. Neurosci. 2017, 11, 254. [Google Scholar] [CrossRef]
Kwak, Y.; Kong, K.; Song, W.; Min, B.; Kim, S. Multilevel feature fusion with 3D convolutional neural network for EEG-based workload estimation. IEEE Access 2020, 8, 16009–16021. [Google Scholar] [CrossRef]
Chiesi, M.; Guermandi, M.; Placati, S.; Scarselli, E.; Guerrieri, R. Creamino: A cost-effective, open-source EEG-based BCI system. IEEE Trans. Biomed. Eng. 2018, 66, 900–909. [Google Scholar] [CrossRef]
Oostenveld, R.; Fries, P.; Maris, E.; Schoffelen, J. Fieldtrip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Comput. Intell. Neurosci. 2011, 2011, 1–9. [Google Scholar] [CrossRef]
Ávila, C.G.; Bott, F.; Tiemann, L.; Hohn, V.; May, E.; Nickel, M.; Zebhauser, P.; Gross, J.; Ploner, M. DISCOVER-EEG: An open, fully automated EEG pipeline for biomarker discovery in clinical neuroscience. Sci. Data 2023, 10, 613. [Google Scholar] [CrossRef] [PubMed]
Cardona-Alvarez, Y.N.; Álvarez-Meza, A.M.; Cárdenas-Peña, D.A.; Castaño-Duque, G.A.; Castellanos-Dominguez, G. A Novel OpenBCI Framework for EEG-Based Neurophysiological Experiments. Sensors 2023, 23, 3763. [Google Scholar] [CrossRef] [PubMed]
Lifanov-Carr, J.; Griffiths, B.; Linde-Domingo, J.; Ferreira, C.; Wilson, M.; Mayhew, S.; Charest, I.; Wimber, M. Reconstructing Spatiotemporal Trajectories of Visual Object Memories in the Human Brain. Eneuro 2024, 11, ENEURO.0091-24.2024. [Google Scholar] [CrossRef]
Lehmann, M.; Schreiner, T.; Seifritz, E.; Rasch, B. Emotional arousal modulates oscillatory correlates of targeted memory reactivation during NREW, but not REM sleep. Sci. Rep. 2016, 6, 39229. [Google Scholar] [CrossRef]
Li, W.; Zhang, W.; Jiang, Z.; Zhou, T.; Xu, S.; Zou, L. Source localization and functional network analysis in emotion cognitive reappraisal with EEG-fMRI integration. Front. Hum. Neurosci. 2022, 16, 960784. [Google Scholar] [CrossRef]
Sharma, R.; Ribeiro, B.; Pinto, A.; Cardoso, A. Emulating cued recall of abstract concepts via regulated activation networks. Appl. Sci. 2021, 11, 2134. [Google Scholar] [CrossRef]
Guenther, S.; Kosmyna, N.; Maes, P. Image classification and reconstruction from low-density EEG. Sci. Rep. 2024, 14, 16436. [Google Scholar] [CrossRef]
Amemaet, F. Python Courses. 2021. Available online: https://pythonbasics.org/text-to-speech/ (accessed on 28 January 2022).
Barratt, S.; Sharma, R. A note on the inception score. arXiv 2018, arXiv:1801.01973. [Google Scholar] [CrossRef]
Xue, X.; Lu, Z.; Huang, D.; Ouyang, W.; Bai, L. GenAgent: Build Collaborative AI Systems with Automated Workflow Generation--Case Studies on ComfyUI. arXiv 2024, arXiv:2409.01392. [Google Scholar]
Torres-García, A.A.; Reyes-García, C.A.; Villaseñor-Pineda, L.; García-Aguilar, G. Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification. Expert Syst. Appl. 2016, 59, 1–12. [Google Scholar] [CrossRef]
LaRocco, J.; Le, M.D.; Paeng, D.G. A systemic review of available low-cost EEG headsets used for drowsiness detection. Front. Neuroinform. 2020, 14, 553352. [Google Scholar] [CrossRef]
LaRocco, J.; Innes, C.R.; Bones, P.J.; Weddell, S.; Jones, R.D. Optimal EEG feature selection from average distance between events and non-events. Proc. IEEE Eng. Med. Biol. Soc. 2014, 2014, 2641–2644. [Google Scholar]
Shah, U.; Alzubaidi, M.; Mohsen, F.; Abd-Alrazaq, A.; Alam, T.; Househ, M. The Role of Artificial Intelligence in Decoding Speech from EEG Signals: A Scoping Review. Sensors 2022, 22, 6975. [Google Scholar] [CrossRef] [PubMed]
LaRocco, J.; Paeng, D.G. Optimizing Computer–Brain Interface Parameters for Non-invasive Brain-to-Brain Interface. Front. Neuroinform. 2020, 14, 1. [Google Scholar] [CrossRef] [PubMed]
Assem, M.; Hart, M.; Coelho, P.; Romero-Garcia, R.; McDonald, A.; Woodberry, E.; Morris, R.; Price, S.; Suckling, J.; Santarius, T.; et al. High gamma activity distinguishes frontal cognitive control regions from adjacent cortical networks. Cortex 2023, 159, 286–298. [Google Scholar] [CrossRef]
Xu, S.; Liu, Y.; Lee, H.; Li, W. Neural interfaces: Bridging the brain to the world beyond healthcare. Exploration 2024, 4, 20230146. [Google Scholar] [CrossRef]

Figure 1. Chronological sequence of visual stimuli presentation, the first rest period, and the first trial in a series.

Figure 2. Operational diagram for data acquisition and recording EEG during each experimental session.

Figure 3. Images detailing temporally separate states.

Figure 4. The 16 channel EEG headset used for data acquisition, detailing labelled electrode positions in the International 10–20 system.

Figure 5. Training and operation of the classification system. The steps for (A) training and (B) classifying are included.

Figure 6. Average results from the intrasubject classification.

Figure 7. Performance from individual participants for Random Forest.

Figure 8. Average results from the intersubject classification.

Figure 9. Normalized power spectral density for the EEG bands for the frontal channels.

Figure 10. Multidimensional view of average EEG activity during recall of “before” dog image.

Figure 11. Multidimensional view of average EEG activity during recall of “after” dog image.

Figure 12. Conversion of a static 2D image to a 3D printable model with software pipeline.

Table 1. SSIM for each image contrasted with 3D projection.

Image	SSIM
Apple1	0.58
Apple2	0.65
Banana1	0.69
Banana2	0.67
Boat1	0.59
Boat1	0.53
Bowling1	0.51
Bowling2	0.50
Cat1	0.55
Cat2	0.57
Dog1	0.60
Dog2	0.63

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

LaRocco, J.; Tahmina, Q.; Zia, S.; Merchant, S.; Forrester, J.; He, E.; Lin, Y. Decoding Temporally Encoded 3D Objects from Low-Cost Wearable Electroencephalography. Technologies 2025, 13, 501. https://doi.org/10.3390/technologies13110501

AMA Style

LaRocco J, Tahmina Q, Zia S, Merchant S, Forrester J, He E, Lin Y. Decoding Temporally Encoded 3D Objects from Low-Cost Wearable Electroencephalography. Technologies. 2025; 13(11):501. https://doi.org/10.3390/technologies13110501

Chicago/Turabian Style

LaRocco, John, Qudsia Tahmina, Saideh Zia, Shahil Merchant, Jason Forrester, Eason He, and Ye Lin. 2025. "Decoding Temporally Encoded 3D Objects from Low-Cost Wearable Electroencephalography" Technologies 13, no. 11: 501. https://doi.org/10.3390/technologies13110501

APA Style

LaRocco, J., Tahmina, Q., Zia, S., Merchant, S., Forrester, J., He, E., & Lin, Y. (2025). Decoding Temporally Encoded 3D Objects from Low-Cost Wearable Electroencephalography. Technologies, 13(11), 501. https://doi.org/10.3390/technologies13110501

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Decoding Temporally Encoded 3D Objects from Low-Cost Wearable Electroencephalography

Abstract

1. Introduction

1.1. Overview

1.2. Background

1.2.1. Summarizing Prior Work

1.2.2. EEG to Image

1.2.3. EEG to Video

1.2.4. EEG to Object

1.2.5. Summary of Prior Work

2. Materials and Methods

2.1. Summary

2.2. Participants

2.3. Stimulus Presentation

2.4. Image Processing

2.5. Design Requirements

2.6. Preprocessing and Feature Extraction

2.7. Data Classification

2.8. Performance Metrics

3. Results

3.1. Summarizing Results

3.2. Intrasubject Competition

3.3. Intersubject Competition

3.4. Top Features

3.5. Image to Object

4. Discussion

4.1. Summary

4.2. Limitations

4.3. Future Work

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI