Next Article in Journal
Normative Data for the D-KEFS Tower Test in Greek Adult Population Between 20 and 85 Years Old
Previous Article in Journal
Could Traumatic Brain Injury Be a Risk Factor for Bruxism and Temporomandibular Disorders? A Scoping Review
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Evaluating Cognitive Function and Brain Activity Patterns via Blood Oxygen Level-Dependent Transformer in N-Back Working Memory Tasks

by
Zhenming Zhang
1,
Yaojing Chen
2,
Aidong Men
1 and
Zhuqing Jiang
1,3,*
1
School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China
2
State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University, Beijing 100875, China
3
Beijing Key Laboratory of Network System and Network Culture, Beijing University of Posts and Telecommunications, Beijing 100876, China
*
Author to whom correspondence should be addressed.
Brain Sci. 2025, 15(3), 277; https://doi.org/10.3390/brainsci15030277
Submission received: 30 December 2024 / Revised: 25 January 2025 / Accepted: 28 January 2025 / Published: 5 March 2025
(This article belongs to the Section Computational Neuroscience and Neuroinformatics)

Abstract

:
(1) Background: Working memory, which involves temporary storage, information processing, and regulating attention resources, is a fundamental cognitive process and constitutes a significant component of neuroscience research. This study aimed to evaluate brain activation patterns by analyzing functional magnetic resonance imaging (fMRI) time-series data collected during a designed N-back working memory task with varying cognitive demands. (2) Methods: We utilized a novel transformer model, blood oxygen level-dependent transformer (BolT), to extract the activation level features of brain regions in the cognitive process, thereby obtaining the influence weights of regions of interest (ROIs) on the corresponding tasks. (3) Results: Compared with previous studies, our work reached similar conclusions in major brain region performance and provides a more precise analysis for identifying brain activation patterns. For each type of working memory task, we selected the top 5 percent of the most influential ROIs and conducted a comprehensive analysis and discussion. Additionally, we explored the effect of prior knowledge conditions on the performance of different tasks in the same period and the same tasks at different times. (4) Conclusions: The comparison results reflect the brain’s adaptive strategies and dependencies in coping with different levels of cognitive demands and the stability optimization of the brain’s cognitive processing. This study introduces innovative methodologies for understanding brain function and cognitive processes, highlighting the potential of transformer in cognitive neuroscience. Its findings offer new insights into brain activity patterns associated with working memory, contributing to the broader landscape of neuroscience research.

1. Introduction

The brain’s abilities involved in acquiring, processing, storing, and retrieving information are known as cognitive functions. Working memory, considered the foundation of cognitive functions, is an important experimental paradigm in neuroscience for assessing the temporary storage and manipulation of information in the brain [1,2]. These tasks typically involve presenting participants with a sequence of stimuli, such as numbers, letters, or shapes, and requiring them to retain and manipulate this information over a brief duration. Initially proposed by Kirchner in 1958 [3], the N-back task is one of the most widely employed paradigms in cognitive neuroscience research for investigating working memory. Due to its demand for both the retention and manipulation of cognitive information, this task has been extensively utilized in neuroimaging studies. In the N-back task, participants are presented with a series of stimuli and are instructed to respond by button-press when the current stimulus matches the one presented N steps back in the sequence. This task is administered with various values of N, and increasing N levels heighten the task’s difficulty due to the heightened complexity of cognitive load [4,5]. A baseline condition (0-back) is included where participants must respond if a stimulus equals a predefined item.
Functional magnetic resonance imaging (fMRI) explores complex cognitive processes in the human brain [6,7,8] by measuring blood oxygen level-dependent (BOLD) responses that indicate changes in metabolic demand following neural activity [9,10]. Task-based fMRI associates stimulus or task variables with brain responses [11,12,13] to identify co-activated brain regions [14] and indicate functional connectivity [15]. Neuroimaging studies, particularly those employing fMRI, have advanced the understanding of the activated brain regions involved in N-back tasks. For instance, Owen et al. [16] conducted the first meta-analysis of fMRI studies on the N-back task in adults, identifying key regions such as the prefrontal and parietal cortices. Similarly, Rottschy et al. [17] confirmed the involvement of an extensive frontoparietal network in healthy adults, and Wang et al. [18] further demonstrated that regions like the middle frontal gyrus, inferior parietal lobule, and thalamus are consistently activated across different memory loads. These findings have provided valuable insights into the neural mechanisms underlying working memory tasks like the N-back.
Conventionally, statistical methods and traditional machine learning have been utilized to process fMRI data, aiming to estimate spatiotemporal patterns associated with cognition and brain diseases. Feature extraction is commonly employed to reduce dimensionality and mitigate nuisance variability [19,20]. Functional connectivity features are typically expressed as temporal correlations of BOLD responses across distinct brain regions. Methods such as support vector machines or logistic regression are then applied to classify relational variables [21,22,23,24]. While these approaches have proven valuable, they are limited in their ability to capture complex and nonlinear relationships within high-dimensional fMRI data. Recently, many studies have instead applied deep learning due to its capacity to capture intricate patterns in high-dimensional data, thus investigating the nonlinear relationship between brain dynamics and human cognition/behaviors [25,26,27]. Various successful deep learning models have emerged in the literature, leveraging convolutional [28], graph [29], or recurrent architecture [30,31] to process functional connectivity features. Moreover, several recent studies have opted for transformer models [32,33,34] to build a classifier directly on BOLD responses, enabling a more direct assessment of high-order interactions in fMRI data.
In this paper, we aimed to evaluate the brain activity patterns associated with cognitive function using deep learning techniques. To detect these cerebral activation patterns, we designed a working memory N-back task with three levels of cognitive demand. While brain cognitive patterns in N-back tasks have been extensively studied, recent research has demonstrated the precision and sensitivity of deep learning techniques in describing activated brain regions. This introduces innovative perspectives and methodologies for neuroscience research. In our study, we employed the blood oxygen level-dependent transformer (BolT) model to capture brain activity patterns across varying levels of cognitive task difficulty [35]. The experimental results are consistent with prior research and offer a more precise and detailed analysis of activated brain regions compared to traditional statistical methods. We also investigated the brain’s performance on (1) the same task under different prior knowledge conditions and (2) task classification at different time stages.

2. Methods and Materials

2.1. Datasets Description

Participants were recruited from the Beijing Aging Brain Rejuvenation Initiative (BABRI) study [36], an ongoing longitudinal investigation focusing on brain health and cognitive decline in elderly individuals residing in the community. Inclusion criteria for participants in this report were as follows: (1) native Chinese speakers aged over 50 years of age without dementia and possessing normal daily living abilities; (2) no history of brain tumors, neurological or psychiatric disorders, or substance addiction; (3) not presenting with conditions known to affect cerebral function, including alcoholism, current depression, Parkinson’s disease, or epilepsy; and (4) no contraindications to magnetic resonance imaging (MRI). Finally, 255 individuals met these criteria and were included in the study.
All participants underwent scanning with a Siemens Trio 3T scanner (Siemens Healthineers, Erlangen, Germany) located at the Imaging Center for Brain Research at Beijing Normal University. Participants were laid supine, with their heads securely immobilized, using straps and foam pads to minimize head movement. High-resolution T1-weighted sagittal 3D magnetization prepared rapid gradient echo sequences were obtained, covering the entire brain (176 sagittal slices; repetition time = 1900 ms; echo time = 3.44 ms; slice thickness = 1 mm; flip angle = 9°; inversion time = 900 ms; field of view = 256 × 256 mm 2 ; acquisition matrix = 256 × 256).

2.2. N-Back Task Design

The N-back task in this study was performed with visual stimuli consisting of selected numbers. The task involved presenting a series of visual stimuli to participants in a predetermined sequence, preceded by a 10-second guidance before each appearance. The task comprised 9 blocks, divided into 3 blocks per N-back condition ranging from easy to difficult (0-back, 1-back, and 2-back). In total, 180 stimuli (20 per block) were presented with different targets occurring thrice across the three conditions.
Participants were instructed to press the button with their right hand under three conditions: when the displayed number matched a predefined target number (0-back), when consecutive identical numbers appeared (1-back), or when identical numbers with the same interval were presented (2-back). Notably, we also included target trials where subjects refrained from responding. This was done to ensure equal and adequate trials for all conditions and participants. See Figure 1 for an illustration of the paradigm.

2.3. Data Preprocessing

Data preprocessing was conducted using statistical parametric mapping (SPM12) software. Registration with higher degrees of freedom and segmentation operations were applied to analyze 3D T1-weighted scans. Firstly, we utilized slice timing to mitigate potential confounds caused by temporal differences in slice acquisition within each volume of fMRI data. Next, we realigned fMRI images to correct distortion from head motion and co-registrate functional images with T1-weighted anatomical images. Finally, we normalized images into Montreal Neurological Institute (MNI) space and apply spatial smoothing with a Gaussian kernel of 6 mm 3 to reduce spatial noise.
After that, we segmented the data according to the time series for the 0-back, 1-back, and 2-back tasks, and classification experiments were designed for each task. However, one block only contains 20 trials, which might not provide sufficient information for feature extraction by the classifier and probably causes overfitting problems. To address this, we cropped and spliced three blocks of the same task to increase the number of sampling points within a block. This approach expanded the operating space for the model to extract features and laid the groundwork for determining the most suitable hyperparameters for the model. Figure 2 shows a schematic diagram of block processing.

2.4. Analysis Techniques

In our study, we adopted a novel transformer model, BolT [35], and designed a classification task to analyze the fMRI data. BolT has gained prominence in the analysis of fMRI data due to its capability to capture temporal dynamics directly from fMRI time series. In contrast to traditional methods that primarily rely on clustering and probabilistic approaches to identify the most relevant brain regions, BolT offers a significant advantage by providing importance weights for all brain regions, thereby ensuring a more comprehensive analysis. BolT has been shown to outperform other machine learning and deep learning techniques, achieving superior accuracy and sensitivity. This enhancement facilitates a more precise and nuanced understanding of the brain’s responses to cognitive tasks. Although BolT has not been extensively evaluated for task-specific regions or fine-grained classification tasks, we focused on the more specific N-back working memory task, utilizing BolT to investigate the activated brain regions during the task. BolT leverages the Transformer architecture, separating spatial and temporal attention units and focusing on local representations, with the fused window multi-head self-attention (FW-MSA) module. The FW-MSA module calculates local attention within adjacent time windows, significantly enhancing the capture of subtle changes in the dynamics of brain activations while maintaining the linear scalability of the fMRI time series [37].
We utilized the external Schaefer brain atlas that comprises 400 regions labelled across seven intrinsic connectivity networks [38] to extract the regional BOLD responses and map the four-dimensional fMRI time-series data to the corresponding ROIs. The time series of ROI was obtained by averaging the responses across voxels and aligned with the MNI template. The task of the model in the classification was to map these regional BOLD responses to the class label. An overview of the classification task process is shown in Figure 3.

2.5. Implementation Details

The experiments were conducted in PyTorch 1.12.1 on an NVIDIA RTX 2080 Ti GPU (NVIDIA Corporation, Santa Clara, CA, USA). Modelling utilized a five-fold cross-validation procedure to evaluate the model’s performance on different subsets and enhance its generalization ability. FMRI time series were dynamically sampled at randomly generated start positions to enable the model to capture patterns and correlations better, thus improving learning efficiency. The time series were standardized to maintain consistent value ranges, optimizing convergence speed and stability.
Hyperparameter selection was based on performance in the initial validation set. Parameters demonstrating near-optimal performance across all datasets and atlases were selected. The selected parameters included learning rate ( 2 × 10 6 , 4 × 10 4 ) , 20 epochs, and mini-batch size ( 8 , 32 ) . The training was performed via the Adam optimizer. BolT was trained to minimize the following loss: L = L C E + λ · L C L S where L C E is cross-entropy loss, and λ = 0.1 is the regularization coefficient for C L S loss set via cross-validation.
Based on the number of subjects, BolT was trained for 20 epochs with a batch size of 16. A hidden dimensionality of 400 and 36 attention heads with 20 dimensions per head was prescribed. A dropout rate of 0.1 was used in both FW-MSA and MLP layers. For the FW-MSA architecture, given a desired dynamic length D, receptive field R, and window size W, stride s, fringe length L, and the number of layers N were set proportionately as follows:
s = W α
L = 2 × ( 1 α ) W
D > R = W + ( N 1 ) × s
where α ( 0 , 1 ) is the stride coefficient, which is a preset proportionality constant. Dynamic length represents the application range of the FW-MSA structure and determines the range within which the model dynamically samples the time series. A larger dynamic length accommodates more transformer blocks for feature extraction.
Window size refers to the size of the MSA module. Due to the limitation of the length of the time series, balancing the relationship between length and quantity requires more consideration (a larger length means fewer modules). Given the short time series, prioritizing the amount of modules becomes more significant. The receptive field is formed by adding fringe blocks on both sides of the fused window, effectively increasing the effective number of layers. The receptive field needs to cover the range of dynamic length as much as possible to accommodate more transformer blocks for feature extraction. The following values were selected for the hyperparameters: D = 50 ; N = 12 ; W = 5 ; α = 0.8 . The specific hyperparameter adjustment process is shown in Table 1.

3. Results

3.1. Importance Weight Characteristics of Brain Regions

After the classification task achieved convincing accuracy, we evaluated the impact of the BOLD tokens by computing the gradient-weighted attention maps [39] and the correlation scores between tokens to determine their importance for the classification task. The importance weights for the ROIs of each N-back task corresponding to the Schaefer brain atlas in the working memory task are shown in Figure 4.
We used the landmark time point to identify the brain regions crucial for the detection task. Five tokens were extracted for each subject to represent responses across ROIs. Subsequently, a logistic regression model was trained to correlate the tokens at landmark time points with their respective output classes. The model weights signify the contribution of each ROI to the classification decision. We analyzed the importance weights of influential ROIs to determine their significance in each task and elucidate the neural correlates underlying the processing demands of these cognitive tasks.
In the 0-back task, participants were instructed to respond to a specific target number stimulus. The right hemisphere visual cortex, particularly regions Right Hemisphere Visual 19 and Right Hemisphere Visual 26, exhibited high importance weights. Moreover, left hemisphere regions such as Left Hemisphere Limbic Temporal Pole 6 and Left Hemisphere Visual 27 were also significantly activated. The left hemisphere somatosensory-motor cortex, Left Hemisphere Somatomotor 17, showed notable activity. Figure 5 presents a visualization of the top 5 percent of the most influential ROIs during the 0-back task, accompanied by Table A1, which outlines the details of the brain regions.
In the 1-back task, participants were required to respond whenever the current stimulus matched the one presented immediately before it. Left hemisphere regions demonstrated prominent activation, including Left Hemisphere Default Precuneus Posterior Cingulate Cortex 3 and Left Hemisphere Dorsal Attention Posterior 15. Furthermore, left hemisphere regions such as Left Hemisphere Frontoparietal Control Parietal 4 and Left Hemisphere Frontoparietal Control Parietal 1 showed significant involvement. The top 5 percent of the most influential ROIs for the 1-back task is visualized in Figure 6, and Table A2 offers specific details on the brain regions.
In the 2-back task, participants had to respond whenever the current stimulus matched the one presented two stimuli back. The right hemisphere dorsal attention network exhibited substantial activation, particularly regions Right Hemisphere Dorsal Attention Posterior 15 and Right Hemisphere Dorsal Attention Posterior 9. Moreover, the bilateral visual cortex regions Left Hemisphere Visual 25 and Right Hemisphere Visual 29 showed significant activation. Figure 7 and Table A3 provide the details for the 2-back task, as above.

3.2. Comparative Experiments in Different Prior Knowledge Conditions

The previous experiment combined three N-back time series from subjects performing the same task throughout the experiment to classify three task types. However, the experiment design conducted these three tasks in a staggered order. Consequently, the same task varied in terms of prior knowledge, with differences in difficulty sequence, such as starting with easy and then transitioning to hard, or vice versa. To investigate these three types of working memory tasks in greater depth, we designed comparative experiments to classify the same difficulty of N-back tasks at different periods and to classify different difficulties of N-back tasks simultaneously.
Given that the sampling time for a single task is only 20 time points, we extended the experiment to 60 time points through replication. To validate the feasibility of the replication method, we conducted classification experiments on the dataset obtained through replication expansion and compared the classification results with the original dataset. Following simple parameter adjustments, the classification accuracy reached 67.76%. For comparison, the highest classification accuracy achieved in a single period was 67.10%. Therefore, it can be inferred that the replication expansion method effectively maintains the classification performance within the same period.
We extracted data corresponding to 0-back, 1-back, and 2-back tasks and conducted three separate classification experiments to assess their ability to distinguish among different prior situations. Table 2 shows the results of the classification task.
The classification results indicate a gradual decrease in accuracy as the task difficulty increased. This observation suggests that brain activity in regions associated with more challenging tasks exhibited greater similarity and was less influenced by prior knowledge. Conversely, easier tasks were more susceptible to sequencing and level of difficulty.
Next, we proceeded to classify the three tasks at different stages. The classification results are shown in Table 3.
It can be observed that the initial phase model exhibits better classification performance, characterized by a higher discriminability of activated brain regions, whereas the subsequent two phases show similar classification results. It is conjectured that participants lacked proficiency in the classification task at the beginning, indicating a learning curve. Participants likely became more adept at capitalizing on repeated information as the task progressed, mitigating task difficulty. Consequently, the activation of brain regions began to decline and stabilize.

4. Discussion

Based on the BABRI cohort, this study innovatively adopted a novel transformer structure that effectively captures local-to-global representations of time series to perform detection tasks based on fMRI scans of the N-back task. The architecture learned latent representations of fMRI data via a novel fused window attention mechanism that incorporates long-range context with linear complexity regarding scan length. Detection was then performed based on learned high-level classification tokens regularized across time windows. We then used a matched explanatory technique to calculate the weights of activated brain regions to obtain each brain region’s contribution to the task. In the 0-back task, the right hemisphere visual cortex exhibited high importance weights. This suggests its crucial involvement in visual processing and discrimination of the target stimulus, possibly reflecting the visual encoding and identification of the presented number. In the left hemisphere, the involvement of the limbic temporal pole might indicate emotional processing or memory retrieval associated with the presented stimuli [40,41]. Meanwhile, the visual cortex likely contributed to visual perception and recognition. The engagement of motor responses reflects the participants’ manual responses to the target stimuli. In the 1-back task, the activation of the default precuneus posterior cingulate cortex in the left hemisphere, associated with the default mode network (DMN), suggests involvement in maintaining attentional focus and cognitive control during the task [42,43,44,45]. Dorsal attention posterior in the left hemisphere, a dorsal attention network (DAN) component, likely played a role in sustaining attention and monitoring for target stimuli [46,47]. Part of the parietal cortex, including the control parietal in the left hemisphere involved in attentional control, likely facilitated the comparison and matching processes required in the task [48,49,50]. In the 2-back task, the substantial activation of the dorsal attention network in the right hemisphere suggests their role in maintaining attentional resources and updating working memory representations across trials in the task [51,52]. The participation and activation of bilateral visual cortex regions indicate their involvement in visual processing and encoding stimuli, supporting the participants’ recognition and discrimination of the target numbers.
The observed activation patterns across the three tasks underscore the distributed nature of cognitive processing, with different brain regions contributing to various cognitive demands. The involvement of visual cortex regions in all tasks highlights the fundamental role of visual perception and discrimination in task performance. Furthermore, the engagement of attentional networks, including the default mode and dorsal attention networks, suggests the importance of attentional control and cognitive monitoring across tasks. These networks likely play a crucial role in regulating attentional resources and maintaining task-relevant information in working memory. These findings are consistent with the existing literature on the correlates of N-back tasks. In comparison to previous studies, we assigned precise importance weights to all brain regions, enabling an evaluation of the entire brain’s contributions rather than limiting the analysis to a few prominent regions. This perspective provides more detailed insights into the mechanisms underlying cognitive control and working memory processing.
In the comparative experiment, the observed decrease in classification accuracy with increasing task difficulty suggests a nuanced relationship between task complexity and brain activation. As tasks become more challenging, neural resources may be recruited more uniformly, reflecting adaptive strategies to cope with increased cognitive demands. Furthermore, the differential impact of prior knowledge on task classification highlights the role of cognitive factors in shaping brain activation patterns. Easier tasks are more susceptible to the influence of previous knowledge, indicating a potential reliance on familiar strategies or mental models. In contrast, more difficult tasks exhibit greater consistency in brain activation, possibly reflecting a higher reliance on core cognitive processes unaffected by prior experiences. The temporal dynamics of brain activation across task stages underscore the importance of considering learning effects in neuroimaging studies. The superior performance of the initial phase model suggests a period of exploration and adaptation, where participants familiarize themselves with task requirements. Subsequent phases show stabilization in brain activation, indicating optimized cognitive processing and reduced reliance on novel strategies.
However, our study primarily focuses on individuals above 50, which may not accurately reflect brain activation patterns in younger individuals or those with different health statuses due to age-related changes in brain structure and function and varying life experiences. Moreover, excluding participants with neurological or psychiatric disorders limits our understanding of how these conditions affect neural correlates of cognitive tasks. Future research should include a broader range of ages and clinical backgrounds to enhance the generalizability of findings and uncover unique patterns associated with diverse populations.

5. Conclusions

This study employs a novel deep learning technique to investigate the spatiotemporal brain activation patterns during working memory tasks. The classification experiments identified the critical brain regions contributing to the cognitive task, providing new insights into the neural mechanisms and producing more accurate and comprehensive results. Additionally, comparative experiments revealed differences in brain activation patterns under varying task difficulty and prior knowledge conditions. These findings highlight the adaptive strategies of neural resources in response to increased cognitive demands and underscore the role of cognitive factors in shaping brain activation patterns. The results also validate the optimization of brain region stability in cognitive strategies following adaptation.

Author Contributions

Z.Z. conceptualized and designed the experiments, processed the data, analyzed and interpreted the results, and drafted the paper. Y.C. collected and provided the experimental data, and assisted in analyzing the results. A.M. critically reviewed and revised the paper. Z.J. provided guidance on the experimental design and contributed to paper writing. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China (2023YFC3605401).

Institutional Review Board Statement

The study protocol received approval ICBIR_A_0041_002.02 (22 February 2017) from the Ethics Committee and Institutional Review Board of Beijing Normal University Imaging Center for Brain Research.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

Due to privacy and ethical restrictions, the data used in this study are not publicly available. The analysis scripts and data supporting the findings of this study can be obtained from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. The specific information of the top 5 percent of the most influential ROIs for 0-back.
Table A1. The specific information of the top 5 percent of the most influential ROIs for 0-back.
RankBrain RegionsSideCentroid Coordinates (R A S)Importance Weight
1Visual 19R9 −74 90.315
2Visual 26R27 −87 210.314
3Limbic Temporal Pole 6L−40 −21 −270.274
4Visual 27L−12 −71 200.272
5Somatomotor 17L−51 −7 430.271
6Frontoparietal Control Cingulate 1R6 −26 280.253
7Salience Ventromedial Attention Medial 4R12 −34 430.247
8Visual 5L−23 −73 −100.238
9Visual 18R35 −89 20.238
10Dorsal Attention Posterior 16L−20 −57 660.234
11Somatomotor 21R52 −13 490.231
12Somatomotor 10R41 −29 180.228
13Frontoparietal Control Lateral Prefrontal Cortex 8R48 18 230.228
14Default Temporal 5L−53 6 −110.226
15Default Precuneus Posterior Cingulate Cortex 2L−13 −61 190.225
16Default Dorsal Prefrontal cortex an Medial Prefrontal Cortex 13R12 20 630.217
17Visual 28L−32 −84 270.216
18Frontoparietal Control Parietal 5R54 −33 510.204
19Somatomotor 3R53 −14 60.203
20Default Dorsal Prefrontal cortex and Medial Prefrontal Cortex 12R12 −55 150.203
Table A2. The specific information of the top 5 percent of the most influential ROIs for 1-back.
Table A2. The specific information of the top 5 percent of the most influential ROIs for 1-back.
RankBrain RegionsSideCentroid Coordinates (R A S)Importance Weight
1Default Precuneus Posterior Cingulate Cortex 3L−4 −53 200.246
2Frontoparietal Control Parietal 4L−35 −62 480.241
3Dorsal Attention Posterior 15L−7 −59 630.237
4Limbic Temporal Pole 2L7 42 40.216
5Dorsal Attention Posterior 8L−46 −29 440.212
6Frontoparietal Control Parietal 1L−29 −74 420.211
7Somatomotor 31R29 −11 650.206
8Visual 24L−11 −97 170.195
9Frontoparietal Control Precuneus 2L−5 −64 520.187
10Dorsal Attention Posterior 13R35 −36 510.181
11Limbic Temporal Pole 5R29 12 −300.176
12Frontoparietal Control Parietal 5L−42 −52 490.174
13Dorsal Attention Posterior 5R32 −66 350.171
14Default Prefrontal Cortex 10L−53 19 110.169
15Frontoparietal Control Lateral Prefrontal Cortex 15R24 10 580.165
16Frontoparietal Control Temporal 1R62 −28 −200.163
17Somatomotor 26L−36 −19 650.161
18Visual 25L−3 −84 240.159
19Dorsal Attention Posterior 4R45 −75 310.157
20Frontoparietal Control Parietal 2R56 −41 480.153
Table A3. The specific information of the top 5 percent of the most influential ROIs for 2-back.
Table A3. The specific information of the top 5 percent of the most influential ROIs for 2-back.
RankBrain RegionsSideCentroid Coordinates (R A S)Importance Weight
1Dorsal Attention Posterior 15R8 −71 530.287
2Visual 25L−3 −84 240.285
3Visual 29R16 −87 360.256
4Visual 19L5 41 −110.255
5Dorsal Attention Posterior 9R45 −28 420.252
6Default Parietal 3R53 −53 260.241
7Visual 24R16 −66 190.237
8Dorsal Attention Frontal Eye Fields 1L−40 −3 510.227
9Default Prefrontal Cortex 13L−4 51 280.223
10Dorsal Attention Posterior 6L−55 −32 450.218
11Frontoparietal Control Parietal 1L−29 −74 420.214
12Visual 2L−30 −33 −180.203
13Default Parietal 4L−47 −64 310.200
14Visual 26L−12 −71 200.195
15Somatomotor 31L−19 −24 670.185
16Salience Ventromedial Attention Parietal Operculum 2L−58 −44 270.183
17Default Parietal 4R55 −45 330.182
18Salience Ventromedial Attention Medial 5L−13 −41 470.180
19Dorsal Attention Posterior 15L−7 −59 630.177
20Frontoparietal Control Lateral Prefrontal Cortex 13R43 7 510.176

References

  1. Baddeley, A. Working memory: Theories, models, and controversies. Annu. Rev. Psychol. 2012, 63, 1–29. [Google Scholar] [CrossRef] [PubMed]
  2. Yaple, Z.; Arsalidou, M. N-back working memory task: Meta-analysis of normative fMRI studies with children. Child Dev. 2018, 89, 2010–2022. [Google Scholar] [CrossRef] [PubMed]
  3. Kirchner, W.K. Age differences in short-term retention of rapidly changing information. J. Exp. Psychol. 1958, 55, 352. [Google Scholar] [CrossRef]
  4. Kane, M.J.; Conway, A.R.; Miura, T.K.; Colflesh, G.J. Working memory, attention control, and the N-back task: A question of construct validity. J. Exp. Psychol. Learn. Mem. Cogn. 2007, 33, 615. [Google Scholar] [CrossRef]
  5. Jaeggi, S.M.; Buschkuehl, M.; Perrig, W.J.; Meier, B. The concurrent validity of the N-back task as a working memory measure. Memory 2010, 18, 394–412. [Google Scholar] [CrossRef] [PubMed]
  6. Kubicki, M.; McCarley, R.W.; Nestor, P.G.; Huh, T.; Kikinis, R.; Shenton, M.E.; Wible, C.G. An fMRI study of semantic processing in men with schizophrenia. Neuroimage 2003, 20, 1923–1933. [Google Scholar] [CrossRef]
  7. Papma, J.M.; Smits, M.; De Groot, M.; Mattace Raso, F.U.; van der Lugt, A.; Vrooman, H.A.; Niessen, W.J.; Koudstaal, P.J.; van Swieten, J.C.; van der Veen, F.M.; et al. The effect of hippocampal function, volume and connectivity on posterior cingulate cortex functioning during episodic memory fMRI in mild cognitive impairment. Eur. Radiol. 2017, 27, 3716–3724. [Google Scholar] [CrossRef]
  8. Mensch, A.; Mairal, J.; Bzdok, D.; Thirion, B.; Varoquaux, G. Learning neural representations of human cognition across many fMRI studies. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
  9. Hillman, E.M. Coupling mechanism and significance of the BOLD signal: A status report. Annu. Rev. Neurosci. 2014, 37, 161–181. [Google Scholar] [CrossRef]
  10. Rajapakse, J.C.; Kruggel, F.; Maisog, J.M.; Yves von Cramon, D. Modeling hemodynamic response for analysis of functional MRI time-series. Hum. Brain Mapp. 1998, 6, 283–300. [Google Scholar] [CrossRef]
  11. Li, K.; Guo, L.; Nie, J.; Li, G.; Liu, T. Review of methods for functional brain connectivity detection using fMRI. Comput. Med. Imaging Graph. 2009, 33, 131–139. [Google Scholar] [CrossRef]
  12. Venkataraman, A.; Van Dijk, K.R.; Buckner, R.L.; Golland, P. Exploring functional connectivity in fMRI via clustering. In Proceedings of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei, Taiwan, 19–24 April 2009; pp. 441–444. [Google Scholar]
  13. Nishimoto, S.; Vu, A.T.; Naselaris, T.; Benjamini, Y.; Yu, B.; Gallant, J.L. Reconstructing visual experiences from brain activity evoked by natural movies. Curr. Biol. 2011, 21, 1641–1646. [Google Scholar] [CrossRef]
  14. Simon, O.; Kherif, F.; Flandin, G.; Poline, J.B.; Riviere, D.; Mangin, J.F.; Le Bihan, D.; Dehaene, S. Automatized clustering and functional geometry of human parietofrontal networks for language, space, and number. Neuroimage 2004, 23, 1192–1202. [Google Scholar] [CrossRef]
  15. Rogers, B.P.; Morgan, V.L.; Newton, A.T.; Gore, J.C. Assessing functional connectivity in the human brain by fMRI. Magn. Reson. Imaging 2007, 25, 1347–1357. [Google Scholar] [CrossRef]
  16. Owen, A.M.; McMillan, K.M.; Laird, A.R.; Bullmore, E.T. N-back working memory paradigm: A meta-analysis of normative functional neuroimaging studies. Hum. Brain Mapp. 2005, 25, 46–59. [Google Scholar] [CrossRef] [PubMed]
  17. Rottschy, C.; Langner, R.; Dogan, I.; Reetz, K.; Laird, A.R.; Schulz, J.B.; Fox, P.T.; Eickhoff, S.B. Modelling neural correlates of working memory: A coordinate-based meta-analysis. Neuroimage 2012, 60, 830–846. [Google Scholar] [CrossRef]
  18. Wang, H.; He, W.; Wu, J.; Zhang, J.; Jin, Z.; Li, L. A coordinate-based meta-analysis of the n-back working memory paradigm using activation likelihood estimation. Brain Cogn. 2019, 132, 1–12. [Google Scholar] [CrossRef]
  19. McKeown, M.J.; Sejnowski, T.J. Independent component analysis of fMRI data: Examining the assumptions. Hum. Brain Mapp. 1998, 6, 368–372. [Google Scholar] [CrossRef]
  20. Svensén, M.; Kruggel, F.; Benali, H. ICA of fMRI group study data. NeuroImage 2002, 16, 551–563. [Google Scholar] [CrossRef]
  21. Pereira, F.; Mitchell, T.; Botvinick, M. Machine learning classifiers and fMRI: A tutorial overview. Neuroimage 2009, 45, S199–S209. [Google Scholar] [CrossRef]
  22. De Martino, F.; Valente, G.; Staeren, N.; Ashburner, J.; Goebel, R.; Formisano, E. Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns. Neuroimage 2008, 43, 44–58. [Google Scholar] [CrossRef] [PubMed]
  23. Zhang, X.; Hu, B.; Ma, X.; Xu, L. Resting-state whole-brain functional connectivity networks for MCI classification using L2-regularized logistic regression. IEEE Trans. Nanobioscience 2015, 14, 237–247. [Google Scholar] [CrossRef]
  24. Wang, C.; Xiao, Z.; Wu, J. Functional connectivity-based classification of autism and control using SVM-RFECV on rs-fMRI data. Phys. Medica 2019, 65, 99–105. [Google Scholar] [CrossRef]
  25. Heinsfeld, A.S.; Franco, A.R.; Craddock, R.C.; Buchweitz, A.; Meneguzzi, F. Identification of autism spectrum disorder using deep learning and the ABIDE dataset. NeuroImage Clin. 2018, 17, 16–23. [Google Scholar] [CrossRef]
  26. Li, Y.; Liu, J.; Tang, Z.; Lei, B. Deep spatial-temporal feature fusion from adaptive dynamic functional connectivity for MCI identification. IEEE Trans. Med. Imaging 2020, 39, 2818–2830. [Google Scholar] [CrossRef]
  27. Mlynarski, P.; Delingette, H.; Criminisi, A.; Ayache, N. Deep learning with mixed supervision for brain tumor segmentation. J. Med. Imaging 2019, 6, 034002. [Google Scholar] [CrossRef] [PubMed]
  28. Kawahara, J.; Brown, C.J.; Miller, S.P.; Booth, B.G.; Chau, V.; Grunau, R.E.; Zwicker, J.G.; Hamarneh, G. BrainNetCNN: Convolutional neural networks for brain networks; towards predicting neurodevelopment. NeuroImage 2017, 146, 1038–1049. [Google Scholar] [CrossRef]
  29. Parisot, S.; Ktena, S.I.; Ferrante, E.; Lee, M.; Guerrero, R.; Glocker, B.; Rueckert, D. Disease prediction using graph convolutional networks: Application to autism spectrum disorder and Alzheimer’s disease. Med. Image Anal. 2018, 48, 117–130. [Google Scholar] [CrossRef]
  30. Fan, L.; Su, J.; Qin, J.; Hu, D.; Shen, H. A deep network model on dynamic functional connectivity with applications to gender classification and intelligence prediction. Front. Neurosci. 2020, 14, 881. [Google Scholar] [CrossRef]
  31. Wang, L.; Li, K.; Hu, X.P. Graph convolutional network for fMRI analysis based on connectivity neighborhood. Netw. Neurosci. 2021, 5, 83–95. [Google Scholar] [CrossRef]
  32. Kan, X.; Dai, W.; Cui, H.; Zhang, Z.; Guo, Y.; Yang, C. Brain network transformer. Adv. Neural Inf. Process. Syst. 2022, 35, 25586–25599. [Google Scholar]
  33. Malkiel, I.; Rosenman, G.; Wolf, L.; Hendler, T. Self-supervised transformers for fmri representation. In Proceedings of the International Conference on Medical Imaging with Deep Learning, PMLR, Zurich, Switzerland, 6–8 July 2022; pp. 895–913. [Google Scholar]
  34. Deng, X.; Zhang, J.; Liu, R.; Liu, K. Classifying ASD based on time-series fMRI using spatial–temporal transformer. Comput. Biol. Med. 2022, 151, 106320. [Google Scholar] [CrossRef] [PubMed]
  35. Bedel, H.A.; Sivgin, I.; Dalmaz, O.; Dar, S.U.; Çukur, T. BolT: Fused window transformers for fMRI time series analysis. Med. Image Anal. 2023, 88, 102841. [Google Scholar] [CrossRef] [PubMed]
  36. Chen, Y.; Xu, K.; Yang, C.; Li, X.; Li, H.; Zhang, J.; Wei, D.; Xia, J.; Tao, W.; Lu, P. Beijing aging brain rejuvenation initiative: Aging with grace. Sci. Sin. Vitae 2018, 48, 721–734. [Google Scholar]
  37. Hutchison, R.M.; Womelsdorf, T.; Allen, E.A.; Bandettini, P.A.; Calhoun, V.D.; Corbetta, M.; Della Penna, S.; Duyn, J.H.; Glover, G.H.; Gonzalez-Castillo, J.; et al. Dynamic functional connectivity: Promise, issues, and interpretations. Neuroimage 2013, 80, 360–378. [Google Scholar] [CrossRef]
  38. Schaefer, A.; Kong, R.; Gordon, E.M.; Laumann, T.O.; Zuo, X.N.; Holmes, A.J.; Eickhoff, S.B.; Yeo, B.T. Local-global parcellation of the human cerebral cortex from intrinsic functional connectivity MRI. Cereb. Cortex 2018, 28, 3095–3114. [Google Scholar] [CrossRef]
  39. Chefer, H.; Gur, S.; Wolf, L. Generic attention-model explainability for interpreting bi-modal and encoder-decoder transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 397–406. [Google Scholar]
  40. Fang, J.; Jin, Z.; Wang, Y.; Li, K.; Kong, J.; Nixon, E.E.; Zeng, Y.; Ren, Y.; Tong, H.; Wang, Y.; et al. The salient characteristics of the central effects of acupuncture needling: Limbic-paralimbic-neocortical network modulation. Hum. Brain Mapp. 2009, 30, 1196–1206. [Google Scholar] [CrossRef]
  41. Al-sharoa, E.; Al-khassaweneh, M.A.; Aviyente, S. Tensor Based Temporal and Multilayer Community Detection for Studying Brain Dynamics During Resting State fMRI. IEEE Trans. Biomed. Eng. 2019, 66, 695–709. [Google Scholar] [CrossRef]
  42. Wang, X.; Cheng, B.; Roberts, N.; Wang, S.; Luo, Y.; Tian, F.; Yue, S. Shared and distinct brain fMRI response during performance of working memory tasks in adult patients with schizophrenia and major depressive disorder. Hum. Brain Mapp. 2021, 42, 5458–5476. [Google Scholar] [CrossRef]
  43. Daamen, M.; Bäuml, J.G.; Scheef, L.; Sorg, C.; Busch, B.; Baumann, N.; Bartmann, P.; Wolke, D.; Wohlschläger, A.M.; Boecker, H. Working memory in preterm-born adults: Load-dependent compensatory activity of the posterior default mode network. Hum. Brain Mapp. 2015, 36, 1121–1137. [Google Scholar] [CrossRef]
  44. Xiong, H.; Guo, R.J.; Shi, H. Altered Default Mode Network and Salience Network Functional Connectivity in Patients with Generalized Anxiety Disorders: An ICA-Based Resting-State fMRI Study. Evid.-Based Complement. Altern. Med. eCAM 2020, 2020, 4048916. [Google Scholar] [CrossRef]
  45. Chao, T.H.H.; Lee, B.; Hsu, L.M.; Cerri, D.H.; Zhang, W.; Wang, T.W.W.; Ryali, S.; Menon, V.; Shih, Y.Y.I. Neuronal dynamics of the default mode network and anterior insular cortex: Intrinsic properties and modulation by salient stimuli. Sci. Adv. 2023, 9, eade5732. [Google Scholar] [CrossRef]
  46. Kim, H. Neural activity during working memory encoding, maintenance, and retrieval: A network-based model and meta-analysis. Hum. Brain Mapp. 2019, 40, 4912–4933. [Google Scholar] [CrossRef]
  47. Ischebeck, A.; Hiebel, H.; Miller, J.; Höfler, M.; Gilchrist, I.D.; Körner, C. Target processing in overt serial visual search involves the dorsal attention network: A fixation-based event-related fMRI study. Neuropsychologia 2021, 153, 107763. [Google Scholar] [CrossRef]
  48. Pennock, I.M.L.; Schmidt, T.T.; Zorbek, D.; Blankenburg, F. Representation of visual numerosity information during working memory in humans: An fMRI decoding study. Hum. Brain Mapp. 2021, 42, 2778–2789. [Google Scholar] [CrossRef]
  49. Capotosto, P.; Sulpizio, V.; Galati, G.; Baldassarre, A. Visuo-spatial attention and semantic memory competition in the parietal cortex. Sci. Rep. 2023, 13, 6218. [Google Scholar] [CrossRef]
  50. Gilmore, A.W. Perceiving Oldness in Parietal Cortex: fMRI Characterization of a Parietal Memory Network. Ph.D. Thesis, Washington University in St. Louis, St. Louis, MO, USA, 2016. [Google Scholar]
  51. Beffara, B.; Hadj-Bouziane, F.; Hamed, S.B.; Boehler, C.N.; Chelazzi, L.; Santandrea, E.; Macaluso, E. Separate and overlapping mechanisms of statistical regularities and salience processing in the occipital cortex and dorsal attention network. Hum. Brain Mapp. 2023, 44, 6439–6458. [Google Scholar] [CrossRef]
  52. Machner, B.; Braun, L.; Imholz, J.; Koch, P.J.; Münte, T.; Helmchen, C.; Sprenger, A. Resting-State Functional Connectivity in the Dorsal Attention Network Relates to Behavioral Performance in Spatial Attention Tasks and May Show Task-Related Adaptation. Front. Hum. Neurosci. 2022, 15, 757128. [Google Scholar] [CrossRef]
Figure 1. (A) The working memory task adopts the numerical N-back paradigm, with three levels from easy to difficult: 0-back, 1-back, and 2-back. The three levels appear pseudo-randomly 3 times, with 10 seconds of instruction before each appearance. The task comprises a total of 9 blocks, with each block containing 20 trials. Among these, only 6 trials require a correct button response. (B) Illustration of the N-back working memory task paradigm with three levels of difficulty.
Figure 1. (A) The working memory task adopts the numerical N-back paradigm, with three levels from easy to difficult: 0-back, 1-back, and 2-back. The three levels appear pseudo-randomly 3 times, with 10 seconds of instruction before each appearance. The task comprises a total of 9 blocks, with each block containing 20 trials. Among these, only 6 trials require a correct button response. (B) Illustration of the N-back working memory task paradigm with three levels of difficulty.
Brainsci 15 00277 g001
Figure 2. The entire task process is divided into three stages, with the same task blocks from each stage being cropped and spliced.
Figure 2. The entire task process is divided into three stages, with the same task blocks from each stage being cropped and spliced.
Brainsci 15 00277 g002
Figure 3. Classification task process overview. The BOLD responses are extracted from the fMRI time series and projected to the corresponding brain region according to the external brain atlas to obtain BOLD tokens. Each BOLD token encodes the ROI responses for the corresponding period. Cascade transformer blocks process these BOLD tokens across a series of overlapping time windows in the time series. A separate learnable C L S token is introduced into the transformer blocks for each time window. Both the BOLD tokens and the C L S tokens serve as inputs to the transformer blocks, facilitating the extraction of latent representations. Finally, the output C L S tokens are averaged and passed through a linear layer to yield the final classification results.
Figure 3. Classification task process overview. The BOLD responses are extracted from the fMRI time series and projected to the corresponding brain region according to the external brain atlas to obtain BOLD tokens. Each BOLD token encodes the ROI responses for the corresponding period. Cascade transformer blocks process these BOLD tokens across a series of overlapping time windows in the time series. A separate learnable C L S token is introduced into the transformer blocks for each time window. Both the BOLD tokens and the C L S tokens serve as inputs to the transformer blocks, facilitating the extraction of latent representations. Finally, the output C L S tokens are averaged and passed through a linear layer to yield the final classification results.
Brainsci 15 00277 g003
Figure 4. The importance weights for the ROIs of each N-back task corresponding to the Schaefer brain atlas in the working memory task. The horizontal axis represents the different brain regions of the Schaefer brain atlas. The brain regions above the red line are considered to be the regions that contribute the most to the task.
Figure 4. The importance weights for the ROIs of each N-back task corresponding to the Schaefer brain atlas in the working memory task. The horizontal axis represents the different brain regions of the Schaefer brain atlas. The brain regions above the red line are considered to be the regions that contribute the most to the task.
Brainsci 15 00277 g004
Figure 5. The top 5 percent of the most influential ROIs for 0-back, with higher opacity indicating higher influence weights.
Figure 5. The top 5 percent of the most influential ROIs for 0-back, with higher opacity indicating higher influence weights.
Brainsci 15 00277 g005
Figure 6. The top 5 percent of the most influential ROIs for 1-back, with higher opacity indicating higher influence weights.
Figure 6. The top 5 percent of the most influential ROIs for 1-back, with higher opacity indicating higher influence weights.
Brainsci 15 00277 g006
Figure 7. The top 5 percent of the most influential ROIs for 2-back, with higher opacity indicating higher influence weights.
Figure 7. The top 5 percent of the most influential ROIs for 2-back, with higher opacity indicating higher influence weights.
Brainsci 15 00277 g007
Table 1. The hyperparameter adjustment process in the classification task. The results demonstrate that a receptive field with more layers can increase classification accuracy under a larger dynamic length condition. Bold values indicate the parameter values and results that achieve the highest accuracy under the same conditions.
Table 1. The hyperparameter adjustment process in the classification task. The results demonstrate that a receptive field with more layers can increase classification accuracy under a larger dynamic length condition. Bold values indicate the parameter values and results that achieve the highest accuracy under the same conditions.
DNW α AccuracyROC
204200.457.12%74.94%
304200.458.17%75.66%
404200.461.96%80.30%
504200.463.66%83.50%
504100.466.41%84.93%
50450.470.72%86.93%
50450.671.37%86.91%
50650.672.29%88.00%
50650.873.07%88.46%
501250.873.86%89.02%
Table 2. Comparative experiment of the same task under different prior knowledge conditions.
Table 2. Comparative experiment of the same task under different prior knowledge conditions.
Task TypeAccuracyROC
0-back64.31%82.62%
1-back56.07%75.21%
2-back41.05%58.80%
Table 3. Comparative experiment of the task classification at different time stages.
Table 3. Comparative experiment of the task classification at different time stages.
Task StageAccuracyROC
01267.58%84.46%
12063.53%82.22%
20163.79%82.32%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, Z.; Chen, Y.; Men, A.; Jiang, Z. Evaluating Cognitive Function and Brain Activity Patterns via Blood Oxygen Level-Dependent Transformer in N-Back Working Memory Tasks. Brain Sci. 2025, 15, 277. https://doi.org/10.3390/brainsci15030277

AMA Style

Zhang Z, Chen Y, Men A, Jiang Z. Evaluating Cognitive Function and Brain Activity Patterns via Blood Oxygen Level-Dependent Transformer in N-Back Working Memory Tasks. Brain Sciences. 2025; 15(3):277. https://doi.org/10.3390/brainsci15030277

Chicago/Turabian Style

Zhang, Zhenming, Yaojing Chen, Aidong Men, and Zhuqing Jiang. 2025. "Evaluating Cognitive Function and Brain Activity Patterns via Blood Oxygen Level-Dependent Transformer in N-Back Working Memory Tasks" Brain Sciences 15, no. 3: 277. https://doi.org/10.3390/brainsci15030277

APA Style

Zhang, Z., Chen, Y., Men, A., & Jiang, Z. (2025). Evaluating Cognitive Function and Brain Activity Patterns via Blood Oxygen Level-Dependent Transformer in N-Back Working Memory Tasks. Brain Sciences, 15(3), 277. https://doi.org/10.3390/brainsci15030277

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop