Episodic Memory Encoding and Retrieval in Face-Name Paired Paradigm: An fNIRS Study

Background: Episodic memory (EM) is particularly sensitive to pathological conditions and aging. In a neurocognitive context, the paired-associate learning (PAL) paradigm, which requires participants to learn and recall associations between stimuli, has been used to measure EM. The present study aimed to explore whether functional near-infrared spectroscopy (fNIRS) can be employed to determine cortical activity underlying encoding and retrieval. Moreover, we examined whether and how different aspects of task (i.e., novelty, difficulty) affects those cortical activities. Methods: Twenty-two male college students (age: M = 20.55, SD = 1.62) underwent a face-name PAL paradigm under 40-channel fNIRS covering fronto-parietal and middle occipital regions. Results: A decreased activity during encoding in a broad network encompassing the bilateral frontal cortex (Brodmann areas 9, 11, 45, and 46) was observed during the encoding, while an increased activity in the left orbitofrontal cortex (Brodmann area 11) was observed during the retrieval. Increased HbO concentration in the superior parietal cortices and decreased HbO concentration in the inferior parietal cortices were observed during encoding while dominant activation of left PFC was found during retrieval only. Higher task difficulty was associated with greater neural activity in the bilateral prefrontal cortex and higher task novelty was associated with greater activation in occipital regions. Conclusion: Combining the PAL paradigm with fNIRS provided the means to differentiate neural activity characterising encoding and retrieval. Therefore, the fNIRS may have the potential to complete EM assessments in clinical settings.


Introduction
Episodic memory (EM) refers to the process of currently retrieving events from the past [1,2]. EM incorporates two main phases: (a) an encoding stage, which contributes to the formation of a new memory trace, and (b) a retrieval stage, which refers to the conscious remembering of past events. EM is an essential process for various higher cortical functions such as judgment and decision making [1,2]. EM performance has been reported to be particularly sensitive to aging and pathological conditions such as amnesia, mild cognitive impairment (MCI), and Alzheimer's disease (AD) [3,4]. Notably, a recent study indicated subjective memory complaints in 14% of younger adults (N = 4425) aged between 18 and 39 [5]. The progressive decline of EM has detrimental effects on multiple daily life outcomes including educational success and work performance [6,7]. Furthermore, because of the associated medical cost, EM impairment has been shown to increase financial pressures on families of affected people [8][9][10].
Over the last two decades, the paired-associate learning (PAL) paradigm has become one of the most widely used paradigms to assess EM performance [11][12][13]. PAL paradigm usually involves recollection of objects after learning a series of associations between stimuli (e.g., faces and names). Given that the difficulty and novelty of the paradigm can be easily manipulated by means of item-item associations, it can be adapted to the functional levels of targeted clinical populations. In addition, the paradigm does not include complex language requirements, and it has been widely used to examine EM performances in both healthy participants and individuals with apparent memory impairments [14,15].
Based on the results of previous functional magnetic resonance imaging (f MRI) researches and lesion studies, specific brain regions (i.e., prefrontal, parietal and occipital cortices, and temporal hippocampal areas) have been suggested to be engaged in the process of EM [16][17][18]. According to the transfer-appropriate processing (TAP) theory, retrieval success depends on the successful encoding. Specifically, the extent of encoding-related activation in regions engaged during both encoding and retrieval has been associated with subsequent memory success [19]. However, this reengagement phenomenon in EM has not been well proved by neuroimaging evidence. Moreover, it remains largely elusive whether the novelty (repetition times; repeated pairs vs. novel pairs) and difficult levels (number of characters in a name; two-word pairs vs. three-word pairs) of stimulus can influence the accuracy rate of retrieval. To elucidate the neural mechanisms underlying the retrieval successes, the use of functional near-infrared spectroscopy (f NIRS) has been recently suggested as it provides the means to identify brain functioning to investigate the central and autonomous nervous system in more ecological and clinically usable contexts [20].
Recently, the combination of behavioral assessment with f NIRS recordings has been broadly and efficiently applied in the exploration of cognitive performances and associated brain correlates [21]. Compared with electroencephalogram (EEG) and f MRI, f NIRS is less sensitive to movement artifacts and has fewer contraindications (e.g., allow the participation of individuals with metallic implants such as braces). Thus, f NIRS technology enables researchers to monitor the brain activity of children, healthy adults, as well as clinical population (e.g., MCI, AD) with less stress, higher ecological validity, and fewer costs, thus allowing a better integration into the clinical practice. Regarding the PAL paradigm, f NIRS allows investigators to record the verbal answers and EM-related cortical brain activity simultaneously [22]. Given its advantages, we aimed to combine the PAL paradigm with f NIRS to examine differences in oxygenated-hemoglobin concentrations associated with different aspects of paradigm (i.e., memory phases, novelty, and difficulty level of face-name pairs) among the male college students. Specifically, we hypothesized that the encoding and retrieval should be characterized by different brain responses among the frontal and parietal cortices in spite of partial activation overlap [23][24][25], allowing to differentiate the two phases of EM. As more cognitive resources and efforts are required when performing a novel task [26] and with increasing difficulty levels [27][28][29], we also hypothesized that greater activation would be observed among pairs of trials presenting contrasting novelty (vs. repeated/same pair) and difficulty levels.

Participants
Based on a priori sample size calculation (α = 0.05, effective size d z = 0.8) [22], 22 male students from Shenzhen University (China; age: M = 20.55, SD = 1.62) participated in this study. Inclusion criteria were (a) normal visual, hearing, and physical function to complete the experiment; (b) right-handedness; (c) absence of mental health disorders or cognitive impairments; (d) no dependence on alcohol, nicotine, coffee, or drugs. Written informed consent forms were signed by all individuals before participation. This study was approved by Shenzhen University Institutional Review Board, with protocol (PN-2020-034) performed in accordance with the latest revision of the Declaration of Helsinki.

Experimental Procedure
We adopted the face-name PAL paradigm from a previous f MRI study (Figure 1) [30]. In the present study, the neutral faces were acquired from the CAS-PEAL Large Scale Chinese Face Database and were presented on the lab computer. There were 4 runs in the present paradigm. In the first two runs, two-word names were presented, with (a) run 1: female face-name pairs and (b) run 2: male face-name pairs. In the last two runs, three-word names were presented, with (a) run 3: female face-name pairs and (b) run 4: male face-name pairs. Each run started with a 15-s baseline, and consisted of 4 blocks presented following the order of encoding (repeated), retrieval (repeated), encoding (novel), and retrieval (novel) face-name pairs. In each block, 7 faces were randomly presented on the white background, each lasting for 4.5 s and with 3-s intervals between faces. Each block was followed by a 15-s reset period. In the novel trials, 7 different face-name pairs were randomly presented; in the repeated trials, 2 different face-name pairs were randomly represented (at least 3 times for each pair). The two-word Chinese names and the threeword Chinese names (higher difficult level) represented different difficulty levels in our study. In the encoding phase, the participants were asked to try their best to remember the face-name pairs. In the retrieval phase, the faces which appeared in the encoding phases could be presented, and the participants were required to report the names associated with the presented faces within 4.5 s. The accuracy (accuracy = the number of correct answers/the number of total face-name pairs * 100%) was noted by investigators.
Before the experiment, the entire procedure was explained in detail by the investigator and the participants were given specific instructions on how to remember the face-name matches for later testing. During the experiment, the participants were required to sit in front of the computer (45 cm) without extraneous movements for about 24 min and the f NIRS data were collected simultaneously. At the beginning, a white fixation cross (+) was presented in the center of the visual field on a black background to ensure the concentration of participants on paradigm stimuli. During the encoding runs, seven faces with names printed underneath were shown, and the participants were asked to memorize as accurately as possible the face-name pairs presented on the screen. In the retrieval runs, a face without a name appeared on the screen, and the participants were required to report the paired name within the given time (7.5 s). During the rest periods, a 15-s black screen was displayed. The participants' responses were recorded for further analysis.

f NIRS Data Acquisition
A 40-channel f NIRS system (Danyang Huichuang Medical Equipment Co. Ltd., China) was used to measure the relative change in oxygenated hemoglobin (HbO). Beta values were computed within the bilateral prefrontal, superior parietal, inferior parietal, and middle occipital cortices. The f NIRS system operated at two wavelengths (750 and 850 nm) and recorded changes at a sample rate of 7 Hz. In total, 16 sources and 15 detectors were used and the optodes were placed in the cap based on the international 10-5 system for EEG electrode placement (Figure 2), with a black overcap covering all the optodes to avoid interference from environmental light (Babak S et al., 2010). The uniform distance between neighboring optodes was 3 cm. The 3D positions (MNI standard coordinates) of the sources and detectors, as well as the reference optodes, were calculated via a 3D digitizer (PATRIOT, Polhemus), which is based on a four-point positioning algorithm in 3D space relying on received signal strength indication.  We mapped the fNIRS channels to the corresponding brain regions according to their MNI coordinates, as shown in Figure 2   We mapped the f NIRS channels to the corresponding brain regions according to their MNI coordinates, as shown in Figure 2 (prefrontal cortex: channel 1-20; superior parietal cortex: channel 22, 32; inferior parietal cortex: channel 21-27, 29, 31-37, 39; middle occipital cortex: channel 28,30,38,40). Additionally, the coordinates were checked using the f NIRS Optodes' Location Decider (fOLD) [31].

f NIRS Data Preprocessing
The raw f NIRS data were recorded during the face-name PAL paradigm using the NIRSmart acquisition software and then processed via the NIRSpark analyzing software. During the data preprocessing, the unrelated time intervals and artifacts induced by motion and environment were eliminated (automatic motion correction of NIRSpark analyzing software; the standard deviation of threshold = 6.0; the amplitude of threshold = 0.5). The light intensities were converted to optical densities and blood oxygen concentrations through the modified Beer-Lambert law. A bandpass filter (0.01-0.1 Hz) was then applied to remove both noise and interference signals (heart rate, breathing rate, and Mayer waves) [32]. The initial time of the hemodynamic response function (HRF) was set to −2 s (i.e., baseline state) and the end time to 50 s (i.e., task state for a single block). The oxygenated HRF was averaged for each channel across the four blocks.

Data Analysis
The generalized linear model (GLM) was employed to analyze channel-wise hemodynamic responses induced by the face-name PAL paradigm. The GLM established a canonical HRF for each experiment condition and each participant, and calculate the degree of matching between the experimental and ideal HRF values. GLM models were used to analyze HbO concentration, with the stage (encoding/retrieval), novelty (novel/repeated), and difficult level of face-name pairs (two-word name/three-word name) as independent measures. The beta value represented how much a regressor (e.g., experimental manipulation) contributed to the f NIRS signal and whether the HRF model was reliable (Pinti P et al., 2017). Diffuse optical topography (Figure 3) of beta value was generated by Easy-Topo [33,34], a toolbox running through MatLab software. Additionally, mean activation of hemodynamic state (HbO concentration) was also computed and exported for postprocessing analyses. In this study, only HbO signals calculated by the modified Beer-Lambert law were used. HbO is less influenced by physiological noise and better reflect changes in the regional cerebral blood oxygenation induced by stimuli when compared to deoxygenated hemoglobin (Hb) [35]. The path length factor was set at 6 for both 740 nm and 850 nm waves. The raw fNIRS data were recorded during the face-name PAL paradigm using the NIRSmart acquisition software and then processed via the NIRSpark analyzing software. During the data preprocessing, the unrelated time intervals and artifacts induced by motion and environment were eliminated (automatic motion correction of NIRSpark analyzing software; the standard deviation of threshold = 6.0; the amplitude of threshold = 0.5). The light intensities were converted to optical densities and blood oxygen concentrations through the modified Beer-Lambert law. A bandpass filter (0.01-0.1 Hz) was then applied to remove both noise and interference signals (heart rate, breathing rate, and Mayer waves) [32]. The initial time of the hemodynamic response function (HRF) was set to −2 s (i.e., baseline state) and the end time to 50 s (i.e., task state for a single block). The oxygenated HRF was averaged for each channel across the four blocks.

Data Analysis
The generalized linear model (GLM) was employed to analyze channel-wise hemodynamic responses induced by the face-name PAL paradigm. The GLM established a canonical HRF for each experiment condition and each participant, and calculate the degree of matching between the experimental and ideal HRF values. GLM models were used to analyze HbO concentration, with the stage (encoding/retrieval), novelty (novel/repeated), and difficult level of face-name pairs (two-word name/three-word name) as independent measures. The beta value represented how much a regressor (e.g., experimental manipulation) contributed to the fNIRS signal and whether the HRF model was reliable (Pinti P et al., 2017). Diffuse optical topography (Figure 3) of beta value was generated by EasyTopo [33,34], a toolbox running through MatLab software. Additionally, mean activation of hemodynamic state (HbO concentration) was also computed and exported for postprocessing analyses. In this study, only HbO signals calculated by the modified Beer-Lambert law were used. HbO is less influenced by physiological noise and better reflect changes in the regional cerebral blood oxygenation induced by stimuli when compared to deoxygenated hemoglobin (Hb) [35]. The path length factor was set at 6 for both 740 nm and 850 nm waves.

Frontal Top Left Right
Encoding Encoding -Two Words

Retrieval -Repeated
Retrieval -Novel For the exported data (HbO concentration and beta value), the analysis was performed using IBM SPSS (v.23.0). The normality distribution of data was checked by the Shapiro-Wilk test, as recommended for small sample sizes [36,37]. Then, the paired-sample t-tests (e.g., encoding vs. retrieval, repeated pairs vs. novel pairs (novelty), two-word pairs vs. three-word pairs (difficulty level)) were used for data of normal distribution, and the Wilcoxon signed-rank tests were used for the data of non-normal distribution, with a significance level set at α = 0.05. As recommended for fNIRS studies, the false discovery rate (FDR) was used to account for the multiple comparison problem (i.e., type I error accumulation) [38,39]. For the behavioral data, Pearson's correlation was used to test the associations between the accuracy of cognitive paradigm (number of correct answers) and beta value [40]. We rated the correlation coefficients as follows: 0 to 0.19: no correlation; 0.2 to 0.39: low correlation, 0.40 to 0.59: moderate correlation; 0.60 to 0.79: moderately high correlation; ≥0.80: high correlation. Table 1 shows the significances of HbO concentration among channels in the different task conditions. The channels with significant differences in beta values (baseline vs. task condition) are shown in Table 2. Finally, the visualization of beta values in the encoding and retrieval phases is illustrated in Figure 3.  For the exported data (HbO concentration and beta value), the analysis was performed using IBM SPSS (v.23.0). The normality distribution of data was checked by the Shapiro-Wilk test, as recommended for small sample sizes [36,37]. Then, the paired-sample t-tests (e.g., encoding vs. retrieval, repeated pairs vs. novel pairs (novelty), two-word pairs vs. three-word pairs (difficulty level)) were used for data of normal distribution, and the Wilcoxon signed-rank tests were used for the data of non-normal distribution, with a significance level set at α = 0.05. As recommended for f NIRS studies, the false discovery rate (FDR) was used to account for the multiple comparison problem (i.e., type I error accumulation) [38,39]. For the behavioral data, Pearson's correlation was used to test the associations between the accuracy of cognitive paradigm (number of correct answers) and beta value [40]. We rated the correlation coefficients as follows: 0 to 0.19: no correlation; 0.2 to 0.39: low correlation, 0.40 to 0.59: moderate correlation; 0.60 to 0.79: moderately high correlation; ≥0.80: high correlation. Table 1 shows the significances of HbO concentration among channels in the different task conditions. The channels with significant differences in beta values (baseline vs. task condition) are shown in Table 2. Finally, the visualization of beta values in the encoding and retrieval phases is illustrated in Figure 3.

Encoding and Retrieval Phases
During the encoding, HbO concentration showed significant increases from baseline in the left superior parietal cortex (channel 22; FDR-corrected p-value = 0.038); and the significant decreases were observed in bilateral prefrontal cortices and left inferior parietal cortex (channel 1, 3, 17, 18, 20, 25, 26; FDR-corrected p-values = 0.048, 0.048, 0.012, 0.048, 0.029, 0.000, 0.012). During the retrieval phase, a significant increase from baseline of HbO concentration was only observed in the left prefrontal cortex (channel 1; FDR-corrected p-value = 0.048).  In the phase of encoding, the highest beta value was 0.611 located in the left superior parietal cortex, and the lowest beta value was reported in the middle occipital cortex (left: −0.147; right: 0.020). For the retrieval phase, the highest beta value (0.231) was found in the left prefrontal cortex while the lowest one (−0.183) in the left superior parietal cortex. Compared with retrieval, significantly higher beta values during encoding were observed in the right prefrontal cortex (channel 10,12; FDR-corrected p-values = 0.022, 0.022), while significantly lower beta values were observed in the left prefrontal cortex (channel 1; FDR-corrected p-value = 0.021) and right middle occipital cortex (FDR-corrected p-value = 0.021). During the retrieval of faces with two-word names, there was no significant decrease or increase from baseline in HbO concentration. During the retrieval of faces with three-word names, HbO showed significant increases in the Brodmann area 11 of bilateral prefrontal cortices (channel 7, 19; FDR-corrected p-values = 0.031, 0.035) and decrease in the Brodmann area 46 of the right prefrontal cortex (channel 20; FDR-corrected p-value = 0.031). Between these two task conditions, no significant difference of beta value was observed in any brain regions.

Retrieval of Face at Different Novelty Levels: Repeated Faces vs. Novel Faces
During the retrieval of repeated and novel faces, no channel showed a significant increase or decrease of HbO concentration in any brain region. Compared with the retrieval with repeated faces, significant differences of beta values were observed in the left prefrontal cortex (channel 5; FDR-corrected p-value = 0.021), left middle occipital cortex (channel 30; FDR-corrected p-value = 0.036), right inferior parietal cortex (channel 31; FDRcorrected p-value = 0.046) and right superior parietal cortex (channel 32; FDR-corrected p-value = 0.022) during the retrieval of novel faces.

Correlations between Memory Performance and Brain Activation (Beta Value)
No correlation was found between the accuracy of retrieval and brain activation during encoding (Table S1). During the retrieval of novel faces, negative correlations were observed between the accuracy of retrieval and beta values (right superior parietal cortex (channel 32): FDR-corrected p-values = 0.020; r = −0.573).

Discussion
The present study used f NIRS to examine the brain mechanisms underlying memory encoding and retrieval in a sample of male college students. Compared to baseline, the f NIRS results showed differential changes in brain activation in the PFC between the encoding and retrieval processes of episodic memory (EM). In particular, the encoding process was characterized by decreases in HbO, suggesting deactivations in broader subregions of the bilateral PFC (Brodmann area 9,11,45,46). In contrast, the retrieval process was characterized by increases in HbO in the left orbitofrontal cortex (Brodmann area 11). Compared with retrieval, significantly higher beta values during encoding were observed in the right prefrontal cortex while significantly lower beta values were observed in the left prefrontal cortex and right middle occipital cortex. These outcomes were partially consistent with previous f MRI [ [16][17][18] and f NIRS [22] studies. Overall, our findings indicate the f NIRS can be a useful tool to model brain activation patterns during memory-related tasks.
Previous f MRI studies reported increased activation of PFC during encoding [41][42][43]. However, a recent f NIRS study [22] reported decreased prefrontal activity during encoding, which was consistent with our outcomes. The differences between f MRI and f NIRS can be explained by the fact that f MRI measures the deeper brain regions, including those in the prefrontal cortex. Regarding the dominant role of left PFC in the retrieval, at least two factors can explain this phenomenon. First, the activation of the left PFC-precuneus network resulted from item-item associations [44]. Secondly, the contribution of left PFC to acquire semantic knowledge was used in the face-name PAL paradigm [45][46][47].
The HbO status during encoding was opposite in the superior (increased activation) and inferior (Brodmann area 39, 40; decreased activation) parietal cortices. Notably, the activated superior parietal cortex is an essential component of the dorsal attention network (DAM) and contributes to the top-down modulation of sensory [48]. Likewise, the deactivated angular gyrus (i.e., Brodmann area 39), which is part of the default mode network (DMN), has been reported to be involved mainly in passive states [48][49][50]. Based on these past findings, it could be argued that memory encoding may require higher attentional processes and thereby may inhibit the function of DMN. Importantly, accumulating evidence pinpoint the important role of parietal DMN nodes in EM retrieval. Specifically, it has been suggested that the task-evoked activity first induces the close coupling of intra-DMN, para-hippocampal, and medial temporal regions and then activates the extra-DMN nodes to make a final decision [51,52]. Therefore, it can be drawn that the parietal cortex differentially responds to memory phases and is in coordination with intra/extra-DMN areas for different purposes.
In this study, regarding faces with different levels of naming difficulty, compared with the blocks with two-word names, the ones with three-word names generated greater beta values and HbO concentration levels in the bilateral prefrontal cortices (Brodmann 9, 10, 11) during both the encoding and retrieval processes. Additionally, during retrieval, the left hemisphere activation was observed to be more dominant among pairs with higher difficulty levels. These results are in line with previous neuroimaging studies [51,52] and fit well within the "cortical asymmetry of reflective activity (CARA)" hypothesis [53], demonstrating that the left prefrontal cortex is more likely to engage in the more complex and reflectively demanding test. The complexity and reflective demand of the task include detailed and deliberative evaluation, analysis, and maintenance of the input information as well as the self-cueing initiation for additional information. Furthermore, in the more difficult level of the task, the selection of relevant information and the inhibition of irrelevant information also requires the participation of the left prefrontal cortex [51,[54][55][56].
A successful EM retrieval relies on the rapid reactivation of sensory (i.e., visual) information presented in the encoding process, a process that relies on the occipital cortex [57], which plays a key role in visual processing [57]. In the present study, the repeated pairs generated lower activation in the occipital cortex, compared to the novel pairs. This result is in agreement with the findings of a previous f MRI study observing that the repetition of items suppressed the occipital visual cortex activity [58,59]. It is reported that such repetition suppression arises from stimulus-specific expectations. In particular, the visual cortex (i.e., visual area 1) is likely to produce a weaker response for the expected stimulus relative to unexpected things [60].
According to the transfer-appropriate processing (TAP) theory, people tend to deal with a task faster and more efficiently if the associated stimulus has been experienced before. In terms of memory, this phenomenon can be explained as follows: the success of retrieval depends on the successful encoding processing [61][62][63]. It is hypothesized that key brain areas during encoding can reengage the retrieval process [64]. Our results partially matched with the TAP hypothesis. During the encoding and retrieval of three-word names, Brodmann area 46 was deactivated in a systematic fashion. Likewise, we found that participants did better on the repeated trials only in cases where the degree of activation (or deactivation) in several regions was similar in the encoding and retrieval phases.

Limitations
Due to the limitation of f NIRS penetration depth, the brain activities from the temporal hippocampal area were not available for observation, as noted in previous f MRI studies. Due to the limitation in optode number, we could not assess the activation of the whole occipital lobe. However, the primary visual cortex (visual area 1; Brodmann area 17) and partial extrastriate areas (visual area 2-3; Brodmann area 18) were well covered by the channel setup in this study, which enables us to explore the basic role of the visual system in EM. Additionally, we did not observe the correlation between accuracy of retrieval and brain activation during encoding in the present study. In addition, future studies using f NIRS to investigate EM-related brain activation patterns should use data processing techniques to remove artifacts arising from systemic physiological changes (e.g., short-separation channel regression to account for changes in superficial blood flow) in order to reduce the likelihood of false-positive findings [65,66]. Moreover, considering the limitation of small sample size, as well as a single-sex and narrow-age group (male university students) in this study, a larger sample and wider-scale population will be required to test the present outcomes in the further studies.

Conclusions
In conclusion, the results of the present study demonstrated that the encoding and retrieval of EM are associated with distinct patterns of prefrontal, parietal, and middle occipital activation/deactivation. In addition, the paradigm difficulty and the novelty play modulating roles in the engagement of brain substrates. The opposite brain responses of superior and inferior parietal cortices during encoding and the dominant activation of left PFC during retrieval allow us to differentiate the two phases of EM using f NIRS technology during a face-name PAL paradigm. Moreover, these cerebral correlates findings were reinforced by the observation of significant correlations between brain activation patterns and behavioral performance (i.e., accuracy) during the retrieval process that was reinforced. These findings suggest that f NIRS may represent a valuable tool to investigate the brain processes engaged in EM.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/10 .3390/brainsci11070951/s1, Table S1: Accuracy in Retrieval Phase.  Data Availability Statement: Data of this study were collected at Shenzhen University.

Conflicts of Interest:
The authors declare that they have no conflict of interest.