Children begin to understand themselves and their social surroundings by engaging in embodied social interactions with others. As this process is interpersonal by definition, it should ideally be studied by simultaneously obtaining data from all interaction partners. However, such a second-person social neuroscience approach has only gained momentum in developmental research in recent years [1
]. Methodological challenges made it difficult to obtain neural measures from two (or more) individuals at the same time during naturalistic, live interactions [2
]. The emergence of functional near-infrared spectroscopy (fNIRS) and particularly the synchronized measurement of brain activity from parents and their children while they interact with each other—so-called hyperscanning—have recently allowed us to take this step. We are now able to investigate the neurobiological underpinnings of socio-cognitive and affective processes underlying these early social interactions and, thereby, deepening our understanding of child development from a second-person social neuroscience perspective.
Children’s social brain development and the associated question of how children come to understand others have long been studied in experiments utilizing pre-recorded and live social stimuli. The obtained results provided important insights into children’s abilities to decipher social information but were devoid of the information’s social context and especially the dynamic and reciprocal nature of real social exchanges. Traditional developmental neuroscience has, thus, investigated the neural correlates of social perception from a third-person perspective, as children were studied individually and mainly in their role as observers [3
]. Yet, children are more than merely passive observers during early social interactions; they are embodied agents able to generate more information by their own actions [4
]. To truly assess the neurobiological basis of interpersonal social processes, we, thus, need to study parent–child interactions from a second-person perspective [3
When a caregiver and a child communicate with one another during an embodied social interaction, they typically fluctuate between aligned and misaligned states [5
]. Interpersonal alignment, or synchrony, appears to be of great importance for social interaction, not only concerning the matching of behavior and affective states, but also biological and neural rhythms [6
]. Interpersonal synchrony is thought to be rewarding to the interactants, as it facilitates mutual prediction, interpersonal coordination, and allostasis—the latter describing the ongoing interpersonal physiological regulation required to meet the environment’s changing demands [6
]. Recent methodological advancements now allow us to study interpersonal synchrony on various levels, but especially the dynamic, reciprocal alignment of oscillatory brain activity from second-person social neuroscience. The simultaneous brain signal acquisition in a multi-person experimental design has been coined with the term ‘hyperscanning’.
Previous research with adults using hyperscanning provided evidence for interpersonal neural synchronization during verbal and non-verbal communication as well as interpersonal coordination [9
]. The underlying mechanistic idea is that when the individual oscillatory brain activities become aligned during social interaction, information can be exchanged in an optimal manner. Accordingly, interpersonal neural synchrony was identified in brain networks related to mutual attention, affect attunement, mentalizing, as well as shared intentions [2
]. The correlational evidence was recently extended by work using multi-brain stimulation to increase interpersonal neural synchrony and subsequent behavioral coordination [15
]. These findings provide a causal framework for the social effects of interpersonal neural synchrony.
While the number of adult hyperscanning studies has been steadily increasing, only a few developmental hyperscanning studies—using near-infrared sensors specifically—exist to date. fNIRS uses near-infrared light to indirectly and non-invasively assess neural activity in concentration changes of blood oxygenation levels. Near-infrared light is emitted by sensors and detected by detectors (optodes), which are secured to a participant’s scalp. A certain number of channels in the probe sets index activity occurring in the outermost layers of the cortex located below. Similar to functional magnetic resonance imaging (fMRI), fNIRS measures the hemodynamic response and assumes neurovascular coupling, i.e., increases in blood oxygenation levels upon neural activation and vice versa [16
]. Beyond the blood-oxygenation-level-dependent (BOLD) response measured with fMRI, fNIRS obtains three relative concentrations measures: oxygenated hemoglobin (HbO), deoxygenated hemoglobin (HbR), and total hemoglobin (Hb). Typically, when neural activity increases, HbO and Hb increase as well, while HbR concentration slightly decreases. fNIRS is especially suitable to naturalistic, developmental hyperscanning research due to its tolerance of motion and applicability [17
]. The few studies using fNIRS hyperscanning in a developmental sample available to date show that parents and children (from preschool to school-age) synchronize their brain activities during cooperative tasks, ranging from standardized button press tasks to more naturalistic interactions, in comparison to various control conditions [18
]. Additional studies evidenced interpersonal neural synchrony between adults/parents and children in free play interactions and during emotional video watching [19
Generally, brain signal preprocessing approaches were similar between above-mentioned studies. The overall procedure included data conversion from electrical signals to optical density measures, which were then corrected for motion, automatically or visually checked for signal to noise ratio, spatially filtered, and finally subjected to interpersonal neural synchrony estimations. Interpersonal neural synchrony was either assessed using Wavelet Transform Coherence (WTC) or correlational analyses of the two time-series (e.g., Pearson, robust and cross-correlations) [21
]. Here, we will primarily focus on WTC to assess the relation between the two time-series of interacting partners’ brain activity. WTC is suggested to be more suitable in comparison to correlational approaches, as it is invariant to interregional differences in the hemodynamic response function (HRF) [24
]. Correlations, on the other hand, are sensitive to the shape of the HRF, which is assumed to be different between individuals (especially regarding age) as well as different within distinct brain areas. Moreover, a high correlation may be observed among regions that have no blood flow fluctuations.
In the remaining parts of this paper, we outline the data analysis procedure we have developed to study fNIRS hyperscanning data from naturalistic parent–child interactions. We provide insight into our approach by providing an exemplary dataset of 20 mother–child dyads during a naturalistic, cooperative versus individual problem-solving task to showcase all pre-processing and analysis steps. Statistical analyses will allow preliminary inferences on whether parent–child dyads show significant interpersonal neural synchrony, and if so within which region of interest (ROI) and during which experimental condition. We will be using MATLAB and RStudio to pre-process and analyze the data. Interested readers can follow the steps of the described analyses by examining the associated files on OSF (https://osf.io/wspz4/
, accessed on 12 June 2021).
2. Materials and Methods
2.1. Sample Description
Our exemplary dataset consists of 20 randomly sampled mother–preschool child dyads from a real study data set and is only used for illustration purposes here. The children were 5–6 years of age (M = 5 years; 3 months; SD = 1.5 months). Mothers’ age averaged at 37.20 years (SD = 3.51 years). Families were recruited from a database of volunteers based in and around a mid-sized city in eastern Germany. All dyads were of European white origin and came from middle to upper-class families based on parental education and family monthly income. Sixty percent of mothers had a university degree and 60% of families had a monthly income higher than 3000 €. Participants were remunerated for their participation.
2.2. Experimental Procedure
The primary interest of the study was whether interpersonal neural synchrony between mother and child increases during a cooperative problem-solving task in comparison to an individual problem-solving task. In the cooperative problem-solving task, mother and child sat face-to-face and were instructed to take turns forming seven geometric shapes into predetermined templates (e.g., rocket, bridge, lamp etc.) using wooden blocks (i.e., Tangrams). In the individual condition, mother and child were given four of the same templates and asked to reconstruct these by themselves. Because they were seated at the same table, a portable wooden barrier was put in between them to induce a separated task context. Mother and child were also instructed to refrain from talking to one another if possible. Each task lasted 120 s and was repeated twice, resulting in four task phases (see Figure 1
A). In between those four task phases, three 80-s resting phases were included. fNIRS was simultaneously measured in both participants in bilateral temporo-parietal junction (left hemisphere: Channel 9–12; right hemisphere: Channel 13–16) and dorsolateral prefrontal cortex (left hemisphere: Channel 1–4; right hemisphere: Channel 5–8; see Figure 1
B). For a full description of the study design, please refer to [25
2.3. General Information on fNIRS Data Acquisition and Analysis
A visual summary of all processing steps is depicted in Figure 2
. fNIRS data were obtained with a NIRScout 16–16 system (NIRx GmbH, Berlin, Germany) offering 16 sources and 16 detectors, which were divided to measure two interactants. In other settings, two or more devices can be synchronized to enable hyperscanning. The absorption of near-infrared light was measured at the wavelengths of 760 and 850 mm and the sampling frequency was 7.81 Hz. The start of each condition was indicated via triggers sent through an experimental program, for instance, OpenSesame [27
]. Varying task conditions can be indicated by triggering a second pin for the end of the condition, while preset and standardized task duration can be manually added at a later time-point.
The NIRStar recording program that came with the NIRScout device used to acquire the present data saved it in a NIRx-specific format—i.e., each participant’s data was stored in a separate folder and included the following files: *.avg, *.dat, *.evt, *.hdr, *.inf, *.set, *.tpl, *.wl1, *.wl2, *config.txt, and *probeInfo.mat. Other devices save fNIRS data in different data formats, but all data formats always include a data file comprising the raw wavelength data, as well as a data file in which triggers and/or the optode configuration are saved.
We will use MATLAB and the following toolboxes to analyze the data:
2.4. Optode Configuration and Raw Data Conversion
In the first pre-processing step, we must store the individual optode configuration as a Source-Detector (SD) file. The SD file can be prepared using the SDgui as part of the Homer2 toolbox, in which the optode positions can be manually entered according to each study’s optode layout. In our case, having used a NIRx system, all necessary information to do so is available in the *probeInfo.mat and *config.txt files located in the corresponding data folder. The probeInfo.mat file is automatically created after each recording and stems from a manual configuration using NIRSite set up before the start of data collection. In NIRSite, sources and detectors can be manually placed on MNI head models or imported from digitized optode coordinates. Optode localization only needs to be saved once—if the same optode layout was used for all participants and individual digitizer information is unavailable—and the thereby created SD file can then be used generically for all participants. In this specific case, the function createSD file will automatically convert the optode configuration into an SD file.
Once the SD file is prepared, we can run the NIRxtoSPM
function to convert the stored data into a MATLAB structure with the *.nirs
format, which comprises all needed information for further processing with Homer2. This can be done using both Homer2 and SPM for fNIRS, and their data conversion functions are freely available online. For other systems than NIRx, these conversion functions should be available from the respective manufacturer (e.g., Hitachi: https://www.nitrc.org/projects/hitachi2nirs
, accessed on 12 June 2021, Shimadzu: https://www.nitrc.org/projects/shimadzu2nirs/
, accessed on 12 June 2021).
2.5. Pre-Processing and Visual Quality Check
We start the pre-processing by loading the *.nirs data of an individual into the MATLAB workspace. Next, we go through a rough automatic data quality check (initial pruning) using the function enprunechannels. We specify the expected range of the data (V)—dRange = [0.03 2.5], a signal-to-noise threshold—SNRthresh = 10 (the lower, the more conservative; with a range from 1–13), and the range of the inter-optode distance—SDrange = [2.5 3.5]. Please be aware that these parameters are different for each fNIRS recording device. The obtained output includes the variable SD.MeasListAct, which labels bad channels (per row) with a 0. This information can be used to guide the decision on which channels to exclude later.
Subsequently, we convert the raw wavelength data (illustrated in Figure 3
) into raw optical density (OD; illustrated in Figure 4
, upper panel) using the function spm_fnirs_calc_od
. Following the initial pruning and data conversion, we proceed to motion correction. The choice of motion correction has been discussed in various other methodological papers [30
]. We have found the MARA algorithm [32
] to provide the most efficient motion correction when applied to both our adult and preschool child fNIRS data. MARA implements a smoothing based on local regression using weighted linear least squares and a 2nd-degree polynomial model. We use the default setting for the length of the moving window to calculate the moving standard deviation (L = 1), the threshold for artifact detection (th = 3), and the parameter that defines how much high-frequency information should be preserved by the removal of the artifacts (alpha = 5). This parameter also corresponds to the length of the LOESS smoothing window. Motion corrected OD data is illustrated in Figure 4
, lower panel.
Subsequently, we continue with an additional manual, visual quality check. To do so, we convert the OD data to concentration changes (µmol/L) in oxygenated (HbO) and deoxygenated hemoglobin (HbR) using the function hmrOD2Conc
, with the partial path length factor set to its default [6 6
]. We then extract the HbO time-series and plot it for all channels using the wavelet transform function wt
. Following, we check plots for each channel for their integrity, a clearly visible heart band, and visually observable motion artifacts. Examples for different types of artifacts detected during this manual, visual quality check are shown in Figure 5
. As a final result, the number of bad quality channels is identified and saved in an array to be later excluded from further analyses.
Once the quality check is finalized, we go back to the OD data after motion correction and apply a band-pass filter, with low- and high-pass parameters of 0.5 and 0.01, respectively, and using a second-order Butterworth filter with a slope of 12 dB per octave. Spatial filtering at this point makes sure that the data is also usable for individual analyses (GLM) and excludes physiological noise to a certain extent.
In the last step, before interpersonal neural synchrony analyses, we once more convert the filtered OD data to concentration changes in HbO and HbR (µmol; illustrated in Figure 6
). This time, we use the spm_fnirs_calc_hb
function from the SPMforfNIRS toolbox, as it allows for age-dependent modification of the modified Beer–Lambert Law. In the current sample, children’s age averaged at the rounded number of 5, and mother’s age averaged at the rounded number of 36. The age-dependent parameters are automatically calculated in the GUI version of SPMforfNIRS toolbox and result in the following parameters: molar absorption coefficients [1/(mM*cm)] of [1.4033 3.8547; 2.6694 1.8096] for both mother and child; distal path length factor of [5.5067 4.6881] for the child, and [6.4658 5.4036] for the mother.
To calculate interpersonal neural synchrony, several methods have been used, such as Pearson correlation and Robust Correlations [19
]. We rely on Wavelet Transform Coherence (WTC), the most commonly used method to estimate interpersonal neural synchrony in fNIRS hyperscanning paradigms [21
]. WTC is used to assess the relation between the individual fNIRS time series in each dyad and each channel as a function of frequency and time. We estimate WTC using the cross wavelet and wavelet coherence toolbox [22
Based on earlier literature [25
], visual inspection, and spectral analyses, a frequency band of interest needs to be identified. In previous publications on interpersonal neural synchrony during parent–child problem solving, the frequency band of 0.02–0.10 Hz (corresponding to ~10–50 s) was identified as most task-relevant. Accordingly, the same frequency band is used for the present sample. This frequency band should avoid high- and low-frequency physiological noise such as respiration (~0.2–0.3 Hz) and cardiac pulsation (~1 Hz).
Furthermore, coherence values outside the cone of influence need to be excluded in the WTC analysis due to likely estimation bias. Average neural coherence (i.e., interpersonal neural synchrony) can then be calculated for each channel combination and epoch. Epoch estimations, and, therefore, epoch length, are defined by the physical constraints of time-frequency analyses, i.e., the minimal time needed to estimate an appropriate coherence value for the indicated frequency range. Three oscillations in a certain frequency are required for WTC estimations [22
]. For example, epoch lengths for a frequency of 2 s must, therefore, be at least 6 s long. To ensure reliable coherence value estimation, most studies have opted towards averaging coherence values over entire conditions. However, few studies have also calculated neural synchrony in smaller epochs to study how it behaves over time [35
]. Here, we average coherence over the entire condition due to the broad frequency band and relatively short condition duration. Condition onset time can be extracted from the variable s
included in the initial *.nirs
files. Condition durations were manually added.
To run the WTC function, we extract each individual’s time-series for each channel and combine that time-series array with another column including the timestamps of each row—and, thus, the information on the data’s sampling rate. The timestamp array, if not already available, can be created using the following formula:
t = (0:(1/sampling_rate):((size(time_series, 1) − 1)/sampling_rate))’
We, then, input the following variables into the function wtc ([t, hboSub1(:, i)], [t, hboSub2(:, i)], ‘mcc’,0), which translates into timestamp and fNIRS time-series of HbO of Subject 1, timestamps and fNIRS time-series of HbO of Subject 2, and the count for a Monte Carlo Simulation. Generally used to estimate the significance of coherence values, we set the simulation to zero to increase computational speed. The output variables include Rsq (coherence values), period (period/frequency band), and coi (cone of interest). As we need to calculate coherence for each channel combination separately, the rows comprising the frequency band of interest are extracted once for each dyad and used for all channel combinations. The information in coi is used to set values outside of the cone of interest to missing values (NaN). To sum up the estimation, interpersonal neural synchrony is calculated by averaging the variable Rsq over a certain frequency of interest (y-axis) and task duration (x-axis), which results in a coherence value for each channel combination in each dyad.
2.7. Control Analysis
There are various approaches for additional control analyses. Here, we describe an approach in which we pair a random mother time-series to children’s time-series and bootstrap the pairing 100 times with replacements. Accordingly, 100 WTC values are computed per channel combination and dyad combination with a fixed child, resulting in 2000 values overall. The thereby obtained WTC values are then averaged over each channel combination and pairing and result in a coherence value for each channel combination and child–caregiver dyad. Each true WTC value, therefore, is compared against a “randomly generated” value.
2.8. Statistical Analysis
For statistical analysis, the data is exported in a long data format and imported into RStudio (RStudio Team, 2015). Coherence values are bound by 0 and 1 and, therefore, assume a beta distribution. While a Fisher’s r-to-z transformation would be appropriate and indicated for subsequent analyses with a linear (mixed) model assuming a gaussian data distribution, we use the untransformed coherence values to calculate generalized linear mixed models (GLMM) using the package and function glmmTMB
]. We chose to cluster individual channels into regions of interest to enhance the reliability of interpersonal neural synchrony measures in specific cortical regions, even across minor differences in cap fit and, thus, probe settings [20
]. Post-hoc contrasts are conducted using the function and package emmeans
and corrected for multiple comparison using Tukey’s Honest Significant Difference [39
]. The distributions of residuals are visually inspected for each model. Models are estimated using Maximum Likelihood. Model fit is compared using a Chi-Square Test (likelihood ratio test; [40
In the current data set, we chose to focus on the analysis of HbO only, for both brevity and its consistent association with neural synchronization (e.g., [25
]). However, we recommend repeating all analyses for HbR (and ideally Hb total) to derive a comprehensive picture of all possible neural synchronization processes [41