Time-Frequency Analysis of Mu Rhythm Activity during Picture and Video Action Naming Tasks

This study used whole-head 64 channel electroencephalography to measure changes in sensorimotor activity—as indexed by the mu rhythm—in neurologically-healthy adults, during subvocal confrontation naming tasks. Independent component analyses revealed sensorimotor mu component clusters in the right and left hemispheres. Event related spectral perturbation analyses indicated significantly stronger patterns of mu rhythm activity (pFDR < 0.05) during the video condition as compared to the picture condition, specifically in the left hemisphere. Mu activity is hypothesized to reflect typical patterns of sensorimotor activation during action verb naming tasks. These results support further investigation into sensorimotor cortical activity during action verb naming in clinical populations.


Introduction
Studies show that action observation and action execution elicit similar patterns of cortical activity in sensory and motor regions of the brain [1][2][3]. Mirror neurons, first discovered in the F5 area of macaque monkeys [3]-and shortly thereafter in humans-in Broca's Area, the inferior frontal gyrus (IFG), and in the inferior parietal lobule (IPL), are considered the "neural mechanism" that links observation with execution, and drives imitative motor learning processes [1,[4][5][6][7]. Specifically, mirror neuron activity occurs in response to a variety of perceptual conditions including, but not limited to, viewing pictures and videos of hand, leg, and foot actions [4,8,9], emotional facial expressions [10,11], oropharyngeal swallowing movements [12,13], and orofacial speech movements [14]. Recently, additional cortical areas such as the dorsal premotor cortex (PMC), superior parietal lobule (SPL), medial temporal gyrus (MTG), superior temporal sulcus (STS), anterior cingulate cortex (ACC), and the insula, have also been shown to activate during action observation-execution experiments (see Rizzolatti & Craighero [1] for a detailed review). Evidence of the so-called "extended mirror neuron network" has led to more recent notions that mirror neuron activity goes beyond motor learning by associating sensorimotor representations of actions with corresponding cognitive-linguistic representations of actions (i.e., action names).

Mirror Activity in Response to Action Verbs
Evidence for mirror neuron activity during linguistic processing of action verbs has steadily increased in the past decade. Studies have shown that simply reading an action verb such as kick or lick, activated areas of the premotor cortex that control the execution of leg and face movements [15,16]. A seminal study conducted by Pulvermuller, Harle, and Hummel [17], provided evidence for the localization of mirror neuron activity within the primary motor cortex, which corresponded with the particular body-part(s) that perform the action. Specifically, during a lexical decision task, leg-related verbs activated the vertex of the motor strip, while face-related verbs activated over the left Sylvian fissure near the representation of the articulators [17]. Overlap between the neural substrates responsible for action observation, action execution, and the linguistic processing of action verbs was further demonstrated in a study by Andric and colleagues [18] that used magnetic resonance imaging (MRI) to compare the location and strength of mirror neuron activity during the observation of symbolic emblems (e.g., thumbs up), grasping actions (e.g., holding a cup), and speech (e.g., "it's good"). The authors found activity in the STS and MTG, in addition to the classical frontal and parietal mirror neuron areas, thereby demonstrating shared sensorimotor and linguistic neural substrates for gestures and speech production [18]. Another neuroimaging study provides evidence that handedness factors into the hemispheric localization of mirror neuron activity. A functional magnetic resonance (fMRI) study of left-and right-handed neurologically healthy adults revealed that mirror neuron activity in the premotor cortex was stronger in the hemisphere contralateral to participants' dominant hands [19]. This pattern was seen when participants made lexical decisions regarding the name of manual-action words, as well as when participants imagined themselves performing the actions. Such findings lead the authors to conclude that the semantic aspects of manual-action verb processing may vary in accordance with the way in which an individual typically performs actions, further supporting an interaction between the sensorimotor representation (mirror neurons) of actions and the linguistic (semantic) representation of action verbs in an individualized manner.

Stimuli Effects
Despite the number of studies of that have documented mirror neuron activity in response to viewing, imagining, performing, and linguistically processing actions, it must be acknowledged that the quality and mode of stimuli delivery yields contrasting results in terms of the localization, strength, and timing of activity. For instance, weaker patterns of mirror neuron activity have been found in response to viewing pictures when compared to viewing videos of hand actions [20]. However, different modes of video display have elicited similar patterns of mirror neuron activity including, but not limited to, animated physiology videos [21], point light biological action videos [22], cartoon videos of action execution [23], and videos of humans performing actions [24,25]. Pictures or videos depicting non-goal related actions (e.g., hand extension) [26], biologically irrelevant actions (e.g., barking) [27], or incomplete actions whereby the action does not reach the desired end-point [28] have been shown to elicit significantly weaker patterns of mirror neuron activity. Such evidence seems to indicate that mirror neuron activity varies under different conditions, although in general the more realistic the stimulus-i.e., actions that are biologically relevant, complete, presented in videos-the stronger the response.
Just as the nature of the action stimuli may influence the strength of mirror neuron activity, the nature of the linguistic stimuli may also reveal differences in the strength or location of mirror neuron activity. For example, transitive verbs elicit stronger patterns of mirror neuron activity than intransitive verbs, particularly in the posterior regions of the parietal lobe and in the left inferior frontal gyrus [29]. Consistent with previous literature comparing videos and pictures, the difference between transitive and intransitive verbs was even greater when the actions were presented in video segments as compared to when actions were represented in line drawings. Specifically, greater activation was seen in the right inferior and superior parietal cortices when the action verbs were observed in videos [29]. These results suggest mirror neuron activity is specifically related to actions performed on an object, and thus will not necessarily be activated during the processing of all verbs.
The study by den Ouden and colleagues [29] is noteworthy for a methodological change compared with previous investigations of action observation. The participants in previous studies performed comprehension tasks such as reading or lexical decision, while the participants of the den Ouden et al. [29] study verbally named the action presented in picture and video. Differences in the activation patterns seen in fMRI during picture and video presentation suggest videos are a more natural representation of actions and require less processing, while pictures required more visual scanning and strategy due to the lack of movement. Furthermore, video observation led to activation of Wernicke's area, which was not activated during picture observation. Thus, linguistic processing areas were engaged during the more natural video presentation of actions. Furthermore, participants verbally named the actions. Consequently, the den Ouden et al. [29] study extends previous literature on mirror neuron activity during comprehension tasks to verbal tasks. However, more research is needed to understand the role of mirror neuron activity (i.e., sensorimotor activity) in word retrieval-specifically, identifying when sensorimotor processes are active in the time course of action verb naming.

Neurophysiological Measures
Mirror neuron activity has been investigated using a variety of neuroimaging techniques such as magnetic resonance imaging (MRI), functional magnetic resonance imaging (fMRI), electroencephalography (EEG), and magnetoencephalography (MEG) to determine the location and strength of the activity of sensorimotor processing in a variety of action-observation, -execution, and verb processing tasks. An additional measure of interest, particularly in relation to verb processing, relates to the temporal dynamics of mirror neuron activity. In 2004, Hauk and Pulvermuller [30] measured the time-course of neural activity during passive reading of actions performed via face, arm, and leg movements by analyzing event-related potentials (ERPs) using EEG. The passive reading of action verbs resulted in somatotopically organized activation of areas within the motor cortex approximately 200 ms after stimulus onset. This timing of the mirror neuron activity during verb reading matches the proposed word processing model by Indefrey and Levelt [31] in which lexical-semantic access occurs within 200 ms of stimulus presentation. A meta-analysis conducted by Indefrey and Levelt [31], which included 82 experiments, revealed consistent findings for the spatial and temporal aspects of word production. Using the reaction time and ERP data as predictors, the authors analyzed the findings of MEG studies to determine if the predicted time course of word production aligned with the spatial components of word production. The results of the meta-analysis did support the hypothesized time course of lexical access at 175-250 ms; phonological retrieval 250-330 ms; and syllabification 330-445 ms. Thus, the results presented by Hauk and Pulvermuller [30] provide evidence that the action semantics of the action words were accessed during lexical-semantic processing. Given the assumption that action semantics are supported by mirror neuron activity as the mechanism for the sensorimotor representation of actions, these findings are further evidence to support a link between the sensorimotor representation and the linguistic representation of actions. Furthermore, the identification of the time course of mirror neuron activity during action verb processing suggests the sensorimotor representation of actions is linked to, or even a component of, the semantic representation of action verbs. However, ERP studies do not depict changes in neural activity across the time course of lexical processing. Therefore, a more specific means of analyzing the time course of mirror neuron activity during naming tasks may elucidate sensorimotor mechanisms that are thought to support action semantics.

Mu Rhythm Activity
A number of electrophysiological studies using whole head EEG and MEG imaging techniques have shown that mu rhythm suppression provides a valid measure of mirror neuron activity. The combined "mu rhythm" is characterized by an initial spectral peak in the~10 Hz "alpha" frequency band and a second spectral peak in the~20 Hz "beta" frequency band, and is considered a global measure of cortical sensorimotor activity [32][33][34][35][36][37]. As with other rhythms, mu rhythm activity is measured via time-locked decreases and increases in the signal amplitude reflecting desynchronization (i.e., neural activity) or synchronization (i.e., neural inhibition or neural idling) respectively [38,39].
Independent component analysis (ICA) has been shown to separate and sort EEG signals into temporally independent and spatially fixed components [40]. However, one of the limitations of EEG signal analysis is the presence of movement artifacts due to eye blink and gross motor movements [41,42]. Recently, ICA has been shown to effectively identify and remove movement artifacts from neural signal data by using higher-order statistics, such as kurtosis [43,44].
A number of studies show that ICA can be used to identify sensorimotor activity via localized mu components. Clusters of mu components have been identified and localized during the perception or imagination of movement in premotor and/or primary sensorimotor cortices [45][46][47][48][49]. In addition to using ICA to spatially map mu rhythm activity, patterns of mu ERS/ERD may be analyzed using event-related spectral perturbations (ERSPs). ERSPs provide a means to visually analyze patterns of ERS/ERD in frequency bands across time, relative to the onset of a particular time-locked event [39,40]. Generating a color-coded time-frequency graphic, ERSPs depicting the average changes in spectral power between the baseline time period and the experimental condition time period are plotted across subjects and conditions [50,51]. Hence, ERSPs can be used to compare and contrast dynamic changes in mirror neuron activity, exhibited by clusters of mu rhythm independent components that exhibit spectral peaks at~10 Hz (mu-alpha) and~20 Hz (mu-beta) respectively.
Recent studies have successfully identified mu rhythm activation during processing of action semantics whether presented visually, auditorily, or orthographically. In 2013, Moreno, Vega, & Leon [52] found that action language modulated mu rhythm suppression similarly to action observation. Participants performed a recognition task with action sentences, abstract sentences, and action videos, in which they were asked to press a button if the sentence or video had been previously seen. Results showed greater mu suppression during the action video and action sentence conditions compared to the abstract sentence condition. Furthermore, there was no significant difference between action video and action sentence suggesting that the comprehension of action words activates the same motor areas as action observations. In a follow-up study, time frequency analyses were used to determine differences in the time course of mu suppression when participants read action sentences, abstract sentences, and perceptive (sensory) sentences. Consistent with the previous study, mu suppression was observed only during the processing of action sentences [53]. Furthermore, patterns of mu rhythm have been found during action word reading tasks in bilingual individuals with no differences between languages. Vukovic & Shtyrov [54] compared patterns of mu ERD during passive reading of action words by German-English bilinguals. While mu ERD was observed in both languages, significantly stronger patterns of mu ERD were observed when reading in the primary language. The authors suggest a stronger engagement between motor and language systems in the primary language because the action semantics were first learned through that language [54]. Together these studies support the use of mu rhythm activity to measure sensorimotor processing during action language tasks, and the notion that action language is embodied.
To date, the strongest evidence for mirror neuron activity during action verb processing involves observation of biologically-relevant actions presented in video form with names that are transitive verbs. However, previous studies have not used ERSPs to measure mu rhythm activity in these conditions, nor during naming tasks. More specific comparisons of the strength of mu rhythm activity (i.e., sensorimotor activity) across time during picture and video verb naming will further illustrate how different stimuli presentations of actions may affect the timing of word production processing. If mirror neuron activity occurs before lexical-semantic processing is theorized to occur (200 ms), then sensorimotor representations of actions may support action comprehension; however, if mirror neuron activity occurs at the same time as linguistic processing, then sensorimotor representations may be hypothesized to support lexical access, providing evidence of a direct link between sensorimotor representations and lexical-semantic representations of actions. Such a link would have implications for the assessment and treatment of action verb impairments in acquired communication disorders, such as aphasia. In order to investigate differences in the neurophysiological response to action verbs presented in video and picture forms, this study used whole-head electroencephalography (EEG) to measure changes in the amplitude and timing of mu rhythm activity in neurologically-healthy adults during a confrontation naming task of actions depicted in videos and pictures. Specifically, this study aims to (1) use ICA to identify bilateral mu components during subvocal naming of action pictures and videos, and (2) use ERSPs to compare and contrast the strength and timing of sensorimotor activity during subvocal naming of action pictures and videos.

Participants
A repeated measures design was used to analyze patterns of mu rhythm ERS/ERD obtained from 21 neurologically healthy adults during the sub vocal naming of actions depicted in videos and pictures. The experimental protocol was approved by the Midwestern University IRB board, and all participants gave their informed consent prior to participating in the study. Participants had no self-reported history of developmental or acquired cognitive or communication impairments. Additionally, each participant completed a demographic questionnaire to provide information regarding handedness, language, age, and gender. Table 1 presents the participant demographics. The short form of the Boston Naming Test (BNT) [55] and the Montreal Cognitive Assessment (MOCA) [56] were also administered to screen for cognitive-linguistic deficits that may interfere with performing the action verb naming task. All participants scored as unimpaired on the BNT and MOCA (see Table 1 for average scores).

Stimuli
Videos were taken from an online database of hand-related actions [57] and edited using iMovie [58] to create a 2-s stimulus representing each action. Subsequently, screenshots of the videos were taken to create corresponding pictures for each action (see Figure 1 for an example). Of the 1074 video clips of actions that were available in the online database created by Umla-Runge and colleagues [57], 22 biologically-relevant action verb video clips were initially selected as stimuli for this study based on three criteria-recognizable action in picture and video format, name agreement, and similar psycholinguistic criteria of concreteness, familiarity, imageability, and frequency. First, the researchers chose actions which could be identified from a still shot of the video using three independent reviewers. The reviewers identified still shots which could easily be recognized as the action performed in the video. Actions which were agreed upon by all reviewers were selected. Name agreement data was collected from a separate normative sample of 20 neurologically healthy adults that were asked to view the action pictures and videos, and report all possible names for the actions depicted. Action videos and pictures that did not yield 100% name agreement were excluded from the study. Lastly, the actions which met criteria for recognition in picture and videos and name agreement were analyzed for psycholinguistic values. Because the number of stimuli were limited, the aim was to include actions with relatively similar psycholinguistic values using averages and standard deviations. Values for concreteness, frequency, familiarity and imageability were initially taken from the MRC Psycholiguingstic Database [59] which reports Francis and Kucera [60] values for frequency and concreteness. The updated frequency and concreteness values by Brysbaert and New [61] and Brysbaert, Warriner, and Kupperman [62] were subsequently collected. In all, 20 action pictures and videos were retained as experimental stimuli. The final 20 action verbs and the values for the psycholinguistic criteria are presented in the Appendix A (Table A1). All actions included were transitive verbs, based on results from previous research suggesting actions upon objects evoke a stronger mirror neuron response [29]. All stimuli were formatted for presentation on a 21.5 inch HD Dell computer screen located approximately 3 feet in front of the participant.

Data Collection
EEG data was collected in a double walled, soundproof audio booth using E-Prime 2.0 software [63] to present and time-lock action picture and video stimuli during the continuous recording of EEG data. Event-related EEG data was collected using 64 electrodes via a 64 channel HydroCell geodesic sensor net (HCGSN), arranged in accord with the international 10-20 system. EEG channel data was recorded using EGI Netstation 4.5.7 software (Electrical Geodesics, Inc., Eugene, OR, USA). The following time-locked events were marked during each trial: (a) 2000 ms inter-trial interval with blank screen, (b) 1000 ms fixation cross, (c) 2000 ms picture or video presentation of an action that participants were asked to silently name in their head, (d) 100 ms fixation cross, (e) 2000 ms picture or video presentation of an action that participants were asked to name aloud, and (f) 1000 ms presentation of a stop sign. Thus, participants were asked to subvocally name the first presentation of the action picture or video in order to minimize any potential noise associated with movement artifacts. Subsequently, participants were asked to name aloud the second presentation of the same action picture or video so as to encourage consistency regarding action naming across all trials.
Videos and pictures were randomly presented in separate blocks and the order of stimuli condition blocks, pictures vs. videos, were also randomized for each participant. Each block presented all 20 verbs, and each block was repeated three times resulting in 60 trials per participant, per stimulus condition. A break was offered to participants after each block to prevent fatigue.

Data Analysis
Using EEGLAB, an open source Matlab toolbox [64], raw EEG data were down sampled at 256 Hz, and channels were registered using BESA head model coordinates. Data were re-referenced to the average common reference, and were band passed filtered between 7 and 30 Hz. The data were epoched to begin at 1000 ms prior to the time-locked the presentation of pictures or videos that demarcated the onset of subvocal naming (i.e. time point zero), and ended at 2000 ms, yielding a baseline of 1000 ms and a total sum of 3000 ms of experimental data in each trial. Epochs were subsequently visually inspected and those that were contaminated with noise due to movement were rejected from the dataset. A minimum of 40 epochs were retained per condition for each participant.
Following preprocessing, each condition dataset was concatenated and underwent ICA [65] using the extended version of the "runica" algorithm in the EEGLAB v13.4 (SCCN, San Diego, USA) ICA toolbox. Each participant yielded 64 independent components (ICs) per condition [50]. After "unmixing" signal data to obtain independent components (ICs), which are considered spatially fixed plots of component activity projected onto topographic scalp maps [64], group analyses were performed using the STUDY toolbox in EEGLAB v13.4.

Data Collection
EEG data was collected in a double walled, soundproof audio booth using E-Prime 2.0 software [63] to present and time-lock action picture and video stimuli during the continuous recording of EEG data. Event-related EEG data was collected using 64 electrodes via a 64 channel HydroCell geodesic sensor net (HCGSN), arranged in accord with the international 10-20 system. EEG channel data was recorded using EGI Netstation 4.5.7 software (Electrical Geodesics, Inc., Eugene, OR, USA). The following time-locked events were marked during each trial: (a) 2000 ms inter-trial interval with blank screen, (b) 1000 ms fixation cross, (c) 2000 ms picture or video presentation of an action that participants were asked to silently name in their head, (d) 100 ms fixation cross, (e) 2000 ms picture or video presentation of an action that participants were asked to name aloud, and (f) 1000 ms presentation of a stop sign. Thus, participants were asked to subvocally name the first presentation of the action picture or video in order to minimize any potential noise associated with movement artifacts. Subsequently, participants were asked to name aloud the second presentation of the same action picture or video so as to encourage consistency regarding action naming across all trials.
Videos and pictures were randomly presented in separate blocks and the order of stimuli condition blocks, pictures vs. videos, were also randomized for each participant. Each block presented all 20 verbs, and each block was repeated three times resulting in 60 trials per participant, per stimulus condition. A break was offered to participants after each block to prevent fatigue.

Data Analysis
Using EEGLAB, an open source Matlab toolbox [64], raw EEG data were down sampled at 256 Hz, and channels were registered using BESA head model coordinates. Data were re-referenced to the average common reference, and were band passed filtered between 7 and 30 Hz. The data were epoched to begin at 1000 ms prior to the time-locked the presentation of pictures or videos that demarcated the onset of subvocal naming (i.e., time point zero), and ended at 2000 ms, yielding a baseline of 1000 ms and a total sum of 3000 ms of experimental data in each trial. Epochs were subsequently visually inspected and those that were contaminated with noise due to movement were rejected from the dataset. A minimum of 40 epochs were retained per condition for each participant.
Following preprocessing, each condition dataset was concatenated and underwent ICA [65] using the extended version of the "runica" algorithm in the EEGLAB v13.4 (SCCN, San Diego, USA) ICA toolbox. Each participant yielded 64 independent components (ICs) per condition [50]. After "unmixing" signal data to obtain independent components (ICs), which are considered spatially fixed plots of component activity projected onto topographic scalp maps [64], group analyses were performed using the STUDY toolbox in EEGLAB v13.4.
For study group analyses, component measures were precomputed based upon the spectral characteristics of IC activity and the scalp map spatial distribution of each IC [65]. A "pre-clustering array" was created to instruct the principle component analyses (PCA) to cluster ICs demonstrating similar spectral and scalp map characteristics across participants [64]. Independent components that exhibited incongruent topographic and spectral characteristics were excluded from clusters. Following IC clustering, each IC that was included in right (R) and left (L) hemisphere mu clusters, were visually inspected to ensure that each IC met the following criteria: (1) topographic scalp maps exhibited a localized pattern of signal activity in the appropriate (R vs. L) hemisphere, and (2) power spectra displayed peaks in the alpha (~10 Hz) and beta (~20 Hz) frequency ranges. In addition, other component clusters were individually inspected to ensure that all mu components were appropriately assigned during PCA clustering.
To analyze the power of ERS/ERD in the alpha (8-13 Hz) and beta (15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25), or the combined mu rhythm exhibited by the ICs that were included in the R and L mu clusters across time, ERSPs were obtained within the 7-30 Hz frequency range. ERSP analyses were conducted. The time-frequency analyses were analyzed using a Morlet sinusoidal wavelet transformation, set at 3 cyles and linearly increasing to 20 cylces at 30 Hz [57]. Dynamic changes in the power of alpha and beta frequency ERS/ERD, occurring 1000 ms prior to the time-locked event (i.e., picture and video subvocal naming) and continuing up to 2000 ms following the time-locked event, were of particular interest (i.e., beginning 1000 ms prior to the onset of picture/video subvocal naming and continuing throughout the duration of the epoched trials). ERSPs were computed relative to the baseline, which was computed from 200 randomly sampled latency windows from each inter-trial interval. To statistically analyze conditional effects between (1) subvocal naming of action pictures, and (2) subvocal naming of action videos, EEGLAB bootstrapping statistics were employed with an alpha value set at 0.05 [66,67]. False discovery corrections (FDR) were applied to adjust for multiple hypotheses [68].

Results
As hypothesized, mu component clusters were identified bilaterally (see Figure 2), as indicated by spectra peaks at~10 Hz and~20 Hz, respectively. ERSP analysis of both the right and left mu clusters provide evidence of mu ERD during the action picture and video subvocal naming tasks (see Figures 3 and 4). Due to the statistical limitations of performing independent component analysis, activity in the right vs. left hemisphere could not be directly compared across participants. However, visual analysis of ERSP data indicate differences in the power of mu activity, with the left mu cluster clearly exhibiting more robust patterns of ERD in the alpha (~8-13 Hz) and beta (~15-25 Hz) frequency ranges. While the right cluster does not exhibit powerful patterns of mu ERD in the picture or video conditions, there were significant differences in the strength of activity (pFDR < 0.05) across the alpha and beta frequency ranges. In contrast, the left mu cluster exhibits significantly stronger patterns of mu ERD (pFDR < 0.05) in the video condition as compared to the picture condition, indicated by the increased number of time-frequency voxels that appear subsequent to time point zero (see Figures 3  and 4). For study group analyses, component measures were precomputed based upon the spectral characteristics of IC activity and the scalp map spatial distribution of each IC [65]. A "pre-clustering array" was created to instruct the principle component analyses (PCA) to cluster ICs demonstrating similar spectral and scalp map characteristics across participants [64]. Independent components that exhibited incongruent topographic and spectral characteristics were excluded from clusters. Following IC clustering, each IC that was included in right (R) and left (L) hemisphere mu clusters, were visually inspected to ensure that each IC met the following criteria: (1) topographic scalp maps exhibited a localized pattern of signal activity in the appropriate (R vs. L) hemisphere, and (2) power spectra displayed peaks in the alpha (~10 Hz) and beta (~20 Hz) frequency ranges. In addition, other component clusters were individually inspected to ensure that all mu components were appropriately assigned during PCA clustering.
To analyze the power of ERS/ERD in the alpha (8-13 Hz) and beta (15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25), or the combined mu rhythm exhibited by the ICs that were included in the R and L mu clusters across time, ERSPs were obtained within the 7-30 Hz frequency range. ERSP analyses were conducted. The time-frequency analyses were analyzed using a Morlet sinusoidal wavelet transformation, set at 3 cyles and linearly increasing to 20 cylces at 30 Hz [57]. Dynamic changes in the power of alpha and beta frequency ERS/ERD, occurring 1000 ms prior to the time-locked event (i.e. picture and video subvocal naming) and continuing up to 2000 ms following the time-locked event, were of particular interest (i.e. beginning 1000 ms prior to the onset of picture/video subvocal naming and continuing throughout the duration of the epoched trials). ERSPs were computed relative to the baseline, which was computed from 200 randomly sampled latency windows from each inter-trial interval. To statistically analyze conditional effects between (1) subvocal naming of action pictures, and (2) subvocal naming of action videos, EEGLAB bootstrapping statistics were employed with an alpha value set at 0.05 [66,67]. False discovery corrections (FDR) were applied to adjust for multiple hypotheses [68].

Results
As hypothesized, mu component clusters were identified bilaterally (see Figure 2), as indicated by spectra peaks at ~10 Hz and ~20 Hz, respectively. ERSP analysis of both the right and left mu clusters provide evidence of mu ERD during the action picture and video subvocal naming tasks (see Figures 3 and 4). Due to the statistical limitations of performing independent component analysis, activity in the right vs. left hemisphere could not be directly compared across participants. However, visual analysis of ERSP data indicate differences in the power of mu activity, with the left mu cluster clearly exhibiting more robust patterns of ERD in the alpha (~8-13 Hz) and beta (~15-25 Hz) frequency ranges. While the right cluster does not exhibit powerful patterns of mu ERD in the picture or video conditions, there were significant differences in the strength of activity (pFDR < 0.05) across the alpha and beta frequency ranges. In contrast, the left mu cluster exhibits significantly stronger patterns of mu ERD (pFDR < 0.05) in the video condition as compared to the picture condition, indicated by the increased number of time-frequency voxels that appear subsequent to time point zero (see Figures 3 and 4).

Localization of Mu Rhythm ERD during Action Verb Naming
The primary aim of the current study was to use a novel means of analyzing the spatiotemporal dynamics of sensorimotor activity during the subvocal naming of static (i.e. picture) vs. dynamic (i.e. video) actions. A number of studies provide evidence of mu rhythm activity [52][53][54] during action verb processing tasks. As hypothesized-and in accordance with previous studies-the current study revealed bilateral mu component clusters during action word processing and specifically in subvocal naming conditions. These results are in line with previous EEG studies in which mu suppression was found to be greater during action language processing conditions [52][53][54]. To our

Localization of Mu Rhythm ERD during Action Verb Naming
The primary aim of the current study was to use a novel means of analyzing the spatiotemporal dynamics of sensorimotor activity during the subvocal naming of static (i.e. picture) vs. dynamic (i.e. video) actions. A number of studies provide evidence of mu rhythm activity [52][53][54] during action verb processing tasks. As hypothesized-and in accordance with previous studies-the current study revealed bilateral mu component clusters during action word processing and specifically in subvocal naming conditions. These results are in line with previous EEG studies in which mu suppression was found to be greater during action language processing conditions [52][53][54]. To our

Localization of Mu Rhythm ERD during Action Verb Naming
The primary aim of the current study was to use a novel means of analyzing the spatiotemporal dynamics of sensorimotor activity during the subvocal naming of static (i.e., picture) vs. dynamic (i.e., video) actions. A number of studies provide evidence of mu rhythm activity [52][53][54] during action verb processing tasks. As hypothesized-and in accordance with previous studies-the current study revealed bilateral mu component clusters during action word processing and specifically in subvocal naming conditions. These results are in line with previous EEG studies in which mu suppression was found to be greater during action language processing conditions [52][53][54]. To our knowledge, there is only one previous neurophysiological study of action picture versus action video naming. In an fMRI study of action naming by den Ouden and colleagues [29], stronger patterns of motor activity were elicited during video presentation compared to picture presentation. Additionally, the video condition elicited activity in Wernicke's area, suggesting linguistic processing is facilitated by video presentation of actions but not by picture presentation.
In the current study, minimal differences were observed regarding the time course of mu activity across conditions. However, significant differences were revealed in terms of the power or "strength" of activity. Specifically, dynamic action videos elicited significantly stronger patterns of mu activity, beginning at 0 ms and continuing throughout the duration of the naming task, which ended at 2000 ms. Considering the proposed functional link between action observation, action execution, and the lexical-semantic processing of actions [1], it seems reasonable to suggest that cortical sensorimotor activity provides a common framework for the multimodal processing of actions, which is modulated via mu rhythm activity. Furthermore, finding significant differences in the strength of mu activity in both the right and left hemispheres across conditions suggests that dynamic action videos may represent the most effective means to elicit cortical sensorimotor activity during action naming tasks. The theoretical and clinical implications of these findings will be the focus of the remaining discussion; more specifically, the timing of the interaction between action observation and action semantics within the word production process and the resulting clinical implications for treatment of word production impairments will be discussed.

Action Observation and Action Semantics
Across participants, the onset of sensorimotor activity during the subvocal naming tasks occurred at 200 ms, which, according to the timing theory put forth by Indefrey and Levelt [31], falls within the proposed time period of lexical-semantic access. Similarly, Hauk and Pulvermuller [30] reported sensorimotor activation at 200 ms after presentation of an action word. Pulvermuller [69] concluded that the sensorimotor activation was occurring as part of the lexical semantic processing and not post-lexically. The results of the current study further support the hypothesis that sensorimotor representations of action observation and action execution are in fact activated during the linguistic processing of action verbs, suggesting that lexical semantic access for action verbs does not solely involve linguistic representations, but also invokes supportive sensorimotor representations.
Several studies have sought to use sensorimotor activation to enhance naming impairments through the use of gestures. These studies found improvements in naming through the training of gestures related to words (i.e., a gesture for drinking or cutting with scissors), although only when combined with verbal production of the word (see Rose, Raymer, Lanyon, & Attard, [70] for a review). Other studies have reported improvement in naming when participants completed a meaningless gesture with the impaired or unimpaired arm, or when standing as compared to sitting (see Marangolo & Caltagirone, [71] for a review). However, these studies did not solely focus on action naming and resulted in limited generalization. If processing of an action verb involves an interaction between the linguistic and the sensorimotor representation of that verb, treatments that invoke the supportive sensorimotor processes could improve word production. The results of this study suggest that videos elicit stronger patterns of sensorimotor activity during subvocal action naming as compared to pictures. Therefore, incorporating video stimuli during action verb naming tasks may provide enhanced access when the lexical-semantic system is damaged, further enhancing treatment effectiveness.
Several recent investigations of the effects of action observation on word production in persons with aphasia have reported behavioral effects. The first of these experiments by Marangolo, Cipollari, Fiori, Razzano, & Caltagirone [72] compared the effects of intense verb retrieval training when participants either (1) observed actions, (2) observed and executed the same actions, or (3) observed actions and executed a meaningless movement. In all three conditions, the therapist produced the action and the participant produced the verb. Four of six participants with aphasia significantly improved verb production following only the action observation as well as action observation and execution conditions. Participants did not improve on verb production when they observed a meaningless gesture, which implies the link between actions and verbs is at the semantic level of verb processing. In a follow-up study by Marangolo and colleagues [73], persons with aphasia observed human and non-human actions and produced the corresponding verb. Participants improved with maintenance effects in naming the human actions only. This finding suggests that only biologically relevant actions have sensorimotor representations.
A study by Bonifazi et al. [74] replicated the stimulus conditions used by Marangolo et al. [73] but also included a novel condition in which the actions were presented in video-clips rather than by a live model. Notably, improvement did not significantly differ when the action was observed from a video clip. Another intriguing result from the Bonifazi et al. [74] study is that only the individuals with lexical-phonological impairments improved while the individuals with semantically-based verb impairments did not. To explain this finding, the authors hypothesized that the semantic deficit hindered the activation of the sensory-motor features of the verbs and thus verb production did not improve. The difference in treatment outcomes from the semantic impairment group is further evidence of the hypothesis that sensorimotor representation of actions interacts with the semantic representation of actions. However, a study by Faroqi-Shah and Graham [75] reported that one of two participants improved in action naming from video stimuli and suggested the difference was related to phonological impairments of the second participant, which were not directly addressed by the treatment procedures (i.e., no phonological cueing). However, the authors also posited a lower premorbid education may have led to reduced treatment effects [75]. As such, the effectiveness of video verb training for types of aphasia warrants further examination.
In summary, there is growing evidence of the interaction between sensorimotor and linguistic representations of action verbs. Furthermore, treatments that evoke this interaction consistently report improvement in action naming. It has been understood for some time that noun and verb impairments are distinct and thus, the treatment of these linguistic units should be distinct [76]. Incorporating the sensorimotor representation of action verbs in treatment stimuli is one method of differentially treating action verbs.

Limitations and Future Directions
To our knowledge, this is the first study to use whole-head EEG analysis techniques to map cortical sensorimotor activity via mu rhythm ERS/ERD during subvocal action verb naming tasks. However, it must be acknowledged that there are inherent differences between covert and overt naming tasks, which limit the generalization of study findings to therapeutic naming tasks. Covert naming tasks were utilized in the current study as a means to dissociate the cortical sensorimotor activity that occurs during motor speech planning and production from the cortical sensorimotor activity that is thought to facilitate lexical processing during verb naming. Thus, the current method does not directly indicate sensorimotor activity during verbal production but during silent naming. However, future studies employing surface electromyography (sEMG) in conjunction with whole-head EEG would allow researchers to simultaneously record and analyze peripheral orofacial muscle activity and cortical sensorimotor activity during overt naming tasks. Finally, a recent study reported increases in cortical language and sensorimotor activity of older adults during naming tasks, which suggests older individuals require additional processing [77]. Given that aphasia typically affects older individuals, the next logical step in this line of research is to use the current methodology to map cortical sensorimotor activity in neurologically-healthy older adults. Thus, in addition to behavioral measures of increased word production, mu rhythm activity may provide a meaningful measure of improved connections between the sensorimotor and linguistic representations of action verbs during overt naming tasks. Lastly, the findings that action observation-either with or without a model of action execution-has been shown to improve action verb naming in individuals with aphasia [72][73][74], and offers a novel area of exploration aimed at improving action verb naming treatment methodologies. Furthermore, while the studies reviewed here reported positive effects of action observation when presented live or by video, no comparison has been made between pictures and videos. Traditional naming therapy involves picture stimuli, hence, a direct comparison between video and picture action observation treatment is planned.

Conclusions
The results of the current study are consistent with previous action observation and naming literature, which has reported that videos elicit the strongest pattern of sensorimotor activity. These results further support claims of a link between the sensorimotor representation of actions and the related linguistic representation. Therefore, action naming treatments may be strengthened by incorporating videos in order to elicit the strongest pattern of sensorimotor activity.
Author Contributions: The authors jointly conceived and designed the study. Both authors collected and analyzed data. Both authors wrote the paper.

Conflicts of Interest:
The authors declare no conflict of interest.