Embodying Language through Gestures: Residuals of Motor Memories Modulate Motor Cortex Excitability during Abstract Words Comprehension

There is a debate about whether abstract semantics could be represented in a motor domain as concrete language. A contextual association with a motor schema (action or gesture) seems crucial to highlighting the motor system involvement. The present study with transcranial magnetic stimulation aimed to assess motor cortex excitability changes during abstract word comprehension after conditioning word reading to a gesture execution with congruent or incongruent meaning. Twelve healthy volunteers were engaged in a lexical-decision task responding to abstract words or meaningless verbal stimuli. Motor cortex (M1) excitability was measured at different after-stimulus intervals (100, 250, or 500 ms) before and after an associative-learning training where the execution of the gesture followed word processing. Results showed a significant post-training decrease in hand motor evoked potentials at an early processing stage (100 ms) in correspondence to words congruent with the gestures presented during the training. We hypothesized that traces of individual semantic memory, combined with training effects, induced M1 inhibition due to the redundancy of evoked motor representation. No modulation of cortical excitability was found for meaningless or incongruent words. We discuss data considering the possible implications in research to understand the neural basis of language development and language rehabilitation protocols.


Introduction
How concrete and abstract concepts are represented in our brain? Imagining the word "run" makes it easy to think about a prototypical action referred to it. On the other hand, the task grows harder when imagining the word "game", considering the varieties of concrete experiences, situations, and contexts related to the same concept.
Traditionally, concrete and abstract semantics were distinguished by the presence/ absence of a relation of a concept with elements or actions present in the physical world. Concrete concepts can thus be processed by both a verbal-based and imagery-based system, while abstract words would land on a verbal-based system only [1].
However, recent research underlined the complexity and variety of abstract concepts, sustaining that they are not "detached" from the sensory world. Otherwise, they express flexible relationships with a multitude of "real-time" experiences processing, which involves multiple neural systems, including the sensorimotor one (for recent reviews, see [2][3][4][5][6][7]. Indeed, alongside "strong" language theories that are in favor of an obligatory involvement ("embodiment" theory, [8][9][10][11]) or a corollary involvement ("disembodiment" theory, [12]) of sensorimotor systems during language processing, new "extended" embodied approaches claim about a multidimensional representation of language. In this sense, language could be represented at the neural level in a multimodal way [13][14][15][16], involving motor representation at different levels, depending on the semantic content of each word and its degree of "grounding" in the physical world. In this perspective, the role of the sensorimotor system is considered modulable but not absent.
Some studies did not find traces of motor system involvement in response to abstract word processing [17][18][19][20][21]. However, more recent evidence demonstrated motor activation when abstract words were presented within a motor context, i.e., preceded by an action verb or a manual gesture [22][23][24][25][26]. The results of these studies may be explained as the effect of manipulating the association between the abstract word and a physical referent. This associational constraint "narrowed" the broader semantic representation of the abstract concept, enhancing motor activation. Nevertheless, additional studies found that the semantic processing of particular categories of abstract words (i.e., carrying emotional o mental content) recruits the motor system even if presented in isolation [27,28].
In attempting to explain these contrasting results, some authors proposed that differences in concrete and abstract language processing can be accounted for by different neural mechanisms that characterize the acquisition of different types of semantics [29][30][31]. Concrete or action language acquisition presumably goes through a directly experienced association between a signifier and a referent in the physical world. In contrast, abstract language learning is more mediated by social interaction during which verbal representations are shared ("WAT theory", [32,33] or inferred through previously learned verbal associations [34]. One of the major lines of research on language acquisition is led by theories that postulate a motor origin of language development [35], as the gestural origin hypothesis [36]. Some authors proposed that spoken language functions rely on frontal brain regions, in particular in Broca's area [37][38][39][40][41][42], where connections between hand and mouth evolved with the function of controlling feeding behavior [43,44]. Thus, the primary use of hand-arm gestures was enlarged to the mouth, inducing spoken language development through the gradual introduction of vocal elements associated with gestures [36]. Thenceforth, gestures (and, in general, hand movements) still accompany human communication exchanges [45], appearing as a precursor of the first attempts at speaking [46][47][48] and mediating language knowledge through sensory experience [49].
Studies demonstrating a tied integration in gesture and language processing (see [50][51][52][53] for a review) and the overlapping development of gesture and speech behavior in children [54,55] outlined a strong interconnection between actions, gestures, and language. Through gestures, the motor system participates in social communication with action pantomimes, iconic/deictic manual postures, and signs conventionalized across cultures. Symbolic gestures (i.e., hand or arm postures associated with a concept) give a clear example, being deliverable without the presence of speech though expressing a social-established meaning [56]. Symbolic gestures represent the best candidate to study the relationship between the motor system and language: they share the semantic properties with language, maintaining a manual representation, so being processed by the same systems involved in action understanding [42].
Given that gesture is considered an "evolutionary bridge" through language development, it is surprising that its role in abstract language processing and learning was poorly addressed.
In a transcranial magnetic stimulation (TMS) study, [26] assessed how gestures (primes) presented in a semantic priming paradigm influence the lexical-semantic processing of consequent abstract words (targets). When gestures and words were congruent in meaning, participants showed faster lexical decision times than conditions with pseudo-words targets or meaningless primes presentation. This result was associated with increased hand motor cortex (M1) excitability between 100 and 250 ms from word presentation, suggesting the motor system's pivotal role in the earliest phase of language processing [18,57]. Moreover, the motor and behavioral facilitations reflected language integration with gesture meaning [51,52,58]. However, it is unclear if this effect represents temporary facilitation The early studies investigating the latter hypothesis evidenced that enriching the verbal learning process with gesture performing favors the acquisition of a new abstract lexicon in humans ( [59][60][61][62] and robots (i.e., numerical concepts, [63][64][65]). In this direction, defining manual actions as a crucial element in language learning could help to determine new paradigms to investigate the function of gesture in abstract lexicon development at the ontogenetical level.
The present study aims to assess how conditioning gesture execution within semantic processing modulates motor system involvement during abstract word comprehension. Second, we expected to evidence if motor response was affected or not by traces of previously learned gesture-word representations.
For this purpose, healthy participants were required to make a lexical decision about a series of meaningful words without any action or motor content presented with an equal number of meaningless words. Motor cortex excitability was assessed by delivering a single TMS pulse in correspondence to the hand motor cortex (M1) along with different delays from word presentation corresponding to pre-lexical (100-250 ms) and post-lexical (500 ms) phases of semantic processing. The same task was repeated after an associative-learning training where the presentation of each meaningful or meaningless word was coupled with the execution of a specific gesture, congruent or not with the previous words meaning.
Considering previous literature, we had two main hypotheses: (a) Gesture training could increase motor excitability in response to meaningful words, underlining specific motor representation related to each word's meaning; and (b) this effect could be interfered with by previous common sensorimotor representation, i.e., the semantic relation between gestures and words, being specific for congruent words only.

Participants
Twelve participants (7 females, mean age of 21.5 ± 2.3 years) were enrolled in the experiment. All participants were right-handed (according to Edinburgh Handedness Inventory), [66], and Italian native speakers. They had normal or corrected-to-normal vision and no neurological or psychiatric disorder history. All participants provided written informed consent and declared no contraindication to applying single-pulse TMS [67,68]. The Ethics Committee of the Medical Faculty at the University of Parma approved the study.

Stimuli
Gestures stimuli were three pictures showing actress postures executing different symbolic gestures (thumb up, thumb down, and palm upward) and were presented in the training phase. Verbal stimuli were 12 pictures showing a still actress with opened mouth and a strip on the top on which a meaningful or meaningless Italian word was written (see Table 1). We selected half of the meaningful words, incongruent with gesture meaning, from an Italian database of abstract terms [69]. The other half of the words were chosen congruently with gesture meaning, matching them in length and occurrence with incongruent words (COLFIS database; [70]). Since these latter are not present in the Villani and colleagues database, we checked that the English translations are commonly classified as abstract concepts (Calgary semantic decision database, [71]).
Meaningless words (Pseudowords) were matched in length with the corresponding meaningful stimuli (Congruent: neba, lema, merfo; Incongruent: ritia, dife, or deia). A further picture of a black cross on a grey background was presented randomly as control baseline stimuli.

Experimental Procedure
The experiment was carried out in a soundproofed room. Participants were seated in a comfortable armchair with their arms placed under a table plane. They were positioned on a head and chin rest to guarantee the stillness of their head throughout the TMS stimulation.
Stimuli were presented on a PC display 60 cm far from participants' eyes. Stimuli presentation and TMS pulses were time-controlled through a script developed with Matlab software (Matlab version 7.7, R2008b; Psychophysics Toolbox extensions [72,73]).
The experiment was divided into three phases ( Figure 1): phase 1 (PRE-TRAINING), phase 2 (TRAINING), and phase 3 (POST-TRAINING). Phase 3 was identical to phase 1 and was administered after the training to measure changes in motor cortical excitability.
In phases 1 and 3, each trial began displaying a fixation cross printed on a black screen (duration 700 ms), followed by the presentation of a verbal stimulus (word or pseudoword) which lasted 550 ms. A single TMS pulse on the left primary motor cortex (M1) was then delivered 100, 250, or 500 ms after verbal stimuli onset (stimulation delays). Each verbal stimulus was presented nine times for each stimulation delay (in total, 108 stimuli). Baseline trials were run randomly during phases 1 and 3, showing the cross instead of word/pseudoword and maintaining the same stimulation delays (in total, 27 trials). After the verbal stimulus was turned off, an interval of 4000 ms (black screen) interspersed the subsequent trial. Then, participants were required to perform a lexical-decision task classifying each verbal stimulus as a word or pseudoword. It is worth noting that they were unaware of the semantic relation between the observed word and the gestural stimuli presented in the training phase. The task was performed covertly except in 30% of the trials, where an overt response (button pressing with the left arm) was required in correspondence to a question mark appearing. Trials with incorrect answers or no responses were discarded for subsequent analysis.
In the intermediate training phase (phase 2), each trial started with the displaying of the fixation cross (700 ms duration), followed by the presentation of a gesture (duration 700 ms). Then, a verbal stimulus was presented and lasted 2000 ms. We did not administer TMS in this phase. Participants were required to make a lexical decision about the verbal stimulus and reproduce the presented gesture if the following word was meaningful (go trials), regardless of the congruent/incongruent semantic relationship.
Each congruent and incongruent word was matched with a specific gesture, forming the following couples (gesture-congruent word/gesture incongruent word: thumb upalright, thumb up-air; thumb down-badly, thumb down-faith; palm upward-tillness, palm upward-idea). Each couple was repeated 40 times (in total, 240 trials). Additional 24 trials (10% of training trials) were administered, presenting pseudowords as linguistic stimuli (no-go trials). Thus, no pseudoword was associated with a gesture execution. Participants observed a fixation crossed and had to make a lexical decision task in response to a verbal stimulus presentation (word or pseudoword). No gestural stimuli were presented in this phase. In the training phase (2), participants observed a gesture as a prime stimulus, followed by a meaningful verbal stimulus (congruent or incongruent with the gesture). Then, they had to perform the observed gesture (Go condition). Pseudowords were presented randomly as a No-Go condition that no required gesture execution. (B) Gestural and verbal stimuli presented in the experiment.
Colored frames indicate the semantic relation of each class of verbal stimuli with gestures (congruent, incongruent, or meaningless). Participants observed a fixation crossed and had to make a lexical decision task in response to a verbal stimulus presentation (word or pseudoword). No gestural stimuli were presented in this phase. In the training phase (2), participants observed a gesture as a prime stimulus, followed by a meaningful verbal stimulus (congruent or incongruent with the gesture). Then, they had to perform the observed gesture (Go condition). Pseudowords were presented randomly as a No-Go condition that no required gesture execution. (B) Gestural and verbal stimuli presented in the experiment.
Colored frames indicate the semantic relation of each class of verbal stimuli with gestures (congruent, incongruent, or meaningless).

Single-Pulse TMS Protocol and MEPs Recording
A single-pulse TMS was delivered to the left M1, and the corresponding MEPs were recorded from the right first dorsal interosseus (FDI). We acquired continuous electromyographic (EMG) recording from FDI with a CED Micro 1401 (Cambridge Electronic Design, Cambridge, U.K.) connected to CED 1902 amplifier and interfaced with CED Spike software. EMG signal was amplified (1000×), band-pass filtered (20-2000 Hz), and digitized at a sampling rate of 5 kHz through a PC software (Spike2, CED Ltd, Cambridge, UK). Pairs of surface electrodes (Ag-AgCl, disposable, 7 mm × 4 mm) were attached to the muscle belly (active electrode) and the corresponding metacarpophalangeal joints (reference electrode). The ground electrode was placed on the left wrist. The EMG signal was visualized and then processed offline. A figure-of-8 coil (Magstim Co., Ltd., Whitland, UK) connected to a Bistim system (Magstim Co., Ltd., Whitland, UK) was placed over the left M1. The coil intersection was placed tangentially to the scalp, with the handle pointing backwards and laterally, 45 • angle away from the midline. The coil was moved to find the optimal position from which maximal amplitude MEPs were elicited in the contralateral FDI muscles using slightly supra-threshold stimulus intensity. We then marked the optimal position on the scalp to ensure correct coil placement throughout the experiment. The intensity of magnetic pulses was set at 120% of the resting motor threshold (RMT), defined as the minimal intensity of the stimulator output that produces MEPs with an amplitude of at least 50 µV in the muscles with a probability of 50% [68].
The experimenter visually inspected the absence of voluntary contraction throughout the experiment. When muscle tension was detected, participants were invited to relax. Trials with muscle contractions unrelated to TMS were discarded in the offline MEP analysis. MEPs of the FDI muscle were visually inspected and rejected if contaminated by contraction not due to stimulation (<3% of the total trials). The peak-to-peak amplitude (mV) was computed using MATLAB software (MATLAB R2016b). We discarded MEPs amplitudes less than 50 µV (<3% of the total trials) from the analysis.
Raw MEPs amplitude of baseline trials has been checked for differences by means of a within-subject ANOVA with TRAINING (pre-training vs. post-training) and STIMULA-TION DELAY (T1, 100 ms vs. T2, 250 ms vs. T3, 500 ms). The normalized MEP amplitude of experimental trials was calculated for each subject by subtracting the individual baseline mean value from every raw MEP. Average normalized MEPs values calculated for each stimulation delay were submitted to a repeated-measures ANOVA with TRAINING (pre-training vs. post-training), SEMANTICS (pseudoword congruent vs. pseudoword incongruent vs. word congruent vs. word incongruent), and STIMULATION DELAY (T1, T2, T3) as within-subjects factors. All post-hoc comparisons were carried out using the Duncan test. Significance was established in all analyses at p = 0.05. The data's normal distribution and sphericity were verified before performing statistical analysis (Mauchly's test, p > 0.05; Kolmogorov-Smirnov Test, p > 0.05). η2 partial was calculated as a measure of effect size and 1-β as a measure of statistical power.

Results
Participants recognized above chance whether the presented stimuli were words or pseudowords. The mean percentage of correct responses was 98.1%.
Raw mean MEP amplitudes of FDI of baseline trials (cross pictures) administered during the experimental session were not significantly different from each other. No main effect nor significant interaction effects were found (p > 0.05).
Post-hoc analysis evidenced a significant difference in hand M1 activation in response to congruent word stimuli before and after training at T1 stimulation delay (100 ms after word presentation) (p = 0.004). Indeed, while in the pre-training phase congruent words presentation significantly induced an augment of excitability, being the MEPs amplitude statistically different from baseline (t(11) = 2.56, p = 0.03), gesture training induced a motor inhibition (t(11) = 2.34, p = 0.03, Figure 2, upper-right). A further difference concerning Congruent Words was found in the correspondence of T3 stimulation delay (500 ms after word presentation) compared to T1 and T2 in the pre-training phase (p < 0.0001; p < 0.02). In this last stimulation interval, an opposite pattern emerged: starting from an average MEP value largely below the baseline, the hand M1 response to Congruent and Incongruent Words significantly increased after the training, revealing an augment of cortex excitability but showing significance only versus pre-training Congruent Words. However, despite this boost, cortical excitability remains below the baseline level. The same tendency also emerged for incongruent words without revealing significant differences. p = 0.029, η2 partial = 0.19, 1 − β = 0.80) and between TRAINING, SEMANTICS and STIM-ULATION DELAY (F(6,66) = 2.59, p = 0.026, η2 partial = 0.19, 1 − β = 0.82).
Post-hoc analysis evidenced a significant difference in hand M1 activation in response to congruent word stimuli before and after training at T1 stimulation delay (100 ms after word presentation) (p = 0.004). Indeed, while in the pre-training phase congruent words presentation significantly induced an augment of excitability, being the MEPs amplitude statistically different from baseline (t(11) = 2.56, p = 0.03), gesture training induced a motor inhibition (t(11) = 2.34, p = 0.03, Figure 2, upper-right). A further difference concerning Congruent Words was found in the correspondence of T3 stimulation delay (500 ms after word presentation) compared to T1 and T2 in the pre-training phase (p < 0.0001; p < 0.02). In this last stimulation interval, an opposite pattern emerged: starting from an average MEP value largely below the baseline, the hand M1 response to Congruent and Incongruent Words significantly increased after the training, revealing an augment of cortex excitability but showing significance only versus pre-training Congruent Words. However, despite this boost, cortical excitability remains below the baseline level. The same tendency also emerged for incongruent words without revealing significant differences.
No significant results emerged concerning pseudowords. No significant results emerged concerning pseudowords. A list of post-hoc results of the three factors' interaction was presented in Table 2. Comparisons resulting with p > 0.05 and comparisons resulting in significant but mixing stimulation time and stimuli factors are not reported for simplicity and unuseful interpretability. In sum, after the training, we observed a modulation of hand motor excitability specific for words that were semantically congruent with the associated performed gesture.
Interestingly, MEPs re-measured in correspondence to the T1 stimulation interval showed a significant inhibition compared to baseline. At the post-semantic level of processing (T3 stimulation interval), we found an effect on training in increasing MEPs for the same class of words, even if the general excitability remains under baseline values. Conversely, we found no significant changes in MEPs for words that were equally associated with gestures during the training but conveyed an incongruent meaning or pseudowords (untrained stimuli).

Limitations
The small sample size represented a limitation of the study. To manage this issue, we computed a sensitivity analysis with GPower software (GPower 3.1, Universität Düsseldorf: Psychologie-HHU, Düsseldorf, Germany). The output of this analysis reported the theoretical acceptable minimum effect size given the alpha and 1-β obtained as the result of the ANOVA. This analysis permitted comparing the experimental and theoretical effect size to facilitate the interpretation of practical significance and the impact of the study results.
The sensitivity analysis computed concerning the interaction effect between TRAIN-ING, SEMANTICS, and STIMULATION DELAY factors showed a minimum effect size of η2 partial of 0.09, relatively smaller than the obtained effect size (0.19). This result, in association with the large effect size and power values, offered a good statistical basis to sustain the significance and reproducibility of the results.

Discussion
Several studies evidenced the interaction between gestures and language during lexico-semantic processing and communication [50][51][52]58,74]. These findings support the hypothesis that gestures and speech are processed and integrated by the same neural systems involved in action perception and execution (mirror mechanism, see [75,76]).
In this study, we investigated if the effect of repeated gesture execution in association with the language comprehension process modulated the grade of motor representation, measuring motor system excitability in response to abstract words congruent or not in meaning with the performed gestures. This paradigm was developed based on a previous study [26], where we found motor facilitation in response to abstract word presentation preceded by congruent gesture, demonstrating that both signals shared a motor representation that was retrieved to accomplish word understanding.
In the present study, we did not present any motor prime associated with the words. Symbolic gestures and words were matched through associative learning training to constrain the lexical-decision task to gesture processing and execution. In this way, we expected to sufficiently shape the motor response to word processing since the learned motor schema resulted automatically evoked within the lexical task.
The study results showed a significant change in hand motor cortex excitability in response to meaningful words related to gestures at T1 (100 ms after stimulus onset) between the pre-learning and post-learning phases. According to previous studies, this timing represents an automatic (bottom-up) phase of lexical-semantic elaboration [18,22,26,52,57]. As expected, residuals of previous learning (i.e., individual semantic memory) interfered with training effects. Indeed, the modulation of motor system response was limited to words with a congruent semantic relationship with the gestures presented during the training.
However, this modulation turned out to be a significant decrease as the effect of the training, in contrast with our original hypothesis that predicted higher motor facilitation consequent to the reinforcement of the semantic relation between each word and gesture. This result accords with a previous study by [50], where authors found inhibition of kinematic parameters when subjects performed gestures contemporary to the pronunciation of the corresponding-in-meaning word. The authors explained this result by claiming the overlap between the cerebral mechanisms associated with speech and manual activity (see [44]. In that case, parallel to a vocal spectra increase, the manual system was interfered with because of the involvement of the same brain process. Similarly, the motor inhibition found in our study could result from a conflict between overlapping neural activations. Previous neurophysiological studies showing an interference effect on semantic processing caused by the co-activity of the motor system corroborated this hypothesis [17,81,82]. The excitability decrease in the motor cortex would then represent the neural correlate of a motor response suppression. Since the combination of gesture and word during the training enhanced the motor representation related to the conveyed semantics, the following gesture execution added a probably redundant motor response.
The interesting datum is that the effect was evident only in the condition of words congruent with gestures, excluding the possibility that a general inhibitory control affected the motor response in the post-learning phase, where gesture execution was no more required.
An unexpected result concerns the increase of motor excitability observed in the prelearning phase for congruent words compared to baseline values. The motor facilitation started 100 ms after stimulus onset and progressively decreased in the subsequent intervals. This result contrasts with a previous study [52], where we found no significant motor activation in response to gesture-congruent meaningful words compared to meaningless stimuli isolated from a gestural context.
We offer a possible explanation of this phenomenon considering the possible unbalance between the selected verbal stimuli in the present and past studies regarding valence. It is well-known that abstract semantics entail affective processing to a greater extent than concrete concepts [83,84], involving the precentral cortex and limbic regions [85]. A possibility is that in our study, the motor activation could be highlighted by the emotional value conveyed by the meaningful stimuli. Moreover, recent literature reports motor activation in response to specific abstract categories (e.g., numbers: [86]; Internal states: [87]; Aesthetics: [88]. Even if we cannot categorize our verbal stimuli in a single category, the presented words may engage, at least in part, the sensorimotor system. Another result concerned the motor facilitation emerged at T3 (500 ms after stimulus onset): the MEPs in correspondence to congruent words significantly increased compared to the value recorded at the same time before the training, in the opposite tendency to the results of T1. In our opinion, this could be ascribed to a corollary activation of the motor system due to deeper semantic access that re-activates the motor content associated with the word, probably involving top-down processes (i.e., motor imagery). However, even if the modulation before and after training was significant, no significant augmenting of motor excitability beyond the physiological baseline appeared in the post-lexical phase of word comprehension.
In conclusion, the results of our study corroborated recent evidence which demonstrated the motor system's role in abstract language processing. This is in line with revised embodied language theories, integrating the concept of a multimodal representation of the words [13,14,30,32,49,80]. Sensorimotor areas can be involved in the early phase of comprehension at different grades, depending on the contextual availability of motor schemas associated with them.
In this study, we used gestural stimuli to demonstrate how a learned association between a symbol (i.e., a word) with a motor posture shapes motor activation in a somatotopic way during a lexical-semantic task, interfacing with previous semantic knowledge of the subjects. These data suggest that the motor system may contribute through a similar mechanism to building semantic knowledge during language development in childhood, shaping and reinforcing specific neural pathways based on the relationship between a phoneme and a movement or physical entity [89]. This relation would develop depending on contextual variables and lexical flexibility [90], being sparser in abstract words due to the wide variety of meaning representations [5].
Further studies should clarify at which level concrete and abstract language could or not diverge in acquisition and processing at a behavioral and neural level.
A suggestion is to investigate the interaction between gestures and words following the action-language continuum [56]. While we easily understand pantomimes as mimed actions in association (or substitution) of action verbs, emblems comprehension results from an arbitrary culture-based association between a posture and a word. Through behavioral and neuromodulation studies, investigating how the interaction between the language and motor system could change according to the manual and verbal signals' conventionalization could help better understand language development. Moreover, longitudinal studies comparing children with developmental language disorder (DLD) or typical development (TD) could help to determine the relation between gesture production and lexicon acquisition (see also [91]).
Lastly, potential application in clinical populations affected by acquired language disorders (aphasia) should be considered, exploiting symbolic gestures to extend the impact of naming recovery based on observation of motor stimuli beyond action verbs [92,93].