1. Introduction
Human communication relies not only on verbal expressions but also on a wide range of nonverbal behaviours that accompany or modulate speech. Among these, particular attention has been paid to actions involving the hands, including symbolic gestures that convey meaning and movements directed toward the body itself. One prominent category of such behaviours is referred to as self-adaptors, which involve unconscious or semi-conscious physical actions, such as touching one’s face, rubbing one’s hands, or scratching one’s head. These behaviours are typically not communicative in intent but are often observed during moments of cognitive or emotional strain (
Ekman & Friesen, 1969). According to the foundational framework established by
Ekman and Friesen (
1969), self-adaptors are part of a broader class of adaptors. These are nonverbal behaviours that reflect internal psychological states and are not typically intended to communicate with others. Historically, self-adaptor behaviours have been interpreted primarily as indicators of stress or anxiety regulation, serving as bodily manifestations of emotional arousal (
Ekman & Friesen, 1969;
Harrigan, 1985). Despite their ubiquity, self-adaptor behaviours have often been overlooked in research on language processing. In contrast, gestures with explicit communicative intent have received far greater empirical attention (
McNeill, 1992). There is a growing need to reconsider this imbalance and explore the potential role of self-adaptors in cognitive and verbal tasks. This study focuses specifically on examining how self-adaptor behaviours are related to cognitive regulation and lexical access during speech production.
In addition to the functional classification proposed by
Ekman and Friesen (
1969), which placed self-adaptors among behaviours that regulate internal states, alternative frameworks have addressed the direction and focus of nonverbal actions.
Freedman and Hoffman (
1967) categorised these behaviours into body-focused and object-focused movements. Self-adaptor behaviours are considered body-focused movements, as they are directed toward the individual’s own body and typically lack communicative intent. In contrast, object-focused movements are gestures directed away from the body, often used to represent shape, motion, or spatial relationships, and frequently function to support verbal communication.
Barroso et al. (
1978) observed that object-focused gestures tend to emerge when speakers attempt to externalise mental representations through speech, suggesting a close connection between gestural action and conceptual access. Although body-focused movements such as self-adaptor behaviours have been commonly observed in everyday interaction, they have received far less empirical attention in psycholinguistic research. Their cognitive function remains particularly underexplored, especially in comparison to object-focused gestures that have been extensively studied in relation to speech production. This study seeks to address this gap by investigating how self-adaptor behaviours may contribute to cognitive regulation during verbal retrieval tasks, with a specific focus on tip-of-the-tongue (TOT) phenomena.
Previous studies (e.g.,
Barroso et al., 1980;
Grunwald et al., 2014;
Mueller et al., 2019) have suggested that self-adaptor behaviours may be linked to cognitive processing, particularly in tasks that involve increased attentional demands. For instance,
Barroso et al. (
1980) investigated children’s nonverbal behaviour during two distinct tasks: the Stroop task, which requires cognitive inhibition and attentional control, and the water jar task, which involves basic problem-solving and explanation. This study found that self-adaptor behaviours occurred significantly more often during the Stroop task, indicating a potential association with elevated cognitive load. Further support comes from
Fujii (
1997), who examined whether self-adaptor behaviours facilitate lexical retrieval during tip-of-the-tongue (TOT) states. In this study, participants listened to auditory definitions of Japanese proverbs and four-character idioms, which were designed to trigger TOT experiences. Participants were assigned to either a suppression condition, in which hand movements were restricted, or a control condition with no movement constraints. The control condition produced significantly more correct responses. Importantly, retrieval attempts were categorised into four distinct outcomes: (1) immediate retrieval, (2) retrieval failure, (3) retrieval following a TOT state, and (4) failure following a TOT state. Self-adaptor behaviours were found to be markedly more frequent during TOT states than in non-TOT conditions, especially in trials that resulted in successful retrieval. These findings suggest that self-adaptor behaviours may contribute to regulating verbal access under conditions of lexical uncertainty.
Further evidence for the role of self-adaptor behaviours in lexical retrieval comes from a study by
Pine et al. (
2007), who examined whether gestures influence lexical access in young children. The researchers tested 65 participants aged six to eight years, with the majority aged six. Participants were randomly assigned to either a suppression condition or a control condition. In the suppression condition, children wore mittens fastened to the table to prevent hand movement. In the control condition, no restrictions on gesture use were imposed. All participants were shown 50 pictures of objects that had been pretested to induce tip-of-the-tongue (TOT) states, such as a seahorse or a stethoscope, and were asked to name them. The frequency of TOT states was comparable between the two conditions. However, the number of correct responses following a TOT state was significantly higher in the control condition. An analysis of gesture use revealed that during TOT states, participants in the control condition produced nearly twice as many gestures compared to non-TOT conditions. Notably, this study reported that across all retrieval conditions, including immediate retrieval, TOT success, and non-TOT retrieval, the most frequently observed nonverbal behaviour was self-adaptor behaviour. These findings suggest that both representational gestures and self-adaptor behaviours may serve a supportive function in facilitating lexical retrieval, particularly under conditions of temporary verbal inaccessibility.
The proposition that gestures assist speakers in retrieving lexical items is commonly referred to as the lexical retrieval hypothesis (
Butterworth & Hadar, 1989;
Krauss & Hadar, 1999). According to this framework, gestures are not simply byproducts of speech but actively contribute to the retrieval of words, particularly when linguistic access becomes difficult. This function is believed to operate at the surface level of speech production and is most evident when the words in question involve spatial or conceptually rich representations. Gestures are thought to facilitate lexical access by activating semantically related information and supporting the retrieval of phonological forms through motor planning. One of the earliest empirical observations supporting this hypothesis comes from
Butterworth (
1981), who conducted a single-case study involving an individual with aphasia. The participant frequently produced spontaneous hand movements preceding lexical failures or moments of hesitation. These observations led Butterworth to suggest that gestures related to the target word might act as retrieval cues, compensating for impaired verbal access by stimulating semantically or phonologically associated representations. This view was further elaborated by
Krauss et al. (
1996), who proposed that gestures externalise spatially imagined or syntactically organised information, thereby assisting in the search for the appropriate lexical items. They posited that gestures reflect a broader interaction between bodily movement and the syntactic or semantic planning stages of speech.
Supporting experimental evidence was provided by
Rauscher et al. (
1996), who examined the effects of gesture suppression on speech fluency. In a within-subjects design, participants were asked to describe animated scenes under two conditions: one allowing free gesture use and another requiring them to suppress hand movements. Fluency, measured through word output per minute and the number of silent pauses, declined significantly in the suppression condition, particularly during spatial descriptions. In a related study,
Frick-Horbury and Guttentag (
1998) examined the role of gestures in resolving tip-of-the-tongue (TOT) states and in supporting memory recall. Participants were presented with auditory definitions of low-frequency target words that were likely to induce TOT experiences and were asked to provide the corresponding lexical item. The correct answer was given after each response. Following this task, participants completed a free recall test in which they attempted to retrieve as many of the previously defined words as possible. Participants were randomly assigned to one of two conditions: a gesture suppression condition, in which they were instructed to avoid gesturing, and a control condition, which received no instructions regarding movement. The findings revealed that during TOT states, both conditions exhibited more gestures than during non-TOT conditions, indicating a spontaneous increase in bodily movement during retrieval difficulty. However, the control condition successfully resolved significantly more TOT states than the suppression condition. Moreover, they also recalled a greater number of target words in the final memory task. These results suggest that gestures not only support immediate lexical access but may also enhance longer-term verbal memory. This study highlights how gesture use, even when not deliberately communicative, plays a regulatory role in language processing under cognitively demanding conditions.
Recent research has raised questions about the robustness of the lexical retrieval hypothesis.
Kısa et al. (
2021) conducted a replication and extension of
Rauscher et al. (
1996), aiming to evaluate the effects of gesture suppression on speech fluency. They identified two key methodological concerns in the original study. First, the statistical analyses did not adequately account for variance in speech performance. Second, when corrections for multiple comparisons were applied, the reported effects no longer reached statistical significance. Based on these issues, Kısa et al. argued that the empirical support for the claim that gesture suppression impairs lexical retrieval is limited. To further investigate, they conducted a new experiment modelled on Rauscher et al.’s design. Participants were paired and assigned to either a gesture suppression condition or a no-instruction control condition. Each participant memorised a short narrative of 50 to 100 words within 60 s and was then asked to retell the story to their partner using the first person. Behaviours they analysed included iconic gestures, deictics, metaphoric gestures, emblems, and self-adaptors. The analysis revealed no significant differences between conditions in key fluency measures, including the number of silent or filled pauses (e.g., “uh” or “um”). From these findings, Kısa et al. concluded that gestures do not exert a reliable facilitative effect on lexical retrieval in spontaneous speech tasks. In addition to challenging the lexical retrieval hypothesis, Kısa et al. critically examined several studies frequently cited in support of the theory, including those by
Pine et al. (
2007),
Frick-Horbury and Guttentag (
1998), and
Beattie and Coughlan (
1999). They identified limitations in analytical procedures and interpretation across these works. Notably, in their discussion of Pine et al., they observed that the most frequently occurring nonverbal action during TOT states was not representational gesture, but self-adaptor behaviour. This led them to propose that it may not be gesture suppression itself that influences lexical retrieval, but rather the suppression of other bodily movements, such as self-adaptor behaviours, which may serve a supportive function during word-finding difficulties.
There remains no consistent conclusion as to whether gestures serve a functional role in lexical retrieval. Although some findings suggest that self-adaptor behaviours may facilitate verbal access (
Fujii, 1997;
Pine et al., 2007), these observations have been limited to spontaneously occurring movements. To date, there is no definitive evidence that self-adaptor behaviours causally influence lexical retrieval.
To investigate whether self-adaptor behaviours facilitate lexical retrieval, the present study employed an experimental design modelled on
Fujii (
1997). As noted earlier, Fujii’s original study included only two conditions: a control condition and a gesture suppression condition. However, it did not provide conclusive evidence regarding the effect of self-adaptor behaviours on lexical retrieval. In particular, the number of correct responses following tip-of-the-tongue (TOT) states was not reported. To address these limitations, the present study introduced a third condition in addition to the two used by Fujii. Specifically, a self-adaptor condition was included, in which participants were instructed to place their hands on their cheeks throughout the task. This condition was compared with a suppression condition, in which gestures were inhibited by requiring participants to hold a stick with both hands, and a control condition, which received no instructions regarding hand movement. As in Fujii’s study, retrieval responses were categorised into four types: successful retrieval, retrieval failure, TOT success, and TOT failure. This categorisation allowed for a focused examination of whether self-adaptor behaviours facilitate lexical access following TOT states. The cheeks were selected as the contact point in the self-adaptor condition because, according to observations in Fujii’s study, the most frequent self-adaptor behaviours involved the face and head.
Based on previous findings, the present study tested two hypotheses concerning the relationship between self-adaptor behaviours and lexical retrieval. Hypothesis 1 predicted that the self-adaptor condition would produce a greater number of correct responses overall than both the control and suppression conditions, given that self-adaptor behaviours were the most frequently observed movements in prior studies on lexical retrieval (
Fujii, 1997;
Pine et al., 2007). Hypothesis 2 predicted that the number of correct responses following TOT states would also be higher in the self-adaptor condition, as these behaviours were particularly prevalent during TOT experiences in the studies by
Fujii (
1997) and
Pine et al. (
2007).
2. Materials and Methods
2.1. Participants
The participants were 60 adult native speakers of Japanese, comprising equal numbers of males and females. Their ages ranged from 18 to 25 years (M = 20.41, SD = 1.91). All participants had normal or corrected-to-normal vision and reported no hearing impairments. None had any conditions affecting the arms or hands that would interfere with daily activities, and all were able to follow the instructions provided. A priori power analysis was conducted using G*Power (version 3.1;
Faul et al., 2007) to determine the minimum required sample size for detecting a medium effect size (f = 0.25) with 80% power at an alpha level of 0.05 in a one-way ANOVA with three groups. The analysis indicated that a total of 52 participants would be sufficient. Our sample size of 60, therefore, exceeds this threshold, ensuring adequate statistical power for detecting group differences.
2.2. Experimental Design
The experimental design involved a single between-subjects variable with three levels of self-adaptor behaviour: a self-adaptor condition (instructed to cover their cheeks with both hands during the task), a control condition (given no specific instructions regarding bodily movement), and a suppression condition (required to hold a stick with both hands throughout the task). For the self-adaptor condition, we selected the cheeks as the contact location based on pilot observations and previous findings (e.g.,
Fujii, 1997;
Harrigan, 1985;
Grunwald et al., 2014;
Spille et al., 2021), which consistently identified the face, particularly the cheeks and jaw, as frequently contacted regions during verbal tasks. The independent variable was the type of self-adaptor condition. The dependent variables were the number of successful lexical retrievals (correct responses to proverbs and four-character idioms) and the number of correct responses following a TOT state (TOT success). When participants experienced a TOT state, they were instructed to indicate this to the experimenter.
2.3. Tasks
The experiment consisted of two tasks: a lexical retrieval task and a recall task. In the lexical retrieval task, participants listened to audio definitions of 30 proverbs and four-character idioms, which were played sequentially through speakers connected to a computer. Participants were asked to respond with the word or phrase that best matched each definition. The recall task required participants to remember and list as many of the proverbs and idioms from the lexical retrieval task as possible within a three-minute period.
2.4. Materials
As described at the beginning of this section, the lexical retrieval task involved the auditory presentation of definitions for 30 proverbs and four-character idioms. These definitions were delivered sequentially via speakers connected to a desktop computer. Participants were required to provide the word or phrase that best matched each definition. In the subsequent recall task, participants were given three minutes to recall and write down as many of the proverbs and idioms from the lexical retrieval task as possible.
In preparation for the lexical retrieval and recall tasks, a pilot study with 20 participants was conducted to select proverbs and idioms likely to elicit tip-of-the-tongue (TOT) states. The items were drawn from a pool of 30 expressions used in
Fujii (
1997) and an additional 20 items prepared by the experimenter. Via an online questionnaire, 50 definitions of proverbs and idioms were presented, and participants were asked to type the phrase that best matched each definition. They were also asked to indicate whether they experienced a tip-of-the-tongue (TOT) state. Two criteria were used in the item selection process: (1) the item had to be correctly answered by at least some participants, and (2) the item needed to reliably elicit TOT states. Based on these criteria, a final set of 30 items was selected for the main experiment. Further details of these items are provided in the
Supplementary Material. Two criteria were applied in the selection process. First, each item had to be solvable by at least some participants in the pilot study. Second, the item needed to induce TOT states with sufficient frequency. Based on these criteria, the final set of stimuli, 30 items, was determined for the main experiment. Please see the
Supplementary Material for the details of 30 items.
In the suppression condition, where participants were instructed to inhibit gestural movement during the task, a cardboard tube measuring 38 mm in internal diameter and 50.5 cm in length was used to restrict hand movement (see
Figure 1). For the auditory stimuli, a clear and easily comprehensible voice was selected using VoicePeak software (AHS Co., Ltd., Tokyo, Japan). A video camera was installed in the laboratory to record participants’ responses.
2.5. Procedure
All experiments were conducted individually in a quiet laboratory room. At the beginning of the session, participants were informed that their personal information would not be disclosed to any third party, that participation was entirely voluntary, and that declining to participate would result in no disadvantage. Written informed consent was obtained from all participants. Participants then received an explanation of the experimental procedure via both visual slides (displayed on a monitor) and verbal instructions. After the instructions were completed, the display was turned off, and only pre-recorded audio was played through the speakers.
In this experiment, participants were randomly assigned to one of three self-adaptor conditions, each consisting of 20 participants (10 males and 10 females). The first was a control condition, in which participants received no specific instructions regarding hand movements or gestures. The second was a self-adaptor condition, where participants were instructed to keep their hands on their cheeks throughout the task (see
Figure 2). The third was a suppression condition, in which participants held a stick with both hands to inhibit any hand movements during the task (see
Figure 2).
In the lexical retrieval task, participants listened to audio definitions of 30 proverbs and four-character idioms, presented sequentially via speakers connected to a computer. They were instructed to respond with the word or phrase that best matched each definition. Participants were told to provide their response after a beep sound (Beep A) played immediately following the audio definition. If they did not know the answer, they were instructed to say, “I do not know.” If they experienced a tip-of-the-tongue (TOT) state, in which they felt that the word was on the verge of recall but could not retrieve it, they were instructed to report this to the experimenter. Each item had a maximum response time of one minute, and the definition was presented only once.
A response was counted as correct if the participant retrieved the target word without reporting a TOT state. For example, when the pre-recorded definition stated, “to get straight to the main point without preamble”, a response such as “cut to the chase” was considered correct. If the participant gave an incorrect response or said, “I do not know”, the response was recorded as incorrect. If a participant explicitly reported experiencing a TOT state, their subsequent response was coded as either a TOT success or a TOT failure. A TOT success was defined as a correct response following a TOT report, whereas a TOT failure referred to a failure to retrieve the target word after reporting a TOT state. The total number of correct responses was calculated as the sum of successful retrievals and TOT successes, while incorrect responses comprised retrieval failures and TOT failures.
If participants responded immediately or stated that they did not know the answer, the task proceeded to the next item without waiting for the full minute. When a correct response was given, the experimenter confirmed it. If participants were unable to respond or the time limit was reached, the correct answer was provided orally by the experimenter. The transition to the next item was marked by a different beep sound (Beep B), after which the next audio definition was played. Participants were also informed that they were allowed only one response per item. Even if they provided a proverb or idiom with the same meaning as the intended answer, only the specific item preselected by the experimenter was considered correct. All sessions were video recorded.
Following the lexical retrieval task, participants completed the recall task. In this task, they were instructed to recall as many of the proverbs and four-character idioms from the lexical retrieval task as possible within a three-minute period. During this phase, participants in the self-adaptor condition were asked to continue holding their cheeks with both hands, while those in the suppression condition were instructed to maintain their grip on the stick in order to inhibit gestural movement. After receiving an explanation of the experimental procedure, participants were seated in a chair positioned in front of the video camera. Before beginning the main trials, they completed three practice trials. A diagram illustrating the flow of stimulus presentation and participants’ responses is provided in
Figure 3.
Upon completion of all 30 trials, participants were instructed that the lexical retrieval task had concluded and that correct answers had been provided after each item. They were then asked to recall as many of the proverbs and four-character idioms from the previous 30 trials as possible. It was explained that it did not matter whether their previous responses had been correct or incorrect, and that items could be recalled in any order. Participants were informed that they would have three minutes to complete the recall task. The recall task began immediately after this instruction, and a three-minute timer was started. In the recall task, the number of target words recalled within the three-minute time limit was recorded as the recall score.
3. Results
3.1. Correct Responses in the Lexical Retrieval Task
The total number of correct responses in the lexical retrieval task was calculated for each self-adaptor condition. The means and standard deviations for each condition are presented in
Figure 4. In this analysis, the total number of correct responses refers to the sum of successful retrievals and TOT successes.
The mean number of correct responses (out of 30) was 10.70 (SD = 6.30) in the control condition, 13.05 (SD = 4.96) in the self-adaptor condition, and 8.45 (SD = 3.90) in the suppression condition. A one-way analysis of variance (ANOVA) was conducted with self-adaptor condition as the between-subjects factor and the number of correct responses out of 30 trials as the dependent variable. The analysis revealed a significant main effect of self-adaptor condition, F (2, 57) = 3.99, p = 0.024, η2p = 0.12. Post hoc comparisons using the Bonferroni method (adjusted p-values, p < 0.05) indicated a significant difference between the self-adaptor condition and the suppression condition, t (57) = 2.83, p = 0.019, d = 0.88. That is, participants in the self-adaptor condition produced significantly more correct responses than those in the suppression condition.
The previous analysis was based on the total number of correct responses, including both successful retrievals and TOT successes. To further examine the effect of self-adaptor behaviour on immediate lexical access, a one-way ANOVA was conducted using only the number of successful retrievals (excluding TOT successes) as the dependent variable. The analysis revealed a significant main effect of self-adaptor condition, F (2, 57) = 3.79, p = 0.028, η2p = 0.12. Post hoc comparisons using the Bonferroni method (adjusted p-values, p < 0.05) indicated a significant difference between the self-adaptor condition and the suppression condition, t (57) = 2.75, p = 0.024, d = 0.85.
3.2. TOT State Frequency by Condition
We counted the number of TOT states for each condition (
Table 1). As the numbers differed across the three self-adaptor conditions, a one-way ANOVA was conducted to examine the susceptibility to TOT experiences. Self-adaptor condition was treated as a between-subjects factor, and the total number of TOT occurrences reported by each participant served as the dependent variable. The analysis revealed no significant main effect of self-adaptor condition,
F (2, 57) = 1.16,
p = 0.32,
η2p = 0.04. This suggests that the type of behavioural instruction did not differentially affect participants’ likelihood of experiencing a TOT state.
3.3. TOT Success Rate in the Lexical Retrieval Task
The mean proportion and standard deviation of correct responses following TOT states (TOT success rate) for each self-adaptor condition (control, self-adaptor, and suppression conditions) in the lexical retrieval task were calculated (see
Table 2 and
Figure 5).
A one-way ANOVA was conducted with self-adaptor condition as the between-subjects factor and the proportion of TOT successes as the dependent variable. The analysis revealed no significant main effect of self-adaptor condition, F (2, 57) = 1.41, p = 0.25, η2p = 0.047. This indicates that the type of self-adaptor condition had no differential effect on the likelihood of successfully retrieving a word following a TOT state. Based on this result, Hypothesis 2 was not supported.
3.4. Recall Task Performance
The number of items recalled by each self-adaptor condition in the recall task was calculated, defined as the number of items participants were able to recall from the 30 presented during the lexical retrieval task (see
Table 3).
A one-way ANOVA was conducted with self-adaptor condition as the between-subjects factor and the number of items recalled as the dependent variable. The analysis revealed no significant main effect of self-adaptor condition, F (2, 57) = 0.30, p = 0.74, η2p = 0.01. This indicates that the type of self-adaptor condition had no significant effect on recall performance, suggesting no differential impact on the number of items recalled across conditions.
3.5. Self-Adaptor and Gesture Frequency in the Control Condition
In the lexical retrieval task, various gestures and self-adaptor behaviours were observed in the control condition. The number of occurrences of each behaviour corresponding to different response patterns (successful retrieval, retrieval failure, TOT success, and TOT failure) was calculated using ELAN (
Lausberg & Sloetjes, 2009).
Self-adaptor behaviours were classified into 16 anatomical categories based on the framework proposed by
Sugawara (
1987): hair, head, forehead, eyes, ears, nose, mouth, cheeks and jaw, entire face, neck, arms, hands, shoulders, torso, lower limbs, and clothing. Gestures were categorised into four types according to McNeill’s classification (
McNeill, 1992): iconic gestures, metaphoric gestures, beat gestures, and others. Iconic gestures are those that represent the action, shape, or movement of the referent. For example, mimicking the motion of turning a steering wheel to indicate driving would be classified as an iconic gesture. Metaphoric gestures involve the spatial visualisation of abstract concepts; for instance, moving one’s hand forward to represent the passage of time would be considered a metaphoric gesture. Beat gestures refer to rhythmic movements used to emphasise prosodic features of speech.
Gestures were annotated separately for the right and left hands. For example, when participants brought both hands together (e.g., clasping), annotations were added to both the right-hand and left-hand tiers with the label “hands”. In cases where a participant touched their chin with the right hand and their arm with the left, the annotation “chin” was assigned to the right-hand tier and “arm” to the left-hand tier, respectively. All coding focused on gestures and self-adaptor behaviours that occurred after the end of Beep A (0.06 s in duration, as shown in
Figure 3) and before the participant produced a verbal response.
In the present analysis, the focus was placed on the frequency of occurrences rather than the duration of gestures or self-adaptor behaviours. This decision was made due to the variability in response times across the 30 lexical retrieval trials: some participants used the full one-minute response window, while others responded almost immediately. Accordingly, only the number of distinct self-adaptors or gesture events was counted. For example, if during a single trial a participant touched their head with the right hand, then their cheek, and then their head again, while continuously touching their lower limb with the left hand, the total number of self-adaptor occurrences for that trial would be recorded as four gestures (classified into four types) observed during the lexical retrieval task is reported below for each response pattern: successful retrieval, retrieval failure, TOT success, and TOT failure (see
Table 4).
As shown in
Table 4, the most frequently observed self-adaptor regions were the hands and lower limbs. Most participants completed the task either with their hands clasped together or resting on their thighs, which is likely attributable to the fact that the task was conducted in a seated position. The next most frequently touched areas were the arms, often because of participants crossing their arms or otherwise managing hand placement during the task. These were followed by facial regions, such as the cheeks and jaw and the torso, which were also relatively common points of contact.
Next, disregarding the specific body regions or gesture types, the analysis focused on the total number of self-adaptors and gesture occurrences in relation to response patterns (
Figure 4). In the control condition, a total of 471 self-adaptor occurrences were observed across 214 successful retrievals, 891 across 310 retrieval failures, 39 across 10 TOT successes, and 234 across 66 TOT failures. This corresponds to an average of 2.20 self-adaptor instances per trial for successful retrievals, 2.87 for retrieval failures, 3.90 for TOT successes, and 3.55 for TOT failures. These results indicate that self-adaptor behaviours were more frequent during TOT states than during immediate successful retrievals, suggesting that self-adaptor frequency increases when lexical access becomes more difficult.
The number of gesture occurrences by response pattern is reported as follows. During successful retrievals, gestures were observed only once across 214 responses. During retrieval failures, a total of seven gestures were recorded across 310 responses. For TOT successes, five gestures were observed across 10 responses, and for TOT failures, 23 gestures were recorded across 66 responses. This corresponds to an average of 0.005 gestures per trial for successful retrievals, 0.02 for retrieval failures, 0.50 for TOT successes, and 0.35 for TOT failures. These results indicate that gesture use increased markedly during TOT states, particularly in trials where the target word was ultimately retrieved, suggesting that gestures may be mobilised more actively when lexical access becomes difficult.
To further examine the role of self-adaptors and gesture use in the resolution of TOT states, we conducted Pearson’s correlation analyses between the frequency of self-adaptors, gestures, and the number of TOT successes across participants in the control group. A moderate to strong positive correlation was found between the number of TOT successes and the frequency of self-adaptors (r = 0.95, 95% CI [0.87, 0.98], p < 0.001), and a moderate positive correlation was observed between the number of TOT successes and the frequency of gestures (r = 0.51, 95% CI [0.09, 0.78], p < 0.05). These results suggest that increased use of both gestures and self-adaptors may be associated with improved TOT resolution. They provide further support for the idea that spontaneous body movements play a functional role in facilitating lexical access during TOT states.
4. Discussion
This study investigated whether self-adaptor behaviours influence lexical retrieval. The results revealed five main findings. First, among the three conditions (control, self-adaptor, and suppression), the self-adaptor condition produced the highest number of correct responses in the lexical retrieval task. A significant difference was observed between the self-adaptor and suppression conditions, suggesting that self-adaptor behaviour facilitates lexical retrieval more effectively than suppressing hand movements. Second, there was no significant main effect of condition on the likelihood of entering a TOT state, indicating that the self-adaptor condition did not affect susceptibility to TOT experiences. Third, no significant condition differences were found in the proportion of correct responses following TOT states. This suggests that self-adaptor behaviour did not enhance resolution once a TOT state had occurred. Fourth, there was no significant effect of condition on the number of items recalled in the recall task, indicating that self-adaptor behaviour did not influence delayed memory performance. Fifth, in the control condition, self-adaptor behaviours were observed more frequently than gestures. Notably, gesture frequency increased markedly during TOT states, suggesting a heightened reliance on gesture when lexical access becomes more difficult. Taken together, these findings support the interpretation that self-adaptor behaviour contributes to immediate lexical retrieval processes, likely by enhancing cognitive focus or alleviating retrieval interference. However, its influence does not appear to extend to memory consolidation or the resolution of TOT states.
4.1. Self-Adaptor Effects on Lexical Retrieval Accuracy
As shown in
Table 1, the self-adaptor condition outperformed the suppression condition in the number of correct responses in the lexical retrieval task. This supports Hypothesis 1 and suggests that self-adaptor behaviours facilitate lexical retrieval. The standard deviations for the control, self-adaptor, and suppression conditions were SD = 6.30, SD = 4.95, and SD = 3.90, respectively. Such reduced variability may reflect a more constrained behavioural repertoire under suppression instructions, potentially limiting participants’ ability to recruit compensatory strategies during retrieval. This finding is consistent with previous studies (
Fujii, 1997;
Pine et al., 2007), which suggest that suppressing self-adaptor behaviour impedes lexical retrieval.
The significant difference in correct responses, defined as the combined total of successful retrievals and TOT successes, between the self-adaptor and suppression conditions suggests that self-adaptor behaviours serve a functional role in supporting lexical retrieval. As mentioned previously,
Kısa et al. (
2021) pointed out that in the study by
Pine et al. (
2007), the most frequently observed behaviour was not gesture but self-adaptor activity. They argued that gesture may not be a consistently reliable factor in lexical retrieval processes. A similar pattern was observed in the present study, where self-adaptor behaviours occurred more frequently than gestures. This finding implies that engaging in self-adaptor behaviour, rather than relying on gesture, may help speakers allocate cognitive resources more effectively, thereby enhancing access to verbal information.
Moreover, a complementary interpretation is that self-adaptors support lexical retrieval by modulating attentional control mechanisms. Under time-constrained and cognitively demanding conditions, bodily behaviours such as self-adaptors may act as regulatory strategies to redirect attentional focus from external distractions to internal processing. This interpretation is supported by neurophysiological evidence indicating that spontaneous facial self-touches are associated with changes in EEG activity linked to emotion regulation and working memory maintenance (
Grunwald et al., 2014). Furthermore, the duration and location of such self-touches systematically vary with cognitive and emotional load, suggesting a functional role in the adaptive regulation of attentional and affective states (
Mueller et al., 2019). These findings reinforce the view that self-adaptors may not merely reflect affective states but function as strategic behaviours that promote cognitive stability and control in challenging tasks.
Why, then, did the present study fail to find a significant difference between the control condition and the suppression condition, as reported by
Fujii (
1997)? One possible explanation concerns the substantial variability observed within the control condition. As shown in
Table 1, the control group exhibited the highest standard deviation among the three conditions. This suggests that participants in the control condition differed widely in their prior knowledge of proverbs and four-character idioms. Such variation could have contributed to the broad distribution of correct response scores. Consequently, this increased variability may have obscured a statistically significant difference between the control and suppression conditions. In addition to variability in prior knowledge, the use of gestures and self-adaptors may also have contributed to individual differences in performance within the control condition. Participants who spontaneously engaged in these bodily movements may have been better able to access target words, particularly when experiencing tip-of-the-tongue states. Thus, it is plausible that both familiarity with the stimuli and the extent of gesture use jointly influenced lexical retrieval outcomes. This combination of factors could have increased variability within the control group and potentially obscured any between-group differences.
4.2. Frequency of TOT States Across Self-Adaptor Conditions
No significant main effect was found for self-adaptor condition on the number of TOT states experienced. This finding is consistent with previous studies. For example,
Frick-Horbury and Guttentag (
1998) conducted an experiment using TOT inducing target words and compared a gesture-allowed condition with a gesture-suppressed condition. They reported no significant difference in the frequency of TOT occurrences between the two conditions. Similarly,
Pine et al. (
2007) found no difference in the number of TOT states between gesture-allowed and gesture-suppressed participants. The present results align with these earlier findings (
Frick-Horbury & Guttentag, 1998;
Pine et al., 2007), suggesting that self-adaptor behaviour does not influence the likelihood of entering a TOT state. One possible explanation is that the experience of a TOT state may arise prior to the onset of any bodily movement, such as self-adaptor behaviour or gesture. In other words, although gestures or self-adaptor actions may assist in resolving a TOT state once it has occurred, they are unlikely to influence its initiation. Therefore, self-adaptor behaviour does not appear to play a preventive or anticipatory role in the emergence of TOT experiences.
4.3. Self-Adaptor Behaviour and Resolution of TOT States
As shown in
Table 2, no significant main effect of self-adaptor condition (control, self-adaptor, suppression) was found on the proportion of correct responses following TOT states. In other words, self-adaptor behaviour did not enhance the likelihood of resolving a TOT state once it had occurred. These findings differ from those of
Pine et al. (
2007), who found that children were significantly more likely to resolve TOT states when gestures were permitted than when they were suppressed. Their experiment involved image-naming tasks with concrete objects such as “umbrella”, “saddle”, and “seahorse”, which possess clear and readily recognisable visual features. In contrast, the present task involved proverbs and four-character idioms, which are inherently abstract and lack direct sensorimotor grounding. This suggests that the effectiveness of gesture or bodily movement in supporting retrieval may depend on the semantic concreteness of the target item. Gestures may be more beneficial when attempting to retrieve lexical items that convey visual or spatial information. That said, it should be noted that some definitions used in the present task included imagery or references to physical entities, which complicates this explanation. Future research should systematically manipulate lexical concreteness to clarify this relationship.
Another important difference between the studies is the age of participants. While
Pine et al. (
2007) tested children with a mean age of 6.63 years, the present study involved adults aged 18 and above. Lexical retrieval strategies and cognitive support mechanisms may differ across developmental stages, and children may rely more heavily on embodied cues such as gesture or tactile engagement when retrieving difficult words. Finally, it is worth noting that although self-adaptor behaviour significantly increased the overall number of correct responses (successful retrievals plus TOT successes), no such effect was observed when considering only those responses that followed TOT states. This suggests that self-adaptor behaviour may support general lexical retrieval—possibly by maintaining attentional focus or reducing internal interference—but does not assist in resolving a TOT state once it has begun. These findings highlight possible developmental and task-related constraints on the role of self-adaptor behaviour during word-finding difficulties.
While self-adaptors did not significantly aid TOT resolution, their positive impact on general lexical retrieval invites further theoretical consideration. One possible explanation is that self-adaptor behaviours support verbal access by stabilising attention and reducing internal distractions—functioning much like “mental blinkers”. In contrast, representational gestures that convey semantic content may facilitate retrieval through activation of associated semantic networks (
Pine et al., 2007).
This functional distinction suggests that self-adaptors and gestures may support lexical retrieval via different cognitive pathways. Whereas gestures operate through meaning-based mechanisms, self-adaptors may act by regulating internal attentional states. Furthermore, prior research suggests that the utility of gesture in resolving TOT states may be modulated by individual differences in verbal memory capacity (
Pyers et al., 2021), a factor which could also shape the effectiveness of self-adaptor behaviours. Understanding these interactions may offer deeper insight into the cognitive mechanisms underpinning embodied lexical retrieval.
4.4. Recall Performance Across Self-Adaptor Conditions
No significant differences were observed in the number of items recalled across the control, self-adaptor, and suppression conditions. This finding replicates the results of
Fujii (
1997), who also reported no difference between the control and suppression conditions in the recall task. While self-adaptor behaviour had a significant effect in the lexical retrieval task, its influence did not extend to the recall task. As
Fujii (
1997) suggested, this may be attributed to the relatively low cognitive demands of the recall task. Because participants were asked to recall items that had already been presented, regardless of their initial response accuracy, the task likely required less effortful lexical access.
In such contexts, self-adaptor behaviour may not serve a meaningful function in enhancing cognitive focus, as the need to mobilise attentional or mnemonic resources is diminished. These findings suggest that self-adaptor behaviour is effective in tasks requiring active retrieval without prior exposure, such as lexical access. In contrast, in recall tasks where answers are already known and cognitive demands are reduced, self-adaptor behaviour appears to provide limited benefit. This indicates that the cognitive utility of self-adaptor actions may depend on task characteristics and is most evident when individuals must engage in effortful, internally driven retrieval.
4.5. Self-Adaptor and Gesture Patterns Observed in the Control Condition
Self-adaptor behaviour was found to occur more frequently than gesture across all response types, including both successful and unsuccessful retrievals as well as TOT and non-TOT states. Both types of behaviour were more common during TOT states than during non-TOT states. However, the most striking difference was a sharp increase in gesture frequency, specifically during TOT states. As reported in
Section 3.5, the average number of gestures per trial was 0.01 for successful retrievals, 0.02 for retrieval failures, 0.50 for TOT successes, and 0.30 for TOT failures. These results suggest that gestures are particularly recruited during moments of lexical difficulty. This pattern is consistent with
Pine et al. (
2007), who found that symbolic gestures significantly increased during TOT states, whereas beat gestures and self-adaptor actions did not. The present findings support the interpretation that representational gestures may facilitate lexical retrieval during TOT states, while self-adaptor behaviour appears to play a more indirect role.
Regarding the physical distribution of self-adaptor behaviour, the most frequently contacted areas were the hands and lower limbs, likely due to the seated posture adopted during the task. Participants often clasped their hands or rested them on their thighs. Other frequently contacted areas included the arms, torso, and face, especially the cheeks and jaw. Notably, there were considerable individual differences. Some participants rarely touched their faces but frequently contacted their arms or hands. Others consistently used self-adaptor actions involving the lower limbs while maintaining focused task engagement. This variability suggests that the functional role of self-adaptor behaviour may differ across individuals, depending on which body areas serve as more cognitively supportive.
Such variability may help explain inconsistencies across studies concerning the role of self-adaptor behaviour in lexical retrieval.
Ravizza (
2003), for example, demonstrated that even meaningless motor activity, such as tapping, improved TOT resolution compared to stillness. This implies that motor engagement itself may support word retrieval, regardless of its semantic content. Additionally, gesture patterns observed in the recall task often mirrored those in the retrieval task. Participants who used facial self-adaptors during retrieval tended to repeat similar behaviours during recall. This recurrence suggests a potential role for gesture as an embodied cue that aids memory reactivation, not only in immediate lexical access but also in subsequent recall.
4.6. Future Directions
Several limitations of the present study should be acknowledged, each pointing to directions for future research.
First, no procedure was implemented to reduce short-term memory effects prior to the recall task. Unlike
Fujii (
1997), who used a form-recognition task to mitigate such effects, the present design may have allowed short-term memory to influence recall performance. Future research should include a delay or unrelated filler task between the retrieval and recall phases to better isolate long-term lexical access.
Second, although a pilot study was conducted to select appropriate proverbs and four-character idioms, the ability of these items to reliably induce TOT states remains uncertain. Some items were too easy, while others were rarely answered correctly. Future studies should improve stimulus calibration by ensuring that pilot and main participants have comparable linguistic backgrounds.
Third, the instructed self-adaptor posture, placing both hands on the cheeks, may not fully reflect the most natural form of self-adaptor behaviour typically observed in spontaneous communication. This design choice was based on pilot observations and prior research (e.g.,
Fujii, 1997;
Harrigan, 1985;
Spille et al., 2021), which consistently identified the face, particularly the cheeks and jaw, as one of the most frequently contacted regions during verbal tasks. Although this standardisation improved procedural consistency across participants, it may have limited ecological validity. In naturalistic settings, individuals differ in their preferred self-adaptor locations, and alternative forms such as arm-touching may vary not only in frequency but also in cognitive function. Future studies should systematically compare different forms and locations of self-adaptor behaviour to determine whether their cognitive effects vary depending on the body region involved.
Fourth, a further limitation concerns the nature and familiarity of the verbal stimuli used in the present study. While all idioms and proverbs were familiar to participants upon recognition, four-character idioms are generally less frequently used in everyday communication, particularly among younger adults. As such, task difficulty may have been influenced by both the semantic opacity of these expressions and the relatively narrow age range of participants. Additionally, the use of abstract verbal stimuli may have limited the opportunity for gesture-supported retrieval. In contrast, previous research has employed more concrete, imageable words and reported facilitative effects of gesture (e.g.,
Pine et al., 2007). Future studies should examine age-related differences in idiom retrieval and directly compare imageable and non-imageable stimuli within participants. Such comparisons would help clarify how stimulus characteristics and participant demographics interact to influence the effectiveness of gesture and self-adaptor behaviours in lexical retrieval.
Fifth, while the current study focused on structured lexical retrieval tasks, future work could explore the role of self-adaptors in more naturalistic settings such as daily conversation or narrative production. These contexts may offer further insight into the spontaneous and functional use of self-adaptors in everyday communication, thereby enhancing our understanding of their cognitive and social significance.