In Scriptura Veritas? Exploring Measures for Identifying Increased Cognitive Load in Speaking and Writing

: This study aims to establish a methodological framework for investigating deception in both spoken and written language production. A foundational premise is that the production of deceitful narratives induces a heightened cognitive load that has a discernable influence on linguistic processes during real-time language production. This study includes meticulous analysis of spoken and written data from two participants who told truthful and deceitful narratives. Spoken processes were captured through audio recordings and subsequently transcribed, while written processes were recorded using keystroke logging, resulting in final texts and corresponding linear representations of the writing activity. By grounding our study in a linguistic approach for understanding cognitive load indicators in language production, we demonstrate how linguistic processes, such as text length, pauses, fluency, revisions, repetitions, and reformulations can be used to capture instances of deception in both speaking and writing. Additionally, our findings underscore that markers of cognitive load are likely to be more discernible and more automatically measured in the written modality. This suggests that the collection and examination of writing processes have substantial potential for forensic applications. By highlighting the efficacy of analyzing both spoken and written modalities, this study provides a versatile methodological framework for studying deception during language production, which significantly enriches the existing forensic toolkit.


Introduction
"To speak the truth is easy and pleasant" was Yeshua's answer when Pontius Pilate asked about his suspected treason towards the Roman Empire in Mikhail Bulgakov's The Master and Margarita.Perhaps it is easier to be truthful than it is to lie.However, being able to lie is an important skill to develop in life, for example, to respond politely when your mother-in-law asks you if you like her horrendous stew (Talwar 2019).Despite the social function of some lies, there are also instances when it is important to be able to tell if a person is lying or not, in our day-to-day lives, as well as in our legal system, where, for example, witness accounts need to be judged for their credibility in a safe and just way.While there have been attempts at creating a "lie detector", so far these efforts have not reached a reliable and safe conclusion and no single "symptom" of a lie has been identified that can be used for diagnosing a story as deception (Ofen et al. 2017;Mann 2019;Vrij et al. 2022).
Instead, discriminating a lie from the truth is typically contingent upon a comprehensive evaluation of multiple distinct verbal and nonverbal behavioral indicators, such as gaze cues, pulse rate, hand movements, and manifestations of nervousness, among others (Newman et al. 2003;DePaulo et al. 2003;Vrij et al. 2010;Granhag et al. 2015).The present methodological article sets out to contribute to the existing body of behavioral indicators.
We suggest that the examination of linguistic processes that occur during spoken and written language production, captured through audio and video recordings for speaking and keystroke logging for writing, may offer new possibilities in forensic linguistics.Recordings of spoken reports have previously been used in deception studies, for example, to identify verbal cues of deception (Vrij et al. 2022), and these often combine production cues such as hesitations with content cues such as the linguistic complexity of a message.Complementing these verbal cues with studies and methods examining both spoken and written real-time production will further the understanding of deception, especially with the use of the knowledge that can be gained from the existing body of linguistic research on language production in both speaking and writing.
As mentioned above, cues in speech are already one aspect examined in deception studies and detection, and deception during writing has also been studied (e.g., see Banerjee et al. 2014;Derrick et al. 2013).These studies have, however, not used the full potential and tools provided by the last decades' research in the area of cognitive writing processes (Lindgren and Sullivan 2019), which has developed methodologies and informed theories on how the writing process unfolds in real-time through the study of pauses and revisions, for example.
Knowledge about writing in itself is essential in modern society, where writing is a common and often necessary form of communication in a range of activities for a majority of the population (Brandt 2015).As such, while many reports in forensic settings are spoken, written reports are also becoming more and more common.For example, in a society marked by an escalated reliance on digital platforms for diverse purposes, individuals are increasingly prompted to report accidents, incidents, and criminal activities online.These reports find their outlets on platforms such as the websites of national law enforcement agencies or insurance providers.Furthermore, written narratives are employed by migration authorities and in the creation of ongoing military reports.Consequently, it becomes conceivable to use software designed for capturing real-time text production, similar to how spoken processes are recorded through audio and video.Recent advancements in web-based software now afford such capabilities.Several prototypes have been developed, with a notable example being the keystroke logging tool, CyWrite, which resembles a customized web-based text editor that has the capacity to comprehensively capture and later reconstruct the entire text production process (Chukharev-Hudilainen et al. 2019).The prospect of scrutinizing real-time writing processes for forensic purposes, paralleling the examination of real-time spoken processes, holds significant intrigue.It is important to note, however, that this application is unlikely to function as prima facie proof of lying, or as a reliable "lie detection".Nevertheless, it cautiously holds the potential to identify circumstances warranting further attention during subsequent interrogations.
When addressing lying and deception in language production, it is imperative to acknowledge that behavioral indicators may be attributable to factors other than the veracity of the communicated information.For instance, it is easy to imagine that in an interrogation situation, most witnesses of a potential crime will feel an obligation to provide accurate information and that the gravity of the situation will bring on nervousness.The challenge lies in discerning such reactions from those exhibited by individuals who are anxious about the detection of a lie.
But what is a lie then?Lies can take on many different forms and can even be defined differently across languages and cultures-like whether or not the intention to lie or the factual truth is the defining feature of a lie (Coleman and Kay 1981;Nishimura 2018).A person can lie by omission, i.e., by leaving out important information, or they can lie by presupposition ("Has Bambi stopped hitting Thumper?" presupposing that Bambi has been hitting Thumper).In addition, different kinds of misleading information can be considered lies.In the scope of this article, lying is operationalized as a person knowingly giving false information with the intention of making the receiver of this information believe it to be true.Deception may manifest in both spoken and written modalities, impacting various linguistic processes, such as planning, conceptualizing, generating spoken or written text, monitoring, and editing, irrespective of the communicative mode employed (Flower and Hayes 1981;Levelt 1989).
The present study forms part of a larger project entitled "Based on a true story: How to differentiate between invented and self-experienced narratives through comparing linguistic processes in speaking and writing.".The project is guided by the foundational assumption that heightened cognitive load during the process of language production, whether in spoken or written form, is likely to exert a discernible influence on linguistic processes (Goldman Eisler 1968;Matsuhashi 1981;McCutchen 1996).The overarching objective of this larger project is to examine how (and if) deception influences cognitive load during language production and to ascertain whether discerning deception is more realizable through an examination of writing processes as opposed to spoken processes.To achieve this goal, the initial step involves identifying and defining relevant phenomena that can effectively capture and measure heightened cognitive load during language production across these two different modalities.
The present study aims to accomplish this first step by establishing a versatile methodological framework, capable of identifying and quantifying phenomena linked to heightened cognitive load in both spoken and written language production.We believe such an approach holds the potential to significantly expand the existing forensic toolkit.

Deception and Cognitive Load
The relationship between deception and increased cognitive load has been the subject of extensive investigation in various studies.Cognitive load refers to the demand placed on working memory resources when solving immediate tasks (Baddeley 2007;Cowan 2010).The underlying assumption is that lying is a mentally demanding task, prompting suggestions that indicators of heightened cognitive load could be employed for deception detection.All forms of lie detection rely on individuals perceiving deception and identifying cues (either automatically or manually) that may suggest falsehood.A myriad of lie detection techniques has been proposed, particularly for use in interrogations and interviews (see Walczyk et al. 2013 for a comprehensive overview).These techniques may focus on attentional processes, aligning with the orienting response theory (Sokolov 1963), or delve into memory processes and inhibition, in line with the parallel task set model (Seymour 2001).
Several theoretical frameworks address deception and its relation to cognitive load.One is the four-factor theory of deception, advanced by Zuckerman et al. (1981), which posits that deception escalates cognitive load.Some theories include explanations as to why deception would increase cognitive load.For instance, the interpersonal deception theory (Buller and Burgoon 1996;Burgoon and Buller 2008) proposes that cues of deception stem from aspects of communication that remain "unmonitored" due to increased cognitive load.Another example is the self-presentation theory (DePaulo 1992), which outlines three cognitive phases governing behavior to appear truthful: intention to regulate behavior, intention translated into non-verbal behavior, and self-assessment of the behavior.Sporer andSchwandt (2006, 2007) introduced the Working Memory Model of Deception, which builds on Baddeley's (2007) working memory model and asserts that lying elevates cognitive load, potentially affecting speech production among other processes.Finally, the Activation-Decision-Construction Model (Walczyk et al. 2003(Walczyk et al. , 2005(Walczyk et al. , 2009) ) outlines a model for deceptive responses in the context of lie detection interviews, and this model has been expanded to account for repeated lies (2009).
These theories and models have undergone scrutiny, as exemplified by a study conducted by Repke et al. (2018) that tested two models-one assuming that increased cognitive load during deception would reduce linguistic complexity and another assuming that the lie's goal would determine the complexity of the deception.The latter model received empirical support, indicating that liars can adjust the complexity of their falsehoods based on their objectives.Other studies propose content analysis to assess statement credibility, such as criteria-based content analysis (CBCA) (Vrij et al. 2000), developed to evaluate state-ments from individuals who have experienced abuse, as well as analyses of vividness and spontaneity in lies versus truths (Colwell et al. 2007).These investigations have revealed that the content of statements is, to some extent, influenced by whether a person is lying or telling the truth, and complementing this knowledge of the content with future knowledge about the process of deceiving would most likely be rewarding.
Similarly, Leins et al. (2012) explored the impact of different reporting modes on deception from a forensic perspective.They discovered that, when individuals were asked to recount the same event through spoken and pictorial modes, liars exhibited less consistency across the two modes compared to truth-tellers.Thus, while liars can tailor their lies to specific goals in a given context, transferring lies across different reporting modes appears more challenging.Further investigation into this phenomenon across various language modalities could be valuable.Additionally, Vrij et al. (2008) found that, when asked to narrate a series of events in reverse order, both verbal and nonverbal cues indicative of deception (e.g., filled pauses, hesitations, and leg movements) increased among liars.Moreover, they observed an improvement in lie detection accuracy among police officers in the reversed condition, surpassing chance levels.
Regarding heightened cognitive load during deception, reaction time studies have been a common approach.For instance, Duran et al. (2010) reported a significant increase in reaction time when participants were instructed to lie in response to simple yes/no questions, a pattern consistent with numerous other studies (e.g., see Suchotzki et al. 2017 for a review of studies measuring reaction time in relation to deception).Furthermore, Debey et al. (2012) found that, when given additional time to examine a stimulus after being instructed to lie, participants exhibited significantly longer reaction times when lying compared to responding truthfully.
Many studies investigating deception cues primarily focus on interview responses or spoken language.Apart from reaction time latency, some also assess speech rate and hesitations as verbal cues of deception.For instance, Vrij et al. (2008) examined speech rate (calculated as the number of words divided by the length of the answer) and found that liars had a slower speech rate than truth-tellers, along with more hesitations.A limited number of studies have specifically looked at deception related to aspects of speaking and writing.One example is the study by Goupil et al. (2021), who found that the speech signal itself may be perceived as more or less honest.A few recent examples include studying deception during written language production, with interesting findings, such as liars engaging in more revisions and producing shorter texts (Banerjee et al. 2014).Another result has demonstrated that, in synchronous chat settings, liars exhibit not only increased revisions and shorter texts but also longer response times (Derrick et al. 2013).The latter study also noted a significant age-related effect, with older participants displaying these behaviors more prominently than younger individuals.Finally, studies have demonstrated that writing processes can be disrupted due to background speech, especially regarding semantic aspects, something that may be relevant for inducing increased cognitive load in experimental settings (Sörqvist et al. 2012).
In sum, the connection between cognitive load and deception is underpinned by theoretical models as well as empirical findings.The review of the field further highlights that the examination of behavior during the production of language may add insights into when and how deception occurs.

Language Production and Cognitive Load
The concept of working memory has also been influential in descriptions of language production.Across all kinds of language production, we depend on our working memory resources to perform tasks, such as planning what to say/write, actually expressing it, and evaluating the result of it (McCutchen 2000;Baddeley 2007).In studies on speaking and writing, working memory demands, or cognitive load, have been investigated through analyses of pauses and disfluencies (cf.e.g., Goldman Eisler 1968;Matsuhashi 1981;Spelman Miller 2006b).The underlying idea posits that, when too much information needs to be processed simultaneously, our limited working memory capacity becomes overloaded with information and additional time is needed to plan for spoken and written expressions.This often results in longer pauses and/or more frequent pausing (Spelman Miller 2006a), alongside an increased occurrence of other expressions of disfluencies in speaking, such as filled pauses, elongated words and word segments, and repeated words and expressions (Goldman Eisler 1968;Clark and Wasow 1998;Heldner and Edlund 2010), and in writing, such as deletion of word fragments, words, and expressions, additions, and substitutions, both locally and globally.
Numerous factors may contribute to cognitive load during language production (for examples and overviews addressing different factors influencing cognitive load see Barkaoui 2019;Bourdin and Fayol 1994;Feng and Guo 2022;Johansson 2009;Kellogg 2008;Kellogg et al. 2016;Lively et al. 1993;Lourdes Ortega 2009;Manchón 2020;Song and Li 2020).Existing research in this area suggests that writers' and speakers' linguistic proficiency (including factors such as producing in one's first or second/third language, as well as overall grammatical and lexical knowledge), age, and education will influence fluency during language production.In addition, factors such as knowledge of genre, topic, and the amount of preparation, as well as grammatical complexity will have an impact.Finally, contextual factors such as sleep, hunger, general comfort, distracting factors in the current situation, etc., will sway the performance.In sum, differences in execution at the group level are expected; for example, first-language speakers are generally more fluent than second-language speakers, or increased age and education lead to more fluent speaking and writing (compared to children).Apart from this, overall between-subject findings and substantial within-subject findings can also be expected.That is, a person does not always pause, revise (in writing), or demonstrate disfluencies (in speaking) in a consistent way.The context will matter in this respect.For all these reasons, it is necessary to establish an individual baseline or include a control condition in all research using the assumed symptoms of cognitive load during language production (such as pausing and revision) as indications of-in this case-lying.
Thus, with all things being equal, in this case, it is assumed that, when a person engages in deceptive communication, an augmented cognitive load can be reflected in the language production processes.As mentioned, reaction time studies on deception have shown that both prompted and unprompted lies lead to an increase in response time during lies (Duran et al. 2010;Debey et al. 2012;Williams et al. 2013;Walczyk et al. 2013;Suchotzki et al. 2017;Bott and Williams 2019).Increased cognitive load during lying would be caused by the speakers/writers having to concurrently devise what to say/write and determine how to say/write it: organize the sequence of (in this case) narrative events, select an appropriate syntactic structure, and choose the lexical items.Additionally, the speakers/writers would have to actively maintain a mental representation of the truthful version of the events and continuously decide when and how they wish to deviate from this.All these factors add to working memory demands.

Speaking and Writing
Language production is an over-arching term for the modalities of speaking, writing, and using sign language (which will not be addressed here).Below, we outline some fundamental characteristics that apply to spoken and written discourses and that will influence the behavior of speakers/writers.Here, we disregard situations such as instant messaging or fast-written conversations, spoken conversations on the phone, or messages in delayed mode (i.e., voice mail and voice messages), where some of the characteristics of the modality will be less prominent.We use a contrastive focus and address the three themes: time, receiver, and permanence.

Time
The difference in production speed is an essential factor for understanding how speakers and writers distribute their resources during language production.The rate at which language can be produced in the two modalities is profoundly different; we speak much faster than we write.A common estimation is that speaking (in English) allows for a speed of 120 to 200 words per minute, corresponding to approximately 2-10 syllables or 8-15 phonemes per second (Crystal and House 1990;Schreiber and McMurray 2019), while a proficient typist would produce 38-40 words per minute (Hayes and Chenoweth 2006).The differences can mostly be attributed to the fact that speaking "only" requires the use of the vocal apparatus for expression, whereas writing requires the use of some artifact, e.g., pen and paper, keyboard, screen, etc.The writers' mastering of the artifact and often, the limitation of the artifact itself, introduce an intrinsic latency in the transformation from one's thoughts into linguistic expression (Grabowski 2008).
Finally, in the context of speaking, there are one or more listeners waiting for the delivery of the message and they can potentially interrupt.As a result, speakers will often experience time constraints and consequently use strategies that allow them to plan what to say while keeping the floor.This encompasses the use of filled pauses (eh, um) to indicate that more will come, to repeat and reformulate words and phrases, and to allocate silent pauses within syntactic units.These strategies are typically learned very early in life through numerous interactions and observations of spoken contexts, and speakers are very rarely aware of this behavior.The context for writing is normally different: even in stressful situations, writers will have comparably more time to think, generate text, and edit it before handing it over to a reader.

Receiver
Another substantial distinction between the modalities is the presence or absence of a receiver, or in other words, a listener or reader (Chafe and Danielwicz 1987;Chafe and Tannen 1987).Speakers often rely on the listeners' reactions to determine if, and when, more information is required (Levelt 1989;Barker et al. 2020).Conversely, writers must anticipate the readers' knowledge and needs and tailor the text accordingly.This inherent uncertainty can result in extensive revisions during and after text production (Flower and Hayes 1981;Hayes et al. 1987).However, these revisions and alterations are usually not visible to future readers, something that can be contrasted to the spoken context where listeners will be aware of all modifications made during speaking.Speaking is thus described as a dialogic activity, while writing is characterized as monologic (Linell 2009).

Permanence
The visibility and (relative) permanency of the written message in relation to the fugitive nature of the spoken message is yet another factor to consider.The fleeting spoken discourse necessitates repetition if the speakers need to reinforce certain points (Levelt 1989;Clark 1996).The listeners can also readily discern hesitations and repetitions, phenomena that help signal that the turn is ongoing and that the speakers want to keep the floor (Norrby 2014).Importantly, studies suggest that these disfluencies also facilitate the understanding of the spoken message (Clark 1996;Fox Tree 2001;Fox Tree and Schrock 2002).Conversely, when readers encounter a written text, it typically lacks visible traces of prior revisions.
The permanent condition of the written message is one contributing factor to the higher status of written language (cf.Chafe 1994).The visible language and the delay between the written production and the readers' reception also lead to a strong cultural expectation (reinforced by, e.g., the importance of writing skills posed by formal education and schooling) that the message should be edited and perfected before being handed over (cf. the different functions of spoken and written language outlined by, e.g., Biber 1988;Halliday 1985).The permanency is probably also a fundament for our view that written agreements and contracts are more reliable and binding than (undocumented) oral equivalents: it is more difficult to prove what was said than what was written.Writing thus includes an important component of understanding that one can, and is expected to, revise the written text before it is finished and that the revisions will be (largely) invisible to readers, especially for computer-written texts (Einarsson 1978).This belief would contribute to writers' revision behavior and how it is distributed throughout a writing session.
In summary, on the one hand, speaking is characterized by its quick, instantaneous, and synchronous nature, with (typically) present listeners who witness the entire overt language process and can actively contribute to the spoken message through oral and visual feedback (e.g., nodding or asking questions).On the other hand, writing typically unfolds more slowly, in isolation without readers present.The written message needs to be decontextualized and is read later and (often) in a different place, which, in turn, requires that writers anticipate what the future readers need to understand the context.The characteristics of spoken and written discourse will influence the type of processes that can be observed during language production in the two modalities.

Models of Language Production in Speaking and Writing
The description of spoken and written modalities outlined above have been addressed in theoretical models attempting to identify and sequence the different processes involved in speaking and writing.Overarching models, covering both speaking and writing are hard to find (but see Cleland and Pickering 2006).Instead, we use some of the most seminal models for speaking and writing to establish a terminology for the most described processes during language production (note that the description is delimited to stage models): Models of speech production (e.g., Fromkin 1973;Levelt 1989) distinguish between different stages of production that unfold (somewhat) successively.In a simplified description of the process, it commences with conceptualization, where the content is decided, followed by sentence formation, where lexical decisions are made, and syntactic structuring, determining word order, and ultimately articulation.During articulation, the speaker is also engaged in constant monitoring of what they are saying, as well as being attuned to the listener's reaction (while at the same time moving on with the production of the next utterance).
Other theories of spoken production emphasize that the speech process is facilitated by certain mechanisms.For instance, Linell (2009) highlights that grammatical constructions used in speech have been internalized by the speaker through prior practice in various situations (cf.Clark 1996 who described how much is given in a conversation, for example, in question-answer constructions).Despite these facilitating mechanisms during speech production, the task of having to plan what to say while saying it can still be daunting.
Models of written production (e.g., Flower and Hayes 1981) distinguish between three main processes: planning, translating, and revision.Planning entails the formulation of ideas, text organization, and text generation (on a conceptual and linguistic level), while translating involves rendering these ideas into their orthographic form.Revision encompasses reading and evaluating the text to align with the writers' intended goals, and based on this, editing the text if necessary.These processes are iterative and recursive during the unfolding of text production; thus, one should understand the processes of planning, translation, and revision to be carried out at a local level within short time frames, although they could also be applied for understanding the writing process at a more global text level and on a long-term perspective.
Revisions in writing can take place at any point during the writing process, whenever the writers see fit.This flexibility means that revisions can occur at various locations within the text; writers may edit at the leading edge of the text they have produced so far or make changes at some other point in the text they have already written (Lindgren et al. 2019).
In sum, the theoretical frameworks for speaking and writing, despite differences in terminology, include some common components, such as those of pre-activities: conceptualization, sentence formation and syntactic structuring in speech, and planning in writing, those of text generation: articulation and translating, and finally, those of post-activities: monitoring and revision.Of these processes, it is only the articulation and translation components that are overt and observable, while the other processes need to be inferred or rely on self-reporting methods.

The Present Study
The review of previous research demonstrates that the examination of symptoms of cognitive load in different behaviors can be fruitful for identifying deceit and for forensic purposes.However, one conclusion is that there is no single action or expression that has been shown to be indicative of lying.Instead, co-occurrences of phenomena may be a way forward.For this reason, it is imperative to increase the forensic toolkit and expand the possibilities to evaluate witness statements and narrative reports.Given the research outlined above, the current study builds on, on the one hand, studies of deception and the effect of cognitive load, and on the other hand, the existing knowledge on language production processes and how demonstrations of increased cognitive load are expressed.
As mentioned in the introduction, the present study has a methodological aim, closely related to a larger research project, called "Based on a true story: How to differentiate between invented and self-experienced narratives through comparing linguistic processes in speaking and writing."The overarching objective of the project is to examine the impact of deception on cognitive load during language production and to investigate if this heightened cognitive load is more salient in writing than in speaking.The first step in reaching this goal is, as mentioned in the introduction, to identify relevant phenomena during language production that can effectively capture and measure heightened cognitive load across these modalities.Thus, the aim of the present study is to accomplish this initial step by establishing a versatile methodological framework capable of identifying and quantifying phenomena associated with heightened cognitive load in both spoken and written language production.
Four central assumptions underpin our approach to achieving this objective.First, acts of deception are mentally demanding, leading to increased cognitive load, which directly affects sensory and behavioral expressions (cf.Vrij et al. 2000;Walczyk et al. 2013), such as prolonged response latencies (e.g., Duran et al. 2010).Second, heightened cognitive load during language production, regardless of modality, will directly impact linguistic processes during real-time language production (Goldman Eisler 1968;Matsuhashi 1981;McCutchen 1996).Third, acts of deception will have a discernible influence on these linguistic processes (Banerjee et al. 2014;Derrick et al. 2013;Vrij et al. 2008).Fourth, language users are generally more familiar with real-time spoken interactions compared to the real-time process of writing.As a result, mimicking truthful spoken language production is likely easier than replicating the same in the context of writing processes.
The complete corpus of the larger project comprises experimental data where, in total, 40 participants recounted events depicted in four specially tailored films portraying minor misdemeanors.In each data collection session, every participant produced four narratives based on four different films: one truthful spoken account, one deceitful spoken account, one truthful written account, and one deceitful written account.Participants returned three times for additional sessions, two weeks apart, and repeated all their narratives.Thus, the complete corpus consists of both truthful and deceitful narratives, totaling 320 written and 320 spoken accounts, collected from the 40 participants.In the truthful condition, participants were asked to retell the events as they unfolded, while in the deceitful condition, participants were asked to modify "who did it".To eliminate potential order effects arising from modality (writing > speaking or speaking > writing), or veracity (truthful > deceitful or deceitful > truthful) variations, a Latin Square design was implemented during data collection.Consequently, all possible orders were equally distributed and counterbalanced across participants throughout the entire data corpus.This design allows for within-subject comparisons across tasks while controlling for potential idiosyncratic veracity and modality effects, but also for the examination of possibly generalizable patterns through betweensubject comparisons.The study design further enables an exploration of the dynamics of cognitive load indicators throughout a narrative account and investigates whether specific sequences, particularly those associated with altered portions of the accounts, exhibit distinct patterns.The data within the larger project are extensive, necessitating a systematic approach to its exploration for the present study's purposes.
The present study uses a subset of data samples derived from the corpus collected within the larger project to qualitatively illustrate how language production unfolds in speaking and writing.The critical goal of this description is to identify phenomena that effectively discern and quantify heightened cognitive load-something that will inform the exploration of the corpus in the larger project.
To summarize, in the present study, we use a subset of data samples to qualitatively depict the unfolding of language production in both spoken and written contexts, with the aim to identify observable phenomena that can effectively discern and quantify heightened cognitive load.It is important to note that our study is not an attempt to comprehensively demonstrate how deception affects cognitive load and the associated linguistic processes within spoken and written narratives.Instead, our primary focus is on the development of a methodological framework that would enable such examinations.To achieve this objective, we employ overarching theories related to cognitive load and deception, grounding our analysis in a linguistic approach to interpret real-time language production in both speaking and writing.
Specifically, this study is guided by the following research questions: 1.
How can text length be measured to capture increased cognitive load during language production in speaking and writing, respectively?2.
How can pauses be defined and measured to reflect increased cognitive load during language production in speaking and writing, respectively?3.
How can changes in spoken and written text, such as repetitions and reformulations in speaking, and revisions in writing, be defined and measured to reflect increased cognitive load in speaking and writing, respectively?4.
In the assessment of spoken and written language production, how can we use these measures to capture fluctuation in cognitive load throughout the process?

Materials and Methods
This section describes the data that are used to exemplify the methodological discussion that constitutes the main part of this study.Note that, given the methodological scope of this study, the measures that are described and discussed in detail in the Results section are only briefly explained and addressed here.

Materials
The data used in this study were drawn from two participants from the larger project.These participants were randomly chosen with the purpose of providing examples that qualitatively illustrate the methodological aspects of analyzing and measuring behavioral phenomena in language production, which are outlined in the Results section below.In choosing the texts that provide these empirical examples, we selected two participants who had truthfully or deceitfully described the same films: one participant for spoken accounts and one for written accounts.This selection was made in order to restrain the number of film events that are related in the examples, and in that way, hopefully, facilitate the actions and choices of the participants.
The four selected texts from the two participants are illustrative of all the texts in the larger corpus.Note that the excerpts are used to exemplify the behavioral phenomena, and in that sense, they constitute empirical evidence.However, the four texts are not used for any conclusions on a more general level, neither for what is more common during speaking or writing nor for what characterizes deceitful or truthful accounts.
The texts from the two participants comprise spoken and written samples obtained during an experiment in which participants described events they had observed in four elicitation films depicting minor misdemeanors (e.g., cheating on an exam and putting pepper in a stranger's coffee mug).The data were collected online via the Zoom platform due to the constraints imposed by the COVID-19 pandemic, and the participants consequently made all recordings on their own computers and then transferred files to the researcher through a safe server depository.The two participants in the present study will be referred to using the pseudonyms Alfa and Bravo.Alfa and Bravo were native Swedish speakers with no known difficulties in reading, writing, or speaking, retelling the narratives in Swedish.They had accomplished a minimum of one year of academic studies at the university level.They further demonstrated fairly good typing skills; thus, their transcription skills were not expected to intimidate other engagements during text production (cf.Van Waes et al. 2021).All texts were written on a computer.
The sample included one deceitful and one truthful account from each participant.Alfa's truthful spoken account and Bravo's truthful written account describe the same film (The Garden Café), and likewise, Alfa's deceitful spoken account and Bravo's deceitful written account relate to another film (The Examination).In the deceitful accounts, Alfa and Bravo were instructed to alter the attribution of the culprit in the videos to simulate a fabricated eyewitness account.In effect, this task entails altering a specific portion of the events in the film, rather than the entirety of the film, although the participants were free to make changes where they saw fit.See the Supplementary Materials for the full transcripts of the spoken data, as well as the final and linear texts of the written data.
Thus, all four narrative text examples describe the events in one of two films, tailored for this experiment.These wordless films were all approximately three minutes long, and each film included three protagonists and depicted a misdemeanor where one protagonist was the culprit.In the deceitful condition, there was always an option of putting the blame on another protagonist.A short synopsis of these two films is given below.The first film, The Garden Café, starts with a girl sitting at a table engrossed in her reading, her bag placed on a chair beside her.In the foreground, there is a table with cakes and drinks as well as payment instructions so that guests can serve themselves.A second girl enters and looks around to see what the café has to offer.Shortly thereafter, a man arrives and, without apology, stumbles on the chair with the bag, and the bag falls from the chair, with its contents spilling out.The man then walks up to the tables and checks out what is offered at the café.He cuts in before the other girl has a chance to take what she wants and starts serving himself without paying.When his coffee is left unsupervised, the girl whose bag he knocked over, takes the opportunity to add pepper to it with a pepper mill.She then takes her coffee and leaves the café, and soon after, the girl who he cut in front of also leaves.The man drinks the coffee and reacts to the strange taste by spitting it out.He is unaware of who or what caused this unexpected taste.
The second film, The Examination shows a situation where a student cheats during an exam at the university.Two girls are sitting on opposite sides of an aisle in a classroom, and a male teacher is supervising them at the front.We see the backs of the girls: one in a yellow sweater on the left and one in a black sweater on the right.Seizing an opportunity when the teacher's coffee cup falls over and he is busy wiping it up, the girl in the black sweater pulls out a cheat note and looks at it.The girl in the yellow sweater notices what she is doing and observes her.When the exam is over, both girls hand in their exams, and when doing this, the girl with the black sweater accidentally drops her cheat note without noticing.The teacher subsequently finds it on the floor but remains unaware of its owner.

The Written Data: Data Collection Methods
The written data were collected by means of keystroke logging, using the software ScriptLog, version 196.Keystroke logging is a methodology that enables the study of real-time writing processes, and the software typically collects written data through the registration of keypresses and mouse movements during text production (Wengelin and Johansson 2023).ScriptLog records the writing process and then allows researchers to replay the writing session and extract information regarding where and how long writers pause, what, when, and how much was deleted from the text, and overall statistics of writing time and text length as the number of characters (letters, numbers, punctuation, and spaces) during a writing session.In addition, the software produces various output files, for instance, a final text, that is, a version of the finished written text in the way the writers intended it for the reader and-as a contrast-a so-called linear text that illustrates the writers' step-for-step writing process, with pauses and revisions included.
Deciding what constitutes a pause during computer writing has proven to be difficult.Generally, pauses have been defined as "inactivity between keypresses" (Van Waes and Leijten 2015).This inactivity can be measured in milliseconds, meaning that it is detailed enough to measure the writer's transcription skills.However, pause measures are also restricted to the hardware's (computer and keyboard) ability to capture and register keypresses (Johansson et al. 2023).Researchers interested in high-level cognitive processing (such as planning, reading, and revision) during writing have often made use of pause thresholds, which filter the noise of low-level transcription processes (such as the transcription speed of the writer or processes such as remembering how to spell certain words) (Wengelin 2006).While an objectively defined pause threshold has yet to be made (see Barkaoui 2019), various ad hoc criteria have been proposed.For instance, 2 s has been established as an ad hoc threshold that will capture most high-level processes while disregarding pauses caused by motor activities (at least for adult writers who are good typists and lack reading and writing difficulties) (Wengelin 2006).However, in studies where smaller fluctuations in pause behavior may be important, it would be safer to postulate a lower threshold to avoid filtering out too many details of the writing process.For this reason, we have chosen the ad hoc criteria of the 1 s threshold.Finally, it should be noted that various pause criteria can easily be explored using the analysis options in keystroke logging programs such as ScriptLog and Inputlog (Leijten and Van Waes 2013).
A comparison of these two outputs from the final and linear texts illustrates the various operations (and the amount of time and effort) writers engage in during the writing of a particular part of the text.Table 1 includes examples of the final text (in the upper part of the table) and its corresponding linear text (in the lower part).In the linear text, actions such as using the 'backspace key' are indicated within angular brackets, as are the occurrence of pauses.In this example, we have included pauses longer than 1 s.Note that, although the linear file provides rich information, it is not always particularly fitted for understanding exactly what has been deleted or comprehending the writers' movements within the text.For such purposes other types of output options can be acquired from keystroke logging software, for instance, revision analysis and possibilities to replay the writing session (see examples from Inputlog, Leijten and Van Waes 2013).
In Table 1, we contrast the final text with the linear text from Bravo's written truthful account to illustrate the type of data that the linear text can provide.As shown in the table, the final text starts by saying "När tjejen hade plockat ihop alla sina saker gick hon fram till kaffebordet och tog upp en pepparkvarn" (When the girl had collected all her things, she walked up to the coffee table and grabbed a pepper mill).From the linear text, however, it becomes evident that Bravo started out writing this part of the text differently.At the start of the linear text, there is a pause of 3.250 ms.This is followed by the writing of "När kille" (when the gu), that is, starting by telling the events from the perspective of 'the guy'.But then, Bravo presses the backspace 10 consecutive times to delete what has just been written.After that she writes the letter "J", and then immediately deletes that.Instead, she writes "Tjejen" (The girl), which is also deleted at once through seven presses on backspace.After this operation, Bravo writes "När tjejen hade plockat" (When the girl had collected) which corresponds to the solution in the final text.
This contrasting example of the final and linear texts shows that writers often engage in many more activities during text production than what is visible or traceable from inspecting only the final text.Through the study of such actions, it is possible to explore how writers allocate their resources during writing and gain general insights into where writers need to pause or revise and more specific awareness if this need fluctuates in regard to certain contexts, or in our case, particular sequences.In this study, we are interested in methods for exploring the linear files.Analyses of the final texts are not discussed here, but they can be investigated through corpus linguistic methods or discourse analysis.Future methodological avenues can further encompass comparisons between the linguistic properties of the final texts and the writing processes, as shown in linear files, which took place during the production of the final texts.
In the Results section, we will refer to translated excerpts in our examples in this study (for full versions of the linear and final texts, see the Supplementary Materials).
Table 1.Written data Example of a written final text and linear text of a truthful account.The excerpt comes from the central segment of a text by Bravo and is a description of an event depicted in one of the elicitation films.The top section of the table shows the final text, such as it was when Bravo finished writing.The linear text is presented below, which shows the same part of the text but with all pauses and revisions, denoted within angular brackets.The English translation mirrors the syntax and structure of the original Swedish, and keeps any mistakes in the texts.

The Spoken Data: Data Collection Methods
The spoken data were collected through the software Audacity version 3.0.2(Audacity 2021), and the audio files subsequently were transcribed using CHAT, an established transcription format for corpora (MacWhinney 2000).The purpose of the transcriptions was to annotate such disfluency phenomena that previous research has associated with planning processes and increased cognitive load: pauses, fillers (ehm), word fragments, self-corrections, and repeated words (cf.Clark and Wasow 1998).
First, the transcriptions include indications of repetitions of words and word fragments.The transcriptions were carried out using standard Swedish orthography (SAOL 2015) in line with the purpose of the research (Norrby 2014): we wanted the possibility to investigate the content of the message and had no purpose of analyzing it phonetically.However, deviations from standard orthography were made for some common function words.These may be pronounced more "written-like" in stressed contexts, which may be associated with more time for thinking (Johansson et al. 2001;Fox Tree and Clark 1997).More specifically, the conjunction "och" (and) can be pronounced either as a short /å/ or /och/ depending on the context.In Table 2, there are instances of "och" being pronounced as both /och/ and /å/.Further, the infinitive marker "att" (to) can be pronounced as /å/ or /att/.When participants have used lexical items typically associated with spoken varieties, this has been included in the transcription.This applies to Alfa's adding of a vowel to the word "här" (here) so that it becomes /hära/, making the word longer.It can also apply to forms such as "nån" (short for "någon", somebody) or "sån" (short for "sådan", such).Second, the transcriptions contain annotations of filled and silent pauses.To define pauses, we used a common solution from the field of Conversation Analysis (CA), that is, to define a pause from the listener's perspective, in other words, a pause will be defined as a perceived length of silence (following Sacks et al. 1974).However, we also adopted a minimal length of 200 ms.These silences are denoted by (.) following the CHAT format.Filled pauses-i.e., instances of the speaker filling the silence with a sound such as eh or um-are, according to the transcription standard, denoted by &-eh.For the purposes of this study, we used a common transcription standard for all eh-sounds and did not discriminate between variations in pronunciation.
Further, other disfluencies in speech, such as repetitions and reformulations were denoted by square brackets and one to three forward slashes, according to the CHAT format.One forward slash, as can be seen towards the end of the excerpt in Table 2, denotes an exact repetition (in this case, "i hans [/] &-eh i hans") and angular brackets denote which words are repeated (if more than one).Two forward slashes denote repetitions with small reformulations, such as at the beginning of the excerpt in Table 2, where Alfa says "då får [//] &-eh börjar" (then gets [//] &-eh starts), and three forward slashes denote larger reformulations, such as in the middle of the sample, where Alfa says "hon [///] &-eh när mannen är och &-eh plockar på ett annat bord så går hon" (she [///] when the guy is and &-eh picking at another table then she).We will refer to translated excerpts in our examples in this study (see the Supplementary Materials for full versions of the transcripts of the spoken texts).

Results
As is evident from the outline of spoken and written language production above, there are many similarities between the two modalities: both require that speakers/writers plan the content and form of the utterance, execute this as a linguistic expression, and evaluate the result against the plan.However, due to inherent differences between the conditions for the modalities, the signs of increased cognitive load will be manifested differently.
The Results section illustrates various incidents deemed suitable for exploring the distribution of cognitive load throughout narrative accounts and outlines methodological issues concerning the definitions and choices of measures used to capture cognitive load in previous language production research.The presentation of results is structured as follows: first, an examination of meaningful measurements for text length; second, an exploration of the definition and analysis of pauses; third, an investigation into aspects of revisions; and fourth, a proposition suggesting that so-called fluency measures, which integrate all these aspects, may effectively discern segments of increased cognitive load during language production.What is discussed here are thus data from spoken real-time discourse, captured by detailed transcripts, and linear written texts that illustrate the step-by-step actions that writers engage in during text production.However, this article is not concerned with measures for exploring static, final texts.All measures are exemplified and related through excerpts from Alfa and Bravo's accounts.

Measuring Text Length in Spoken and Written Text Production
Previous studies of truthful and deceitful accounts have highlighted the importance of considering the text length (Knapp et al. 1974;Newman et al. 2003;Derrick et al. 2013), and linguistic comparison of text length in different speakers/writers further emphasizes that both individual differences (for example, age, education, and linguistic proficiency), genre differences, and spoken and written differences influence the text length (cf.Biber 1988;Johansson 2009).Following this, a suitable starting point in describing cognitive load during text production, independent of modality, is to estimate the amount of language production, i.e., the quantity produced by speakers/writers.There are several reasons for this.One assumption posits that the ability to produce longer texts may reflect a less cognitively demanding production process, indicating reduced cognitive load.Another assumption suggests that longer texts may indicate more changes, additions, and explanations, potentially enhancing the credibility of a fabricated narrative (cf.Undeutsch 1989).Measuring text length also serves as a foundational baseline metric for other relevant measures (such as pauses and text changes; see below).Thus, measuring text length will generate an overview of how easy it may have been to re-tell the events of the narrative and will also perhaps render a rough estimation of how elaborate the story is.
The most commonly proposed unit for measuring text length in studies including both spoken and written data has been the word, which has also been one of the most important units for measuring text length in deception studies (e.g., see Vrij et al. 2008;Colwell et al. 2007;Derrick et al. 2013).This includes studies illustrating linguistic development (e.g., Berman and Verhoeven 2002) or broad approaches to genre differences (Biber 1988).Importantly, these studies have compared final written texts to transcripts of speaking, i.e., a product to a process (although the transcriptions in these studies can vary according to the degree to which they account for e.g., pauses and repetitions).The reason for this choice is that writing processes have rarely been captured, making studies with process-level descriptions of spoken and written discourse sparse.
Text length in spoken and written discourse has also been compared based on syntax (that is, clauses, sentences, and sentence-like structures).In such comparisons, one major challenge has been that spoken language often lacks the written-like type of grammatical sentences.Therefore, a measure called the T-unit (Terminal Unit; Hunt 1970), which is defined as "[o]ne main clause plus any subordinate clause or non-clausal structure that is attached to or embedded in it," was introduced.It is a syntactic entity, and it is roughly equivalent to a "sentence" in written language.A T-unit is not only defined regarding its syntactic information, but it is also possible to use clues from intonation and discourse/thematic content.As such, it has proven useful for describing and understanding syntactic structure and grammatical development in speech and then comparing it to writing (see Berman and Verhoeven 2002;Johansson 2009;Scott 1988).An addition to the T-unit is the C-unit (Communication Unit), proposed by Loban (1976), which allows for utterances without clausal structure to be organized syntactically (see Johansson 2009, p. 93 for an expanded discussion).However, while these measures have been used to compare spoken and written language, the comparison is based on dynamic speech and the final products of writing-thus, these measures share the same problems as using the word as a measure.However, there have been attempts to apply T-units to the dynamic linear files from keystroke logging.One example is found in Bowen (2019), where some actions of revision were removed from the keystroke logging files to better fit the T-unit analysis.This may illustrate the difficulties in applying T-units to linear files.Finally, while the T-unit/C-unit has much in common with "the sentence", it is, in many cases, not equivalent to a graphical sentence that starts with a capital letter and ends with a full stop.The reason for this is that a written, graphical sentence may consist of several main clauses.Thus, investigations using T-units require manual coding of the data.Although it currently seems challenging to apply the notion of T-unit or C-unit to the real-time written data collected by keystroke logging due to the manual work that is needed and the difficulty in identifying T-units in linear text files, it may, for some purposes, be fruitful to explore this measure in the future if one is looking for a rewarding way to compare spoken and written real-time data on a syntactic level.To summarize, while comparisons of text length from a syntactic perspective may prove rewarding, especially through the use of T-units, this is mainly a measure that is suitable for exploring final written texts.
From the discussion above, we can conclude that, in studying real-time data of spoken and written processes, it is complicated to use measures that are adapted for examining final texts.Let us, however, explore the options to use "the word" as a unit a bit further.The traditional definition of a word is a string of letters surrounded by spaces in printing, corresponding to a distinct unit with meaning; however, this definition is difficult to apply to multi-word units, such as "in spite of".For example, in Table 3, the sample of the spoken truthful account of Alfa in the left column includes several occasions of filled pauses (&-eh).This raises the question of whether a filled pause should be included in a word count.Another question concerns the instances when Alfa rephrases the information about a bag that is turned over, as in "och välter ner (.) &-eh den hära tjejen som satt på en stol hennes väska &-eh välter omkull den" (turns over (.) &-eh this girl who sat on a chair her bag &-eh turns it over).Should both versions of how the bag was overturned be included in the word count?The important point is that there is no correct answer here; instead, the choice of inclusion or exclusion depends on the purpose of the study.Regarding the sample of a written truthful account from Bravo, new challenges arise (see the bolded fragments in Table 3).Corrections of misspellings here result in word fragments.For instance, the word "välde" (turnet, a misspelled version of turned) involves three presses on the backspace to delete one space and the two last letters.The correct letters "te" are then immediately added to create "välte" (turned).How should we calculate words in this context?Should "välde" be considered one word and "te" another, should we treat "välde" and "välte" as two separate words, or should all be counted as one final, correct word, "välte"?From the perspective of studying deceitful narratives, one could argue that the correction of misspelled words is uninteresting, as such corrections merely contribute to the surface level of the final text.However, it may influence the cognitive load in a way that writers focusing on low-level orthographic processes have fewer cognitive resources available for other activities.With this argumentation, it is essential to establish a method for including word fragments and alternative spelling varieties as they contribute to the understanding of how the writers' resources were distributed during their writing.
In summary, spoken texts include word repetitions, and a decision to count each repetition of the same word (or phrase, or part of the phrase) leads to the risk of obscuring parts of the process only once.Issues also arise regarding whether "filled pauses", (e.g., um, eh) should be counted as words or not.Yet another issue is word fragments, which occasionally occur in speaking, that is, the instantiation of words where only a few speech sounds are included and when it is sometimes difficult to guess which word was intended.
In writing, on the other hand, fragments are frequent, and just as in speaking, the mapping of the fragments into words can pose challenges.The fragments may comprise just one or two letters, where it is impossible to comprehend whether they are signs of a false start or mistyping.However, often the fragments present alternatively spelled versions of the same word, and it is common for only a portion of the word to be deleted and rewritten.Should these instances be calculated as one word or two words (or more)?Furthermore, it is typical for writing to encounter letter combinations that arise accidentally or erroneously by pressing the wrong key.Finally, in writing, it is not uncommon for entire clauses, sentences, or paragraphs to be deleted, rewritten, or pasted and moved around within the text.Just as in spoken language, a decision must be made regarding whether to count the words and phrases once or multiple times.
When comparing the text length of written texts, for instance, across genres or between or within subjects, a proposition is to use the number of written characters (i.e., letters, numbers, punctuations, and spaces in writing) in the linear text (see example in Johansson 2009).Such an approach would account for fragments, words, phrases, and other parts of the texts, which may have been deleted and are thus invisible in the final text but which are nevertheless part of the work that writers have put into producing a text.
Spoken texts do not offer the same possibility to easily capture all phonemes.However, with a carefully conducted transcription of the spoken process, the result will not only be a written, linear reflection of the spoken process, but it can also serve to capture text length as the number of written characters in the transcriptions.Although this is not the same as a phonetic transcription that encompasses all phonemes, this approach will serve the purpose of estimating text length in relation to cognitive load and, importantly, enable a rough estimation of how truthful and deceitful accounts compare within and across modalities.
Table 4 illustrates the outcome of the different ways of measuring text length: number of words and number of characters.In this example, the computations are conducted using the excerpts in Table 3.Here, it is evident that variations exist in the word count of the written linear texts that are contingent upon the inclusion or exclusion of word fragments.
Hence, our optimal recommendation for comprehensively understanding the text length of spoken and written accounts while concurrently capturing the expression, repetition, or reformulation of small units is by measuring the number of characters in writing and the number of characters in the transcription of spoken accounts.In doing so, we propose that, for some purposes, it may be suitable to include filled pauses in speech to account for the fact the speaker is uttering something, in contrast to being silent.In our methodological approach, a filled pause denoted as &-eh in the transcription would be computed as two characters (eh).Silent pauses in speaking would be excluded from the calculation, just like the pauses in writing.
Table 5 provides an overview of the number of characters, with filled pauses in speaking being annotated and counted as two characters (i.e., eh; see Section 2.2 for an elaboration of the transcription decisions).See the Supplementary Materials for the complete accounts for both Alfa and Bravo.

Table 4. Text length measures
Calculations of the number of words, characters, and pauses and the number of characters divided by silent, filled, and all pauses (this metric thus delineates the number of characters produced between pauses).The numerical values are derived from the excerpts outlined in Table 3.The initial column enumerates the variables under consideration, the second column delineates the calculations for Alfa's spoken excerpt, and the third column expounds the calculations for Bravo's written excerpt.

Defining and Measuring Pauses in Spoken and Written Text Production
Regarding the spoken account in Table 6, numerous examples of silent (.) and filled (&eh) pauses are discernable in the transcription.Similarly, in the written account, examples of pauses (<2.809>) are evident in the linear representation of the writing process.As detailed earlier, pauses during language production are closely linked to moments of increased cognitive load.Consequently, the identification of pauses and their location and duration are highly relevant for our purposes.So, what is a pause?Pausing during writing (on a keyboard) is often defined as inactivity between two keypresses.However, it is debatable how long the duration of the inactivity should be for it to count as a pause.Technically, pauses can be defined as short as the hardware accounts for, that is, there is generally a latency between the pressing of a key until it is registered.In writing theories (e.g., Flower and Hayes 1981), longer pauses are highly associated with high-level processes such as planning processes (see Torrance 2016;Torrance et al. 2016), while shorter pauses, to a greater extent, have been seen as indicative of low-level processes related to transcription, orthography, and spelling (Wengelin 2006).The literature does not propose strict cut-off points when a pause is long or short, and instead, this must be seen as relative given the particular task or circumstance.However, in general, writing researchers adopt ad hoc pause criteria based on the purpose of the study.Often pauses of 2 s and longer have been proposed as indicating high-level processes (Wengelin 2006), while shorter pauses have been associated with the low-level processes.
The variation of pauses between subjects has been acknowledged in many studies (see Spelman Miller 2006b;Lindgren et al. 2019) and has been explained through background factors (such as writing in first or second language, education, and practice in writing (including handwriting and/or typing skills), age, linguistic development, and reading and writing difficulties due to dyslexia or aphasia), and contextual factors (topic and genre knowledge, audience awareness, or occasional disturbance in the surroundings).Further attempts have been made to propose methods for establishing individual pause criteria, which would allow for a more reliable comparison between subjects.Proposals include correlating the individual pausing behavior to writing speed (Chenu et al. 2014), or relating to the dynamic variation of keypresses across a writing session (Olive 2014).Thus, writing studies examining pauses must establish their own ad hoc criteria for how to define a pause and how to discriminate between long and short pauses if that is relevant given the research questions, and they must use an experimental design that controls for within-and between-subject factors.To facilitate the analysis of pauses, we have used the ad hoc criteria of 1 s.This will allow us to capture pauses on a relative micro-level but avoid having to address pauses that may be primarily related to transcription skills (see Wengelin 2006).
Regarding speaking, the definition of a pause is equally tricky, not the least the issue of individual variation that applies to this modality as well.When (silent) pauses are investigated in speech, there has been a general acknowledgement that the definition of a pause must be related to individual speaking rates.However, the speaking rate may vary between sessions and within the same session.A common solution in the CA transcriptions is to include so-called perceived pauses (Sacks et al. 1974), which is how we operationalize it (while only including pauses with a minimal length of 200 ms).
In addition, a decision on whether to treat filled and silent pauses equally or not must be made.Do filled pauses (eh, um, etc.) serve the same purpose as silent pauses?Studies on conversation show that filled pauses are communicative and can help the speaker keep their turn, whereas silent pauses are not necessarily so (Clark 1996).However, while our data contain spoken monologues where "keeping the floor" should not be an issue for the speakers, there are still ample examples of filled pauses in the data.For instance, in Table 4, it is shown that, of a total of 13 pauses in the spoken sample, 8 are filled.Occurrences of filled pauses in monologues should perhaps be interpreted along the lines that speakers have incorporated filled pauses as one of several planning strategies during speaking and that it is difficult to abandon this habit when invited to speak uninterrupted.
To sum up, pauses are seen as important indicators of speakers' and writers' increased cognitive load during language production.However, it can be difficult to define a pause; previous research has established a rough standard for the respective modality, and we have employed these standards in our analysis.Since there may be different preferences for using filled or silent pauses, which may vary between and within different accounts of speakers, we measured the number of silent pauses as well as the number of filled pauses.In addition, when appropriate, we advocate a measure where both types are included-for instance, to illustrate the overall number of pauses.
Once we have established the definition of pauses in each modality, we turn to measuring the number of pauses.One can assume (based on, e.g., findings from Goldman Eisler 1968; Heldner and Edlund 2010) that frequent pausing would be an indication of instances where the cognitive load is increased and where the speakers/writers need extra time to think about the linguistic expression.With our definition of pauses, it is relatively easy to calculate the number of pauses in a written or spoken account from the linear files in writing or the transcription of the spoken accounts.Since the text length and/or the amount of time dedicated to the accounts will differ, the number of pauses must be calculated relative to text length and/or writing/speaking time.
Further, pause duration has been proposed as an important indicator of cognitive load, and longer pauses are often found preceding more linguistically complex constructions, e.g., subordinated clauses or complicated noun phrases in both speaking (Goldman Eisler 1968) and writing (Nottbusch 2010).While there are tools that can identify silences, these will be rendered useless if there is any kind of background noise in the audio file and filled pauses will also not be captured with these tools.Thus, in speaking, the calculation of pause length will need manual attention and consequently be very time-consuming.Written data, collected through keystroke logging, will have various options for calculating pause length readily and will be automatically accessible (Leijten and Van Waes 2013).
Pause location is a final component that is likely to be relevant for addressing cognitive load during language production.Pause location in connection with specific syntactic constructions, or semantic information may reflect, on the one hand, difficulties in structuring the message, or, on the other hand, difficulties with finding lexical expressions that reflect what one needs to say (Matsuhashi 1981;Spelman Miller 2006a).These types of investigations may be rewarding in establishing which segments are particularly challenging for speakers/writers from a forensic perspective.Pause location can, to a certain degree, be annotated in the transcriptions with the use of speech technology tools, such as linguistic parsers that indicate parts of speech (there are a few parsers trained for Swedish, e.g., Qi et al. (2020), which could aid in this).However, one must expect that substantial manual handling is needed, not the least since transcriptions with pauses and repetitions will make automatic analyses difficult.

Revisions and Reformulations in Spoken and Written Language Production
This section addresses, on the one hand, revisions in writing and how they may be expressed and studied, and, on the other hand, how reformulations and repetitions can be studied in speaking.We treat all these aspects as manifestations of changes in the linguistic message.According to the models of writing and speaking, such changes would occur by monitoring what has been previously produced and will happen if the speakers or writers after such an evaluation conclude that the previous text needs to be modified.
We start by outlining how the concept of revision has been described in writing.Its complexity is discussed in a seminal article by Faigley and Witte (1981), where they make the point that revision should not only be viewed as tidying up the text after the first draft.Instead, there is substantial evidence of it being a complex process that writers engage in, concurrent with planning new content and generating text.Therefore, reading or monitoring the text written so far is an important component (see Johansson et al. 2010; see Wengelin et al. 2023).Changes in the written text can be made at any point during the composition of text: before the text has been transcribed, at the point of inscription (i.e., at the end, or the leading edge, of the text being produced), or at a previous point in the text (cf.Fitzgerald 1987;Lindgren et al. 2019).
There are undoubtedly revision processes of different kinds, and from a processing point of view, there can be so-called internal revisions, also referred to as pre-linguistic and pre-textual revisions (see Murray 1978), which will occur mentally and never be manifested or overtly expressed.External revisions, on the other hand, can be made at the point of inscription or in the previous text.Revisions can further be classified as surface revisions, i.e., language revisions (associated with formal changes), or deep revisions, i.e., content revisions (associated with semantic information) (Chanquoy 2009;Stevenson et al. 2006).The concept of internal revision can further be compared to the idea of text generation as part of the planning process in the model of Flower and Hayes (1981).It is important for the purpose of our exploration that some revision processes may not be overtly visible in the written data, but instead, to a certain extent, incorporated in pauses where the writer is planning the linguistic expression and trying out and rejecting possible solutions before settling on one decision.Existing literature provides many examples of different taxonomies for categorizing the types of revisions occurring in writing, where adding, deleting, and substituting content are the most agreed upon (for some examples, see Johansson et al. 2023).
Just like other linguistic processes, the acts of revision will fluctuate depending on the context, the task at hand, and the background of the writer.Here, age, education, linguistic proficiency (writing in the first or second language and grammatical and lexical knowledge), writing proficiency, knowledge of the topic and genre, and writing mode (for example, typing and handwriting) will influence how, what, and when revisions occur (for overviews, see Chanquoy 2009;Lindgren 2005).
Table 6 provides examples of Bravo's revisions in the deceitful written account, mostly consisting of surface revisions at the leading edge, where typos (e.g., errors occurring due to the pressing of the wrong key and not because of ignorance of orthography) are corrected.The erroneous 'd' at the end of mod is changed to mot ('towards'); the initial letter combination std is immediately corrected and the word studenterna ('the students') is written; the misspelled word tröa is at once corrected to tröja ('sweater').One change can be categorized as a content revision, where flick (orna) ('young girls') is changed mid-word to the (near) synonym tjejerna ('girls').Similar surface revisions at the leading edge are found in Table 3, in the linear text of Bravo's truthful written account.Here, we see no examples that can be categorized as content revisions.The examples of revision in these excerpts thus show how the writers immediately tend to surface revisions (note that there are no pauses between the deleted written text and the use of backspace and the added written text), which suggests a constant monitoring of what is being written.We have also seen examples of content or semantic revision, where another lexical choice for "the girls" was made.
The linear files of the writing sessions further allow for the study of other types of revision behavior: using the arrow keys or mouse to move around in the text.Such movements may or may not be followed by a backspace (for deleting text) or the addition of text to previously written parts.Writers can also highlight parts of the text by using the mouse and click and drag functions, or by using combinations of keys (shift + alt and arrow keys).Once highlighted, the text can be deleted, moved, copied and pasted, or overwritten if writers type over the highlighted text with new text.Consequently, a lot of text can be deleted or moved with very few keypresses or mouse movements.Therefore, it can be relevant to account for the number of editing operations that take place independent of how much text is being removed or added in each operation.These types of editing operations are unique for keyboard writing, but the same concept can be adapted for speaking if reformulations or self-repairs are included in calculations.In Table 5, the total number of editing operations for the complete sample files (found among the Supplementary Materials) is included.The numbers in the table further illustrate how common editing operations are in writing, compared to reformulations in speaking.
The full writing session of the truthful account by Bravo can provide an illustration of what it can look like when a revision is made away from the leading edge, in the previously written text.By the end of the final text of Bravo's truthful account (see the Supplementary Materials), she uses the mouse to move the cursor to a spot preceding the last written sentence.There, she adds a sentence.Table 1 shows the linear file of this sequence, where the indication of <MOUSECLICK> is seen on the last line, followed by a pause of 2.931 s, and then, the sentence fragment that was added (in boldface in Table 1): "Efter det gick ut därifrån och" (After that went out and).Note that this sentence lacks a subject, possibly the pronoun "she", and that the last word "och" was immediately deleted.An illustration of what it looks like is found in Figure 1, where we see two screenshots from the real-time replay of the writing session: the first one just before the mouse click and the second one immediately after the first word of the new sentence has been written ("Efter", After).In the figure, the red circles show the placement of the cursor in the two examples.
Languages 2024, 9, x FOR PEER REVIEW 22 of 36 if reformulations or self-repairs are included in calculations.In Table 5, the total number of editing operations for the complete sample files (found among the Supplementary Materials) is included.The numbers in the table further illustrate how common editing operations are in writing, compared to reformulations in speaking.
The full writing session of the truthful account by Bravo can provide an illustration of what it can look like when a revision is made away from the leading edge, in the previously written text.By the end of the final text of Bravo's truthful account (see the Supplementary Materials), she uses the mouse to move the cursor to a spot preceding the last written sentence.There, she adds a sentence.Table 1 shows the linear file of this sequence, where the indication of <MOUSECLICK> is seen on the last line, followed by a pause of 2.931 s, and then, the sentence fragment that was added (in boldface in Table 1): "Efter det gick ut därifrån och" (After that went out and).Note that this sentence lacks a subject, possibly the pronoun "she", and that the last word "och" was immediately deleted.An illustration of what it looks like is found in Figure 1, where we see two screenshots from the real-time replay of the writing session: the first one just before the mouse click and the second one immediately after the first word of the new sentence has been written ("Efter", After).In the figure, the red circles show the placement of the cursor in the two examples.As mentioned above, revisions can occur far from the inscription point, that is, when the writer uses the mouse or the arrow keys to move the cursor away from the inscription point to change something that has already been written (Lindgren et al. 2019).This means that writers can add, delete, or change the previously written text at any point during the writing session anywhere in the text.For example, a writer may add an initial paragraph of the text as the last part of the writing process, change a description of a protagonist, or delete a chain of events.In the final text, there will be no trace of this (see Wengelin et al. 2023 for examples of this execution in advanced writers).However, examining when writers decide to make changes in their previous written texts offers new perspectives for the understanding of how the message is constructed and can give insights into how deception is built.One example of what revisions may look like when they occur away from the leading edge is shown in Figure 1, where the writer Beta has finished a sentence ("Den yngre tjejen såg detta och gick snabbt iväg från cafét" The younger girl saw this and quickly As mentioned above, revisions can occur far from the inscription point, that is, when the writer uses the mouse or the arrow keys to move the cursor away from the inscription point to change something that has already been written (Lindgren et al. 2019).This means that writers can add, delete, or change the previously written text at any point during the writing session anywhere in the text.For example, a writer may add an initial paragraph of the text as the last part of the writing process, change a description of a protagonist, or delete a chain of events.In the final text, there will be no trace of this (see Wengelin et al. 2023 for examples of this execution in advanced writers).However, examining when writers decide to make changes in their previous written texts offers new perspectives for the understanding of how the message is constructed and can give insights into how deception is built.One example of what revisions may look like when they occur away from the leading edge is shown in Figure 1, where the writer Bravo has finished a sentence ("Den yngre tjejen såg detta och gick snabbt iväg från cafét" The younger girl saw this and quickly departed from the café) and then, she moves the cursor to before this sentence to add a new sentence ("Efter det gick [de] ut därifrån" After that they left.)These kinds of revisions do not have an obvious equivalence in speaking, which probably can be attributed to changes in speech due to the necessity for immediate changes-using the terminology from revision in writing, one can say that changes during speaking will occur in a linear fashion and always at the leading edge.It is undoubtedly an option for speakers to address something that was said further back in the spoken message, and draw the attention to what they need to change or add information to what was previously stated.However, speakers can never "move away" from the leading edge of their spoken account.The phenomena of "revision" in speech are typically referred to as disfluencies in the literature (see Clark and Wasow 1998).This term covers filled and silent pauses, prolongations, repetitions of words and utterances, as well as reformulations.The psycholinguistic view on disfluencies is expressed in Goldman Eisler's (1968) seminal work that connects increased number and duration of pauses and other signs of disfluencies with increased linguistic complexity (especially regarding syntactic complexity at the clause or phrase level).Similar views are shared by Clark (1996) and Levelt (1989) (see also Eklund 2004 for an overview, with a phonetic focus on disfluencies in speech).Here, we will mainly be concerned with repetitions and reformulations since they, just like revisions in writing, serve the purpose of being overt changes to the linguistic message.
A common example of disfluencies in speaking is to repeat one or more words occurring at the start of a clause verbatim, a strategy that is often associated with planning (Clark and Wasow 1998).Table 2 shows an illustration of this from Alfa's spoken truthful account (verbatim repetition in boldface).She says "när mannen är och &-eh plockar på ett annat bord så går hon fram å häller ner peppar <i hans> [/] &-eh i hans mugg" (when the guy is and &-eh picking at another table then she walks up and pours pepper <in his> [/] &-eh in his mug).Here, the repetition (in his) occurs at the end of the clause, where it precedes the noun "mug".Note that in connection with the repetition, we also find a filled pause (&-eh).
Table 6 illustrates parts of the spoken deceitful account from Alfa, which contains numerous examples of reformulations.She says "personen till höger" (the person on the right), which is followed by a silent pause, a filled pause (&-eh), and another silent pause.She then says "den här personen till höger vill" (this person on the right wants), and then, the last verb ("vill") is changed to "försöker" (tries).Thus, taken together, this is a sequence of self-repair consisting of a series of reformulations of what is pretty much the same content.Just a little bit further on, Alfa has another sequence of reformulations: "tanken är att" (the idea is to), which is followed by a pause and the fragment "eller hon" (or she), which, again, is abandoned for the clause "man ser att hon försöker" (one sees that she tries).First, these kinds of sequences of reformulations are particularly interesting to study because they highlight a circumstance or event that the participant finds difficult to express in words.Second, they constitute a noteworthy example of how the strategy of "talking around" a subject allows more time to think while at the same time ensuring no interruptions from listeners-a purpose often attributed to filled pauses.Finally, this example illustrates a sequence of (extensive) consecutive revisions.For our purposes, such sequences are intriguing in both modalities as they have the potential to reveal particularly challenging portions of the narrative accounts.
These instances of verbatim repetition and consecutive reformulations during speech demonstrate that speakers often make repeated attempts to find the right expression with the rephrasing frequently involving multiple words.Notably, the observed changes in our examples appear to be more closely associated with linguistic content, specifically lexical choices, rather than linguistic form.
For our objectives, it is pertinent to explore methods of quantifying revisions, repetitions, and reformulations as a cumulative display of such occurrences may indicate disturbances in the planning processes due to heightened cognitive load.One approach, used in writing studies employing keystroke logging technology, involves subtracting the final text's character count from the character count in the linear files (see example in Gärdenfors and Johansson 2023).This will result in a proportion of the text that was deleted (see Table 5 for an example from our data).Another option is to calculate how many editing operations there are (i.e., the number of occasions something was deleted, independent of how much text was deleted each time, cf.Johansson 2000).In both cases, this will demonstrate a quantitative approach to capturing how frequent revision occurs.
In spoken language, the concept of "deleted text" becomes irrelevant as all utterances, whether rephrased or not, are overtly expressed.However, quantifying and accounting for the number of repetitions and reformulations provides an overview of how frequently speakers rephrase themselves.
An additional potentially valuable approach to investigating changes would be to annotate their location or context and/or categorize the nature of the revision.This could shed light on the causes of increased cognitive load, following the insights of Goldman Eisler (1968), and reveal whether the linguistic expression leading up to deceitful information is more prone to revision or if the deceitful information itself is the focus.However, it is important to note that such annotations necessitate manual execution, making it a time-consuming task.

Fluency and Disfluency in Spoken and Written Language Production
We have touched upon that accumulative signs of cognitive load may be of relevance for our purposes-that is, where pauses and/or changes occur together or within a short time frame.In addressing this issue, we turn to the concept of fluency-disfluency.For speaking, the concept of fluency has been an important concept for estimating how easily speakers carry out different oral tasks.There are many examples that comprise proficiency in second-language learning (e.g., Jong 2016) or fluency in regard to disturbances during speaking, for example, stuttering (e.g., Alm 2011).In the study of spontaneous speech (whether from a cognitive approach or CA perspective), it is also contrasted against the notion of disfluency, which would be viewed as unwanted disturbances during speaking (Clark and Wasow 1998;Eklund 2004;Norrby 2014).
In writing, fluency was brought to the forefront by Chenoweth and Hayes (2001) as a way to shed light on linguistic proficiency (often from an L2 perspective, see examples in Manchón and Roca de Larios 2023) and writing competence.Fluency during writing will typically be captured by dividing the number of linguistic units (words or written characters) per time unit (seconds, minutes, or the whole time on task/total writing time) (see Kaufer et al. 1986 for early examples).From a processing perspective, fluency is often measured through "bursts", that is, the number of words or the number of typed characters between pauses (P-bursts) or between revisions (R-bursts) (see Alves and Limpo 2015 for a comprehensive overview of bursts in writing).Increased fluency will occur when writers have few and/or very short pauses and few revisions.Keystroke logging software, especially the widely used Inputlog (Leijten and Van Waes 2013), can provide automatic output with a variety of different bursts and applied pause criteria.Such output can show the mean length of P-burst in a text, or, in other words, the average number of characters that are written between each pause.According to the hypothesis, the P-bursts will be longer if writers produce new text with ease.Here, we can refer to Table 4, where the number of pauses in speaking and writing are presented across modalities.The number of pauses is divided by the number of characters.This would be an example of the mean length of a P-burst.For the spoken account, we have included several comparable measures: one measure where characters have been divided by the number of silent pauses (47.4), one that divides them with the number of filled pauses (29.63), and one that includes all pauses (18.23).The different results illustrate that the definition of pauses is important for the outcome.For the written account, we only included one measure: number of characters per pauses longer than 1 s.Note that, with a different pause criterion, the number of characters per pause would also change.Determining which pause criteria to choose and whether or not to include or exclude filled pauses will depend on the research questions that are posed, but it may also be valuable to explore various options before deciding on the definition in a particular study.
The fluency approach thus requires measuring the total time on task in the written and spoken task.At the overall text level, this would mean that, for the written task, the keystroke logging software offers automatic output, while the spoken task requires some, but limited, manual attention.The number of written characters or the number of characters in the transcripts of the spoken accounts can then be divided by the time on task.We advocate using the linear text files for this type of calculation.In examining the results of such calculations, a few effects can occur: if speakers/writers have long and/or many pauses, there will be, on average, fewer characters written per second.However, if speakers/writers engage in many changes (revisions, repetitions, and reformulations), the effect may be more characters written/spoken per second.It may be the case that writers who have longer pauses also revise more, but not necessarily so.Given the previous studies we have repeatedly referred to above, it is evident to expect a fluctuation regarding where pauses and changes occur during the unfolding of both spoken and written language production.
We have already concluded that it is difficult to automatically identify and isolate pauses in speaking due to potential background noise in the recording and the existence of filled pauses; consequently, we have ruled out measuring pause duration in speaking as a cost-effective way to approach our goals.However, to account for the fluctuation in fluency during language production, we suggest another approach, that is, dividing the spoken and written texts into different segments.Given our experimental design, where we have identified which portions of the events in the elicitation films should be altered by the participants in their deceitful accounts, we propose a threefold division: before the lying event, during the lying event, and after the lying event.This will be a way to operationalize the variation in fluency during different sequences of the narrative accounts and serve as an initial but potentially rewarding attempt that can serve our purposes.
Similar approaches were applied to written data in a study by Johansson (2009), but in that case, the writing time was divided into five equally long segments and then, the proportion pause time was measured in each segment.The results demonstrated different pause time distributions throughout the writing of narrative and expository genres.For our data, this would be a more time-consuming but perhaps fruitful way to divide the speaking and writing into fixed time segments or 20% divisions of the time on task and then explore the proportion of pause time and/or changes in each segment.
Finally, an initial quantitative and cost-effective approach to identifying sequences with accumulated signs of cognitive load can later be combined with more qualitative inspections and annotations.For our purposes, measuring fluency may thus provide an approach that combines text length, pauses, and changes-all outlined above.The advantage of looking at fluency is that it is a proportional measure and thus more suitable for comparing accounts across participants with different text lengths, and consequently, across modalities and deceitful-truthful conditions.

Discussion
This study had a methodological aim to explore and identify phenomena indicative of increased cognitive load within language production in both speaking and writing.Our objective was to discern and quantify heightened cognitive load effectively, with the aim of establishing a methodological framework capable of using these effects to identify deceitful narration, which can be an indication of lying.Drawing inspiration from previous research on cognitive load measures in language production, our study specifically concentrated on indicators within the speaking and writing processes, such as text length, pauses, changes (revisions, repetitions, and reformulations), and fluency.

Measuring Text Length
The first research question revolved around the methodology for quantifying cognitive load by measuring text length.The examination of text length is pertinent, grounded in the assumption that longer texts reflect ease in both speaking and writing.This characteristic may serve as an indicator for both truthful and deceitful accounts.Moreover, the augmentation of text can signify an individual's effort to enhance the persuasiveness of a narrative.Our deduction from this exploration is that, although words are commonly used as a metric for text length, such a measure may obscure instances of word fragments and revisions in written content.
In speech, it is equally imperative to acknowledge the difficulty in operationalizing repetition and rephrasing of words and phrases.To derive a comprehensive measure for text length that incorporates the entirety of overt linguistic production by speakers and writers, we propose using the character count (including letters, numbers, punctuation signs, and spaces) in written text.This approach encompasses all textual elements produced, irrespective of linguistic unit.For spoken language, an equivalent measure can be achieved by calculating the character count in the transcription of spoken accounts.This methodological choice facilitates the inclusion of word fragments, false starts, repetitions, and reformulations.Additionally, if necessary, the annotations of filled pauses can be incorporated into the calculation.
Despite the inherent limitations of this approach, it represents an ad hoc solution that aligns with the research objectives of the larger project.Moreover, the accessibility of character count information in keystroke logging programs, as well as its ease of extraction from transcriptions of spoken accounts, renders this methodology a cost-effective and economical means of obtaining relevant measures for researchers interested in exploring or comparing spoken and written discourse.

Measuring Pauses and Pause Length
The second research question delved into the nuanced definition and measurement of pauses to effectively capture heightened cognitive load.Within the literature, pauses are commonly regarded as a robust indicator of increased cognitive load.In addressing this matter, we initiated a discussion on the definition of a pause.In the context of keyboard writing, a pause has traditionally been defined as the interval of inactivity between two keypresses.However, such pauses can be exceedingly brief, making it impractical and often less rewarding, to scrutinize every pause between keypresses.Recognizing this, we advocated for an approach commonly adopted by writing researchers, which involves setting a pause criterion to exclude very short pauses, unless the focus lies on low-level processes such as transcription skills.For the purpose of the larger project, we specifically proposed an ad hoc criterion of 1 s, enabling the capture of pauses on a micro-level relative to the study's scope while circumventing the need to address pauses primarily associated with transcription skills.
In the realm of spoken language, defining pauses poses its own set of challenges.For instance, should one measure each silent pause and only consider pauses surpassing a specific temporal threshold?Drawing insights from conversation analysis, we opted to incorporate the concept of perceived pauses, wherein listeners' perceptions determine what qualifies as a silent pause, but we adapted our definition to include only pauses exceeding a minimal length of 200 ms.This entailed employing slightly different approaches to defining pauses in writing and speaking.Our rationale behind this decision was twofold: to facilitate subsequent analyses related to pauses and to establish criteria suitable for our research needs, grounded in robust practices within the field of speaking and writing.
Additionally, this study addressed filled pauses, such as instances where speakers use fillers (eh, um) to avoid silence and potential loss of conversational footing.While we deemed it reasonable to occasionally incorporate filled pauses into the same calculations as silent pauses, our data structure in the corpus of the larger project allows for the separate calculation of different pause types when necessary.This flexibility enables comprehensive examination of overall pause distribution during speech and, when desired, discrimination between various pause categories.
Then, we focused on the dimension of pause duration, recognized as a notable candidate for indicating heightened cognitive load.The suggestion was that opting to define pauses as perceived pauses in speaking is preferred, as opposed to the more timeconsuming approach of measuring the length of every silent pause above a certain threshold.In contrast, within the written data acquired through keystroke logging, accessibility is straightforward, and various pause criteria can be applied for data exploration.Consequently, although pause length could serve as a vital indicator for capturing heightened cognitive load, its examination demands a more labor-intensive, manual approach.
The location of pauses within the texts additionally conveys insights into the linguistic contexts where writers/speakers allocate additional time.While this approach holds promise, this study ascertained that its implementation necessitates considerable manual effort for the meticulous annotation of syntactic context and semantic content.Notably, the researcher's workload in this task remains equivalent when annotating both spoken and written data following transcription.It is pertinent to acknowledge that leveraging existing parsers for parts of speech tailored for Swedish could assist in this undertaking.However, these parsers encounter challenges when confronted with word fragments, introducing potential limitations and uncertainties in content coding.Despite recent advances in this domain, exemplified by technologies like ChatGPT (https://chat.openai.com/chat,accessed on 26 January 2024) and other AI applications, it is conceivable that novel tools better suited to our needs may emerge in the near future.Nevertheless, the imperative role of manual supervision remains evident to ensure validity.In light of the objectives of the larger project, we conclude that annotating pause location, while potentially valuable from a forensic perspective, does not stand as a primary choice for initially identifying sequences of heightened cognitive load in our data.

Measuring Linguistic Changes
The third research question centered on overt revisions in writing and reformulations and repetitions in speaking, denoting instances when writers and speakers modify previously produced messages.Presumably, such alterations aim to enhance the message's accuracy or persuasiveness, with an anticipation of increased instances of revision and reformulation occurring in specific linguistic contexts and narrative sequences where expressing the intended meaning proves challenging on the initial attempt.However, the manner in which changes in the linguistic message manifest differs significantly between writing and speaking.
In writing, revision can occur at any point during a writing session, often proximate to the leading edge, and often addressing formal language aspects.However, writers possess the freedom to navigate to any part of the written text, addressing various issues.Typically, minimal traces of revision are discernible in the final text.Leveraging keystroke logging methodology enables the visibility of revisions through linear file inspection and session replay.Supplementary output files can further provide an overview and categorization of revisions (see Leijten and Van Waes 2013).
Conversely, the dynamics of reformulation and repetition in speaking differ.These modifications are both audible and available to listeners.Speakers are confined to making changes at the leading edge and perhaps use the repetition of words and expression as a strategy to gain thinking time in a similar but more sophisticated way as the function of filled pauses is interpreted.Reformulations or self-repairs serve the dual purpose of extending the time for thought, planning (indicating a desire to convey specific thoughts), and experimenting with different linguistic expressions.Transcriptions of spoken accounts offer a relatively straightforward means of capturing repetitions and reformulations.Although coding is necessary during the initial transcription phase, corpus tools (CLAN (MacWhinney 2000), AntConc (Anthony 2023), etc.) can be used for the almost automatic acquisition of such data.
In summary, regarding research question three, addressing revision in writing and repetition and reformulation in speaking appears to be relatively straightforward.However, if one seeks to annotate the location, specifically the syntactic or semantic context in which the revision occurs, similar challenges as those regarding pause location may arise.Manual coding, especially regarding semantic aspects, becomes imperative.
Yet, the application of keystroke logging software presents a viable solution by offering a string-based analysis of the revision's location in writing, distinguishing between locations such as mid-word, between words, and between clauses (Leijten and Van Waes 2013;Wengelin and Johansson 2023).Consequently, estimating the amount of revision becomes achievable, either by measuring the number of deleted characters in writing-an easily obtainable measure-or by simply counting the instances where a speaker or writer engages in editing operations, irrespective of the size or scope of the revision.The mere occurrence of changes in the text is indicative of potential increased cognitive load.

Exploring the Fluctuation of Cognitive Load
The fourth research question encompasses an examination of how text length, pause distribution, and text changes, when considered collectively, can facilitate the exploration of cognitive load fluctuations throughout the process of spoken and written language production.From the comprehensive overview of our measures, it is evident that each area holds significant potential for capturing crucial indicators of heightened cognitive load.In particular, we highlight the connection to the concepts of fluency and disfluency and the ways that previous research has proposed for discerning the ease with which language is produced.
In the data from our larger project, we anticipate that different segments of the narratives will demonstrate a variation in the presence of pauses and linguistic changes and that a cost-effective and less time-consuming way to capture this would be to divide the spoken and written processes into meaningful sequences and then explore the proportion of pauses, particularly those of extended duration, as well as the proportion of diverse and recurrent linguistic changes in each segment.Equally, the mere calculation of text length in each segment will give insights into the fluctuating nature of language production.Such an approach does not rule out subsequent or parallel qualitative analyses of the contexts of pauses and linguistic changes with a content-based focus, that is, the kind of annotations and analyses that require manual attention for accuracy and are thus more costly.
Further, in-depth examinations of pauses and changes during speaking and writing are time-consuming and require manual attention to ensure accuracy.From this point of view, it is imperative to explore methods that can be more cost-effective and, to a certain degree, automatized while still containing validity in annotations and categorizations of linguistic phenomena.While parts of spoken analyses may be automatically transcribed, filled pauses, repetitions, and reformulations will require manual annotation (even though speech recognition and AI have taken substantial leaps in the last decades).In turn, this also makes the analyses more time-consuming.In this regard, examining written accounts will have certain benefits; for example, the technical solution already exists for implementing a keylogger behind editor windows in report systems (Chukharev-Hudilainen et al. 2019) and both web-based and locally stored software is available (Wengelin and Johansson 2023).There is also existing software that can quickly provide an overview of the distribution of revision and pausing (in particular, see Inputlog, Leijten and Van Waes 2013).
Although specific solutions must be tailored for authorities and businesses that would want to use this possibility, there are fundamental technical solutions to build this on.In this way, less manual work is involved in collecting, annotating, and making initial analyses of texts.Nevertheless, interpretation of data must be carried out manually to ensure validity.Also, the implementation of the method must be preceded by training in interpreting the findings.In this respect, the method of using data from real-time writing shares the same challenges as methods using spoken data.In addition, just as during the study of speaking, the study of real-time writing can be combined with methods for concurrently collecting auxiliary types of data, such as gaze behavior (cf.Johansson et al. 2023 for an overview of existing approaches).

Forensic Applications
Finally, what are the possible forensic applications of the present study?In our study, we have demonstrated that the examination of expressions of cognitive load during writing and speaking can serve as a mirror of instances, events, or circumstances that require extra thinking from the writer/speaker.With the assumption that deception induces increased cognitive load, such discoveries can be used for forensic purposes, although it is essential to first conduct more applied research to establish a baseline and variation across different populations and tasks.It is also important to note that this is one of many approaches that should be part of a forensic toolbox.
However, given that written reports are an essential part of many procedures in the legal system, such as requiring clients to give an initial report online in written form on a website, it is not unlikely that the examination of pausing and revision patterns during writing can be used as one of several indications of instances that need extra attention from interrogators and that it can, together with other evidence and circumstances, serve to inform about contexts where more evidence or investigations are necessary to collect.In the future, it may also be possible to ask witnesses to give written statements in a more secure environment (e.g., a police station) in an early stage of an investigation to capture less rehearsed accounts through keystroke logging.
To summarize, this article has shown and exemplified through one speaker and one writer that there are common phenomena associated with increased cognitive load in speaking and writing.Further, this article proposes methods for how such expressions can be investigated in a fruitful way across these modalities based on previous theories and results of research in the linguistic fields of real-time speaking and writing.The next step will be to investigate if there are any systematic differences between truthful and deceitful accounts in a larger pool of data, which is the goal of the larger project, which includes this study.

Conclusions
The foundational premise of this study posits that deceptive narrative accounts necessitate extensive planning, thereby impeding the fluency of the language production process and inducing increased cognitive load.Observable manifestations of this cognitive load in overt linguistic expression include disruptions, such as an augmented number of pauses, relatively longer pauses (filled and silent in spoken discourse), heightened revision through deleted characters and editing operations in writing, or an increased number of repetitions and rephrasing in speaking.While our primary objective was to establish a methodological framework for identifying signs of deceptive-induced cognitive load in language production, we were additionally focused on devising accessible and cost-effective approaches for this purpose.
In our examination of different approaches and methods to measure cognitive load in speaking as well as writing, we have consistently recognized that some methods necessitate more manual work than others.We advocate for the use of automatic or semi-automatic methods, whenever available, and highlight keystroke logging as a particularly valuable tool for investigating written language production.This preference is rooted in its ease of data collection, requiring comparatively little post-curation of data in contrast to transcription and annotation of spoken data.The keystroke logging software offers diverse output analysis files that facilitate the investigation of pauses and revisions.Furthermore, exploring deception during writing provides a unique advantage.People are generally familiar with encountering final written texts but will normally not have observed the process of text composition.This lack of familiarity with writing processes makes it potentially easier to detect patterns of increased cognitive load associated with lying compared to spoken language production, where individuals may employ mimicking behaviors learned from countless spoken interactions.
While our methodological framework holds promise for practical forensic applications seen from a long-term perspective, such as employing keystroke logging tools to analyze written statements on web pages for witness reports, it is essential to acknowledge certain limitations.The larger study exclusively involves native language (L1) speakers, a deliberate choice made to maintain integrity, as L2 speaking and writing often introduce heightened cognitive load.Additionally, the methodological approach assumes proficiency in keyboard-based writing, limiting its applicability.As an important note, the methodological framework we propose here is primarily applicable to computer-generated texts and leverages the analytical capabilities offered by current keystroke logging software programs (see Wengelin and Johansson 2023 for a comprehensive overview).While the foundational theories regarding how cognitive load impacts written language production are also relevant to handwriting (cf.van Hell et al. 2008), one should anticipate that differences in execution will come into play.For example, handwriting tends to be more time-consuming, and the process of revision is both more challenging and time-intensive, leaving more detectable traces.Expanding the framework's applicability to handwriting would necessitate additional research endeavors utilizing specialized tools designed for capturing and scrutinizing handwriting, such as Eye & Pen (Alamargot et al. 2006).
As discussed above, additional applications of the suggested methodology must be carried out before the method can be used in real-life contexts.Not the least, it is imperative to determine what role individual differences play and how one can establish a baseline for how cognitive load is expressed during truthful accounts.While the method of using keystroke logging eventually has the potential to be used in court, much more research is needed to establish various baselines concerning how deceptive behavior is manifested during writing in different contexts and for different individuals, where individual writing styles are important to consider.However, once such a body of research is established, this method can be used alongside other methods for information gathering.
In conclusion, we contend that writing can serve as a valuable complement to speaking as a forensic tool but cannot entirely replace it.Our methodological proposition seeks to expand the forensic toolbox.This endeavor holds significant forensic importance as individuals provide both truthful and deceptive narratives in both spoken and written formats.A thorough exploration of the processes associated with "speaking and writing truth and falsehood" and their interplay across modalities will offer valuable insights into forensic linguistics.Specifically, comparing the act of deception in spoken and written forms will illuminate best practices for extracting potentially deceptive information in various contexts, including witness accounts, security clearance interviews, and similar scenarios.and specified in the Swedish Act concerning the Ethical Review of Research involving Humans (2003:460), the present study does not require specific ethical review by the Swedish Ethical Review Authority due to the following reasons: (1) it does not deal with sensitive personal data, (2) it does not use methods that involve a physical intervention, (3) it does not use methods that pose a risk of mental or physical harm, (4) it does not study biological material taken from a living or dead human that can be traced back to that person.
Informed Consent Statement: Participants gave informed consent orally.These were recorded separately but in connection with the data collection.This consent form was chosen due to the pandemic situation, where we never met our participants in person.All informed consent included permission to publish in scientific journals, using data from the participants.Informed consent was obtained prior to the participation and all participants were informed that they could withdraw their participation at any time without any consequences.All participants are pseudonymized.

Figure 1 .
Figure 1.Text revision away from the leading edge.Example of inscription points from Bravo's truthful written account.(a) Bravo writing at the leading edge; (b) Bravo has moved the cursor and is now inserting a sentence away from the leading edge.The red circles denote the placement of the cursor.

Figure 1 .
Figure 1.Text revision away from the leading edge.Example of inscription points from Bravo's truthful written account.(a) Bravo writing at the leading edge; (b) Bravo has moved the cursor and is now inserting a sentence away from the leading edge.The red circles denote the placement of the cursor.

Table 2 .
Spoken data An example of the transcription of a spoken truthful account.Periods within parentheses (.) denote silent pauses that are 200 ms or longer, &-eh denotes filled pauses, [/] denotes exact repetitions with angular brackets around the preceding strings to show what is repeated, [//] denotes repetitions with minor reformulations, and [///] denotes reformulations.Verbatim repetition is highlighted in boldface.
) &-eh this here girl then (.) whose her bag fell down she [///] when the guy is and &-eh picking at another table then she walks up and pours pepper <in his> [/] &-eh in his mug

Table 3 . Spoken versus written data One
spoken and one written extract from Alfa's and Bravo's truthful accounts with different measures and calculations for text length and pauses.All (.) denote pauses and &-eh filled pauses.

Table 5 . Descriptive statistics for language production across the four accounts
. Alfa's spoken and Bravo's written.Time on task represents many seconds Alfa and Bravo spent speaking/writing.Number of characters represents the total number of characters in the different accounts.Characters per second represents the average number of characters produced per second.The proportion of deleted characters demonstrates the proportion of the written text that was deleted.Number of pauses represents the total number of pauses.Revisions and reformulations represent the number of editing operations independent of editing size.

Table 6 . Pauses in spoken and written accounts.
Examples illustrating the processes of linguistic changes in the spoken deceitful text by Alfa and the written deceitful text by Bravo.The (.) denote pauses and &-eh filled pauses.[//] denote repetitions with reformulations.