Next Article in Journal / Special Issue
The Impact of Virtual Exchanges on the Development of Sociolinguistic Competence in Second Language Spanish Learners: The Case of Voseo
Previous Article in Journal / Special Issue
Sociolinguistic Competence in Chinese Heritage Language Speakers: Variation in Subject Personal Pronoun Expression
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Testing Cumulative Lexicalized Effects in Study Abroad: Variable Subject Pronoun Expression in Spanish as an Additional Language

Department of Spanish and Portuguese, UCB 278, University of Colorado Boulder, Boulder, CO 80309, USA
*
Author to whom correspondence should be addressed.
Languages 2025, 10(5), 110; https://doi.org/10.3390/languages10050110
Submission received: 29 June 2024 / Revised: 8 February 2025 / Accepted: 15 March 2025 / Published: 8 May 2025
(This article belongs to the Special Issue The Acquisition of L2 Sociolinguistic Competence)

Abstract

:
We examine variable first-person singular subject pronoun expression in Spanish learner data to investigate the effects of study abroad in Mexico and Spain on the acquisition of sociolinguistic variation. In addition to exploring pre- and post-study abroad effects, this work considers whether such impacts wane over time after the study abroad experience. We include in the analyses novel usage-based factors estimating lexically specific usage patterns. We conduct a mixed-effects linear regression model predicting overt yo (‘I’) expression. Results indicate that overt yo expression is more likely after studying abroad (compared to pre-study abroad). Additionally, learners acquire a usage-based pattern of variation evident after the study abroad experience. This effect is not just apparent immediately after studying abroad, but it persists in data collected after a time delay.

1. Introduction

Research on the development of sociolinguistic variation—the understanding that two or more linguistic forms or variants can be employed to express the same function or meaning—among additional language learners has focused on examining how learners acquire variation and how such variable patterns compare with that of first language (L1) speakers (K. Geeslin & Long, 2014). In addition to linguistic factors, variationist second language acquisition (SLA) researchers have been particularly interested in extralinguistic factors such as learning context (e.g., studying a language at home versus in a study abroad (SA) context) (Zahler et al., 2023). Prior research has shown that SA may impact the acquisition of variable features such as subject personal pronoun (SPP) expression (e.g., Gudmestad & Edmonds, 2023), the focus of this study.
Meanwhile, usage-based approaches posit that the acquisition or developmental trajectory of an additional language is subject to usage, that is, input and experience (Bybee, 2008; K. Geeslin et al., 2023a; López-Beltrán & Carlson, 2020). In the case of SPP expression, L1 acquisition research demonstrates that a verb’s likelihood of use in contexts promoting an overt pronoun predicts the likelihood of SPP expression, regardless of the online contextual production context (E. L. Brown & Shin, 2022). Usage-based approaches (Bybee, 2010) suggest these effects accumulate in memory, become lexicalized, and exert independent effects on variant use. However, less is known about whether similar effects exist among additional language learners. As K. Geeslin et al. (2023a) note, applying usage-based approaches to SLA can inform the field’s understanding of the role of learners’ existing language system, as well as exposure, frequency, and input, on acquisition.
To bridge variationist SLA and usage-based research, the current study extends this line of research regarding accumulation in the memory of words’ contextual distributions in the speech of additional language learners of Spanish after SA, both immediately and delayed. Two usage-based factors not included in previous SPP research include each verb token’s cumulative likelihood of use with an overt subject pronoun (percent subject expression) and each verb token’s likelihood of use in a switch (vs. same) reference context. The former will be labeled ‘Likelihood of SPP expression’ in this work, and the latter FRC (Forms’ Ratio of Conditioning). Each will be described below in the Background section.
Using interview data from the LANGSNAP Spanish Corpus (Mitchell et al., 2017), this study tests whether variable SPP expression in additional learners of Spanish provides evidence for the lexicalized details of verbs’ patterns of use. Compared to the at-home classroom, SA offers learners the possibility of richer input and greater opportunities to use the target language. As such, we examine whether SA plays a role by considering differences between data prior to SA (labeled pre-SA) and immediately after SA (post-SA1), and whether any effects persist over time (post-SA1 versus delayed post-SA (post-SA2)).

2. Background

2.1. Study Abroad and Acquisition of Sociolinguistic Variation

Research on the acquisition of sociolinguistic variation among additional language learners has had two primary foci. The first is to better understand how learners develop variation over time and whether learners’ variation is constrained by factors like those found among L1 speakers. The second are extralinguistic factors that may influence target language use and input, and subsequently the acquisition of sociolinguistic variation. An extralinguistic factor that has received ample attention within variationist SLA research is study (or stay) abroad. Prior studies have examined the development of different variable structures in Spanish as a second language (L2) during SA, including forms of address (Pozzi, 2022), subject-verb word order in wh-questions (Denbaum-Restrepo, 2023), present perfect (K. L. Geeslin et al., 2012), future-time expression (Kanwit & Solon, 2023), and SPP expression (Gudmestad & Edmonds, 2023), among others.
Spanish allows for variable SPP expression in that a subject pronoun can be overt (yo creo ‘I believe’) or null (∅ creo ‘I believe’). Spanish SPP expression is constrained by multiple factors of the discourse and speech context including referent continuity, priming/perseveration, tense–aspect–mood (TAM), and subject person/number (e.g., Carvalho et al., 2015; Otheguy & Zentella, 2012; Torres Cacoullos & Travis, 2018). Previous research suggests that variable expression (overt vs. null) of pronominal subjects among learners of Spanish increases as proficiency increases and tends to be conditioned by similar linguistic factors as that of L1 speakers (K. Geeslin & Gudmestad, 2016; K. Geeslin et al., 2023b; Gudmestad & Edmonds, 2023; B. G. Linford, 2016; B. Linford et al., 2018).
Gudmestad and Edmonds (2023), who analyzed first-person-singular subject forms utilizing the same language learner corpus as the one in this study, found that polarity, clause type, referent continuity, and perseveration significantly constrained SPP expression. That is, SPP expression was more likely in affirmative statements (polarity), independent clauses (clause type), switch reference contexts (referent continuity), and when the previous mention of the same referent in subject position was a pronoun (perseveration) (Gudmestad & Edmonds, 2023). These findings correspond to that of K. Geeslin and Gudmestad (2016), who also found that L1 Spanish speakers were more likely to express first-person subject pronouns under the same linguistic conditions, except for clause type which was not included in their study.
Previous studies point to an effect of an SA context on target-like SPP expression. B. G. Linford (2016) found that students who had studied in the Dominican Republic increased their selection of overt subject pronouns on a written contextualized task (WCT) as well as in interviews post-SA. Denbaum’s (2020) study reported that L2 learners abroad, also in the Dominican Republic, approached significance in their SPP expression on a WCT post-SA compared to their at-home counterparts. Meanwhile, B. Linford et al. (2018) found that students in Spain selected overt subject pronouns on a WCT at rates similar to L1 speakers post-SA. Nonetheless, B. Linford et al. (2018) caution that SA may not have played a unique role since their participants’ rates of SPP expression are similar to those which are to be expected among students of similar proficiency levels, regardless of SA, according to K. Geeslin et al.’s (2015) proposed developmental pathway for variable SPP expression in L2 Spanish. Regarding linguistic predictors that constrain variation, Gudmestad and Edmonds (2023) found that polarity, reference continuity, and clause type remained stable factors that did not interact with time (pre- versus immediate post-SA), which B. G. Linford (2016) and B. Linford et al. (2018) also found. However, language learners did show the following developmental changes after a nine-month stay abroad in Spain: frequency of SPP expression increased, participants employed both un/expressed subjects (rather than categorical use of the unexpressed subject), and perseveration was a significant predictor (Gudmestad & Edmonds, 2023).
Adding to this research, our study aims to contribute in two ways. First, we analyze longitudinal learner data, which Bayley and Tarone (2012) argue for within variationist SLA. Apart from pre- and immediate post-SA (post-SA1) utilized by Gudmestad and Edmonds (2023), we include the analysis of delayed post-SA interviews (post-SA2) to measure the extent to which learners’ SPP expression changes over the course of SA and both immediately and nine months afterwards when they have returned to their country of origin. Second, we investigated two usage-based factors that have not been included, to our knowledge, in prior research on SPP expression and additional language learners: FRC (each verb token’s likelihood of use in a switch- (vs. same-) reference context) and likelihood of overt pronoun expression (each verb token’s prior probability of use with an overt subject pronoun).

2.2. Usage-Based Factors

Usage-based approaches to language hold that speakers’ experiences with language (in production and in perception) shape language structure, language variation and change (Becker et al., 2009; Bybee, 2010), and L1 acquisition (Tomasello, 2003; Shin, 2016; Shin & Miller, 2022). Acquisition and learning of additional languages (K. Geeslin & Long, 2014; López-Beltrán & Carlson, 2020) are also shaped by usage. Speakers’ experience with language is often estimated via lexical frequency counts (token frequency, type frequency, bigram frequency, etc.), which have been shown to impact processes of language acquisition and learning (e.g., Ambridge et al., 2015; Ellis, 2012). Usage-based approaches presume that vast amounts of detail regarding language use and usage patterns become registered in memory as episodic traces [for example, the Exemplar Model (Bybee, 2001)]. The lexical representations of words, thus, come to reflect frequently experienced tokens, which in turn may be selected as production targets, forming a type of Feedback Loop (Kemmer & Barlow, 2000).
One factor consistently found to predict overt (vs. null) SPP is whether the target verb is used in a switch (vs. non-switch) reference context. Switch reference refers to instances in which there is a change in subject between two adjacent clauses. Verbs used in a switch reference context are more likely to be expressed with an overt SPP than verbs with continuity of reference, which holds true across monolingual (Carvalho et al., 2015), bilingual (Otheguy & Zentella, 2012; Torres Cacoullos & Travis, 2018), and learner data (K. Geeslin & Gudmestad, 2016; Gudmestad & Edmonds, 2023). Importantly, episodic traces of these usage events (verb’s usage with and without an overt SPP) accumulate in memory (Bybee, 2002). SPP usage is conditioned by both the online contextual production context (whether appearing in a switch or non-switch reference context) as well as the verb’s history of use in a switch reference context (E. L. Brown & Shin, 2022). That is, a verb’s likelihood of use in contexts promoting overt SPP predicts the likelihood of SPP expression, independently of the production context. Such episodic traces in memory of usage patterns shape lexical representations and subsequently have an independent effect in predicting variant productions.
Therefore, in this study, we include two relatively under-utilized usage-based factors which will be detailed in the following sections: each verb form’s ratio of conditioning by switch reference (FRC) and each verb form’s likelihood of use with an overt (yo creo) (‘I believe’) versus null (∅ creo) (‘I believe’) subject. Both probabilistic measures are understood to be rough estimates of L1 speakers’ experiences with language. This method is supported by previous studies’ results on additional-language learners’ developmental acquisition of (socio)-linguistic variables (Gudmestad, 2021, p. 230) in which variable patterns in learners’ data are converging with native speakers’.

2.2.1. FRC

In addition to lexical frequency, words’ contexts of use should be considered in studies of variation and change (Bybee, 2002). Online production contexts condition variation in myriad ways, promoting or inhibiting the productions of specific variants in a probabilistic fashion. Additionally, words differ significantly in their likelihood of use in contexts conditioning specific variants. These variant patterns of use become registered in memory serving as potential targets for subsequent productions. Therefore, identifying the recurring contexts of use of words in specific conditioning contexts helps to predict speakers’ variant choices. This knowledge contributes to our understanding of how language variation and change take shape.
Specifically, in the case of Spanish SPP variation, it has been shown that discourse continuity of the subject is typically one of the factors with the greatest magnitude of effect as a predictor of expression with regard to SPP in Spanish (Posio, 2018, p. 300). When the subject of the target verb differs from the subject of the previous conjugated verb in the immediately preceding discourse, pronoun expression is favored. When there is discourse continuity (no switch in reference), null subjects are expected. A verb commonly used in discourse lacking continuity of reference (switch reference context) might come to store this probabilistic contextual information as part of the production plan. This usage pattern (likelihood of switch vs. same reference context) accumulates in memory.
This study will measure each verb form’s ratio of use in a switch versus a same reference context (FRC). This same measure has been tested previously using adult language data (E. Brown, 2020) as well as L1 acquisition data (E. L. Brown & Shin, 2022). After children acquire sensitivity to switch reference [i.e., greater likelihood of SPP expression in switch vs. same reference context], over time a long-term effect of verbs’ use in switch vs. same reference context emerges that conditions expression. For SLA, K. Geeslin et al. (2015) note that the strongest predictors of pronoun expression among adult L1 speakers of Spanish are the first to emerge among additional language learners. Numerous studies demonstrate that learners are sensitive to the online conditioning of discourse continuity (e.g., Gudmestad & Edmonds, 2023; B. Linford & Geeslin, 2022). Nevertheless, the accumulation in memory of the effects of sensitivity to discourse continuity has not been tested for L2 learners.

2.2.2. Likelihood of SPP Expression

In addition to FRC, a contextually informed probability measure, we are interested to know whether, independent of the production context (the online use in either a switch or a no switch reference context), overall likelihood of overt pronoun expression impacts learner acquisition of the sociolinguistic variation. As usage-based approaches to SLA have noted, “if the object or practice to be learned by the L2 speaker is not directly observable, noticeable or in other ways readily available for the human experience […] it will not be learned” (Eskildsen & Cadierno, 2015, p. 4). In other words, quantity and quality of input matters because frequency of exposure drives language learning and enables entrenchment (Ellis, 2015).
If, independent of production context, a verb is frequently realized with an overt subject pronoun in general (e.g., yo creo ‘I believe’), this pattern of input may become registered in memory. In other words, learners’ lexical representation with this particular verb could include an overt subject pronoun. Consequently, learners might replicate this pattern (overt subject pronoun + verb) more often. Knowing that string frequencies such as, for example, yo creo have been shown to be registered by children and adults (e.g., Bannard & Matthews, 2008), we hypothesized that verbs more likely to be used with an overt subject pronoun might be stored in memory with an overt subject pronoun compared to verbs less likely to be accompanied by an overt yo. Knowing that SPP variation is constrained by numerous factors simultaneously, the effect of co-occurrence patterns in the input could be hypothesized to be independent of the switch-reference effect.
Thus, in this study, employing a well-understood (e.g., Carvalho et al., 2015) sociolinguistic variable, the variable expression (vs. omission) of Spanish first-person subject personal pronoun, we explore the following research questions.
  • What role, if any, do usage-based factors (FRC and Likelihood of SPP expression) identified for L1 acquisition play in L2 acquisition of sociolinguistic variation?
  • What effects, if any, does SA have on the acquisition of variable first-person subject pronoun for additional language learners of Spanish? Do these effects persist across time?

3. Materials and Methods

3.1. LANGSNAP Corpus

We analyzed data from an open-source longitudinal learner corpus called LANGSNAP (Mitchell et al., 2017). The corpus consists of oral and written data collected over a 21-month period from participants who were United Kingdom-based university students who spent an academic year abroad (~9 months) in a French- or Spanish-speaking location. Data collection periods included pre-abroad, immediate post-abroad, and delayed post-abroad. All participants were additional language learners of French or Spanish and had various placement types abroad (exchange student, intern, or teaching assistant). For this study, we analyzed three semi-structured interviews from 25 Spanish language learners. Nineteen of whom were women while six were men. Their ages ranged from 20–25 (M = 20.71; SD = 1.46), with one participant who did not identify their age. Their first languages included: English (n = 23), Polish (n = 1), and English and Polish (n = 1)1. Participants had studied Spanish in an academic context between 2–14 years (M = 5.6; SD = 3.3). In addition to Spanish, other languages studied included: French (n = 19), German (n = 7), Italian (n = 2), Latin (n = 1), and Portuguese (n = 1)2. Three participants had not studied any languages other than Spanish. Sixteen of the participants were in Spain, while nine were in Mexico. While in Spain, nine where exchange students, one was an intern, and six were teaching assistants. While in Mexico, all nine were teaching assistants. Data for this study come from three semi-structured interviews per participant, each lasting approximately twenty minutes long and conducted by members of the LANGSNAP research team. In these interviews, participants discussed their daily lives and opinions about life in the UK and abroad. For each participant, we manually analyzed first-person singular subject verbal forms with or without yo expression. First-person singular subjects present only two variants (null, pronominal). Selecting first-person subjects helps minimize the range of conditioning factors constraining the variation. This method is in line with previous work on Spanish subject pronoun variation in monolingual (Travis & Torres Cacoullos, 2012), bilingual (Torres Cacoullos & Travis, 2018), and SLA research (K. Geeslin & Gudmestad, 2016; Gudmestad & Edmonds, 2023). All tokens of first-person subjects spoken by the learners (N = 5138) were extracted from the following interviews: (a) pre-SA (about 3 months prior to their departure), (b) immediate post-SA (at the end of their academic year abroad, approximately 12 months after the pre-study abroad data collection period), and (c) delayed post-SA (approximately 21 months after the pre-study abroad data collection period). These are labeled in this work as pre-SA, post-SA1, and post-SA2 respectively. Of the 5138 total tokens, we analyze a subset of 3518 tokens which we explain and justify in the following sections below.

3.2. Data Coding and Analysis

The acquisition of variable SPP expression in the Spanish of additional language learners has been widely researched (K. Geeslin et al., 2013, 2015; K. Geeslin & Gudmestad, 2016; B. Linford et al., 2018; Denbaum, 2020; B. Linford & Geeslin, 2022; K. Geeslin et al., 2023b; Gudmestad & Edmonds, 2023). Linguistic factors that condition the variation for additional language learners coincide with those identified for monolingual and bilingual speakers of Spanish. Additionally, characteristics specific to learners and the type of language input have been shown to constrain patterns of acquisition of SPP variation. Thus, we follow previous research and manually code each of the first-person finite verbs for numerous predictors described below.
As mentioned previously, a recurring conditioning factor constraining Spanish SPP expression is switch reference. For each target token in our data, therefore, we code for whether the target subject was different from (switch) or identical to (no switch) the subject of the preceding finite verb, regardless of the semantic and pragmatic features of the preceding subject. An example of a switch in reference for a target verb (hablo ‘I speak/I talk’) can be seen in example (1). The subject of the bolded target hablo (first-person singular yo) differs from the subject of the previous, underlined finite verb es (third-person singular ‘is’). In example (2), there is discourse continuity in that the target hablo has the same subject as the previous finite verb tengo (I have).
(1)
pero uh mi tío es um es(pañol) [//] de España. Y uh sí. Hablo con ellos en español. [speaker 170, Pre-SA]
‘but my uncle is from Spain. And yes. I speak Spanish with them’
(2)
Tengo amigos mmm con quien hablo en Facebook [speaker 163, Post-SA1]
‘I have friends I speak to on Facebook’
In addition to switch reference, previous studies (e.g., Abreu, 2012; E. L. Brown & Rivas, 2011; Cameron & Flores-Ferrán, 2004; Travis, 2007; Torres Cacoullos & Travis, 2018) demonstrate a priming effect, whereby an expressed pronominal subject in the previous context may trigger an overt yo on the target verb. Example three (3) illustrates a case in which the verb preceding the target (in this case tengo ‘I have’) is an overt yo (yo creo ‘I believe’). Example (4), conversely, illustrates an example in which the subject of the verb ( ‘(I) know’) preceding the target (podría ‘I could’) is null. To capture this perseverative effect, we code each target verb for the expression and type of subject of the immediately preceding finite verb; lexical, pronominal, null. We predicted that an expressed subject could favor an overt yo in the target context.
(3)
y yo creo que yo tengo un nivel bastante bueno [speaker 175, Post-SA1]
‘and I believe I have a pretty good level’
(4)
sé que las podría haber conocido [speaker 166, Post-SA1]
‘I know that I could have met them’
Lexical frequency can account for patterns of morphosyntactic variation (Bybee & Thompson, 1997; Krug, 2003). Although for Spanish subject pronoun expression predictions are not straightforward regarding the direction of effect of lexical frequency—that is, whether frequency would favor an increase or a decrease in rates of SPP expression, Erker and Guy (2012) find that for Spanish SPP expression, lexical frequency interacts with linguistic predictors to constrain SPP variation. Each of the target verbs included in our statistical models is coded for lexical frequency per million using the Oral section of the Corpus del español3 (Davies, 2002) to determine whether the verb’s token frequency constrains variation in the learner data (either in interaction with other factors or as a main effect).
To examine whether a verb’s likelihood of use in a switch-reference context (independent of the production context) plays a role in the patterns of SPP variation in this data, we measured the FRC, or each verb’s ratio of occurrence in a switch- (vs. same-) reference context. This is a measure of the likelihood of a specific discourse context (switch reference) for each verb type. Some verbs are commonly found in a switch reference context (a discourse context favoring overt subject pronouns), and others are more commonly found in same-reference contexts (a discourse context disfavoring overt pronouns). For example, a verb in our data that occurs frequently in a switch- (vs. same-) reference context is imagino (‘I imagine’). In the oral section of the Davies’ (2002) corpus, out of 200 uses, the subject of the previous finite verb is different (i.e., switch) in 182 cases and is the same referent in 18 cases. In other words, in 91% of the uses, the verb imagino occurs in a switch reference context. A verb like salí (‘I left/went out’) on the other hand, in this same corpus, out of 200 uses occurs in a switch reference context in just 50% of the instances. The estimate, thus, attempts to capture any potential long-term effect of such disparate usage patterns across time in a speaker’s experiences with Spanish. In this way, to create this probability measure, the FRC is calculated as each verb form’s occurrences in switch-reference contexts out of the verb form’s total appearances, capped at 200 occurrences for frequent verbs (# Verb in switch reference/# Verb in corpus = FRC).4 The log of this value is used in the statistical modeling.
As a separate measure, we also explored any potential effect for the likelihood of expressed subject pronoun for each verb type in the Davies (2002) corpus. This factor measures co-occurrence patterns of verb forms (e.g., digo ‘I say’) and the verb’s corresponding subject pronoun (yo) embedded in instances of language use. Subjects in Spanish may appear pre- and post-verbally (yo digo or digo yo respectively). Additionally, subjects need not appear immediately adjacent to the verb (yo siempre digo ‘I always say’). For each verb type, we calculated the percentage of expressed (vs. null) subject pronouns in the external corpus (Corpus del español). We extracted each instance of use (again capping frequent verbs at 200) and calculated a percentage of subject expression including pre- and post-verbal subjects as well as adjacent and non-adjacent tokens of yo (# tokens with expressed subject pronoun/# tokens of the verb overall). For example, in nearly half of the occurrences (91/200), a verb like pensé (‘I thought/believed’) is expressed with an overt yo. The log of percentage is used in the regression models.
Additionally, we code each target verb for whether it is sampled from the pre-SA interviews, the immediate post-SA interviews (post-SA1), or the delayed post-SA interviews (post-SA2).
For the quantitative analysis, we use a generalized linear mixed-effects model using lme4 in R (R Core Team, 2019). We include the speaker and the verb as random effects. Section 4 summarizes the results of our analyses.

4. Results

To understand the patterns of SPP variation evident in the learner data, we base our analyses upon all the first-person singular verb forms extracted from the LANGSNAP corpus (n = 3518). We explored the extent to which learners’ SPP expression changed prior to and after a stay abroad. Additionally, we tested whether learners acquired usage-based patterns of variation.

4.1. Rates of SPP Expression and SA

The rate of SPP expression in the learner data is strikingly low, similar to what is true of L1 acquisition (Shin, 2012). There are important differences across speakers with regard to their SPP usage. Rates of SPP expression vary from 0% to 25%. Overall, in just 8.3% of the 3518 instances of first-person singular verb forms is the yo pronoun expressed in these data. A summary of rates of SPP expression is presented in Table 1.
Prior to the SA experience (pre-SA), the rate of yo expression is 7.2% (with just 95 instances of overt yo expression across the 25 speakers). Rates of yo expression are at their highest in the data collected immediately after the SA experience (post-SA1). In post-SA1, speakers use overt SPP at a rate of 9.6%. The rates of SPP expression are lower in post-SA2 (8.0%). A Friedman’s test5 shows that SA timing does not have a significant effect on SPP expression χ2 (2) = 0.989, p = 0.610. According to pairwise comparisons, SPP expression is not significantly different between each time point. Given that rates of expression do not reveal the underlying constraints on variation (or control for factors operative in the production contexts that probabilistically predict SPP production), we conduct statistical analyses to understand the role of SA on the acquisition of this sociolinguistic variation, the possible retention of any gains, and the effects of usage-based factors in the acquisition process.

4.2. Acquisition of SPP Expression and Usage-Based Factors

A primary goal of this project was to determine whether certain usage-based factors constrain the variation of the additional language learners of Spanish. Specifically, building upon previous research that report L1 acquisition of FRC effects, we sought to test whether additional language learners, likewise, given enough experience with the target language, might also acquire this variation. An additional factor of the input we tested was whether a verb’s prior probability of being expressed with an overt (vs. null) yo could condition variation in learner data. Initially, we had a total of 5138 tokens. We imported values of FRCs and the Likelihood of SPP expression from previous work for 2575 of those tokens (E. Brown, 2020; E. L. Brown & Shin, 2022), and then, working in descending order of verb token frequency in our corpus, we generated by hand these measures for an additional 943 tokens that were not calculated in previous studies. Time constraints leave data coding for the remaining 1620 ongoing. The following analyses, thus, do not reflect the entirety of the dataset and are conducted upon first-person singular tokens for which we have FRC and percent SPP expression values (n = 3518; pre- n = 1323; immediate post-SA1 n = 1078; delayed post-SA2 n = 1117). However, we have no indications or reason to believe that further coding would alter the central findings of our study (the effect of SA, the persistence of the effects, and the lack of an FRC effect in SLA learner data).
To explore the role of any usage-based factors and whether the acquisition of SPP variation is impacted by SA, we conducted linear regression models in which we considered the factors described in the methodology section: previous reference, perseveration, log FRC, log verb frequency, and log Likelihood of SPP expression. First, the fixed effects factors included in this study were tested for pre-SA and immediate post-SA1 (see Table 2). We considered the effect of the factors independently and in interaction with timing (pre-SA, post-SA1). When considered without an interaction, an examination of the dataset as a whole (see Appendix A) does not permit us to identify a potential effect of usage-based factors in a SA experience. It does, however, provide us with insights into the potential individual effects of our usage-based factors (lexical frequency, FRC, Likelihood of SPP). Neither frequency nor FRC is selected as significantly predicting SPP, and Likelihood of SPP has a marginally significant effect. When we consider the pairwise interactions of each usage-based factor and timing, or pairwise interactions one at a time, the model does not converge except for the model including Likelihood of SPP. We report the results in Table 2.
Table 2 summarizes the results of the linear mixed effect model predicting overt SPP expression. There are three factors that significantly constrain variation in the learners’ speech. Unsurprisingly, in line with previous findings across varieties of Spanish, discourse continuity is selected as significant. When a first-person singular verb form is used in a switch-reference context, yo expression is more likely than when there is continuity of reference. In a switch-reference context, expression is 11.5% whereas in a same-reference context, rates are 5.9%.
The results of this analysis in Table 2 also reveal a significant effect of priming on likelihood of an overt SPP in these data. When the subject of a preceding verb is an overt subject, the target verb subject pronoun expression is more likely, revealing higher rates of expression (11.3%) in these cases compared to targets lacking such priming (7.4%). This result aligns with previous research (e.g., Gudmestad & Edmonds, 2023).
The results in Table 2 also reveal an effect of timing on the SPP expression of these speakers. The rate of expressed subject pronouns is significantly higher in post-SA than in pre-SA (7.2% to 9.6%, respectively). Neither lexical frequency nor FRC are selected as significantly constraining SPP variation. Our additional usage-based factor, however, Likelihood of SPP expression significantly constrains SPP variation as a main effect and in interaction with timing. As a verb’s likelihood of use with an overt pronoun increases, so does the likelihood of overt pronoun usage in our learner data. Verbs that have a higher rate of pronoun expression generally in Spanish [as estimated in the Davies (2002) corpus] have a greater likelihood of overt yo expression in these data. Additionally, the significant interaction with timing reveals that after studying abroad, learner’s use of subject pronoun is more acutely predicted by this usage-based factor (Likelihood of SPP expression). The effects are enhanced in the post-SA1 data. This interaction is represented in Figure 1.
The data summarized in Figure 1 reveal that, prior to SA, overt yo tokens have only a slightly higher average Likelihood of SPP expression (27.1%) compared to the instances in which learners did not express yo (26.8%). After SA (post-SA1), the instances of overt yo expression in the learner data are on verbs that, on average, have a higher Likelihood of SPP expression (32.2%) compared to the null tokens in our data (27.9%). The significant effect on patterns of SPP usage of the previously untested role of verbs’ Likelihood of SPP suggests that these speakers have accumulated in memory the probability of yo expression with individual verb forms. These usage-based patterns shape SPP expression more clearly after living abroad.
Although previous research examined SPP expression in pre- versus immediate post-SA data, this work sought to understand whether there is evidence of persistence of acquisition effects across time post-SA. As made evident in Table 1, there are no significant changes in overall rates of SPP across time after the SA experiences (post-SA1 vs. post-SA2), but there was acquisition of a linguistic constraint on variation (Likelihood of SPP expression) as seen in Table 2. We ask whether the acquisition of this pattern of variation that was evident post-SA persists.
Table 3 presents the results of a linear regression prediction of yo expression in post-SA1 compared to post-SA2. Following the same modeling procedure as the pre-SA and post-SA1 model, we conduct a mixed effect linear regression model predicting yo expression in the data post-SA. Both Verb and Speaker are included as random intercepts.
In these data, there is a significant effect of verb’s Likelihood of SPP expression on the likelihood of learners expressing yo. The significant positive correlation suggests that as a verb’s Likelihood of SPP expression increases, the more probable it is that a speaker expresses an overt yo. When we examine whether the effect of verb’s Likelihood of SPP persists or wanes across time (via the interaction of Timing and usage-based factors), it is apparent that there is no significant interaction between the distinct data collection times. The significant effect of verb’s Likelihood of SPP expression apparent in post-SA1 (illustrated in Figure 1), persists in post-SA2.

5. Discussion

This project analyzed variable first-person SPP expression in the speech of additional language learners of Spanish before and after an SA experience. We examined speech immediately after students’ return from abroad and at a delayed time interval. We sought to understand what role SA played in acquisition of this widely studied sociolinguistic variable in Spanish (SPP expression). Additionally, we tested whether cumulative usage factors (contextual, non-contextual) constrained the variation.
Overall, the analysis of SPP usage revealed strikingly low rates of SPP expression as compared to other L1 studies of the same linguistic variable. Previous research on monolingual adult L1 Spanish suggests that yo expression varies from 25% (Lastra & Martínez Butragueño, 2015, p. 43) for the Spanish of Mexico City to 57% in Andalusian Spanish (Ranson, 1991, p. 138). Rates of yo expression in contact varieties such as the Spanish of Southeast Texas or Southern Arizona can be as low as 19% (Bessett, 2023, p. 32). Overall, rates of expression in previous L1 Spanish research typically hover at a rate far higher than the 8.3% we report here. L1 child acquisition data, however, do reveal lower levels of expression. Shin (2016, p. 925) reports that the percentage of subject pronoun expression in children ranges from 8% in 6 to 9 year olds to 11% in children 12 or older. These percentages are very similar to ones we find in the L2 learner data we analyze in this study (8%, Table 1). Meanwhile, K. Geeslin and Gudmestad (2016) reported about 20% first-person SPP expression among L2 learners of Spanish. Differences in our L2 findings may be due to task, as our data came from general interviews where there was no explicit intention on the part of researchers to elicit first-person SPP usage, while K. Geeslin and Gudmestad (2016) employed sociolinguistic interviews which may have affected the type of language produced by participants. Our findings, however, align with Gudmestad and Edmonds (2023), who analyzed a subset of the same learner corpus and found a 7% rate of first-person SPP expression.
We asked whether SA played a role in the SPP usage patterns. Results of the mixed-effects linear regression (Table 2) suggest a significant difference between the speech pre- and post-SA1. Prior to SA, overt yo expression is less likely when compared to SPP expression post-SA1. Moreover, in this project, we asked whether additional language learners of Spanish exhibit any evidence of accumulation in memory of acquisition of usage-based factors that constrain SPP variation, and we find a mixed result. We find no evidence of acquisition of the contextually informed measure estimating verbs’ likelihood of use in a switch- (vs. same-) reference context (FRC). However, our results suggest that these learners are sensitive to the online conditioning of the switch-reference factor. Expression of an overt yo is more likely when the target lacks discourse continuity and omission is more likely when the target subject does not switch from the previous referent. Despite a robust effect of this conditioning factor, there is no evidence to suggest that verbs’ patterns of use in conditioning contexts accumulates in memory (no significant FRC effect either independently or in interaction with lexical frequency). Therefore, we conclude that the verb’s probability of use in the conditioning context is not lexicalized.
This result confirms, unsurprisingly, that additional language learner results differ from child L1 acquisition data. E. L. Brown and Shin (2022) show that children first acquire the online switch-reference condition. With time and sufficient input, they acquire the lexically specific pattern of verbs’ likelihood of use in the conditioning context (e.g., FRCswitch). In the child language data, the FRC effect is apparent starting around the age of 8 or 9 years old. It is unlikely that additional language learners, whose primary experience will have been in the classroom, will have been exposed to a similar amount and type of input as that of an 8- or 9-year-old L1 speaker. Unfortunately, for this study, we lack information about the quantity and quality of target language input that our participants received while abroad that would inform this comparison further, but future studies could include this type of data.
Unlike the FRC measurement that is discourse-sensitive (requiring speakers to register and accumulate a verb’s likelihood of occurrence in a switch-reference context), our measure of a verb form’s prior probability of overt yo expression does not share the same level of cognitive demands for the learners. The presence or absence of yo expressed with a verb presents a more tangible pattern for learners to attend to compared to the likelihood of the target verb occurring in a switch or a same reference context. The relative perceptual salience, or noticeability of the pattern (Ellis & Collins, 2009), as well as the more immediate proximity of an expressed subject pronoun with a verb (rather than across clauses) and therefore less cognitive processing load (Fedorenko et al., 2013), could account for the acquisition of Likelihood of SPP expression as a conditioning factor over FRC. In L1 acquisition, it has been argued (Shin, 2016) that complex patterns, as could be the case of FRC, may be acquired later than simpler patterns (such as, for example, likelihood of an overt vs. null yo).
This study also set out to examine whether any SA effects persisted over time subsequent to returning to the home country or whether effects waned. We examined data extracted from interviews immediately after the SA (post-SA1) as well as after a time delay (post-SA2). We focused upon one variable of interest: the acquisition of Likelihood of SPP expression. This variable aligns only slightly with first-person SPP variation in pre-SA data, but a significant effect is evident particularly after time abroad (Table 2) suggesting that SA may play a role in learners’ development of SPP expression. Meanwhile, the average Likelihood of SPP expression between immediate post-SA1 and delayed post-SA2 is a stable correlate of expressed tokens. While rate of SPP expression decreases between immediate (8.1%) and delayed post-test (7.1%), the pattern of first-person SPP use persists in that the difference in average Likelihood of SPP expression is higher for expressed tokens compared to null tokens for both immediate post-SA1 and delayed post-SA2 while there is almost no difference between expressed and null in the pre-SA (Figure 1). This suggests that usage-based effects acquired during a year abroad linger even after having been home in the UK for nine months (delayed post-SA2).
Nonetheless, these findings must be interpreted with caution as it is unclear whether simply more experience with exemplars, either at home or abroad, may have produced stronger associations. This study does not employ an at-home comparison group. Yet, even if we did, a comparison between SA and at-home would also be subject to confounding factors, such as individual differences and varying input. Students who choose to study abroad are qualitatively different from those who do not and so a direct comparison between such groups would also have its limitations (Sanz, 2016).
Looking ahead, future directions may include the examination of other extralinguistic factors such as English-language use while abroad, placement type, or prior years of Spanish study, as well as how participant experiences may have varied due to placement type. One possibility is to examine qualitatively the participant interviews, which Gudmestad and Edmonds (2023) also suggest, to better understand students’ access to Spanish speakers and their varying use of Spanish while abroad. This can paint a better picture of the type of input they received as it relates to SPP expression as measured here. Ongoing challenges of usage-based SLA research is documenting the input that learners receive and their linguistic experiences, since research suggests that additional language learners may be exposed to more formal language even when studying abroad (K. Geeslin et al., 2023a). In sum, this study adds to the growing, albeit still limited, research on variationist SLA and usage-based linguistics by incorporating the effect of usage-based factors in sociolinguistic variation.

Author Contributions

Conceptualization, E.B., T.Q. and J.R.; methodology, E.B., T.Q. and J.R.; software, E.B., T.Q. and J.R.; validation, E.B., T.Q. and J.R.; formal analysis, E.B., T.Q. and J.R.; investigation, E.B., T.Q. and J.R.; resources, E.B., T.Q. and J.R.; data curation, E.B., T.Q. and J.R.; writing—original draft preparation, E.B., T.Q. and J.R.; writing—review and editing, E.B., T.Q. and J.R.; visualization, E.B., T.Q. and J.R.; supervision, E.B., T.Q. and J.R.; project administration, E.B., T.Q. and J.R.; funding acquisition, E.B., T.Q. and J.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable because this study used open access data and did not involve human subjects.

Informed Consent Statement

Not applicable because this study used open access data and did not involve human subjects.

Data Availability Statement

Target verb tokens derive from the publically available LANGSNAP corpus of Spanish (Mitchell et al., 2017) [https://web-archive.southampton.ac.uk/langsnap.soton.ac.uk/view/participant/spanish/index.html]. The corpus estimates for the target verbs were calculated from the Corpus del español (Davies, 2002) [https://www.corpusdelespanol.org/]. Data is available upon request.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Table A1. Mixed effects linear regression predicting yo expression in pre- vs. immediate post-SA1 datasets (n = 2401).
Table A1. Mixed effects linear regression predicting yo expression in pre- vs. immediate post-SA1 datasets (n = 2401).
Random EffectVarianceStd. Dev.
Verb (intercept)0.49230.7017
Speaker (intercept)0.97150.9857
Fixed effectsN% overtEstimate coef.Std. errorp-value
(Intercept) −3.0111.256*
Switch-reference (switch)101811.50.7820.162***
Switch-reference (no switch)13835.9----
Timing (Pre-SA)13237.2−0.1900.165**
Immediate Post-SA110789.6----
Priming (previous expressed)54811.30.6960.290*
(previous null)18537.4----
Log Likelihood SPP expression----1.8471.007.
Log Verb Frequency----0.3660.256n.s.
Log FRC_switch----2.4732.527n.s.
AIC = 1219.2, Random effects: Speaker (N = 25), Verb (n = 61). Positive coefficients are associated with yo expression. Significance codes: p-value *** < 0.001, ** < 0.01, * < 0.05, . < 0.1, 1 > n.s.

Notes

1
We did the runs we report without the Polish and English/Polish participants. The model selected the same factors as significant as well as the same directions of effects in the non-significant factors. We therefore choose to keep the data obtained from these speakers in the final version.
2
None of these speaker characteristics significantly predict subject pronoun variation in these data.
3
Although this corpus also provides abundant written data, both from the twentieth century and for earlier stages of the language, for our calculations we rely exclusively on the oral section, which contains approximately 5 million words of spoken Spanish from different varieties of Latin America and Spain.
4
For more details, see E. L. Brown and Shin (2022).
5
The Friedman Test is a non-parametric test used for non-normally distributed data and is an alternative to the one-way ANOVA.

References

  1. Abreu, L. (2012). Subject pronoun expression and priming effects among bilingual speakers of Puerto Rican Spanish. In K. Geeslin, & M. Díaz-Campos (Eds.), Selected proceedings of the 14th hispanic linguistics symposium (pp. 1–8). Cascadilla Proceedings Project. [Google Scholar]
  2. Ambridge, B., Kidd, E., Rowland, C., & Theakston, A. (2015). The ubiquity of frequency effects in first language acquisition. Journal of Child Language, 42, 239–273. [Google Scholar]
  3. Bannard, C., & Matthews, D. (2008). Stored word sequences in language learning: The effect of familiarity on children’s repetition of four-word combinations. Psychological Science, 19(3), 241–248. [Google Scholar]
  4. Bayley, R. J., & Tarone, E. (2012). Variationist Perspectives. In S. M. Gass, & A. Mackey (Eds.), The Routledge handbook of second language acquisition (pp. 41–56). Routledge. [Google Scholar]
  5. Becker, C., Blythe, R., Bybee, J., Christiansen, M., Croft, W., Ellis, N., Holland, J., Ke, J., Larsen-Freeman, D., & Schoenemann, T. (2009). Language is a complex adaptive system: Position paper. Language Learning, 59(1), 1–26. [Google Scholar]
  6. Bessett, R. M. (2023). A cross dialectal comparison of first person singular subject pronoun expression in Southern Arizona and Southeast Texas. Referring to discourse participants in Ibero-Romance languages, 4, 25. [Google Scholar]
  7. Brown, E. (2020, October 9). The long-term accrual in memory of contextual conditioning effects. 6th PSUxLing Conference, The Pennsylvania State University, State College, PA, USA. [Google Scholar]
  8. Brown, E. L., & Rivas, J. (2011). Subject–verb word-order in Spanish interrogatives: A quantitative analysis of Puerto Rican Spanish. Spanish in Context, 8(1), 23–49. [Google Scholar]
  9. Brown, E. L., & Shin, N. (2022). Acquisition of cumulative conditioning effects on words: Spanish-speaking children’s [subject pronoun + verb] construction. First Language, 42(3), 361–382. [Google Scholar]
  10. Bybee, J. (2001). Phonology and language use (Vol. 94). Cambridge University Press. [Google Scholar]
  11. Bybee, J. (2002). Word frequency and context of use in the lexical diffusion of phonetically conditioned sound change. Language Variation and Change, 14, 261–290. [Google Scholar]
  12. Bybee, J. (2008). Usage-based grammar and second language acquisition. In P. Robinson, & N. Ellis (Eds.), Handbook of cognitive linguistics and second language acquisition (pp. 216–236). Routledge. [Google Scholar]
  13. Bybee, J. (2010). Language, usage and cognition. Cambridge University Press. [Google Scholar]
  14. Bybee, J., & Thompson, S. (1997, September). Three frequency effects in syntax. In Annual meeting of the berkeley linguistics society (pp. 378–388). Linguistic Society of America. [Google Scholar]
  15. Cameron, R., & Flores-Ferrán, N. (2004). Perseveration of subject expression across regional dialects of Spanish. Spanish in Context, 1(1), 41–65. [Google Scholar]
  16. Carvalho, A., Orozco, R., & Shin, N. (Eds.). (2015). Subject pronoun expression in Spanish: A cross-dialectal perspective. Georgetown University Press. [Google Scholar]
  17. Davies, M. (2002). Corpus del Español: 100 million words, 1200s–1900s. Available online: http://www.corpusdelespanol.org/hist-gen/ (accessed on 3 June 2024).
  18. Denbaum, N. (2020). Role of social interaction abroad in the L2 acquisition of sociolinguistic variation: The case of subject expression in the Dominican Republic. In D. Pascual y Cabo, & I. Elola (Eds.), Current theoretical and applied perspectives on hispanic and lusophone linguistics (pp. 63–84). John Benjamins. [Google Scholar]
  19. Denbaum-Restrepo, N. (2023). The role of language attitudes in the L2 acquisition of sociolinguistic variation. The case of pre-verbal subjects in wh-questions. In S. Zahler, A. Long, & B. Linford (Eds.), Study abroad and the second language acquisition of sociolinguistic variation in Spanish (pp. 229–265). John Benjamins. [Google Scholar]
  20. Ellis, N. C. (2012). What can we count in language, and what counts in language acquisition, cognition, and use. Frequency Effects in Language Learning and Processing, 1, 7–34. [Google Scholar]
  21. Ellis, N. C. (2015). Cognitive and social aspects of learning from usage. In T. Cadierno, & S. W. Eskildsen (Eds.), Usage-based perspectives on second language learning (pp. 49–74). De Gruyter Mouton. [Google Scholar]
  22. Ellis, N., & Collins, L. (2009). Input and second language acquisition: The roles of frequency, form, and function introduction to the special issue. The Modern Language Journal, 93(3), 329–335. [Google Scholar]
  23. Erker, D., & Guy, G. R. (2012). The role of lexical frequency in syntactic variability: Variable subject personal pronoun expression in Spanish. Language, 88(3), 526–557. [Google Scholar] [CrossRef]
  24. Eskilsen, S. W., & Cadierno, T. (2015). Advancing usage-based approaches to L2 studies. In T. Cadierno, & S. W. Eskildsen (Eds.), Usage-based perspectives on second language learning (pp. 1–18). DeGruyter Mouton. [Google Scholar]
  25. Fedorenko, E., Woodbury, R., & Gibson, E. (2013). Direct Evidence of Memory Retrieval as a Source of Difficulty in Non-Local Dependencies in Language. Cognitive Science, 37(2), 378–394. [Google Scholar] [CrossRef] [PubMed]
  26. Geeslin, K., Daidone, D., Long, A. Y., & Solon, M. (2023a). Usage-based models of second language acquisition. In M. Díaz-Campos, & S. Balasch (Eds.), The handbook of usage-based linguistics (pp. 345–361). John Wiley and Sons. [Google Scholar]
  27. Geeslin, K., Goebel-Mahrle, T., Guo, J., & Linford, B. (2023b). Variable subject expression in second language acquisition: The role of perseveration. In P. Posio, & P. Herbeck (Eds.), Referring to discourse participants in Ibero-Romance languages (pp. 69–104). Language Science Press. [Google Scholar]
  28. Geeslin, K., & Gudmestad, A. (2016). Subject expression in Spanish: Contrasts between native and non-native speakers for first and second-person singular referents. Spanish in Context, 13, 53–79. [Google Scholar] [CrossRef]
  29. Geeslin, K. L., García-Amaya, L. J., Hasler-Barker, M., Henriksen, N. C., & Killam, J. (2012). The L2 acquisition of variable perfective past time reference in Spanish in an overseas immersion setting. In Selected proceedings of the 14th Hispanic linguistics symposium (pp. 197–213). Cascadilla Proceedings Project. [Google Scholar]
  30. Geeslin, K., Linford, B., & Fafulas, S. (2015). Variable subject expression in second language Spanish. In A. Carvalho, R. Orozco, & N. Shin (Eds.), Subject pronoun expression in Spanish: A cross-dialectal perspective (pp. 191–209). Georgetown University Press. [Google Scholar]
  31. Geeslin, K., Linford, B., Fafulas, S., Long, A., & Díaz-Campos, M. (2013). The L2 development of subject form variation in Spanish: The individual vs. the group. In J. C. Amaro, G. Lord, A. de Prada Perez, & J. E. Aaron (Eds.), Selected proceedings of the 16th hispanic linguistics symposium (pp. 156–174). Cascadilla Proceedings Project. [Google Scholar]
  32. Geeslin, K., & Long, A. Y. (2014). Sociolinguistics and second language acquisition: Learning to use language in context. Routledge. [Google Scholar]
  33. Gudmestad, A. (2021). Variationist approaches. In N. Tracy-Ventura, & M. Paquot (Eds.), The routledge handbook of second language acquisition and corpora (pp. 228–237). Routledge. [Google Scholar]
  34. Gudmestad, A., & Edmonds, A. (2023). The variable use of first-person-singular subject forms during an academic year abroad. In S. L. Zahler, A. Long, & B. Linford (Eds.), Study abroad and the second language acquisition of sociolinguistic variation in Spanish (pp. 266–290). John Benjamins Publishing Company. [Google Scholar]
  35. Kanwit, M., & Solon, M. (2023). Variable outcomes abroad: Exploring the role of pre-program proficiency in the development of Spanish future-time expression. In Study abroad and the second language acquisition of sociolinguistic variation in Spanish (pp. 292–320). John Benjamins Publishing Company. [Google Scholar]
  36. Kemmer, S., & Barlow, M. (2000). Introduction: A usage-based conception of language. In M. Barlow, & S. Kemmer (Eds.), Usage-based models of language (pp. 7–28). CSLI Publications. [Google Scholar]
  37. Krug, M. (2003). Frequency as a determinant in grammatical variation and change. Topics in English Linguistics, 43, 7–68. [Google Scholar]
  38. Lastra, Y., & Martínez Butragueño, P. (2015). Subject pronoun expression in oral Mexican Spanish. In A. M. Carvalho, R. Orozco, & N. Shin (Eds.), Subject pronoun expression in Spanish: A cross-dialectal perspective (pp. 39–57). Georgetown University Press. [Google Scholar]
  39. Linford, B. G. (2016). The second-language development of dialect-specific morpho-syntactic variation in Spanish during study abroad [Doctoral dissertation, Indiana University]. [Google Scholar]
  40. Linford, B., & Geeslin, K. (2022). The role of referent cohesiveness in variable subject expression in L2 Spanish. Spanish in Context, 19(3), 508–536. [Google Scholar] [CrossRef]
  41. Linford, B., Zahler, S., & Whatley, M. (2018). Acquisition, study abroad, and individual differences: The case of subject pronoun variation in L2 Spanish. Study Abroad Research in Second Language Acquisition and International Education, 3(2), 243–274. [Google Scholar] [CrossRef]
  42. López-Beltrán, P., & Carlson, M. T. (2020). How usage-based approaches to language can contribute to a unified theory of heritage grammars. Linguistics Vanguard, 6(1), 20190072. [Google Scholar] [CrossRef]
  43. Mitchell, R., Tracy-Ventura, N., & McManus, K. (2017). Anglophone students abroad: Identity, social relationships and language learning. Routledge. [Google Scholar]
  44. Otheguy, R., & Zentella, A. C. (2012). Spanish in New York: Language contact, dialectal leveling, and structural continuity. OUP USA. [Google Scholar]
  45. Posio, P. (2018). Properties of pronominal subjects. In K. L. Geeslin (Ed.), Handbook of Spanish linguistics (pp. 286–306). Cambridge University Press. [Google Scholar]
  46. Pozzi, R. (2022). Acquiring sociolinguistic competence during study abroad: US students in Buenos Aires. In Variation in Second and Heritage Languages (pp. 199–222). John Benjamins Publishing Company. [Google Scholar]
  47. Ranson, D. L. (1991). Person marking in the wake of /s /deletion in Andalusian Spanish. Language Variation and Change, 3(2), 133–152. [Google Scholar] [CrossRef]
  48. R Core Team. (2019). R: A language and environment for statistical computing. R Foundation for Statistical Computing. [Google Scholar]
  49. Sanz, C. (2016, April 10). SLA in study abroad contexts: A researcher-practitioner’s perspective. American Association for Applied Linguistics Annual Conference, Orlando, FL, USA. [Google Scholar]
  50. Shin, N. L. (2012). Variable use of Spanish subject pronouns by monolingual children in Mexico. In K. Geeslin, & M. Díaz-Campos (Eds.), Selected proceedings of the 14th hispanic linguistics symposium (pp. 130–141). Cascadilla Proceedings Project. [Google Scholar]
  51. Shin, N. L. (2016). Acquiring constraints on morphosyntactic variation: Children’s Spanish subject pronoun expression. Journal of Child Language, 43(4), 914–947. [Google Scholar] [CrossRef]
  52. Shin, N. L., & Miller, K. (2022). Children’s acquisition of morphosyntactic variation. Language Learning and Development, 18(2), 125–150. [Google Scholar] [CrossRef]
  53. Tomasello, M. (2003). The key is social cognition. Language in mind: Advances in the study of language and thought, 47–57. [Google Scholar]
  54. Torres Cacoullos, R., & Travis, C. (2018). Bilingualism in the community: Code-switching and grammars in contact. Cambridge University Press. [Google Scholar]
  55. Travis, C. E. (2007). Genre effects on subject expression in Spanish: Priming in narrative and conversation. Language Variation and Change, 19(2), 101–135. [Google Scholar] [CrossRef]
  56. Travis, C. E., & Torres Cacoullos, R. (2012). What do subject pronouns do in discourse? Cognitive, mechanical, and constructional factors in variation. Cognitive Linguistics, 23(4), 711–748. [Google Scholar]
  57. Zahler, S., Long, A. Y., & Linford, B. (Eds.). (2023). Study abroad and the second language acquisition of sociolinguistic variation in Spanish. John Benjamins Publishing. [Google Scholar]
Figure 1. Average Likelihood of SPP expression for yo vs. null in learner data.
Figure 1. Average Likelihood of SPP expression for yo vs. null in learner data.
Languages 10 00110 g001
Table 1. Rates of SPP expression by SA timing.
Table 1. Rates of SPP expression by SA timing.
n% SPP Expression
Pre-SA13237.2
Post-SA110789.6
Post-SA211178.0
Total35188.3
Table 2. Mixed effects linear regression predicting yo expression in Pre- vs. Immediate Post-SA1 datasets (N = 2401).
Table 2. Mixed effects linear regression predicting yo expression in Pre- vs. Immediate Post-SA1 datasets (N = 2401).
Random EffectVarianceStd. Dev.
Verb (intercept)0.52360.7236
Speaker (intercept)1.01061.0053
Fixed effectsN% overtEstimate coef.Std. errorp-value
(Intercept) −2.33141.2801
Switch-reference (switch)101811.50.7780.163<0.001
Switch-reference (no switch)13835.9----
Timing (Pre-SA)13237.2−1.8600.5670.001
Immediate Post-SA110789.6----
Priming (previous expressed)54811.30.6730.2930.022
(previous null)18537.4----
Log Likelihood SPP expression----3.2141.1230.004
Log Verb Frequency----0.3690.2570.151
Log FRC_switch----2.1812.5600.394
Log Likelihood SPP expression X Timing (pre-SA)----−2.9030.9450.002
AIC = 1211.7, random effects: speaker (N = 25), Verb (n = 61). Positive coefficients are associated with yo expression.
Table 3. Mixed effects linear regression predicting yo expression in immediate post-SA1 vs. delayed post-SA2 datasets (n = 2195).
Table 3. Mixed effects linear regression predicting yo expression in immediate post-SA1 vs. delayed post-SA2 datasets (n = 2195).
Random EffectVarianceStd. Dev.
Verb (intercept)0.31420.5606
Speaker (intercept)1.29821.1394
Fixed effectsn% overtEstimate coef.Std. errorp-value
(Intercept) −2.02750.595
Switch-reference (switch)88211.10.7010.168<0.001
Switch-reference (no switch)13137.2----
Timing (Delayed Post-SA2)11178.00.0710.5660.901
Immediate Post-SA110789.7----
Priming (previous expressed)10919.30.4480.2880.112
(previous null)20868.3----
Log Likelihood SPP expression----2.2730.9970.023
Log Likelihood SPP expression X Timing (delayed-SA)----0.2341.0020.815
AIC = 1119.8 random effects: speaker (N = 25), verb (n = 61). Positive coefficients are associated with yo expression.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Brown, E.; Quan, T.; Rivas, J. Testing Cumulative Lexicalized Effects in Study Abroad: Variable Subject Pronoun Expression in Spanish as an Additional Language. Languages 2025, 10, 110. https://doi.org/10.3390/languages10050110

AMA Style

Brown E, Quan T, Rivas J. Testing Cumulative Lexicalized Effects in Study Abroad: Variable Subject Pronoun Expression in Spanish as an Additional Language. Languages. 2025; 10(5):110. https://doi.org/10.3390/languages10050110

Chicago/Turabian Style

Brown, Esther, Tracy Quan, and Javier Rivas. 2025. "Testing Cumulative Lexicalized Effects in Study Abroad: Variable Subject Pronoun Expression in Spanish as an Additional Language" Languages 10, no. 5: 110. https://doi.org/10.3390/languages10050110

APA Style

Brown, E., Quan, T., & Rivas, J. (2025). Testing Cumulative Lexicalized Effects in Study Abroad: Variable Subject Pronoun Expression in Spanish as an Additional Language. Languages, 10(5), 110. https://doi.org/10.3390/languages10050110

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop