Next Article in Journal
Enhancing English Past Tense Acquisition: Comparative Effects of Structured Input, Referential, and Affective Activities
Previous Article in Journal
Mismatches and Mitigation at CS-PF Interface: The Curious Case of li
Previous Article in Special Issue
Reading Between the Lines: Digital Annotation Insights from Heritage and L2 Learners
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Variability in the Online Processing of Subject–Verb Number Agreement in Spanish as a Heritage Language: The Role of Lexical Frequency

by
Jill Jegerski
1,* and
Sara Fernández Cuenca
2
1
Department of Spanish and Portuguese, University of Illinois, Urbana, IL 61801, USA
2
Department of Spanish, Wake Forest University, Winston-Salem, NC 27109, USA
*
Author to whom correspondence should be addressed.
Languages 2025, 10(9), 211; https://doi.org/10.3390/languages10090211
Submission received: 31 May 2025 / Revised: 12 August 2025 / Accepted: 12 August 2025 / Published: 27 August 2025
(This article belongs to the Special Issue Language Processing in Spanish Heritage Speakers)

Abstract

This eye tracking study examined the role of lexical frequency in the processing of non-local verbal number agreement by heritage speakers of Spanish. Few prior studies of heritage bilingualism have investigated the role of word frequency in the comprehension or production of morphosyntax, and none have employed a real-time measure of sentence processing, despite the well-known sensitivity of such methods to word frequency and the proposal of some scholars that such online methodologies could be particularly useful in research on heritage speakers. Fifty heritage speakers of Spanish read stimulus sentences containing non-local verbal number agreement that depended on a verb that was either high or low frequency, based on published corpus data. The results suggest that the online integration of verbal agreement was both more immediate and more robust with high frequency verbs than with low frequency verbs. Moreover, an analysis of individual language background variables indicates that faster reading was associated with greater sensitivity to verbal agreement with low frequency verbs. These findings are consistent with theoretical claims that lexical frequency can play an important role in the morphosyntax of heritage speakers, due to reduced exposure to the home language and, particularly, low frequency words.

1. Introduction

For much of its relatively short history, theoretical and empirical research on heritage language morphosyntax has focused primarily on global comparisons between heritage bilinguals and other populations of language users, such as monolinguals, monolingually-raised first language (L1) speakers, and second language (L2) learners. However, there is a growing awareness of the limitations of this group-level approach, such as a potential lack of methodological control (Rothman et al., 2023) and even the inadvertent encouragement of negative attitudes toward bilingualism in the broader society (De Houwer, 2023). Recent research has therefore emphasized the variability that tends to occur within the heritage speaker population itself, as well as the potential for the study of such variability to inform fundamental questions in the study of heritage language bilingualism (e.g., Perez-Cortes & Giancaspro, 2022; Rothman et al., 2023).
Morphosyntactic variability among heritage speakers can be approached from at least two different perspectives: by looking within a group and by looking within individual speakers (Perez-Cortes & Giancaspro, 2022). Empirical research on within-group variability has typically investigated the role of individual differences in language background variables, such as proficiency in the heritage language and age of onset of bilingualism (Montrul, 2016), and related theoretical claims have posited a critical role for background variables, such as age of acquisition of the majority language (Montrul, 2008). Research on variability within individual speakers is relatively limited (Perez-Cortes & Giancaspro, 2022) and has focused primarily on lexical frequency (i.e., word frequency) and its potential role in the comprehension and production of morphosyntax. Theoretical discussions guiding this approach have proposed that lexical frequency is closely related to exposure to the heritage language, under the assumption that higher-frequency items are typically encountered more often and therefore have greater exposure than lower-frequency items (Hur et al., 2020; Perez-Cortes & Giancaspro, 2022).
Thus far, only a handful of empirical studies have examined the role of lexical frequency in the untimed production and comprehension of heritage language morphosyntax (Giancaspro, 2020; Hur, 2020; Hur et al., 2020; López Otero, 2022, 2023; Perez-Cortes, 2022) and, to our knowledge, no prior study has employed a real-time (i.e., online) measure of sentence processing like eye tracking or self-paced reading. Such work stands to be particularly informative because online measures have well-known sensitivity to lexical frequency and because such methodologies have also been proposed to be especially useful in the study of heritage speakers (Bayram et al., 2024; Jegerski, 2018). The present study addresses this gap in existing research with an eye tracking study of the role of verb frequency in the processing of Spanish verbal number agreement among heritage speakers. In addition to this primary focus on variability within individuals, we also examine the role of individual background variables related to variability among heritage speakers as a group, to obtain a more complete picture of variability in the processing of morphosyntax.

1.1. The Processing of Grammatical Agreement by Heritage Speakers

Subject–verb agreement, a core aspect of morphosyntax in many languages, is when the form of a verb in a given phrase changes according to the features of the subject. In Spanish, verb inflections vary according to the person and number of the subject (in addition to the tense, aspect, and mood of the sentence). Monolingual children do not fully acquire verbal agreement until at least five years of age (Johnson et al., 2005; Miller & Schmitt, 2014) and U.S. children acquiring Spanish as a heritage language can show slower development of verbal agreement than monolinguals (Goldin, 2022). This is likely because heritage speakers often have extensive exposure to English before age five, which can impinge on their exposure to input in Spanish.
In theoretical discussions of heritage language bilingualism, morphosyntactic phenomena like agreement have long been of primary interest because they can be challenging for heritage speakers to acquire and maintain (e.g., Montrul, 2008). More recently, the real-time processing of morphosyntax and distance dependencies have also been posited as a particular area of vulnerability in heritage bilingualism (Polinsky & Scontras, 2020). On the other hand, empirical studies using real-time measures, such as self-paced reading and eye tracking, have shown that heritage speakers are sensitive to simple verbal number agreement when the subject and verb are immediately adjacent to each other (Foote, 2011; Sagarra & Rodriguez, 2022); plus, similar observations have been made for verbal person agreement (Di Pisa et al., 2024) and with a paradigm that combined person and number agreement (Rodríguez & Reglero, 2015), so verbal agreement does not seem to present any broad, categorical difficulty. However, there is also some evidence that non-adjacent or non-local agreement, in which the subject and verb are separated by several intervening words, can be challenging for heritage speakers (Foote, 2011). This is likely due to the extra burden on cognitive resources like working memory; plus, complex subjects can present the additional challenge of potential interference from the number specifications of any intervening noun phrases. Hence, although the existing empirical evidence is quite limited, it seems that verbal agreement that is non-local may have the greatest potential for variability in heritage language processing, which is why it was employed in the present study.
With such a limited body of previous online research on the processing of verbal agreement by heritage speakers, no prior study has investigated within-speaker variability (i.e., variability between lexical items) and little is known about the role of individual background variables in this specific aspect of sentence processing. Age of onset of bilingualism has been examined on a broad level by comparing heritage speakers to late bilinguals (i.e., L2 learners) and has not revealed any group-level differences (Foote, 2011; Sagarra & Rodriguez, 2022). Similarly, heritage language proficiency has not been observed to play a role in the processing of verbal agreement (Di Pisa et al., 2024; Sagarra & Rodriguez, 2022).
Looking beyond the limited body of published work on verbal agreement, a related aspect of morphosyntactic processing that has received more attention in empirical research on heritage languages is gender agreement. Several studies have shown that heritage speakers are able to integrate gender agreement during online sentence processing (Foote, 2011; Keating, 2022; Luque et al., 2023). The background variable of age of onset of bilingualism did not appear to play a role in a between-group comparison of early and late bilinguals (i.e., heritage speakers and L2 learners; Foote, 2011), but one study with a more fine-grained analysis of variability within the group (Keating, 2022) revealed subtle differences in the timing of gender agreement effects according to whether heritage speakers had acquired the minority and majority languages in a simultaneous or sequential manner (Montrul, 2008). Specifically, eye movement data suggested that sequential bilinguals, who had time to acquire just Spanish for at least three years before English started to encroach on their early childhood input, processed gender agreement faster than did simultaneous bilinguals. Also worth noting is the importance of methodology in this study: the subtle within-group difference was revealed only with eye tracking, which is arguably the most fine-grained online measure of language processing because it has multiple early and late measures for each word of a stimulus sentence. In other words, the method may be especially well suited to the study of variability in language comprehension. For this reason, eye tracking was employed for the present study.

1.2. Lexical Frequency and the Morphosyntax of Heritage Speakers

In addition to individual background variables, another potential source of variability that has begun to receive some attention in HL research in recent years is lexical frequency. This relatively new perspective has enabled the investigation of variability at the level of the individual speaker, the contribution of which has been articulated by Perez-Cortes and Giancaspro (2022, p. 2) as follows: “…intra-speaker variability, too, is a micro-level pattern that simply cannot be addressed by looking at the macro-level comparisons—specifically, between-speaker and between-property comparisons—that continue to predominate in the field.” The starting point of this approach is the assumption that lexical frequency affects how words are stored and accessed (Bybee, 2007), such that higher frequency items are more entrenched and lower frequency items have less developed representations and are therefore more susceptible to variability. Frequency effects can occur with all language users, but scholars working on heritage languages have proposed that frequency effects can have a greater impact among heritage bilinguals due to their reduced exposure to the minority language (O’Grady et al., 2011; Perez-Cortes & Giancaspro, 2022; Putnam & Sánchez, 2013). In other words, a word that is considered low frequency based on corpus data would likely have an effective frequency that is even lower for heritage speakers, so differences between high and low frequency words would be magnified. A key implication of this assumption is that difficulty with lexical items, and particularly the grammatical features associated with them, has the potential to impact the comprehension and production of morphosyntax because it involves the combination of said lexical items and grammatical features in the structuring of phrases. For example, reduced exposure to individual Spanish nouns and their associated specifications for the assignment of grammatical gender at the word level can affect the broader morphosyntactic phenomenon of gender agreement at the phrase level (Montrul et al., 2014).
Several studies have investigated the role of lexical frequency in the morphosyntactic knowledge of heritage speakers using offline methods and, in general, they have observed frequency effects. Hur (2020) found that heritage speakers of Spanish with intermediate proficiency produced more differential object marking on nouns that followed higher frequency verbs than low frequency ones, while a comparison group of Spanish-dominant bilinguals showed no effect of frequency. Hur et al. (2020) made similar observations regarding Spanish gender agreement, also among heritage speakers, but this time with a group that included individuals with advanced proficiency and with no Spanish-dominant comparison group. The participants in this study performed more successfully with high (versus low) frequency nouns on two experimental measures, an elicited production task and a receptive forced-choice task. Lexical frequency also appears to play a role in different aspects of the Spanish subjunctive mood among heritage speakers: Giancaspro (2020) and Perez-Cortes (2022) both found higher levels of accuracy with high frequency verbs, as compared to lower frequency ones, on both receptive and productive measures. Additionally, the comparison groups of Spanish-dominant bilinguals in these two studies did not show any effects of verb frequency. A related aspect of verbal morphosyntax was examined by López Otero (2023), who also observed that higher verb frequency was associated with higher accuracy for Spanish imperatives, although in this study, frequency effects were only seen in data from an elicited production task and not in acceptability judgments. Data from a comparison group of Spanish-dominant bilinguals was not analyzed due to very low variability, which suggests that those participants were at ceiling, regardless of verb frequency. Finally, López Otero (2022) investigated subject placement with Spanish accusative and unergative predicates and also found that verb frequency affected performance on an elicited production task but not on an acceptability judgment. Furthermore, as with most of the other studies in this line of research, a comparison group of Spanish-dominant bilinguals did not show the lexical frequency effect in question.
Thus, one general observation from previous research is that lexical frequency effects may be quite common in the morphosyntax of heritage speakers, given that all six empirical studies that have examined this key theoretical claim (O’Grady et al., 2011; Perez-Cortes & Giancaspro, 2022; Putnam & Sánchez, 2013) so far have obtained results that support it. Moreover, the lack of frequency effects among Spanish-dominant bilinguals in all four of the prior studies that reported and analyzed data from comparison groups (Giancaspro, 2020; Hur, 2020; López Otero, 2022; Perez-Cortes, 2022) suggests that heritage speakers may be more affected by lexical frequency than other bilinguals. Lastly, this previous research has been more ambiguous with regard to a proposal related to the theory of Putnam and Sánchez (2013) that frequency effects should primarily affect productive language skills rather than receptive ones. Five of the abovementioned studies included both productive and receptive experimental measures (mostly elicited production and acceptability judgments) and two found support for the proposal (López Otero, 2022, 2023), while three reported findings that suggested that productive and receptive skills were both affected by lexical frequency (Giancaspro, 2020; Hur et al., 2020; Perez-Cortes, 2022).
One notable limitation of the existing research on lexical frequency effects on the morphosyntax of heritage speakers is that all the studies reviewed above employed the same experimental measures, elicited production and untimed metalinguistic judgments. This may be closely related to the question of whether the comprehension of morphosyntax is equally affected by word frequency or if it primarily affects language production (Putnam & Sánchez, 2013). We propose that research with a wider range of methods is needed to complement the existing body of work and that online measures of sentence processing, like eye tracking and self-paced reading, may be especially valuable in this context for two reasons: (1) Online methods are particularly sensitive to lexical frequency and some of the most fundamental evidence in support of theoretical proposals related to bilingual lexical access come from precisely timed measures (e.g., Gollan et al., 2008) and (2) Online methods are thought to be particularly appropriate for research on heritage bilinguals (Bayram et al., 2024; Jegerski, 2018). Hence, this was another reason why eye tracking was employed in the present study.

1.3. The Present Study

Given this background and the gaps in the existing literature on the topic, the objective of the present study was to investigate the potential for lexical frequency to affect morphosyntactic processing in heritage Spanish. Eye tracking was used to measure the real-time processing of non-local verbal number agreement in stimulus sentences with critical verbs that were either high frequency or low frequency. More specifically, this study was guided by the following research questions:
  • Do heritage speakers of Spanish show online sensitivity to non-local verbal number agreement?
  • Does their online sensitivity to non-local verbal number agreement vary according to verb frequency (as the operationalization of variability within individuals)?
  • Does their online sensitivity to non-local verbal number agreement vary according to individual language background variables (as the operationalization of variability within the group)?

2. Materials and Methods

2.1. Participants

A total of 50 Spanish heritage speakers (40 female) participated in this study and all were students at a large university in the central U.S. at the time of recruitment. Forty-seven were born in the U.S. to one or two Spanish-speaking parents and three were born in a Spanish-speaking country before moving to the U.S. during childhood. Of the 50 participants, 16 were simultaneous bilinguals and the rest acquired only Spanish from birth and started to learn English later. Spanish proficiency was estimated via an adapted version of the DELE (Diploma del Español como Lengua Extranjera “Certificate of Spanish as a Foreign Language”) standardized Spanish proficiency test (Montrul & Slabakova, 2003), as well as a self-rating from the language background questionnaire. More information about the participants can be found in Table 1. Overall, the participants tended to rate their English abilities higher than their Spanish abilities, which is a common pattern among heritage speakers in the U.S., given that minority languages receive little to no support at the institutional and educational levels (Fuller & Leeman, 2020). The mean DELE score indicated Spanish proficiency in the intermediate-high to advanced-low range, but the individual scores also demonstrated considerable variability.

2.2. Materials

The stimulus sentences that the participants read for the eye-tracking experiment targeted non-local Spanish subject–verb number agreement via a grammaticality manipulation, to examine morphosyntactic processing, and varied in terms of whether the target verb was high frequency or low frequency, as illustrated in examples (1) and (2) below. All target verbs were morphologically regular. A full set of stimuli is provided in Appendix A.
  • High frequency verb (grammatical, ungrammatical)
    • El paquete que pidió la secretaria llegó esta tarde a las cinco.
    • *Los paquetes que pidió la secretaria llegó esta tarde a las cinco.
“The package/*packages that the secretary ordered arrivedSING this afternoon at five.”
2.
Low frequency verb (grammatical, ungrammatical)
  • El carrito que empujó el niño abolló el vehículo de la policía.
  • *Los carritos que empujó el niño abolló el vehículo de la policía.
“The little car/*cars that the boy pushed dentedSING the police officer’s vehicle.”
Following the previous research reviewed in the introduction, lexical frequency was operationalized as a categorical variable. To identify high- and low-frequency verbs, initial selections were made based on the frequency rankings provided by Davies (2006). Subsequently, exact frequency counts were obtained from the SUBTLEX-ESP database (Cuetos et al., 2011), which compiles lexical frequency data from a 40-million-word corpus derived from Spanish film subtitles. Verbs categorized as high frequency exhibited average raw frequency counts of 4929 for the infinitive form and 2184 for the inflected form (third person singular, preterite), corresponding to mean log frequencies of 3.37 and 3.01, respectively. In contrast, verbs classified as low frequency yielded average counts of 41 for the infinitive and 16 for the inflected form, with corresponding log frequencies of 1.33 and 0.79. To corroborate the assumption that the participants were more familiar with high frequency verbs than with low frequency ones, a written verb test was administered at the end of the research session.
Following established best practices for eye tracking research (Keating, 2014; Keating & Jegerski, 2015), each participant read 16 high-frequency verb items (8 grammatical, 8 ungrammatical) and 16 low-frequency verb items (also evenly split between grammatical and ungrammatical forms), drawn from 64 total stimulus items (32 with high frequency verbs and 32 with low frequency verbs) and presented in one of four counterbalanced lists. To avoid repetition effects, each sentence appeared only once per experimental list, either in the grammatical or ungrammatical condition. The 32 target sentences in each list were combined with 32 distractor items—designed for a separate study on Spanish mood (Fernández Cuenca & Jegerski, 2023)—and 64 filler sentences that did not target specific linguistic structures but were matched to the experimental items in terms of length and grammaticality ratio (i.e., 50% grammatical, 50% ungrammatical). In total, each list contained 128 sentences, which were presented in pseudo-randomized order to prevent the consecutive presentation of sentences of the same type.
Beyond the eye-tracking task, the materials included a language background questionnaire and a 50-item written test of general Spanish proficiency that was adapted from the Diploma of Spanish as a Foreign Language or DELE. This short proficiency assessment has been employed for over two decades in research on Spanish language acquisition, beginning with the work of Montrul and Slabakova (2003). More recently, it has been shown to correlate with alternative proficiency measures, such as elicited imitation, including in studies involving heritage Spanish speakers (Solon et al., 2022).

2.3. Procedure

The eye-tracking experiment began with on-screen instructions, followed by an initial calibration block that used a nine-point grid and a series of eight practice trials. This preparatory stage took place before the experimental stimuli were introduced. Data collection was carried out using an EyeLink 1000 desktop-mounted system (SR Research, 2005), which recorded the right eye at a sampling rate of 1000 Hz. To ensure participant stability, both chin and forehead rests were employed. Participants were positioned approximately 39 inches away from a 22-inch display monitor. Calibration of the eye tracker took place at the beginning of each session and was validated to maintain an error margin within 0.5 degrees. Additional calibrations were performed following the practice trials and whenever necessary, based on an automatic drift correction implemented before each trial. Stimulus sentences and comprehension questions were shown as a single line of text in black 24-point Tahoma font against a white background.
Participants were instructed to read at a natural pace, as if they were reading a newspaper or book, and were informed that the task targeted reading comprehension. Following each sentence, a comprehension question assessing the meaning of the sentence just read appeared on a separate screen. Participants selected their response using the “A” and “B” buttons on a Microsoft Sidewinder game controller, the standard response device included with the EyeLink 1000 system. Comprehension accuracy was recorded, but no corrective feedback was provided. Given the length of each experimental list (128 sentences), which typically required 30 to 45 min to complete, participants were provided with two scheduled breaks: one following the instructions and practice items, and another at the halfway point of the experimental list. After completing the eye tracking experiment, participants proceeded to complete the language background questionnaire, the Spanish proficiency test, and the verb test.

2.4. Data Analysis

The primary word of interest in the stimulus sentences was the main verb (see examples above under Materials) and we also analyzed eye movements for the word that came two words after the main verb as a spillover region. These two words were labeled Critical Verb and Verb + 2, respectively. We did not analyze eye movements for the Verb + 1 word that immediately followed the critical verb because the word was skipped for most trials (55–69%; see Table 2) due to it being a very short word (Rayner & McConkie, 1976). Outlier trimming was kept to a minimum, following recommendations by Baayen and Milin (2010). Data points were excluded if the relevant stimulus region was not fixated during a given trial and individual fixations shorter than 80 milliseconds were also removed, which resulted in the exclusion of 2.58% of the data for the critical verb and 2.45% of that for the Verb + 2 word. Additionally, fixation durations exceeding 3000 milliseconds were trimmed to 3000 milliseconds, which affected a further 0.35% and 0.44% of the data, respectively. To address positive skew in the distribution, all time-based eye movement measures were log-transformed prior to statistical analysis.
As is common practice in eye tracking research, we included a combination of early and late eye movement measures (Godfroid, 2020). As an early measure, we employed gaze duration, which is the sum of fixations on a word during the first pass through a sentence, left-to-right. It is thought to reflect the time for initial lexical access and is therefore the primary locus of basic word frequency effects (e.g., Staub, 2011). The two late measures included in the present study were total dwell time and regression path time. Total dwell time (also known as total reading time) is the total amount of time spent looking at a word, including the first fixation and any subsequent fixations, so it includes gaze duration. Total dwell time typically shows effects of non-local verbal agreement (e.g., Lim & Christianson, 2015). Finally, regressive eye movements were included via regression path time (also known as go-past time), which is the sum of all fixations from the first fixation on a word until the word is exited to the right, so it includes the time for regressions that occur on the first pass through a sentence. Regression path time is a later measure that is thought to reflect the integration of a word into a phrase or sentence context, so it can show effects of non-local verbal agreement (e.g., Lim & Christianson, 2015).
Statistical analyses were conducted using mixed-effects models implemented in R version 4.4.2 (R Core Team, 2021). The lme4 package (Bates et al., 2015) was employed for linear model estimation, and pairwise comparisons were conducted using the emmeans package version 1.10.5 (Lenth et al., 2019), which applies the Tukey correction to control for Type I error inflation. Given the binary nature of the comprehension accuracy data, logistic models were used, following the recommendations of Jaeger (2008), and were fit using the glmmTMB package version 1.1.10 (Brooks et al., 2017). Separate models were run for each stimulus word and eye tracking measure, with agreement (grammatical, ungrammatical), verb frequency (high, low), and the interaction as the fixed effects and the subject and item as random effects. All fixed effects and covariates were coded using sum contrast coding. Random effects structures were specified as maximal—incorporating both random intercepts and slopes where model convergence allowed—following the guidelines proposed by Barr et al. (2013). P-values were computed using Satterthwaite’s approximation for degrees of freedom (Kuznetsova et al., 2014). A significance threshold of α = .05 was adopted, with p-values below .10 considered marginally significant (Larson-Hall, 2010).

3. Results

Gaze duration is presented in Table 2 and the corresponding statistical output is provided in Table 3. The agreement effect that was of primary interest in this study was not significant until the Verb + 2 word and it did not interact with verb frequency. Verb frequency was significant at the critical verb, which is not directly relevant to the goals of this study because it simply indicates predictably longer gaze duration for low frequency verbs versus high frequency ones.
Total dwell times are presented in Table 2 and the corresponding statistical output is provided in Table 4. The agreement effect that was of primary interest in this study was significant at the critical verb, where it interacted with verb frequency, but neither carried over to the Verb + 2 word. Follow-up pairwise comparisons conducted to explore the interaction at the critical verb revealed that the agreement effect was significant only with the high frequency verb stimuli (estimate = 0.255; SE = 0.041; t = 6.218; p < .0001), and not with the low frequency verb stimuli (estimate = 0.042; SE = 0.041; t = 1.025; p = .310). Verb frequency on its own was significant at both the critical verb and the Verb + 2 word, but, as mentioned above, this is not directly relevant to the goals of this study because it merely reflects a general tendency for longer reading time with lower frequency words.
Regression path times are presented in Table 2 and the corresponding statistical output is provided in Table 5. The agreement effect that was of primary interest in this study was not significant until the Verb + 2 word, where it was marginally significant (p = .058) and showed a marginal interaction with verb frequency (p = .068). Follow-up pairwise comparisons conducted to explore the potential interaction at the Verb + 2 word showed that the agreement effect was significant only with the high-frequency verb stimuli (estimate = 0.103; SE = 0.039; t = 2.622; p = .009) and not with the low frequency verb stimuli (estimate = 0.002; SE = 0.039; t = 0.048; p = .962). Verb frequency was again significant at both the critical verb and the Verb + 2 word, but, as mentioned above, this effect on its own is not directly relevant to the goals of this study.
Accuracy data from the post-stimulus meaning-based comprehension questions are presented in Table 6, where it can be seen that comprehension was quite high overall, at least 87% for all stimulus conditions. The statistical output from the corresponding logit mixed-effects models is provided in Table 7. The only significant effect was verb frequency, which indicated higher comprehension accuracy with high frequency verb stimuli compared to low frequency ones.
In addition to the primary analysis that focused on verb frequency as an indicator of variability within individuals, we conducted a second set of statistical analyses to examine variability within the group, as articulated in the third research question. Four additional linear mixed-effects models explored one centered background variable (run separately due to likely multicollinearity), the effect of agreement, the effect of verb frequency, and the respective two- and three-way interactions. The four language background variables were: age of acquisition of English, DELE proficiency test score, self-rated reading ability in Spanish, and average reading speed for the experiment (calculated as the mean total dwell time across all sentence regions and across all sentences in the eye tracking experiment, including experimental items, distractors, and fillers). These models were run on a selected subset of the data: the total dwell time data for the critical verb, because this was where the primary effect of agreement (with the high frequency verb stimuli) was observed in the first analysis above. Each model had random intercepts for subject and item and random slopes for agreement for both wherever possible.
All four models predictably showed the same effects that were present in the first analysis above, meaning agreement, verb frequency, and the interaction of the two (all ps < .001). The only individual predictor that showed a main effect was average reading speed (estimate = 0.002; SE = 0.000; t = 9.284; p < .001), which predictably indicated that faster individual reading speed for the whole experiment was linked to generally faster total dwell time on the critical verb. There were also two interactions, the first of which was a marginal interaction of self-rated reading ability in Spanish with verb frequency (estimate = 0.019; SE = 0.010; t = 1.777; p = .082). This reflects a larger frequency effect (i.e., longer total dwell time for low frequency verbs as compared to high frequency ones) with lower self-rated reading ability and a smaller frequency effect with higher self-ratings. The second interaction was the three-way interaction of individual reading speed with agreement and verb frequency. Follow-up models were conducted to analyze the high and low verb frequency data separately and these showed that the interaction of reading speed with agreement was present with the low frequency verb stimuli (estimate = 0.000; SE = 0.000; t = 2.431; p = .019), but not with the high frequency verb stimuli (estimate = 0.000; SE = 0.000; t = 0.204; p = .838). Hence, the faster readers had a tendency toward an agreement effect with the low frequency verb stimuli, even though the primary analysis above showed an agreement effect only with the high frequency verb stimuli.
To summarize, the main findings of this study are as follows:
  • Online sensitivity to verbal number agreement was evident among heritage speakers of Spanish, in the later measure of total dwell time at the critical region and in both early and late measures (gaze duration and regression path time) at the spillover word.
  • For variability within individuals, verb frequency appeared to play a role in the processing of verbal agreement, as the agreement effects were more immediate and robust with the high frequency verb stimuli than with the low frequency verb stimuli. Specifically, the high frequency verb stimuli showed the agreement effect in three of the six eye movement analyses (total dwell time at the critical verb, gaze duration at the spillover word, and regression path time at the spillover word), but the low frequency verb stimuli showed the effect in only one of the six analyses (gaze duration at the spillover word).
  • For variability within the group, a second analysis of the eye movement data with individual language background variables suggested that more skilled reading, as measured by self-rating of reading ability in Spanish and average reading speed during the eye tracking experiment, is associated with a reduced verb frequency effect, as well as a reduced role for verb frequency in the processing of verbal agreement.

4. Discussion

The first research question guiding this study sought to determine whether heritage speakers integrated non-local verbal number agreement during the processing of Spanish sentences, as measured by eye tracking. Previous research on heritage bilinguals using real-time methods like self-paced reading (Foote, 2011) and eye tracking (Sagarra & Rodriguez, 2022) had shown online sensitivity to local verbal agreement when the subject and verb were immediately adjacent to each other in the stimulus sentences. There was also some evidence that agreement over distance, when the subject and verb are separated by several words, can show reduced effects because it is more difficult to process (Foote, 2011). Consistent with the existing prior studies of the processing of local verbal agreement among heritage speakers (Di Pisa et al., 2024; Foote, 2011; Rodríguez & Reglero, 2015; Sagarra & Rodriguez, 2022), the results of the present study suggest that heritage speakers are also sensitive to non-local verbal agreement, even though the subject and verb in the stimuli for the present study were separated by four words and this incurred a greater processing burden. Eye movement data indicated agreement effects that were both immediate (on the critical verb) and sustained (carried over to the spillover word), which suggests that there was no categorical difficulty processing non-local verbal agreement. Still, there was also evidence of variability both within participants (according to verb frequency) and between participants (according to individual language background variables), which was as predicted and will be discussed below.
The second research question guiding this study was regarding the role of verb frequency in the processing of verbal agreement among heritage speakers of Spanish, as an instance of within-speaker variability. Based on several prior studies using offline methods (Giancaspro, 2020; Hur, 2020; Hur et al., 2020; López Otero, 2022, 2023; Perez-Cortes, 2022), the expectation was that there could be some type of delay or reduction in the effect of verbal agreement with low frequency verbs as compared to high frequency verbs. Indeed, the eye movement data suggested that verb frequency did in fact play an important role. For one, the agreement effect was more immediate with the high frequency verb stimuli, which showed the effect at the critical verb, and relatively delayed with the low frequency verb stimuli, which did not show the effect until the spillover word. Additionally, the agreement effect was more robust with the high frequency verb stimuli, which showed the effect in three of the six eye movement analyses (total dwell time at the critical verb, gaze duration at the spillover word, and regression path time at the spillover word), but the low frequency verb stimuli showed the effect in only one of six analyses (gaze duration at the spillover word).
The findings of the present study thus contribute to a growing body of empirical evidence in support of theoretical discussions in recent years that have posited that lexical frequency can be an important factor in the morphosyntax of heritage bilinguals due to reduced exposure to the minority language (O’Grady et al., 2011; Perez-Cortes & Giancaspro, 2022; Putnam & Sánchez, 2013). What is more, the current investigation offers a novel contribution to this body of work in the form of evidence from a real-time measure of sentence processing. Six previous empirical studies examined the role of word frequency in the morphosyntax of heritage speakers and consistently observed higher accuracy and lower variability with high frequency words than with low frequency ones (Giancaspro, 2020; Hur, 2020; Hur et al., 2020; López Otero, 2022, 2023; Perez-Cortes, 2022), but all of these studies employed offline measures of comprehension and production. Prior findings were also less clear with regard to the question of whether word frequency primarily affects morphosyntactic production rather than comprehension (Putnam & Sánchez, 2013). Of five studies designed to examine the issue, two found that word frequency only affected the production of morphosyntax (López Otero, 2022, 2023), while three concluded that frequency also played a role in comprehension (Giancaspro, 2020; Hur et al., 2020; Perez-Cortes, 2022). The present study, with eye tracking as a more precise and time-sensitive measure of language comprehension, suggests that word frequency does indeed affect the comprehension of morphosyntax among heritage speakers on a moment-by-moment basis. It therefore seems likely that, in prior studies that suggested that heritage bilinguals do not experience difficulty comprehending morphosyntax with low frequency words (López Otero, 2022, 2023), untimed measures of receptive skills like acceptability judgments allowed enough time for participants to recover from any initial difficulty before providing experimental responses.
The current findings regarding the role of verb frequency in the morphosyntactic knowledge of heritage speakers have broader implications for research on their grammatical knowledge. Specifically, the outcome of the present study demonstrates the potential for experimental outcomes to be misleading. If, for example, an investigation of verbal agreement similar to the present study included stimuli with many verbs of relatively low frequency, the results might indicate a lack of online sensitivity to agreement, leading researchers to erroneously conclude that this is an area of morphosyntactic difficulty among heritage bilinguals, even though the difficulty is actually lexical. Such a scenario might arise more easily with heritage speakers than with other populations, as they tend to have a less developed vocabulary in the heritage language (e.g., Kubota & Rothman, 2025), although the current findings do not speak to between-group differences directly because only heritage speakers were tested. One way to avoid this problem is to develop experimental materials with high frequency words, especially for critical words and phrases that constitute the target form or are closely related to it. This practice is likely already followed by many researchers, even if they are not always consciously aware of it or do not typically address it in published descriptions of experimental materials. Another approach is to manipulate word frequency in experimental materials to compare performance with both higher and lower frequency words, as in the present study and the prior work discussed above (Giancaspro, 2020; Hur, 2020; Hur et al., 2020; López Otero, 2022, 2023; Perez-Cortes, 2022). This second approach seems especially important for future research using online methods like self-paced reading and eye tracking, given that the novel findings of the present study are in need of replication and examination across a range of contexts to establish whether the findings are generalizable.
The third and final research question guiding this study was regarding the role of individual language background variables in the processing of verbal agreement among heritage speakers of Spanish, as an instance of between-speaker variability. The purpose of this aspect of the study was to connect with previous research on individual differences among heritage speakers, which has primarily taken this perspective, rather than the within-speakers perspective of the word frequency analysis discussed above under the second research question. Of the four background variables that were examined, the one that was most often included in prior work was proficiency in the heritage language. The results of the present study suggest that proficiency did not play a role, which is consistent with the two prior studies of the processing of verbal agreement among heritage speakers that had investigated the role of proficiency and observed a similar lack of effects (Di Pisa et al., 2024; Sagarra & Rodriguez, 2022). While this does suggest that proficiency may not be important with some groups of heritage speakers with the processing of verbal agreement in particular, it should not be taken as evidence against the importance of proficiency in any absolute sense, as there is evidence from other empirical studies that proficiency can play a role in other aspects of sentence processing (e.g., Bice & Kroll, 2021; Shin, 2024).
A second background variable that did not appear to play an important role in this study was age of acquisition of the societal majority language, English. One prior study on the processing of verbal number agreement among heritage speakers had included a broad level comparison of age of onset of bilingualism by comparing a group of heritage speakers to a group of L2 learners, but no differences emerged (Foote, 2011). The outcome of the present study is therefore consistent with those findings, although the design is different. On the other hand, another study of the processing of gender agreement among heritage speakers conducted a more fine-grained analysis of age differences, which revealed subtle differences in the timing of gender agreement effects according to whether heritage speakers had acquired the minority and majority languages in a simultaneous or sequential manner (Keating, 2022), which is a classification based on the age of acquisition of English. There are many differences between the present study and Keating’s (2022) that might account for the different findings, but one that might be especially relevant is that the range of values for age of acquisition of English was larger in the prior study, which included heritage speakers with ages of acquisition of up to 10 years and also had a diverse distribution of participants across the whole range. The participants in the present study were more typical heritage speakers, with all but 6 of 50 having been exposed to English by age 5, so the range of values was more restricted.
Beyond the lack of effects of heritage language proficiency and age of onset of the majority language, the two remaining background variables that did show relevant effects in this study were self-rated reading ability in Spanish and average reading speed during the eye-tracking experiment. First, higher self-ratings for reading in Spanish were associated with smaller overall frequency effects, indicating less of a slowdown on low frequency verbs among these participants. There was no interaction with agreement, so reading skill did not appear to be directly relevant to the processing of morphosyntax (but cf. Jegerski & Keating, 2023), but the finding is nevertheless broadly consistent with the proposal that lexical frequency can have a greater impact among bilinguals who have had less experience with the target language (Gollan et al., 2008; O’Grady et al., 2011; Perez-Cortes & Giancaspro, 2022; Putnam & Sánchez, 2013). Finally, the last individual language background variable examined in this study was individual reading speed (averaged across all words in all sentences read during the eye tracking experiment). In this case, the processing of agreement was involved. Specifically, faster readers tended toward an agreement effect with low frequency verb stimuli, although the main statistical analysis showed an overall agreement effect only with the high frequency verb stimuli at this point (i.e., total dwell time on the critical verb). It therefore appears that the most efficient readers were less susceptible to the delay caused by the need to access and integrate low frequency verbs in the processing of verbal agreement. Reading speed has not often been considered in previous studies of the sentence processing of heritage speakers, but at least one prior study also found it to be a predictor of their online processing of morphosyntax (Jegerski & Keating, 2023). Relatedly, written language experience has also proved important in at least one prior study (Karaca et al., 2024). Hence, the findings of the present study regarding individual language background variables suggest that more skilled reading can be less impacted by lexical frequency effects.
The present outcome has broader implications for heritage language theory, and we turn now to a more detailed discussion of some key points. One of these is the potential for production to be more susceptible to variability than comprehension, which is a core claim of Putnam and Sánchez (2013). The results of the present study show that variability related to lexical frequency is not exclusive to production, but they also leave open the possibility that production could be even more affected than comprehension because production was not examined in this study. A second point pertains to the mechanism behind the observed limitations in the processing of agreement with low frequency verbs and the question of whether such limitations are merely due to the cognitive demands of language processing in real time or if they also point to differences at the level of linguistic representation in memory (Montrul, 2021). The design of the present study is such that linguistic representation seems to be implicated in the findings in addition to cognitive demands. Eye movements are a measure of language comprehension in real time, so the data from this study clearly reflect language processing under cognitively demanding conditions. However, the observed differences in agreement processing according to verb frequency also suggest differences in the lexical representations for the verbs because that is where the effects of differential exposure to high frequency and low frequency verbs would accumulate over time. Thus, it seems that a delay in lexical access with a low frequency verb due to a less developed representation can leave less time for the subsequent computation of verbal agreement (consistent with the proposal of Hopp (2018), for L2 sentence processing). An important implication of this interpretation is that a given instance of variability among heritage speakers that appears to be morphosyntactic in nature can sometimes be traced to the lexicon instead.
Another important question is whether the observed effects of word frequency in the processing of verbal agreement are unique to heritage speakers. As no other populations were tested in the present study, the results cannot speak directly to this question. Looking to evidence from previous research, four previous studies using offline methods and different target forms included comparison groups of Spanish-dominant bilinguals and all four found that only the heritage speakers were affected by lexical frequency (Giancaspro, 2020; Hur, 2020; López Otero, 2022; Perez-Cortes, 2022). For online methods, we are aware of only three published studies that have examined the impact of word frequency on any aspect of sentence processing with any population, and the participant groups in those were either monolingual English speakers (Tily et al., 2010; Staub, 2011) or late L2 learners of English with a comparison group of monolinguals (Hopp, 2016). One of these found that word frequency affected higher-level syntactic processing among monolinguals (Tily et al., 2010) and the other two found that it did not (Hopp, 2016; Staub, 2011). One key factor in the different outcomes appears to be the specific range of frequency for the lower frequency lexical items used in experimental stimuli, meaning that monolinguals can be slower with lexical items that are of very low frequency (Hopp, 2016). In another approach to the question of whether frequency effects in morphosyntactic processing are unique to heritage speakers, the analysis of individual language background variables in the present study similarly suggests that word frequency effects in agreement processing can be a matter of degree, as online agreement effects with low frequency verb stimuli increased with faster reading speed, which suggests more proficient reading. Hence, it seems quite plausible that monolinguals and target language dominant bilinguals could exhibit word frequency effects similar to those observed in the present study among heritage speakers, but such effects may only appear under circumstances that mimic those of heritage language bilingualism, such as slow reading speed (potentially indicative of underdeveloped literacy skills) or very low frequency words (indicative of reduced exposure and underdeveloped lexical representations). Further research is needed to investigate these particular questions and also to replicate the main finding of the present study, that lexical frequency can play a role in the online processing of morphosyntax among heritage speakers.

5. Conclusions

The present eye tracking study of heritage speakers of Spanish found that online sensitivity to non-local verbal number agreement was reduced with low frequency verb stimuli relative to stimuli with high frequency verbs. This outcome supports the theoretical proposal by scholars working on heritage languages that frequency effects can impact the morphosyntax of heritage bilinguals due to reduced exposure to the minority language (O’Grady et al., 2011; Perez-Cortes & Giancaspro, 2022; Putnam & Sánchez, 2013), although further research is needed to determine to what degree such effects are unique to heritage speakers. The present results represent initial evidence from a real-time measure of sentence processing, which demonstrates that frequency affects language comprehension as well as production, even if the effects are not always evident in offline measures like untimed acceptability judgments (López Otero, 2022, 2023). Furthermore, our findings suggest that skilled reading among heritage speakers can also be important in the processing of morphosyntax, as it has the potential to partially mitigate the limiting effect of low lexical frequency.

Author Contributions

Conceptualization, J.J.; methodology, J.J.; investigation, J.J.; writing—original draft preparation, J.J. and S.F.C.; writing—review and editing, J.J. and S.F.C.; supervision, J.J.; project administration, J.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Institutional Review Board of the University of Illinois Urbana-Champaign (protocol code IRB24-0079, approved on 16 February 2024).

Informed Consent Statement

Informed consent was obtained from all the subjects involved in the study.

Data Availability Statement

The data presented in this study may be made available on request from the corresponding authors. The data are not publicly available to accord with the informed consent guidelines provided to the participants.

Acknowledgments

We are very grateful to research assistants Yvette Bandín and Danny Meléndez for their help with data collection and for the resources in the Second Language Acquisition and Bilingualism (SLAB) Laboratory at UIUC, directed by Silvina Montrul. This work was partially supported by a UIUC Conrad Humanities Scholar Award to J.J.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
L1First language
L2Second language

Appendix A. Experimental Stimuli

  • High Frequency Verb
  • La nota/*Las notas que escribió el chef indicó el problema con la comida.
    El evento/*Los eventos que describió mi abuelo ocurrió en Canadá hace treinta años.
    La camiseta/*Las camisetas que diseñó la muchacha mostró la creatividad de los jóvenes.
    La máquina/*Las máquinas que compró la italiana preparó dos capuchinos en tres minutos.
    La lámpara/*Las lámparas que compró mi padre usó mucha energía por la noche.
    El artículo/*Los artículos que publicó el periódico consideró la perspectiva de los inmigrantes.
    El libro/*Los libros que escribió la enfermera explicó el problema de la obesidad.
    La demanda/*Las demandas que presentó el abogado logró el objetivo de sus clientes.
    La tienda/*Las tiendas que visitó mi hermana abrió esta mañana a las nueve.
    La torta/*Las tortas que hizo mi abuela ganó el concurso en el festival.
    El programa/*Los programas que vio la familia trató el tema de la discriminación.
    La medicina/*Las medicinas que inventó el médico recibió mucha atención de la prensa.
    El avión/*Los aviones que mandó la comisión llevó las provisiones a los refugiados.
    El carro/*Los carros que arregló el mecánico entró al garaje hace diez minutos.
    El proyecto/*Los proyectos que realizó el municipio tomó dos meses en el verano.
    El paquete/*Los paquetes que pidió la secretaria llegó esta tarde a las cinco.
    La pastilla/*Las pastillas que recomendó el doctor resultó más peligrosa que la enfermedad.
    El libro/*Los libros que compró el estudiante costó diez dólares en la librería.
    La pregunta/*Las preguntas que hizo la maestra comenzó el debate entre los alumnos.
    El ejercicio/*Los ejercicios que hizo el beisbolista desarrolló los músculos de su mano.
    La discoteca/*Las discotecas que visitó el grupo cerró esta mañana a las cinco.
    La exposición/*Las exposiciones que tuvo el museo presentó el arte de la China.
    El edificio/*Los edificios que construyó el arquitecto empezó la renovación de la zona.
    La estrategia/*Las estrategias que usó el candidato alcanzó el éxito en las elecciones.
    La foto/*Las fotos que sacó el hombre cambió el color de las flores.
    El premio/*Los premios que ganó la estudiante pagó la matrícula de la universidad.
    El cuento/*Los cuentos que presentó el profesor contó la vida de la autora.
    El negocio/*Los negocios que hizo el actor perdió mucho dinero en el pasado.
    La puerta/*Las puertas que instaló el carpintero evitó el frío durante el invierno.
    El camión/*Los camiones que contrató el gerente dejó las cajas en la fábrica.
    El grupo/*Los grupos que formó el padre pasó seis horas en la iglesia.
    La historia/*Las historias que publicó la revista habló de mujeres con trabajos importantes.
  • Low Frequency Verb
  • El concierto/*Los conciertos que dio la cantante abarrotó el estadio de la universidad.
    La fiesta/*Las fiestas que celebró el pueblo deparó muchas actividades a los niños.
    La piedra/*Las piedras que tiró el niño rascó el coche de la vecina.
    La colonia/*Las colonias que fundó el conquistador coartó la libertad de los indígenas.
    El viento/*Los vientos que causó el huracán tumbó un árbol en el parque.
    El herbicida/*Las herbicidas que usó el jardinero estropeó las rosas en el jardín.
    El movimiento/*Los movimientos que hizo el gato volcó la leche en el suelo.
    El método/*Los métodos que utilizó el chef coció el pescado con el limón.
    El cambio/*Los cambios que realizó el jefe propició el conflicto con los empleados.
    La novela/*Las novelas que escribió el colombiano engendró un movimiento en la literatura.
    El instrumento/*Los instrumentos que empleó el geólogo cavó un hoyo en la tierra.
    El comunicado/*Los comunicados que publicó la empresa constató la hipótesis de los economistas.
    La tarjeta/*Las tarjetas que reveló la psíquica vislumbró el futuro de su cliente.
    La cafetera/*Las cafeteras que usó la cocinera hirvió el agua en tres minutos.
    La silla/*Las sillas que empujó la estudiante vertió el café sobre los papeles.
    La ola/*Las olas que causó el terremoto reventó el muro de la ciudad.
    El carrito/*Los carritos que empujó el niño abolló el vehículo de la policía.
    La tormenta/*Las tormentas que describió el reportero bordeó el sur de la Florida.
    El poema/*Los poemas que leyó la estudiante clausuró la ceremonia de la graduación.
    La subida/*Las subidas que experimentó el mercado rebasó la predicción de los expertos.
    El incendio/*Los incendios que prendió un relámpago arrasó la comunidad en las colinas.
    El programa/*Los programas que descargó el especialista depuró la computadora de la directora.
    La droga/*Las drogas que tomó el comediante derivó en problemas de salud mental.
    La comisión/*Las comisiones que creó el ministro desentrañó la causa de la epidemia.
    El producto/*Los productos que compró la señora enredó el cabello de su hija.
    La ley/*Las leyes que aprobó el senado ciñó el presupuesto de las escuelas.
    La expedición/*Las expediciones que mandó la reina transitó el camino de las montañas.
    El aparato/*Los aparatos que instaló el granjero regó las plantas con poca agua.
    La ruta/*Las rutas que estableció el imperio abarcó la tierra de los aztecas.
    El curso/*Los cursos que ofreció la pintora reanudó el interés por el arte.
    El discurso/*Los discursos que dio el presidente entabló las negociaciones con el sur.
    La lección/*Las lecciones que presentó la maestra sustentó la participación de los alumnos.

References

  1. Baayen, R. H., & Milin, P. (2010). Analyzing Reaction Times. International Journal of Psychological Research, 3, 12–28. [Google Scholar] [CrossRef]
  2. Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68, 255–278. [Google Scholar] [CrossRef]
  3. Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. [Google Scholar] [CrossRef]
  4. Bayram, F., Kubota, M., & Pereira Soares, S. M. (2024). Editorial: The next phase in heritage language studies: Methodological considerations and advancements. Frontiers in Psychology, 15, 1392474. [Google Scholar] [CrossRef] [PubMed]
  5. Bice, K., & Kroll, J. F. (2021). Grammatical processing in two languages: How individual differences in language experience and cognitive abilities shape comprehension in heritage bilinguals. Journal of Neurolinguistics, 58, 100963. [Google Scholar] [CrossRef] [PubMed]
  6. Brooks, M. E., Kristensen, K., van Benthem, K. J., Magnusson, A., Berg, C. W., Nielsen, A., Skaug, H. J., Mächler, M., & Bolker, B. M. (2017). glmmTMB balances speed and flexibility among packages for zero-inflated generalized linear mixed modeling. The R Journal, 9(2), 378–400. [Google Scholar] [CrossRef]
  7. Bybee, J. (2007). Frequency of use and the organization of language. Oxford University Press. [Google Scholar]
  8. Cuetos, F., Glez-Nosti, M., Barbon, A., & Brysbaert, M. (2011). SUBTLEX-ESP: Spanish word frequencies based on film subtitles. Psicologica, 32, 133–143. [Google Scholar]
  9. Davies, M. (2006). A frequency dictionary of Spanish. Routledge. [Google Scholar]
  10. De Houwer, A. (2023). The danger of bilingual–monolingual comparisons in applied psycholinguistic research. Applied Psycholinguistics, 44(3), 343–357. [Google Scholar] [CrossRef]
  11. Di Pisa, G., Pereira Soares, S. M., Rothman, J., & Marinis, T. (2024). Being a heritage speaker matters: The role of markedness in subject-verb person agreement in Italian. Frontiers in Psychology, 15, 1321614. [Google Scholar] [CrossRef]
  12. Fernández Cuenca, S., & Jegerski, J. (2023). A role for verb regularity in the L2 processing of the Spanish subjunctive mood: Evidence from eye-tracking. Studies in Second Language Acquisition, 45(2), 318–347. [Google Scholar] [CrossRef]
  13. Foote, R. (2011). Integrated knowledge of agreement in early and late English–Spanish bilinguals. Applied Psycholinguistics 32, 187–220. [Google Scholar] [CrossRef]
  14. Fuller, J. M., & Leeman, J. (2020). Speaking Spanish in the US: The sociopolitics of language. Multilingual Matters. [Google Scholar]
  15. Giancaspro, D. (2020). Not in the mood: Frequency effects in heritage speakers’ knowledge of subjunctive mood. In B. Brehmer, & J. Treffers-Daller (Eds.), Lost in transmission: The role of attrition and input in heritage language development (pp. 72–97). John Benjamins. [Google Scholar]
  16. Godfroid, A. (2020). Eye tracking in second language acquisition and bilingualism: A research synthesis and methodological guide. Routledge. [Google Scholar]
  17. Goldin, M. (2022). Language activation in dual language schools: The development of subject-verb agreement in the English and Spanish of heritage speaker children. International Journal of Bilingual Education & Bilingualism, 25(8), 3046–3067. [Google Scholar]
  18. Gollan, T. H., Montoya, R. I., Cera, C., & Sandoval, T. C. (2008). More use almost always means a smaller frequency effect: Aging, bilingualism, and the weaker links hypothesis. Journal of Memory and Language, 58, 787–814. [Google Scholar] [CrossRef]
  19. Hopp, H. (2016). The timing of lexical and syntactic processes in second language sentence comprehension. Applied Psycholinguistics, 37(5), 1253–1280. [Google Scholar] [CrossRef]
  20. Hopp, H. (2018). The bilingual mental lexicon in L2 sentence processing. Second Language, 17, 5–27. [Google Scholar]
  21. Hur, E. (2020). Verbal lexical frequency and DOM in heritage speakers of Spanish. In A. Mardale, & S. Montrul (Eds.), The acquisition of differential object marking (pp. 207–235). John Benjamins. [Google Scholar]
  22. Hur, E., Lopez Otero, J. C., & Sanchez, L. (2020). Gender agreement and assignment in Spanish heritage speakers: Does frequency matter? Languages, 5(4), 48. [Google Scholar] [CrossRef]
  23. Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards logit mixed models. Journal of Memory and Language, 59, 434–446. [Google Scholar] [CrossRef] [PubMed]
  24. Jegerski, J. (2018). Psycholinguistic perspectives on Spanish as a heritage language. In K. Potowski (Ed.), Routledge handbook of Spanish as a heritage/minority language (pp. 221–234). Routledge. [Google Scholar]
  25. Jegerski, J., & Keating, G. D. (2023). Using self-paced reading in research with heritage speakers: A role for reading skill in the online processing of Spanish verb argument specifications. Frontiers in Psychology, 14, 1056561. [Google Scholar] [CrossRef]
  26. Johnson, V. E., de Villiers, J. G., & Seymour, H. N. (2005). Agreement without understanding? The case of third person singular /s/. First Language, 25(3), 317–330. [Google Scholar] [CrossRef]
  27. Karaca, F., Brouwer, S., Unsworth, S., & Huettig, F. (2024). Morphosyntactic predictive processing in adult heritage speakers: Effects of cue availability and spoken and written language experience. Language, Cognition and Neuroscience, 39(1), 118–135. [Google Scholar] [CrossRef]
  28. Keating, G. D. (2014). Eye-tracking with text. In J. Jegerski, & B. VanPatten (Eds.), Research methods in second language psycholinguistics (pp. 69–92). Routledge. [Google Scholar]
  29. Keating, G. D. (2022). The effect of age of onset of bilingualism on gender agreement processing in Spanish as a heritage language. Language Learning, 72(4), 1170–1208. [Google Scholar] [CrossRef]
  30. Keating, G. D., & Jegerski, J. (2015). Experimental designs in sentence processing research: A methodological review and user’s guide. Studies in Second Language Acquisition, 37(1), 1–32. [Google Scholar] [CrossRef]
  31. Kubota, M., & Rothman, J. (2025). Modeling individual differences in vocabulary development: A large-scale study on Japanese heritage speakers. Child Development, 96(1), 325–340. [Google Scholar] [CrossRef] [PubMed]
  32. Kuznetsova, A., Brockoff, P. B., & Christensen, R. H. B. (2014). lmerTest: Tests for random and fixed effects for linear mixed effect models (lmer objects of lme4 package). Available online: https://cran.r-project.org/web/packages/lmerTest/index.html (accessed on 20 August 2025).
  33. Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. Routledge. [Google Scholar]
  34. Lenth, R., Singmann, H., Love, J., Buerkner, P., & Herve, M. (2019). emmeans (Version 1.3.5.1): Estimated marginal means, aka least-squares means. Available online: https://cran.r-project.org/web/packages/emmeans/index.html (accessed on 20 August 2025).
  35. Lim, J. H., & Christianson, K. (2015). Second language sensitivity to agreement errors: Evidence from eye movements during comprehension and translation. Applied Psycholinguistics, 36(6), 1283–1315. [Google Scholar] [CrossRef]
  36. López Otero, J. C. (2022). Lexical Frequency Effects on the Acquisition of Syntactic Properties in Heritage Spanish: A Study on Unaccusative and Unergative Predicates. Heritage Language Journal, 19(1), 1–37. [Google Scholar] [CrossRef]
  37. López Otero, J. C. (2023). Imperatives in heritage Spanish: Lexical access and lexical frequency effects. Languages, 8(3), 218. [Google Scholar] [CrossRef]
  38. Luque, A., Rossi, E., Kubota, M., Nakamura, M., Rosales, C., López-Rojas, C., Rodina, Y., & Rothman, J. (2023). Morphological transparency and markedness matter in heritage speaker gender processing: An EEG study. Frontiers in Psychology, 14, 1114464. [Google Scholar] [CrossRef]
  39. Miller, K., & Schmitt, C. (2014). Spanish-speaking children’s use of verbal inflection in comprehension. Lingua, 144, 40–57. [Google Scholar] [CrossRef]
  40. Montrul, S. A. (2008). Incomplete acquisition in bilingualism: Re-examining the age factor. John Benjamins. [Google Scholar]
  41. Montrul, S. A. (2016). The acquisition of heritage languages. Cambridge University Press. [Google Scholar]
  42. Montrul, S. A. (2021). Representational and computational changes in heritage language grammars. Heritage Language Journal, 18(2), 1–30. [Google Scholar] [CrossRef]
  43. Montrul, S. A., Davidson, J., De La Fuente, I., & Foote, R. (2014). Early language experience facilitates the processing of gender agreement in Spanish heritage speakers. Bilingualism: Language and Cognition, 17(1), 118–138. [Google Scholar] [CrossRef]
  44. Montrul, S. A., & Slabakova, R. (2003). Competence similarities between native and near-native speakers: An investigation of the preterite/imperfect contrast in Spanish. Studies in Second Language Acquisition, 25(3), 351–398. [Google Scholar] [CrossRef]
  45. O’Grady, W., Kwak, H.-Y., Lee, O.-S., & Lee, M. (2011). An emergentist perspective on heritage language acquisition. Studies in Second Language Acquisition, 33(2), 223–245. [Google Scholar] [CrossRef]
  46. Perez-Cortes, S. (2022). Lexical frequency and morphological regularity as sources of heritage speaker variability in the acquisition of mood. Second Language Research, 38, 149–171. [Google Scholar] [CrossRef]
  47. Perez-Cortes, S., & Giancaspro, D. (2022). (In)frequently asked questions: On types of frequency and their role(s) in heritage language variability. Frontiers in Psychology, 13, 1002978. [Google Scholar] [CrossRef] [PubMed]
  48. Polinsky, M., & Scontras, G. (2020). Understanding heritage languages. Bilingualism: Language and Cognition, 23(1), 4–20. [Google Scholar] [CrossRef]
  49. Putnam, M., & Sánchez, L. (2013). What’s so incomplete about incomplete acquisition? A prolegomenon to modeling heritage language grammars. Linguistic Approaches to Bilingualism, 3, 478–508. [Google Scholar] [CrossRef]
  50. Rayner, K., & McConkie, G. W. (1976). What guides a reader’s eye movements? Vision Research, 16(8), 829–837. [Google Scholar] [CrossRef]
  51. R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing. Available online: https://www.R-project.org/ (accessed on 20 August 2025).
  52. Rodríguez, E., & Reglero, L. (2015). Heritage and L2 processing of person and number features: Evidence from Spanish subject-verb agreement. EuroAmerican Journal of Applied Linguistics and Languages, 2(2), 11–30. [Google Scholar] [CrossRef]
  53. Rothman, J., Bayram, F., DeLuca, V., Di Pisa, G., Duñabeitia, J. A., Gharibi, K., Hao, J., Kolb, N., Kubota, M., Kupisch, T., Laméris, T., Luque, A., Van Osch, B., Pereira Soares, S. M., Prystauka, Y., Tat, D., Tomić, A., Voits, T., & Wulff, S. (2023). Monolingual comparative normativity in bilingualism research is out of “control “: Arguments and alternatives. Applied Psycholinguistics, 44(3), 316–329. [Google Scholar] [CrossRef]
  54. Sagarra, N., & Rodriguez, H. (2022). Subject–verb number agreement in bilingual processing: (Lack of) Age of acquisition and proficiency effects. Languages, 7(1), 15. [Google Scholar] [CrossRef]
  55. Shin, G. H. (2024). Good-enough processing, home language proficiency, cognitive skills, and task effects for Korean heritage speakers’ sentence comprehension. Frontiers in Psychology, 15, 1382668. [Google Scholar] [CrossRef] [PubMed]
  56. Solon, M., Park, H. I., Dehghan-Chaleshtori, M., Carver, C., & Long, A. Y. (2022). Exploring an elicited imitation task as a measure of heritage language proficiency. Studies in Second Language Acquisition, 44(4), 1095–1123. [Google Scholar] [CrossRef]
  57. SR Research. (2005). Eyelink 1000 [Apparatus and software]. Available online: https://www.sr-research.com/eyelink-1000-plus/ (accessed on 20 August 2025).
  58. Staub, A. (2011). Word recognition and syntactic attachment in reading: Evidence for a staged architecture. Journal of Experimental Psychology: General, 140(3), 407–433. [Google Scholar]
  59. Tily, H., Fedorenko, E., & Gibson, E. (2010). The time-course of lexical and structural processes in sentence comprehension. Quarterly Journal of Experimental Psychology, 63(5), 910–927. [Google Scholar] [CrossRef] [PubMed]
Table 1. Language background information (n = 50 heritage speakers) 1.
Table 1. Language background information (n = 50 heritage speakers) 1.
MSDRange
Age20.541.5818–28
Age of Acquisition
English3.812.780–7
Spanish0.801.550–4.5
DELE Score39.605.6721–48
Self-ratings: English
Understanding9.540.767–10
Speaking 9.400.906–10
Reading9.580.797–10
Self-ratings: Spanish
Understanding8.641.166–10
Speaking7.981.265–10
Reading8.101.534–10
1 The maximum score was 50 for the DELE and 10 for self-rated language skills.
Table 2. Trimmed eye movement measures.
Table 2. Trimmed eye movement measures.
Eye Movement MeasureStimulus WordAgreementHigh Frequency VerbLow Frequency Verb
MSDMSD
Gaze Duration
(milliseconds)
Critical VerbGrammatical370179540359
Ungrammatical373177519311
Verb + 1Grammatical268135256116
Ungrammatical265141239108
Verb + 2Grammatical365221399263
Ungrammatical386252416241
Total Dwell Time
(milliseconds)
Critical VerbGrammatical520291924558
Ungrammatical658354920496
Verb + 1Grammatical351201358213
Ungrammatical389293334207
Verb + 2Grammatical542338679435
Ungrammatical566352677402
Regression Path Time
(milliseconds)
Critical VerbGrammatical467392678482
Ungrammatical491364666460
Verb + 1Grammatical359317400369
Ungrammatical382398438451
Verb + 2Grammatical494432675560
Ungrammatical559477659550
Word Skipping
(proportion of trials)
Critical VerbGrammatical.065.247.013.111
Ungrammatical.061.239.023.149
Verb + 1Grammatical.545.499.694.461
Ungrammatical.576.495.693.462
Verb + 2Grammatical.053.224.045.208
Ungrammatical.061.239.060.239
Table 3. Analysis of gaze duration: Output from linear mixed-effects models.
Table 3. Analysis of gaze duration: Output from linear mixed-effects models.
EstimateSEtp
Critical Verb
Intercept5.9540.035167.887<.001 *
Agreement0.0070.0120.600.550
Verb Frequency0.1460.0226.580<.001 *
Agreement × Verb Frequency0.0110.0120.921.361
Verb + 2
Intercept5.8150.033174.334<.001 *
Agreement0.0330.0122.632.009 *
Verb Frequency0.0380.0261.476.145
Agreement × Verb Frequency0.0030.0120.267.790
* Effect significant at α = .05.
Table 4. Analysis of total dwell time: output from linear mixed-effects models.
Table 4. Analysis of total dwell time: output from linear mixed-effects models.
EstimateSEtp
Critical Verb
Intercept6.4510.041159.228<.001 *
Agreement0.0740.0154.834<.001 *
Verb Frequency0.2190.0287.746<.001 *
Agreement × Verb Frequency0.0530.0133.967<.001 *
Verb + 2
Intercept6.2360.047133.895<.001 *
Agreement0.0210.0161.330.192
Verb Frequency0.1060.0362.950.004 *
Agreement × Verb Frequency0.0060.0140.454.652
* Effect significant at α = .05.
Table 5. Analysis of regression path time: Output from linear mixed-effects models.
Table 5. Analysis of regression path time: Output from linear mixed-effects models.
EstimateSEtp
Critical Verb
Intercept6.1450.039156.833<.001 *
Agreement0.0090.0150.602.550
Verb Frequency0.1630.0275.971<.001 *
Agreement × Verb Frequency0.0160.0151.054.296
Verb + 2
Intercept6.1540.046134.260<.001 *
Agreement0.0260.0141.894.058 †
Verb Frequency0.1110.0353.129.003 *
Agreement × Verb Frequency0.0250.0141.826.068 †
* Effect significant at α = .05. † Effect marginally significant at .05 < α < .10.
Table 6. Mean accuracy proportion for post-stimulus comprehension questions.
Table 6. Mean accuracy proportion for post-stimulus comprehension questions.
High Frequency VerbLow Frequency Verb
AgreementMSDMSD
Grammatical.949.221.872.334
Ungrammatical.956.206.877.329
Table 7. Analysis of comprehension accuracy: Output from logit mixed-effects models.
Table 7. Analysis of comprehension accuracy: Output from logit mixed-effects models.
EstimateSEtp
Intercept23.0045.60112.878<.001 *
Agreement0.8670.1340.919.358
Verb Frequency1.6820.3192.742.006 *
Agreement × Verb Frequency0.9740.1150.227.820
* Effect significant at α = .05.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Jegerski, J.; Fernández Cuenca, S. Variability in the Online Processing of Subject–Verb Number Agreement in Spanish as a Heritage Language: The Role of Lexical Frequency. Languages 2025, 10, 211. https://doi.org/10.3390/languages10090211

AMA Style

Jegerski J, Fernández Cuenca S. Variability in the Online Processing of Subject–Verb Number Agreement in Spanish as a Heritage Language: The Role of Lexical Frequency. Languages. 2025; 10(9):211. https://doi.org/10.3390/languages10090211

Chicago/Turabian Style

Jegerski, Jill, and Sara Fernández Cuenca. 2025. "Variability in the Online Processing of Subject–Verb Number Agreement in Spanish as a Heritage Language: The Role of Lexical Frequency" Languages 10, no. 9: 211. https://doi.org/10.3390/languages10090211

APA Style

Jegerski, J., & Fernández Cuenca, S. (2025). Variability in the Online Processing of Subject–Verb Number Agreement in Spanish as a Heritage Language: The Role of Lexical Frequency. Languages, 10(9), 211. https://doi.org/10.3390/languages10090211

Article Metrics

Back to TopTop