Acceptability of Different Psychological Verbal Constructions by Heritage Spanish Speakers from California

: This study set out to investigate whether US Heritage Spanish features a more streamlined verbal paradigm in psych verb constructions compared to standard varieties of Spanish, where HS speakers ﬁnd an invariable third-person singular form acceptable with both singular and plural grammatical subjects. In standard Spanish, the semantic subjects of psych verbs are typically preverbal experiencers cast as oblique arguments in inverse predicates such as in me encantan los buhos ‘I love owls’. The translation of this sentence shows that equivalent English predicates are typically direct constructions. The data were gathered using an acceptability judgement questionnaire that was distributed to participants that ﬁt into one of three groups: early bilingual heritage speakers of Spanish from California, advanced Spanish as L2 speakers, and non-bilingual native speakers of Spanish who had learnt English as an L2 as adults. The Heritage Spanish speakers in this group often patterned differently from both other groups, who surprisingly patterned together. We argue that this is due to L2 speakers’ mode of acquisition (formal and subject to prescriptive grammar), in comparison with Heritage Spanish speakers’ naturalistic acquisition. Speciﬁcally, we ﬁnd evidence for a streamlining of the Spanish verbal paradigm not immediately attributed to English interference, and that in psych verb constructions, Heritage Spanish speakers more readily accept a third-person singular invariable verbal form. This differentiation of the verbal paradigm from standard Spanish use should be considered a bona ﬁde linguistic change, but not proof of either incomplete acquisition or language attrition. Since Heritage Spanish speakers are, after all, native speakers of Spanish, this study shows that Heritage Spanish should be considered and studied as any other dialect of Spanish, with its distinctive grammatical features, and subject to variability and change.


Introduction
The interest and research on Heritage Languages (HLs) has increased exponentially since the 1970s, when bilingual HL speakers first caught linguists' attention (Valdés 1975;Lambert 1977;Dvorak and Kirschner 1982). Early studies focused on descriptive and pedagogical aspects of teaching bilingual speakers, especially Spanish-English bilinguals in the US. Later work focused on the sociolinguistic characteristics of the population, and on the features of their Spanish insofar as they diverged from monolingual Spanish speakers' as well as from second language learners (L2s).
The definition of heritage speakers that is usually cited in recent literature goes as follows: A language qualifies as a heritage language if it is a language spoken at home or otherwise readily available to young children, and crucially this language is not a dominant language of the larger (national) society . . . [A]n individual qualifies as a heritage speaker if and only if he or she has some command of the heritage language acquired naturalistically . . . although it is equally expected that such competence will differ from that of native monolinguals of comparable age. (Rothman 2009, p. 156) However, this definition misses an important aspect of HLs, namely that heritage speakers are not only early bilinguals, but also speakers of a minoritized language in a diglossic context. The language is mainly spoken at home, and typically not used as a written means of communication. In some cases, the immediate local community functions as an important supporting structure and a public space in which the HL is used (Liu et al. 2011;Wiley et al. 2014). For other contexts (academic, work environment) and modes (written communication), however, the majority language is used instead. Thus, the linguistic input to which heritage speakers are exposed is potentially more limited than that of monolingual native speakers. After all, the baseline language spoken by their caregivers and possibly by the immediate surrounding community is "a diasporic variety spoken by first-generation immigrants in their respective communities-as compared to the language spoken in the homeland". (Polinsky and Scontras 2020, p. 7). The lack of prestige and legal protection of Spanish as a minoritized language in the US is heightened by its association with the language of newly arrived immigrants from lower socioeconomic classes, with often limited education and English language proficiency. Many first-generation bilingual children of immigrants report having had very negative experiences in the US school system, where they were singled out, belittled or even reprimanded because of their Spanish use with other Spanish-speaking children. Such treatment, of course, reveals the absurd double standard of praising bilingualism as a result of foreign language learning, while disparaging it as a result of acquiring a HL at home (cf. Wiley 2014;Flores and Rosa 2015). Therefore, despite these children's early exposure to both the majority language and the HL at home (simultaneous bilinguals), or to Spanish at home and English when they enter the school system (consecutive bilinguals), for many of them, the majority language will become their dominant language. Whether HL speakers use both languages with equal efficiency or not, all bilingual speakers can display phenomena of convergence between their two languages, especially if one of the two languages is dominant, as often occurs with heritage speakers (Toribio 2004).
Because of their speakers' early bilingualism, HLs became an interest of formal linguists as a petri dish to test theories of language acquisition. The research of by now classic authors working on HL (e.g., Montrul or Polinsky) has focused on showing precisely which areas of the heritage speakers' grammar differ from that of monolingual native speakers. These areas would then be those where the HL is more easily affected by the most widely used, societal language. Just as other minoritized language speakers in a diglossic context, in fact, Heritage Spanish (HS) speakers may unsurprisingly display a different, less diverse gamut of constructions and vocabulary, compared to monolingual speakers of a variety of Spanish with (co-)official status used as means of communication in the school system. This raises the question of what baseline should actually be taken for comparison for HLs (Polinsky and Scontras 2020, p. 7;Benmamoun et al. 2013aBenmamoun et al. , 2013b, given that even L2 teaching has long since abandoned the native-speaker-as-the-ultimate-goal-of-foreignlanguage-learning approach (Cook 1999;Juanggo 2017).
Unfortunately, it is often easy to slip into accepted modes of description of HLs that tacitly or overtly endorse what Bley-Vroman's termed the "comparative fallacy" (1983), i.e., describing the heritage language in terms of a different entity, such as another variety of Spanish. However, using standard Spanish as a yardstick to which HS is compared usually finds the latter lacking: as a result, we have made every effort to avoid linguistic terminology that, while commonly accepted even in academic writing, may belittle HS speakers' linguistic skills. Thus, we strive to avoid phrases such as "balanced bilingual", even if it is accepted, for instance, by the APA as "a person who has proficiency in two languages such that his or her skills in each language match those of a native speaker of the same age." (see dictionary.apa.org, s.v. balanced bilingual), since we are not suggesting Languages 2021, 6, 80 3 of 33 that bilingual speakers are or should be two monolingual speakers rolled into one (as an outraged anonymous reviewer pointed out). Recent work on bilingualism, in fact, whether for spoken languages or cross-modal, clearly shows that early bilinguals will not behave in linguistically comparable ways to monolingual speakers of the same languages (Birdsong 2018;Birdsong and Quinto-Pozos 2018;Hartshorne et al. 2018). We, therefore, would not expect Californian HS speakers' early bilingual skills to be identical or even comparable to those of monolingual speakers of any other standard variety of Spanish spoken elsewhere as a national or official language. After all, as Toribio (2004, p. 165) observes: "For bilinguals, then, two language systems constitute their linguistic competence in a singular sense, and their linguistic performance reflects the contribution of the component languages independently or in tandem", as they move from interactions on a continuum spanning from a "monolingual mode" on one end with the psycholinguistic suppression of one of the two languages, to the "bilingual endpoint" on the other, where both are activated, with code-switching and the seamless alternation between the two languages typical of highly proficient bilinguals (ibid.).
We would like to suggest instead that HS in California should simply be considered as a dialect with its own distinctive features. After all, Spanish has a considerable historical depth in many parts of the US, given the country's colonial past (see Lamar Prieto 2018 for a detailed study) and cannot be considered just as an immigrant language, although it is that too, of course. HS in the US is in contact with English, the more widely spoken societal language. Thus, HS can often show convergence with English, where convergence can be defined as a series of optimization strategies available to bilingual speakers that decrease structural and lexical differences between their two languages (Muysken 2013;Toribio 2004). Convergence can introduce novel constructions in either language, or simply increase the use of an existing construction in one of the languages that is less commonly used in a corresponding monolingual variety. Contact is one of the commonly mentioned causes for language change (Bowern even states that language contact, at least at a superficial level, "is part of the definition of language change", Bowern 2013), especially when contact-induced variation is then followed by a restructuring of grammatical paradigms and by the acceptance and use by the HL community. Examples of restructuring for HS are cited in Toribio and Nye (2006); De Prada Pérez and Pascual y Cabo (2011), while linguistic features that are adopted by the community as in-group markers can be found in McGregor Villarreal (2014, examples from child language in HS), and Rickford et al. (2015) for AAVE. While it is perhaps unavoidable to draw strictly linguistic comparisons with other varieties of Spanish including standard ones, HS should therefore be mainly compared to itself, while considering that HS speakers not only differ in individual proficiency and attachment to Spanish language and culture, but also originate from, and interact with, Spanish-speaking communities in the US, whose Spanish may differ in terms of substrate variety, in historical depth in the region, and in how active and recent the flow of new monolingual Spanish immigrants may be. Thus, as we have argued elsewhere, it is important to provide fine-grained analyses of linguistic usage and not imply that what may be accurate for a specific domain in HS will necessarily translate into observations valid for all areas of HS.
We chose to analyze verbs of emotion (psych verbs) and related constructions for this study, which are common and productive in Spanish (Vázquez Rozas 2006;Vázquez and Miglio 2016). Our overarching research questions were as follows: (1) Are psych verb constructions an area of grammar with more variability in Spanish?
(2) Do these constructions pose a learnability problem for HS speakers?
(3) Do these problems stem from the fact that English, the main societal language for HS speakers in the US, has direct constructions where Spanish equivalents are typically (but not solely) expressed by reverse/inverse constructions? (4) If influence from English is the case, do these constructions cause problems for L2 learners of Spanish? Ascertaining previously considered hypotheses on these constructions was relevant to address some of those questions, i.e., whether HS speakers in the US use a different verbal paradigm compared to standard varieties of Spanish: it was in fact observed that the 3rd person singular of the present tense could also double as the 3rd person plural (see, for instance, Toribio 2004). In 2011, De Prada Pérez and Pascual y Cabo investigated a possible change toward direct mapping of gustar-type verbs and the emergence of an invariable experiencer form le in Floridian HS. Reverse/inverse psychological predicates, such as gustar 'to like', prescriptively require the verb to agree with a typically postposed grammatical subject (semantically a theme). In pre-verbal position, however, the preferred location for the subject in Spanish, they feature an obliquely marked experiencer (EXP). De Prada Pérez and Pascual y Cabo (2011, p. 117) did not find evidence for an invariable le EXP or for direct mapping in HS but did find "evidence of simplification in verb agreement . . . towards invariable gusta". In the same paper, however, they did not find statistically significant differences among speaker groups (low-, intermediate-, and high-proficiency HS speakers). They did find more tolerance toward a singular gusta with plural grammatical subject than toward plural gustan with a singular grammatical subject. Thus, they argued, "the variability found in the native control data is the locus of interlanguage influence. While the nature of invariable gusta is left unanswered" (2011, p. 118). However, they also suggest it could have been due to morphological simplification rather than being syntactic in nature, and that it was caused by the variability in (monolingual) native speakers' Spanish, which provides vulnerable areas upon which the majority language encroaches (cf. also Dvorak and Kirschner 1982;and Toribio and Nye 2006).
A third-person singular invariant form seemed to emerge in studies of Miglio and Gries (2019) but proved to be slightly more elusive. In a study of HS usage of the verb gustar 'to like', in fact, we found that participants were weakly, but significantly less prescriptively correct in assessing third-person plural gustan, and that a plural syntactic subject led to more incorrect judgments (Miglio and Gries 2015). That is, when the syntactic subject was plural and gustar singular, participants judged the standard Spanish sentence as grammatical, making the prescriptively incorrect decision more often (chi-squared = 29.69, df = 1, Cramer's V = 0.13) than when the syntactic subject was singular and gustar plural (pp. 426-27). All in all, however, those results were marginal. We found the same tendency in a later study, but no interaction with type of speaker (early bilingual vs. late bilingual w/English as L2 vs. late bilingual w/ Spanish as L2). Thus, the tolerance toward lack of agreement between third-person singular verb and plural subject seemed not to apply to HS speakers exclusively (Miglio and Gries 2019). This was the background that led us to gather more data and test for a correlation between speaker type and the existence of an invariant verbal form based on the third-person singular in HS. The current study uses a large data set in order to explore HS speakers' acceptance of invariant 3rd person singular verb forms. We gathered data from an extensive grammatical acceptability survey distributed to three groups of participants: late bilingual speakers of Spanish with English as their native language (L2s), early bilingual speakers of Spanish and English from California (HS speakers), and late bilingual speakers of English with Spanish as their native language (NSs).
In the remainder of the paper, we first describe Spanish psych verb constructions with relevant examples in Section 2. In Section 3, we lay out the methodology used to collect the data, the characteristics of the participants in the study, as well as the statistical methodology employed for data analysis, which is still rarely used in linguistic studies as we write. In Section 4, we then graph the results from the statistical analysis, while a discussion of their interpretation is provided in Section 5. Finally, in Section 6, we draw the general conclusions that can be gleaned from this study, as well as propose further developments of this topic, the methodology, and its applications.

Invariable Third-Person Present Tense Predicates in Heritage Spanish?
Both sentences in (1) and (2) below are prescriptively ungrammatical because of a mismatch in agreement between the grammatical subject and the verb. If an invariable verbal form for psychological predicates based on the third-person singular were acceptable to HS speakers, one would expect them to accept sentences such as (1) below (with thirdperson singular verb, but plural subject) more readily than (2) (with third-person plural verb, but singular subject): ( In general, both HS speakers and L2 learners can find the acquisition and accurate processing of inflectional morphology difficult (Montrul 2008;Mikhaylova 2012;Bosch et al. 2019). The tendency increases proportionally with the age of acquisition of the L2 Kluender 2017, 2018;Abutalebi and Clahsen 2018;Hartshorne et al. 2018), which supports some form of the critical period hypothesis (subsuming into our use of CPH both the meaning of critical and/or sensitive period as in Birdsong 2018 and references cited therein). Moreover, ever since Johnson and Newport's (1989) grammaticality judgements study, it has been argued that fully fledged acquisition and L1-like morphosyntactic use are unlikely for late learners (Wartenburger et al. 2003;White 2003;DeKeyser 2005DeKeyser , 2010DeKeyser et al. 2010). This, however, discounts the possibility that factors other than age of acquisition may play a role in how morphosyntactically proficient L2 learners become, which has brought some researchers to call a strict CPH into question (for instance, Bialystock and Kroll 2018), or at least posit other factors as potentially affecting both bilingual and L2 language acquisition: . . . it is possible that what appear to be AoA effects on grammar are in fact effects of nonage-related factors such as reduced general language skill in a late-learned L2, the possibility of L1 transfer, decreased exposure, less practice and use, etc. How these learning factors contribute to L1/ L2 performance differences and how they can be distinguished from genuine AoA effects is still subject to controversy. (Bosch et al. 2019) De Prada Pérez and Pascual y Cabo had already mentioned some of the potentially confounding factors named in the passage above. They wondered, for instance, what could lead to an invariable gusta form, when HS speakers' intuitions about the singular vs. plural IO clitic (le vs. les) were very clear. They suggested that those results may have been due to the type of regular input HS speakers receive, or whether even monolingual native speakers' grammar may be more accepting of gusta with a plural grammatical subject than of gustan with a singular one. They finally also wondered how this change in progress should be classified and whether it was triggered by phonology, morphology, or syntax (2011, p. 118).
Some of these discrepancies in language usage between monolingual and bilingual HS speakers could be due to convergence between the HL grammar and the dominant societal language, which in turn may give rise to a less diverse morphological structure in HS as compared to standard Spanish (as suggested by Toribio 2004). Non-diversified verbal morphology would be consistent with studies that posited the existence of an invariable form based on the original third-person singular present tense of Spanish verbs used also with plural subjects (Toribio and Nye 2006;De Prada Pérez and Pascual y Cabo 2011). The observations and conclusions from those contributions and our previous research allowed us to devise a study that would test the relevant areas in both monolingual and bilingual Spanish grammar and usage. Against this background, the following refines the above research questions as follows: (1) Are psych verb constructions an area of grammar with more variability in Spanish? If there is more variability in Spanish psych verb constructions than in other areas of grammar, in the sense that here we would find considerable variation in acceptability judgment even among originally monolingual speakers of Spanish, Spanish speakers that learnt English as a L2 later in life would allow that hypothesis to be tested. Rather than proposing a binary acceptable/unacceptable answer, it was also important to allow participants to express acceptability on a Likert scale (from "not at all acceptable" to "completely acceptable" in several steps), rather than in a binary "yes"/"no" form. Eliciting those judgments in a gradient form would also help to assess whether monolingual speakers of Spanish are indeed more tolerant toward ungrammaticality in this domain.
(2) Do these constructions pose a learnability problem for HS speakers? The results obtained from the non-bilingual native speakers of Spanish could be compared to the results of a HS bilingual group to address the differences in psych verb constructions in HS use. Such differences could emerge from eliciting additional acceptability judgments on prescriptively ungrammatical stimuli featuring an agreement mismatch between a plural grammatical subject and a third-person singular verb and vice versa (examples 1 and 2 above). That lack of agreement would help us to uncover whether psych verb constructions are a potential area of morphological variability in Spanish and whether HS speakers' usage is innovative in having the 3rd sg. verbal form be used for both 3rd sg. and pl. Evidence for an invariable verbal form in HS used for both singular and plural in psych verbs could be found if HS speakers preferred stimuli where the agreement mismatch featured singular verb + plural subject (*me encanta los buhos 'I love owls') to one with plural verb + singular subject (*me encantan el buho 'I love the owl').
Regarding research questions (3) "Do HS speakers accept more readily psych verb constructions that are ungrammatical in standard Spanish because of English?" and (4) "Does English also affect L2 learners of Spanish?", of course, extending the experiment to a third group of participants, anglophone speakers of Spanish as L2, addresses question (4). As for what part English plays for the different linguistic groups, this is more difficult to establish. One could expect HS and L2 speakers to pattern together, separately from monolingual native speakers, if English were the main factor contributing to this linguistic behavior.

Methods
In this section, we describe our methodology in recruiting participants to the study, their characteristics, and the instruments deployed for data collection.

Participants
A total of 131 participants participated in this study voluntarily. Of those, 55 were late bilingual Spanish speakers (L2s) and 58 were early bilingual HS speakers (HSs). The L2s were native speakers of American English, who had studied Spanish language at university level for at least two years or had equivalent studies. They can be considered late bilinguals with Spanish as L2 since they were fluent enough to attend upper division content classes in Spanish at the time of their participation in the study.
The 58 HS speakers were early bilinguals, who were all born in the United States, had at least one parent of Mexican origin, and had started to learn English before 8 years of age. The majority had been through the school system in English after a mainstreaming period of one or two years. This type of information was collected in a pre-experimental questionnaire, where participants were asked about their age, native language(s), nationality of parents, whether Spanish had been learned in school and for how long, and whether it was spoken at home and with whom (with parents/caregivers, or among siblings).
The "control group" comprised 18 monolingual native speakers of Spanish from Mexico and Spain, who had learnt English in school during or after their 12th year of age with different degrees of success. These could be considered late bilinguals of English as an L2 and performed at ceiling, in terms of prescriptive, standard Spanish grammar, and we therefore did not consider it necessary to distinguish between different speaker dialects in this group. We have sometimes referred to this group as "NS/native Spanish speakers", which should be interpreted as short-hand to mean "monolingual NS speakers". It does not imply that bilingual speakers or Heritage Spanish speakers are not native speakers of Spanish.
All study participants were undergraduates at a campus of the University of California system, and most were either Spanish majors or minors, exclusively or jointly with other subjects. All were attending upper division courses of Spanish literature, history, or linguistics, where they were recruited. Both L2s and HS speakers attend the same classes together by the time they are admitted to upper division courses. This guaranteed that they all spoke English and Spanish at a level advanced enough to follow academic instruction and perform effectively in both Spanish and English. While we are aware that there may be considerable proficiency differences in both L2 and HS speakers in general, this was not the case with the participants in this experiment. We did not ask them to undergo a language proficiency test because for these students to attain access to upper division classes in the Spanish Department, they had to have passed the same language exams with a C or better, whether taking language instruction courses or "testing out" of them. The same was true for early bilingual speakers, who would either need to have taken a sequence of two academic Spanish courses specific for HS speakers or would have passed the equivalent language tests. Their status as Spanish Department upper division students, therefore, was used in lieu of a proficiency test. It should also be pointed out that students taking upper division classes at this institution's Spanish Department are accustomed to writing essays and projects in Spanish and to reading canonical literature in Spanish. Therefore, the HS speakers that participated in this experiment were not similar to those envisaged for instance by Rao and Ronquest (2015), i.e., heritage speakers of Spanish with less education that might have difficulties with a written questionnaire because their Spanish usage is predominantly oral. On the contrary, the participants in our study were majors and minors in Spanish and therefore had plenty of exposure to written, academic Spanish.
Participants were between 22 and 28 years old and came from similar cultural and linguistic backgrounds. The late bilinguals were native speakers of American English, born and raised in the US, who had learnt Spanish as L2. The Hispanic-American early bilinguals were children of immigrant parents, who had been born in the US and spoke Spanish as L1 at home and had learnt English at least before 8 years of age or were exposed to both Spanish and English from birth. We did not actively control for the participants' socioeconomic background in the pre-test questionnaire, but we statistically controlled for any participant variability, which would also control for potential variability of a socioeconomic type, which is therefore not expected to have a large and systematic impact on the participants' acceptability judgments.
Neither psychological predicates nor invariant verbal forms, i.e., the more specific topic explored in this paper, had been discussed in any of the classes where participants were recruited. None of the participants received any monetary compensation for their participation. All students present in the classroom on the day the questionnaire was distributed were given course credit, whether they filled it out and/or handed it in at the end of the class or not, using the names on the attendance sheet, since participation was voluntary and the questionnaires anonymous.

The Experimental Design of the Questionnaire
The actual linguistic experiment consisted of an acceptability judgment task with a total of 64 items, 42 of which were actual stimuli. All items elicited acceptability judgments on a Likert scale spanning from −3 to + 3 (−3, −2, −1, 0, + 1, + 2, + 3), where −3 equaled "Very strange", 0 "Intermediate", and + 3 "Natural Spanish" (see one of the actual questionnaires in Appendix A). The remaining 22 stimuli were fillers: 11 with and 11 without grammatical errors. All stimuli were randomized differently in 16 different types of questionnaire and balanced for the presence or absence of the grammatical, semantic, and/or pragmatic factors that could influence the participants' judgments (see below). (Un)grammaticality/(Un)acceptability in this instrument meant that there was a mismatch between the grammatical subject's number and verbal agreement, so that half of the stimuli could have a singular grammatical subject (semantically a theme) coupled with a verb conjugated in the plural or vice versa. The stimuli in the instrument were in Spanish; however, the biographical data questionnaire and the written instructions were given in English. Highly proficient bilinguals such as these participants are accustomed to switching between their two languages, so while some priming effects cannot be excluded, the number of judgments collected, the randomization of the stimuli in different questionnaires, and the fine-grained statistical analysis, as well as the high number of participants would have at least limited the effect of interlinguistic priming. Moreover, there were no examples in the English text relevant for psych verbs, and the stability of acceptability judgments is well-established (Cowart 1997;Sprouse and Almeida 2017).
The questionnaire was provided to the students by their instructor on paper in their usual classroom, in this case a large lecture hall. Instructions on how to fill out the questionnaire (including the biographical data and consent form) were repeated orally in Spanish, the language of instruction, before they were allowed to complete the questionnaire. The test was not timed.
We wanted to test both psych verbs and verbal constructions with reverse predicates, not just gustar 'to like' as in Miglio and Gries (2015) or De Prada Pérez and Pascual y Cabo (2011). Thus, we used the inverse constructions from Miglio and Gries (2019): agradar 'to like', caer bien/mal 'to like/dislike (person)', desalentar 'to disappoint', disgustar 'to bother, displease, disgust', doler 'to hurt, to bother', encantar 'to like a lot', faltar 'to lack', fascinar 'to fascinate, like a lot', interesar 'to interest', llamar la atención 'to intrigue, to interest', molestar 'to annoy', preocupar 'to worry', resultar imposible/difícil 'to turn out to be impossible/difficult', sobrar 'to have (something) in excess', sorprender 'to surprise', both in grammatical and ungrammatical stimuli. Since, admittedly, some of these verbs may have been novel or unusual for both HS speakers and L2 learners, we added a box before the actual questionnaire stimuli, in which all relevant verbal constructions were translated with English equivalents, and in the case of doler 'to hurt', even the morphologically irregular forms duele(n) were provided. Students could then refer to this box as a kind of glossary during the test if they needed to look up the meaning of a construction.
The grammatical, semantic, and pragmatic factors that were manipulated in order to obtain the relevant stimuli and that became, then, part of the statistical analysis involved the following variables: We now discuss each of these variables. We begin with the main predictors of interest and turn then to the control variables.

The Main Predictor 1: AGREEMENT
In standard Spanish, and therefore in prescriptive grammar, agreement between the grammatical subject (los buhos 'owls') and verbal morphological markers (gusta(n)-(3rd sg./pl.) produces grammatical stimuli (le gustan los buhos 's/he likes owls'-AGREEMENT: yes); lack of agreement produces ungrammatical ones (*le gusta los buhos 's/he likes owls' -AGREEMENT: no). Assessing if different types of Spanish speakers recognized this ungrammaticality and how they evaluated the acceptability of different stimuli aimed at revealing whether psych verb constructions are an area of variability in Spanish, and thus, whether even non-bilingual native speakers might have difficulties recognizing some ungrammatical stimuli or might attribute to them different degrees of acceptability. Test sentences were, therefore, balanced to be (prescriptively) grammatical or ungrammatical (due to the agreement mismatch between subject and verb).
Following the suggestions in Bullock and Toribio (2006) and De Prada Pérez and Pascual y Cabo (2011), agreement mismatches would also reveal whether HS speakers in the US find acceptable a 3rd sg. invariable verbal form for both 3rd sg. and pl. If the HS speakers preferred stimuli where lack of agreement was a result of singular verb + plural subject (*me encanta los buhos 'I love owls') to those where ungrammaticality resulted from plural verb + singular subject (*me encantan el buho 'I love the owl'), this would be evidence for an innovation featuring an invariable verbal form in HS psych verb constructions, compared to standard Spanish. Agreement between subject and verb was therefore one of the two main variables of interest, as it targeted some aspects of the four overarching research questions of the study directly, namely those of the variability intrinsic in psych verb constructions, and those of the potential learnability problems this variability might cause. While it might be expected that prescriptively ungrammatical stimuli (AGREEMENT: no) would elicit worse judgments than grammatical ones (AGREEMENT: yes) across the board, this study aimed at establishing whether different, even ungrammatical stimuli would be more or less acceptable to different types of participants.

The Main Predictor 2: NATLG
The main predictor NATLG (short for 'native language') had three levels: English (L2) vs. HS (English-Spanish) vs. non-bilingual native Spanish speakers (NS), designating the three groups of participants and highlighting how and when they learned Spanish. As mentioned in the introduction above, this use of "native language" is just a label to distinguish the three groups. It is in no way intended as an implication that HS speakers are non-native speakers of Spanish: the HS level here is short-hand for "bilingual speakers of HS", whereas the NS level is short-hand for "originally monolingual speakers of Spanish born and raised in a Spanish-speaking country, who learnt English as adults later in life". NATLG was therefore one of the two main variables of interest, as it targeted other aspects of the four overarching research questions of the study directly, namely those of the variability intrinsic in different ways of learning Spanish, whether early (HS, NS) or late in life (L2), in a formal setting with frequent exposure to the standard, written language (NS, L2) or an informal (HS) setting, characterized by an almost exclusively oral usage of the language. We expected that the degree to which AGREEMENT would correlate with, i.e., predict, the response variable JUDGMENT (see below) would differ across the levels of the variable NATLG, i.e., the three speaker groups. Thus, non-earlybilingual NS speakers would behave differently from HS speakers, and these in turn would behave differently when compared to Spanish as L2 learners. These distinctions were based on existing literature that found that L2 learners have problems producing inflectional morphology and often omit it (Juanggo 2017 and references cited therein; Bosch et al. 2019), and that more exposure to the L2 and continued learning improves subject-verb agreement (Birdsong and Quinto-Pozos 2018). Furthermore, we expected bilingual HS speakers to be closer to monolingual NS speakers since reverse constructions are frequent in natural language.
For the statistical analysis, NATLG and AGREEMENT were conflated into a 6-level variable NATLGAGREE, namely the combination of the variables NATLG and AGREEMENT. This was created because brms::brm does not permit a straightforward analysis of threeway and higher interactions, but we still needed each of the predictors to be able to interact with both NATLG and AGREEMENT.

Control Variables
We now turn to control variables that embody some of the grammatical or pragmatic features known from previous literature to be relevant for psych verb constructions. While the existence of important grammatical, semantic, or pragmatic factors affecting these constructions is not in doubt (cf. Vázquez and Miglio 2016, for instance, and literature cited therein), they could interact differently with the separate participant groups. For that reason, (i) we made sure our stimuli varied systematically across these variables and (ii) featured in the statistical analysis, but we had no specific hypotheses regarding their impact on JUDGMENT.
3.4.1. NEGATION NEGATION, with its two levels no vs. yes, refers to the absence or presence of negation of the main verb of the stimulus sentence (Le gustan los buhos 's/he likes owls' vs. No le gustan los buhos 's/he does not like owls'). NEGATION could be relevant because it is known to be a complicating factor in cognitive processing (an extra step or cognitive operation (as per Wason 1959;Wason and Johnson-Laird 1972, p. 39). This has been documented experimentally, for instance, by Dale and Duran (2011), and causes increased reaction time, which in turn can be decreased through context (Wason 1965;Glenberg et al. 1999). It should be pointed out, however, that the stimuli in the questionnaire used for this study provided no context (see examples (3) and (4) below), nor was reaction time recorded, as the instrument was given to the participants on paper.
Considering previous studies' conclusions on negation, we considered it possible that it might add some cognitive load on the participants during the eliciting phase of the acceptability judgments, which is why we felt it needed to be added as a control. For instance, NEGATION could be a complicating factor for only some of the participants, for instance, for HS and/or L2 learners, whose exposure to standard Spanish was less consistent than that of NS educated in a Spanish-speaking and writing school system. Thus, HS speakers and/or Spanish as L2 learners might find it more difficult to assess prescriptively grammatical/ungrammatical stimuli, i.e., to align their acceptability judgement to prescriptive grammar, when rating negative sentences as in (3), than affirmative ones as in (4): (3) *La razón para quejarse no les sobran SUBJ-3rd sg.
IO-EXP V-3rd sg. EXP-RED '(As for me), I have plenty of reasons to complain' (Negation: affirmative)

EMPHPRONOUN
The variable EMPHPRONOUN with its levels yes vs. no tracks whether or not the sentence involves reduplication (i.e., doubling) of the oblique/indirect object (IO) clitic pronoun representing EXP through the use of a corresponding, fully stressed personal pronoun. Clitic doubling is grammatical in every variety of Spanish. Usage of reduplication in this context comes across as emphatic or as marking contrastive focus. Specifically, EXP is often represented by a simple clitic (79.4% of times in a corpus of monolingual Spanish, according to Vázquez Rozas 2006). Because of its precise scope as emphatic or contrastive, monolingual native speakers make and perceive a clear distinction between sentences such as (5) and (6) Example (5) is unmarked, while (6) is used for emphasis or contrastively, for instance, to distinguish the EXP in this sentence from a previously mentioned one (someone who has not got enough courses for the Thanksgiving meal). Despite the lack of contrastive context for a sentence such as (6), native speakers have no problem rating them as grammatical. However , Vázquez Rozas (2006, p. 84) concludes that ≈79% of inverse verbal constructions express their EXP as a clitic, and only ≈9% do so through a full NP. The EXP, on the other hand, is expressed by a clitic and a reduplicated stressed pronoun in 13.38% of cases. These are still few cases compared to the almost 80% with a clitic-only EXP. Potential differences in sensitivity to natural language patterns among the groups of participants could therefore induce higher or lower judgments according to their mode of Spanish learning (NATLANG: L2 vs. HS vs. NS).

SUBJORDER
SUBJORDER: gramsem vs. semgram represents the position of gram and sem with regard to the verb and/or each other, where gram is the actual grammatical subject of the verb, which in Spanish requires verbal agreement in number and sem is the "semantic subject", i.e., EXP. The level gramsem represents the condition where either (i) the grammatical subject is located before and an experiencer PP (a stressed pronoun introduced by the preposition "a") is after the verb or (ii) both grammatical subject and EXP represented by a clitic are located before the verb. The level semgram, by contrast, represents the condition where either (i) a fully fledged EXP PP (a stressed pronoun introduced by the preposition "a") and its clitic or (ii) by a simple clitic are located before the verb and the grammatical subject is located after the verb.
As always, the stimuli used were balanced for all conditions because this variable was involved in the stimulus construction because speakers have an expected preference for placing grammatical subjects and EXPs in particular positions in sentences (see Vázquez Rozas's (2006) corpus results): Her data are taken from BDS, a tagged database with texts from the ARTHUS corpus (Archivo de Textos Hispánicos de la Universidad de Santiago (Spain), a corpus of Spanish syntactic data). In her 2006 study, the semantic themes cast as grammatical subjects are typically expressed by postposed NPs (54.2%), and only 10% are located before the verb. Moreover, 35.9% of these syntactic subjects are not expressed at all, since Spanish does not need to express the subject overtly. As for EXPs, they are mainly represented by clitics (79.4%), which have an obligatory pre-verbal placement in most Spanish tenses and moods, and by pre-posed NPs (18.3%). The latter percentage for EXPs is still an important expression of preferred pre-verbal placement if compared to the 10% of pre-verbal grammatical subjects.
Given these different frequencies, acceptability for stimuli differing in the position of the syntactic subject and that of EXP might, therefore, vary: The HS speakers' naturalistic way of learning Spanish could reflect a preference for more common frequency patterns in Spanish as a whole, and heritage speakers could, therefore, attribute higher values to a post-verbal gram. In Miglio and Gries (2019), we found that some monolingual Spanish speaker participants found a few questionnaire sentences with a pre-verbal grammatical subject (los buhos le gustan '(s/he) likes owls') less than felicitous but still judged them as grammatical. On the other hand, regardless of the stylistic and pragmatic purposes that make pre-verbal grammatical subjects acceptable to monolingual Spanish speakers, some HS speakers and L2 learners judged them as ungrammatical. Processing this extra pragmatic function might be more cognitively taxing, for instance, for heritage speakers and L2s that may have had little exposure to such unusual word order types, which in turn might lower their acceptability judgments even for prescriptively grammatical stimuli. On the other hand, L2 learners are more accustomed than heritage speakers to manipulating this kind of structure, as they have learnt Spanish in a formal setting. In this case, as we found in a different study (Miglio and Gries 2015), L2 learners may outperform HS speakers vis á vis the permutations of the grammatical subject's and EXP's position.

NUMBERGRAM
The number of the grammatical subject, expressed by the variable NUMBERGRAM, is an important variable, since it governs the agreement with the verbal form. This is also a two-level variable singular vs. plural, since the grammatical subject can vary according to the Spanish two-number system: Nos llama la atención el curso de literatura francesa 'we are interested in the French literature course' and Nos llaman la atención los cursos de literatura francesa 'we are interested in the French literature courses'. We considered it possible that NUMBERGRAM might interact with the variable of interest NATLG to shed light on the overarching research questions about the forms of the verbal paradigm in HS but once again had no expectation as to how it would do so.
Note that we only used third-person singular and third-person plural grammatical subjects in the stimuli to limit possible sentence types and to decrease the number of variable levels to consider. Therefore, no stimuli were constructed to vary according to the person of the grammatical subject: There are no sentences such as Al hombre alto le gusto yo/gustas tú 'the tall man likes me/you'. The grammatical subject was always a fully fledged singular or plural NP (la(s) mujer(es) simpática(s) 'the nice/funny woman' or 'women'), and consequently, the verb is also only ever in the third-person singular or plural present tense. This obviated the need for a predictor PERSONGRAM, a possible variable with first-, second-, and third-person levels.

EXPERPERSON
EXPERPERSON is a variable with three levels (first, second, and third person), since the stimuli can vary according to the person of the EXP (Los hombres altos me/te/le gustan, 'I/you like tall men' or 's/he likes tall men'); in order to be able to balance other factors, stimuli were constructed so that a first person EXP (singular or plural) always corresponded to a third-person singular grammatical subject. This variable was also involved in the stimulus construction because we found in Miglio and Gries (2019) that first person EXPs behave differently from second/third person. In psychological predicates such as the ones considered here, this is ascribed to the first-person sg. or pl. being able to penetrate the realm of feelings and knowledge of the individual EXP(s). Speakers are not usually able to opine knowledgeably about other people's states of mind or feelings (Floyd et al. 2018;Hargreaves 2018;Mithun 1999, pp. 74-75). Thus, it is perhaps not surprising that first persons should behave differently from second and third persons. In this sense, all groups of participants would hopefully make fewer mistakes (from the point of view of prescriptive grammar) in assessing the grammaticality of 1st-person EXPs (i.e., rate them higher if AGREEMENT: yes and lower if AGREEMENT: no) than that of 2nd and 3rd, simply because one is accustomed to speaking about one's own feelings. Where L2 learners do not have the benefit of immersion as HS/NS, they have the benefit of manipulating structures in the classroom, especially commonly used ones.
3.4.6. EXPERNUMBER EXPERNUMBER has the two levels plural vs. singular: Me/Nos encanta ese hombre curtido por el sol 'I/we really like that sun-tanned man'. As stated above, we used all persons singular and plural for EXP. However, we avoided vos for second singular and vosotros for second plural as having a regionally limited distribution that could throw off participants, who were all of Mexican descent (at least by way of one parent/caregiver). We considered that the Iberian Spanish speakers of the control group would either be familiar with the usage of ustedes as second person plural, having all resided in the Western USA, or would consider it as the second person plural polite form, and that it would therefore not posit a problem. We used les + a ustedes/les + a ellos to disambiguate between second-and third-person plural.
The number of the EXP should not make a difference for any of the groups, since in Spanish, it does not control any other component of the sentence. It could, however, be relevant if there were a direct influence of English on, say, early or late bilingual speakers: The translation equivalents of these psych verbs are a direct construction in English, where the EXP is also the grammatical subject that controls verbal agreement (I like owls vs. she likes owls).

STIMNUMBER
Finally, we included a numeric predictor STIMNUMBER in the model, which encoded for each stimulus which number it was since the beginning of the test (1-64). This controlled for fatigue, learning, habituation, etc. over the course of the experiment.

Stimulus Composition
All variables detailed above were balanced in the 64 stimuli each participant was asked to rate. It should also be mentioned that while we tried to make the stimuli as natural as possible (see Appendix A), even when manipulating the control variables described in this section above, some sentences may have sounded cumbersome, or slightly infelicitous. A form that is frequently used in the negative, for instance, may be substituted by a different construction altogether in the affirmative, instead of simply taking the negation out. Moreover, as we strived to mimic natural language as much as possible, we created several types of stimuli without controlling for constituents being represented by noun phrases built with different components. Some were just clitic NPs (Las razones para quejarse no les sobran, 'they have no reason to complain'); some were full NPs (La política no nos interesa, 'we aren't interested in politics'); others were NPs with a prepositional phrase (Los novios de sus hermanas le caen mal, 's/he dislikes her/his sisters' boyfriends'), or NPs with an adjectival phrase (Te gustan los hombres altos, 'You like tall men'); and others still were represented by an NP with an adjectival phrase and a PP (Nos sorprenden las opiniones conservadoras del rector, lit. 'we are surprised by the conservative opinions of the Chancellor').

The Dependent/Response Variable JUDGMENT
JUDGMENT indicates how acceptable the stimulus sentence was to the participants. Half of both stimuli and fillers presented to participants were grammatical and half were ungrammatical sentences, according to prescriptive grammar. Although only 42 of the 64 sentences on the questionnaire were actual stimuli, participants were asked to rate all items through acceptability judgments on a Likert scale spanning from −3 to + 3 (−3, −2, −1, 0, + 1, + 2, + 3), where −3 equaled "Very strange", 0 "Intermediate", and + 3 "Natural Spanish".
Grammatical stimuli could be items such as (7): (A él) no le caen mal los novios de sus hermanas '(As for him), he does not dislike his sisters' boyfriends', while (8) and (9) below exemplify prescriptively ungrammatical stimuli. The tables under (7) Grammatical stimuli were counterbalanced by ungrammatical ones such as: *La política nos interesan 'we are interested in politics', exemplified without reduplication of the indirect object EXP in (8) and with reduplication of the EXP in (9)  All variables of Section 3.4 were permitted to interact with NATLGAGREE so we would be able to determine whether any of these factors had an impact on the judgements that different speakers/groups of speakers gave to the stimulus, based how linguistically acceptable it was to them. Results of said interactions would shed light on the study's research questions and determine if the effects of the main predictors of interest were modified/mediated by the controls. Generalizing from the above remarks on some of the controls, the following were potential effects of the grammatical, pragmatic, or semantic factors encoded in these controls. As for the NEGATION and EMPHPRON variables, their presence (negation: los buhos no me gustan 'I don't like owls') and reduplication of the EXP with an emphatic pronoun (los buhos me gustan a mí 'I, for one, really like owls') could be considered a further cognitive load and possibly cause stimuli to be judged worse/unacceptable, especially by speakers such as HS and L2 with potentially less exposure to less frequently used structures. Frequency effects could also be causing worse ratings for less common ordering of constituents, and thus, any of the groups that was particularly sensitive to frequency effects in natural language might judge worse or even unacceptable those stimuli where the grammatical subject was located before the verb (los buhos me gustan 'I like owls') instead of being postposed (me gustan los buhos 'I like owls').
The number of the grammatical subject (NUMBERGRAM), on the other hand, could affect the acceptability judgments of early bilingual speakers if HS has implemented an innovation whereby the third-person singular of psych verbs has become an invariant form also used in the plural, as surmised in previous literature (for instance, Toribio and Nye 2006). The variable EXPERPERSON could cause variability in JUDGEMENT because of the nature of psych verbs as self-reflective predicates (Hargreaves 2018), but it is unclear how. EXPERNUMBER should not cause any differences in judgment values per se, but given that it coincides in English with the syntactic subject of psych verbs, it might affect bilingual speakers (HS/L2) through transfer. Finally, it was possible but not systematically predictable that there would be habituation or fatigue effects as the experiment progressed as tracked by the control variable STIMNUMBER.

Statistical Analysis
We began our statistical analysis with comprehensive exploration and cross-tabulation to ensure that no pockets of data sparsity would raise problems in the analysis. No such problems were encountered, however, thanks to the design of the questionnaire and to the fact that there were virtually no missing judgments, which could have been potentially caused by speakers not providing an answer. These data were modeled with a Bayesian ordinal mixed-effects model. Unlike many other studies, we decided to use an ordinal model because the grammaticality judgments we collected are arguably not interval/ratio-scaled data and should thus more realistically be analyzed at the ordinal level that they instantiate. In addition, we went with a Bayesian model (i) because of how Bayesian modeling addresses one of the most widespread problems of frequentist mixed-effects modeling, namely, convergence problems, and (ii) because the results of Bayesian models can often be more intuitive than those of frequentist models; for instance, they provide credible intervals for the statistics in question as opposed to confidence intervals.
Using the R package::function brms::brm (Bürkner 2017(Bürkner , 2018, we fit a single modeli.e., there was no model selection-with the following characteristics: other control as a fixed effect: STIMNUMBER, which was also permitted to interact with NATLGAGREE; • sources of random-effect variation: varying intercepts and varying slopes for NATL-GAGREE for each level of the variables SUBJECT (one for each of the 131 participants completing the questionnaire) and VERB (one for each of the 16 verbal constructions in the critical stimulus sentences).
Priors were set to be uninformative (mean = 0, sd = 10), and we used 5 chains with 6000 iterations and 2000 warmup iterations; these values were chosen after a first round of modeling with fewer iterations resulted in unsatisfactory Rhat values-with the settings reported above, no warnings or problematic Rhat values were encountered.
This model was then evaluated based on its R 2 values and the predictors whose credible intervals did not include 0 (and that did not participate in higher-order interactions, as per the principle of marginality). For these highest-order predictors, we studied their conditional effects, i.e., the predictions the model makes for each of those predictors while all other predictors are controlled for, and we plot those below.

Results
As mentioned in Section 3.3.1, prescriptively ungrammatical stimuli were expected to elicit worse judgments than grammatical ones across the board. In this specific case, the Likert scale allowed participants to assign three negative values to unacceptable stimuli, a middle-of-the road 0, and three positive values for acceptable stimuli. By and large, Table 1 shows that prescriptively ungrammatical stimuli (AGREEMENT: no) were given negative ratings (−3, −2, −1) more than prescriptively grammatical ones (AGREEMENT: yes). The "zero" value was given to approximately the same number of grammatical and ungrammatical stimuli (138 vs. 140). Only the + 1 value was assigned to more ungrammatical stimuli than grammatical ones, otherwise the other two positive values are given to more grammatical than ungrammatical stimuli (412 vs. 308, and 992 vs. 413).
The overall model fit was somewhat satisfactory: The robust Bayesian R 2 was 0.383, which a 95%-CI of [0.367, 0.397]; sampling reliability was fine (all Pareto k-diagnostics were <0.7, all Rhat values were 1 but two, which were 1.01 and thus still below the usual "critical value" of 1.1). A variety of main effects were observed, but all of them participated in interactions, which is why we focus on those. The picture painted by these interactions is rather homogeneous. Unfortunately, the results of ordinal or multinomial regressions are rather voluminous and complex: Unlike for linear or binary logistic regressions, which return one prediction for every single case, ordinal/multinomial regressions return predicted probabilities of all outcomes (i.e., all seven possible judgment values) for every single case. We use the same kind of visual representation for all interactions so as to hopefully reduce the complexity of the output and facilitate its understanding; given the multitude of results, we provide a concise interim summary in Section 4.7 below.

The Interaction NATLGAGREE:EMPHPRON
Consider Figure 1 for the plot for NATLGAGREE:EMPHPRON. Each of the six panels represents predictions for one of the levels of NATLGAGREE:EMPHPRON; e.g., the top left one represents the result for the English/L2 speakers' judgments of ungrammatical (AGREEMENT: no) stimuli. The x-axis represents the levels of the predictor interacting with NATLGAGREE, i.e., here, EMPHPRON; the y-axis represents predicted probabilities of judgments. The numbers plotted into the coordinate system are the seven possible acceptability judgments, which, to avoid dealing with possibly hard-to-read minus signs in plotting, were converted from the "−3 to + 3" scale that the subjects actually assigned to each stimulus to a corresponding 1-to-7 scale. In each panel, (i) the y-axis position and the font size represent the predicted probability of each judgment (i.e., numbers that are bigger and higher up in the plot represent more likely outcomes) and where (ii) the acceptability judgment with the highest predicted probability-i.e., the one the model predicts-is highlighted in black.
To some extent, there is an overall effect of AGREEMENT: In the three lower panels, where AGREEMENT is yes (i.e., the stimulus is prescriptively grammatical), high positive judgments are predicted for all speakers even if the strength of the prediction varies across the three speaker groups. The clearest positive judgments are predicted for the monolingual native speakers of Spanish in the right panel, the prediction for the HS speakers is a bit less strong, and the prediction for the English/L2 speakers is least strong (even if they are still predicted to produce the second highest rating, a 6/ + 2).
(i.e., the stimulus is ungrammatical), low(er) acceptability values are predicted, but, and this is where the main force of the interaction comes in, not for the HS speakers. The predictions for the non-early-bilingual native speakers of Spanish and of English are indeed 1s (corresponding to −3 values in the original judgments), but the HS speakers are actually predicted to rate these ungrammatical stimuli just as perfectly acceptable (7/ + 3) as they rated the lower-panel counterparts.

The Interaction NATLGAGREE:EXPERNUMBER
A very similar result was obtained for the interaction of NATLGAGREE:EXPERNUMBER, which is shown in Figure 2, and therefore does not require much additional commentary. By contrast and for the most part, in the three upper panels, where AGREEMENT is no (i.e., the stimulus is ungrammatical), low(er) acceptability values are predicted, but, and this is where the main force of the interaction comes in, not for the HS speakers. The predictions for the non-early-bilingual native speakers of Spanish and of English are indeed 1s (corresponding to −3 values in the original judgments), but the HS speakers are actually predicted to rate these ungrammatical stimuli just as perfectly acceptable (7/ + 3) as they rated the lower-panel counterparts.

The Interaction NATLGAGREE:EXPERNUMBER
A very similar result was obtained for the interaction of NATLGAGREE:EXPERNUMBER, which is shown in Figure 2, and therefore does not require much additional commentary.

The Interaction NATLGAGREE:NEGATION
The result for this interaction, NATLGAGREE:NEGATION, is also of the same type, as is shown in Figure 3.

The Interaction NATLGAGREE:NEGATION
The result for this interaction, NATLGAGREE:NEGATION, is also of the same type, as is shown in Figure 3.

The Interaction NATLGAGREE:SUBJORDER
The result for this interaction, NATLGAGREE:SUBJORDER, is also of the same type, as is shown in Figure 4.

The Interaction NATLGAGREE:SUBJORDER
The result for this interaction, NATLGAGREE:SUBJORDER, is also of the same type, as is shown in Figure 4.

The Interaction NATLGAGREE:EXPERPERSON
This interaction is slightly different from those described so far; consider Figure 5 below.

The Interaction NATLGAGREE:EXPERPERSON
This interaction is slightly different from those described so far; consider Figure 5 below.
When AGREEMENT is yes, there is 'the usual' set of high predictions for all speaker groups, and when AGREEMENT is no, there is 'the usual' set of low predictions for nonbilingual native speakers of Spanish and 'the usual' unexpectedly high ratings from the HS. However, the native speakers of English (L2s) behave more unexpectedly here as well: Rather than as in all effects so far, where they gave uniformly low ratings for all ungrammatical stimuli, they now only do so (correctly) for 1st person EXPs, while for 2nd and 3rd person EXPs, they give high acceptability judgments (just like the HS speakers).

The Interaction NATLGAGREE:NUMBERGRAM
This final interaction is also slightly different from all previous ones; see Figure 6 below.
There are no changes when AGREEMENT is yes and no changes when AGREEMENT is no for the English late bilinguals (L2) and for the non-early-bilingual Spanish speakers (NS). However, when AGREEMENT is no, then the HS speakers give prescriptively correct low acceptability ratings for singular grammatical subjects, but prescriptively incorrect high acceptability ratings for plural grammatical subjects. When AGREEMENT is yes, there is 'the usual' set of high predictions for all speaker groups, and when AGREEMENT is no, there is 'the usual' set of low predictions for nonbilingual native speakers of Spanish and 'the usual' unexpectedly high ratings from the HS. However, the native speakers of English (L2s) behave more unexpectedly here as well: Rather than as in all effects so far, where they gave uniformly low ratings for all ungrammatical stimuli, they now only do so (correctly) for 1st person EXPs, while for 2nd and 3rd person EXPs, they give high acceptability judgments (just like the HS speakers).

The Interaction NATLGAGREE:NUMBERGRAM
This final interaction is also slightly different from all previous ones; see Figure 6 below. There are no changes when AGREEMENT is yes and no changes when AGREEMENT is no for the English late bilinguals (L2) and for the non-early-bilingual Spanish speakers (NS). However, when AGREEMENT is no, then the HS speakers give prescriptively correct low acceptability ratings for singular grammatical subjects, but prescriptively incorrect

Interim Summary
To sum up: First, all interaction effects reveal that, in essence, all speaker groups give the prescriptively accepted high acceptability ratings for prescriptively grammatical stimuli-differently strongly predicted, but as expected, nonetheless. Second, all interaction effects reveal that non-early-bilingual native speakers of Spanish give the prescriptively accepted low acceptability ratings for ungrammatical stimuli. Third, the non-early-bilingual native speakers English (L2) mostly pattern with the non-early-bilingual native speakers of Spanish-the only time they do not is in ungrammatical stimuli with 2nd/3rd person EXPs, which L2s rate very high (*te/le gusta los buhos 'you/she/he likes owls'). Fourth, the HS speakers are nearly uniformly much more accepting of ungrammatical stimuli-the only occasion where their ratings of ungrammatical stimuli converge with those of the L2s and the non-early-bilingual native speakers of Spanish (NS) is when NUMBERGRAM is singular (*me gustan el buho 'I like the owl'). Before we discuss the implications of these results in Section 5 below, we turn to some results regarding the random effects.

Random-Effects Results
Most studies using mixed-effects models-Bayesian or frequentist ones-do not discuss random-effects results at all: They use the random effects to get better fixed-effects results but often neither check them as part of model diagnostics nor explore them with regard to the phenomenon at hand. In our current modeling process, we first checked the distribution of the slope and intercept adjustments for predictors by verb as well as by speakers, which looked rather good (in the sense of very few outlier kind of adjustments pointing to noteworthy verbs or speakers) and with hardly any adjustments' CI at all not including 0 and, thus, worthy of further considerations.
We also checked which stimuli received surprisingly high ratings although they were unacceptable (i.e., AGREEMENT was no). There were some slightly intriguing patterns, which motivated us to do a small exploratory follow-up: We generated a response variable SURPRHIGH, which classified each rating as yes if the rating was + 2/6 or + 3/7 although AGREEMENT was no (all others were set to no), and then we tried to predict these yes occurrences using a conditional inference tree (Hothorn et al. 2006). The resulting tree implied that: • Unsurprisingly, given the above premises, SURPRHIGH: yes was very rare in the responses from non-early-bilingual native speaker of Spanish (NS) and with singular grammatical subjects (stimuli of the type *me gustan el buho 'I like the owl'); • SURPRHIGH: yes was much more frequent (exceeding the baseline by a factor of at least 2, with EMPHPRON: yes, SUBJORDER: semgram, and NUMBERGRAM: plural, i.e., with stimuli of the type: *A mí me gusta los buhos 'As for me, I like owls'.

Discussion
Considering that our research questions were as follows, we revisit them in turn in the course of the discussion below: (1) Are psych verb constructions an area of Spanish grammar that is particularly rife with variability? (2) Do these constructions pose a learnability problem for HS speakers? (3) Do these potential problems stem from English influence? (4) If transfer from English is the potential cause, does this affect L2 learners of Spanish?
In general, the expectation was that the three participant groups would differ in their judgments of these constructions, either because psych verbs are a linguistic domain producing considerable variation in judgment even in monolingual varieties (1), and possibly because participants learned Spanish in different ways (2-4): HSs and NSs learned it early in life and in a naturalistic context, whereas L2s learned it later in life and in a formal setting. On the other hand, NSs also share with L2s the classroom setting, the formal study of standard grammar, and the written support warranted by studying Spanish through the school system or in an academic setting. We contextualize these issues in the light of our results below.

General Discussion of Fixed Effects
Regarding research question (1) above, given the high level of Spanish fluency for all groups, we expected that the participants would generally give higher ratings to prescriptively grammatical stimuli than to ungrammatical ones, which is by and large confirmed in Table 1 (Section 4). However, considering the unanswered variability question raised by De Prada Pérez and Pascual y Cabo (2011), all variables that could influence the outcome, and the possibility of gradient responses through the Likert scale, it was possible that even the judgments provided by the "control" group of participants that had grown up as native monolingual speakers in a Spanish-speaking country might not always align with the prescriptive grammar of standard Spanish. This is in part true if one looks at raw numbers for NS in the table below (last two rows). Table 2 below is a summary table grouping together the number of stimuli that were judged to be unnatural or very unnatural (−3, −2), neither particularly natural nor unnatural (−1 to + 1), and natural or very natural Spanish (+ 2, + 3), respectively, but also divided by speaker type/grammaticality. Although an essentially monofactorial table must not distract, much less override, the actual multifactorial results of the model, the last two rows show us that there is a near ceiling performance of non-bilingual native speakers of Spanish regarding grammatical alignment with standard grammar. The Likert scale did allow for some variation in the acceptability judgments, and thus, there are 12 instances in which some NS considered a grammatical stimulus unnatural Spanish, and 35 where they considered an ungrammatical stimulus perfectly natural Spanish (see also Section 4.8 above and Section 5.2 below). Whether this indicates variability in the psych verb domain of monolingual Spanish grammar is more difficult to establish, as those few NS responses may simply be a reaction to the infelicitousness conjured forth by the need to construct strictly balanced stimuli, tied to frequency effects. The tendency of NSs to align to with standard grammar is clear, however, much more than in the case of HSs (middle rows), or L2s (first two rows of results).
In fact, there are three main findings that require specific discussion/contextualization. The most widespread one is the clear effect that, again and again, HS rate prescriptively ungrammatical utterances as much more acceptable than both non-bilingual native speakers of Spanish and L2 learners; in other words, they seem much more tolerant to stimuli that the other two speaker groups consider clearly and nearly completely unacceptable. This first finding addresses our second research question and confirms findings from our and others' previous research (Toribio and Nye 2006;De Prada Pérez and Pascual y Cabo 2011). However, if psych verb constructions posit a "learnability issue" for HS, our findings do not explain the systematicity with which Heritage Spanish does not align with standard Spanish in this domain and seem to point in a direction different from learnability. One reaches a more plausible explanation when considering Heritage Spanish as a separate dialect of Spanish, with different grammatical rules, which need not and indeed should not be compared in all grammatical aspects to the native Spanish spoken by our third group of non-early-bilingual native speakers.
In part, this first finding also addresses research questions (3) and (4), as the performance of advanced learners of Spanish as L2 patterns more with NSs than with HSs: This would seem to exclude a direct influence of English on this grammatical domain, as L2s Languages 2021, 6, 80 23 of 33 and HSs are also both native speakers of English. The patterning of L2s with NSs seems to point instead to the fact that both these groups share the learning of Spanish also in a formal setting and therefore are more likely to adhere to prescriptive usage.
The second main finding can be seen in Figure 5 above: the grammatical stimuli are accepted by all speaker groups, the ungrammatical ones are rejected by the monolingual native Spanish speakers, accepted by the HS speakers (as before), but the L2 learners now also accept 2nd and 3rd person even in ungrammatical stimuli. This difference between first and non-first person in psych verb predicate constructions-here, by the learners-is not entirely surprising, perhaps, as . . . it is expected that speakers have a tendency to talk about themselves rather than about a third party, and for the same reasons . . . speakers do not feel entitled to talk about the feelings or impressions of others, since they usually have no access to them. (Vázquez and Miglio 2016, pp. 97-98) Among others, Mithun (1999, pp. 74-75) showed the use of a special so-called empathetic third-person pronoun when talking about the feelings or thoughts of others in the Pomoan languages of California. Recent bibliography on egophoricity (see Floyd et al. 2018;Hargreaves 2018) has highlighted the distinction between first-person and non-first-person forms when talking about other speakers' feelings. Specifically, Hargreaves (2018, p. 79) has shown that some languages even morphosyntactically encode interconnected semantic and pragmatic features embodying "epistemic constraints on the attribution of intentional and internal states, and a discourse function . . . constructed from the indexical properties of speaker/addressee . . . [from different] illocutionary [sentence] types". It is, thus, not completely unexpected that there should be a distinction between the judgments accorded to first-person vs. non-first-person syntactic subjects.
The third main finding is the most important one in our study. HS speakers evaluate systematically as more acceptable and give a higher score to stimuli where AGREEMENT is no, and the number of the grammatical subject (NUMBERGRAM) is plural: *Nos llama la atención los cursos de literatura francesa 'we are curious about the French literature courses'. The fact that the HS speakers accept a morphologically marked singular verb and a plural grammatical subject as grammatical much more easily than an ungrammatical stimulus with a singular subject and a plural verb such as *nos llaman la atención el curso de literatura francesa 'we are curious about the French literature course' (cf. Figure 6 in Section 4.6) is the best evidence thus far that Heritage Spanish differs from standard dialects of Spanish as spoken by monolingual native speakers and is consistent with the existence of an invariant third-person singular verbal form used with both singular and plural grammatical subjects, as hypothesized by several authors before us (cf. Toribio and Nye 2006;De Prada Pérez and Pascual y Cabo 2011). HS speakers, after all, have no problem recognizing that sentences such as *nos interesan la política 'we are interested in politics' are prescriptively ungrammatical, just as the other two groups of speakers (L2s, NSs) do. Therefore, it is most likely not because they have acquired a grammar that is "defective" in the area of verbal agreement (because of incomplete acquisition or because of attrition), but rather because they speak a variety of Spanish that no longer uses a verbal form marked for the plural in psych verb constructions.
Despite their likely dominance in English, the language the majority of these HS speakers used in the school system, these participants specifically chose to major or minor in Spanish, which clearly shows a considerable degree of commitment to their heritage language. While we did not specifically ask the participants for a breakdown of their daily linguistic interactions in English or Spanish, their degree choices would require a high level of weekly, if not daily, engagement with Spanish in all its forms, and specifically in the standard variety, as required by written assignments and essays for the Spanish upper division courses. In fact, it is precisely because of the high level of education of these participants and their choice of Spanish as their subject of university study that we believe our findings document even more convincingly that (i) HS has a more streamlined verbal paradigm for psych verb constructions than the standard, and (ii) HS should be considered a separate dialect of Spanish. Given the participants' considerable access to literacy (upper division university studies at a 4-year college, and R1 institution) and their engagement with Spanish (major/minor in Spanish), their acceptance of a third-person singular form gusta even with plural subjects, as in *me gusta los buhos 'I like owls', can only be a feature of HS, rather than a bug for these speakers. It is therefore likely an innovation in the variety of Spanish grammar they speak, i.e., California Heritage Spanish. This is particularly true because L2s and NSs differ from HS speakers particularly in this respect: L2s and NSs have in common that they did, at some point, study Spanish standard grammar in a classroom setting and therefore find forms such as *me gusta los buhos (accepted by HSs) as unacceptable as *me gustan el buho, i.e., regardless of whether the agreement mismatch is caused by the singular or plural morphology marked on the verb coupled with a grammatical subject of the opposite number.
Considering that third singular present tense is the one person that has maintained a differential marking in the otherwise severely impoverished English verbal paradigm, it would be difficult to prove the direct English influence on an invariant form in HS psych verbs. This would not exclude that extensive language contact and the dominance of one language over the other in bilingual speakers in general (although not necessarily in this group of participants) could encourage the simplification of the psych-verbal paradigm as an innovation in HS, a language change that could potentially spread to the rest of the verbal paradigm.

Discussion of Post Hoc Exploration of SURPRHIGH
In Section 4.8, we discussed the random-effects results by exploring which specific, ungrammatical stimuli elicited surprisingly high ratings (hence the response variable SURPRHIGH). We mentioned there that there were some intriguing patterns and that these results are important because they address variability in monolingual native speakers' grammar, an issue raised, for instance, by De Prada Pérez and Pascual y Cabo (2011, p. 118) and masked in previous studies where grammaticality judgments were elicited as a binary variable (Miglio and Gries 2015). It seems, in fact, that Spanish shows some variation in psych verb constructions when there is (i) an interplay between verb-subject agreement, (ii) the contextless presence of an emphatic pronoun in the stimulus, (iii) the less common order of constituents EXP + verb + SUBJ (semgram), and (iv) a plural grammatical subject. As a result, there were no stimuli with AGREEMENT: yes that received a median rating of −3, but there were some stimuli with AGREEMENT: no that received a median of + 3: (10) *A él le llama la atención los tatuajes de colores-'(As for him) he likes color tattoos' (11) *A él no le agrada los restaurantes de lujo-'(As for him) he dislikes high end restaurants' (12) *A ella le gusta los chocolates de Bélgica-'(As for her) she likes Belgian chocolate' (13) *A ella no le gusta los deportes extremos-'(As for her) she dislikes extreme sports' (14) *A ella no le sorprende las tendencias conservadoras del nuevo rector-'(As for her) she is not surprised by the new chancellor's conservative tendencies' (15) *A ustedes les sorprende las tendencias conservadoras del nuevo rector-'(As for you-pl.) you are surprised by the new chancellor's conservative tendencies' (16) *Los libros de historia les interesa a ellos-'(As for them -postposed) They are interested in history books' (17) *El maestro de natación te caen bien-'You like the swimming instructor' (18) *Les sorprende los actos de generosidad espontánea-'They are surprised by spontaneous acts of kindness' (19) *No les disgusta las comidas muy picosas-'They do not dislike very spicy food (pl.)' (20) *Nos faltan el dinero para ir de vacaciones-'We don't have the money to go on holiday (pl.)' Interestingly, the above stimuli share a few characteristics. In total, 7 out of 11 have an emphatic pronoun, which reduplicates the IO/EXP, and 6 out of those 7 display the reduplicated, emphatic pronoun at the beginning of the sentence. We saw in previous studies, that the presence of an emphatic pronoun may affect the acceptability or, at least anecdotally, the felicitousness of the stimulus. Moreover, 5 of those 7 emphatic pronoun are singular and could affect spotting the ungrammaticality of the sentence by inducing the participants to wrongly match the agreement of the verb with that of the EXP. This happens in English with to like but also with any other psych verb construction casting EXP as a grammatical subject both in English and in Spanish, where they also exist: adoro las playas desiertas 'I love desert beaches', for instance. The grammatical subject, on the other hand, is plural in 9 of those 11 stimuli, which also entails that the verb is third-person singular (since they are ungrammatical via verbal agreement).
Thus, to recap: In a majority of these stimuli, the extra material that needs processing is located sentence-initially and without a wider context that would alert the participants to pragmatic factors such as emphasis or contrastive focus; grammatical subjects are plural and EXPs singular, while the verb is third-person singular. It is conceivable, therefore, that the participants have not kept track of the number morphology marked on the verb by the time a grammatical subject with mismatched agreement appears. Although rare, even some originally non-bilingual NSs do not judge this kind of mismatch as totally unacceptable, which could point to an area of grammatical variability in monolingual Spanish that could be particularly susceptible to influence in a context of language contact and eventually result in language change. Once again, all these factors seem to point to a linguistic change in US Heritage Spanish in California, whereby an invariable third-person singular verbal form is used with both singular and plural grammatical subjects, at least in psych verb constructions.

Conclusions
The overarching research questions underlying our study consisted in whether psych verb constructions display variability in Spanish as a native language (question 1), which would require us to assess both monolingual and bilingual varieties and compare them (question 2). In the case of finding differences between early bilingual native speakers of Spanish (HSs) and non-early-bilingual Spanish native speakers (NSs), could these differences be attributed to English influence (question 3), and if so, (question 4) how does it affect native English speakers that are advanced learners of Spanish as an L2 (late bilinguals, or L2s)? We set out to test the basic hypothesis that these three populations of speakers would behave differently from each other, by testing acceptability judgment of psych verb constructions in stimuli that varied along several syntactic, pragmatic, or semantic features. Moreover, by following observations from our own previous research and that of others (for instance Toribio and Nye 2006; De Prada Pérez and Pascual y Cabo 2011), we specifically wanted to demonstrate that an invariable third-person singular verbal form is acceptable, in psych verb constructions, to Heritage Spanish speakers from California, even with plural grammatical subjects. In Spanish, these subjects are typically but not exclusively pre-verbal EXPs cast as oblique arguments in reverse/inverse predicates, such as in me gustan los buhos 'I like owls'. The same predicates are typically direct constructions in English, as the translation of the previous example shows. Bilingual heritage speakers often end up being English-dominant adults, after going through a very assimilationist school system that penalizes rather than support their skills in the minoritized language. Heritage Spanish speakers regularly report about the demeaning and bullying behavior they were subjected to in school (both by educators and peers), while growing up in a wider monolingual English environment where diglossic bilingualism is criticized rather than admired.
We chose a population of HS speakers that showed a high degree of engagement with Spanish despite their past experiences, since they had chosen to study Spanish as a major or minor at a 4-year R1 institution. They were thus also educated bilingual speakers of both English and Spanish. Our aim was, in fact, to show that even highly proficient Heritage Spanish bilinguals exposed to standard varieties of Spanish at school and university are much more tolerant toward a verbal paradigm that has fewer forms than either Iberian or Latin American Spanish. This tolerance could then be explained not as part of a "simplified" grammar used by less educated speakers, who were also less engaged with the Heritage Language and less exposed to some of the less frequent constructions in Heritage Spanish, compared to standard Spanish, but rather as belonging to a different variety of Spanish altogether. As such, HS grammar has evolved independently of standard Spanish dialects, and its verbal paradigm-at least for psych verbs-has dispensed with a distinction between singular and plural, not unlike Early Modern English in comparison with Middle English, or Germanic languages from Mainland Scandinavia in comparison with Old Norse.
Our results provide solid evidence that the innovation is well established in California Heritage Spanish, since even highly educated HS speakers familiar with standard Spanish clearly show that this invariant verb form is acceptable in their dialect. The change may have been indirectly caused by English transfer perhaps simply because in HS psych verb constructions, speakers may establish verbal agreement with the preverbal EXP, which would be prescriptively correct for equivalent direct constructions in English, as the EXP (sem) also corresponds to the grammatical subject (gram) in English. However, if an influence of English channeled through transfer via EXP-as-GRAM were to blame for HSs' acceptance of the relevant ungrammatical stimuli, it would beg the question of why L2s do not seem to be affected by English transfer in this regard, considering that they pattern with NSs in finding the relevant ungrammatical stimuli unacceptable. The fact that the Heritage Spanish speakers' linguistic behavior in this respect differs from both NSs and L2s also detracts from the theories that consider HS features as resulting from incomplete acquisition, attrition, or limited proficiency. If that were the case, surely HS speakers would perform "better" than even advanced L2 learners . . . . If anything, advanced L2s seem to adhere to more prescriptive usage, which is unsurprising, given their formal mode of acquisition of the language. In turn, the simplification of the verbal paradigm of psych verbal predicates in this variety is to be considered as a fully fledged language change that has exploited some variability in monolingual native speakers' language (as suggested by De Prada Pérez and Pascual y Cabo 2011). These verbal predicates showed in our data that even non-bilingual native speakers of Spanish have some tolerance toward variability in this domain.
Whether this invariable form, third-person singular even with a plural grammatical subject (*me gusta los buhos 'I like owls') can then spread to the rest of the verbal system seems unlikely given the different syntactic and semantic features of direct constructions. It would certainly be overly teleological for us to speculate on the evolution of this change. Considering these premises, this study has shown the advantages of the application of fine-grained, Bayesian analysis to linguistic study, and we hope it consolidates the view that Heritage Spanish is not a form of incompletely acquired or attrited Spanish, but rather a separate variety with different historical, sociolinguistic, and/or regional origins from the forms of monolingual Spanish it is usually compared with (Parodi 2011;Silva-Corvalán 2012Kupisch and Rothman 2018).

Dear participant
Thank you very much for filling out this little questionnaire for which ALL of you will receive an extra credit. The questions require you to evaluate a set of expressions in Spanish. Please take your time to read these instructions (page 1), fill out the information on page 2, and then proceed to evaluate the 64 sentences.
The procedure will be as follows: You will receive a spreadsheet that looks like this:

SENTENCE JUDGM
A María nos gusta la comida mexicana Los paisajes de montaña son los más bonitos Please proceed as follows: 1.
Please read the sentence in the left column.

2.
Then ask yourself if the sentence seems Spanish-sounding or not. If someone you were speaking to were to use this expression, would s/he sound like a native speaker? Or would the sentence seem strange or unnatural to a native speaker no matter how it was pronounced? Your task is to tell me how Spanish-sounding each sentence is using the following scale. Please write the appropriate number into the cell next to the sentence. Once you've given a response for a sentence, please don't go back and reconsider. In the unlikely event that you are not able to evaluate the sentence, just write a question mark in the cell on the right. Whenever you give responses smaller than 0 (i.e., closer to unnatural Spanish), please circle that part of the expression you consider responsible for its strangeness. As an example, let us consider the examples given above. For those, your answers could look somewhat like this:

SENTENCE JUDGM
A María nos gusta la comida mexicana -3 Los paisajes de montaña son los más bonitos 3 The following is essential: Please do not take into consideration what you believe I would like to read-doing so would jeopardize the whole experiment. There are absolutely no right or wrong answers, and it would be pointless to try to guess the purpose of this test since it is only a pilot study for a larger project. I am only interested in your spontaneous evaluation-the more you try to see through or manipulate the experiment, the more you jeopardize the whole evaluation.

3.
Please repeat steps 1 and 2 for the remaining sentences of the questionnaire.
Before we start, some other comments: • the experiment is completely anonymous: your answers need not and cannot be traced back to you; • unfortunately, I cannot tell you the purpose of this experiment because I am interested in what your spontaneous answers reveal about the Spanish language-if I told you the purpose now, this knowledge would distort the results. Once you are finished with the questions, however, I would be happy to tell you the purpose of the experiment (if you are interested); • if you have any questions, please do not hesitate to ask me immediately.

SENTENCE JUDGM
Francisco va a la escuela todos los días Los sueldos bajos de los maestros te preocupan a ti No te enojes con Javier, no lo hizo a propósito Model-random-subjects Model-random-verbs Model-random-verbs