7. Results and Discussion
Table 3 shows the distribution of the variants under scrutiny in this study: 24.6% of overt SPPs and 75.4% of null subjects.
In answering our first research question, the overall Tucson newspapers’ overt SPP rate is very close to the 20.2% attested for contemporary Tucson Spanish (
Anderson 2013) and slightly higher than the 17.8% reported for contemporary Phoenix Spanish (
Cerrón-Palomino 2016). In other words, there has only been a small fluctuation of the overt variant distribution across time.
In order to answer our second research question, we conducted a mixed methods multiple regression analysis of the linguistic factors constraining the presence of overt SPPs in our data with Rbrul, and the results are presented in
Table 4.
As we can see, of the six factor groups mentioned in
Section 4, only three obtained statistical significance: grammatical person and number, lexeme, and reference, whereas verb class, reflexivity, and ambiguity of the TAM ending were disregarded in the Rbrul runs.
First, we will refer to the factor groups that were shown to have a statistical effect in predicting the occurrence of overt SPPs. As with almost every SPE study, grammatical person and number, as well as reference, turned out to be factor groups decisively affecting the preference for overt SPPs in Spanish. Lexeme, on the other hand, has not been studied as much because of the lack of use of mixed-model statistical packages in previous decades.
With a robust range of 44, grammatical person and number was the strongest factor group favoring overt SPP use in our data. In order to preserve orthogonality in the results and to avoid possible interactions spotted performing cross tabulations, tú + usted were collapsed, as were ustedes + vosotros. Within the factor group, tú + usted was the strongest predictor of the variant at issue with a probability of 0.75, followed by yo with 0.59, and ustedes + vosotros exerting little to no influence with a factor weight of 0.51. As usual, the plural forms nosotros/as and ellos/ellas disfavored overt SPPs, and él/ella, which in very rare cases promotes pronominal subjects, patterned with the plural forms. In general, the factors behave similarly to what has been found in the literature. The random factor group of lexeme is ranked second with a range of 34, reflecting great lexical variation involved in the choice of overt SPPs and null subjects, which we will address in more detail later. The third factor group that achieved statistical significance, reference, has a rather small range compared to the strongest one: 14. As expected, switching the reference of a subject favors the occurrence of a pronominal subject (0.57), as opposed to maintaining the same referential subject across verbs, which promotes null subjects (0.43). The low probability, despite the statistical significance achieved, is likely due to the offline nature of journalistic genres, which is less sensitive to pragmatic features than the online character of speech.
Now we will discuss the linguistic factor groups rendered not significant in the regressions. Verb class, a factor group that has been relatively stable across studies as affecting SPE choice, was discarded in the analysis. This result is directly related to analyzing verb lexeme in the same run: in regressions not including the random factor group, verb class was selected as significant, with copulas as the strongest predictors of overt SPPs.
Table 5 offers a more detailed look at the results of the lexeme factor group. Out of the 531 different verb lexemes attested in the corpus, this table only includes a rather short list of them, i.e., those with 20 occurrences or more. As can be seen, the strongest predictor of the pronominal variant is
ser, the verb with the highest number of occurrences (137). We then attest a decrease of significance until none is achieved, at around 30 occurrences, with the exception of verbs
estar (75) and
ver (49).
These results give us a clearer pattern of the verb constraint: the more frequent verbs favor more occurrences of pronominal subjects, and the least frequent ones promote null subjects, corroborating the findings of
Orozco (
2016),
Orozco and Orozco (
2022), and
Del Carpio (
forthcoming). For instance,
estar, another copulative verb, does not favor overt SPPs in the present study, despite being relatively frequent. It seems evident, then, that the first regressions we run without lexeme as a random factor rendered copulas as the strongest predictor of the factor group mainly because of
ser, which accounts for 60.5% of the tokens of the category.
However,
Travis and Torres Cacoullos (
2021) have argued that frequent verb-form combinations interact with semantic verb classes. They claim, in particular, that highly frequent cognition verbs (‘psychological’, in this paper) like
creer affect SPE in Spanish in the first person singular, e.g., (
yo)
creo, but not across different grammatical person/number combinations. In this respect, Travis and Torres Cacoullos claim that frequent sequences like
yo creo must be interpreted as chunks rather than as selected combinations. They substantiate their claim by showing that (
yo)
creo represents 84% of the occurrences of its lexical paradigm, 27% of all the instances of overt
yo +
verb, and 14% of all instances of verbs inflected for the first person singular, both overt and null.
In order to test the validity of this claim in our data, we followed the methodology used by
Travis and Torres Cacoullos (
2021) to obtain the overall token frequency and the relative frequency of all the instances of the (
yo)
+ verb combinations of the verbs displayed in
Table 5, whose recurrence favored overt SPPs in the data analyzed. The distribution of said combinations across the data studied is shown in
Table 6.
As we can see in the second column, (yo) creo represents 50% of the occurrences of its lexical type, and (yo) sé comprises 37.1% of all saber inflected forms. Unlike the findings of Travis and Torres Cacoullos, where (yo) creo constituted 84% of the occurrences, it does not seem that either cognition verb had achieved the status of a particular unit at the time, although (yo) creo shows the highest proportions of any first person singular verb within its respective paradigm. The other frequent non-psychological verbs, whose rates go from 15.7% to 37.2%, do not seem to be experiencing chunking either.
The fourth column shows that the strings yo creo and yo sé each account for 5.5% of all the instances of yo + verb, strikingly lower than the 27% found by Travis and Torres Cacoullos for the former string, with yo soy being the most frequent yo + verb sequence at 17%. When all instances of verbs inflected in first person singular are taken into account (fifth column), yo creo and yo sé each represent only 1.8% of all occurrences, considerably lower than the 14% attested by the aforementioned authors for the former sequence, with yo soy being once more the most recurrent string at 5.7%. The aforementioned numbers thus suggest that yo creo and yo sé were not treated as chunks in the Spanish of the Tucson newspapers, either because the process of chunking was still incipient or because it was not a part of formal written styles.
Another factor group that was discarded in the regressions was reflexivity, which traditionally was not a robust factor group across studies. This factor group was perhaps even neutered further given the offline setting of the data, where the extra referential subject clue conveyed by the reflexive pronoun was not deemed redundant enough to favor null subjects over overt SPPs in a decisive way. Despite 24.9% of non-reflexive verbs attesting overt SPPs in contrast with 20.3% of reflexive verbs, following previous studies’ trends, the difference in the distribution of the variants was too meager to achieve statistical significance.
By the same token, we hypothesize that the lack of an online setting, where phonetic features can be crucial in aiding or hampering sentence processing, prevented the journals’ staff from resorting to the use of overt SPPs to clarify the referent of the subjects of verbs with ambiguous TAM endings.
In sum, and in answering our second research question, the linguistic factor groups that have proven to be most robust across the literature are also significant in these diachronic data: grammatical person and number, and reference. On the other hand, the factor groups that were not constantly significant in the literature, reflexivity, and ambiguity of the TAM, were discarded in our regression analysis. In addition, a factor group somewhat stable in constraining SPE, the semantic class of the verb, was deemed non-significant only when included in the same regression with verb lexeme as a random factor.
Now, to answer our third and fourth research questions, we move on to the social factors analyzed in this study.
Table 7 displays the results of the regression analysis of three external factor groups, two of which were statistically significant: newspaper genre and the continuous variable of the year of publication, whereas specific journal was rendered not significant.
The predictors of overt SPPs in the genre factor group are advertising (0.69), letters (0.57), short stories (0.56), and editorials (0.55) with the rest of the factors favoring null subjects. Regarding the year of publication, more overt SPPs tend to show up as time progresses, which supports prima facie the hypothesis that language contact favors English patterns to show up in Tucson Spanish as the territory is gradually anglicized and answers in a positive way our fourth research question.
As for the factor group that turned out to be non-significant, the specific journal from where the tokens were extracted, our results rule out possible hidden factors that could cause the overt SPP rate to spike in one way or another, such as the Spanish proficiency of the staff, their bilingual status, (or lack thereof), Spanish instruction, and so on. In general, the results of the social factors regressions go along with the hypotheses presented in
Section 6.
The answer to our fourth research question takes us to the matter of whether there is an ongoing change in overt SPP rate in the Spanish of Tucson that started with the anglicization of the area after the Gadsden Purchase. If this were true, the overt SPP rate in contemporary varieties of Tucsonan Spanish should be higher than the 24.6% found in the early newspapers’ data, in a continuing linguistic change. However, as mentioned earlier, the rate found by
Anderson (
2013) was lower: 20.2%, which goes against the trend found in the present study. How can we account, then, for this seeming contradiction?
A plausible explanation of the chronological increase of overt SPPs in the data analyzed is that there is a surge of switch reference contexts across time favoring the rise of overt pronominal subjects. In order to explore this hypothesis, we performed independent crosstabulations of overt SPP/null subjects by reference and time period for the three grammatical person + number combinations showing probabilities of 0.5 or higher in
Table 4: second person singular, first person singular, and second person plural.
Table 8 and
Table 9 display the results for first person singular subjects across time in switch reference and same reference contexts, respectively.
Table 8 shows that switch reference contexts increase over time (from 132 to 239), but the overt SPP rate decreases in said contexts from 46.2% to 31.8%, albeit not in a statistically significant way (
p ≤ 0.301). However, the number of overt SPPs shows an increase from 61 to 76, which goes along the lines of the hypothesis.
Similarly, in
Table 9 we see that same reference contexts also increase over time, from 73 to 103. Analogous to the switch reference contexts, the overt SPP rate decreases in coreferential contexts from 33.8% to 16.5%, also lacking statistical significance (
p ≤ 0.017). However, in contrast to the number increase of overt SPPs experienced across time in switch reference contexts, same reference contexts show a decrease from 22 to 14 cases. Omitted here for reasons of space, the results for second person singular also exhibit a switch reference increase across time, albeit with a percentage and number decrease of overt SPPs, whereas the switch reference contexts decrease as well as the rates and number of overt SPPs for second person plural.
In sum, we do not attest in our data a surge of switch reference contexts across time that promotes the rise of overt pronominal subject rates. However, an increase of yo pronoun occurrences across time does take place, although this increase is not statistically significant and is not reflected in a rate increase.
Another possible explanation to explore for the chronological overt SPP increment displayed in
Table 7 is that the different journalistic genres sway the overt SPP rate, on the one hand, and disengage the variable from oral speech patterns, on the other. In order to investigate whether the overt SPP rate spike across time is due to the array of data across the newspaper genres,
Table 10 displays the number of tokens distributed by journalistic genre in each newspaper’s period and the rate of overt SPPs for each genre. For instance, if there were an increase of tokens from the genres that favor overt SPPs the most, i.e., advertising, letters, short stories, and editorials, it could explain the overt variant’s surge over time.
As we can see, three of the four favoring genres show a token increase, with only letters experiencing a decrease, from 31 to 9. In contrast, advertising contexts increase from 48 to 117, short stories increase from 281 to 627, while editorials increase from 42 to 72. However, the overall increase of tokens in these genres does not translate into an overt SPP rate increase: the latter increases only in editorials, from 19% to 33.3%, but decreases in the other three favoring genres. Conversely, the overt SPP rate also increases over time in genres not favorable for pronominal subjects, such as miscellaneous (11% to 22.2%) and essays (12.1% to 25%).
In summary, the data distribution by genre in the two periods does not seem to have skewed the results in favor of the overt variant.
However, there is the possibility that it is not the journalistic genres, but rather the more general offline nature of written texts that causes overt SPP patterns to differ—even if slightly—from those of spoken Spanish. In fact, in written texts, authors have the choice of correcting and amending their own production for different purposes before its publication, such as the avoidance of redundancies or the enhancement of emphasis. Along these lines of reasoning, the low range of the referential connection factor group shown in
Table 6 seems to suggest that written Spanish differs from oral Spanish with regards to pragmatic features such as switch reference, as illustrated in (10), where overt SPP
ella, preceded by two coreferential null subjects, would be more likely to be interpreted as referring to someone other than
Mrs. Diamond in most varieties of spoken Spanish.
(10) | Mrs. Diamond, quien lloraba, con el pelo descompuesto y fumando cigarrillo tras cigarrillo, gritaba que ella era inocente del crimen. (El Tucsonense, News, 19 December 1931) |
| ‘Mrs. Diamond, who was crying, with messy hair and smoking one cigarette after another, was yelling that she was innocent of the crime’. |
By the same token, the impact of an ambiguous TAM ending and non-reflexive verbs, closely related to online production, seems to be neutralized in written output, where authors can edit an utterance at will, and the subject’s referential retrieval does not hamper the structure’s parsing, as seen in (11).
(11) | No descansan ni pueden dormir tranquilos estos bellos ejemplares de indiscreción, hasta no saber, con buenos datos, cuáles son los recursos con que cuenta Fulano para pasar la vida; averiguan cuántos matrimonios están en paz y cuántos riñen como perros y gatos; pueden decir cuáles comerciantes andan mal en sus negocios y si están próximos a declararse en quiebra; de las demás pueden revelar ocultos secretos tocantes a la legítima propiedad del color del pelo y rostro y si son naturales los lunares que tanto le agracian; son capaces, en fin, de investigar minuciosamente la vida y milagros de todo bicho viviente que cae bajo la jurisdicción de su impertinente curiosidad. (El Fronterizo, Essay, 4 October 1891) |
| ‘These pretty exemplars of indiscretion can neither rest nor sleep calmly until it is found out, from good sources, what resources John Doe has at his disposal to live his life; (they) find out how many couples live peacefully and how many fight like cats and dogs; (they) can point out what businesses are going through a rough time and whether or not they are on the verge of bankruptcy; regarding the rest, (they) can reveal hidden secrets related to the legitimacy of their hair and face color, and whether the beauty marks that embellish them are natural or not; (they) are capable of investigating meticulously the life of every living creature that falls under the jurisdiction of their impertinent curiosity’. |
In (11), the four main verbs following the postverbal subject estos bellos ejemplares have an ambiguous TAM and are also non-reflexive, and despite having numerous clauses with different subject referents intervening between them, they all have null subjects. It is possible, however, that the last three verbs are instances of null subject priming, which was not studied in this paper due to the relative shortness of many texts, particularly advertisements, the most favorable journalistic genre for overt SPP use.
Conversely, in (12), we attest verbs with non-ambiguous TAMs like
discutimos,
amamos, and
planteamos which, in addition, are inflected in the plural, a non-favorable context for overt SPPs, following each an instance of the overt pronoun
nosotros.
(12) | ¿Habrá alguno de los amigos y camaradas que me escuchan que crean, después de la forma demócrata que nosotros discutimos nuestros problemas, que diga que nosotros amamos ser dictadores? ¿Habrá alguien de los presentes que crea que nosotros planteamos conspiraciones o violencias? Mil veces “no”. (El Tucsonense, News, 23 June 1936) |
| ‘Is there any one among the friends and comrades listening to me who believes, after the democratic way in which we discuss our problems, who says that we love being dictators? Is there any one among the attendants who believes that we advocate for conspiracies or violence?’ |
One could argue that it is a case of overt SPP priming, although there is only one clause intervening between the first and second verb and one clause between the second and third verb, as opposed to the multiple intervening clauses that favor priming (
Travis 2007;
Ramos 2016). Instead, it appears to be a case of stylistic emphasis not commonly found, for instance, in oral sociolinguistic interviews.
Another example of these seeming stylistic licenses can be seen in (13), where the general spoken Spanish patterns are not being followed.
(13) | Antes les dirigió un discurso conmovedor, diciéndoles que los perdonaba, y que moría por Dios y por la patria. Varios se conmovieron, uno de los soldados, vencido por la emoción, arrojó el rifle, diciendo: “Yo no tiro, joven, yo pienso como usted, yo soy católico”. (El Tucsonense, Short Story, 22 March 1928) |
In (13) we could assume there is a priming effect on the overt variant. However, the probability of overt SPPs priming when there are non-coreferential verbs between the trigger and the target is higher than when there are zero intervening non-coreferential verbs between them (
Travis 2007;
Ramos 2016), which would make this one an infrequent case of priming.
More to the point, almost the totality of SPE studies revolves around oral data collected through sociolinguistic interviews, with
Ramos (
2016) being the only exception that we are aware of. Ramos analyzed SPE for the first person singular in three Spanish books covering the Old and Middle Spanish stages:
El Conde Lucanor (1335),
La Celestina (1499), and
El Lazarillo de Tormes (1554). Interestingly, the
yo rates found in the three texts are not constant: 28%, 15%, and 19%, respectively. Ramos offers no explanation as to why this difference is attested, but he does state that no changing pattern is evident over time: overt SPP use does not increase or decrease consistently as time progresses. It seems, then, that overt SPP rates are in a league of their own in written Spanish. When we compare the
yo rates of said texts and the one found in the present study, 33.3%, we corroborate that no discernable pattern can be found.
In sum, the results of this section show that overt SPP rates in historical newspapers do not differ substantially from those of contemporary spoken varieties of Arizonan Spanish. In addition, SPE in our data was conditioned essentially by the same factor groups shaping the variable in contemporary spoken varieties of Spanish. However, the offline nature of written data seems to weaken and even neutralize constraints that operate more vigorously in oral, online data. In particular, the effect of the switch reference constraint was considerably weaker than in most studies. Additionally, the constraints of ambiguous TAM and non-reflexive verbs showed no impact on SPE in historical newspapers.
8. Conclusions
This paper was motivated by four research questions: (1) How does the overt SPP rate in the 19th and 20th Century written Spanish of Tucson compare to those in contemporary spoken Arizonan Spanish? (2) Do linguistic factors conditioning variable SPE in the 19th and 20th Century written Spanish of Tucson differ from those in contemporary varieties? (3) What social factors favor overt SPPs in the 19th and 20th Century written Spanish of Tucson? and (4) Does the rate of overt SPPs increase over time as bilingualism increases in Tucson?
In answering our first research question, at 24.6% of overt SPPs, the Tucson newspapers’ data show a percentage similar to spoken contemporary Tucson Spanish (20.2%) and Phoenix Spanish (17.8%), albeit slightly higher.
As a means to answer our second research question, we used Rbrul to conduct a mixed-methods multiple regression analysis of the internal factors constraining the presence of overt SPPs in our data. Only three out of the six factor groups analyzed turned out to be statistically significant: grammatical person and number, lexeme, and reference. Verb class, reflexivity, and ambiguity of the TAM ending were discarded as non-significant. These results are in line with those found in the vast literature on Spanish SPE: grammatical person/number and reference have been constant in conditioning SPE, whereas reflexivity and ambiguity of TAM have not. Nonetheless, verb class was discarded when included in the same regressions with the random factor of lexeme, which turned out to be significant. Following
Travis and Torres Cacoullos (
2021), we investigated whether there was a psychological/cognition verb interaction with first person singular tokens only. However, our results show no evidence of such interplay, either because of the written nature of the data or because this relationship was in its early stages and was not, therefore, echoed in written Spanish.
Answering our third and fourth research questions required a regression analysis of the external factor groups hypothesized to influence SPE in our data. Two out of the three factor groups analyzed were statistically significant: journalistic genre and the continuous variable of the year of publication, whilst specific journal was not significant.
In accordance with the language functions predominant in them, advertising, letters, short stories, and editorials were the genres favoring overt pronominal subjects, whereas news, miscellaneous, and essays favored null subjects.
With respect to the continuous factor group of the year of publication, there was a positive correlation between the year the issues were published and the use of overt SPPs: the latter increased over time. This result, in principle, supports the convergence hypothesis by which as contact with English increased, the Spanish of Tucson received indirect interference from it, manifested, in the concrete case of SPE, in an increment of overt SPP rates.
However, as mentioned before, a study of contemporary Tucson Spanish (
Anderson 2013) found a lower rate of overt SPPs: 20.2%, which goes against the aforementioned hypothesis. We conjectured that this discrepancy could be due to one of three aspects of the data: an increase of switch reference contexts over time, a surge of the journalistic genres that favor overt SPPs, or simply, the offline nature of the texts analyzed. We performed crosstabulations to investigate whether there was an increase of switch reference contexts entailing an overt SPP spike in the person/number combinations favoring the overt variant, but there was no rate surge despite the context increase. With respect to a possible rise across time of the journalistic genres favoring overt SPPs triggering a surge of the overt variant, crosstabulations showed that only the genre of editorials experienced a rate increase over time. Given these results, the most feasible reason for the slight overt variant increase is the offline nature of written data, which often bypasses online constraints, as attested in the weakening of switch reference and the neutralization of ambiguous TAM and non-reflexive verbs.
In other words: our data do not present evidence that the initial stages of increasing language contact introduced English patterns in the SPE of Tucson Spanish. Instead, the factor groups that condition the variable in contemporary spoken Arizonan Spanish were already operating in the newspaper varieties in the period studied, with the aforementioned characteristics that distinguish written output from oral production.
For further research, it would be ideal to study variable SPE in Mexican newspapers of the same period, such as El Heraldo Mexicano (1895–1915), to compare rates, factor group rankings and constraints ranking with the ones arrived at in this paper. In addition, we would like to study contemporary Mexican newspapers to test the hypothesis that written Spanish may allow a higher rate of overt SPPs, by comparing it to its corresponding spoken Spanish rate.