Enhancing Code-Switching Research Through Comparable Corpora: Introducing the El Paso Bilingual Corpus
Abstract
1. Introduction
2. Datasets of Spontaneous Spanish–English Bilingual Speech: State of the Art
2.1. An Overview
2.2. Three Key Challenges
3. Presentation of the El Paso Bilingual Corpus
3.1. Data Collection
3.1.1. Participant Recruitment
3.1.2. Recording Session
3.1.3. Sociolinguistic Background Questionnaire
- (a)
- How well do you feel you can speak [Spanish/English]?
- (b)
- How well do you feel you can read [Spanish/English]?
- (c)
- How well do you feel you can write [Spanish/English]?
- (d)
- How well do you feel you can understand spoken [Spanish/English]?
3.1.4. Ethical Considerations
3.2. Description of the Data
(1) | DN7658: | Sí nos falta un minutito un minutito |
‘Yeah we need one more minute, one little minute’ | ||
KV9880: | Vamos a decirles un secreto vamos va chiquito | |
‘We’re gonna tell them a secret, let’s tell them a little one’ | ||
(El Paso Bilingual Corpus, Dominguez08) | ||
(2) | IB6767: | Dijo que ahorita como a los cuarenta y cinco regresaba |
‘She said that now like at forty five she would return’ | ||
GZ5538: | En efecto whatever | |
‘Actually’ | ||
IB6767: | Yo digo que le dejemos cinco minutitos más | |
‘I say we leave it about five minutes more’ | ||
(El Paso Bilingual Corpus, Dominguez36) | ||
(3) | GF4964: | sí sí sí hace tres años por como cuatro años era abusivo mi papa |
‘yeah yeah yeah three years ago for like four years he was abusive my dad’ | ||
VK3709: | [xxx] en Juarez | |
‘in Juarez’ | ||
GF4964: | Sí (.) Pero no no era físicamente nomas era (..) psicológicamente | |
‘yeah but it was not physically it was rather psychologically’ | ||
gritaba mucho güey yo tengo muchos problemas todavía de eso | ||
‘he yelled a lot, dude, I have many problems still because of that’ | ||
(El Paso Bilingual Corpus, Dominguez38) |
(4) | ZQ0416: | I normally last time los lavé (.) pero ya mira hasta me cayó ahí el café |
‘I washed them, but well look it even got the coffee’ | ||
que me escupiste | ||
‘that you spat on me’ | ||
FX7977: | it was by accident, ay todos tus [xxx] | |
‘ah all your’ | ||
ZQ0416: | that’s why I [//] viste? ahí tengo el café que me escupiste | |
‘you see? Here I have the coffee you spat on me’ | ||
FX7977: | It was an accident, I didn’t mean it | |
ZQ0416: | Los tengo que volver a lavar, tú cómo los lavas? | |
‘I have to wash them again, how do you wash them?’ | ||
FX7977: | qué? | |
‘what?’ | ||
ZQ0416: | los zapatos | |
‘your shoes’ | ||
FX7977: | I don’t mi mamá like she is like “pásame los zapatos” y yo “okay” | |
‘my mom’ ‘pass me the shoes’ and I’ | ||
(.) but she puts them in la lavadora | ||
‘the washing machine’ | ||
ZQ0416: | I’ve seen like different techniques he visto unos que dicen que | |
‘I’ve seen a few that say that’ | ||
pasta de dientes y luego otros que dicen like baking soda | ||
‘tooth paste and then others that say’ | ||
(El Paso Bilingual Corpus, Dominguez01) | ||
(5) | SU9854: | “Okay pues ya acabé con tu tire it’s gonna be [/] it’s usually forty but |
‘well I already finished your’ | ||
te cobro treinta cinco” and I was like “okay” y luego ya pues y le pagué ‘I’ll charge you thirty-five’ ‘and then well and I paid him’ | ||
and then he was like [/] we went into like the little shop para darle ‘to give him’ | ||
mi tarjeta y todo and then he’s like “tienes Facebook”? &=chuckles and | ||
‘my card and all’ ‘do you have Facebook?’ | ||
I was like sí pero también tengo novio” y l(ueg)o dice “ay pues está bien’ ‘yes but I also have a boyfriend and then he says ‘ah well that’s fine’ y luego le salía así como no Add Friend le salía no más Message | ||
‘and then it showed him like not’ ‘it showed just’ | ||
he’s already my friend | ||
KZ1980: | no sé | |
‘I don’t know’ | ||
SU9854: | does that mean [//] porque hazte cuenta que he gave me his phone | |
‘because note that’ | ||
in Facebook and then I typed there my name y luego I clicked | ||
‘and then’ | ||
on my profile y luego no salía así Add Friend or Remove Friend | ||
‘and then it didn’t show like’ | ||
na(da) más decía Message | ||
‘it just said’ | ||
(El Paso Bilingual Corpus, Dominguez06) |
(6) | DN7658: | Mejor no lo veo |
‘It’s better if I don’t see it’ | ||
KV9880: | Habrán visto algo | |
‘they will have seen something’ | ||
DN7658: | Speak in English | |
KV9880: | English? | |
DN7658: | Yes. I mean I don’t really know the study is about but I am assuming | |
that it is +... | ||
(El Paso Bilingual Corpus, Dominguez08) |
- (a)
- Date of the recording.
- (b)
- Duration of the recording.
- (c)
- Place of the recording.
- (d)
- Number of active participants and passive bystanders.
- (e)
- Relationship between the participants.
- (f)
- Socio-demographic information about the participants.
- Gender and age.
- Education level.
- Current profession.
- Languages spoken by the participant before the recording.
- Languages spoken by the participant after the recording.
- (g)
- Field notes (i.e., any extra information that may be relevant for the data analysis).
3.3. Participant Profiles
3.4. Transcription Methods
Code | Explanation | Examples |
&= | To indicate a paralinguistic or extralinguistic aspect | &=cough, &=laugh, &=sneeze |
[xxx] | To indicate unintelligible words or sequences. | I wanted to [xxx] |
[ ] |
|
|
( ) | To indicate a consonant or syllable that has been elided and thus not pronounced. | I have (e)nough. ya (es)toy lista. |
<< >> | To indicate direct quotations. It is important to not use the symbol “ in ELAN, as it causes the program to crash. | He said << what are you going to do? >> |
[/] | Used when a speaker begins to say something, stops, and then repeats the earlier material without change. The material being retraced is enclosed in angle brackets. | <I wanted> [/] I wanted to invite him. |
[//] | Used when a speaker begins to say something, stops, and then starts a new phrase while maintaining the same idea. The material being retraced is enclosed in angle brackets. | <I wanted> [//] I think I wanted to invite him. |
[///] | Used when a speaker begins to say something, stops, and then says something else. The material being retraced is enclosed in angle brackets. | <I wanted> [///] Oh I forgot to tell you that the cat’s gone. |
(.) | To indicate short pauses in the speech unit. | I wanted to (.) invite him. |
(..) | To indicate longer pauses in the speech unit. | I wanted to invite him (..) but didn’t. |
+… | To indicate that the speech unit is incomplete, but not interrupted. The speaker has trailed off. | Smells good enough for +… |
+< | Used at the beginning of an utterance that overlaps with a previous utterance. | SP1: I wanted to invite him SP2: +< but you didn’t. |
+/ | Used at the end of an utterance that is incomplete because the speaker is interrupted. | SP1: I wanted to invite +/ SP2: +< Mommy! |
/+ | Used at the beginning of the utterance that completes a previously interrupted speech unit. | SP1: I wanted to invite +/ SP2: +< Mommy! SP1: /+ him, but didn’t. |
3.5. Future Endeavors for the Publication of the Corpus
4. Case Study: Diminutive Expressions in Spanish–English Bilingual Speech
(7) | VM4054: | Fuimos | a | la | school | juntos | ||
we.went | to | the.art.f[sg] | together6 | |||||
‘We went to school together’ | ||||||||
(El Paso Bilingual Corpus, Dominguez32) | ||||||||
(8) | DN7658: | ¿Qué | haces | gamble? | ||||
What | do.you | |||||||
‘What do you gamble?’ | ||||||||
KV9880: | Dinero | |||||||
‘money’ | ||||||||
(El Paso Bilingual Corpus, Dominguez17) |
4.1. The Diminutive Construction
(9) | Gusta | un | cafe-cit-o? | |
you.like | a | coffee.cn.m-dim.sx-m[sg] | ||
‘Would you like some coffee?’ | ||||
(Mendoza, 2005, p. 164) |
(10) | Tú | tienes | un | problem-it-a | también | ahí | |||
you | have | a | problem.cn.m:-dim.sx-m[sg] | too | here | ||||
‘You have a little problem there too’ | |||||||||
(Bangor Miami Corpus, Zeledon05) |
4.2. Productivity of the Diminutive Construction and Its Types
4.2.1. Formation Strategies
4.2.2. Paradigm of Diminutive Markers
(11) | MG4783: | tenía | la | cara | rendond-it-a | ||
she.had | the | face | round.adj-dim.sx-f[sg] | ||||
‘She had a roundish face’ | |||||||
(El Paso Bilingual Corpus, Dominguez07) |
(12) | TIM: | se | te | va a | llevar | tu | apartement-it-o | |
them | you | going to | take | your | apartment.cn.m-dim.sx-m[sg] | |||
‘it’s gonna take your little apartment with man’ | ||||||||
(Bangor Miami Corpus, Herring12) |
(13) | TH5643: | los | papell-ill-o-s | esos | que | dicen | |
the | paper.cn.m-dim.sx-m-pl | those | that | say | |||
‘those little papers that say’ | |||||||
(El Paso Bilingual Corpus, Dominguez13) |
(14) | VAN: | faltan | dos | minut-ic-o-s | |||
they.miss | two | minute.cn.m-dim.sx-m-pl | |||||
‘I need two little minutes’ | |||||||
(Bangor Miami Corpus, Herring13) |
(15) | TD6891: | Dimitri and Rueben were on that little ride thing-ie | ||
thing.cn.m-dim.sx-m[sg] | ||||
(El Paso Bilingual Corpus, Dominguez41) |
(16) | IB6767: | es | un | mero | comediante | famoso-ish | ||
he.is | a | mere | comedian | famous.adj-dim.sx | ||||
‘He is a mere famous-ish comedian’ | ||||||||
(El Paso Bilingual Corpus, Dominguez41) |
(17) | MAR: | all | become | little | fish-ie-s | ||
fish.cn-dim.sx-pl | |||||||
(Bangor Miami Corpus, María10) |
(18) | MAR: | we | don’t | like | the | red-d-ish | tone | one |
red.adj-dim.sx | ||||||||
(Bangor Miami Corpus, María16) |
(19) | AQ4720: | hay | un | chorro | de | mini-grupos | ||
there.are | a | bunch | of | dim.pref-group.cn.m[pl] | ||||
‘There are a bunch of minigroups’ | ||||||||
(El Paso Bilingual Corpus, Dominguez24) |
(20) | CHA: | he was just this | little | mini-guy | |||
little.dim.adj | dim.pref-guy.cn | ||||||
(Bangor Miami Corpus, Zeledon09) |
(21) | FX7977: | driving around | en | esos | little | karts | ||
in | these | little.dim.adj | karts.cn[pl] | |||||
‘driving around in these little karts’ | ||||||||
(El Paso Bilingual Corpus, Dominguez01) |
(22) | LUK: | that | little | girlfriend | of | mine | |
little.dim.adj | girlfriend.cn[sg] | ||||||
(Bangor Miami Corpus, Herring9) |
(23) | XF0788: | a family of ten would live in a | tiny | house | |
tiny.dim.adj | house.cn[sg] | ||||
(El Paso Bilingual Corpus, Dominguez01) |
(24) | PQ8810: | or | into | a | small | house | ||
small.dim.adj | house.cn[sg] | |||||||
(El Paso Bilingual Corpus, Dominguez28) |
(25) | LUK: | and a | small | square | at | that | |
small.dim.adj | square.cn[sg] | ||||||
(Bangor Miami Corpus, Herring9) |
(26) | GIL: | we put “just kidding” in really | tiny | letters | |||
tiny.dim.adj | letters.cn[pl] | ||||||
(Bangor Miami Corpus, Zeledon9) |
(27) | BS2736: | así como | un | espacio | chiquito | ||
like | a | space.cn | small.dim.adj:dim.sx | ||||
que también | lo | tiene | |||||
that also | it | has | |||||
‘like a small little space it also has’ | |||||||
(El Paso Bilingual Corpus, Dominguez04) |
(28) | AUD: | es | un | niño | chiquito | todavía | |
he.is | a | kid.cn.m[sg] | small.dim.adj:dim.sx.m[sg] | still | |||
‘he’s a little kid still’ | |||||||
(Bangor Miami Corpus, Herring9) |
(29) | AQ4720: | se hacen | tres | círculos | pequeños | ||
are.made | three | circle.cn.m[pl] | small.dim.adj.m[pl] | ||||
‘three small circles are made’ | |||||||
(El Paso Bilingual Corpus, Dominguez24) |
(30) | KEV: | esto | es | un | pequeño | ||
this | is | a | little.dim.adj.m[sg] | pocket.cn.m[sg] | |||
‘This is a little pocket’ | |||||||
(Bangor Miami Corpus, Sastre1) |
(31) | LIL: | sí | es | casi | igual | un chin | más |
yes | it.is | almost | equal | a bit.dim.phr | more.adv | ||
‘yes it is almost the same, a bit more…’ | |||||||
(Bangor Miami Corpus, Sastre5) |
(32) | EA3159: | me | siento | también | un poco | raro | |
me | I.feel | also | a bit.dim.phr | weird.adj.m[sg] | |||
‘I feel also a little weird’ | |||||||
(El Paso Bilingual Corpus, Dominguez15) |
(33) | CON: | la historia | que | ella | me | contó | es | ||||
the story | that | she | me | told | is | ||||||
un poquito | diferente | ||||||||||
a little.dim.adv:dim.sx | different.adj | ||||||||||
‘the story that she told me is a little bit different’ | |||||||||||
(Bangor Miami Corpus, Herring14) |
(34) | MAR: | it | is | a bit | chilly | right now | |
a bit.dim.phr | chilly.adj | ||||||
(Bangor Miami Corpus, María19) |
(35) | TD6891: | I’m | a little bit | deaf | y’all | |
a bit.dim.phr | deaf.adj | |||||
(El Paso Bilingual Corpus, Dominguez15) |
(36) | JM6290: | she gets | a little | jealous | |||
a little.dim.phr | jealous.adj | ||||||
(El Paso Bilingual Corpus, Dominguez03) |
(37) | FLA: | it’s getting | a little | more | carnoso | aquí | |
a little.dim.phr | more.adv | fleshy | |||||
‘it’s getting a little more fleshy around here’ | |||||||
(Bangor Miami Corpus, Zeledon08) |
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
1 | In the documentation file of the Bangor Miami Corpus, the number of words of text mentioned is 242,475. However, this number does not take into account the speech of one of their participants, María, who recorded multiple conversations (for more information on the collection process of the Bangor Miami Corpus, see Deuchar, 2008; or Deuchar et al., 2014b). |
2 | This Ph.D. project, titled Diminutive expressions in Spanish-English language contact settings. A multifactorial and multimethod account, has been carried out by the first author under the main supervision of Prof. Dr. Renata Enghels and co-supervision of Dr. M. Carmen Parafita Couto, co-authors of this paper, and Prof. Dr. Robert Hartsuiker. Dr. M. Carmen Parafita Couto was also involved in the creation of the Bangor Miami Corpus. |
3 | This research lab is led by Prof. Dr. Iva Ivanova, co-author of this paper. |
4 | As the transcription process has not been finalized yet, this number should be taken as a preliminary index which may still change in the future. |
5 | However, for one session, the recorder had run out of battery prior to the recording, as a result of which the conversation was recorded using the research assistant’s iPhone 13 Pro Max. Although the recording quality was not as high as that of the Marantz recorder, it turned out to be sufficiently high to allow transcription of the conversation. |
6 | The glossing of the examples has been carried out following the Leipzig Glossing Rules (https://www.eva.mpg.de/lingua/pdf/Glossing-Rules.pdf, accessed on 20 August 2021). |
7 | These data form a strong abstraction of the responses retrieved from the following survey request: Make a list below of five of the people you speak to most in your everyday life, either in person or on the phone (...). Then note which language(s) you mostly speak with that person. For interested researchers, the detailed responses to the questionnaire for Miami can be found on the Bangor Miami Corpus website. The questionnaire data for the El Paso Bilingual Corpus will be freely accessible once the corpus is published in an open-access repository. |
8 | It needs to be noted that lexicalized diminutive forms (e.g., zorrillo ‘skunk’, almohadilla ‘inkpad’) are not included in these datasets, as they have acquired a proper meaning and have become autonomous lexemes in the process. Moreover, diminutive forms of proper names (e.g., Carlitos) are also excluded. In the Bangor Miami Corpus, all names have been pseudonymized, as a result of which we cannot know whether the transcriber has copied the diminutive form of the original name that was expressed or used a diminutivized pseudonym. In other words, the transcriber may have employed the pseudonym Juanito in lieu of the original name, which could have been, for example, Carlitos or just as well Carlos. |
9 | Residuals measure the degree to which an observed frequency deviates from the expected value under the null hypothesis (Lowry, 2023). Standardized residuals, then, adjust these differences by scaling them with the standard deviation, thereby quantifying the magnitude of deviation in standard deviation units (Pardoe et al., 2018). A positive standardized residual indicates that the observed frequency exceeds the expected value, while a negative residual suggests it is lower than expected (Lowry, 2023). Typically, a standardized residual greater than ±2 is considered noteworthy, as it implies that the observed deviation is unlikely under the null hypothesis (Pardoe et al., 2018). |
References
- Achugar, M., & Pessoa, S. (2009). Power and place: Language attitudes towards Spanish in a bilingual academic community in Southwest Texas. Spanish in Context, 6(2), 199–223. [Google Scholar] [CrossRef]
- Amengual, M., Kim, J.-Y., & Davidson, J. (2025). Multilingual hispanic speech in california (MuHSiC) [dataset]. University of California. Available online: https://muhsic.acad.ucsc.edu/ (accessed on 9 January 2025).
- Asociación de Academias de la Lengua Española. (2010). Diccionario de americanismos [online]. Available online: https://www.asale.org/damer/ (accessed on 12 January 2025).
- Audacity Team. (1999). Audacity(R): Free audio editor and recorder (Version 3.3.3.) [Computer software]. Muse Group & Contributors. Available online: https://www.audacityteam.org/ (accessed on 14 January 2025).
- AyiConnect Staff. (2022, December 12). Latin America and Anglo America: What are their differences? AyiConnect. Available online: https://www.ayiconnection.com/blog/latin-america-and-anglo-america-what-are-their-differences#:~:text=The%20Hispanic%20culture%20is%20more,important%2C%20their%20goal%20is%20stability (accessed on 24 May 2024).
- Bagasheva-Koleva, M. (2013). Some correlates between diminutive words in Bulgarian, Russian and English. Bulgaria Research Papers, 51(1b), 138–147. [Google Scholar]
- Bailey, B. (2022). Social/interactional functions of code switching among Dominican Americans. Pragmatics. Quarterly Publication of the International Pragmatics Association (IPrA), 10, 165–193. [Google Scholar] [CrossRef]
- Bakema, P., & Geeraerts, D. (2004). Diminution and augmentation. In G. Booij, C. Lehmann, J. Mugdan, & S. Skopeteas (Eds.), Morphology: An international handbook on inflection and word-formation (pp. 1045–1052). Walter de Gruyter. [Google Scholar]
- Balam, O. (2015). Code-switching and linguistic evolution: The case of ‘Hacer + V’ in orange walk, northern belize. Lengua y Migración/Language and Migration, 7(1), 83–109. [Google Scholar]
- Balam, O., Lakshmanan, U., & Parafita Couto, M. C. (2021). Gender assignment strategies among simultaneous Spanish/English bilingual children from Miami, Florida. Studies in Hispanic and Lusophone Linguistics, 14(2), 241–280. [Google Scholar] [CrossRef]
- Balam, O., Parafita Couto, M. C., & Stadthagen-González, H. (2020). Bilingual verbs in three Spanish/English code-switching communities. International Journal of Bilingualism, 24(5–6), 952–967. [Google Scholar] [CrossRef]
- Balam, O., Stadthagen-González, H., Rodríguez-González, E., & Parafita Couto, M. C. (2022). On the grammaticality of passivization in bilingual compound verbs. International Journal of Bilingualism, 27(4), 415–431. [Google Scholar] [CrossRef]
- Beatty-Martínez, A. L., Parafita Couto, M. C., Ameka, F. K., & Aboh, E. O. (2025). Codeswitching. Reference Module in Social Sciences, 1–4. [Google Scholar] [CrossRef]
- Beatty-Martínez, A. L., Valdés Kroff, J., & Dussias, P. E. (2018). From the field to the lab: A converging methods approach to the study of codeswitching. Languages, 3(2), 19. [Google Scholar] [CrossRef]
- Bellamy, K., Child, M. W., González, P., Muntendam, A., & Parafita Couto, M. C. (Eds.). (2017). Multidisciplinary approaches to bilingualism in the Hispanic and Lusophone world. John Benjamins Publishing Company. [Google Scholar]
- Bellamy, K., & Parafita Couto, M. C. (2022). Gender assignment in mixed noun phrases. In D. Ayoun (Ed.), The acquisition of gender: Crosslinguistic perspectives (pp. 13–48). John Benjamins Publishing Company. [Google Scholar]
- Berez, A. L. (2007). EUDICO Linguistic Annotator (ELAN) from Max Planck Institute for psycholinguistics. Language Documentation & Conservation, 1(2), 283–289. [Google Scholar]
- Beseghi, M. (2019). The representation and translation of identities in multilingual TV series: Jane the virgin, a case in point. MonTi: Monografías de Traducción e Interpretación, 4, 145–172. [Google Scholar] [CrossRef]
- Bialy, P. (2016). The usage of diminutives in polite phrases as a way to express positive/negative politeness or to formulate face-threatening acts in Polish. In E. Bogdanowska-Jakubowska (Ed.), New ways to face and (im)politeness (pp. 133–155). Wydawnictwo Uniwersytetu Śląskiego. [Google Scholar]
- Biber, D. (2012). Register as a predictor of linguistic variation. Corpus Linguistics and Linguistic Theory, 8(1), 9–37. [Google Scholar] [CrossRef]
- Blokzijl, J., Deuchar, M., & Parafita Couto, M. C. (2017). Determiner asymmetry in mixed nominal constructions: The role of grammatical factors in data from miami and nicaragua. Languages, 2(4), 20. [Google Scholar] [CrossRef]
- Bullock, B. E., & Gerfen, C. (2004). Phonological convergence in a contracting language variety. Bilingualism: Language and Cognition, 7(2), 95–104. [Google Scholar] [CrossRef]
- Bullock, B. E., & Toribio, A. J. (2024). Spanish in texas corpus [Dataset]. Texas Data Repository. [Google Scholar] [CrossRef]
- Callahan, L. (2004). Spanish/English code-switching in a written corpus. John Benjamins Publishing Company. [Google Scholar]
- Carter, P., López Valdez, L., & Sims, N. (2020). New dialect formation through language contact: Vocalic and prosodic developments in Miami English. American Speech, 95(2), 120–149. [Google Scholar] [CrossRef]
- Carter, P. M., & Lynch, A. (2015). Multilingual miami: Current trends in sociolinguistic research: Multilingual Miami. Language and Linguistics Compass, 9(9), 369–385. [Google Scholar] [CrossRef]
- Carter, P. M., & Lynch, A. (2018). On the status of Miami as a southern city. Defining language and region through demography and social history. In W. Wolfram, K. Wojcik, E. Wilbanks, & J. Reaser (Eds.), Language variety in the new south: Contemporary perspectives on change and variation. The University of North Carolina Press. [Google Scholar]
- Carvalho, A. M. (2012). Corpus del Español en el Sur de Arizona (CESA) [Dataset]. University of Arizona. Available online: https://cesa.arizona.edu/ (accessed on 9 January 2025).
- Christoffersen, K., Besset, R. M., & Carvalho, A. M. (2021). Technologically-aided transcription methods for bilingual sociolinguistic corpora: Findings, resources, and considerations project overview. Writing and Language Studies Faculty Publications and Presentations University of Texas Rio Grande Valley. Available online: https://scholarworks.utrgv.edu/wls_fac/123/ (accessed on 12 January 2025).
- Christoffersen, K., & Ciller, J. (2024). Corpus bilingüe del valle (CoBiVa). University of Texas Rio Grande Valley. Available online: https://utrgv.edu/cobiva (accessed on 9 January 2025). [CrossRef]
- Cisneros, M., Rodríguez-González, E., Bellamy, K., & Parafita Couto, M. C. (2023). Gender strategies in the perception and production of mixed nominal constructions by New Mexico Spanish-English bilinguals. Isogloss. Open Journal of Romance Linguistics, 9(2), 1–30. [Google Scholar] [CrossRef]
- Deuchar, M. (2008). The Miami Corpus: Documentation file. Available online: https://bangortalk.org.uk/docs/Miami_doc.pdf (accessed on 9 November 2020).
- Deuchar, M. (2020). Code-Switching in linguistics: A position paper. Languages, 5(2), 22. [Google Scholar] [CrossRef]
- Deuchar, M., Carter, D., Davies, P., Donnelly, K., Parafita Couto, M. C., Stammers, J., Aveledo, F., Fusser, M., Jones, L., Lloyd-Williams, S., Myfyr, P., & Robert, E. (2014a). Bangor miami corpus [Dataset]. Bangortalk. [Google Scholar]
- Deuchar, M., Davies, P., Herring, J. R., Parafita Couto, M. C., & Carter, D. (2014b). Building bilingual corpora. In E. M. Thomas, & I. Mennen (Eds.), Advances in the study of bilingualism (pp. 93–111). Multilingual Matters. [Google Scholar] [CrossRef]
- Dewaele, J.-M. (2010). Emotions in multiple languages. Palgrave Macmillan. [Google Scholar]
- Dewaele, J.-M., & Li, W. (2014). Intra-and inter-individual variation in self-reported code-switching patterns of adult multilinguals. International Journal of Multilingualism, 11(2), 225–246. [Google Scholar] [CrossRef]
- Dressler, W. U., & Merlini Barbaresi, L. (1994). Morphopragmatics: Diminutives and intensifiers in Italian, German, and other languages. M. de Gruyter. [Google Scholar]
- Dufour, R., Estève, Y., & Deléglise, P. (2014). Characterizing and detecting spontaneous speech: Application to speaker role recognition. Speech Communication, 56, 1–18. [Google Scholar] [CrossRef]
- Eberhard, D. M., Simons, G. F., & Fennig, C. D. (Eds.). (2024). How many languages are there in the world? Ethnologue. Available online: https://www.ethnologue.com/insights/how-many-languages/ (accessed on 11 February 2025).
- Eilers, R. E., Pearson, B. Z., & Cobo-Lewis, A. B. (2006). Chapter 5. Social Factors in Bilingual Development: The Miami Experience. In P. McCardle, & E. Hoff (Eds.), Childhood bilingualism (pp. 68–90). Multilingual Matters. [Google Scholar] [CrossRef]
- El Paso City Hall. (2024). Population demographics. Elpasotexas.Gov. Available online: https://www.elpasotexas.gov/economic-development/economic-snapshot/population-demographics/ (accessed on 15 April 2024).
- Enghels, R., & Azofra Sierra, M. E. (2018). Sobre la naturaleza de los corpus y la comparabilidad de resultados en lingüística histórica: Estudio de caso del marcador pragmático sabes. Spanish in Context, 15(3), 465–489. [Google Scholar] [CrossRef]
- Gaarder, A. B. (1966). Los llamados diminutivos y aumentativos en el español de México. PMLA, 81(7), 585. [Google Scholar] [CrossRef]
- González-Vilbazo, K., & Koronkiewicz, B. (2016). Tú y yo can codeswitch, nosotros cannot: Pronouns in Spanish-English code-switching. In R. E. Guzzardo Tamargo, C. M. Mazak, & M. C. Parafita Couto (Eds.), Spanish-English code-switching in the Caribbean and the US (Vol. 11, pp. 237–260). John Benjamins Publishing Company. [Google Scholar] [CrossRef]
- Gorzycka, D. (2020). Diminutive constructions in English. Peter Lang. [Google Scholar] [CrossRef]
- Grosjean, F. (1985). The bilingual as a competent but specific speaker-hearer. Journal of Multilingual and Multicultural Development, 6(6), 467–477. [Google Scholar] [CrossRef]
- Gullberg, M., Indefrey, P., & Muysken, P. (2009). Research techniques for the study of code-switching. In B. E. Bullock, & A. J. Toribio (Eds.), The cambridge handbook of linguistic code-switching (1st ed., pp. 21–39). Cambridge University Press. [Google Scholar] [CrossRef]
- Huddleston, R., & Pullum, G. K. (2002). The cambridge grammar of the english language (1st ed.). Cambridge University Press. [Google Scholar] [CrossRef]
- Hulstijn, J. H. (2011). Language proficiency in native and nonnative speakers: An agenda for research and suggestions for second-language assessment. Language Assessment Quarterly, 8(3), 229–249. [Google Scholar] [CrossRef]
- Hurtado, I. (2022, June 20–25). BILinMID: A spanish-english corpus of the US midwest. Thirteenth Language Resources and Evaluation Conference (LREC 2022) (pp. 5511–5516), Marseille, France. [Google Scholar]
- Jespersen, O. (1948). Growth and structure of the English language. Basil Blackwell. [Google Scholar]
- Jurafsky, D. (1993, February 12–15). Universals in the semantics of the diminutive. Nineteenth Annual Meeting of the Berkeley Linguistics Society: General Session and Parasession on Semantic Typology and Semantic Universals (pp. 423–436), Berkeley, CA, USA. [Google Scholar]
- Jurafsky, D. (1996). Universal tendencies in the semantics of the diminutive. Language, 72(3), 533–578. [Google Scholar] [CrossRef]
- Keller, G. (1979). The literary strategems available to the bilingual Chicano writer. In F. Jiménez (Ed.), The identification and analysis of Chicano literature (pp. 262–316). Bilingual Press. [Google Scholar]
- Khachikyan, S. (2015). Diminutives as intimacy expressions in english and armenian. Armenian Folia Anglistika, 11(2), 78–83. [Google Scholar] [CrossRef]
- King, K., & Melzi, G. (2004). Intimacy, imitation and language learning: Spanish diminutives in mother-child conversation. First Language, 24(2), 241–261. [Google Scholar] [CrossRef]
- Koch, P., & Oesterreicher, W. (1985). Sprache der Nähe—Sprache der distanz. Romanistisches Jahrbuch, 36(85), 15–43. [Google Scholar] [CrossRef]
- Kornfeld, L. M. (2016). “Una propuestita astutita”: El diminutivo como recurso atenuador. Revista Internacional de Lingüística Iberoamericana, 14(1), 123–135. [Google Scholar] [CrossRef]
- Kuzic, I. (2019). Diminutives in Portuguese and their equivalents in English [Master’s thesis, Zagreb University]. [Google Scholar]
- Labov, W. (1972). Sociolinguistic patterns. University of Pennsylvania Press. [Google Scholar]
- Lipski, J. M. (1982). Spanish-English language switching in speech and literature: Theories and models. Bilingual Review/La Revista Bilingüe, 9(3), 191–212. [Google Scholar]
- Lockyer, D. (2012). Such a tiny little thing: Diminutive meanings in alice in wonderland as a comparative translation study of English, Polish, Russian and Czech. Verges: Germanic & Slavic Studies in Review, 1(1), 10–22. [Google Scholar]
- Lowry, R. (2023). Chi-square, cramer’s V, and lambda. Vassarstats Net. Available online: http://vassarstats.net/newcs.html (accessed on 20 November 2024).
- MacSwan, J. (1999). A minimalist approach to intrasentential code switching. Garland. [Google Scholar]
- Martín Zorraquino, M. A. (2012). Sobre los diminutivos en español y su función en una teoría de la cortesía verbal (con referencia especial a un cuento de Antonio de Trueba). In T. E. Jiménez Juliá, B. López Meirama, V. Vázquez Rozas, & A. Veiga Rodríguez (Eds.), Cum corde et in nova grammatica: Estudios ofrecidos a Guillermo Rojo (pp. 555–569). Universidade de Santiago de Compostela, Servicio de Publicaciones e Intercambio Científico. [Google Scholar]
- Mendoza, M. (2005). Polite diminutives in Spanish. In R. T. Lakoff, & S. Ide (Eds.), Broadening the horizon of linguistic politeness (pp. 163–173). John Benjamins Publishing. [Google Scholar]
- Merriam-Webster. (2025). Merriam-webster’s dictionary of English usage. Available online: https://www.merriam-webster.com/ (accessed on 14 January 2025).
- Milroy, L. (1987). Language and social networks (2nd ed.). Blackwell. [Google Scholar]
- Moreno-Fernández, F. (Director). (2018). CORPEEU: Corpus del Español en los Estados Unidos [Dataset]. With the col. of F. Javier Pueyo Mena. Instituto Cervantes at Harvard University—ANLE. Available online: https://corpeeu.org/ (accessed on 9 January 2025).
- Moslimani, M., & Noe-bustamante, L. (2023, August 16). Facts on latinos in the U.S. Pew Research Center. Available online: https://www.pewresearch.org/race-and-ethnicity/fact-sheet/latinos-in-the-us-fact-sheet/ (accessed on 6 February 2025).
- Muysken, P. (2013). Language contact outcomes as the result of bilingual optimization strategies. Bilingualism: Language and Cognition, 16(4), 709–730. [Google Scholar] [CrossRef]
- Myers-Scotton, C. (1993). Duelling languages: Grammatical structure in code-switching. Clarendon Press. [Google Scholar]
- Náñez Fernández, E. (1973). El diminutivo: Historia y funciones en el español clásico y moderno. Universidad Autónoma de Madrid. [Google Scholar]
- Nieuwenhuis, P. (1985). Diminutives [Ph.D. Thesis, University of Edinburgh]. [Google Scholar]
- Olson, D. J. (2024). Code-switching and language mode effects in the phonetics and phonology of bilinguals. In M. Amengual (Ed.), The cambridge handbook of bilingual phonetics and phonology (1st ed., pp. 677–698). Cambridge University Press. [Google Scholar] [CrossRef]
- Pablos, L., Parafita Couto, M. C., Boutonnet, B., De Jong, A., Perquin, M., De Haan, A., & Schiller, N. O. (2018). Adjective-noun order in Papiamento-Dutch code-switching. Linguistic Approaches to Bilingualism, 9(4–5), 710–735. [Google Scholar] [CrossRef]
- Palacios, A. (2014). Variación y cambio lingüístico en situaciones de contacto: Algunas precisiones teóricas. In P. M. Butragueño, & L. Orozco (Eds.), Argumentos cuantitativos y cualitativos en sociolingüística: Segundo coloquio de cambio y variación lingüística (pp. 267–294). El Colegio de México. [Google Scholar]
- Parafita Couto, M. C., Greidanus Romaneli, M., & Bellamy, K. (2021). Code-switching at the interface between language, culture, and cognition. Lapurdum, 1–26. Available online: https://shs.hal.science/halshs-03280922v1 (accessed on 11 February 2025).
- Pardoe, I., Simon, L., & Young, D. (2018). 9.3—Identifying outliers (unusual Y values). Stat 462: Applied Regression Analysis. Available online: https://online.stat.psu.edu/stat462/node/172/ (accessed on 20 November 2024).
- Ponsonnet, M. (2018). A preliminary typology of emotional connotations in morphological diminutives and augmentatives. Studies in Language, 42(1), 17–50. [Google Scholar] [CrossRef]
- Poplack, S. (1980). Sometimes I’ll start a sentence in Spanish y termino en español: Toward a typology of code-switching. Linguistics, 18(7), 581–618. [Google Scholar] [CrossRef]
- Poplack, S., & Meechan, M. (1998). Introduction: How languages fit together in codemixing. International Journal of Bilingualism, 2(2), 127–138. [Google Scholar] [CrossRef]
- Potowski, K. CHISPA. Kim Potowski homepage. n.d. Available online: https://www.potowski.org/chispa (accessed on 28 April 2025).
- Potowski, K., & Torres, L. (2023). The chicago spanish (CHISPA) corpus. In K. Potowski, & L. Torres (Eds.), Spanish in Chicago (1st ed., pp. 35–70). Oxford University Press. [Google Scholar] [CrossRef]
- PRESEEA. (2014). Corpus del Proyecto para el estudio sociolingüístico del español de España y de América [Dataset]. Universidad de Alcalá. Available online: http://preseea.uah.es/ (accessed on 9 January 2025).
- Qualtrics. (2020). Qualtrics XM (Version April, 2023) [Computer software]. Qualtrics. Available online: https://www.qualtrics.com (accessed on 12 February 2025).
- Real Academia Española. (2011). La derivación apreciativa. In Nueva gramática de la lengua española manual (pp. 163–172). Espasa. [Google Scholar]
- Real Academia Española. (2014). Diccionario de la lengua española. Espasa. Available online: https://dle.rae.es/ (accessed on 12 February 2025).
- Ruiz, E. (2005). Hispanic culture and relational cultural theory. Journal of Creativity in Mental Health, 1(1), 33–55. [Google Scholar] [CrossRef]
- Sáenz, F. S. (1999). Conceptual interaction and spanish diminutives. Cuadernos de Investigación Filológica, 25, 173–190. [Google Scholar] [CrossRef]
- Schneider, K. P. (2003). Diminutives in English. De Gruyter. [Google Scholar] [CrossRef]
- Schneider, K. P. (2013). The truth about diminutives, and how we can find it: Some theoretical and methodological considerations. SKASE Journal of Theoretical Linguistics, 10(1), 137–151. [Google Scholar]
- Sebba, M. (2012). Multilingualism in written discourse: An approach to the analysis of multilingual texts. International Journal of Bilingualism, 17(1), 97–118. [Google Scholar] [CrossRef]
- Sloetjes, H., Wittenburg, P., & Somasundaram, A. (2011, August 27–31). ELAN—Aspects of interoperability and functionality. Interspeech 2011, 12th Annual Conference of the International Speech Communication Association (pp. 3249–3252), Florence, Italy. [Google Scholar] [CrossRef]
- Spasovski, L. (2012). Morphology and pragmatics of the diminutive: Evidence from macedonian [Master’s thesis, Arizona State University]. [Google Scholar]
- Stefanich, S., & Amaro, J. C. (2018). Phonological factors of Spanish/English word internal code-switching. In L. López (Ed.), Issues in hispanic and lusophone linguistics (Vol. 19, pp. 195–222). John Benjamins Publishing Company. [Google Scholar] [CrossRef]
- Stone, D. L., Johnson, R. D., Stone-Romero, E. F., & Hartman, M. (2006). A comparative study of hispanic-american and anglo-american cultural values and job choice preferences. Management Research: Journal of the Iberoamerican Academy of Management, 4(1), 7–21. [Google Scholar] [CrossRef]
- The Language Archive. (2024). ELAN (Version 6.4) [Computer software]. Max Planck Institute for Psycholinguistics. Available online: https://archive.mpi.nl/tla/elan (accessed on 11 February 2025).
- Toribio, A. J. (2017). Structural approaches to code-switching: Research then and now. In R. E. V. Lopes, J. Ornelas De Avelar, & S. M. L. Cyrino (Eds.), Romance languages and linguistic theory (Vol. 12, pp. 213–234). John Benjamins Publishing Company. [Google Scholar] [CrossRef]
- Torres Cacoullos, R., & Travis, C. E. (2020). Code-switching and bilinguals’ grammars. In E. Adamou, & Y. Matras (Eds.), The Routledge handbook of language contact (1st ed., pp. 252–275). Routledge. [Google Scholar] [CrossRef]
- Turner, G. W. (1973). Stylistics. Penguin Books. [Google Scholar]
- United States Census Bureau. (2022a). ACS demographic and housing estimates. Data.Census.Gov. Available online: https://data.census.gov/table/ACSDP1Y2022.DP05?g=160XX00US4824000 (accessed on 15 April 2024).
- United States Census Bureau. (2022b). S1601 Language spoken at home. United States Census Bureau. Available online: https://data.census.gov/table/ACSST1Y2022.S1601?q=language (accessed on 23 February 2024).
- Vanhaverbeke, M., Dominguez, A., Ivanova, I., Parafita Couto, M. C., & Enghels, R. (2022). El Paso Bilingual Corpus [Conversational corpus]. Ghent University. [Google Scholar]
- Vanhaverbeke, M., & Enghels, R. (2021). Diminutive constructions in bilingual speech: A case study of Spanish-English code-switching. Belgian Journal of Linguistics, 35, 183–213. [Google Scholar] [CrossRef]
- Velázquez, I. (2009). Intergenerational Spanish transmission in El Paso, Texas: Parental perceptions of cost/benefit. Spanish in Context, 6(1), 69–84. [Google Scholar] [CrossRef]
- Velázquez, I. (2013). Individual discourse, language ideology and Spanish transmission in El Paso, Texas. Critical Discourse Studies, 10(3), 245–262. [Google Scholar] [CrossRef]
- Vilares, D., Alonso, M. A., & Gómez-Rodríguez, C. (2016, May 23–28). EN-ES-CS: An English-Spanish code-switching twitter corpus for multilingual sentiment analysis. Tenth International Conference on Language Resources and Evaluation (pp. 4149–4153), Reykjavik, Iceland. [Google Scholar]
- Ward, W. (1989, October 15–18). Understanding spontaneous speech. Workshop on Speech and Natural Language—HLT ’89 (pp. 137–141), Cape Cod, MA, USA. [Google Scholar] [CrossRef]
Corpus | Period | Extension | Language Focus | Discourse Setting | Data Types | Access (All URLs Have Been Accessed on 9 January 2025) |
---|---|---|---|---|---|---|
Bangor Miami corpus | 2008–2011 | 84 speakers 35 h 265,000 words | bilingual | informal conversations | audio transcripts metadata | online data repository (Talkbank) http://bangortalk.org.uk |
Bilinguals in the Midwest Corpus (BILinMID) | 2021–2022 | 82 speakers | bilingual | picture elicited short stories | transcripts (some) metadata | online website https://ihurta3.shinyapps.io/bilinmid-corpus/ |
New Mexico Spanish-English bilingual corpus (NMSEB) | 2010–2011 | 40 speakers 29 hours 300,000 words | bilingual | sociolinguistic interviews | audio transcripts metadata | request PI https://nmcode-switching.la.psu.edu/bilingual-corpus/ |
Chicago Spanish corpus (CHISPA) | 2006–2010 | 124 speakers, approx. 1 hour per recording | Spanish | sociolinguistic interviews | audio transcripts metadata | request PI |
Corpus del Español en los Estados Unidos (CORPEEU) | 1960–now | no info available | Spanish | written + interviews + public interactions | transcripts | online website https://corpeeu.org/ |
Corpus of Spanish in Southern Arizona (CESA) | 2012–2020 | 78 speakers, approx. 1 hour per recording | Spanish | sociolinguistic interviews | transcripts audio metadata | request PI https://cesa.arizona.edu/ |
Corpus Bilingüe del Valle (CoBiVa) | 2017–now | 69 speakers, approx. 1 hour per recording | unilingual interviews | sociolinguistic interviews | audio transcripts metadata | online website https://www.utrgv.edu/cobiva/index.htm |
Spanish in Texas Corpus | 2011–2013 | 97 speakers 500,000 words, approx. 53 h | Spanish | sociolinguistic interviews & conversations | transcripts video & audio metadata | online data repository https://corpus.spanishintexas.org/ |
PRESEEA (Havana) Sociolinguistic Interviews | AMERESCO (Havana) Free Conversational Data | |
---|---|---|
muy | ++++ | +++ |
super- | +/- | ++ |
-ísimo | + | ++ |
-ón | +/- | ++ |
-azo | +/- | + |
-ote | +/- | + |
-udo | - | + |
Dominant Language in the Conversations | # | % |
---|---|---|
Spanish | 20 | 47.6 |
English | 12 | 28.6 |
both | 10 | 23.8 |
Total | 42 | 100.0 |
Social Relationship | # | % |
---|---|---|
friends | 30 | 71.4 |
siblings | 4 | 9.5 |
couple | 4 | 9.5 |
parent–child | 2 | 4.8 |
colleagues | 2 | 4.8 |
total | 42 | 100.0 |
Gender | Female | Male | Other | Total | ||||
---|---|---|---|---|---|---|---|---|
Age | n | % | n | % | n | % | n | % |
GEN2 (18–25) | 40 | 48.8 | 28 | 34.1 | 2 | 2.4 | 70 | 85.4 |
GEN3 (26–45) | 7 | 8.5 | 3 | 3.7 | - | - | 10 | 12.2 |
GEN4 (46–60) | 2 | 2.4 | - | - | - | - | 2 | 2.4 |
Total | 49 | 59.8 | 31 | 37.8 | 2 | 2.4 | 82 | 100.0 |
El Paso Bilingual Corpus | Bangor Miami Corpus | ||
---|---|---|---|
n | Fn/10,000 | n | Fn/10,000 |
995 | 37.9 | 891 | 33.8 |
Diminutive Strategy | El Paso Bilingual Corpus | Bangor Miami Corpus | ||
---|---|---|---|---|
n | % | n | % | |
synthetic | 754 | 75.8 | 563 | 63.2 |
analytic | 241 | 24.2 | 328 | 36.8 |
Total | 995 | 100.0 | 891 | 100.0 |
Diminutive Strategy | El Paso Bilingual Corpus | Bangor Miami Corpus |
---|---|---|
r | r | |
synthetic | 2.25 | −2.37 |
analytic | −3.42 | 3.61 |
Diminutive Strategy | El Paso Bilingual Corpus | Bangor Miami Corpus | ||
---|---|---|---|---|
Diminutive Language | n | % | n | % |
synthetic | 754 | 75.8 | 563 | 63.2 |
Spanish | 745 | 98.8 | 531 | 94.3 |
English | 9 | 1.2 | 32 | 5.7 |
analytic | 241 | 24.2 | 328 | 36.8 |
Spanish | 89 | 36.9 | 50 | 15.2 |
English | 152 | 63.1 | 278 | 84.8 |
Total | 995 | 100.0 | 891 | 100.0 |
El Paso Bilingual Corpus | Bangor Miami Corpus | |||
---|---|---|---|---|
Synthetic Markers | n | % | n | % |
Suffix | 753 | 99.9 | 562 | 99.8 |
-ito | 704 | 93.4 | 485 | 86.1 |
-ico | - | - | 44 | 7.8 |
-illo | 38 | 5.0 | 2 | 0.4 |
-y | 6 | 0.8 | 25 | 4.4 |
-ish | 3 | 0.4 | 6 | 1.1 |
-ino | 2 | 0.3 | - | - |
Prefix | 1 | 0.1 | 1 | 0.2 |
mini- | 1 | 0.1 | 1 | 0.2 |
Total | 754 | 100.0 | 563 | 100.0 |
El Paso Bilingual Corpus | Bangor Miami Corpus | |||
---|---|---|---|---|
Analytic Markers | n | % | n | % |
Adjective | 145 | 60.2 | 212 | 64.6 |
little | 104 | 71.7 | 184 | 86.8 |
chico | 30 | 20.7 | 11 | 5.2 |
tiny | 4 | 2.8 | 3 | 1.4 |
pequeño | 4 | 2.8 | 2 | 0.9 |
small | 3 | 2.1 | 12 | 5.7 |
Phrasal | 96 | 39.8 | 116 | 35.4 |
un poco | 55 | 57.3 | 36 | 31.0 |
a bit | 23 | 24.0 | 46 | 39.7 |
a little | 18 | 18.8 | 33 | 28.4 |
un chin | 1 | 0.9 | ||
Total | 241 | 100.0 | 328 | 100.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Vanhaverbeke, M.; Enghels, R.; Parafita Couto, M.d.C.; Ivanova, I. Enhancing Code-Switching Research Through Comparable Corpora: Introducing the El Paso Bilingual Corpus. Languages 2025, 10, 174. https://doi.org/10.3390/languages10070174
Vanhaverbeke M, Enghels R, Parafita Couto MdC, Ivanova I. Enhancing Code-Switching Research Through Comparable Corpora: Introducing the El Paso Bilingual Corpus. Languages. 2025; 10(7):174. https://doi.org/10.3390/languages10070174
Chicago/Turabian StyleVanhaverbeke, Margot, Renata Enghels, María del Carmen Parafita Couto, and Iva Ivanova. 2025. "Enhancing Code-Switching Research Through Comparable Corpora: Introducing the El Paso Bilingual Corpus" Languages 10, no. 7: 174. https://doi.org/10.3390/languages10070174
APA StyleVanhaverbeke, M., Enghels, R., Parafita Couto, M. d. C., & Ivanova, I. (2025). Enhancing Code-Switching Research Through Comparable Corpora: Introducing the El Paso Bilingual Corpus. Languages, 10(7), 174. https://doi.org/10.3390/languages10070174