The Negative Concord Mystery: Insights from a Language Model
Abstract
1. Introduction
Despite the daunting scope of linguistic phenomena begging an explanation, usage-based theories of language representation have a simple overarching approach. Whether the focus is on language processing, acquisition, or change, knowledge of a language is based in knowledge of actual usage and generalizations made over usage events.[1] (p. 1)
What gets learned is a function of usage: what is heard and said and the frequencies thereof.[2] (p. 194)
Children are preprogrammed to adhere to [certain linguistic principles] as part of the blueprint for their development. Just as a child cannot help but grow fingers and toes, and not wings or claws, so these linguistic principles, and not others, grow in the child.[3] (p. 27)
… children are born with a set of universal linguistic principles … many aspects of adult grammar are innate.[4] (p. 31)
… compared to … human minds and human brains, language models are actually incredibly transparent… We can train a model and know with certainty every single word that went into the training. We can know exactly how the architecture is put together, and then there are increasingly new tools for taking these admittedly black box parameters that they learn and trying to probe them and get some sense of what’s happening inside there.[5]
2. Negative Concord
Spanish: |
María no comió nada. |
Maria not ate.PST nothing |
‘Maria ate nothing.’ |
Japanese: |
Michiko-ga nanimo tabe-na-katta. |
Michiko-NOM nothing eat-not-PST |
‘Michiko ate nothing.’ |
Serbo-Croatian [8] (p. 123): |
Milan ne vidi nista. |
Milan not see nothing |
‘Milan sees nothing.’ |
Hungarian [9] (p. 90): |
Nem fél-Ø-ek semmi-töl. |
not fear-PRES-1SG nothing.ABL |
‘I do not fear of anything.’ |
Turkish [10] (p. 3): |
Deniz hiçbir s̹ey bil-m-iyor. |
Deniz no-one thing know-NEG-PRES |
‘Deniz knows nothing.’ |
Non-standard Modern English: |
I didn’t see nobody. |
(= ‘I saw nobody.’/‘I didn’t see anybody.’) |
I didn’t do nothing. (file 63, age 3;5) |
I didn’t call him nothing. (file 72, age 3;8) |
Because nobody didn’t broke it. (file 107; age 4;5) |
We don’t want no gas. (Adam 3;11) |
No tigers don’t bit you? (Mark 2;08) |
I don’t care about nothing. (Ross 5;04) |
No one’s not drying him, mum. (Fraser 3;00) |
The girl who skipped didn’t buy nothing. |
- In principle, two interpretations are available. The first involves negative concord, which yields the meaning ‘The girl bought nothing.’ The second involves a ‘double negation’ interpretation in which the first negative cancels out the other, giving the meaning ‘The girl bought something.’
The girl (who skipped) didn’t buy nothing. | |
Negative concord interpretation | Double negative interpretation |
The two negatives jointly produce simple negation. | The first negative cancels the second one. |
= ‘The girl bought nothing.’ | = ‘The girl bought something.’ |
3. Biuniqueness
-s | plural | |
car | cars | |
dog | dogs | |
hill | hills | |
plate | plates | |
etc. |
- Such correlations lead to a strong association between -s and plurality—an arrangement that facilitates language learning and language use by creating a simple one-to-one relationship.
Biuniqueness in Negation
Not-type negative sentences | Negative-pronoun-type sentences |
The people didn’t eat the food. | Nobody ate the food. The people ate nothing. |
- All three options produce negative sentences. If the people did not eat the food (first column), there was no eating event. The same is true if (as in the second column) we say that nobody ate the food or that the people ate nothing. There is, though, a major difference in the frequency with which these options are employed.
not | clausal negation |
Nobody | didn’t eat the food. | |
↑ | ↑ | |
Negative pronoun signals negation | ‘not’ is added because the sentence is negative |
4. Using a Language Model to Test the Biuniqueness Hypothesis
- Children can favor negative concord without direct exposure to its use by others.
- The emergence of negative concord stems from the dominance of not in the expression of negation, resulting in its overuse in sentences where a negative pronoun alone would normally suffice.
… We don’t have the ability to control the full environment, the full input to a child, as much as we would like to answer the kinds of questions that linguists and cognitive scientists are generally interested in.[24]
The Acceptability Test
Language models are already trained to give us probabilities for strings [of words]. So we can just ask, do they assign higher probability to a grammatical sentence than an ungrammatical one?[24]
- The use of perplexity as a measure of acceptability works best when applied to pairs of sentences that differ from each other in a minimal way, in accordance with the protocol followed in the Benchmark of Linguistic Minimal Pairs for English (BLiMP) (see [26]).
(1) | A subject negative pronoun with and without not: | ||
(negative concord) | (stand-alone negative pronoun) | ||
Nobody didn’t laugh. | Nobody laughed. | ||
(2) | A direct object negative pronoun with and without not: | ||
(negative concord) | (stand-alone negative pronoun) | ||
The boy doesn’t like nobody. | The boy likes nobody. | ||
(3) | A direct object negative pronoun versus a direct object negative polarity pronoun: | ||
(negative concord) | (negative polarity) | ||
The man didn’t wash nothing. | The man didn’t wash anything. |
- PPL = perplexity; N is the total number of words in the test sentence; and p(wi) is the probability assigned by the model to the i-th word.
5. Concluding Remarks
- Its input contained no instances of negative concord.
- It nonetheless generates negative concord patterns that manifest low rates of perplexity, consistent with grammatical acceptability.
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Ibbotson, P. The scope of usage-based theory. Front. Psychol. 2013, 4, 255. [Google Scholar] [CrossRef] [PubMed]
- Lieven, E. Developing constructions. Cogn. Linguist. 2009, 20, 191–199. [Google Scholar] [CrossRef]
- Crain, S.; Thornton, R. Investigations in Universal Grammar: A Guide to Experiments on the Acquisition of Syntax and Semantics; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
- Crain, S.; Goro, T.; Thornton, T. Language acquisition is language change. J. Psycholinguist. Res. 2006, 35, 31–49. [Google Scholar] [CrossRef] [PubMed]
- Mahowald, K. Psycholinguistic Insights from Small Language Models. HSP Online Seminar, 4 April 2025. Available online: https://stanford.zoom.us/rec/share/T3nJqGOjPyJ7m6t2mmtnIUOrnF9_pet3jHGtGU6qSyCBcOiifd-IE1OrSb4B7JhD.H6taAZJzQrzOKq_K (accessed on 12 June 2025).
- Linzen, T. How can we accelerate progress towards human-like linguistic generalization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA, 5 July 2020; pp. 5210–5217. [Google Scholar]
- Haspelmath, M. Negative Indefinite Pronouns and Predicate Negation. In The World Atlas of Language Structures Online, Chapter 115; Dryer, M., Haspelmath, M., Eds.; Oxford University Press: Oxford, UK, 2013; Available online: https://wals.info/chapter/115 (accessed on 14 August 2025).
- Zeijlstra, H. Sentential Negation and Negative Concord; Landelijke Onderzoekschool Taalwetenschap: Utrecht, The Netherlands, 2004. [Google Scholar]
- Gugán, K. Zigzagging in language history: Negation and negative concord in Hungarian. Finno-Ugric Lang. Linguist. 2012, 1, 89–97. [Google Scholar]
- Kamali, B.; Zeijlstra, H. Negative dependencies in Turkish. Languages 2024, 9, 342. [Google Scholar] [CrossRef]
- Robinson, M.; Thoms, G. On the syntax of English variable negative concord. Univ. Pa. Work. Pap. Linguist. 2021, 27, 24. [Google Scholar]
- Bellugi, U. The Acquisition of the System of Negation in Children’s Speech. Ph.D. Dissertation, Harvard University, Cambridge, MA, USA, 1967. [Google Scholar]
- Brown, R. A First Language: The Early Stages; Harvard University Press: Cambridge, MA, USA, 1973. [Google Scholar]
- Thornton, R.; Notley, A.; Moscati, V.; Crain, S. Two negations for the price of one. Glossa 2016, 1, 45. [Google Scholar] [CrossRef]
- Hein, J.; Bill, C.; Driemel, I.; Gonzalez, A.; Ilić, I.; Jeretič, P. Negative concord in the acquisition of English and German: Some results from a corpus study. Proc. Chic. Linguist. Soc. 2022, 58, 167–182. Available online: https://johannes-hein.de/documents/CLS58_Proceedings_paper.pdf (accessed on 14 August 2025).
- Bates, E.; MacWhinney, B. Competition, variation, and language learning. In Mechanisms of Language Acquisition; MacWhinney, B., Ed.; Lawrence Erlbaum Associates: Hillsdale, NJ, USA, 1987; pp. 157–194. [Google Scholar]
- Dressler, W. Naturalness. In Morphologie: Ein Internationales Handbuch zur Flexion und Wortbildung; Booij, G., Lehmann, C., Mugdan, J., Eds.; de Gruyter: Berlin, Germany, 2000; Volume l, pp. 288–296. [Google Scholar]
- Slobin, D. Crosslinguistic evidence for the language-making capacity. In The Crosslinguistic Study of Language Acquisition. Vol. 2: Theoretical Issues; Slobin, D., Ed.; Erlbaum: Hillsdale, NJ, USA, 1985; pp. 1157–1256. [Google Scholar]
- Graves, M.; Koziol, S. Noun plural development in primary grade children. Child Dev. 1971, 42, 1165–1173. [Google Scholar] [CrossRef]
- Keffer, S. Teaching Toddlers Plural Nouns: Expressive Lesson 18. Toddler Talk. 2025. Available online: https://toddlertalk.com/blog/plural-nouns (accessed on 17 August 2025).
- Warstadt, A.; Mueller, A.; Choshen, L.; Wilcox, E.; Zhuang, C.; Ciro, C.; Mosquera, R.; Paranjape, B.; Williams, A.; Linzen, T.; et al. Findings of the BabyLM challenge: Sample-efficient pretraining on a developmentally plausible corpus. In Proceedings of the 27th Conference on Computational Natural Language Learning: Volume 2: The BabyLM Challenge, Singapore, 6–7 December 2023; pp. 1–34. [Google Scholar]
- Quick, N.A.; Erickson, K.; Mccright, J. The most frequently used words: Comparing child-directed speech and young children’s speech to inform vocabulary selection for aided input. Augment. Altern. Commun. 2019, 35, 120–131. [Google Scholar] [CrossRef] [PubMed]
- Beukelman, D.; Jones, R.; Rowan, M. Frequency of word usage by nondisabled peers in integrated preschool classroom. Augment. Altern. Commun. 1989, 5, 243–248. [Google Scholar] [CrossRef]
- Warstadt, A. Why It Matters that Babies and Language Models are the Only Known Language Learners. Talk Presented at the Workshop on LLMs, Cognitive Science, Linguistics, and Neuroscience. Simons Institute, 4 February 2025. Available online: https://www.youtube.com/watch?v=Oyf-sS2oEvo (accessed on 14 August 2025).
- Gilkerson, J.; Richards, J.; Warren, S.; Montgomery, J.; Greenwood, C.; Oller, D.K.; Hansen, J.; Paul, T. Mapping the early language environment using all-day recordings and automated analysis. Am. J. Speech-Lang. Pathol. 2017, 26, 248–265. [Google Scholar] [CrossRef] [PubMed]
- Warstadt, A.; Parrish, A.; Liu, H.; Mohananey, A.; Peng, W.; Wang, S.-F.; Bowman, S. BLiMP: The benchmark of linguistics minimal pairs for English. Trans. Assoc. Comput. Linguist. 2020, 8, 377–392. [Google Scholar] [CrossRef]
Negative Concord | Negative Pronoun Without ‘not’ | |
---|---|---|
(1) Subject patterns | 128.90 (80.24, 193.57) | 1850.57 (642.24, 3644.96) |
(2) Direct object patterns | 216.64 (133.79, 316.56) | 12,608.50 (7657.53, 17,340.39) |
Negative Concord | Negative Polarity | Stand-Alone Negative |
---|---|---|
216.64 (135.27, 318.81) | 118.48 (85.51, 153.06) | 12,608.50 (7677.16, 17,442.11) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
O’Grady, W.; Zhang, H.; Lee, M. The Negative Concord Mystery: Insights from a Language Model. Information 2025, 16, 710. https://doi.org/10.3390/info16080710
O’Grady W, Zhang H, Lee M. The Negative Concord Mystery: Insights from a Language Model. Information. 2025; 16(8):710. https://doi.org/10.3390/info16080710
Chicago/Turabian StyleO’Grady, William, Haopeng Zhang, and Miseon Lee. 2025. "The Negative Concord Mystery: Insights from a Language Model" Information 16, no. 8: 710. https://doi.org/10.3390/info16080710
APA StyleO’Grady, W., Zhang, H., & Lee, M. (2025). The Negative Concord Mystery: Insights from a Language Model. Information, 16(8), 710. https://doi.org/10.3390/info16080710