Next Article in Journal
Balancing Tradition and Digitalization: Enhancing Museum Experiences in the Post-Pandemic Era
Previous Article in Journal
A Phrase Fill-in-Blank Problem in a Client-Side Web Programming Assistant System
Previous Article in Special Issue
EVOCA: Explainable Verification of Claims by Graph Alignment
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Negative Concord Mystery: Insights from a Language Model

1
Department of Linguistics, University of Hawaii at Manoa, Honolulu, HI 96822, USA
2
Department of Information and Computer Science, University of Hawaii at Manoa, Honolulu, HI 96822, USA
3
Department of English Language & Literature, Hanyang University, Seoul 04763, Republic of Korea
*
Author to whom correspondence should be addressed.
Information 2025, 16(8), 710; https://doi.org/10.3390/info16080710
Submission received: 4 July 2025 / Revised: 11 August 2025 / Accepted: 15 August 2025 / Published: 20 August 2025

Abstract

An important recent development in the field of linguistics is the use of small language models to investigate language acquisition. Following this line of research, we investigate the mysterious appearance of ‘negative concord’ (e.g., I didn’t do nothing) in the speech of children whose environment offers no exposure to patterns of this sort. Drawing on a 10-million-word version of the BabyLM corpus, we show that the preference for negative concord over patterns involving a single negative (e.g., I did nothing) can be traced to a cognitive force known as biuniqueness, whose effects will be examined with the help of data from both natural speech and a language model.

1. Introduction

The acquisition of language is arguably the single most important cognitive achievement of childhood. Yet, despite more than half a century of intense study, there is still no satisfactory explanation for how children are able to master language so effortlessly in the first years of their lives.
Broadly speaking, two competing lines of inquiry have been pursued over the last several decades. The first focuses on children’s ability to extract generalizations by observing how language is used by those around them.
Despite the daunting scope of linguistic phenomena begging an explanation, usage-based theories of language representation have a simple overarching approach. Whether the focus is on language processing, acquisition, or change, knowledge of a language is based in knowledge of actual usage and generalizations made over usage events.
[1] (p. 1)
What gets learned is a function of usage: what is heard and said and the frequencies thereof.
[2] (p. 194)
A different approach, pioneered by Noam Chomsky, focuses on the possibility of an inborn ‘Universal Grammar’ that encodes the fundamental properties of language, thereby reducing a child’s dependence on experience to acquire linguistic phenomena.
Children are preprogrammed to adhere to [certain linguistic principles] as part of the blueprint for their development. Just as a child cannot help but grow fingers and toes, and not wings or claws, so these linguistic principles, and not others, grow in the child.
[3] (p. 27)
… children are born with a set of universal linguistic principles … many aspects of adult grammar are innate.
[4] (p. 31)
We will not adopt either of these two approaches here. Instead, we will focus on a relatively novel line of inquiry that brings to the study of language acquisition an interdisciplinary blend of AI-based language models and traditional experimental work. As [5] notes, such an approach opens the door to advances that could well revolutionize the field of linguistics.
… compared to … human minds and human brains, language models are actually incredibly transparent… We can train a model and know with certainty every single word that went into the training. We can know exactly how the architecture is put together, and then there are increasingly new tools for taking these admittedly black box parameters that they learn and trying to probe them and get some sense of what’s happening inside there.
[5]
As [6] notes, a major goal of this line of research involves the development of ‘a model that acquires language as efficiently as humans.’ The purpose of this paper is to illustrate the potential of such an approach by examining a curious occurrence in children’s acquisition of English that we will describe in Section 2. Section 3 and Section 4 contrast and compare a cognitive analysis of the phenomenon with a language model, noting striking similarities in the two approaches. Section 5 offers some concluding remarks.

2. Negative Concord

A very large number of the world’s languages express the non-occurrence of an event by employing a construction known as negative concord, which consists of two negators—a so-called ‘clausal negative’ equivalent to English not (or n’t) (for the purpose of exposition, we use not as a cover term for both forms of the clausal negative) and a ‘negative pronoun’ such as nobody or nothing. (Negative concord is a common occurrence, as attested by its presence in 170 of the 206 languages surveyed in the World Atlas of Linguistic Structure [7].)
Spanish:
María   no   comió   nada.
Maria   not    ate.PST    nothing
‘Maria ate nothing.’
Japanese:
Michiko-ga    nanimo   tabe-na-katta.
Michiko-NOM  nothing   eat-not-PST
‘Michiko ate nothing.’
Serbo-Croatian [8] (p. 123):
Milan   ne   vidi   nista.
Milan   not   see   nothing
‘Milan sees nothing.’
Hungarian [9] (p. 90):
Nem    fél-Ø-ek      semmi-töl.
not    fear-PRES-1SG  nothing.ABL
‘I do not fear of anything.’
Turkish [10] (p. 3):
Deniz   hiçbir    s̹ey    bil-m-iyor.
Deniz   no-one   thing   know-NEG-PRES
‘Deniz knows nothing.’
Middle English (1100–1500) too had negative concord, which has survived to this day in a number of non-standard varieties of Modern English (e.g., [11]).
Non-standard Modern English:
I didn’t see nobody.
(= ‘I saw nobody.’/‘I didn’t see anybody.’)
Curiously, some children learning English as a first language come to use negative concord even though it is not employed by those around them. An early observation along these lines was made by [12], who noted that Adam, an English-speaking child, regularly produced negative-concord sentences even though he had not been exposed to such patterns in the speech of his parents [13] (p. 381), [14] (pp. 7–8).
I didn’t do nothing. (file 63, age 3;5)
I didn’t call him nothing. (file 72, age 3;8)
Because nobody didn’t broke it. (file 107; age 4;5)
In a much larger and more recent study involving seven children and their caregivers, Ref. [15] examined 328,972 child-produced utterances, of which 909 contained negators such as nobody, nothing, never, or no. Of these, 178 (19.6%) co-occurred with not and had a negative concord interpretation, as exemplified below.
We don’t want no gas. (Adam 3;11)
No tigers don’t bit you? (Mark 2;08)
I don’t care about nothing. (Ross 5;04)
No one’s not drying him, mum. (Fraser 3;00)
Yet another study [14] focused on children’s comprehension (rather than their speech) by investigating sentences such as the following:
The girl who skipped didn’t buy nothing.
  • In principle, two interpretations are available. The first involves negative concord, which yields the meaning ‘The girl bought nothing.’ The second involves a ‘double negation’ interpretation in which the first negative cancels out the other, giving the meaning ‘The girl bought something.’
The girl (who skipped) didn’t buy nothing.
Information 16 00710 i001
Negative concord interpretationDouble negative interpretation
The two negatives jointly produce simple negation.The first negative cancels the second one.
= ‘The girl bought nothing.’= ‘The girl bought something.’
In an experiment involving 20 pre-schoolers (aged 3 to 5) and 15 adults, Ref. [14] used stories and pictures to illustrate the two interpretations, from which the participants were asked to select the one they favored. The results were quite striking: 15 of the 20 pre-schoolers preferred negative concord interpretations, whereas only 2 of the 15 adults opted for that meaning.
Taken together, the findings from these and other studies point toward a genuine mystery: a significant number of children are attracted to negative concord in their speech and comprehension even though that pattern is apparently not present in their environment. Neither of the two approaches to the study of language acquisition sketched in §1 has offered a plausible explanation for this puzzle. On the one hand, usage should direct English-speaking children away from—not toward—negative concord, given that it is not encountered in their environment. On the other hand, it is equally unlikely that evolution might have somehow produced an innate grammatical principle that favors negative concord in children.
There is, however, at least one path that may lead to a satisfying resolution of the negative concord mystery, to which we will now turn.

3. Biuniqueness

The leading idea that we propose focuses on ‘biuniqueness’—a commonly encountered phenomenon in the study of language and learning [16,17,18] (p. 291, [17]) (pp. 1162–1163, [18]).
Put simply, biuniqueness is a cognitive force that favors a one-to-one correspondence between a form and its function or meaning. A common example involves the encoding of plurality in English, in which the suffix -s regularly marks the plural and the plural is almost always marked by -s.
-sInformation 16 00710 i002 plural
carcars
dogdogs
hillhills
plateplates
etc.
  • Such correlations lead to a strong association between -s and plurality—an arrangement that facilitates language learning and language use by creating a simple one-to-one relationship.
Biuniqueness need not be absolute. Exceptions are possible, as happens in the case of words such as men, teeth, sheep, feet, geese, etc., whose plural form does not involve the suffix -s. Crucially, though, such words can be problematic for learners [19,20], leading to errors such as mans, tooths, and so forth. As is evident, such errors reflect a preference for a uniform relationship between form and function, confirming the importance of biuniqueness in the process of language acquisition.
Plurality is just one of many phenomena that falls under the influence of biuniqueness. Another case involves negation.

Biuniqueness in Negation

There are different strategies for negating a clause in English. In order to express the fact that there was no eating event on a particular occasion, for instance, at least three familiar options are available.
Not-type negative sentencesNegative-pronoun-type sentences
The people didn’t eat the food.Nobody ate the food.
The people ate nothing.
  • All three options produce negative sentences. If the people did not eat the food (first column), there was no eating event. The same is true if (as in the second column) we say that nobody ate the food or that the people ate nothing. There is, though, a major difference in the frequency with which these options are employed.
This disparity becomes evident in an examination of the BabyLM corpus, a dataset consisting of child-directed and child-friendly speech assembled by [21]. In the version of the corpus that we used (see the next section for details), there were 126,611 negative sentences, but just 1129 instances of the negative pronouns no one, nobody, nothing, and none—approximately 1% of the total number of negative sentences. This leaves not as the principal tool for indicating negation, reflecting a very strong degree of compliance with biuniqueness (approximately 99%). (Refs. [22,23] note a similar imbalance in children’s own speech.)
notInformation 16 00710 i003clausal negation
Under these circumstances, it is not implausible to think that the use of not might be overgeneralized, just as the plural suffix -s can be. The end result of such an occurrence would be the potential appearance of not in any sentence that has a negative interpretation, including those that contain a negative pronoun.
Nobodydidn’t  eat  the  food.
Negative pronoun signals negation‘not’ is added because the sentence is negative
Herein lies a promising explanation for how children who receive no exposure to negative concord might nonetheless come to use such patterns. Interesting support for this idea comes from the small language model described in the next section.

4. Using a Language Model to Test the Biuniqueness Hypothesis

As things now stand, our approach to the negative concord mystery has focused on two claims:
  • Children can favor negative concord without direct exposure to its use by others.
  • The emergence of negative concord stems from the dominance of not in the expression of negation, resulting in its overuse in sentences where a negative pronoun alone would normally suffice.
Crucially though, these suppositions are difficult to confirm. How, for example, can we know that children do not come in contact with negative concord outside the home? And how can we be sure that they all encounter the operator not ninety-plus times more frequently than nobody and nothing? Problems such as this are in fact widely acknowledged.
… We don’t have the ability to control the full environment, the full input to a child, as much as we would like to answer the kinds of questions that linguists and cognitive scientists are generally interested in.
[24]
Interestingly, these challenges may be overcome with the help of the BabyLM alluded to in §3. Our design used a decoder-only architecture based on the GPT-2 model. It includes 12 Transformer decoder blocks with a hidden size of 768, resulting in approximately 124 million parameters. The model was pre-trained from scratch using the Adam optimizer, with Byte-Pair Encoding (BPE) for sub-word tokenization.
A vital requirement of baby language models is that they must be trained on data comparable in size to the input that children receive in the first years of their lives. The ‘strict-small’ version of the model [21] that we used was trained on a corpus consisting of approximately 10 million words (corresponding to 1,244,397 sentences). This is roughly the amount of input that children receive over a period of two to five years, depending on the talkativeness of their caregivers [25].
For our model to be viable, it must recognize that nobody and nothing negate sentences—a property shared by not. In order to encode this fact, we added the symbol N* to the beginning of each sentence that contained a negator (not, nobody, nothing, etc.) (Interestingly, a version of our corpus that lacked N* indicators also yielded results that favored negative concord.) As noted previously, the corpus contains 126,611 negative sentences, of which just 1129 (about 1%) included a negative pronoun—a ratio that reflects the overwhelming predominance of not. There were no instances of negative concord or any other type of double negation in the corpus.

The Acceptability Test

Although our language model contains no examples of negative concord, it is still possible to determine whether it might nonetheless allow for such patterns. This is achieved by measuring perplexity—an indicator of the likelihood that a particular sequence of words might be generated at a particular point in a sentence. Perplexity is often referred to as a measure of ‘surprisal’ since it reflects the degree to which a particular string of words is expected or unexpected. A sequence of words with low perplexity is thus more natural and hence more likely to occur. In contrast, high perplexity indicates that a sequence of words is unexpected and hence more likely to be unnatural.
Although perplexity does not measure accuracy per se, it has been shown to correlate well with a model’s performance as it generates sentences.
Language models are already trained to give us probabilities for strings [of words]. So we can just ask, do they assign higher probability to a grammatical sentence than an ungrammatical one?
[24]
  • The use of perplexity as a measure of acceptability works best when applied to pairs of sentences that differ from each other in a minimal way, in accordance with the protocol followed in the Benchmark of Linguistic Minimal Pairs for English (BLiMP) (see [26]).
According to this methodology, perplexity scores are measured for pairs of sentences that differ from each other in minimal ways. In the case at hand, three contrasts based on the presence or absence of not were created, as exemplified below, which differ with respect to the presence or absence of not.
(1)A subject negative pronoun with and without not:
(negative concord) (stand-alone negative pronoun)
Nobody didn’t laugh.Nobody laughed.
(2)A direct object negative pronoun with and without not:
(negative concord) (stand-alone negative pronoun)
The boy doesn’t like nobody. The boy likes nobody.
(3)A direct object negative pronoun versus a direct object negative polarity pronoun:
(negative concord) (negative polarity)
The man didn’t wash nothing. The man didn’t wash anything.
Each contrast consisted of ten sentence pairs, whose perplexity scores were calculated using the formula below and then compared.
  • PPL = perplexity; N is the total number of words in the test sentence; and p(wi) is the probability assigned by the model to the i-th word.
In the case of the first two comparisons (1 and 2), biuniqueness predicts a lower degree of perplexity for negative concord than for the stand-alone negative pronouns typical of standard adult English. This is a surprising prediction, but it follows straightforwardly from the dominance of not in negative sentences encountered by language learners—increasing the likelihood of its overuse. Table 1 summarizes our findings.
As predicted, these results reveal a low level of perplexity for both types of negative concord, with means of 128.90 for subject patterns and 216.64 for direct object patterns, even though they were not instantiated in the training set on which the language model was built. In contrast, patterns containing a negative pronoun without not—supposedly the norm in standard English—manifested levels of perplexity that were more than 10 times higher in the case of subject negative concord patterns (1850.57) and more than 50 times higher in the case of direct object patterns (12,608.50). The magnitude of the difference between the two patterns containing a negative pronoun with and without not indicates that the observed effect is unlikely to be due to random variation. These results suggest that the language model strongly favors negative sentences that include not, consistent with biuniqueness.
Now let us consider the third comparison (3), which involves two types of negative sentences containing not—a negative concord pattern (The man didn’t wash nothing) and a negative polarity pattern in which the direct object is an indefinite pronoun that must be licensed by a negative elsewhere in the sentence (The man didn’t wash anything). Since both sentence types contain a clausal negative in compliance with biuniqueness, we predict that, all other things being equal, they should have roughly comparable levels of perplexity. Moreover, we expect those levels to be far below those in which the direct object is a negative pronoun without not (The man washed nothing), even though that is the accepted pattern in standard English. The relevant comparisons are reported in Table 2.
As predicted, the negative concord pattern and the negative polarity pattern fall within roughly the same range, as indicated by their comparable means (216.64 versus 118.48) and overlapping 95% CIs. In contrast, the stand-alone negative shows a substantially higher mean perplexity (12,608.50), with no overlap in their CIs, suggesting a marked difference in model performance on this pattern. These results fit well with the biuniqueness hypothesis, which calls for the presence of not in sentences expressing negation.

5. Concluding Remarks

The use of negative concord by English-speaking children who do not encounter such constructions in their environment constitutes a mystery well worth exploring. The particular approach that we have adopted took as its starting point a cognitive principle (biuniqueness) that favors one-to-one associations between form and meaning, such as the relationship between not and negation. We then proceeded to test the plausibility of this hypothesis by examining a language model that manifests the two key properties also attributed to children:
  • Its input contained no instances of negative concord.
  • It nonetheless generates negative concord patterns that manifest low rates of perplexity, consistent with grammatical acceptability.
As we have seen, the results of this experiment align well with the biuniqueness hypothesis put forward in §3. The high correlation between not and negation in the language model offers a plausible explanation for the preference for negative concord, both in the model itself and in human language learners. But now a new question arises: why should children and language models behave alike?
The answer may be quite simple: human children and language models rely on computational systems that are sensitive to statistical trends in natural language. As we have seen, those trends include the likelihood that plural nouns carry the suffix -s and the likelihood that negation is accompanied by the negator not. These correlations are themselves the product of deeper forces, such as biuniqueness, which seek to facilitate the acquisition and use of language by minimizing the effort required to link form to meaning.
The next step, obviously, is to consider how children come to abandon the negative-concord strategy, as they invariably do if they are learning standard English. If we are on the right track, this adjustment can happen only upon ample exposure to sentences that contain negative pronouns without an accompanying not—sentences such as Nobody is there and They did nothing wrong. Evidently, the training set used for the language model employed here does not provide sufficient input of this type—hence its strong preference for negative concord, as reflected in perplexity measures. Put simply, exposure to 1129 instances of sentences containing a stand-alone negative pronoun does not suffice to override the appearance of not in almost 100 times as many other negative sentences.
At what point does the pendulum swing in the other direction, licensing the use of stand-alone negative pronouns that ultimately push the more complex negative concord patterns to the side? Variation is expected since the particular input available to individual children will differ. Infrequent exposure to negative pronouns could well extend the negative concord stage, robbing children of opportunities to hear sentences in which nobody and nothing occur without not. On the other hand, early and frequent exposure to stand-alone negative pronouns could shorten the negative concord stage, perhaps even circumventing it entirely if the input is sufficiently rich.
Further inquiry into the precocious emergence of negative concord and its eventual suppression is thus called for. Crucially, that investigation will require detailed information about input and reliable measures of acceptability that can identify the point at which a maturational change takes place. As we have seen, these are precisely the tasks for which language models are well suited.

Author Contributions

Conceptualization, W.O., M.L. and H.Z.; methodology, H.Z., W.O. and M.L.; software, H.Z.; formal analysis, W.O. and M.L.; resources, H.Z. and W.O.; data curation, H.Z.; writing—original draft preparation, W.O.; writing—review and editing, W.O. and M.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF2019S1A5A2A03053390).

Data Availability Statement

All data is publicly available at the site noted in the text.

Acknowledgments

We acknowledge with gratitude helpful comments by two reviewers for Information and by audience members at the Japanese–Korean Linguistics Conference, held at Cornell University in June of 2025.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Ibbotson, P. The scope of usage-based theory. Front. Psychol. 2013, 4, 255. [Google Scholar] [CrossRef] [PubMed]
  2. Lieven, E. Developing constructions. Cogn. Linguist. 2009, 20, 191–199. [Google Scholar] [CrossRef]
  3. Crain, S.; Thornton, R. Investigations in Universal Grammar: A Guide to Experiments on the Acquisition of Syntax and Semantics; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
  4. Crain, S.; Goro, T.; Thornton, T. Language acquisition is language change. J. Psycholinguist. Res. 2006, 35, 31–49. [Google Scholar] [CrossRef] [PubMed]
  5. Mahowald, K. Psycholinguistic Insights from Small Language Models. HSP Online Seminar, 4 April 2025. Available online: https://stanford.zoom.us/rec/share/T3nJqGOjPyJ7m6t2mmtnIUOrnF9_pet3jHGtGU6qSyCBcOiifd-IE1OrSb4B7JhD.H6taAZJzQrzOKq_K (accessed on 12 June 2025).
  6. Linzen, T. How can we accelerate progress towards human-like linguistic generalization. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA, 5 July 2020; pp. 5210–5217. [Google Scholar]
  7. Haspelmath, M. Negative Indefinite Pronouns and Predicate Negation. In The World Atlas of Language Structures Online, Chapter 115; Dryer, M., Haspelmath, M., Eds.; Oxford University Press: Oxford, UK, 2013; Available online: https://wals.info/chapter/115 (accessed on 14 August 2025).
  8. Zeijlstra, H. Sentential Negation and Negative Concord; Landelijke Onderzoekschool Taalwetenschap: Utrecht, The Netherlands, 2004. [Google Scholar]
  9. Gugán, K. Zigzagging in language history: Negation and negative concord in Hungarian. Finno-Ugric Lang. Linguist. 2012, 1, 89–97. [Google Scholar]
  10. Kamali, B.; Zeijlstra, H. Negative dependencies in Turkish. Languages 2024, 9, 342. [Google Scholar] [CrossRef]
  11. Robinson, M.; Thoms, G. On the syntax of English variable negative concord. Univ. Pa. Work. Pap. Linguist. 2021, 27, 24. [Google Scholar]
  12. Bellugi, U. The Acquisition of the System of Negation in Children’s Speech. Ph.D. Dissertation, Harvard University, Cambridge, MA, USA, 1967. [Google Scholar]
  13. Brown, R. A First Language: The Early Stages; Harvard University Press: Cambridge, MA, USA, 1973. [Google Scholar]
  14. Thornton, R.; Notley, A.; Moscati, V.; Crain, S. Two negations for the price of one. Glossa 2016, 1, 45. [Google Scholar] [CrossRef]
  15. Hein, J.; Bill, C.; Driemel, I.; Gonzalez, A.; Ilić, I.; Jeretič, P. Negative concord in the acquisition of English and German: Some results from a corpus study. Proc. Chic. Linguist. Soc. 2022, 58, 167–182. Available online: https://johannes-hein.de/documents/CLS58_Proceedings_paper.pdf (accessed on 14 August 2025).
  16. Bates, E.; MacWhinney, B. Competition, variation, and language learning. In Mechanisms of Language Acquisition; MacWhinney, B., Ed.; Lawrence Erlbaum Associates: Hillsdale, NJ, USA, 1987; pp. 157–194. [Google Scholar]
  17. Dressler, W. Naturalness. In Morphologie: Ein Internationales Handbuch zur Flexion und Wortbildung; Booij, G., Lehmann, C., Mugdan, J., Eds.; de Gruyter: Berlin, Germany, 2000; Volume l, pp. 288–296. [Google Scholar]
  18. Slobin, D. Crosslinguistic evidence for the language-making capacity. In The Crosslinguistic Study of Language Acquisition. Vol. 2: Theoretical Issues; Slobin, D., Ed.; Erlbaum: Hillsdale, NJ, USA, 1985; pp. 1157–1256. [Google Scholar]
  19. Graves, M.; Koziol, S. Noun plural development in primary grade children. Child Dev. 1971, 42, 1165–1173. [Google Scholar] [CrossRef]
  20. Keffer, S. Teaching Toddlers Plural Nouns: Expressive Lesson 18. Toddler Talk. 2025. Available online: https://toddlertalk.com/blog/plural-nouns (accessed on 17 August 2025).
  21. Warstadt, A.; Mueller, A.; Choshen, L.; Wilcox, E.; Zhuang, C.; Ciro, C.; Mosquera, R.; Paranjape, B.; Williams, A.; Linzen, T.; et al. Findings of the BabyLM challenge: Sample-efficient pretraining on a developmentally plausible corpus. In Proceedings of the 27th Conference on Computational Natural Language Learning: Volume 2: The BabyLM Challenge, Singapore, 6–7 December 2023; pp. 1–34. [Google Scholar]
  22. Quick, N.A.; Erickson, K.; Mccright, J. The most frequently used words: Comparing child-directed speech and young children’s speech to inform vocabulary selection for aided input. Augment. Altern. Commun. 2019, 35, 120–131. [Google Scholar] [CrossRef] [PubMed]
  23. Beukelman, D.; Jones, R.; Rowan, M. Frequency of word usage by nondisabled peers in integrated preschool classroom. Augment. Altern. Commun. 1989, 5, 243–248. [Google Scholar] [CrossRef]
  24. Warstadt, A. Why It Matters that Babies and Language Models are the Only Known Language Learners. Talk Presented at the Workshop on LLMs, Cognitive Science, Linguistics, and Neuroscience. Simons Institute, 4 February 2025. Available online: https://www.youtube.com/watch?v=Oyf-sS2oEvo (accessed on 14 August 2025).
  25. Gilkerson, J.; Richards, J.; Warren, S.; Montgomery, J.; Greenwood, C.; Oller, D.K.; Hansen, J.; Paul, T. Mapping the early language environment using all-day recordings and automated analysis. Am. J. Speech-Lang. Pathol. 2017, 26, 248–265. [Google Scholar] [CrossRef] [PubMed]
  26. Warstadt, A.; Parrish, A.; Liu, H.; Mohananey, A.; Peng, W.; Wang, S.-F.; Bowman, S. BLiMP: The benchmark of linguistics minimal pairs for English. Trans. Assoc. Comput. Linguist. 2020, 8, 377–392. [Google Scholar] [CrossRef]
Table 1. Mean perplexity scores with 95% confidence intervals (CIs) for negative concord versus stand-alone negative pronouns.
Table 1. Mean perplexity scores with 95% confidence intervals (CIs) for negative concord versus stand-alone negative pronouns.
Negative ConcordNegative Pronoun Without ‘not’
(1) Subject patterns128.90
(80.24, 193.57)
1850.57
(642.24, 3644.96)
(2) Direct object patterns 216.64
(133.79, 316.56)
12,608.50
(7657.53, 17,340.39)
Table 2. Mean perplexity scores with 95% confidence intervals (CIs) for negative concord versus negative polarity.
Table 2. Mean perplexity scores with 95% confidence intervals (CIs) for negative concord versus negative polarity.
Negative ConcordNegative PolarityStand-Alone Negative
216.64
(135.27, 318.81)
118.48
(85.51, 153.06)
12,608.50
(7677.16, 17,442.11)
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

O’Grady, W.; Zhang, H.; Lee, M. The Negative Concord Mystery: Insights from a Language Model. Information 2025, 16, 710. https://doi.org/10.3390/info16080710

AMA Style

O’Grady W, Zhang H, Lee M. The Negative Concord Mystery: Insights from a Language Model. Information. 2025; 16(8):710. https://doi.org/10.3390/info16080710

Chicago/Turabian Style

O’Grady, William, Haopeng Zhang, and Miseon Lee. 2025. "The Negative Concord Mystery: Insights from a Language Model" Information 16, no. 8: 710. https://doi.org/10.3390/info16080710

APA Style

O’Grady, W., Zhang, H., & Lee, M. (2025). The Negative Concord Mystery: Insights from a Language Model. Information, 16(8), 710. https://doi.org/10.3390/info16080710

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop