Next Article in Journal
The Applicability of the Demirjian and Willems Standards to Age Estimation of 6–9-Year-Old Portuguese Children
Previous Article in Journal
From Flocks to Fields: Pastoralism in Eastern al-Andalus During the 11th Century
Previous Article in Special Issue
Mormon Fundamentalist, Polygamous Marriage and What It May Tell Us about Being Human
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

The Origin of Human Theory-of-Mind

Department of Philosophy, Logic and Philosophy of Science, University of Sevilla, 41018 Sevilla, Spain
Submission received: 5 November 2024 / Revised: 20 January 2025 / Accepted: 4 February 2025 / Published: 12 February 2025
(This article belongs to the Special Issue Feature Papers Defining Humans)

Abstract

:
Is there a qualitative difference between apes’ and humans ‘ability to estimate others’ mental states’, a.k.a. ‘Theory-of-Mind’? After opting for the idea that expectations are empty profiles that recognize a particular content when it arrives, I apply the same description to ‘vicarious expectations’—very probably present in apes. Thus, (empty) vicarious expectations and one’s (full) contents are distinguished without needing meta-representation. Then, I propose: First, vicarious expectations are enough to support apes’ Theory-of-Mind (including ‘spontaneous altruism’). Second, since vicarious expectations require a profile previously built in the subject that activates them, this subject cannot activate any vicarious expectation of mental states that are intrinsically impossible for him. Third, your mental states that think of me as a distal individual are intrinsically impossible states for me, and therefore, to estimate them, I must estimate your mental contents. This ability (the original nucleus of the human Theory-of-Mind) is essential in the human lifestyle. It is involved in unpleasant and pleasant self-conscious emotions, which respectively contribute to ‘social order’ and to cultural innovations. More basically, it makes possible human (prelinguistic or linguistic) communication, since it originally made possible the understanding of others’ mental states as states that are addressed to me, and that are therefore impossible for me.

1. Introduction

This article will propose that apes’ Theory-of-Mind (ToM) is supported by vicarious expectations and that these, like any other expectation, are—let’s put it this metaphorical way—empty profiles that will recognize a particular content when it arrives. Thus, vicarious expectations, since they are empty profiles, can be automatically separated from the subject’s own (full) mental contents. By contrast, in the human ToM, the subject estimates foreign (i.e., others’) contents, which need some meta-representational resource that separates them from the subject’s own contents. After having described in this way the contrast between apes’ and uniquely human ToM, I must try to answer the following question: For what function was the estimation of foreign contents—that is, the costly duality of one’s own (full) content and foreign (full) content—originally advantageous?
If it is accepted that vicarious expectations require a previous empty profile in the subject that activates them, then it must be also accepted that such expectations cannot correspond to states which are intrinsically impossible for the subject. Thus, I propose that the ability to estimate foreign contents originally arose when in the human lifestyle mental states that were intrinsically impossible for the subject needed to be thought. But here it is necessary to pause very briefly to deal with this lifestyle.
The new—human—lifestyle, which is the key to the co-evolution of genes/culture, can be characterized by two features. (i) A ‘cultural’ feature. (ii) A ‘social’ one.
(i) An increasing technology: This would have needed some degree of teaching (Gärdenfors, 2022; Laland, 2017; Tatone & Csibra, 2015), or, at least, parental approval/disapproval (Castro & Toro, 2004), and, therefore, some increase in communication. But the technological increase also needs self-control, not only to acquire technological skills, but also above all, to surpass previous cultural products and, later, to support creative innovations (which are the essential factor to achieve the cultural advances).1
(ii) A high degree and wide span of collaboration and ‘partner choice’: This would have required increasing communication (Mussavifard & Csibra, 2023), and also (since there is “competition to be chosen as a partner in cooperative ventures”: Baumard et al. (2013) self-control that “refrains from blatantly selfish actions” (Baumard et al., 2013).
Returning to our thread, we must ask ourselves why actions intrinsically impossible for the subject needed to be thought of in the new, human lifestyle. Self-conscious emotions (if we opt for the idea that they are originally based on an interpersonal relationship, not on an innate moral core) are advantageous because they provide the self-control necessary to care for one’s own reputation. In addition, the subject who experiences those emotions ‘thinks what others think of him’ (of him as a distal, foreign individual), and, therefore, he thinks a foreign mental state which, being impossible for him in any circumstances, is not graspable through vicarious expectations.
But, to get to the origin of the matter, let us focus on a basic question—how does the human subject think originally what others think of him? Thus, we will study the new communicative reception (not production, at the very beginning) that distinguishes human—even prelinguistic—communication from that of chimpanzees. The human addressee must think of a foreign mental state as a state addressed to him. By contrast, apes—I propose—can think of a foreign mental state—not content, but vicarious expectation—only if this mental state is not addressed to them and can understand that a message is addressed to them, only if there is no need of estimating foreign (i.e., the producer’s) mental states.
After proposing the double identification of ‘apes’ Theory-of-Mind/vicarious expectations’, and ‘human Theory-of-Mind/foreign mental contents’, I will add two clarifications. Firstly, the strict condition for the very origin of ‘foreign mental contents’ (that is, the strict requirement that the mental states that must be thought are impossible for the subject under any circumstances) is not necessary for the subsequent development of human Theory-of-Mind. Secondly, it is convenient to focus in a more detailed way on the two receptions—by apes and by humans—of pointing gestures.
Section 2 briefly exposes the old descriptions (around the year 2003) of the primitive and the advanced Theory-of-Mind and then presents the recent changes. Next, it focuses on three articles—M. Tomasello (2018), Southgate (2020), Lurz et al. (2022)—that attempt to accommodate the new data regarding the abilities of Theory-of-Mind in infants or apes without having to dismiss the qualitative separation of the two modes of the Theory-of-Mind. I share those authors’ goals, but I do not agree with their proposals. Section 3, after highlighting the lack of consensus regarding the format of radically non-linguistic ‘expectations’, and after facing the mentalese (in Section “Does the ‘Language of Thought’ Exist?”), chooses to call them ‘well-defined, empty profiles’. Such emptiness, which gets in any animal the automatic separation between goals and perceptions, can also be applied—I propose—to a special type of expectations, the ‘vicarious expectations’. These special expectations, very probably present in humans and apes, are processed as ‘belonging to the other’ through the simplest way of the two proposed by Ereira et al. (2018), i.e., “through an encoding of agent identity intrinsic” to them. The nuclear Section 4—or rather Section 4.2—proposes that the estimation of foreign contents originally arose when mental actions intrinsically impossible for the subject needed to be thought of. Section 5 focuses on self-conscious emotions, which are essential in ‘the new—human—lifestyle’ (Section 5.1) and require the ability to estimate foreign mental contents (Section 5.2). Section 6 specifies that the (above-mentioned) strict conditions for the evolutionary emergence of ‘foreign mental contents’ are not necessary for the subsequent (ontogenetic and historic) functions of the ‘second line’ of mental contents. Section 7 proposes that the really effective (I will call it ‘unified’) reception of pointing gestures requires the estimation of the producer’s mental content, and in such sense is similar to the reception of gestures and gazes that cause the addressee self-conscious emotions, and similar to the dialogic nucleus of any linguistic reception. Section 8 provides a general outline and summarizes the article in the sense of listing all its hypotheses and suggestions without distinguishing between the main one and those that were at the service of this. In fact, the order followed there aspires to be the evolutionary order in which the different capacities would have arisen. The outline does not, of course, cite any bibliography, nor does it accompany each idea with the qualifications ‘according to my proposal’ or ‘I propose’, which were obligatory in the other sections. Finally, Section 9 deals with the testability of the proposals.

2. The Theory-of-Mind from 2003 Until Now

2.1. A Very Brief Summary

For the authors that accepted Theory-of-Mind around 2003, its primitive mode was the ability to know what the other sees (/does not see)—or has (/has not) seen immediately before. This ability is possessed, not only by children much younger than four years but also (as M. Tomasello et al., 2003 showed) by chimpanzees. These results, soon extended to goats or ravens (see Bugnyar & Heinrich, 2005; Bugnyar et al., 2016) were explained by a very simple mechanism, namely, that the subject both tracks a line from the (visually or acoustically) perceived location of a conspecific to the relevant object and is aware of the (possible) opaque barriers obstructing that straight line.
The advanced Theory-of-Mind was linked to the ability to attribute ‘false beliefs’ to others. The early tests of ‘false belief’ show a video in which a child (Maxi) puts his marble inside a vase and then leaves; afterward, his mother puts the marble inside his toy box and leaves. Right then, Maxi comes back, and the experimenter asks the children who have seen the video, ‘Where will Maxi look for his marble?’ The answers coming from children under 4 do not show the false belief that Maxi is bound to have, but their own knowledge. Within this general framework, the implicit knowledge of somebody else’s false beliefs in some 3-year-olds that gave the wrong explicit answer (Clements & Perner, 1994) did not seem to disturb the mentioned descriptions of the two modes of Theory-of-Mind.
But nowadays, there is new data. Let’s begin by attending to Karg et al. (2015), which shows that apes’ ability to estimate what the other sees (or does not see) goes well beyond its old description. These new experiments investigated whether chimpanzees could use self-experience to infer what another sees. Subjects first gained self-experience with the visual properties of an object (either opaque or see-through). In a subsequent test phase, a human agent interacted with the object, and the authors tested whether chimpanzees understood that the experimenter experienced the object as opaque or as see-through. Crucially, in the test phase, the object seemed opaque to the subjects in all cases (while the experimenter could see through the one that they had experienced as see-through before). Therefore, the chimpanzees had to use their previous self-experience with the object to correctly infer whether the experimenter could or could not see when looking at the object. Chimpanzees in a competitive context (that is, when they were sufficiently motivated) successfully used their self-experience to infer what the competitor saw.
This experimental design is an ‘ecological’ one. Let’s think of an ape who must estimate if his peer sees the immobile object that he, the subject-ape, “has previously seen”, Karg et al. (2016). In the wild, it is probable that the ape-subject must estimate if the foliage prevents the peer from seeing the object, but note that, since apes can often find themselves at different heights from each other, ‘the possible foliage that might—or not—prevent the peer from seeing the object’ is often hidden from the ape-subject’s eyes. To make such an estimation, he certainly could move. However, apes (heavy and lacking wings) would take too long to reach a location that would allow them to see their peer’s the visual field. This problem was—I suggest—solved in apes’ evolution by vicarious expectations (in addition to the subject’s prior knowledge of the area).
There is also news regarding false-belief tasks. More concretely, since Onishi and Baillargeon (2005), numerous results in non-verbal tests have been offered in favor of the estimation of false beliefs by infants. That type of test was later applied to great apes, who achieved not very different results—Krupenye et al. (2016), Kano et al. (2017). But, since the rate of success in prelinguistic children, and even more, in apes, is smaller, and more variable than the rate obtained in verbal tests, we must ask: Are those successes in non-verbal tests based on the same resource which supports traditional tests?

2.2. Discussing Some Proposals About the Difference Between the Primitive and Advanced Theory-of-Mind

Before moving on to my proposal, let’s see that some articles try to separate the new data from what is achieved by the advanced Theory-of-Mind. We will focus on M. Tomasello (2018), Southgate (2020), and Lurz et al. (2022), which goes in a different direction. I am close to their goal, but not to their proposals.
According to M. Tomasello (2018), the infant grasps others’ beliefs because he “disregards his own (diverging) knowledge”. In my view, such a reason is not convincing, since disregarding the knowledge of the situation in which we find ourselves is at any age a non-convenient inattention. But it is also true that, as Tomasello argues, if one’s own mental content, instead of being disregarded, is simultaneously carried with somebody else’s content in one’s own mind, then the two contents must be distinguished and compared by the subject, and thus, we would be identifying the primitive mode with the advanced one—an identification which I am opposed to.
Let us look at Southgate (2020), which, being relatively like M. Tomasello (2018), is more recent and elaborate. Southgate (who, unlike Tomasello, doesn’t mention the experiments about ‘foreign false beliefs’ in apes) proposes that “human infants have an altercentric bias, which results from a combination of the value that human cognition places on others, and an absence of a competing self-perspective”, and that such bias causes that the events that are not co-witnessed with the protagonist of the play are encoded with less strength. (About the altercentric bias, Southgate cites Bräten, 2004, and we could add Gallese, 2018). This is what explains, according to Southgate, infants’ successes in non-verbal tests of false belief.
I will start by saying that I very much like the idea that for infants, ‘altercentrism’ is beneficial since it helps them to know what is relevant to others. However, I reject the alleged “weakness of self-perspective” for the same reason I rejected Tomasello’s proposal that the infant “disregards his own (diverging) knowledge”. Note that typical perceptions are evolutionarily much older than altercentrism and are used at any age much more frequently. Thus, it is unlikely that the degree of conservatism that evolution necessarily includes fails there. Certainly, while infants typically pay a lot of attention to what people around them look at, they sometimes do not care—I agree—about changes in the location of an object. However, in my view, such lack of attention only appears if the object is not salient enough for the subject, and, therefore, that contrast would not be a consequence of ‘altercentrism in the strong sense’ (i.e., ‘weakness of self-perspective’).
Let’s also focus on Lurz et al. (2022). This article—quite different from Tomasello’s or Southgate’s ones—proposes that apes’ success can be explained in “a simple way: Apes don’t use meta-representations, but they merely simulate (/imagine) to believe what the other agent believes”. But note that this simulated (/imagined) belief or “low-level simulation” (as Lurz et al. say) requires dealing with two contents about the same thing and distinguishing each from the other. Thus, this task, as implicit as it may be, is not “a really simpler model”, as these authors defend, but is still a meta-representation.2
While I reject Tomasello’s and Southgate’s idea that infants and apes “disregard their own diverging knowledge”, I accept that the union of “inattention to one’s own mental states” and “attention to somebody else’s mental states” characterizes the primitive Theory-of-Mind. But I will propose that such inattention and such attention take place, not at the content level, but at the expectation level.

3. Expectations and Vicarious Expectations

After having criticized those three articles, can we keep the idea of a qualitative difference between apes’ and humans’ Theories-of-Mind? Let’s focus on Karg et al. (2015), which, as above said, shows that chimpanzees can use self-experience to infer what another sees. Probably, on the one hand, they activate their own expectations about what they would see if they were in the same location and circumstances as their observed peer, but, on the other hand, they process such expectations as belonging to the peer. These would be expectations of a special—vicarious—kind. But what exactly does a vicarious expectation consist of?

3.1. Expectations in General

Let us begin by attending to expectations in general. These, mainly since Bar (2007), are more often called ‘predictions’ (Latin prae-dictio: said or evoked in advance), a term that I don’t like to use for non-human animals because of the view presented in the next lines. I borrow ‘innate or learned expectations’ from Lorenz (1966).3
General expectations, mainly the goals, are a vital resource to guide behavior and—as ‘teaching mechanisms’—also learning in any animal. The matter is how expectations act in radically prelinguistic minds (and possibly also in our most spontaneous mental processes), while expected things are absent. Probably, instead of proposing that the animal agent has a simulation (or evocation, or off-line copy) of expected ‘things/events’, it could be helpful to understand such ‘presence of absent elements’ in a less demanding, non-evocational mode.4 Therefore, we could describe them as well-defined but empty profiles hierarchically arranged according to their lesser or greater degree of dependency on learning. These empty profiles can recognize the appropriate content when it arrives.
Okasha (2022), like other researchers, claims that ‘the mental representations of goal’ in avian and mammal species are objective facts, and he justifies such claim “on grounds of (their) evolutionary continuity and neuro-physiological similarity (with humans)”. But I, doubting that those “grounds” are enough of a guarantee, suggest the following alternative. It was the very beginning of the new lifestyle—that is, the initial strong increase of cooperation and communication—that made more and more advantageous a less empty representation of goals: Individuals needed to communicate their displaced goals to their group so that the group can cooperate towards reaching that goal.5 This need for producing and understanding such communications probably caused such advance. In short, ‘well-defined, empty profiles’, which had been sufficient in the old lifestyle, no longer were. But all this view is opposed to Fodor (2007), i.e., to ‘language-of-thought’ or ‘innate mentalese’. Therefore, now I must focus on the contrast between this and my underlining of the crucial role of communication. In this way, I am close to the goal pursued, for example, by Fedorenko et al. (2024), but not to their way of dealing with the relationship between language and thought. (The next sub-subsection offers some proposals about language, its origin, and cognitive consequences. However, it is only later that I will describe my nuclear proposal, that is, ‘the new—even prelinguistic—communicative reception’, which is the point where primitive Theory-of-Mind had to be transformed).

Does the ‘Language of Thought’ Exist?

Fodor (1975) postulated that an innate ‘language of thought’ (discrete symbols, and syntax) supports perceptions (even if “these, unlike discursive representations, lack canonical decomposition”, as Fodor, 2007 adds) and makes evocations possible. Certainly, this idea constituted a root of artificial intelligence, and this fact explains its current revival. However, I lean towards rejecting the existence of the innate ‘language of thought’. More concretely, in some works, I have opposed, not only innate syntax, but also innate semantics, since our semantics is indelibly shaped by syntax: See Bejarano (2008, 2010), and Bejarano (2011) (Chapters 10–16). Without syntax, there aren’t nouns/verbs/adjectives, etc.: There are not even nouns—my proposal insists against a deep-rooted idea that influences Hurford (2007), for example.6
Fodor’s theory, although not focusing on the origin of human cognition and language, closely channels the hypotheses about such origin. Therefore, facing that theory requires facing its potential for derivations. Hence this section is going to be longer than what one might think at first glance is appropriate.
S. Phillips (2024), trying to explain language-of-thought, proposes that “(perceptual) data are projected onto a base (conceptual) space in one direction, and in the opposite direction, these data are referenced by that space”. I agree, of course, that the elements (objects, qualities, relations) of a perception are recognized by the individual who perceives it. However, in my view, no independent element is used in genuinely prelinguistic perceptions. That is, while in linguistic understanding, each of the meanings receives independent attention before they are integrated into the total meaning, in those perceptions, on the contrary, such attentional, non-subpersonal independence of each relevant element would consume time, and therefore, far from solving any problem, would be a detrimental feature. (Certainly, we, using other human abilities, can attend slowly and long to any perception—let alone an artistic painting. However, perception evolved for survival in a world where rapid response is crucial).7 In addition, I think that in preverbal human infants and in animals, the so-called ‘logical reasoning’ or ‘intuitive logical reasoning’ requires neither decomposition nor compositional explicitation. Or if (as Durdevic & Call, 2022 propose) “deductive reasoning, rather than relational or belief reasoning, is so far the best candidate for a human-unique derived cognitive ability”, it is because deductive reasoning requires syntax and syntactic semantics.
But if I reject, not only the innate language-of-thought but also its innate ‘semantics’, then I must try to give an alternative account of the emergence of language. Before that emergence, there were—I propose—pre-syntactic (that is, holophrastic) ‘requests for a certain material’ or ‘calls to a certain individual’, which would use pre-words, i.e., meanings always linked to conative function and conative intonation. These holophrases could sometimes (when the individual was absent or the material was gone) reveal the speaker’s false or outdated beliefs to the listener, and, therefore, provoke the theme/rheme composition, which corrects, completes or updates those beliefs (and is ‘meta-communicative’ in the minimal meaning of Dingemanse & Enfield, 2023).
The emergence of this pre-grammatical syntax would have been helped by a new and broader intonation pattern that girds into a single unit the theme and the rheme. (This suggestion fits well with the link between intonation and semantics: “Regarding prosodic cues that correlate with distinct communicative function, the brain responds very rapidly, but not in communicative situations without semantic content”, R. Tomasello et al., 2022).8 Such intonational help—a case of the physical, pre-symbolic embodiment in human communication—probably facilitated the victory of voice over gesture—Bejarano (2014)—(an evident victory, even if gestures continue to accompany and complement vocal communication).9 Thus, complex management of the two different levels of packaging (the word level and the intonational level) became necessary. Let us focus on that.
Linguistic structure, including hierarchical structure, is “a special case of structured action” (e.g., Planer, 2023). In addition, Gallardo et al. (2023) propose: “In Broca’s area, an action-related region evolved into a bipartite system, with a posterior portion supporting action and an anterior portion supporting syntactic processes”. Could we then suggest that the structured action immediately and directly linked to the syntactic structure is the action of managing the two different levels of packaging? Let’s note that Osiurak et al. (2021) assign the “technical” dimension (not the “motor” one) of actions to Broca’s area. This hypothesis is also consistent with the theory (Corballis, 2011) that recursive skills go beyond language.
The new and broader intonation that girds into a single unit the theme and the rheme resulted in a duality of different sounds for the same sign (with the conative intonation in holophrases, and with the non-conative one in the genuine word used in pre-grammatical syntax). With this perhaps the problem arose of how to identify the same meaning in two different vocal patterns. The final solution could be the learning of articulatory-phonetic sequences, which are able to be produced with one intonation or another depending on the circumstances.
The learning of articulatory-phonetic sequences, even if it does not have to face the problem of perceptual-motor correspondence—one hears oneself–, is a difficult type of imitation. Certainly, as C. Heyes (2021a) says, “I could copy a sound you make by simple trial-and-error, varying my vocal output until it matches my memory of the sounds you made”. This perfectly describes the babbling. However, note that unitary articulatory-phonetic sequences of several different steps—i.e., typical words—cannot be reproduced simultaneously with their hearing, nor can they be easily remembered in an exact way, nor, unlike bird song dialects or vocal learning in parrots, are they merely an enrichment of an innate pattern.10
In this way, the ‘super-high fidelity copying’ could perhaps arise.11 Obviously, “this type of imitation makes sense in intransitive or object-free actions” (C. Heyes, 2021b). It is a “mimicking” (M. Tomasello, 1999) of ‘conventional’ motor sequences. Regarding ‘high-fidelity copying’, I agree that it was not necessary for the early technologies (see Andersson & Tennie, 2023; Osiurak et al., 2022; Sterelny, 2023; Tennie et al., 2016). However, if, as I have just suggested, the strictest motor imitation (i.e., the super-high fidelity in articulatory-phonetic sequential imitation) was really essential for the deployment of syntactic language (and, therefore, also of ‘collaborative computation’, Dor, 2023), then such imitation is a very important cause of the human cultural advances.
Regarding a deeper link between language and those advances, I suggest that the predicative, really compositional language—beyond making communication easier—is likely to strengthen complex innovations, since these may be supported by the same cognitive resources (of decomposition and recomposition) used in syntactic language. Note that the primary cause of cultural advancement is not the ability to copy know-how (see van Leeuwen et al., 2024),12 but the ability to produce innovative solutions, mainly through creative problem-solving (although more serendipitous processes, e.g., of drift from copying error, can sometimes lead to improvement of previous results).
I have no intention of trying to substantiate the previous suggestion that ‘syntactic language, or, more precisely, its cognitive resources of decomposition and recomposition, help to support creative innovations, even in non-linguistic areas’. Anyway, I’ll bring some quotes. “Members of modern Homo sapiens can mentally combine and recombine symbols, according to rules, not only to consciously describe the world as it is but to generate new visions of it as it might be” (Tattersall, 2023). Likewise, Vyshedskiy (2022) highlights ‘the voluntary imagination component of language’. This imagination must—I would add—be used even in simple receptions of theme/rheme, since, for the typical, i.e., the non-informed addressee (vs. the atypical, perfectly informed one), the content provided by the theme doesn’t include yet the rheme, and thus, the addressee will have to imagine a new situation that he/she has not perceived. A clear example is the reception of “The blanket turned to ashes”. In addition, note that, for this communication, the real blanket (or, more exactly, ‘the blanket for the speaker’) is decomposed in two elements—firstly, ‘the addressee’s false belief about the blanket’, that is, an inadequate means to reach the producer’s communicative goal, and secondly, ‘the adequate correction or updating’. Thus, it is communicatively recomposed. This is an ability to transform others’ mental contents. Returning to the previous suggestion: Could that ability later—and more creatively—be exercised on one’s own mental contents and support difficult problem-solving? Thus, in addition to connecting—in my nuclear proposal—human Theory-of-Mind with human communication, I also suggest connecting it with creative problem-solving. What relationship does this last skill have with “human causal cognition” (whose original connection with technology is persuasively proposed by Gärdenfors & Lombard, 2020)?
But let us return to our thread. What have I achieved in this section? Having said above that the innate mentalese is incompatible with my proposals, I, unfortunately, have not offered any strong argument against it. However, I have tried to show that an alternative hypothesis may also have the potential for derivations. So, if it is accepted that the innate language-of-thought is not necessary, then expectations can be more easily described as empty profiles. (If, then: Needless to say, this article, based on data from several disciplines, uses the hypothetico-deductive method).

3.2. Vicarious Expectations

3.2.1. Can They Also Be Described as Empty Profiles?

So far, we have dealt with expectation in general, which is inseparable from any animal life, and involves extremely basic competencies (for instance, the physical understanding of the effects of gravity, or the daily exposure to the principles of causality). But what is interesting for us—what can, in my view, connect with apes’ Theory-of-Mind—is only the vicarious expectation. Thus, we must focus on the following question: Can the metaphorical description (‘well-defined but empty profile’) also be applied to vicarious expectations? Such an application seems plausible. Note, for example, that such emptiness can explain why ‘level II perspective-taking’ is absent in the primitive Theory-of-Mind.13 Rakoczy (2022) underlines in his general review this absence.

3.2.2. An Argument That Favors That Application: Primates’ Mirror-Neurons

To favor a little more the affirmative answer (that is, vicarious expectations can be described as empty profiles), I will try to show that vicarious expectations derive quite directly from a particular non-vicarious expectation. To do this, let us start looking at non-human primates again. More specifically, let us focus on macaques’ mirror-neurons.
But first I must admit that, as Reviewer 2 rightly points out, this business of mirror neurons has weakened in recent years. And this is not just because the innatism of the first cognitive revolution has generally lost its appeal. Beyond this peripheral level of weakening, there is another more nuclear one, namely, more and more differences are being found in mirror neurons between the execution and the observation of the same action: It seems that the alleged ‘mirror’ is becoming increasingly cloudy.
Why, despite that, do I decide to insert this subsection? First, because I really like the idea that a primate trait—the hand—can, by being perfectly visible to its owner, provide an initial bridge towards the estimation of another’s interiority, even if it is only the estimation of a proprioceptive-tactile sensation that another feels. And, second, because the apparent clouding of the ’mirror’ could perhaps be the consequence of an initial erroneous interpretation of mirroring.
But let us leave these preambles and let’s focus on the original cause of mirroring, according to Keysers and Perrett (2004). These authors pointed to a learning process. Hands are (together with the forearm) perfectly visible to their owner, who must look at them very attentively during the actions of grasping. Thus, the proprioceptive and tactile feedback of any grasping will end up being connected with the visual perception of that movement.14 This hypothesis is attractive. Thus, C. Heyes and Catmur (2022) agree that the abilities of mirror-neurons are learned “through the correlated experience of seeing and doing the same actions in the context of self-observation”.
In other paragraphs, C. Heyes and Catmur (2022) and C. Heyes (2021b) also emphasize that cultural practices—“childrearing practices that encourage adults to imitate infants and children, or the use of optical mirrors”—solve ‘the problem of visuomotor correspondence’. I accept, of course, that these cultural factors have a powerful influence on development (Essler et al., 2023). But, in my view, the (both phylogenetic and ontogenetic) origin of the visuomotor correspondence is, as Keysers & Perrett propose, the vision of one’s own hand. More specifically, in the very origin, there would be learning, but not yet cultural, but rather dependent on a bodily trait specific to primates.
But, after accepting the Keysers & Perrett hypothesis, we now have to focus on the activity of the already taught mirror-neurons. It is that when the visual perception is given without the corresponding inner sensations—that is, when it is someone else’s hand–, the subject must disengage from himself the hand that is in sight. That disengagement is confirmed by the results of all the rubber-hand experiments: See e.g., Pfister et al. (2021): “A single tactile stimulus applied to the rubber hand—but not to the real hand—triggers substantial and immediate disembodiment”. But this ‘disembodiment’ (or exclusion from one’s own body) does not only concern—I propose—the hand at sight, but also the proprioceptive and tactile expectations that the observed grasping had activated in the subject, and which this subject now needs to process as ‘belonging to other’.
This may be when vicarious expectations appear for the first time in evolution, this may be the very origin of the estimation of another’s interiority, or, in other words, the most primitive form of the non-human Theory-of-Mind. In short, while it is typically emphasized that “mirror-neurons map other-related information onto self-related brain structures” (Bonini et al., 2023), I underline the later, inverse mapping: One’s own failed proprioceptive expectations become vicarious expectations automatically (see the next subsection) processed as ‘belonging to another’.
Let us now pay attention to Pomper et al. (2023). “At most time points, mirror neurons did not encode observed actions with the same code underlying action execution. However, in about 20% of neurons, there were time periods with a shared code. These time periods formed a distinct cluster and cannot be considered a product of chance”. These experimental results might fit with the proposal offered in this subsection: Note that, if mirroring is explained as vicarious empty expectation, then it is not surprising that it be different from the ‘fulfillment of expectation’ that movements provoke.
But let us continue reading Pomper et al. (2023): “We propose that mirror neurons represent the process of a goal pursuit from the observer’s viewpoint. Whether the observer’s goal pursuit, in which the other’s action goal becomes the observer’s action goal, or the other’s goal pursuit is represented remains to be clarified. In any case, it may allow the observer to use expectations associated with a goal pursuit to directly intervene in or learn from another’s action”.
I would venture to say that “a goal pursuit from the observer’s viewpoint” is relatively close to a vicarious expectation. Even the question (“Whether”) that “remains to be clarified” is similar to what we will see in the next subsection with Ereira et al. (2018). Finally, the last sentence—“In any case, it may allow the observer to use expectations associated with a goal pursuit to directly intervene in or learn from another’s action”—is unobjectionable, of course: Hebbian connections (such as those established between the two simultaneous expectations, visual and proprioceptive, during the period in which mirror neurons learn) constitute a resource that can serve animals in many different ways.
If (and only if) that proposal about the functioning of mirror-neurons and also the proposal in the previous subsection about expectations are both correct, then we could underline that, while visual/proprioceptive connection is forming in a macaque, it is still a non-vicarious expectation: It is the grip that the macaque is going to execute that activates in him the (general, non-vicarious) expectation of the two versions—visual and proprioceptive—of the adequate ‘feedback’. In this way, we could deduce the desired conclusion—i.e., vicarious expectations are directly derived from non-vicarious expectations, and, therefore, if it is accepted that this latter type is an empty profile, then the same has to be accepted with respect to vicarious expectations.
Certainly, the vicarious expectations that I propose to attribute to apes concern the entire body, not only the hand. However, this could be an almost irrelevant difference. Piaget (1954) showed that it is from hands and (since hands bring food to mouth) also from mouth that the child builds correspondences between his own body and other bodies. In addition, Errante et al. (2023) found (in human participants) that “actions-observation activates specific cortical and subcortical sectors not only during hand actions observation but also during the observation of mouth and foot actions.
What do I finally get from all this? If vicarious expectations—instead of requiring imagined (/simulated/evoked/off-line) contents—are ‘well-defined but empty states’, then no meta-representational separation between contents and vicarious expectations is necessary, and also then, the contrast between vicarious expectations and foreign contents can support the contrast between apes’ and humans’ Theory-of-Mind. But here we must add some clarifications.

3.2.3. Some Clarifications on Vicarious Expectations

In Section 2.2, I proposed that it is the subject’s own expectation that is absent when the subject activates vicarious expectations and encodes them as ‘belonging to other’. Why do I use “absent” (instead of “disregarded”, the term applied by Tomasello to ‘the subject’s own knowledge’)? Let’s remember that behavioral activity necessarily activates expectations of goals and subgoals. Therefore, “inattention to one’s own expectations” can only take place when the subject, being behaviorally inactive, has not any general expectation activated. Thus, confusion is impossible, not only between (empty) vicarious expectations and one’s own (full) mental contents but also between both types of the subject’s expectations—(absent) general expectations and (present) vicarious expectations.
My second clarification is that the so-called ‘attribution of ignorance’ in the primitive Theory-of-Mind does not require any resource different from vicarious expectation. The mere ‘absence of vicarious expectations’—when, for example, the other chimpanzee has not seen the food—can explain why in that case the (subordinate) subject goes (as M. Tomasello et al., 2003 showed) to food. This view is close to Barone et al. (2022), who studied early implicit measures of false belief understanding: “The results from a new ‘Ignorance’ control condition in which children largely behave like in the ‘False-Belief’ condition, suggest that the epistemic state ascription does not amount to full-fledged belief attribution. Rather, children probably merely track knowledge vs. ignorance”. In addition, basic and implicit ToM capacities seem not to be the same ones as those tapped in standard explicit false-belief tasks, since—as Poulin-Dubois et al. (2023) found—there is no stability in Theory-of-Mind skills from infancy to early childhood.
From Neuroscience, Schüler et al. (2024) say: “While the primitive Theory-of-Mind is supported by the salience network, it is the default network that supports foreign false beliefs and, more in general, the processing of internal, perceptually decoupled representations”. This is compatible with my hypothesis. Note that vicarious expectations are perceived in the body and movements of another agent, and are really salient perceptions for the behaviorally inactive subject.15
The third clarification is particularly important. If vicarious expectations are accepted, then we must accept that the self-other distinction is automatic in the primitive Theory-of-Mind. Let’s see Ereira et al. (2018), who worked with human adult subjects: “When another agent’s mental state is inferred, it can be identified as ‘belonging to other’ in two different ways”. A way is that “a learning signal (prediction-error or belief) is encoded in an agent-independent pattern. In this case, the learning signal and the identity of the agent to whom the signal is attributed would need to be encoded in 2 separate activity patterns”. This first way, with its meta-representational separation, would be linked to the advanced Theory-of-Mind (in Ereira et al.’s words, “to standard false belief task”). But these authors claim that, to identify mental states as ‘belonging to other’, there is another way, which operates “through an encoding of agent identity intrinsic to fundamental learning signals (my emphasis)”. This second way of self-other distinction (which in human adults is limited to the most spontaneous processes) would be, in my view, based on vicarious expectations.
Those two ways might be relevant to solve a repeatedly alleged conundrum—“the empathy-sharing conundrum, which mainly refers to the self-other differentiation that empathy entails”, Vincini (2023). In my view, the type of self-other distinction that is based on vicarious expectations does not involve any clash between self and other. This is the type that, when it is linked to ‘empathy’, intervenes in spontaneous altruism. On the contrary, the other type, when it is linked to ‘empathy’, appears, for example, when the subject receives a request that he/she feels is an obstacle to—or, in other words, as a clash with—his/her own activated goals.16
In a similar line to that of Ereira et al. (2018) (but focusing on ‘altercentrism’), Tebbe et al. (2024) report: “A highly specific neural signature of visual object processing was also present when their view was blocked and only another observer saw the object”. This, which was found in infants and adults, could perhaps indicate that the visual vicarious expectation shared the empty profile that served the subject to search for and recognize the particular object.17 The core of the experimental design of Tebbe et al. (2024) should be applied to apes. (See above the paragraphs about Karg et al.’s articles).
Let us recapitulate the previous clarifications. Vicarious expectations can include what Michael and Székely (2019) call ‘goal slippage’. In summary, the ‘slippage’ into the circumstances of the other, or the ‘disembodiment’ of expectations (i.e., the ‘exclusion from one’s own body’, Pfister et al., 2021) which the subject performs when the observed hand is a foreign one, or, as in Ereira et al. (2018), ‘the encoding of agent identity intrinsic’ to the mental state—all these terms—describe ‘vicarious expectations’. I would now add that such easy slippage is abruptly interrupted (both in humans and apes) as soon as the other agent turns around and looks at the subject. We have to not forget that, in a subject, vicarious expectations are incompatible, not only with behavior but also with a high probability of immediate behavioral activation.
That rupture of the easy slippage is similar to what happens when, after having imitated (copying all his turns left or right) someone who walks ahead of me, I realize that he turns around and faces me. Certainly, humans can continue such ‘bilaterally accurate’ motor imitation, but only if they start doing something different from what they were doing before (i.e., different from the mere ‘slippage’ into another location). More precisely, if I want to continue the imitation, I will have to imagine myself in a situation that is as intrinsically impossible for me as being in a different spatial relationship with myself.18 (Let’s think of the gesture of two individuals shaking each other’s right hand: Could this gesture originally involve—or try to provoke—the grasping of foreign mental contents?) All in all, the similarity of the collapse of the flowing “slippage” in the two mentioned cases—with imitation and without imitation—is clear.

4. Primitive and Advanced Theory-of-Mind

4.1. Working-Memory and Non-Verbal Tests of False Belief

Indeed, in the (so-called) ‘non-verbal tests of false belief’ there are successes (above chance), but they are quantitatively limited. Regarding non-human primates, see, for example, Berke et al. (2023, preprint). In addition, ‘replication findings’ are mixed. This is why “a large-scale multi-lab collaboration will examine whether 18–27-month-olds and adults’ anticipatory looks distinguish between knowledgeable and ignorant agents” (Schuwerk et al., 2024). About those difficulties in replicating successes, Rakoczy (2022) proposes: “There might be two classes of implicit tasks”.
How could we interpret all this? Certainly, regarding this matter, we must wait for new data. In addition, evolutionary emergence and ontogenetic development can never be identified (and even less so in our case, since “infants’ experience is already enlanguaged”—Dreon, 2024). However, when we are asking a question so difficult to answer—when we are wondering about the evolutionary origin of the human Theory-of-Mind—we should not rule out anything that can provide us even a little bit of light. Thus, I will add some little commentaries.
In my view, those non-verbal tests require—regarding just Theory-of-Mind—vicarious expectations, and do not need meta-representation of foreign contents. In other words,—regarding just Theory-of-Mind, I repeat—such tests mainly depend on the primitive, easier one (even though sometimes, of course, adult humans apply the advanced Theory-of-Mind to them). However, they require other abilities beyond Theory-of-Mind. Thus, the two scenes (original and changed place) and the consequent demand on attention and especially on working memory provoke great difficulty in less motivated subjects.19 Such difficulty mainly appears if these are at the same time prelinguistic subjects. Note, please, that developmentally—and, very plausibly, also evolutionarily—the reception of multiple-word messages causes a great expansion of working memory.
Leaving all this, let’s focus on the nuclear proposal in this article. Certainly, all the other proposals or suggestions offered in the article can and should be evaluated in themselves. However, if I have included them here, it has been to hold the main one.

4.2. What Made the Estimation of Foreign Mental Contents Originally Advantageous?

My proposal up to this point has been that the contrast ‘primitive vs. advanced Theory-of-Mind’ equals the contrast ‘(empty, easier) vicarious expectations vs. (full, more difficult) foreign contents’. Therefore, the following question arises: What made the estimation of foreign mental contents originally advantageous? As the reader can see, I believe that only if we explain the difference in function—and not only in features—between vicarious expectations and foreign mental content can we move forward.
I propose the following three points. First, to support the adaptive advantages provided by apes’ Theory-of-Mind, vicarious expectations are sufficient resources. Second, since vicarious and non-vicarious expectations require previous, well-defined profiles in the subject that activates them, this subject cannot activate any vicarious expectation of mental states that are impossible for him in any circumstances.20 Third, your mental state of thinking of me as a foreign, distal individual, since it is a mental state that is impossible for me, cannot be a vicarious expectation of mine, and therefore, I will only access that state if I am able to estimate foreign contents. Above I proposed that ‘the situation of being in a different spatial relationship with myself’ is impossible for me. Now, a new example—your state of interacting with me as with a foreign, distal individual—would be equally impossible for me, but much more relevant for human needs.
Thus, we can reformulate our previous question in the following way. For what function was ‘the ability of estimating the foreign mental states that involve me as a distal, foreign individual’ originally advantageous? Or, more concretely: In the new lifestyle, were there problems that such ability could solve?

5. Self-Conscious Emotions

I am now going to separate (as other researchers have done) Theory-of-Mind from ‘false belief’ a bit and focus more on those emotions and other issues. “A developmental approach that focuses on a plurality of domains makes us able to generate useful insights that may not be obvious when focusing on a single domain”, A. L. Ruba et al. (2022). Or, in other words, a puzzle is more difficult if some pieces are missing. But let’s go back to work step by step.
“The thinking what others think of us” (Darwin, 1872, about blush; my emphasis) necessarily requires, according to my proposal, the estimation of foreign mental contents, and therefore, the beginning of the advanced Theory-of-Mind. That phrase can describe, beyond blush, also self-conscious (or “self-other-conscious”: Reddy, 2010) emotions, which are “embarrassment, shame, guilt, pride” (e.g., M. Lewis, 2000).
Is there a common neurophysiologic signature for these four emotions? We can see Piretti et al. (2023). However, these authors unfortunately did not include pride—the only pleasant self-conscious emotion—in their study.
I am opting—it is already evident—for the idea that such emotions are originally based, not on an innate moral core, but on an interpersonal relationship. Thus, in self-conscious emotions (unlike in basic emotions),21 the subject “thinks what others think of him”. Beyond Darwin’s phrase, Frith and Frith (2007) is essential: “The appropriate reception of deliberate social signals depends on the ability to take another person’s point of view. This ability is critical to reputation management, as this depends on monitoring how our own actions are perceived by others”.22 Indeed when we experience self-conscious emotions, the contents of the foreign mind become more real, more relevant for us than any other reality in our surroundings. Cf. Peeters et al. (2023): “[O]bserver-memories are often associated with events where the memorizer experienced a high degree of self-awareness, such as during public speaking. This could be explained by appealing to the context of encoding, where the relatively intense emotions guide encoding towards an observer perspective”.
In this Section, firstly, I will argue that self-conscious emotions relate to the new, human lifestyle. They are “survival circuits” (as LeDoux, 2012, 2023, describe the function of any emotion), but survival circuits of a very special type that evolved linked to the human lifestyle. Secondly, I will propose that self-conscious emotions require the estimation of somebody else’s mental contents.

5.1. Self-Conscious Emotions Are Useful in the Human Lifestyle

The new, human lifestyle is based on special cooperation and communication. Consequently, the care of one’s own reputation, and therefore, also an enhancement of self-control became crucial: Leary (2004) and Sznycer (2019). (All this did not replace “the old dynamics of social dominance, which are based on aggressive and submissive interactions”—Royo et al. (2024)—, but was added to them. Hence, prestige is associated with evolutionarily new nonverbal displays: Witkower et al., 2020). In this way, Baumard et al. (2013), who focus on “competition to be chosen as a partner in cooperative ventures”, practically identify the care of reputation with the habit of refraining from “blatantly selfish actions”.23 This refraining is certainly essential in the care of reputation. However, even in “cooperative ventures” other aspects are important—e.g., the reputation concerning good communicative abilities. In addition, beyond cooperative ventures, there are—see Crespi et al. (2022)—other “arenas of runaway social selection” where reputation is equally crucial. We must also consider that when narrative language and thereafter the (negative or positive) gossip arose, the care of reputation became more intense.24
But let us pay attention to the different usefulness of self-conscious emotions. ‘The new, human lifestyle’ requires also the “deliberate practice”—Ericsson (2002), Rossano (2003)—that is necessary to achieve any kind of cultural expertise. Here, a self-conscious emotion—pride—intervenes. Experts arouse admiration. (About the two types of admiration—for skill and for moral virtue—, see Algoe & Haidt, 2009. About admiration—vs. envy—for experts: Onu et al., 2016). Therefore, experts experience the only pleasant self-conscious emotion—pride. See Sznycer and Cohen (2021), and Sznycer et al. (2017).25 The search for those attractive rewards can support, at least in some of the admirers, prolonged, effortful acquisitions, not only of the admired level of expertise but also of a better one. This role of pride could become even stronger in “collaborative computation, which is the foundation of our cumulative cultures”—Dor (2023)—(and is very different from the so-called ’collective mind’ that leads many animal species to, for example, efficiently organize their group movements).
In addition to providing motivation in that way, pride could influence in an indirect, but still effective way. Progress towards a goal > Higher ‘self-efficacy’ and pride > More difficult goals are perceived as possible. A goal that is perceived as both difficult and possible (that is, a goal in Vygotsky’s zone of proximal development—Vygotsky & Cole, 1978) can improve the subject’s level. In other words, “there may be a positive relationship between difficulty and progress when self-efficacy is high”, as Thorne et al. (2023) preprint try to confirm.
Certainly, children at first are concerned with learning by observing their parents. However, from about age 8, they switch to copying the local expert instead.26 This tendency is probably universal (Henrich & Broesch, 2011). Expertise, despite not influencing automatic imitation (Nevejans & Cracco, 2022), can cause the desire to acquire such expertise, and in that causality, “admiration is more decisive than prestige bias” (Chellappoo, 2021). In addition, let us look at experimental results by Brinums et al. (2023): “Children that were asked to imagine succeeding in the test and to focus on what they will be feeling (Emotional Condition) practiced longer than those in the Non-emotional Condition”. More in general, Shimoni et al. (2022) report that a strong link between delay of gratification and pride has been found among preschool-aged children, an age at which self-regulation abilities are still developing.
Thus, pride can, I propose, support cultural advances. Pride is a reward that subjects get when they see the admiration with which they are looked at by the group—a reward that the subject, of course, will seek to obtain again. Certainly, there are other rewards for an outstanding skill. E.g., André et al. (2023)—who do not underline the causal role of pride, or, more concretely, of its pleasant nature—focus on “reputational and material benefits to the recognized artists”. However, the pleasure that others’ admiration and consequent pride provide, being less deferred and more easily evocable than those benefits, could originally be the best resource to support the prolonged effort that an outstanding skill requires. “Regarding, for example, the learning of post-Acheulean shaped stone tools, we should be concerned to explain the hours of effort with little or no short-run return” (Spurrett, 2024). About this, Castro and Toro (2004) and Castro et al. (2024) talk about the reward that a parental positive evaluation involves, and also Sterelny & Hiscock (2024) (in their reply to Spurrett) focus on children. All this is certainly true, but such focus is useful only to support the basic acquisition of skills, not to sustain the attempt to surpass the previous level of the group. Therefore, pride—I propose—could be an important cause of the innovations that gave rise to our cultural advances. (Mere serendipity, in my view, would have had in general only a small influence). Thus, the two features of the new, human lifestyle described above (in Introduction) would be supported by self-conscious emotions. In other words, not only the negative self-conscious emotions are partially responsible for its ‘social’ feature (as it is generally admitted), but also the only pleasant self-conscious emotion has a strong influence on its ‘cultural’ feature.27
In short, self-conscious emotions support self-control, which is necessary in different aspects of the new lifestyle.28 Certainly, self-control will be bolstered later by ‘speech directed to oneself’ or, even later, by ‘inner speech’ (see Bejarano, 2022, in its Section 4), and can be put at the service of any type of goal (even the goal of exercising what I call—see previous note 16—the most demanding moral capacity). Probably, those very special types of speech originally arose when the gossip (which “gives gossipers an evolutionary advantage”, X. Pan et al., 2024) spread more and more. However, before ‘self-directed speech’ began, self-conscious emotions were crucial for the growth of self-control in humans.

5.2. Self-Conscious Emotions and the Estimation of Foreign Contents: The Two Connections Between Both Traits

Now, let us move on to the link between self-conscious emotions and the ability to estimate foreign content. I propose that if the human being can experience self-conscious emotions, it is because he is capable of imagining a situation as impossible for him as that of seeing himself as a distal, foreign element. (An earlier, more embodied version of that imagining was offered in Section 3.2.3). Thinking what others think of oneself requires the ability to estimate other people’s mental contents: Vicarious expectations would have been useless there. This is the first of the two connections mentioned in the title of this subsection.
Let’s move on to the second connection. Having opted for the idea that originally such emotions were based on an interpersonal relationship, I suggest that, very likely, such interpersonal relationship originally occurred as a prelinguistic intentional communication, that is, as expressive ‘gestures or vocalizations’ accompanied by gazes. (I agree with, for example, Bohn et al. (2022) that the main link between the kinds of signals our human ancestors used and human language “is the interaction engine”. In general, I accept Tomasello’s claims that human uniqueness is previous to language).29 Such prelinguistic intentional communications—for example, ‘gesture or vocalization of disgust (/happy surprise) + eye contact with the addressee’—could have caused unpleasant (/pleasant) self-conscious emotions in the addressee.30
Such productions are “simultaneous multilevel communications”: Lipschits and Geva (2024) (who also underline the decisive role of the adult receiver). More concretely, in such communications, the intentional level would control and use the behavioral and even autonomic ones, i.e., those movements or expressions that originally were not intentionally communicative. This transformation of the old levels makes ‘the dissociation between expression and intentional communication’ “murky” (Warren et al., 2023), or, more concretely, there may not be any such dissociation at all in the intentionally communicative production of the great apes.31 The proposal that “in such communications, the intentional level would control and use the behavioral one” (Lipschits & Geva, 2024) is similar to ‘the recruitment view’ about the origin of great ape gestures—“Great ape gestures recruit features of their existing behavioral repertoire for communicative purposes”, Graham et al. (2024).
Certainly, the prelinguistic intentionally communicative messages that caused self-conscious emotions in the addressee stand out due to their special importance (focused on in Section 5.1) for the development of the new, human lifestyle. However, as communicative productions, they are examples just like any other within apes’ and infants’ abilities. Despite this, we need to underline such messages: Note that, while the above-cited phrase of Darwin perfectly serves, with its “of us”, to distinguish what vicarious expectations cannot do, it, however, ignores a basic question—how the human subject originally comes to think what others think of him–, and therefore, the cited phrase can’t get us to the human communicative reception, which is an (or the?) essential root of human uniqueness. So, henceforth this subsection will focus on that root, and in this way will give a second argument in favor of the link between self-conscious emotions and the estimation of foreign contents.
As seen just above, in non-human primates the intentional control of the behavioral and autonomic levels can occur in production. Thus, in the very beginning, human communicative uniqueness only happens at the reception: In other words, according to my proposal, it is the recipient who originally needs to strive—and to estimate foreign mental contents.
This proposal (it is the recipient who originally needs to strive) maybe can seem like a way to escape from the controversy between, on the one hand, Scott-Phillips and Heintz (2023), who agree with Grice that “the communicative producer typically intends that the recipients recognize his/her communicative intention” and, on the other hand, R. Moore (2015) or Geurts (2019), who reply that it is only to hide his/her communicative intention that the producer must strive. However, that ‘second Gricean requisite’ is not the best terrain to focus on the very origin: Note that, while Grice starts from a clear contrast between natural and non-natural signs, I propose that it is the transformation of a ‘natural’ (or rather, returning to Lipschits and Geva (2024), merely “behavioral” or even “autonomic”) sign into a communicative, ‘non-natural’ one that must be recognized by the addressee.32 This transformation can be called the “behavior of marking entities (e.g., objects and actions) as communicative” (Mussavifard, 2023, preprint). However, at the very origin, the recognition of such ’marking’ (i.e., its understanding by the addressee) required an evolutionary, probably genetic transition: This is my point.
In other words, what I really propose is (as in Section 4.2) that, if an addressee identifies through vicarious expectations the outcome that is intended by the producer, then this addressee will not be able to perceive the producer’s behavior as a communicative behavior towards him/her—i.e., towards the addressee. Therefore, the eye contact that typically accompanies chimpanzees’ intentional communications with an addressee will be, of course, understood by the ape addressee as a communicative resource, but it will not be applied to the behavior that activates vicarious expectations. This non-unified reception is certainly more hazardous and less effective than human, unified reception. However, if, as I believe, the non-unified one exists, then it sometimes must produce the result wanted by the producer.
Therefore, my proposal can only be defended if we find which is the condition that allows some intentional communications of that type to be successful—i.e., allows them to get the addressee to satisfy the producer’s desire. The proposal makes the following prediction: In such successes, the behavior with which the ape-producer tries to manipulate the addressee’s attention toward evidence of the intended outcome—that behavior or resource—may be well understood even if it is not perceived as communicative. If that were so, then we could hypothesize that failures do not derive mainly from a deficient ability for pragmatic interpretation (even if interpretation, “in a novel situation, requires the integration and assimilation of multiple pieces of information to guess at outcomes”, Warren & Call, 2022), but above all from the limitations of non-unified reception.
Melis and Rossano (2022)—as others had done before—claim that monkeys’ and apes’ communicative production is better than reception. These primates can intentionally produce request messages for an addressee.33 We can even see that “a female adult baboon tries to draw the attention of her offspring toward the piece of fruit that she waves between her fingers” (Meguerditchian, 2022).34 However, when the non-human primate receives the message that is addressed to him, he cannot—I propose—grasp that ‘such action of trying to draw attention’ is simultaneously a ‘foreign mental state’ and ‘addressed to him, i.e., to the recipient’.
Returning to the purpose of finding which is the condition that allows some intentional communications of that type to be successful, I will start by recognizing that such a task is a difficult one. Firstly, D. A. Leavens et al. (2005), studying their captive but untrained chimpanzees, have found that ape producers no longer use pointing gestures as soon as the recipients leave, and confirmed, therefore, that those communications are intentionally targeted at the addressee, but nothing is said about reception, because the addressee is human. Secondly, in Hobaiter et al. (2014), the addressee of the pointing gesture is the chimpanzee-producer’s mother, but in the case observed, the mother did not satisfy the desire. She probably did not as it would have been risky—we can suppose–, but, anyway, this case cannot be used as an example of successful communication. Thirdly, loud scratch, despite its great relevance, doesn’t seem to help us enough either, since it has typically been regarded as ritualized. However, in this third case, we can remember ‘the recruitment view’—Graham et al. (2024)—, and also a suggestion that was offered above, in Section 3.2.2—“Probably, only the primates possess vicarious expectations”. If these views were correct, then the producer of the loud scratch could intentionally activate in the addressee vicarious expectations instead of the general expectations that are activated by the overwhelming majority of animal ritual signals, which are not recruited for communicative purposes.
Regarding the two first situations, do their circumstances constitute an insurmountable obstacle to considering both as indicative of a possible reception by chimpanzees? I believe it does not. We—again—must consider that, if those gestures or behaviors could never be understood by apes, then they would not be produced by wild (Hobaiter et al., 2014), and captive but non-trained (D. A. Leavens et al., 2005) chimpanzees either.
We can see that those productions occur when a very conspicuous obstacle (the cage in Leavens, or the dominating individual in Hobaiter) prevents the producer from satisfying his/her goal. Therefore, ‘the behaviors that try to signal the purpose of the producer’ can be understood by the ape-addressee as behavior that merely responds to the producer’s goal (although, due to the obstacles, he, the producer, was unable to achieve such a goal). Or, describing it according to my proposal: Those behaviors easily raised vicarious expectations in the chimpanzee addressee and did not need to be understood as communicative by that addressee.
The non-unified reception may seem surprisingly inappropriate. However, it was—I suggest—kept in apes for two interrelated causes. One, in apes’ lifestyle, the non-unified reception, despite being suboptimal, is a sufficiently useful resource. Two, the change to unified reception requires a new ability and probably also brain modifications that allow the duality of contents.
A clarification can be convenient here about the unified, human reception of that type of prelinguistic communication (i.e., the unification between ‘gaze towards the addressee’ and ‘behavior that tries to signal the outcome that is intended by the producer’). While such reception already must be supported by the estimation of foreign mental contents (or, more concretely, of a foreign thought that interacts with the addressee-subject, i.e., with the recipient, as with a distal individual), it is still different from the predicative language. Note that, on the one hand, only predicative communications are primarily used to correct (or complete or update) the addressee’s (incorrect, according to the speaker) beliefs. On the other hand, the role that the gaze towards the addressee fulfills in those human prelinguistic communications is dispensable in linguistic communication: The non-natural feature of linguistic signs is sufficient to reveal that they have an intentional communicative function. (See above, in this same subsection, the debate about the second Gricean requisite).
Therefore, the predicative language (the only communicative function that absolutely requires syntax and syntactic semantics) could mark a new stage, which would be characterized by more working memory (see above, Section 4.1, and, first of all, Coolidge, 2023), and—I suggest—also by constituting an interpersonal, easy precedent for creative problem-solving (see above, the end of Section “Does the ‘Language of Thought’ Exist?”). Certainly, the role I have proposed above for pride would have begun before creative problem-solving and continued afterward. However, creative problem-solving, which transforms the subject’s own mental contents so that they become adequate for solving the problem, could correlate with the emergence of more decisive innovations.
In humans, the non-unified communicative reception is practically absent. The addressee that possesses human Theory-of-Mind, not only can activate vicarious expectations but also estimate foreign mental contents. Let’s apply this—if only to close the argument—to self-conscious emotions. I have accepted that, for communication to cause self-conscious emotions, the recipient must estimate the interiority (the emotional mental content) of the producer—i.e., a foreign interiority that is communicating with him, the recipient.35 But if the recipient’s ability to estimate foreign interiority is reduced to the activation of vicarious expectations, then, that ability—I repeat—will not be able to apply to a foreign interiority which is at that very moment communicating with the recipient as with a distal individual.
In conclusion, self-conscious emotions (1) support the ‘cultural’ and ‘social’ features of the new, human lifestyle, and (2) are linked to one of its most basic and crucial features, namely, the new, advanced type of communicative reception. In the Introduction (when I focused on the question, ‘What is ‘the new, human lifestyle’?), it was stressed that this lifestyle needed increasing communication. But now we can say that prior to that quantitative increase, the new lifestyle needed a deep change in communicative reception.

6. The Human Theory-of-Mind Beyond Its Origin

‘The thinking foreign mental states which involve us as their distal addressees’ is, in my view, a requirement only for the very origin of the human Theory-of-Mind. In fact, I propose that, once the ability to think ‘two lines’ of content becomes strong, this Theory-of-Mind can carry complex functions that do not fulfill that requirement. Such complex functions are varied.
Sometimes they use foreign but non-interactive contents, as in verbal false-belief tests, which involve “a non-dialogic capacity of mind-reading” (Dor, 2016) in relation to the believer. Note that in those verbal tests, the communicative interaction, instead of being between the subject who attributes the mental content and the ‘attributee’, is reduced to that which is established between the child and experimenter. Regarding this feature of verbal tests of false belief, Gallagher (2015) states that “given the specific attraction of the second-person interaction (vs. third-person perspective), the saliency of the interaction with the experimenter takes precedence over the third-person task”. Elaborating that contrast, Barone and Gomila (2019) conclude that second-person attributions of false belief (unlike third-person attributions—for example ‘The Ancients believed that p’) “are transparent, extensional, non-propositional and implicit”.
By way of a parenthetical digression, I will comment about first-person beliefs. Regarding current first-person beliefs, if it is required that they possess the meaning of ‘believe’ that habitually is activated in second- or third-person attributions (‘He—mistakenly—believes that p’ vs. ‘he knows that p’), then we must say that originally, such first-person beliefs did not exist. In the beginning, for human subjects, their non-outdated beliefs are just the reality (and—in the beginning, again—their outdated beliefs are immediately replaced in an automatic way by the new perceptions, and so, the origin of the predicative negation was probably not intrapersonal but interpersonal). In short, the ‘believer’ cannot have first-person beliefs in the above-described sense, but only ‘knowledge’: On this point, I agree with J. Phillips et al. (2020) (at least, for a primitive, prelinguistic sense of ‘knowledge’—as Rakoczy & Proft, 2022 specify). The concept of belief (and of some traits of character: remember what Ross, 1977 called ‘fundamental attribution error’) emerged—I suggest—in an interpersonal way. In my view, the so-called ‘animal meta-cognition in great apes’ (summarized in M. Tomasello, 2022; see also Tomonaga et al., 2023) is not a judgment on one’s own contents, but a mere hesitation about one’s own general expectations, or (as Edwards-Lowe et al., 2024, preprint say) “subpersonal uncertainty estimates”.
Thus (according to this added, parenthetical sub-proposal) the intrapersonal meta-cognition or intrapersonal ‘cognitive humility’ (i.e., a cognitive humility not primarily understood as “moral interpersonal virtue” à la Priest, 2017, or “as reputation management” à la Karabegović & Mercier, 2023) would be a very late human ability. I agree with Li (2023) that it is both interpersonally originated (since the subject during a dialogue sometimes grasps that the knowledge of the other is more complete than his) and very necessary. Such cognitive humility is necessary perhaps because (see the suggestion at the end of Section “Does the ‘Language of Thought’ Exist?”) it is required by the transformation that any creative problem-solving involves, i.e., by the process of transforming our initially inadequate resource (i.e., our incomplete or incorrect mental content of reality) into one capable of achieving the solution. That type of humility—that, so to speak, ‘culmination/intrapersonalisation’ of Theory-of-Mind— may be enhanced by the least social—and ontogenetically the latest—type of laughter, namely, the laughter caused—e.g., after a punchline—by one’s own interpretive failure. In fact, all kinds of laughter are caused by failures or deficiencies in some expectation—either general, vicarious, or narrative—of the subject.36
Once the digression is over, let us return to “second-person attributions”. According to my proposal, this type of attribution is included within ‘the advanced (or uniquely human) Theory-of-Mind’. However, I fully accept its great simplicity. (As said above in Section “Does the ‘Language of Thought’ Exist?”, even pre-syntactic ‘requests for a certain object’ or ‘calls to a certain individual’ could reveal the speaker’s false beliefs to the listener: Therefore, those easy, second-person attributions of mental contents could provoke the origin of syntax). Needless to say, what I have just said is entirely compatible with the fact that second- and third-person attributions of mental content can become very complex.
Other times, non-original functions of the human Theory-of-Mind are not only non-dialogical. Indeed, these functions can even connect with non-foreign content. These contents (not far from ’mental time travel’) are either the subject’s beliefs/perceptions which he no longer holds or ‘possible’ contents, in any of the senses of ‘possible’.
However, according to my proposal, the human Theory-of-Mind originally arose from a directly relational, interpersonal process, which requires neither language nor experience with narratives. In my view, the linguistic modeling of Theory-of-Mind—C. M. Heyes and Frith (2014), and R. Moore (2020)—is a much later step, which requires new linguistic discoveries. Among those new linguistic discoveries, it is worth highlighting above all others the irreducibly hypotactic ‘referred speech’, and the verbs ‘say’, ‘believe’, or ‘imagine’ (See Bejarano, 2011, Chapter 21).37 A later and highly decisive discovery was literacy or ‘the externalization of memory’, as Merlin Donald called it.38 But leaving all these late human advances aside, I return to the core of the proposal.
The original ‘estimation of foreign mental contents’ is what cognitive archeologists recommend looking for, namely, a “component attribute” (vs. ‘compound concept’): See Foley and Mirazón (2020). Likewise, my proposal on the origin of the human Theory-of-Mind fits with the suggestion that “a priority for future research is to identify the genetic ‘start-up kit’ for the cultural inheritance of mind-reading” (Uta Frith, cited with approval by C. M. Heyes & Frith, 2014; my emphasis).39 In my view, the rejection of ‘innate universal grammar’ or of ‘innate mentalese’—a rejection that I obviously share—should not prevent us from proposing this ‘genetic start-kit’, and searching for it with the current resources of Genomics.
Here it is necessary to mention the subject of autism. (It was Reviewer 2 who pointed out this crucial issue to me, which I had omitted). For quite a few years now—especially since Happé (1993)—, autism has been put in relation to the Theory-of-Mind, and this relationship is very interesting with a view to finding the “genetic ’start-up kit’ for the cultural inheritance of mind-reading”. Note that it is much easier to find the genetic basis for a rare disease than for a universal trait.

7. The Advanced Reception of Pointing

In children’s acquisition of language, pointing gestures are important (Southgate et al., 2007; Kishimoto et al., 2007). Since the child’s pointing gestures may often provoke linguistic comments from the adult about the signaled object, it is evident that those gestures create the ideal context for learning words. Note that, although the words that appear in the adult’s comments may be unknown to the child, this will rely on the trick of knowing which object such comments refer to. But in the evolutionary origin of language—I propose—pointing gestures may have been even more important.
This Section, even if now I will add new arguments and data, will repeat the same hypothesis above applied. More concretely, in Section 5.2, I applied it to the reception of communications that cause self-conscious emotions, and now, to the reception of pointing gestures. However, I have considered it appropriate to delay in dealing with pointing gestures, since, while self-conscious emotions are almost unanimously considered uniquely human, regarding pointing gestures, however, things are very different.
In addition, at the end of this section, I return to ‘the cooperative eye hypothesis’ (M. Tomasello et al., 2007, built on Kobayashi & Kohshima, 2001). Certainly, my proposal will put the evolutionary transition (i.e., my proposed transition to the human, unified reception of pointing) precisely in the process that unifies the two gazes—or, in other words, extends the communicative function of ‘the gaze towards the addressee’ to ‘the gaze towards the object’: Therefore, it fits well with the fact that human eyes make the horizontal traveling of the iris conspicuous. Likewise, such conspicuity is certainly an embodied resource, like that of the broad intonational pattern that was proposed above regarding the origin of syntax. However, despite all that, I’m not convinced that the human type of eye emerged in synchrony with the unified reception of pointing gestures (or, in other words, with the beginning of the human Theory-of-Mind). In other words, while I am fully convinced that human eyes are very effective facilitators of the advanced, or ‘unified’, reception of pointing gestures, I have only a faint hope about that synchrony. Anyway, since the problem of when the transition occurred is so difficult, I strongly recommend that researchers in Paleogenomics try to answer the question of when the human-type eye appeared in evolution. As said above, we should not rule out anything that involves any possibility of giving us light.40

7.1. Apes and Pointing Gestures

7.1.1. Responding to a Possible Objection: Pointing in Apes

On the one hand, I have proposed that the advanced Theory-of-Mind is uniquely human. On the other hand, we know that many chimpanzees raised by humans have been taught to produce pointing gestures and to understand them (even the declarative type of pointing: Lyn et al., 2011) What answer can I give to all this?
I will begin by admitting two indisputable facts. One, “human children display this ability to use communicative cues only after many months of intensive exposure to cultural environments characterized by frequent referential signaling, both verbally and nonverbally” Clark et al. (2019). Two, the absence of pointing is not at all harmful in “apes’ lifestyle”.
From those statements, some authors conclude that in non-human primates that ability would be present, although scarcely exercised or developed. See Vasilieva (2019): “Not only the presence/absence of a trait but whether it manifests in animals to the same degree as in humans is equally important for our understanding of trait evolution”. The following example is offered by Heintz and Scott-Phillips (2022): “Human bodies are not especially well-suited to swing from trees. However, there is no absolute barrier”. In that same line, Berio and Moore (2023) recommend resuming great ape enculturation studies.
But, according to my proposal, it is only the effective, ‘unified’ reception of pointing gestures that is uniquely human. Certainly, in this way, I place as a vital criterion a process that is still unobservable, which may seem like a withdrawal towards “untestability with scientific methods”(D. Leavens, 2021). However, as can be seen, the proposal relates to some facts and several potential experiments and research.

7.1.2. Authors Who, When Dealing with Pointing in Apes, Have Focused on Reception

The focus on reception is not new. R. Moore (2013) focuses on the receptive failure of apes and proposes that “since pointing gestures provide poor evidence for a speaker’s message, they exceed the pragmatic capacity of apes”. Likewise, Morrison (2020) emphasizes the ambiguity and necessary disambiguation of pointing gestures. I agree with these claims. But, in my view, ‘poor evidence for the message’ and ‘poor pragmatic ability’ are insufficient to explain the frequency of receptive failures in apes.
Lyn and Christopher (2018) list three conditions which the experimenter may point out and whose reception by apes is differently successful: “(i) Proximal-Proximal: The choice items are close together and the point is close to the correct item. (ii) Proximal-Distal: The choice items are close together, but the point is further away. (iii) Distal-Distal: The choice items are further apart, and the point is therefore necessarily further away”.
According to that work, in Proximal-Proximal and Distal-Distal, point-following can be achieved by simple mechanisms. However, “in Proximal-Distal, the best predictor of success is ontogenetically previous human social contact”. I would underline the fact that it is just in Proximal-Distal where the direction of the head of the producer (that is, the cue that chimpanzees use to estimate what others can see: M. Tomasello et al., 2007) is unable to signal the object.

7.1.3. Unlearned Production in Apes

Before focusing on the contrast between the two receptions, it is convenient to go again and in a more detailed way over unlearned production in apes. “Unlearned (i.e., with no explicit training whatsoever) captive chimpanzees frequently point to unreachable foods. These are communicative signals because apes will not reach towards obviously unreachable food if there is nobody around to see them do it” (D. A. Leavens et al., 2005). In addition, in those chimpanzees, a repeated gaze alternation between the food and the experimenter was significantly associated with their pointing gestures.
Since then, Leavens and other authors began to ask themselves whether conditions like those (cage and benevolent recipient) which in the mentioned observations were considered as decisive appeared in wild chimpanzees too. Hobaiter et al. (2014) offer the following proposal: “Wild chimpanzees experience few physical barriers, but the presence of a dominant, unrelated chimpanzee monopolizing a particular resource may be a greater barrier to a young chimpanzee’s access than bars on a cage. To overcome this challenge, a juvenile’s only resource is another chimpanzee, mainly its mother”. Thus, they found a case in the jungle that they classified as “possibly deictic”. A possible conclusion: Wild chimpanzees that use this type of production with their conspecifics can thus achieve (at least sometimes) their goals.
Nevertheless, for such production to be a useful resource in the wild, it is necessary for recipients to deliver (at least sometimes) the desired object. Is it possible? Animal altruism is a controversial matter: see, e.g., Rendall et al. (2009) vs. De Waal (2010). But I do not discard it if it does not cross the (always narrow) limits of ‘spontaneous altruism’.41

7.2. Reception of Pointing Gestures in Chimpanzees and in Humans

Regarding the reception of pointing gestures in chimpanzees, I begin by highlighting that they understand the communicative value of gazes toward the addressee. Indeed “the sensitivity to being watched is both innate and shared by most vertebrates” (Klein et al., 2009). Thus, in the species that are able to perform ‘recipient-directed’ communication, recipients of that gaze understand that they are the addressees of this innate communicative resource. (But, while in gorillas, eye contact communicates mild threat, in chimpanzees, by contrast, it is a friendly communicative resource).
However, in the chimpanzee-recipient such communicative value is not applied—this was proposed above in Section 5.2—to the other element produced by Leavens’ or Hobaiter’s untrained chimpanzees, that is, to gazes towards the object and to hand/arm movements. ‘The gaze towards the object and hand/arm movements’ is, for an ape-addressee, a non-communicative behavior that can sometimes activate vicarious expectations in him (in the addressee). It is fair to specify up to which point this description of non-human reception of pointing gestures seems implausible to human intuition. The producer, both before and after making movements in a certain direction with his arm and head, communicates with the recipient by means of eye contact. Why would the recipient not understand that the producer’s movements are communicative, or, in other words, that the communicative value of eye contact is applied to those movements and gives them a communicative function? For humans, that unification of the two consecutive instants is obliged and unstoppable, I acknowledge it. But is such unification present in chimpanzees?
As said above, the cage (D. A. Leavens et al., 2005) or the dominating individual (Hobaiter et al., 2014make the chimpanzee’s gesture non-absurd for conspecifics even if it is not interpreted as communicative. On the contrary, our human reception of pointing gestures can be considered closer to that of communicative pantomimes.42 M. Tomasello (2008) stresses how strange any pantomime can be for a recipient if the gestures involved are not interpreted as being communicative (“the recipient will see my iconic gestures as some kind of strangely misplaced instrumental action”43), but he does not say it about our pointing. However, according to my proposal, in both cases, the same problem arises for apes. As said above in Section 4.2, vicarious expectations—the only resource that, according to my proposal, apes have to estimate the interiority of others—cannot involve any action that is impossible for the subject in which they are activated. Therefore, vicarious expectations cannot be understood by the subject—that is, by the ape-addressee—as involving communicative actions directed by the producer to him.
Now, let’s pay attention to the alternation between gazing at the object and gazing at the addressee. This alternation appears in apes’ and humans’ production of pointing gestures. In D. A. Leavens et al. (2005) we already read that in those captive but untrained chimpanzees, the repeated gaze alternation between the food and the experimenter was significantly associated with their pointing gestures. Even more important—of the utmost importance really: Paulus and Fikkert (2013) show that the necessary and sufficient element for human babies to first understand pointing gestures is not the hand movement (or its situational/cultural variations—see Cooperrider & Slotta, 2018), but the alternation between the two gazes. (The movements of the arm/hand/finger would be, therefore, a later strategy to make more precise the function of the gaze to the object). Thus, we must focus on the two gazes.
On the one hand, the ‘gaze towards the object’ causes the recipient to estimate what the producer sees. On the other hand, the ‘gaze towards the recipient’ (a.k.a. ‘eye contact’) informs the recipient that he is being the addressee.44 In addition, inter-brain consequences of eye contact in humans are increasingly studied. Y. Pan et al. (2020) mainly focus on teaching. Di Bernardi Luft et al. (2022) stress that “inter-brain synchronization mainly flows from leader to follower”, and thus, from the producer of pointing gestures to the addressee. In general, second-person approaches underline eye contact: Cañigueral et al. (2022).
But what must be highlighted is that in our human communicative reception, those two instants (‘gaze towards the addressee’ and ‘gaze towards the object’) cannot in any way remain separate, but they must be unified. The addressee has (1) to estimate what the producer from his place and in his circumstances is looking at, and (2) to understand that what the producer is looking for by looking at the object is to point at the object for him, for the addressee. According to my proposal, it is—as the reader already knows—in that unification where the problem arises for the ape-recipient. Let’s return one more time to the nuclear subsection (i.e., to Section 4.2). Certainly, vicarious expectations are automatically processed by the subject as belonging to the observed individual. However, since there can be no vicarious expectation of the results of an action intrinsically impossible for the subject, the recipient-subject will be unable to apply to such expectations an interpersonal communicative function towards himself.
Therefore, the unified, fully effective reception of pointing gestures will only be possible by the estimation of the mental contents of the producer. Thus, there would be a common capacity to that reception and to that of prelinguistic messages that cause self-conscious emotions, and to any linguistic reception, since this always includes that the involved thought comes to the receiver from someone other.45 That ability can be colloquially described as the one of ‘remaining in your shoes when you look at me’ (a description that highlights the similarity to a more embodied version—see above, near the end of Section 3.2.3—of the ability).
A preliminary test of these proposals could investigate in humans whether there is some relevant neurophysiologic similarity between the interpersonal activation of all (negative and positive) self-conscious emotions and the unified communicative reception of pointing gestures. If such similarity is found in the future, then the plausibility of the general proposal would increase. But it is convenient to specify that the proposed explanations of self-conscious emotions and of the effective, unified reception of pointing might be evaluated by future discoveries differently.
In other words, in addition to total success and total failure, there are other two possibilities, the partial results. Thus, it might be discovered that, while the proposal about the advanced reception of pointing can be maintained, the explanation of self-conscious emotions, however, must be transformed—for example, rejecting their interpersonal origin and deriving their ontogenetic and evolutionary emergence from ‘an innate core’ of moral norms. Or, conversely, the result might be that, while the proposal about self-conscious emotions can be maintained, the effective, non-hazardous reception of points, however, does not require any process of unification between ‘gaze towards the addressee’ and ‘gaze towards the object’—because, for example, their mere succession might be enough for full effectiveness to be achieved through “the human pragmatic competence, which is greater than that of apes” (R. Moore, 2013) or, alternatively, because human beings are much more inclined to gaze-following (an inclination that might either derive from the salience of human eyes or connect with a supposedly prior, not subsequent, type of what Csibra & György, 2006 called “Natural Pedagogy”). Anyway, for now, I bet on my proposal in the most ambitious way (or rather the most self-reinforcing one: to give a recent example, see ‘causal-association inferences’ in Currie et al., 2024), that is, applying it to both abilities.
Of course, at the beginning of ‘the new lifestyle’, several behaviors (not very different from the ones carried out by Leavens’ and Hobaiter’s untrained or wild chimpanzees) could achieve some degree of reception and could be useful for both producer and recipient. Let’s consider, for instance, the action of pushing a conspecific until we place him so that he can see a relevant object. These types of communicative production would have been multiplied at the beginning of the ‘new, cooperative lifestyle’, without the recipient grasping the simultaneously mental and communicative nature of the behavior yet. But this problem finally became accessible to coevolution genes/culture. And so, the effective, unified reception of pointing gestures appeared, together with the estimation of foreign contents.46 Now, I will propose that the unified reception of pointing gestures is strongly facilitated by a little anatomical feature.

7.3. The Human Eye and the Unified Reception of Pointing Gestures

M. Tomasello et al. (2007) (that is, six years after Kobayashi & Kohshima, 2001) focused on the universally human white sclera, or, more precisely, on both its horizontal enlargement and its depigmentation and proposed that these human peculiarities enhance “the visibility of eye-gaze orientation”. But gaze-following, a phylogenetically old ability, is—an objector might say—carried out without the help of the white-of-eye. Indeed, M. Tomasello et al. (2007) showed in apes the reliance on head (vs. eyes) in gaze-following. Likewise, C. Moore (2008) concluded from his experiments that when infants first start to follow gaze (at that age—note, please—they are still unable to receive pointing gestures), “they do so on the basis of head direction, not eye direction”.
Despite those possible objections, M. Tomasello et al. (2007), putting ‘the enhancement of the visibility of eye-orientation’ in the evolutionary context of human special cooperativeness, hypothesized that humans evolved such unique eye morphology to facilitate joint attentional and communicative interactions among conspecifics. See also Wolf et al. (2023), or Yáñez and Gomila (2018), who, after underlining ‘the interactional importance of gazes’, adds: “especially when oneself is the focus of that attention”, i.e., during eye contact. I will specify this emphasis on cooperation and interaction to connect it with my proposal of the ‘unified’, effective reception of pointing gestures. Let’s start by describing “the enhanced visibility of eye orientation” in more detail.
Mayhew and Gómez (2015), Perea-García et al. (2019) (but see Mearing & Koops, 2021) and Caspar et al. (2021) have proposed that the chromatic contrast in the human eye is not unique among ape species. But let’s focus on horizontal elongation. This feature may have evolved to allow non-arboreal primates to scan their environment widely. Nevertheless, such elongation together with the universal “totally/bilaterally white sclera” make the location of the iris conspicuous not only in averted but also in a direct gaze. In addition, “the eye outline is easier to see in humans (than in apes) irrespective of skin color” (Kano et al., 2022) and this makes the location of the iris even more conspicuous. See also Prein et al. (2024, preprint), who conclude that human ‘gaze understanding’ is “based on the pupil location within the eye”. Thus, human eyes—this is my point—make the successive locations (that is, the horizontal traveling) of the iris conspicuous.
In this way, the continuity of the two gazes in pointing (or, in other words, the crucial—remember Paulus and Fikkert (2013)—alternation between gazes) is enhanced. It might be said that when the producer moves his iris from the ‘gaze towards the object’ to the ‘gaze towards the recipient’, that movement is perceived by human recipients as if it was injecting the ‘gaze towards the object’—and, consequently, also the vicarious expectations activated by recipients—into the ‘gaze towards the addressee’, that is, into the communication. So, the human eye would lead the human recipient of pointing gestures to unify the two instants—and, therefore, to estimate the producer’s mental states that, involving himself, i.e., the recipient, as their distal addressee, are intrinsically impossible for this addressee—and, therefore again, to estimate ‘foreign mental contents’.
In short, in my view, the human sclera is an anatomical, universal ‘facilitator resource’ of a mental process—the unified communicative reception of pointing gestures, of course—in the addressee. It is also a strong ‘facilitator resource’. These qualifications could maybe raise the suspicion (1) that the ‘unified’ communicative reception of pointing gestures was the evolutionary first function of the ability to estimate foreign mental contents, and (2) that this estimation—and the consequent ‘duality of mental contents’—was originally difficult and demanding. However, such deductions (let us not forget!) would require us to choose the option of the synchronic or quasi-synchronic emergence between human eyes and human Theory-of-Mind. (And, as said above, the decisive ‘horizontal elongation of eyes’ may have emerged much earlier, only to allow non-arboreal primates to scan their environment widely).
The depigmented sclera could become universal in an evolutionarily very short time, and therefore (if there was such synchronic emergence) the human sclera could arise in the same species in which the effective, unified reception of pointing gestures was beginning to emerge. But did it happen in Sapiens? And if so, did it happen at the very beginning of our species? Or later?47 Or did it emerge in Neanderthals/Denisovans? This can be a crucial question. I hope that Paleogenomics and Genomics specialists will answer it soon. Certainly, the depigmented sclera is a quite simple feature. However, its universality makes, of course, their task difficult.
If we follow the option—the faint hope, as I said in Section 7.1—that the peculiarity of human eyes emerged in relative synchrony with human Theory-of-Mind, then we could propose that this facilitator is an essential basis for any human communicative reception (i.e., our ability to understand messages as foreign mental states and, simultaneously, as addressed to ourselves). But such proposal can accept either that such basis—such estimation of foreign contents—emerged in Sapiens, or that, on the contrary, in Sapiens, only its derivations emerged (see above, in Section 5.2, the separation between the human reception of prelinguistic messages, on the one hand, and predicative language, on the other hand, and see also Section 6), while the estimation itself had emerged in Neanderthals. In short, that option, in addition to being based on a ‘faint hope’, could predict only that the human type of eye will not be found in earlier hominins. Therefore, regarding Neanderthals, it does not possess a strict falsifiability. This is an extremely unfortunate fact since it is just the Neanderthal genome that is being studied. Anyway (and returning again to Section 7.1, but now to the recommendation that “we should not rule out anything that can provide us even a little bit of light”), the question of whether Neanderthals—or even, as suggested in note 47, our species in its beginning—possessed eyes like ours should be answered. If such an answer is negative, then it could give us a useful supply of light. But now all this is just a very faint hope.
I do not want this last paragraph, with its lack of confidence and pessimistic tone, to mark readers’ final impression of my hopes. Please remember that such a tone has not been the norm for this article. Indeed, as said above, I’m much more convinced of my general proposal than of the synchrony between the mentioned emergencies.

8. General Outline

(I) Animals do not evoke their goals. Expectation—an empty profile—is enough to guide their behavior.
(II) The primate hand (which its owner can see, and needs, during his grasping action, to see) gives rise to a first novelty. When a movement is to be executed with the hand, there is not only kinesthetic and proprioceptive expectation but also a visual expectation. Thus, the sight of a foreign hand can activate the expectation of the normally concomitant kinesthetic and proprioceptive sensation. When at the very next moment, this error is corrected, those kinesthetic and proprioceptive expectations are automatically processed as belonging to the individual whose manual movement was observed by the subject. Vicarious expectations have appeared.
(III) Vicarious expectation can, perhaps only in great apes, extend beyond the hand. In this more complex vicarious expectation the subject, having established a correspondence between his own body (felt but not seen) and the other’s body (seen but not felt), gains the highly adaptive ability to activate vicarious expectations about what the other sees from his position and orientation, even though at that moment the subject does not have access to such a visual field. (Of course, such evolved vicarious expectations will only be possible if the subject knows the area very well and has often been in the place where the observed individual is now).
(IV) All vicarious expectations, both original and other, remain what all expectations in general are, that is, empty profiles.
(V) But with the human way of life, new skills become necessary, which vicarious expectations are unable to sustain. It is now necessary that communicative messages, though still prelinguistic, be understood by the recipient simultaneously as mental states of the producer and as addressed to him, to the recipient. It is communicative reception, not production, that originally required a great change. (In the communicative production of the great apes, a merely communicative use had already been given to behavior and movements that were not originally communicative). Apes can estimate the mental states of another individual, since, as already said above, a particular type among all the expectations they activate in themselves, i.e., vicarious expectations, are automatically processed as those of the other individual. But such an incipient Theory-of-Mind is not sufficient now. In human communication, the recipient has to grasp a thought that could never be his own under any circumstances, and could never, consequently, be an expectation of his: Note that the content that is thought by the communicative producer necessarily includes the feature of being addressed to him, the recipient, as to a distal individual. Human estimation of the mind of others must therefore capture (full) contents and not mere (empty) expectations.
(VI) Part of this human communicative reception was applied to understanding messages that would trigger self-conscious emotions in the recipient. These very particular emotions emerged in large groups, that is, among individuals who were not permanently together, and where the behavior of one could surprise another. (Hominids who lived in small groups probably evolved only in another direction, that is, in the direction of greater social cohesion and greater spontaneous altruism). In addition to the well-known role of social control played by the three self-conscious emotions that are unpleasant, I want to emphasize that the pleasant one—i.e., pride—could lead to ‘improvements in the group culture’ by an individual.
(VII) With the emergence of this uniquely human type of communicative reception, communication becomes much more useful, and many more meanings are created. The first meanings in human communication had nothing to do with our semantics since this is intrinsically shaped by syntax. The first meanings were only calls to someone in particular or requests for something specific, and they could not have any other intonation than that of a request or call. The message was made up of only one of these pre-words.
(VIII) But these primitive messages, despite their limitations, were capable of designating concrete realities as an individual or an object. And this, together with the already acquired ability to capture other people’s mental contents—in this case, the previous speaker’s false beliefs about the nearby presence of the individual called, or about the availability of the object requested–, soon gave rise to syntax. Note that syntax is only needed in language with a predicative function, and this communicative function seeks—except in lies, of course—to correct, complete, or update the mental content of the addressee. (Of course, this syntax was pre-grammatical—that is, ’theme, rheme’—and remained like that for a long time probably. Complex grammatical devices—subordination, deictics converted into anaphoras—originated only with ‘reported speech’ or with long interventions by a single speaker).
(IX) But let us stop focusing only on the evolution of language. The great transition, the cerebral change that the new communicative reception entailed, had effects beyond communication. If humans can simultaneously think about their own mental content and the mental content of others, they will also be able to evoke, as (full) mental content, their past perceptions, or possible future perceptions. One key to the difficulty is in all cases the same. The brain has to prevent any content other than ‘its own at the moment’ from directing its behavior. This difficulty had no previous precedents. Note that dreams present two differences that remove all difficulty: one, the dream situation is the only one that the subject pays attention to, and two, there is motor paralysis (except in sleepwalkers).
(X) Creative problem-solving also had to do with the great transition. To try to connect the two, we have to go back to language. Creative problem-solving consists of the transformation of our mental contents, which initially seem inadequate to achieve the solution, into ones that do solve the problem. This is, of course, much more difficult and, both in evolution and development, much later than the predicative communication. But in predicative communication, in the mere ‘theme, rheme’, there is also a transformation of an inadequate element (the false belief of the addressee, which is, of course, the only thing the addressee can grasp in the theme) into one that is adequate to communicate to the listener what the speaker judges to be the reality of the matter. The difference lies in the fact that the operation in creative problem-solving is intrapersonal, not interpersonal. But that difference, that enormous distance, could be bridged during human genetic-cultural co-evolution. Thus one should distinguish between cultural innovations that occurred only through pride and other, later and more crucial ones, that were based not only on pride but also on creative problem-solving. (The connections accumulated by any fully linguistic individual throughout not only his years of language acquisition but throughout his entire life would facilitate the search for a way to transform initially inadequate content and achieve problem resolution. One might suspect that such connections are stored not only in language but among the resources an individual learns in music, painting, science, and other areas). But in the linguistic area we might perhaps find a slightly less distant precedent for creative problem-solving than predicative syntax: Note that in partial interrogations the speaker has to communicate what he does not know.
Before moving on to another section, I feel it is necessary to say something about the strong charge of speculation that there is in these views on evolution. At the end of the article, I will return to the speculative character more generally. But I will deal with it here as well, as a grateful response to Reviewer 4.
I will now focus, then, on what some authors have said about proposals on evolution. Lotem et al. (2017) point out that “the typical reductionist appeal to parsimony—that is, Morgan’s Canon—is somewhat misleading in evolutionary contexts and time scales, where changes are actually to be expected”. Thus, the only thing that is then indispensable is that the new proposals—that is, the ‘non-parsimonious’ ones, according to Morgan—follow the slow pace of evolution (van Woerkum & Barrett, 2024). But the ‘search for this balance’ is risky and, to a greater or lesser degree, speculative: Needless to say, the set of canons, Dicta (Buckner, 2013), and general truths does not give us concrete solutions. However, in my view, that search can be a useful task, as long as we recognize that more research is needed to verify if the hypotheses are on the right path or not.

9. Summarizing, and Looking Towards the Future

This article has hypothesized that the contrast ‘vicarious expectations vs. foreign mental contents’ is a genomic, brain novelty that appeared in coevolution genes/culture. Therefore, I have also made another proposal, namely, that such novelty was required by ‘the new, human lifestyle’, which was increasingly technologic (humans are ‘obligatory’ users and producers of tools) and cooperative (with a way of cooperating that is based on a particular type of communication). More concretely, in the origin of this lifestyle, two extremely important abilities (self-conscious emotions and, more basically, the new communicative reception of even prelinguistic messages) required, according to my proposal, the ability to estimate foreign contents. The key to my argument has been that only in human communication the addressee has to think foreign (i.e., others’) thoughts as mental states addressed to him. As the reader already knows, my hypothesis is above all dialogical, and of course, also embodied and deeply embedded in evolution.48
I have proposed that the human Theory-of-Mind and human (even prelinguistic) communication are inextricably linked. Or more precisely: On the one hand, the set of those two abilities and, on the other hand (and more initially), the new lifestyle, feed off each other in a growing spiral. Therefore, while there is absolutely no suggestion on my part that all uniquely human capacities evolutionarily arose at the same time, I maintain that one of them—namely, the estimation of foreign contents, and not only of vicarious expectations—underlies the rest.49
The contrast ‘extinct species of Homo vs. us’, if it becomes finally an area of Comparative Neuroscience, might fulfill especially the promise to help us to ‘know ourselves’, as classical philosophy wanted.50 Such a result could perhaps be achieved with the help of Genomics/Paleogenomics, as said above. Also with “the use of evolution to identify meaningful categories of mental activity” (Cisek, 2019, 2021, which apply this resource to animals). However, the use of evolution (or rather, coevolution genes/culture) is also necessary to identify categories of human mental activity. In other words, the nuclear categories of human mental activity will be more easily found the more we seek their link with the emergence of the human lifestyle.
Returning to the nuclear proposal, this article has not offered any new empirical results. However, the main proposal and each sub-proposal raise questions. Let us mention some of those questions. My view of expectations? Apes’ vicarious expectations? The anti-intuitive ‘non-unified reception of pointing’ in chimpanzees? Interpersonal origin of syntax and syntactic semantics? Is there genuine metacognition in great apes, or, on the contrary, only ‘subpersonal uncertainty estimates’? These questions can lead to different experiments and research in Neuroscience or Genetics, whose results will have an impact on my proposal, in one way or another. But I have already dealt with this above.
Therefore, I will add only a more personal comment. I am looking forward to those results that can make my hypothesis testable. Even if those results discarded my proposals, I would feel that my effort has been useful: Obviously, the hypotheses are most useful when they point out a correct path, but if an apparent road leads nowhere, then the task of promoting its testability is also a service to the community. In short, I ardently wish that these tests are conducted. However, since such empirical research is out of my reach, I can only request them. This is what this article would want to do now and in the medium-term future.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Conflicts of Interest

The author declares no conflicts of interest.

Notes

1
Even if population size and connectivity have been strong drivers of the cultural advances and also—mainly in the African Middle Stone Age—of cultural droppings: (Scerri & Will, 2023).
2
However, I agree that apes’ ability in those tests is related to “affective empathy” (Lurz et al., 2022). Or, in my words (Bejarano, 2022), ‘vicarious expectations’ are related to ‘spontaneous altruism’.
3
However, the methodological, more particular matter of the violation-of-expectation paradigm (see the general review by Margoni et al., 2023) will not be discussed here.
4
Nowadays it is known that unexpected events can only be connected to superficial layers of the visual primary area, while expected events are also connected to the deeper levels of that area—Thomas et al. (2024)—, and, thus, it is possible to suspect that expectations are coded in the brain in a different format than perceptions. (This publication studied human adults. That does not conflict at all with my proposal. Humans, although we can evoke absent things, also have our empty expectations).
5
Such communications would already use non-innate resources (based not only on iconicity but, perhaps even more, as Cartmill et al. (2024, preprint) suggest, on ‘past conditioned associations known by the group’). However, most likely, these cultural gestures or calls still lacked ‘super-high fidelity’ transmission (which supports articulatory-phonetic imitation). In addition, let’s note that in the reception of these messages, the principle of “Teleology, first” in Theory-of-Mind (Perner et al., 2018) was, of course, obeyed. We could even suppose that such type of individual message attempted, firstly, to become more and more choral to, finally, influence group behavior: In other words, it would not be ‘dialogic’. All these features would place this type of message far from even prelinguistic human communication. Despite this, such messages would go beyond empty expectations of goals.
6
“The first words ever spoken is a key issue for the research in the evolution of language” (Gasparri, 2023). I agree with the importance of such an issue.
7
Planer (2019) (an article defending languages-of-thought) understands perfectly that “if the brains of many animals instantiate languages of thought, then we face a serious explanatory challenge. That challenge is to explain how languages-of-thought might have evolved”. But I am not persuaded by his explanation.
8
Or, more precisely, without a semantic content either produced simultaneously with the prosodic cue, or immediately previous in a dialogue—I add. This second type can be produced with a minimal articulation originally empty of meaning (e.g., the ‘huh?’ of Dingemanse et al., 2013).
9
“In human infants, shoulder movements, controlled by ipsilateral motor pathways from the right hemisphere, precede the left-hemisphere control of the right hand” (Rönnqvist, 2003) and also of culturally learned motor sequences. Nowadays it is also known that in humans, certain muscles that are mainly associated with shoulder movement—and, therefore, also with the expressive gestures that involve arm movement—are likely to interact with the voice (Pouw et al., 2023). Thus, the superiority of arm-gestures over vocal resources that is observed in intentionally addressed communications of non-human primates, that indisputable (even if relative, Lameira et al., 2024) superiority, could perhaps be conserved in multimodal communication of human infants as the anteriority of arm-gestures—less complex than hand-movements—over cultural vocal learning. If that were so, then we could suspect that such anteriority, interacting with the voice, caused the new, broader intonational unit, and, in this way, paradoxically ended up giving rise to the mentioned ‘victory of voice on gestural communication’. We must take into account that “in apes, communicative gestures, unlike manipulative movements, are controlled by areas that in the human brain are responsible for human language”: Becker et al. (2021), Becker et al. (2022), Meguerditchian et al. (2011). In short, I wonder if the following similarity has a basis in the ontogenesis and phylogenesis of our brain: Culturally learned movements of the right hand (controlled, of course, by the left hemisphere) are embedded in a previous, simpler arm movement (right hemisphere), and, similarly, culturally learned articulatory-phonetic signifiers (left hemisphere) are embedded in an intonational pattern (perhaps right hemisphere: Gainotti (2024) again vindicates the recently challenged “graded, right-hemisphere dominance for emotions”).
10
Bejarano (2011, p. 126) underlined three points: First, “the imitation of complex motor patterns (or ‘kinetic melodies’, as Luria calls them) which are new to the subject requires, according to Piaget (1954), that there to have been, during the observation of the model, a latent imitation”. Second, “since ‘any body representation which is used for action continuously tracks the positions of our body parts as we move’ (Haggard & Wolpert, 2005), why would the same thing not occur in a motor sequence that occurs latently?” Third, “we can conclude that the latent learning of kinetic melodies requires that any step other than the first in the sequence be imitated through fictionalization of the posture derived from the unexecuted previous motor step”. Certainly, this is too narrow a framework to support the difficulty of learning those sequences: It is clearly a far cry from Lind’s extensive and updated argumentation (see its most recent summary in Lind & Jon-And, 2024) against sequence learning by animals. However, my old framework led me to suspect that super-high fidelity imitation was a fairly late skill.
11
So, I am wondering about the possibility that the early language did not depend on the ‘super-high fidelity copying’. Planer et al. (2024, preprint) focus on a similar puzzle—“an early language previous to know-how copying”, although these authors perhaps do not sufficiently emphasize the difference between ‘know-how copying’ and the vocal ‘super-high fidelity copying’, and thus they choose as a solution to the puzzle the idea of a merely gestural-iconic origin of early language. For my part, I prefer to suggest the possibility that the vocal component of the syntactic multimodal language did not originally involve our current imitation of articulatory-phonetic sequences. Note, please, that the delay in the appearance of articulatory-phonetic sequences is a reliable fact in the first manifestations of writing. Could the same thing have happened in oral language? This suggestion, already put forward by Hockett (1960), has been defended by Fleming (2017), in the context of studying the ‘clicks’ of South African languages.
12
That article shows that chimpanzees used ‘know-how social learning’ (from a chimpanzee that experimenters had taught) to acquire a skill they failed to innovate. Thus, we can think that if wild chimpanzees use such type of learning only very infrequently, it is because they don’t produce complex innovations.
13
Certainly, recent research—Steven et al. (2022)—points to perspective-taking as a flexible and context-specific suite of abilities. However, here we can continue with Flavell’s dichotomy.
14
If this hypothesis turns out correct, then we could deduce that the so-called ‘audio-motor mirror neurons of birds’ cannot be mirror neurons. Note that, while learning the song-dialect, the bird does not sing yet. Therefore, the externally perceived dialect (that is, the dialectal enrichment of the innate template) is stored without any connection with proprioceptive expectations. Thus, if the proposal of Keysers & Perrett is accepted, the research about ‘the mirroring’ would have to refocus on primates, without it meaning undervaluing any type of ‘analogous similarities’ (underlined, for instance, by De Waal & Ferrari, 2010).
15
But, beyond that compatibility, the contrast shown by Schüler et al. (2024) puts a very interesting need at the center of the scene. The human Theory-of-Mind (which will be fully deployed in Section 6) must prevent all those internal, perceptually decoupled representations from influencing our behavior. Such prevention—I add—is a much more difficult task than the one required in nightmares, for example. While in this latter case, there is only one line of mental content—nightmare situations–, in the human Theory-of-Mind, however, there are ‘two lines’ of content, and, therefore, in the default network (in this peculiar, human ‘resting-state’) the prevention must be much more subtle and complex than mere muscle paralysis.
16
In Bejarano (2022) I have focused on that second type, and differentiated it from both spontaneous altruism and caring for one’s own reputation. The proposal of that article is that, while the ‘(ultimately perceptual) estimation of foreign mental contents’ is an adaptively very advantageous resource in human lifestyle, it however caused that the two typical features of perceptions—one, that of informing about the surroundings, i.e., of being true, and the other, that of being useful to the subject’s interests—became, for the first time in evolution, dissociated from each other. Thus, the perception of foreign mental contents—which include, of course, another individual’s needs and interests—is the basis of a demanding moral capacity that, not being adaptive either for the entire group (as spontaneous altruism is) or for the individual (as happens with the care of one’s own reputation), is—almost paradoxically and therefore more wonderfully—built by evolution. More concretely, while in Joyce (2007) or Wilkins and Griffiths (2013) (see also Levy & Weinshtock-Saadon, 2023) moral beliefs are “debunked” by evolution (since they, having been selected for adaptive reasons, are epistemically suspect), I propose how a base (however poor and weak) for that most demanding moral capacity could really arise in evolution. From the recommendations made to me by Reviewer 1, it can be inferred that he/she asks me to pay more attention to the dualism soul-body. In my view, a key core of that issue is whether outside of the acceptance of such dualism there is still a base for the capacity to choose between egoism (which intervenes even in the refined self-control that cares for one’s reputation) and truth, and that is precisely the question—“Could a base (however poor and weak) for this capacity arise in evolution?”—that Bejarano (2022) attempts to answer.
17
Thornton and Tamir (2024) (who use the term ‘affordances’) may perhaps make us see the very different ways in which general expectations are activated in humans.
18
Corballis (2000) and Corballis (2001) claimed that we interpret the ‘images in the mirror’ as the left-right reversal of the original objects, and that, while a reflection’s reversal is a product of optics, “such particular interpretation comes from neuroscience”. This link with neuroscience could be lengthened: The sudden acknowledgment of standing before a mirror and not before a peer (/or conversely, the sudden acknowledgment of standing before a peer and not before a mirror) inhibits (/activates) the mentioned high-level resource.
19
L. Lewis and Krupenye (2022), for example, underline apes’ competitive motivation. About infants’ motivation, see an interesting proposal in Woo et al. (2022) and Woo and Spelke (2022), who apply to this question (infants’ estimation of others’ false belief) an idea relatively similar to the link between “look for cheaters” and reasoning (Cheng & Holyoak, 1985, or Cosmides, 1989). In short, the mentioned proposal underlines that, since in some contexts “the estimation of others’ false beliefs may facilitate the ability to morally evaluate others’ actions”, such estimation is an adaptive task even in toddlers. But, according to my hypothesis, even if that interesting proposal becomes discarded, children’s curiosity about the interiority of others would still be extremely adaptive.
20
Any mammal or bird has expectations about the behavior of animals that are vastly different from him. But those are general, non-vicarious expectations.
21
Thus, it is not surprising that, for example, pride, when it is compared to joy, involves what Bornstein et al. (2023) call “a relatively more distant perspective”.
22
We could also remember Baader’s anti-Cartesian formulation (“Cogitor, ergo sum”), even if Baader (who lived from 1765 to 1841) interpreted it “more theologically than interpersonally” (Geldhof, 2005). I would reformulate it in the following way: ‘If I grasp foreign (i.e., others’) thoughts that involve me, I am human’.
23
Baumard et al. (2013) propose: “The best care of reputation (the most adaptively advantageous one, since the error of mistakenly assuming that no one is paying attention to a blatantly selfish action may compromise an agent’s reputation) is the genuinely moral habit”. This, of course, is also proposed by many other authors, for example, Boileau (“Pour paraître honnête homme, il faut l’être”). I shall not comment on such a proposal here, but see Bejarano (2022).
24
This more intense care could relate to what, on a higher, later level, Di Francesco et al. (2021) said: “People’s self-defining life stories have an intrinsically defensive nature; the description-narration of one’s own inner life is organized on the basis of the fundamental need to construct and defend a self-image endowed with an at least minimal solidity”.
25
According to my option, pride originally arose interpersonally: The “hubristic, narcissist pride” that is mentioned by Tracy et al. (2024) would have been a late (“evolved”) intrapersonal derivation.
26
As said above, while none of the earliest technological abilities implied high-fidelity transmission, this type of transmission not only supported later technologies, but also what I called (in Section “Does the ‘Language of Thought’ Exist?”) the set of all ‘super-high fidelity copying’—the articulatory-phonetic copying, and the learning of songs or dances. (Obviously, in these skillful tasks the conscious activity of memorizing and copying the model gives way, after multiple repetitions, to subconsciously memorized actions, and this allows attention to be focused on a higher level).
27
The underlining of pride is also useful to prevent the concept of self-control from being incorrectly narrowed. See Bermúdez et al. (2024): “Apathy is a normally overlooked kind of self-control problem. However, compared to negative self-control (i.e., self-control against temptations), which relies more on situational strategies, positive self-control requires more intrapsychic work to get motivation (my emphasis)”.
28
‘Self-control’ (Shilton et al., 2020)? Or ‘self-domestication’ (Benítez-Burraco & Nikolsky, 2023, to choose a recent example)? I can only say that the connotations of the term ‘self-domestication’ (even if this is very different from ‘submission’—the evolutionary precedent of shame, according to Maibom, 2010) are less suitable for a capacity that, “even when it takes us to meekness, means the strength and power to use one’s energy” for one’s previously chosen purposes: Roszak (2022). (This author, instead of “self-control”, uses the traditionally moral term “fortitude”. But I cannot adopt such a use, since in my view—Bejarano (2022)—, self-control is not necessarily moral).
29
Could Bryant et al. (2024) reinforce that claim? They state: “Our findings support a two-step evolutionary process, in which changes in prefrontal cortex organization emerge prior to changes in temporal areas”.
30
Certainly, I’m not really proposing these examples, but just putting them here to facilitate the exposition. However, I want to mention Breil et al. (2022), who investigated the unified reception (in their words, “the early temporal integration”) of gaze and emotion cues, and “suggest a processing benefit when emotional expression (happy/disgusted) and gaze (direct/averted) are congruent in terms of approach- or avoidance-orientation”.
31
Remember that, much later in development, also our current narrative speech uses gestural ‘theatricalization’ (whose effects Rühlemann & Trujillo, 2024 have studied in detail) and affective prosody. Likewise, ‘symbolic play’—or ‘pretense’—might train this ‘intentional control and use’ of behavioral and even ‘autonomic’ levels.
32
This capacity of recognition is so adaptive that ‘the possibility of false positives’ (i.e., the currently very mentioned ‘overextension of Theory-of-Mind’–see, e.g., Bering, 2011) doesn’t matter, especially since exercising that capacity makes it stronger. This is a repetition of what happened at a much earlier point in evolution with the detection of agency.
33
Obviously, there is an easier type of communication that is present in many more animal species: In it, individuals accumulate evidence through ‘many pairs of eyes’, for example. Thus, “cues and signals from other individuals (e.g., fleeing movements and alarm calls) reduce uncertainty about predator risk” (Hahn et al., 2024, preprint).
34
Likewise, human infants produce ”ostensive gestures with an object” months before making pointing gestures: Rodríguez et al. (2015) and Guevara et al. (2024).
35
Ontogenetically that estimation is a difficult process, even in its previous requisite: Note that caregivers may naturally express their emotions in ways that maximize learning possibilities—e.g., “emotionese”: see Benders (2013), or A. Ruba and Repacholi (2020).
36
Thus, the pleasure of laughter (a pleasure not entirely exclusive to humans, but certainly a universally human characteristic) arose in evolution because it might—I choose this explanation– prevent frequent failures from discouraging primate brains from making ever more complex expectations. The infant and the chimpanzee know when they are going to be tickled, but they fail to predict the exact point or the exact instant. Likewise, we laugh when, after activating the vicarious expectation that the observed individual will sit down, we see him fall over. However, the predictive failure of the continuation of the narrative after the punchline is not a failure of prediction directly, but one of inadequate and incomplete understanding of the preceding part. Thus it is only this kind of laughter that fosters the cognitive humility that is necessary for creativity.
37
‘Say’ was even later used in ‘first person + present + affirmative’, an apparently tautological use which came to fulfill a new function, but still originally related, in my view, to ‘referred speech’. With these uses the speaker communicates that he is aware of how his speech looks—and could be referred—from the outside. This may have been the ‘interpersonal’ origin of the (later, more culturally and institutionally supported) ‘performatives’: Let’s compare ‘I say that…’ with ‘I swear that…’ (which was the example chosen by Benveniste, 1958/1966).
38
In grateful response to Reviewer 1, I want to add that I highly value Donald (1991) (of which I published in 1996 “Recensión de Donald, 1991 y 1993”), especially the idea that beyond animal memory (which probably only stores the—so to speak—‘moral of the story’ of past events, that is, only what may ever be immediately useful), there are three memory transitions (in my view, supported respectively by non-syntactic multimodal communications, syntactic language, and writing).
39
The appeal to such a ‘genetic start-kit’ is, unsurprisingly, rejected in writings in the behaviorist tradition dealing with Theory-of-Mind. One such paper is Schlinger (2009) (which was recommended to me by Reviewer 1). As for Schlinger, I, while not accepting his rejection of the genetic basis of Theory-of-Mind, do share his criticism that (sometimes, I would qualify) “discussions of ToM focus almost exclusively on inferred cognitive structures and processes and shed little light on the actual behaviors involved” (See, for example, in Section 5.2, my question about how the experiencer of self-conscious emotions is aware of what others think of him/her).
40
In the words of Uomini and Ruck (2019) (who exemplify this attitude in their study of the emergence of human handedness): “The paucity of data is an obstacle in studying cognitive evolution, but this has not stopped researchers from trying”. I love that “but”.
41
About ‘spontaneous altruism’: See M. Tomasello (2012), Rand et al. (2012), and, especially, “self-other merging” (Miyazono & Inarimori, 2021) and “goal slippage” (Michael & Székely, 2019). Let us also focus on the unquestionable footprints of caring for the ill or the wounded that have been found in Neanderthals: At least we cannot doubt “the selective advantages of reducing the risk of mortality of other group members in small groups whose members are highly interdependent” (Spikins et al., 2019, my emphasis). Spontaneous altruism is ontogenetically earlier than the motivation to improve one’s reputation by helping: See Hepach et al. (2022). About the (probably, very primitive) type of spontaneous altruism that, “connected to reactive, non-cognitive fear circuits, helps others under threat” (for instance, in social hunters): See J. B. Vieira et al. (2020), J. Vieira and Olsson (2022).
42
According to M. Tomasello and Call (2019), “attention-getters, since they manipulate attention of addressees, evolutionarily precede pointing gestures, while intention-movements, since they manipulate the imagination, precede pantomimes”. I agree with such a difference, but my interest is now in the similarity of both receptions.
43
See also Bohn et al. (2020), who report that apes do not learn from iconic gestures.
44
When infants first understand pointing in a unified way, do they understand it only when the producer addresses it to them? Clark (1996) claimed: “The basic arena for social interaction is the dyad”. Certainly, some findings might seem to challenge that claim. (Thiele et al., 2023 report that “observed joint attention” already modulates 9-month-old infants’ object encoding. Likewise, according to Goupil et al. (2024), both humans and macaques show spontaneous preference to look at two bodies facing towards each other). However, those findings do not seem to me to involve that challenge. People’s movements are always salient stimuli, of course, but, in my view, the ‘ability to capture other people’s mental contents’ is not required in those experimental situations. Thus, according to my proposal, “the dyad” can be maintained for the very origin of the human mode of receiving pointing gestures.
45
Bejarano (2011), Chapter 6: My argumentation started by focusing on the reception (see Rubio-Fernandez, 2020) of the most egocentric deictics (here vs. there; this vs. that; I vs. you), i.e., of the words that the addressee has to understand in a different way than the way he, the now addressee, uses them when he is the speaker. But I extended it to any linguistic reception.
46
What about dogs? Eye contact—i.e., the communicator making eye contact with the dog—is the major cue that dogs use to determine when a human pointing is intended for them. (See Kaminski & Nitzschner, 2013; Téglás et al., 2012). However, Lyn et al. (2024, preprint) may have slightly lowered the initial triumphalism: Since dogs have more difficulty in following contralateral pointing, these authors suggest that ipsilateral points are learned through associative mechanisms. In general, Project MANYDOGS will try to replicate previous findings. But it is worth remembering Zuberbühler (2008): “Social carnivores must decide on one particular prey individual prior to group hunting”. Thus, if the dominant wolf remains for a few moments looking at—or making some movement towards—a particular prey, this could be an innately communicative signal, which would pre-activate in the members of the herd a plan of attack in the signaled direction. So, when, shortly after, the wolf-recipient feels that he is being looked at by the dominant individual, he starts its previously pre-activated attack plan. In this way, dogs would just make richer their innate expectation of the first signal—i.e., they would learn to associate their innate expectation with some other features (hand or finger).
47
This possibility is not at all an absurd suggestion. Firstly, within the lineage of Sapiens and even in dates totally within the (formerly so-called) ‘anatomically modern humans’, there is a marked evolution in the shape of the cranium: See Neubauer et al. (2018) (although, at least since 160.000 b. p., these differences with living humans would mainly affect, according to Zollikofer et al., 2022, the face and cranial base). See also Freidline et al. (2024): “The unique facial growth pattern of Homo sapiens post-dated the Middle Stone Age”. Secondly, regarding our progressive absence of prominent brow bridges—which were very prominent in Neanderthals–, Godinho et al. (2018) reject the old hypotheses on such absence and suggest “its potential role in social communication”. (See Siposova et al., 2018, who underline the role of raised and highly mobile eyebrows in “the reception of communicative looks”. Likewise, Gast (2023) focuses on the link between linguistic prosody and eyebrow movement). In addition, I ask: Could the chin, whose absence in Neanderthal has been so studied (cf. Meneganzin et al., 2024), strengthen the gestural, emotional expressivity of the mouth? (Remember Section 5.2 above).
48
‘Embodied’ is a term that I have decided to use, although it does not really make sense in a position (such as mine) that opposes dualisms, both the body-mind dualism of the cognitive revolution (about this debate, see an excellent summary in Barrett & Stout, 2024) and the various body-soul dualisms. Indeed, I believe not only that animal consciousness emanates from the evolved complexity of the animal body, but also that the most spiritual capacities of human beings (see previous note 16) are the product of the extremely, wonderfully evolved matter that forms our bodies.
49
Regarding such later rest, I would underline: (1) creative (technical, artistic, or scientific) problem-solving, that is, the ability to transform one’s insufficient mental contents into sufficient ones to solve the problem, and (2) what I called in previous note 16 ‘the most demanding moral capacity’.
50
Bejarano (2022): “The current focus on hominids and Neanderthals opens a new door for us which was undreamt of for previous philosophers and scholars”. Or, much more precisely, Currie et al. (2024): “Philosophical methodology can benefit greatly from interaction with cognitive paleoanthropology. […] Coherent evolutionary narratives is a means of readmitting synthesis to the philosophical toolkit”.

References

  1. Algoe, S. B., & Haidt, J. (2009). Witnessing excellence in action: The ‘other-praising’ emotions of elevation, gratitude, and admiration. The Journal of Positive Psychology, 4(2), 105–127. [Google Scholar] [CrossRef] [PubMed]
  2. Andersson, C., & Tennie, C. (2023). Zooming out the microscope on cumulative cultural evolution: ’Trajectory B’ from animal to human culture. Humanities and Social Sciences Communications, 10, 1–20. [Google Scholar] [CrossRef]
  3. André, J., Baumard, N., & Boyer, P. (2023). Cultural Evolution from the Producers’ Standpoint. Evolutionary Human Sciences, 5, 1–24. [Google Scholar] [CrossRef]
  4. Bar, M. (2007). The proactive brain: Using analogies and associations to generate predictions. Trends in Cognitive Sciences, 11(7), 280–289. [Google Scholar] [CrossRef] [PubMed]
  5. Barone, P., & Gomila, A. (2019). Infants’ performance in the indirect false belief tasks: A second-person interpretation. Cognitive Science, 12(3), e1551. [Google Scholar] [CrossRef]
  6. Barone, P., Wenzel, L., Proft, M., & Rakoczy, H. (2022). Do young children track other’s beliefs, or merely their perceptual access? An interactive, anticipatory measure of early theory of mind. Royal Society Open Science, 9(10), 211278. Available online: https://royalsocietypublishing.org/doi/10.1098/rsos.211278 (accessed on 4 November 2024).
  7. Barrett, L., & Stout, D. (2024). Minds in movement: Embodied cognition in the age of artificial intelligence. Philosophical Transactions B, 379, 20230144. [Google Scholar] [CrossRef]
  8. Baumard, N., André, J., & Sperber, D. (2013). A mutualistic approach to morality. The evolution of fairness by partner choice. Behavioral and Brain Sciences, 36, 59–78. [Google Scholar] [CrossRef]
  9. Becker, Y., Claidière, N., Margiotoudi, K., Marie, D., Roth, M., Nazarian, B., Anton, J., Coulon, O., & Meguerditchian, A. (2022). Broca’s cerebral asymmetry reflects gestural communication’s lateralisation in monkeys (Papio anubis). eLife, 11. Available online: https://elifesciences.org/articles/70521 (accessed on 4 November 2024).
  10. Becker, Y., Sein, J., Velly, L., Giacomino, L., Renaud, L., Lacoste, R., Anton, J., Nazarian, B., Berne, C., & Meguerditchian, A. (2021). Early Left-Planum Temporale Asymmetry in Newborn Monkeys (Papio anubis): A Longitudinal Structural MRI Study at Two Stages of Development. NeuroImage, 227, 117575. [Google Scholar] [CrossRef] [PubMed]
  11. Bejarano, T. (2008, March 12–15). Pragmatics and theory-of-mind: A problem exportable to the origins of language. Conference ‘Evolang 7’, Barcelona, SpainAvailable online: https://www.worldscientific.com/doi/abs/10.1142/9789812776129_0003 (accessed on 4 November 2024).
  12. Bejarano, T. (2010). REVIEW of hurford, james, 2007, the origins of meaning. Teorema, 29, 157–164. Available online: http://www.lel.ed.ac.uk/~jim/origins.revu.bejarano.html (accessed on 4 November 2024).
  13. Bejarano, T. (2011). Becoming Human: From pointing gestures to syntax. Benjamins. Available online: https://benjamins.com/catalog/aicr.81 (accessed on 4 November 2024).
  14. Bejarano, T. (2014). From holophrase to syntax: Intonation and the victory of voice over gesture. HUMANA. MENTE. Journal of Philosophical Studies, 27, 21–37. Available online: https://www.humanamente.eu/index.php/HM/article/view/95 (accessed on 4 November 2024).
  15. Bejarano, T. (2022). The most demanding moral capacity: Could evolution provide any base? Isidorianum, 31(2), 91–126. Available online: https://www.sanisidoro.net/publicaciones/index.php/isidorianum/article/view/Bejarano (accessed on 4 November 2024).
  16. Benders, T. (2013). Mommy is only happy! Dutch mothers’ realisation of speech sounds in infant-directed speech expresses emotion, not didactic intent. Infant Behavior and Development, 36(4), 847–862. [Google Scholar] [CrossRef]
  17. Benítez-Burraco, A., & Nikolsky, A. (2023). The (Co)evolution of language and music under human self-domestication. Human Nature, 34(2), 229–275. [Google Scholar] [CrossRef]
  18. Benveniste, E. (1966). De la subjectivity dans le langage. In Problèmes de linguistique générale. Gallimard. Original work published 1958. [Google Scholar]
  19. Bering, J. (2011). The belief instinct: The psychology of souls, destiny, and the meaning of life. W.W. Norton. [Google Scholar]
  20. Berio, L., & Moore, R. (2023). Great ape enculturation studies: A neglected resource in cognitive development research. Biology & Philosophy, 38, 1–24. [Google Scholar] [CrossRef]
  21. Berke, M., Horschler, D., Jara-Ettinger, J., & Santos, L. (2023). Differences between human and non-human primate theory of mind: Evidence from computational modeling. bioRxiv. [Google Scholar] [CrossRef]
  22. Bermúdez, J. P., Berthelette, S., Anaya, A., Fernández-Miranda, G., & Téllez, D. R. (2024). Temptation and apathy. Oxford Studies in Agency and Responsibility Volume 8: Non-Ideal Agency and Responsibility, 8, 10. [Google Scholar]
  23. Bohn, M., Kordt, C., Braun, M., Call, J., & Tomasello, M. (2020). Learning novel skills from iconic gestures: A developmental and evolutionary perspective. Psychological Science, 31(7), 873–880. [Google Scholar] [CrossRef]
  24. Bohn, M., Liebal, K., Oña, L., & Tessler, M. H. (2022). Great ape communication as contextual social inference: A computational modelling perspective. Philosophical Transactions of the Royal Society B: Biological Science, 377, 20210096. [Google Scholar] [CrossRef] [PubMed]
  25. Bonini, L., Rotunno, C., Arcuri, E., & Gallese, V. (2023). The mirror mechanism: Linking perception and social interaction. Trends in Cognitive Sciences, 27(3), 220–221. [Google Scholar] [CrossRef]
  26. Bornstein, O., Moran, T., Simchon, A., & Eyal, T. (2023). The effect of psychological distance on the experience of joy versus pride. Social Cognition, 41(4), 341–364. [Google Scholar] [CrossRef]
  27. Bräten, S. (2004). Hominin Infant Decentration Hypothesis: Mirror neurons system adapted to subserve mother-centered participation. Behavioral and Brain Sciences, 27, 508–509. [Google Scholar] [CrossRef]
  28. Breil, C., Raettig, T., Pittig, R., van der Wel, R. P. R. D., Welsh, T., & Böckler, A. (2022). Don’t Look at Me Like That: Integration of Gaze Direction and Facial Expression. Journal of Experimental Psychology. Human Perception and Performance, 48, 1083–1098. [Google Scholar] [CrossRef]
  29. Brinums, M., Franco, C., Kang, J., Suddendorf, T., & Imuta, K. (2023). Driven by emotion: Anticipated feelings motivate children’s deliberate practice. Cognitive Development, 66, 101340. [Google Scholar] [CrossRef]
  30. Bryant, K., Camilleri, J., Warrington, S., Blazquez Freches, G., Sotiropoulos, S., Jbabdi, S., Eickhoff, S., & Mars, R. (2024). Connectivity profile and function of uniquely human cortical areas. bioRxiv, 2024-06. [Google Scholar] [CrossRef]
  31. Buckner, C. (2013). Morgan’s canon, meet hume’s dictum: Avoiding anthropofabulation in cross-species comparisons. Biology & Philosophy, 28, 853–871. [Google Scholar] [CrossRef]
  32. Bugnyar, T., & Heinrich, B. (2005). Ravens differentiate between knowledgeable and ignorant competitors. Proceedings of the Royal Society B, 272, 1641–1646. [Google Scholar] [CrossRef] [PubMed]
  33. Bugnyar, T., Reber, S. A., & Buckner, C. (2016). Ravens attribute visual access to unseen competitors. Nature Communications, 7(1), 10506. Available online: https://www.nature.com/articles/ncomms10506 (accessed on 4 November 2024). [CrossRef] [PubMed]
  34. Cañigueral, R., Krishnan-Barman, S., & Hamilton, A. F. d. C. (2022). Social signalling as a framework for second-person neuroscience. Psychonomic Bulletin & Review, 29, 2083–2095. [Google Scholar] [CrossRef]
  35. Cartmill, E., Cartmill, M., Brown, K., & Foster, J. (2024). Which came first—Iconicity or symbolism? Evolang XV. Available online: https://evolang2024.github.io/proceedings/schedule.html (accessed on 4 November 2024).
  36. Caspar, K. R., Biggemann, M., Geissmann, T., & Begall, S. (2021). Ocular pigmentation in humans, great apes, and gibbons is not suggestive of communicative functions. Scientific Reports, 11(1), 12994. [Google Scholar] [CrossRef]
  37. Castro, L., & Toro, M. A. (2004). The evolution of culture: From primate social learning to human culture. Proceedings of the National Academy of Sciences USA, 101, 10235–10240. [Google Scholar] [CrossRef]
  38. Castro, L., Castro-Nogueira, M. Á., & Toro, M. Á. (2024). Teaching and the origin of the normativity. Biology & Philosophy, 39, 23. [Google Scholar] [CrossRef]
  39. Chellappoo, A. (2021). Rethinking Prestige Bias. Synthese, 198, 8191–8212. [Google Scholar] [CrossRef]
  40. Cheng, P. W., & Holyoak, K. J. (1985). Pragmatic reasoning schemas. Cognitive Psychology, 17, 391–416. [Google Scholar] [CrossRef] [PubMed]
  41. Cisek, P. (2019). Resynthesizing behavior through phylogenetic refinement. Attention, Perception & Psychophysics, 81(7), 2265–2287. [Google Scholar] [CrossRef]
  42. Cisek, P. (2021). Evolution of behavioural control from chordates to primates. Philosophical Transactions of the Royal Society B: Biological Sciences, 377, 20200522. [Google Scholar] [CrossRef] [PubMed]
  43. Clark, H. (1996). Using language. Cambridge U. P. [Google Scholar]
  44. Clark, H., Elsherif, M. M., & Leavens, D. A. (2019). Ontogeny versus phylogeny in primate/canid comparisons: A metaanalysis of the object choice task. Neuroscience and Biobehavioral Reviews, 105, 178–189. [Google Scholar] [CrossRef] [PubMed]
  45. Clements, W. A., & Perner, J. (1994). Implicit understanding of belief. Cognitive Development, 9(4), 377–395. [Google Scholar] [CrossRef]
  46. Coolidge, F. (2023). Parietal lobe expansion, its consequences for working memory, and the evolution of modern thinking. Cognitive Archaeology, Body Cognition, and the Evolution of Visuospatial Perception, 2023, 181–194. [Google Scholar] [CrossRef]
  47. Cooperrider, K., & Slotta, J. (2018). The preference for pointing with the hand is not universal. Cognitive Science, 42(1), 1375–1390. [Google Scholar] [CrossRef]
  48. Corballis, M. (2000). Much ado about mirrors. Psychonomic Bulletin & Review, 7, 163–169. [Google Scholar]
  49. Corballis, M. (2001). Why Mirrors Reverse Left and Right. Psycoloquy, 12, 1–4. Available online: https://www.cogsci.ecs.soton.ac.uk/cgi/psyc/newpsy?12.032 (accessed on 4 November 2024).
  50. Corballis, M. (2011). The recursive mind: The origins of human language, thought, and civilization. Princeton University Press. [Google Scholar]
  51. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Cognition, 31(3), 187–276. [Google Scholar] [CrossRef] [PubMed]
  52. Crespi, B. J., Flinn, M. V., & Summers, K. (2022). Runaway social selection in human evolution. Frontiers in Ecology and Evolution, 10, 894506. [Google Scholar] [CrossRef]
  53. Csibra, G., & György, G. (2006). Social learning and social cognition: The case for pedagogy. In Y. Munakata, & M. H. Johnson (Eds.), Processes of change in brain and cognitive development (pp. 249–274). Academia. [Google Scholar] [CrossRef]
  54. Currie, A., Killin, A., Lequin, M., Meneganzin, A., & Pain, R. (2024). Past materials, past minds: The philosophy of cognitive paleoanthropology. Philosophy Compass, 19, e13001. [Google Scholar] [CrossRef]
  55. Darwin, C. (1872). The expression of the emotions in man and animals. John Murray. [Google Scholar]
  56. De Waal, F. (2010). The age of empathy. Three Rivers Press. [Google Scholar]
  57. De Waal, F., & Ferrari, P. (2010). Toward a bottom-up perspective on animal and human cognition. Trends in Cognitive Sciences, 14, 201–207. [Google Scholar] [CrossRef] [PubMed]
  58. Di Bernardi Luft, C., Zioga, I., Giannopoulos, A., Di Bona, G., Binetti, N., Civilini, A., Latora, V., & Mareschal, I. (2022). Social synchronization of brain activity increases during eye-contact. Communications Biology, 5(1), 412. [Google Scholar] [CrossRef] [PubMed]
  59. Di Francesco, M., Marraffa, M., & Paternoster, A. (2021). A self properly embodied. In The jamesian mind. Routledge. [Google Scholar] [CrossRef]
  60. Dingemanse, M., & Enfield, N. (2023). Interactive repair and the foundations of language. Trends in Cognitive Sciences, 28(1), 30–42. [Google Scholar] [CrossRef]
  61. Dingemanse, M., Torreira, F., & Enfield, N. J. (2013). Is “Huh?” a universal word? Conversational infrastructure and the convergent evolution of linguistic items. PLoS ONE, 8(11), e78273. [Google Scholar] [CrossRef] [PubMed]
  62. Donald, M. (1991). Origins of human mind. Three stages in the evolution of culture and cognition. Harvard University Press. [Google Scholar]
  63. Dor, D. (2016). From experience to imagination: Language and its evolution as a social communication technology. Journal of Neurolinguistics, 43, 107–119. [Google Scholar] [CrossRef]
  64. Dor, D. (2023). Communication for collaborative computation: Two major transitions in human evolution. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 378(1872), 20210404. [Google Scholar] [CrossRef]
  65. Dreon, R. (2024). Enlanguaged experience. Pragmatist contributions to the continuity between experience and language. Phenomenology and the Cognitive Sciences, 24(1), 63–83. [Google Scholar] [CrossRef]
  66. Durdevic, K., & Call, J. (2022). On the origins of mind: A COMPARATIVE PERSPECTIVE. Annual Review of Developmental Psychology, 4, 63–87. [Google Scholar] [CrossRef]
  67. Edwards-Lowe, G., La Chiusa, E., Olawole-Scott, H., & Yon, D. (2024). Information seeking without metacognition. Available online: https://osf.io/preprints/psyarxiv/cf4a7_v1 (accessed on 4 November 2024).
  68. Ereira, S., Dolan, R., & Kurth-Nelson, Z. (2018). Agent-specific learning signals for self—Other distinction during mentalising. PLoS Biology, 16(4), e2004752. [Google Scholar] [CrossRef] [PubMed]
  69. Ericsson, K. A. (2002). Attaining excellence through deliberate practice: Insights from the study of expert performance. In M. Ferrari (Ed.), The pursuit of excellence through education (pp. 21–55). Lawrence Erlbaum Associates Publishers. [Google Scholar]
  70. Errante, A., Gerbella, M., Mingolla, G. P., & Fogassi, L. (2023). Activation of cerebellum, basal ganglia and thalamus during observation and execution of mouth, hand, and foot actions. Brain Topography, 36(4), 476–499. [Google Scholar] [CrossRef] [PubMed]
  71. Essler, S., Becher, T., Pletti, C., Gniewosz, B., & Paulus, M. (2023). Longitudinal evidence that infants develop their imitation abilities by being imitated. Current Biology, 33(21), 4674–4678.e3. [Google Scholar] [CrossRef] [PubMed]
  72. Fedorenko, E., Piantadosi, S. T., & Gibson, E. A. F. (2024). Language is primarily a tool for communication rather than thought. Nature, 630, 575–586. [Google Scholar] [CrossRef] [PubMed]
  73. Fleming, L. (2017). Phoneme inventory size and the transition from monoplanar to dually patterned speech. Journal of Language Evolution, 2(1), 52–56. [Google Scholar] [CrossRef]
  74. Fodor, J. (1975). The language of thought. Harvard University Press. [Google Scholar]
  75. Fodor, J. (2007). The revenge of the given. In B. McLaughlin, & J. Cohen (Eds.), Contemporary debates in philosophy of mind (pp. 105–116). Blackwell. [Google Scholar]
  76. Foley, R., & Mirazón, L. (2020). Variable cognition in the evolution of homo: Biology and behaviour in the african middle stone age. In Landscapes of human evolution (pp. 125–141). Archaeopress Publishing Ltd. [Google Scholar]
  77. Freidline, S. E., Gunz, P., Alichane, H., Oujaa, A., Ben-Ncer, A., El Hajraoui, M. A., & Hublin, J. (2024). The undescribed juvenile maxilla from contrebandiers cave, morocco—A study on middle stone age facial growth. Journal of Paleolithic Archaeology, 7, 15. [Google Scholar] [CrossRef]
  78. Frith, C. D., & Frith, U. (2007). Social cognition in humans. Current Biology, 17(16), R724–R732. [Google Scholar] [CrossRef] [PubMed]
  79. Gainotti, G. (2024). Emotions related to threatening events are mainly linked to the right hemisphere. Journal of Psychiatry & Neuroscience, 49(3), E208–E211. [Google Scholar] [CrossRef]
  80. Gallagher, S. (2015). The problem with 3-year-olds. Journal of Consciousness Studies: Controversies in Science and the Humanities, 22(1–2), 160–182. [Google Scholar]
  81. Gallardo, G., Eichner, C., Sherwood, C. C., Hopkins, W. D., Anwander, A., & Friederici, A. D. (2023). Morphological evolution of language-relevant brain areas. PLoS Biology, 21, e3002266. [Google Scholar] [CrossRef] [PubMed]
  82. Gallese, V. (2018). The Problem of Images: A view from the brain-body. Phenomenology and Mind, 14, 70–79. [Google Scholar] [CrossRef]
  83. Gärdenfors, P. (2022). Teaching as evolutionary precursor to language. Frontiers in Communication, 7, 970069. [Google Scholar] [CrossRef]
  84. Gärdenfors, P., & Lombard, M. (2020). Technology led to more abstract causal reasoning. Biology & Philosophy, 35, 40. [Google Scholar] [CrossRef]
  85. Gasparri, L. (2023). The first words ever spoken. Synthese, 201, 174. [Google Scholar] [CrossRef]
  86. Gast, V. (2023). The temporal alignment of speech-accompanying eyebrow movement and voice pitch. Behavioral Sciences, 13(1), 52. [Google Scholar] [CrossRef] [PubMed]
  87. Geldhof, J. (2005). ‘Cogitor ergo sum’: On the meaning and relevance of baader’s theological critique of descartes. Modern Theology, 21(2), 237–251. [Google Scholar] [CrossRef]
  88. Geurts, B. (2019, September 25–27). What’s wrong with Gricean pragmatics? 10th International Conference of Experimental Linguistics, Lisbon, Portugal. [Google Scholar] [CrossRef]
  89. Godinho, R. M., Spikins, P., & O’higgins, P. (2018). Supraorbital morphology and social dynamics in human evolution. Nature (Ecology & Evolution), 2(6), 956–961. [Google Scholar] [CrossRef]
  90. Goupil, N., Rayson, H., Serraille, É., Massera, A., Ferrari, P. F., Hochmann, J., & Papeo, L. (2024). Visual preference for socially relevant spatial relations in humans and monkeys. Psychological Science, 35(6), 681–693. [Google Scholar] [CrossRef]
  91. Graham, K. E., Rossano, F., & Moore, R. T. (2024). The origin of great ape gestural forms. Biological Reviews of the Cambridge Philosophical Society, 100(1), 190–204. [Google Scholar] [CrossRef]
  92. Guevara, I., Rodríguez, C., & Núñez, M. (2024). Developing gestures in the infant classroom: From showing and giving to pointing. European Journal of Psychology of Education, 39, 4671–4702. [Google Scholar] [CrossRef]
  93. Haggard, P., & Wolpert, D. (2005). Disorders of Body Scheme. In Higher-Order motor disorders (pp. 261–271). Oxford University Press. [Google Scholar]
  94. Hahn, L., Sergiou, A., Arbon, J., Fuertbauer, I., King, A., & Thornton, A. (2024). The co-evolution of cognition and sociality. Available online: https://osf.io/preprints/osf/n2z4a_v1 (accessed on 4 November 2024).
  95. Happé, F. (1993). Communicative competence and theory of mind in autism: A test of relevance theory. Cognition, 48, 101–119. [Google Scholar] [CrossRef]
  96. Heintz, C., & Scott-Phillips, T. (2022). Expression unleashed: The evolutionary & cognitive foundations of human communication. Behavioral and Brain Sciences, 46, 1–46. [Google Scholar] [CrossRef]
  97. Henrich, J., & Broesch, J. (2011). On the nature of cultural transmission networks: Evidence from Fijian villages for adaptive learning biases. Philosophical Transactions of the Royal Society Biological Sciences, 366, 1139–1148. [Google Scholar] [CrossRef] [PubMed]
  98. Hepach, R., Engelmann, J. M., Herrmann, E., Gerdemann, S. C., & Tomasello, M. (2022). Evidence for a developmental shift in the motivation underlying helping in early childhood. Developmental Science, 26, e13253. [Google Scholar] [CrossRef]
  99. Heyes, C. (2021a). Imitation. Current Biology, 31(5), R228–R232. [Google Scholar] [CrossRef] [PubMed]
  100. Heyes, C. (2021b). Imitation and culture: What gives? Mind and Language, 38(1), 42–63. [Google Scholar] [CrossRef]
  101. Heyes, C., & Catmur, C. (2022). What happened to mirror neurons? Perspectives on Psychological Science, 17(1), 153–168. [Google Scholar] [CrossRef] [PubMed]
  102. Heyes, C. M., & Frith, C. D. (2014). The cultural evolution of mind reading. Science, 344, 1243091. [Google Scholar] [CrossRef] [PubMed]
  103. Hobaiter, C., Leavens, D. A., & Byrne, R. W. (2014). Deictic gesturing in wild chimpanzees? Journal of Comparative Psychology, 128, 82–87. [Google Scholar] [CrossRef]
  104. Hockett, C. (1960). The origin of speech. Scientific American, 203, 88–111. [Google Scholar] [CrossRef]
  105. Hurford, J. (2007). The origins of meaning. Oxford University Press. [Google Scholar]
  106. Joyce, R. (2007). The evolution of morality. MIT Press. [Google Scholar]
  107. Kaminski, J., & Nitzschner, M. (2013). Do dogs get the point? A review of dog–human communication ability. Learning and Motivation, 44(4), 294–302. [Google Scholar] [CrossRef]
  108. Kano, F., Furuichi, T., Hashimoto, C., Krupenye, C., Leinwand, J. G., Hopper, L. M., Martin, C. F., Otsuka, R., & Tajima, T. (2022). What is unique about the human eye? Comparative image analysis on the external eye morphology of human and nonhuman great apes. Evolution and Human Behavior, 43(3), 169–180. [Google Scholar] [CrossRef]
  109. Kano, F., Krupenye, C., Hirata, S., Call, J., & Tomasello, M. (2017). Submentalizing cannot explain belief-based action anticipation in apes. Trends in Cognitive Sciences, 21(9), 633–634. [Google Scholar] [CrossRef] [PubMed]
  110. Karabegović, M., & Mercier, H. (2023). The reputational benefits of intellectual humility. Review of Philosophy and Psychology, 15(2), 483–498. [Google Scholar] [CrossRef]
  111. Karg, K., Schmelz, M., Call, J., & Tomasello, M. (2015). The goggles experiment: Can chimpanzees use self-experience to infer what a competitor can see? Animal Behavior, 105, 211–221. [Google Scholar] [CrossRef]
  112. Karg, K., Schmelz, M., Call, J., & Tomasello, M. (2016). Differing views: Can chimpanzees do level 2 perspective-taking? Animal Cognition, 19, 555–564. [Google Scholar] [CrossRef]
  113. Keysers, C., & Perrett, D. (2004). Demystifying social cognition: A Hebbian perspective. Trends in Cognitive Sciences, 8, 501–507. [Google Scholar] [CrossRef] [PubMed]
  114. Kishimoto, T., Shizawa, Y., Yasuda, J., Hinobayashi, T., & Minami, T. (2007). Do pointing gestures by infants provoke comments from adults? Infant Behavior and Development, 30, 562–567. [Google Scholar] [CrossRef] [PubMed]
  115. Klein, J. T., Shepherd, S. V., & Platt, M. L. (2009). Social attention and the brain. Current Biology, 19, R958–R962. [Google Scholar] [CrossRef] [PubMed]
  116. Kobayashi, H., & Kohshima, S. (2001). Unique morphology of the human eye and its adaptive meaning. Journal of Human Evolution, 40, 419–435. [Google Scholar] [CrossRef]
  117. Krupenye, C., Kano, F., Hirata, S., Call, J., & Tomasello, M. (2016). Great apes anticipate that other individuals will act according to false beliefs. Science, 354(6308), 110–114. [Google Scholar] [CrossRef] [PubMed]
  118. Laland, K. (2017). The origins of language in teaching. Psychonomic Bulletin & Review, 24(1), 225–231. [Google Scholar] [CrossRef]
  119. Lameira, A. R., E Hardus, M., Ravignani, A., Raimondi, T., & Gamba, M. (2024). Recursive self-embedded vocal motifs in wild orangutans. eLife, 12, RP88348. [Google Scholar] [CrossRef]
  120. Leary, M. (2004). The sociometer. In R. Baumeister, & K. Vohs (Eds.), Handbook of self-regulation (pp. 373–391). Guilford. [Google Scholar]
  121. Leavens, D. (2021). The referential problem space revisited: An ecological hypothesis of the evolutionary and developmental origins of pointing. Cognitive Science, 12(4), e1554. [Google Scholar] [CrossRef] [PubMed]
  122. Leavens, D. A., Hopkins, W. D., & Bard, K. A. (2005). Understanding the point of chimpanzee. Epigenesis and ecological validity. Current Directions in Psychological Science, 14(4), 185–189. [Google Scholar] [CrossRef] [PubMed]
  123. LeDoux, J. (2012). Rethinking the emotional brain. Neuron, 73, 653–676. [Google Scholar] [CrossRef] [PubMed]
  124. LeDoux, J. (2023). The deep history of ourselves: The four-billion-year story of how we got conscious brains. Philosophical Psychology, 36(4), 704–715. [Google Scholar] [CrossRef]
  125. Levy, A., & Weinshtock-Saadon, I. (2023). Evolutionary king of (arguments for) moral realism. Synthese, 201(5), 1–22. [Google Scholar] [CrossRef]
  126. Lewis, L., & Krupenye, C. (2022). Theory of mind in nonhuman primates. In Primate cognitive studies. Cambridge University Press. [Google Scholar] [CrossRef]
  127. Lewis, M. (2000). The emergence of human emotions. In M. Lewis, & J. Haviland-Jones (Eds.), Handbook of emotions (pp. 265–280). Guilford. [Google Scholar]
  128. Li, L. (2023). The other side of false belief: Constructing the objectivity of reality. Infant and Child Development, 32, e2416. [Google Scholar] [CrossRef]
  129. Lind, J., & Jon-And, A. (2024). A sequence bottleneck for animal intelligence and language? Trends in Cognitive Sciences. Online ahead of print. [Google Scholar] [CrossRef]
  130. Lipschits, O., & Geva, R. (2024). An integrative model of parent-infant communication development. Child Development Perspectives, 18(3), 137–144. [Google Scholar] [CrossRef]
  131. Lorenz, K. (1966). Evolution and modification of behaviour. Methuen. [Google Scholar]
  132. Lotem, A., Halpern, J. Y., Edelman, S., & Kolodny, O. (2017). The Evolution of Cognitive Mechanisms in Response to Cultural Innovations. Proceedings of the National Academy of Sciences USA, 114, 7915–7922. [Google Scholar] [CrossRef]
  133. Lurz, R. W., Krachun, C., Mareno, M. C., & Hopkins, W. D. (2022). Do chimpanzees predict others’ behavior by simulating their beliefs? Animal Behavior and Cognition, 9, 153–175. [Google Scholar] [CrossRef]
  134. Lyn, H., & Christopher, J. (2018). A point is not a point is not a point: Reinterpreting three basic kinds of pointing comprehension. Proceedings of Evolang 2018 (pp. 260–263). Available online: https://pure.mpg.de/rest/items/item_3190925_17/component/file_3260022/content (accessed on 4 November 2024).
  135. Lyn, H., Greenfield, P. M., Savage-Rumbaugh, S., Gillespie-Lynch, K., & Hopkins, W. D. (2011). Nonhuman primates do declare! A comparison of declarative symbol and gesture use in children, bonobos, and chimpanzees. Language & Communication, 31(1), 63–74. [Google Scholar] [CrossRef]
  136. Lyn, H., West, K., Villegas, J., Bass, C., & Baker, S. (2024). Pointing on the other side: Do dogs follow contralateral points? Available online: https://www.preprints.org/manuscript/202401.1896 (accessed on 4 November 2024).
  137. Maibom, H. (2010). The descent of shame. Philosophy and Phenomenological Research, 80(3), 566–594. [Google Scholar] [CrossRef]
  138. Margoni, F., Surian, L., & Baillargeon, R. (2023). The violation-of-expectation paradigm: A conceptual overview. Psychological Review, 131, 716–748. [Google Scholar] [CrossRef] [PubMed]
  139. Mayhew, J., & Gómez, J. C. (2015). Gorillas with white sclera. American Journal of Primatology, 77, 869–887. [Google Scholar] [CrossRef] [PubMed]
  140. Mearing, A. S., & Koops, K. (2021). Quantifying gaze conspicuousness: Are humans distinct from chimpanzees and bonobos? Journal of Human Evolution, 157, 103043. [Google Scholar] [CrossRef] [PubMed]
  141. Meguerditchian, A. (2022). On the gestural origins of language: What baboons’ gestures and brain have told us after 15 years of research. Ethology Ecology & Evolution, 34, 288–302. Available online: https://www.tandfonline.com/doi/full/10.1080/03949370.2022.2044388 (accessed on 4 November 2024).
  142. Meguerditchian, A., Molesti, S., & Vauclair, J. (2011). Right-handedness predominance in 162 baboons for gestural communication: Consistency across time and groups. Behavioral Neuroscience, 125, 653–660. [Google Scholar] [CrossRef] [PubMed]
  143. Melis, A., & Rossano, F. (2022). When and how do non-human great apes communicate to support cooperation? Philosophical Transactions of the Royal Society B: Biological Sciences, 377, 20210109. [Google Scholar] [CrossRef]
  144. Meneganzin, A., Ramsey, G., & DiFrisco, J. (2024). What is a trait? Lessons from the human chin. Journal of Experimental Zoology. Part B, Molecular and Developmental Evolution, 342, 65–75. [Google Scholar] [CrossRef] [PubMed]
  145. Michael, J., & Székely, M. (2019). Goal slippage: A mechanism for spontaneous instrumental helping in infancy? Topoi, 38, 173–183. [Google Scholar] [CrossRef]
  146. Miyazono, K., & Inarimori, K. (2021). Empathy, altruism, and group identification. Frontiers in Psychology, 12, 749315. [Google Scholar] [CrossRef] [PubMed]
  147. Moore, C. (2008). The development of gaze following. Child Development Perspectives, 2, 66–70. [Google Scholar] [CrossRef]
  148. Moore, R. (2013). Evidence and Interpretation in great ape gestural communication. HUMANA. MENTE Journal of Philosophical Studies, 6, 27–51. Available online: https://pure.mpg.de/rest/items/item_1838343_2/component/file_1838342/content (accessed on 4 November 2024).
  149. Moore, R. (2015). A common intentional framework for ape and human communication. Current Anthropology, 56(1), 56–80. [Google Scholar]
  150. Moore, R. (2020). The cultural evolution of mind-modelling. Synthese, 199(1), 1751–1776. [Google Scholar] [CrossRef]
  151. Morrison, D. (2020). Disambiguated indexical pointing as a tipping point for the explosive emergence of language among human ancestors. Biological Theory, 15, 196–211. [Google Scholar] [CrossRef]
  152. Mussavifard, N. (2023). Ostensive marking as a distinctive feature of human communication. Available online: https://www.researchgate.net/publication/372788891_Ostensive_Marking_as_a_Distinctive_Feature_of_Human_Communication (accessed on 4 November 2024).
  153. Mussavifard, N., & Csibra, G. (2023). The co-evolution of cooperation and communication: Alternative accounts. Behavioral and Brain Sciences, 46, e11. [Google Scholar] [CrossRef] [PubMed]
  154. Neubauer, S., Hublin, J., & Gunz, P. (2018). The evolution of modern human brain shape. Science Advances, 4, eaao5961. [Google Scholar] [CrossRef] [PubMed]
  155. Nevejans, M., & Cracco, E. (2022). Model expertise does not influence automatic imitation. Experimental Brain Research, 240(4), 1267–1277. [Google Scholar] [CrossRef]
  156. Okasha, S. (2022). Goal attributions in biology: Objective fact, anthropomorphic bias, or valuable heuristic? Available online: https://philsci-archive.pitt.edu/id/eprint/20701 (accessed on 4 November 2024).
  157. Onishi, K. H., & Baillargeon, R. (2005). Do 15-month-old infants understand false beliefs? Science, 308, 255–258. [Google Scholar] [CrossRef] [PubMed]
  158. Onu, D., Kessler, T., & Smith, J. R. (2016). Admiration: A conceptual review of the knowns and unknowns. Emotion Review, 8, 218–230. [Google Scholar] [CrossRef]
  159. Osiurak, F., Claidière, N., & Federico, G. (2022). Bringing cumulative technological culture beyond copying versus reasoning. Trends in Cognitive Sciences, 27, 30–42. [Google Scholar] [CrossRef] [PubMed]
  160. Osiurak, F., Crétel, C., Uomini, N., Bryche, C., Lesourd, M., & Reynaud, E. (2021). On the neurocognitive co-evolution of tool behavior and language: Insights from the massive redeployment framework. Topics in Cognitive Science, 13(4), 684–707. [Google Scholar] [CrossRef]
  161. Pan, X., Hsiao, V., Nau, D. S., & Gelfand, M. J. (2024). Explaining the evolution of gossip. Proceedings of the National Academy of Sciences USA, 121, e2214160121. [Google Scholar] [CrossRef]
  162. Pan, Y., Dikker, S., Goldstein, P., Zhu, Y., Yang, C., & Hu, Y. (2020). Instructor-learner brain coupling discriminates between instructional approaches and predicts learning. NeuroImage, 211, 116657. [Google Scholar] [CrossRef]
  163. Paulus, M., & Fikkert, P. (2013). Conflicting social cues: Infants’ reliance on gaze and pointing cues in word learning. Journal of Cognition and Development, 15, 43–59. [Google Scholar] [CrossRef]
  164. Peeters, A., Cosentino, E., & Werning, M. (2023). Constructing a wider view on memory: Beyond the dichotomy of field and observer perspectives. In A. Berninger, & Í. Vendrell Ferran (Eds.), Philosophical perspectives on memory and imagination (pp. 165–190). Routledge. [Google Scholar]
  165. Perea-García, J. O., Kret, M. E., Monteiro, A., & Hobaiter, C. (2019). Scleral pigmentation leads to conspicuous, not cryptic, eye morphology in chimpanzees. Proceedings of the National Academy of Sciences USA, 116(39), 19248–19250. [Google Scholar] [CrossRef]
  166. Perner, J., Priewasser, B., & Roessler, J. (2018). The practical other: Teleology and its development. Interdisciplinary Science Reviews, 43, 99–114. [Google Scholar] [CrossRef]
  167. Pfister, R., Klaffehn, A., Kalckert, A., Kunde, W., & Dignath, D. (2021). How to lose a hand: Sensory updating drives disembodiment. Psychonomic Bulletin & Review, 28, 827–833. Available online: https://link.springer.com/article/10.3758/s13423-020-01854-0 (accessed on 4 November 2024).
  168. Phillips, J., Buckwalter, W., Cushman, F., Friedman, O., Martin, A., Turri, J., Santos, L., & Knobe, J. (2020). Knowledge before belief. Behavioral and Brain Sciences, 44, 1–37. [Google Scholar] [CrossRef]
  169. Phillips, S. (2024). A category theory perspective on the language of thought. Frontiers in Psychology, 15, 1361580. [Google Scholar] [CrossRef] [PubMed]
  170. Piaget, J. (1954). La formation du symbole chez l’enfant. Delachaux & Niestlé. [Google Scholar]
  171. Piretti, L., Pappaianni, E., Garbin, C., Rumiati, R. I., Job, R., & Grecucci, A. (2023). The neural signatures of shame, embarrassment, and guilt: A voxel-based meta-analysis on functional neuroimaging studies. Brain Sciences, 13, 559. [Google Scholar] [CrossRef] [PubMed]
  172. Planer, R. (2019). The evolution of languages of thought. Biology and Philosophy, 34, 47. [Google Scholar] [CrossRef]
  173. Planer, R. (2023). The evolution of hierarchically structured communication. Frontiers in Psychology, 14, 1224324. [Google Scholar] [CrossRef] [PubMed]
  174. Planer, R., Bandini, E., & Tennie, C. (2024). Hominin tool evolution and its (surprising) relation to language origins. Available online: https://www.academia.edu/105665796/Hominin_Tool_Evolution_and_Its_Surprising_Relation_to_Language_Origins (accessed on 4 November 2024).
  175. Pomper, J. K., Shams, M., Wen, S., Bunjes, F., & Thier, P. (2023). Non-shared coding of observed and executed actions prevails in macaque ventral premotor mirror neurons. eLife, 12, e77513. [Google Scholar] [CrossRef] [PubMed]
  176. Poulin-Dubois, D., Goldman, E. J., Meltzer, A., & Psaradellis, E. (2023). Discontinuity from implicit to explicit theory of mind from infancy to preschool age. Cognitive Development, 65, 101273. [Google Scholar] [CrossRef]
  177. Pouw, W., Werner, R., Burchardt, L., & Selen, L. (2023). The human voice aligns with whole-body kinetics. bioRxiv. [Google Scholar] [CrossRef]
  178. Prein, J., Maurits, L., Werwach, A., Haun, D., & Bohn, M. (2024). Variation in gaze understanding across the life span: A process-level perspective. Available online: https://osf.io/preprints/psyarxiv/dy73a_v1 (accessed on 4 November 2024).
  179. Priest, M. (2017). Intellectual humility: An interpersonal theory. Ergo, 4, 463–480. [Google Scholar] [CrossRef]
  180. Rakoczy, H. (2022). Foundations of theory of mind and its development in early childhood. Nature Reviews Psychology, 1(4), 223–235. Available online: https://www.nature.com/articles/s44159-022-00037-z (accessed on 4 November 2024). [CrossRef]
  181. Rakoczy, H., & Proft, M. (2022). Knowledge before belief ascription? Yes and no (depending on the type of “knowledge” under consideration). Frontiers in Psychology, 13, 988754. [Google Scholar] [CrossRef] [PubMed]
  182. Rand, D., Greene, J., & Nowak, M. (2012). Spontaneous giving and calculated greed. Nature, 489, 427–430. [Google Scholar] [CrossRef]
  183. Reddy, V. (2010). How infants know minds. Harvard University Press. [Google Scholar]
  184. Rendall, D., Owren, M., & Ryan, M. (2009). What do animal signals mean? Animal Behaviour, 78, 233–240. [Google Scholar] [CrossRef]
  185. Rodríguez, C., Moreno-Núñez, A., Basilio, M., & Sosa, N. (2015). Ostensive gestures come first: Their role in the beginning of shared reference. Cognitive Development, 36, 142–149. [Google Scholar] [CrossRef]
  186. Rönnqvist, L. (2003). Developmentally, the arm preference precedes handedness. Behavioral and Brain Sciences, 26, 238–239. [Google Scholar] [CrossRef]
  187. Ross, L. (1977). The intuitive psychologist and his shortcomings: Distortions in the attribution process. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 10). Academic Press. [Google Scholar]
  188. Rossano, M. (2003). Expertise and the evolution of consciousness. Cognition, 89, 207–236. [Google Scholar] [CrossRef]
  189. Roszak, P. (2022). Not only coping: Resilience and its sources from a thomistic perspective. Journal of Religion and Health, 62(4), 2734–2745. [Google Scholar] [CrossRef]
  190. Royo, J., Orset, T., Catani, M., Pouget, P., & Thiebaut de Schotten, M. (2024). Evidence for an evolutionary continuity in social dominance: Insights from non-human primates tractography. Available online: https://www.researchsquare.com/article/rs-4772053/v1 (accessed on 4 November 2024).
  191. Ruba, A., & Repacholi, B. (2020). Beyond language in infant emotion concept development. Emotion Review, 12(4), 255–258. [Google Scholar] [CrossRef]
  192. Ruba, A. L., Pollak, S. D., & Saffran, J. R. (2022). Acquiring complex communicative systems: Statistical learning of language and emotion. Topics in Cognitive Science, 14(3), 432–450. [Google Scholar] [CrossRef]
  193. Rubio-Fernandez, P. (2020). Pragmatic markers: The missing link between language and theory of mind. Synthese, 199, 1125–1158. [Google Scholar] [CrossRef]
  194. Rühlemann, C., & Trujillo, J. (2024). The effect of gesture expressivity on emotional resonance in storytelling interaction. Frontiers in Psychology, 15, 1477263. [Google Scholar] [CrossRef]
  195. Scerri, E. M., & Will, M. (2023). The revolution that still isn’t: The origins of behavioral complexity in Homo sapiens. Journal of Human Evolution, 179, 103358. [Google Scholar] [CrossRef]
  196. Schlinger, H. (2009). Theory of mind: An overview and behavioral perspective. The Psychological Record, 59, 435–448. Available online: https://link.springer.com/content/pdf/10.1007/BF03395673.pdf (accessed on 4 November 2024). [CrossRef]
  197. Schüler, C., Berger, P., & Grosse Wiesmann, C. (2024). A dorsal versus ventral network for understanding others in the developing brain. bioRxiv. [Google Scholar] [CrossRef]
  198. Schuwerk, T., Kampis, D., Baillargeon, R., Biro, S., Bohn, M., Byers-Heinlein, K., & Rakoczy, H. (2024). Project MANYBABIES. Registered report. Action anticipation based on an agent’s epistemic state in toddlers and adults. Available online: https://osf.io/preprints/psyarxiv/x4jbm_v1 (accessed on 4 November 2024).
  199. Scott-Phillips, T., & Heintz, C. (2023). Great ape interaction: Ladyginian but not gricean. Proceedings of the National Academy of Sciences USA, 120, e2300243120. [Google Scholar] [CrossRef] [PubMed]
  200. Shilton, D., Breski, M., Dor, D., & Jablonka, E. (2020). Human social evolution: Self-domestication or self-control? Frontiers in Psychology, 11, 134. [Google Scholar] [CrossRef]
  201. Shimoni, E., Berger, A., & Eyal, T. (2022). Your pride is my goal: How the exposure to others’ positive emotional experience influences preschoolers’ delay of gratification. Journal of Experimental Child Psychology, 217, 105356. [Google Scholar] [CrossRef] [PubMed]
  202. Siposova, B., Tomasello, M., & Carpenter, M. (2018). Communicative eye contact signals a commitment to cooperate for young children. Cognition, 179, 192–201. [Google Scholar] [CrossRef] [PubMed]
  203. Southgate, V. (2020). Are infants altercentric? The other and the self in early social cognition. Psychological Review, 127(4), 505–523. [Google Scholar] [CrossRef]
  204. Southgate, V., Van Maanen, C., & Csibra, G. (2007). Infant pointing: Communication to cooperate or communication to learn? Child Development, 78(3), 735–740. Available online: https://srcd.onlinelibrary.wiley.com/doi/10.1111/j.1467-8624.2007.01028.x (accessed on 4 November 2024). [CrossRef] [PubMed]
  205. Spikins, P., Needham, A., Wright, B., Dytham, C., Gatta, M., & Hitchens, G. (2019). Living to fight another day: The ecological and evolutionary significance of neanderthal healthcare. Quaternary Science Reviews, 217, 98–118. [Google Scholar] [CrossRef]
  206. Spurrett, D. (2024). Motivation and cumulative culture. Commentary on Sterelny and hiscock, cumulative culture, archaeology, and the zone of latent solutions. Current Anthropology, 65(1), 23–48. Available online: https://www.journals.uchicago.edu/doi/10.1086/728723 (accessed on 4 November 2024).
  207. Sterelny, K. (2023). Niche construction, cumulative culture and the social transmission of expertise. PaleoAnthropology. Available online: https://paleoanthropology.org/ojs/index.php/paleo/article/view/119 (accessed on 4 November 2024).
  208. Sterelny, K., & Hiscock, P. (2024). Cumulative culture, archaeology, and the zone of latent solutions. Current Anthropology, 65(1), 23–48. [Google Scholar] [CrossRef]
  209. Steven, S., Cole, G., & Eacott, M. (2022). It’s not you, it’s me: A review of individual differences in visuospatial perspective taking. Perspectives on Psychological Science, 18(2), 293–308. [Google Scholar] [CrossRef]
  210. Sznycer, D. (2019). Forms and functions of the self-conscious emotions. Trends in Cognitive Sciences, 23(2), 143–157. [Google Scholar] [CrossRef] [PubMed]
  211. Sznycer, D., & Cohen, A. (2021). How pride works. Evolutionary Human Sciences, 3, 1–39. [Google Scholar] [CrossRef]
  212. Sznycer, D., Al-Shawaf, L., Bereby-Meyer, Y., Curry, O. S., De Smet, D., Ermer, E., Kim, S., Kim, S., Li, N. P., Seal, M. F. L., McClung, J., O, J., Ohtsubo, Y., Quillien, T., Schaub, M., Sell, A., van Leeuwen, F., Cosmides, L., & Tooby, J. (2017). Cross-cultural regularities in the cognitive architecture of pride. Proceedings of the National Academy of Sciences USA, 114, 1874–1879. [Google Scholar] [CrossRef]
  213. Tatone, D., & Csibra, G. (2015). Learning in and about opaque worlds. Behavioral and Brain Sciences, 38, e68. [Google Scholar] [CrossRef]
  214. Tattersall, I. (2023). Let sleeping syntheses lie. Special issue: Niche construction, plasticity, and inclusive inheritance: Rethinking human origins with the extended evolutionary synthesis, part 1. PaleoAnthropology, 2023, 258–265. [Google Scholar] [CrossRef]
  215. Tebbe, A. L., Rothmaler, K., Koester, M., & Wiesmann, C. (2024). Infants and adults neurally represent the perspective of others like their own perception. bioRxiv. [Google Scholar] [CrossRef]
  216. Téglás, E., Gergely, A., Kupán, K., Miklósi, Á., & Topál, J. (2012). Dogs’ gaze following is tuned to human communicative signals. Current Biology, 22(3), 209–212. [Google Scholar] [CrossRef]
  217. Tennie, C., Braun, D. R., Premo, L. S., & McPherron, S. P. (2016). The island test for cumulative culture in the paleolithic. In M. Haidle, N. Conard, & M. Bolus (Eds.), The nature of culture. Springer Press. [Google Scholar] [CrossRef]
  218. Thiele, M., Kalinke, S., Michel, C., & Haun, D. B. M. (2023). Direct and observed joint attention modulate 9-month-old infants’ object encoding. Open Mind, 7, 917–946. [Google Scholar] [CrossRef] [PubMed]
  219. Thomas, E. R., Haarsma, J., Nicholson, J., Yon, D., Kok, P., & Press, C. (2024). Predictions and errors are distinctly represented across V1 layers. Current Biology, 34, 2265–2271.e4. [Google Scholar] [CrossRef]
  220. Thorne, T. N., Milyavskaya, M., Werner, K., Leduc-Cummings, I., Saunders, B., & Inzlicht, M. (2023). The personal goal difficulty—Progress paradox: Unraveling the role of self-efficacy on perceptions of goal difficulty. Available online: https://www.researchgate.net/publication/376174835_The_Personal_Goal_Difficulty_-_Progress_Paradox_Unraveling_the_Role_of_Self-Efficacy_on_Perceptions_of_Goal_Difficulty (accessed on 4 November 2024).
  221. Thornton, M., & Tamir, D. (2024). Neural representations of situations and mental states are composed of sums of representations of the actions they afford. Nature Communications, 15(1), 620. [Google Scholar] [CrossRef] [PubMed]
  222. Tomasello, M. (1999). The Human Adaptation for Culture. Annual Review of Anthropology, 28, 509–529. [Google Scholar] [CrossRef]
  223. Tomasello, M. (2008). Origins of human communication. MIT Press. [Google Scholar]
  224. Tomasello, M. (2012). Why be nice? Better not think about it. Trends in Cognitive Sciences, 16, 580–581. [Google Scholar] [CrossRef] [PubMed]
  225. Tomasello, M. (2018). How children come to understand false beliefs: A shared intentionality account. Proceedings of the National Academy of Sciences USA, 115, 8491–8498. [Google Scholar] [CrossRef] [PubMed]
  226. Tomasello, M. (2022). Social cognition and metacognition in great apes: A theory. Animal Cognition, 26(1), 25–35. [Google Scholar] [CrossRef] [PubMed]
  227. Tomasello, M., & Call, J. (2019). Thirty years of great ape gestures. Animal Cognition, 22(4), 461–469. [Google Scholar] [CrossRef]
  228. Tomasello, M., Call, J., & Hare, B. (2003). Chimpanzees understand psychological states—The question is which ones and to what extent. Trends in Cognitive Sciences, 7, 153–156. [Google Scholar] [CrossRef]
  229. Tomasello, M., Hare, B., Lehmann, H., & Call, J. (2007). Reliance on head versus eyes in the gaze following of great apes and human infants: The cooperative eye hypothesis. Journal of Human Evolution, 52, 314–320. [Google Scholar] [CrossRef] [PubMed]
  230. Tomasello, R., Grisoni, L., Boux, I., Sammler, D., & Pulvermüller, F. (2022). Instantaneous neural processing of communicative functions conveyed by speech prosody. Cerebral Cortex, 32, 4885–4901. [Google Scholar] [CrossRef] [PubMed]
  231. Tomonaga, M., Kurosawa, Y., Kawaguchi, Y., & Takiyama, H. (2023). Don’t look back on failure: Spontaneous uncertainty monitoring in chimpanzees. Learning & Behavior, 51(4), 402–412. [Google Scholar] [CrossRef]
  232. Tracy, J. L., Mercadante, E., & Witkower, Z. (2024). The evolved nature of pride. In The oxford handbook of evolution and the emotions (pp. 203–218). Oxford University Press. [Google Scholar] [CrossRef]
  233. Uomini, N., & Ruck, L. (2019). Testing models of handedness in stone tools. In Squeezing minds from stones. Oxford University Press. [Google Scholar] [CrossRef]
  234. van Leeuwen, E., Detroy, S., Haun, D., & Call, J. (2024). Chimpanzees use social information to acquire a skill they fail to innovate. Nature Human Behaviour, 8(5), 891–902. [Google Scholar] [CrossRef] [PubMed]
  235. van Woerkum, B., & Barrett, L. (2024). Anthropofabrication and the redressing of memory: An embodied approach to comparative cognition. Philosophical Transactions B, 379, 20230145. [Google Scholar] [CrossRef]
  236. Vasilieva, O. (2019). Beyond “Uniqueness”: Habitual traits in the context of cognitive-communicative continuity. Theoria et Historia Scientiarum, 16, 129. [Google Scholar] [CrossRef]
  237. Vieira, J., & Olsson, A. (2022). Help or flight: Neural defensive circuits promote helping under threat in humans. eLife, 11, e78162. [Google Scholar] [CrossRef] [PubMed]
  238. Vieira, J. B., Schellhaas, S., Enström, E., & Olsson, A. (2020). Help or flight? Increased threat imminence promotes defensive helping in humans. Proceedings of the Royal Society B: Biological Sciences, 287, 20201473. [Google Scholar] [CrossRef] [PubMed]
  239. Vincini, S. (2023). Can interactionist approaches solve the empathy-sharing conundrum? In Empathy’s role in understanding persons, literature, and art (pp. 44–64). Routledge. [Google Scholar] [CrossRef]
  240. Vygotsky, L., & Cole, M. (1978). Mind in society: Development of higher psychological processes. Harvard University Press. [Google Scholar]
  241. Vyshedskiy, A. (2022). Language evolution is not limited to speech acquisition: A large study of language development in children with language deficits highlights the importance of the voluntary imagination component of language. Research Ideas and Outcomes, 8, e86401. [Google Scholar] [CrossRef]
  242. Warren, E., & Call, J. (2022). Inferential communication: Bridging the gap between intentional and ostensive communication in non-human primates. Frontiers in Psychology, 12, 718251. [Google Scholar] [CrossRef] [PubMed]
  243. Warren, E., Call, J., & György, G. (2023). On the murky dissociation between expression and communication. Behavioral and Brain Sciences, 46, e19. [Google Scholar] [CrossRef]
  244. Wilkins, J., & Griffiths, P. (2013). Evolutionary debunking arguments in three domains: Fact, value, and religion. In A new science of religion (pp. 136–146). University of Chicago Press. [Google Scholar]
  245. Witkower, Z., Tracy, J., Cheng, J., & Henrich, J. (2020). Two signals of social. Prestige and dominance are associated with distinct nonverbal displays. Journal of Personality and Social Psychology, 118, 89–120. [Google Scholar] [CrossRef] [PubMed]
  246. Wolf, W., Thielhelm, J., & Tomasello, M. (2023). Five-year-old children show cooperative preferences for faces with white sclera. Journal of Experimental Child Psychology, 225, 105532. [Google Scholar] [CrossRef]
  247. Woo, B. M., Tan, E., Yuen, F. L., & Hamlin, J. K. (2022). Socially evaluative contexts facilitate mentalizing. Trends in Cognitive Sciences, 27(1), 17–29. [Google Scholar] [CrossRef] [PubMed]
  248. Woo, B., & Spelke, E. (2022). Toddlers’ social evaluations of agents who act on false beliefs. Developmental Science, 26(2), e13314. [Google Scholar] [CrossRef]
  249. Yáñez, B., & Gomila, A. (2018). Evolución de la esclerótica del ojo humano: Una hipótesis social. Ludus Vitalis, 26, 119–132. [Google Scholar]
  250. Zollikofer, C. P. E., Bienvenu, T., Beyene, Y., Suwa, G., Asfaw, B., White, T. D., & de León, M. S. P. (2022). Endocranial ontogeny and evolution in early homo sapiens: The evidence from Herto, Ethiopia. Proceedings of the National Academy of Sciences USA, 119(32), e2123553119. [Google Scholar] [CrossRef]
  251. Zuberbühler, K. (2008). Gaze following. Current Biology, 18(11), R453–R455. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Bejarano, T. The Origin of Human Theory-of-Mind. Humans 2025, 5, 5. https://doi.org/10.3390/humans5010005

AMA Style

Bejarano T. The Origin of Human Theory-of-Mind. Humans. 2025; 5(1):5. https://doi.org/10.3390/humans5010005

Chicago/Turabian Style

Bejarano, Teresa. 2025. "The Origin of Human Theory-of-Mind" Humans 5, no. 1: 5. https://doi.org/10.3390/humans5010005

APA Style

Bejarano, T. (2025). The Origin of Human Theory-of-Mind. Humans, 5(1), 5. https://doi.org/10.3390/humans5010005

Article Metrics

Back to TopTop