1. Historical and Conceptual Roots of Embodied AI
The notion of embodied AI is not a settled one [
1,
2]. It suggests a connection between AI and a cluster of approaches loosely addressed as embodied cognition (EC). Historically, embodied AI has been a reaction to classic AI, computationalism and its limitations [
3,
4,
5]. As Tom Ziemke remarks, “a problem with embodied AI, or in fact embodied cognitive science in general, is that it seems to be much more defined in terms of what it argues against than what it argues for […]. Many embodied AI researchers reject the idea that intelligence and cognition can be explained in purely computational terms, but it is left unclear exactly what the alternative is” [
6].
The historical role of embodied AI has been propelled by the crisis of the traditional computational model of cognition, the so-called sandwich model [
7], the AI winter [
8], and the quest for antirepresentational approaches [
9,
10]. By the same token, the difficulty to mimic sensorimotor skills supported considering EC in robotics and AI [
4,
11,
12,
13,
14]. In the last couple of decades, the pendulum that once favored the computational turn in the 1950s has been in full swing and many researchers have embraced various strands of EC.
It is fair to observe that many of the obstacles AI was facing in the 1990s were due to objective limitations in terms of computational power [
15,
16]. In other words, classic AI was not hampered only by conceptual issues but also by technological limitations. Nevertheless, the recent progress in speech or facial recognition [
17] or in autonomous vehicles [
18] have shown promising results. These advances have not been an effect of the adoption of EC, but rather the outcome of various factors—a refinement of algorithms (not always biologically inspired), more computational power, better sensors, and the availability of large amounts of data.
As a matter of fact, the application of the embodied paradigm to AI and robotics has so far had only a limited impact. Aside from a few interesting prototypes and a handful of commercial cases of success, no invasion of embodied agents has taken place. Some of these cases are not obviously representative of embodiment regardless of their history. For instance, Roomba is not, strictly speaking, a good example of EC. Its functions are hard-wired. It operates following inflexible yet efficient algorithms. It is not more embodied than a washing machine, although its shape might suggest otherwise. As regards the other prototypes, they have so far had mostly a demonstrative role and have been unable to address the gap between sensorimotor skills and higher cognition [
18]. For instance, the humanoid robot ICub weights 25 kg, executes code that is almost completely hand-written, has never learned to walk by crawling, and runs classic visual recognition algorithms [
19]. The same algorithms in an iPhone or in a PC would look a lot less impressive, yet they are basically the same. Although ICub has been used to mimic and understand the sensorimotor structure of a humanoid body, the robot is not, in any philosophically relevant sense, embodied or enacted. It is a machine that runs code that has been written elsewhere. If the structure of the machine (both hardware and software) is not the result of the developmental coupling with the environment, there is no true embodiment between the machine and its environment. Likewise, other often quoted examples—e.g., Bongard et al.’s resilient machines or the bipedal walkers in their various incarnations [
14,
20,
21,
22]—are clever devices that demonstrate conceptual points but that have not evolved into embodied agents. They have so far been embodied promissory notes. By and large, as I will argue in the following, implementing mechanical structures that perform desired behaviors—e.g., walking—does not necessarily constitute a case for embodiment. It shows only that cleverly designed physical structures can implement a desired function more efficiently than current computational solutions, which in turn are another way to be embodied.
What does embodiment mean in AI? Trivially, it might refer to the fact that AI agents must be physically implemented. In this interpretation, embodiment is a platitude—e.g., a word processor is physically embodied [
1]. More interestingly, embodied AI and EC may hinge on the thesis that cognition is constituted by body–world interactions. This thesis can be interpreted in two different ways: a weak yet practically feasible thesis and a strong yet ontologically demanding notion. Such a contrast is not too different from Gallagher’s distinction between weak and strong EC (2017, 26–46) and, to some extent, from Di Paolo’s forking path [
23]. Given the recent rise and popularity of the 4E [
24,
25], it is informative to briefly consider how the role and notion of the mind—extended, embodied, embedded, and enactive—is articulated across the 4E. In a simplistic way, at one extreme there is Chalmers and Clarks’ extended mind, which is a form of functionalism, and at the opposite extreme there is enactivism, which in some versions depend on a demanding ontological hypothesis about life and organisms. According to Shaun Gallagher, “In most of the embodied […] approaches the body plays a real role in shaping cognition. There are different ways of thinking about this. The idea that the body, and not just the brain, processes information both prior to and subsequent to central manipulations; the idea that representations can be action-oriented; the idea that the body itself plays a representational role; or the idea that sensorimotor contingencies, bodily affects, posture and movement enter into cognition in a non-representational way; the idea that the body is dynamically coupled to the environment; the idea that action affordances are body- and skill-relative, and so on, are all ways of shifting the ground away from orthodox cognitive science” [
26]. Yet, such a shift requires an ontological foundation as to why the body is special. Assuming the body as the starting point begs the question.
On the one hand, in the weak form, embodied AI and EC boil down to functionalism. According to the parity principle, cognitive processes are functional patterns that may be instantiated either inside a centralized structure (brain, computer) or in more scattered physical systems including parts of the body and of the environment. Occasionally, it may be advantageous to exploit such external structures. It’s a matter of practicality. Why waste time to compute the exact grasping pressure if a soft gripper can adapt to the shape of the object to be grasped? Embodied AI offloads demanding computational processes into the body or into the environment. In this way, computational limitations can be sidestepped as happens, for instance, in living organisms with limited neural resources. In this weak form, embodied AI and EC are variants of functionalism and the body–world nexus has only a contingent role in instantiating functional processes. As mentioned above, passive walkers, resilient machines, and humanoid robots have so far not shown any special ontological power. To put it crudely, they are very smart, human-shaped washing machines. If what matters is the cognitive function performed, matter does not matter. However, there may be huge practical advantages in exploiting the right physical structure; a soft grip is surely more efficient than a gripper made of hard polished steel.
On the other hand, in the stronger form, EC and enactivism entails a thesis about the constitutive and necessary role of the body and the environment—“Experience doesn’t supervene on neural states alone, then, but only on neural states plus environmental conditions” [
27]. Yet, it is not clear why the body and the environment should be constitutive of experience and what difference they should make. Constitution is an ontological claim and requires corresponding ontological premises. As a matter of fact, the constitution thesis hinges on various models that have yet to find definitive theoretical and empirical confirmation: enactivism [
28,
29,
30,
31], radical embodiment [
32,
33], the extended mind [
34,
35], embodied functionalism [
36], the spread mind [
37,
38], radical enactivism [
39,
40], embodied cognition [
32], sensorimotor identity [
41], sensorimotor direct realism [
42], and many others. As regards AI, it is not obvious whether any such models have any practical consequence. Sometimes one has the impression—undoubtedly mistaken—that enactivism and EC is more a descriptive style than an explanatory strategy. Constitution suffers from epiphenomenalism insofar as is a relation with no obvious causal role: “sensorimotor know-how and perceptual experience are causally related, but that is no reason to think that they are constitutively related” [
43]. EC must clarify whether the body and the environment either constitute or cause the mind. For instance, does Gallagher’s expression “playing a role in shaping cognition” [
26,
44] mean that the body causes a difference in one’s cognition or that the body is constitutive of the world? It is difficult to say. If the body and the environment had a constitutive role, a metaphysically necessary relation should obtain. However, constitution is a weak ontological relation akin to supervenience. It is epiphenomenal. It does not change what happens. In fact, the fact that a function might be implemented more easily by means of embodiment than through computation does not imply that the constitution relation holds. Practical feasibility has no ontological relevance for constitution. As Block puts it, “There is no plausibility in the claim that there is some nomologically possible series of brain states that can only be produced by skilled interaction with a certain environment” [
43]. Although there are good reasons to disagree with Block about brain supervenience [
38,
45], given their premises, EC and enactivism cannot easily dodge the objection.
The constitutive role for the body–world nexus is not a functional thesis and, even if it was the case, it might lead to epiphenomenalism, which will be very bad for AI [
46]. In fact, if something is epiphenomenal, it does not make any causal difference. Thus, a constitutive relation would be immaterial insofar as it could not affect in any way what an AI system does. Paradoxically, constitution condemns the mind to an epiphenomenal role.
A caveat: here I do not argue against the design of smarter bodies. I don’t argue whether, say, it is more efficient to implement walking by means of clever joints. Rather I will discuss whether embodiment is constitutive of the mind in any non-trivial sense (as supporters of enactivism and EC often maintain). I will argue that embodiment is not different from functionalism unless additional hypotheses about the nature of the mind are put forward. In fact, I will contend that many key notions in EC and EAI do not require embodiment. My point is that the body is just a physical object, which seems to be the case if one is a physicalist. As a result it doesn’t matter whether cognition occurs inside or outside the head—functionalism works both ways as the parity principle confirms [
34]. Eventually, to overcome this dilemma between functionalism and non-trivial embodiment, I will suggest stepping forward and consider whether the external world has an even greater role.
In this paper I will not address the practical advantages that a weak form of embodied AI might have. Rather I will discuss the ontological issues deriving from assuming EC, in its various forms, as a foundation for embodied AI. I will claim that, in its weak form, EC leads to functionalism and, in its strong form, leads to circularity and mentalism. Eventually, I will briefly outline a more radical position (identity with external objects) and I will contrast it with two recent variants of enactivism (sensorimotor direct realism and embodied identity theory).
2. What does Embodiment Mean?
To be ontologically significant, embodiment requires a notion of the body that is non-circular, non-mentalistic, and consistent with physicalism. These requirements are not obviously met from current EC approaches. To be more than a trivial thesis about the fact that functional processes are instantiated by physical structures [
47], EC needs a neutral foundation for the body, which has so far eluded the enactivist literature [
48]. The body cannot be an a priori principle.
It is very difficult to distinguish bodies from physical objects without a circular appeal to mentalistic notions—such as, say, action—which remain vague. McGann writes that “though exalting action, researchers and writers within these new ways of thinking [EC and enactivism] have tended to gloss over just what they mean by [action]” [
48]. Claiming that bodies are the physical systems that engage in actions while actions are what bodies do is circular. When does a physical structure qualify as a body? Inasmuch, when does a behavior/movement qualify as an action? Suggesting that an “action is a behavior which is driven by the agent” [
48], or that “acting is behaving and meaning it” [
48], or that an “action is […] what the whole organism does in its interactions with the environment or, under a different description, what a person does in the world” [
44] does not avoid circularity. If the notion of action requires the notion of organism/agent/person/meaning, the notion of action cannot be used as a building block for the mind. As a result, the notion of body remains vague.
Di Paolo et al. (2017) states that “According to enactivism, the body counts as a cognitive system because it is possible to deduce from processes of precarious, material self-individuation, the concept of sense-making […] the constitution of the body, its identity, is closely tied to the autonomous processes of material self-individuation that occur at different levels and that become interlinked with what the body does in the world” [
23]. This is a very broad take on the notion of the body, but it suffers from circularity. The body depends on “self-individuation” and “sense-making” which are not notions from physics. They are notions that in turn depend on the existence of subjects and agents. For instance, consider the notion of action. There is no convincing not-circular ontology of the notion of action—actions are cause–effect processes caused by an agent, or sense-making behavior, or goal-directed behavior [
48]. When does a physical process qualify as an action? The word “action” is used when their cause is an agent, but the notion of agent is precisely what EC aims to explain using the notion of action. Defining the notion of action from other mental notions (goals, agent, meaning) reveals a circularity that should be resolved. The action in enaction is a circular notion. The same criticism holds for behavior. Can you define any physical occurrence as an instance of behavior if it is not assumed to be associated with an agent?
Words like “body” or “action” reveal their strong ontological commitments. Enactivists have often supported these notions by drawing on the concepts of the living organism and of autopoiesis [
31,
49,
50,
51,
52]. Yet, these concepts are controversial. Weber and Varela’s claim that autopoiesis is the basis for sense-making and for the creation of meaning in the interaction between a living system and its environment is very ambitious but far from being universally accepted [
53].
Since the seminal works by Thompson, Varela, and Maturana [
31,
50,
54,
55,
56], it has been proposed that the body is special insofar as it is the physical implementation of an autopoietic living organism. Although the validity of autopoiesis as an organizational structure is not called into question, there is no compelling reason to assume that it is anything more than that. Is a functional processes embodied because it partakes in an autopoietic living organism [
57]? Hardly. Many key concepts in the enactivist toolbox—e.g., the existence of meaning, a distinction between living systems and physical world, the constitutive role of interaction, the capacity of autopoiesis to generate meaning—are based on very costly ontological premises. Moreover, existing AI machines do not qualify as autopoietic. Many physicists, roboticists, AI researchers, biologists, and neuroscientists have no compelling reasons to accept the premises on which EC and enactivism are based. Moreover, higher forms of cognition have not yet been explained by EC. Although many aspects of language and thoughts are rooted in bodily metaphors [
57,
58,
59,
60], causation is not constitution. The structure of the body may influence the way we use words such as, say, “forth” and “back”. However, this does not mean that when we use the words “forth” and “back”, our mental states are constituted by the movements of the body. It only shows that language affects mental states. In fact it is still uncertain whether embodied cognition is a viable solution for more complex cognitive tasks—language, planning, abstract reasoning—or consciousness, intentionality, and free-will [
61,
62]. To a large extent, the existing literature is restricted to sensorimotor cases, albeit with some notable exceptions [
23,
63].
In fact, although EC may help to achieve concrete solutions in practical cases related to sensorimotor tasks, there is neither evidence that embodiment is nomologically necessary to achieve such skills nor that embodiment is constitutive of mental phenomena. The key contention is not whether the body causes differences in our mental states, but whether such mental states are constituted by—or even identical with—the body and/or what the body does.
Of course, a body is handy to train neural networks but, without additional ontological premises, only functional patterns matter. Many adherents to embodied cognition have drawn similar conclusions. Gallagher has famously called them “body snatchers” [
26,
44]. He pointed out that many alleged embodied cognitivists have devised “a version of embodied cognition that leaves the body out of it. […] the real action, all the essential action, occurs in the brain. […] theories of embodied cognition and have replaced bodies with
sanitized body-formatted (or B-formatted) representations in the brain” [
26]. This was to be expected as a result of the ontological weakness of the body. In fact, if the enactivist does not provide non-circular arguments for mind-body constitution, it will be difficult to resist to body snatchers. For the functionalist, whether a functional pattern is instantiated by a neural network inside the head or by such a network plus the physical structures of the body is immaterial (more on this in the following sections). The body snatchers win.
Does EC shed any light on the ontological foundation of the mind? I am not sure. Although EC might be useful, it has not been shown it is necessary. If it is not necessary, it does not entail constitution. Suppose that one needs to implement a controller for bipedal walking [
21]. Anyone who has seen the clumsy Tai Chi-like movements of the outdated Asimo Honda robot knows that such a task, if dealt with traditional computational techniques, might be very difficult. In contrast, using the proper body, a much smoother passive walker can be implemented [
21,
22]. Yet, is it necessary for those functional processes to take place? For instance, they might be instantiated by a neural network. Consider the recent Boston Dynamics SpotMini dogs. They perform their smooth movements by means of hardwired micro-controllers. While the details of their software are proprietary, according to the company representatives, SpotMini dogs exploit traditional open-loop controllers and thereby achieve efficient biological-like sensorimotor coordination. The fact that they take advantage of dynamic system theory does not mean that they are embodied agents.
By and large, cognitively speaking, it is difficult to maintain that there is anything more than functions [
64]. Consider a classic feedback control system. There are two options. On the one hand, a computational model of the system is implemented. The central control unit is connected to sensors and actuators. Eventually, some code is run. The proper forces are exerted by the actuators based on the internal computations performed by the CPU (Hurley’s sandwich). On the other hand, in the XIX century, to cope with the problem of insufficient computational power, Watt embodied the functional feedback loop into a rotating pair of iron spheres. The spheres are part of the body/environment and are causally coupled with the system to be controlled. Being made of iron and being conspicuously mechanical, it is easy to deem that to be embodied. Yet, is the electronic activity inside the CPU, usually referred to as the software or the computational level, any less physical than the spheres and their momentum? I don’t think so. In fact, both the CPU and the rotating spheres implement the same functional loop. Today, thanks to cheap computational units, it is more efficient to have sensors, actuators and a separate computational unit.
The crux of the matter, of course, is the difference between a functional loop implemented by means of a computational unit and a functional loop implemented by means of body–world interactions (as in the case with Watt’s governor). Although the two cases are described using different terminology, they implement the same function. Why should they be ontologically different? The body—and the network of interactions with the world—is just an extended brain. Unless strong ontological hypotheses are put forward, body snatchers win.
In the past, an influential philosophical tradition—from Heidegger to Merleau-Ponty, and from Gibson to the most recent authors [
29,
32,
39,
65,
66,
67]—has endorsed the notion of embodiment. The idea of breaking free from the computational mind—often seen as a modern version of the immaterial Cartesian mind—has been perceived as an enlightened move to steer away from metaphysical nonsense. These efforts have all been done in good spirit, of course, but they have not achieved an alternative ontology. There is no compelling reason to assume that the functional relations engaged between the body and the world are different from the functional relations engaged inside brains.
From a distance, the notion of material engagement between the body and the world looks more physicalist than the notion of an internal computational mind of some sort. Yet, this inference is misleading. The head is as physical as the body. Postulating that processes engaging body and world are constitutively different from processes inside the head requires additional hypotheses. Philosophers and scientists—such as Chalmers, Clark, and many others—have inadvertently resurrected the world–soul distinction in terms of head–body or body–world, which is a form of Cartesian materialism or covert dualism [
33,
68]. In this regard, Murray Shanahan points out at the lurking dualism of Chalmers’ view that bears the same hallmark as Descartes’ reflection. In both cases, a wedge is driven between inner and outer. For Descartes, body and place (outer) are divided from thought (inner), whereas for Chalmers, the information processing taking place in the brain (outer) is divided from phenomenal experience (inner). Of course, there is a sense in which information processing occurring in the brain is “inner” relative to the goings on in the “outer” environment. But this is not the sense of “inner” at stake here [
69]. Shanahan stresses that the inner/outer distinction between the head and the body (or between the body and the world) does not overlap with the distinction between the mind and the world. Likewise, if the body has no special status, the notion of embodiment will be empty.
One may wonder whether certain machines are referred to as robots (and thus implicitly as artificial bodies) because they are zoomorphic (humanoid robots, AIBO, SpotMini, Atlas). The aforementioned SpotMini dogs are easily presented as embodied agents—they “perceive’” “decide”, “choose”, have a “behavior”. In contrast, washing machines or chemical plants are usually not described in terms of agency, behavior, and enaction because they do not look like familiar biological organisms. Yet, the plant is as complex (if not more) than most robots. Washing machines can be quite smart too. But they hardly qualify as agents. At the end of the day, though, the difference between a chemical plant and a humanoid robot is cosmetic and parochial.
A human body is just a physical object. So far the discussion that tries to explain consciousness, phenomenology, and even the very concept of agency, has not been able to point out any metaphysical gap from the human body, qua human body, and the rest of the physical world. I want to highlight the physical aspects of bodies and I will suggest the consideration of a reductionist physicalist approach.
3. The Massive Simulation Hypothesis
To bring in the open the conceptual and practical weaknesses of the notion of embodiment, it is useful to attempt a thought experiment that is not far from reality. In fact it is a thought experiment that can be realized by currently available computer programs. The purpose is to show that embodiment has neither special constitutive power nor a unique causal role in shaping cognition, let alone consciousness. Let’s call this scenario the massive simulation hypothesis (MSH).
Suppose one has a reasonably accurate simulation of the physical environment [
70]. Such a simulation does not need to be 100% accurate. It is enough that it encompasses the aspects of the environment which are relevant regarding what the agents does. Suppose that such a simulation is, within these limits, functionally equivalent to the external world. Then, the body of an agent can be simulated inside such a program. If the simulation is massive, in principle, there is no aspect of embodied cognition that cannot be simulated. If one considers the possibility of massive simulation, only functional patterns will matter. There is no a priori reason to deny its possibility [
43]. Of course, the simulation is not an immaterial entity. The simulation will correspond to a physical implementation in terms of electronic activity. The functional structure of such electronic activity will be identical to that of the simulated world. A functional pattern in such a computer is alike the corresponding functional pattern in the environment. Whether such patterns are instantiated by patterns of electronic activity or by the environment is immaterial. In principle, any body and its environment can be simulated. Therefore, functionally equivalent embodied cognition can occur without a robotic body interacting with the environment.
For the functionalist, an embodied agent in a fully simulated environment and an embodied agent in the actual physical world are not different. For the supporter of strong EC they are very different. Yet they must argue why and how enactivism can dodge the risk of epiphenomenalism since, by definition, the two cases are functionally equivalent. The massive simulation rules out all ontological roles for the body and the environment unless cognition is flanked by other factors that reject functionalism—for instance, consciousness. The notion of embodied cognition raises the question as to the nature of the body, and as to whether the body is more than a handy circumstance to instantiate the desired functional patterns. If one is a functionalist, embodiment is not ontologically relevant. If one wants embodiment to be relevant, functionalism cannot be the whole picture. There must be something else. MSH helps us to see that the crux of the matter is the issue of functionalism. If one is a functionalist, embodied cognition is a cosmetic endeavor because, due to multiple realizability, matter does not matter.
4. The Double-Edged Nature of the Parity Principle
To overcome functionalism, supporters of EC and enactivism have endorsed various forms of mentalism, vitalism, or emergentism [
29,
31,
50,
71]. However, EC and enactivism have appeared to be very prodigal. Enactivist literature is populated by many notions whose ontology is vague—meaning, sense-making, body, engagement, action, autopoiesis, living system, organism, etc.
As a matter of fact, this is a consequence also of Chalmers and Clark’s parity principle—if a phenomenon is functionally equivalent to something that is going on inside the head, it might as well be taken to be part of one’s mind, regardless of where it is located “If, as we confront some task, a part of the world functions as a process which, were it done in the head, we would have no hesitation in recognizing as part of the cognitive process, then that part of the world is (so we claim) part of the cognitive process. Cognitive processes ain’t (all) in the head!” [
34].
This principle might be interpreted as a restatement of the principle of multiple realizability [
72]. In fact if a functional process can be implemented by multiple physical structures, it might take place both in the head and outside of it.
A consequence, which is seldom stressed, is that the principle works both ways. In other words, the principle does not suggest only that something, which happens outside the head, might be part of one’s mental processes, but also that something, which happens inside the head—as long as it is functionally equivalent to something going on outside—is equivalent to embodied cognitive processes. Equivalence works both ways. The parity principle states both that cognitive processes can be offloaded in the environment and that “embodied processes” can be uploaded in the brain. The location of processes is immaterial.
The conclusion is that, as long as cognition is our target, the external world is only an opportunity to implement functional patterns. Functionalism and the parity principle rule out any necessary ontological role for the environment. Whether the environment offers handy circumstances to reduce the computational load is only a matter of convenience. The aforementioned MSH argument shows that whether a functional pattern takes place inside a computer simulation or in the brain-body–world nexus is inconsequential. The parity principle reaches the same conclusion. The location of the underpinning physical structure is not relevant unless one puts forward additional consideration of ontological and phenomenological nature.
5. Ontological Commitments
The problem at stake—namely whether EC has any value for AI—has two horns. Either EC (in all its forms) boils down to functionalism and then the notion of embodiment is mostly cosmetic from an ontological perspective, or EC is a demanding ontological thesis about the nature of the mind and the world. Unfortunately, EC proponents have not yet addressed this issue in a universally accepted format. In this section, I briefly list the main ontological shortcomings of current approaches to EC.
EC proponents should suggest a criterion that explains why the mind is embodied—which is a mandatory move if they pursue strong embodied AI. Only if one assumes that the physical stuff matters, it will be possible to distinguish between a functional pattern instantiated inside a massive computer simulation and the same functional pattern instantiated in the actual physical world. Only if matter matters, it is possible to resist to MSH and to an ecumenic application of the parity principle. So far, strong EC (and thus embodied AI) suffers from a series of fatal flaws: mentalism in disguise, symmetry of the parity principle, mismatch between matter and matter of the mind. Let’s address them one by one.
5.1. Epiphenomenalism
The first obstacle, partially addressed above, is epiphenomenalism. The functional description drains all causal issues. There is nothing left to be explained. So, whether something is realized by a certain physical process is immaterial in functional terms. Of course, this is another way to express multiple realizability. It holds for EC too. Inside a computer it does not mind whether a functional process is realized by, say, semiconductors or mechanical gears. Why should it make a difference whether the same functional process is made of limbs and objects? Something is missing.
Regarding epiphenomenalism, the enactivism and EC do not seem to offer any obvious solution unless they appeal to some ontological primacy of the living, which is another problem. Of course, they insist that mental processes are “constituted by” body–world interactions. Unfortunately, constitution “is an abstraction that does not correspond to any general idea that figures non metaphorically in science” [
46].
5.2. Embodied Zombies
Embodiment does not rule out the possibility of zombies, which is not surprising since EC boils down to functionalism. In the traditional thought experiment, zombies are conceivable because between neural activity and phenomenal experience there is a contingent relation. Thus it is conceivable that the same neural activity might occur with or without any associated phenomenal character. In EC and enactivism, the situation is similar. The relation between mental states and actions remains vague. It is, once again, the issue of constitution or identity vs. that of causation. If the relation between actions and experience is not metaphysically necessary, embodied zombies are conceivable. Two bodies might both identical and having the same engagement with the environment and yet only one of them might have a mind. Of course, one may rebuke that the swamp scenario does not work according to the enactive approach; the same embodiment and organization would entail same experience. But this interpretation is a question begging. The claim that embodiment and organization are constitutive of (or identical with) mental experience moves from a common assumption on which enactivism is based. The enactivist should show that this is the case rather than postulating it.
5.3. Mismatching
Leaving aside the two aforementioned problems, for which I do not see any easy scapegoat, another problem faces EC—even if matter mattered, the body–world nexus would not seem to be made of the right matter. What is the stuff mental processes are made of? In their seminal paper, O’Regan and Noe speak of “sensorimotor contingencies” [
30], a notion very close to Gibson’s affordance. Yet, what is a sensorimotor contingency? Is it really anything more than a functional pattern? All the examples they make in that work, and in ensuing papers, are of functional nature [
62]. For instance, perceiving a straight line is to engage with a kind of sensorimotor contingency that, when moving in the direction of the line, does not change the stimulus. However, this is a functional description. There can be other sensor modalities in which an action does not lead to any change in the perceived stimulus. Are they perceived as a line? Not necessarily. They can be sounds, colors, tastes. Many cases of sensorimotor invariance, while functionally equivalent, do not appear phenomenologically alike.
Enactivism (or any other E) has so far been unwilling to make a strong ontological claim about the identity of mental states (a notable exception is Myin [
41]). Unfortunately, the stuff that makes up sensorimotor contingencies is not the right one. The physical properties of sensorimotor contingencies or body–world interactions are not like the properties of our experience. For instance, the movements of my body are not like my touch experience. Eye-movements are not like color, shape, and size. The physical interactions between our visual system and the external objects are not like the colors we experience.
Suppose that the same functional pattern might be instantiated both by a biological eye perceiving an apple and by a simulation running inside a central unit. Suppose that the resulting mental process was different in the two cases. Why should there be any difference? One might expect that the character and quality of one’s experience of the apple depends on the underpinning physical stuff. Yet, an insurmountable problem presents itself. On the one hand, if the physical underpinning was the action that the body performs, such an action would be composed by elements that are not directly experienced by the subject—such as rhodopsin, the light rays, the movements of the eyes. They are not like one’s experience of seeing an apple. On the other hand, if the action was defined in more abstract terms, it would end up being like a functional pattern and how it might lead to conscious experience would remain mysterious.
Why should it be easier for one’s experience to arise out of body–world interactions than from neural processes? Body–world interactions are not closer to one’s experience than neural signals or functional patterns. In short, there is a mismatch between the proposed physical underpinnings (actions, sensorimotor processes) and experience (colors, hues, sounds, objects, faces, mountains, stars).
5.4. Body-Ism
Sensorimotor contingencies and previous notions such as affordances are not ontologically neutral. If there are sensors and actions, there must be agents too. Otherwise there is just causation. If there were no agents, sensorimotor contingencies and affordances would be empty notions. A rock does not have any sensorimotor contingencies notwithstanding the fact that it is causally coupled with its surroundings. When does a sensorimotor description of reality take over? When does a physical cause become a sensory input? The most likely answer is that the semantic shift is triggered by the adoption of a mentalistic stance. Yet, such a stance, not unlike Dennett’s intentional stance [
73], does not reveal any intrinsic feature of the physical world.
Likewise, the notion of the body, as I have stressed at length, is circular relatively to other notions—e.g., agent, action, sensors, etc. If nobody were here, there would be no bodies. A rock does not have a body. Neither is a corpse a body. Bodies require agents. A sex doll, no matter how much similar to a human body, it is not a body. It is just a toy. A body is a body only if there is somebody.
The intrinsic circularity of the notion of body jeopardizes the chance of success of em-body-ment as a theory of the mental. Embodiment entails body-ism that, in turn, entails some form of mentalism. Body-ism is the assumption that the body had a special status. For instance, calling physical causes impinging on a body stimuli is ontologically suspicious.
To some extent, the body and the brain are both shells. They are both physical systems. The key question is not about the extension of the mental, but about its center. Surprisingly, both EC and classic computationalism have one thing in common: they both hold that the body is the center of mass of the mental. Although they differ about the extension of the supervenience basis, they place the center of the mental in one’s body. Yet, why should the body anchor the mind? A likely reason is that the body contains the brain—which is puzzling for EC. The brain, while no longer containing all of the underpinnings of cognition, is still at the center of the physical system that gives rise to one’s mind. Although the brain centered view has been substituted by a body centered view, the body is still characterized by being the shell of the brain. The brain is no longer the container of the mind, but it remains at the center of its physical underpinning. Ontologically, the body plays a role not unlike that of the homunculus. A body-centered view is no less problematic than a brain-centered view.
5.5. Vitalism
The quest for non-mentalistic criteria has led many authors to consider—not always explicitly—forms of vitalism [
23,
53,
56,
56,
71]. In this regard, Stewart states “The paradigm of enaction solves this problem by grounding all cognition as an essential feature of living organisms […] For Maturana and Varela and Jonas, the great divide comes between matter and living organisms” [
74]. Of course, such a divide by suggesting some kind of ontological difference between living organisms and other physical systems must be grounded. While many proponents of enactivism are supportive of such a difference [
31,
50,
54,
56], not everyone agrees. Recent works have tried to appeal to top-down causation or emergence [
22,
75,
76]. Yet these positions are still very speculative. Why should a process be different because it is part of a living organism rather than part of a mechanical system? The biological foundations of enactivism have a great ontological cost.
6. A More Radical View? Em-World-Ment?
Enactivism and embodied cognition have been a step in a common direction. They have tried, albeit without universal agreement, to get out of the trap of the homunculus. However, as we have seen, they snapped back to disguised versions of it because of several unresolved issues: functionalism, circularity, and disguised mentalism (body-ism). In fact, EC has replaced the traditional homunculus with a bigger one which is the body. Is there any available alternative? Yes. We may consider a completely different ontological basis: namely the external world rather than the body–world nexus. Instead of embodiment, we may consider a stance that, for lack of a better word, it might be called em-world-ment.
Em-world-ment takes into consideration the radical hypothesis that the body is not the point of origin of one’s mental processes, but one of the physical circumstances that allow the occurrence of that set of events or objects that are one and the same as one’s mind. This proposal is consistent with embodied AI since neither does it entail any kind of emergentism, nor does it dwell on whether the system is alive. Moreover, according to this proposal, the body has no longer any special role, it is just an object among other objects and events. By doing so it is possible to consider an emworlded view of the mental that is neutral regarding the biological basis of organisms.
What is the mind then? The mind is one and the same as the collection of objects and events that, at any time, conjoin causally together and produce effects thanks to an object, which is the body. In this view, the body is the causal proxy through which the world, or better
a world, produces effects. The details of this view have been dubbed elsewhere by the author as the mind–object identity [
37,
38,
45].
This view, which is incompatible with functionalism rejects multiple realizability, is based on the identity between the mind and the external object. Therefore, the existence of the mental is no longer a contingent possibility. Zombies are ruled out together with other ontological danglers. Interestingly, such a strong identity claim is suitable for AI because it sets aside the need of having biological organisms. Consciousness is then relocated in the external world. Finally, the mind–object identity does not get trapped by the issues of constitution. Mental processes are not constituted by the external world, they are identical to the external world that takes place relative to an object which is called, for historical reasons, the body. The body is not a “living body”. It does not have any special quality or structural organization. Its role is only to offer to external objects an opportunity to produce effects.
The proposal consists of two steps. First, objects have a relative existence. They are relative objects. This step is fundamental because it allows to be realists without being naïve realists. Every object exists in multiple versions each relative to another object. In our case, for purely contingent reasons, the other object is our body but it might have been a machine. The body is necessary for the existence of the relative object, but it is neither its container nor its constitutive basis. Second, which is the identity claim, the mental is one and the same as the relative objects. At any time, the collection of relative objects, which might be called a relative world or a mind, is one’s consciousness. Of course, this notion is akin to von Uexkull’s Umwelt [
77]. As a result of such a hypothesis, one’s consciousness is physically external to one’s body. Mins are not embodied; they are one and the same with relative worlds.
Recently, two authors have suggested positions not too far from the proposed mind–object identity: Beaton [
27,
42] and Myin [
41]. I will briefly compare the mind-object identity with these two views.
Beaton expresses a radical form of direct realism according to which “when I see, I see things themselves. That when I see an apple, for instance, my experience is directly of that apple itself, with no intervening mental image or representation.” [
42]. He rejects naïve realism too. The apple he sees is not the mind independent apple but rather “the actual and available norm-involving actions that my sensorimotor coupling with the world makes” [
42]. Although this notion is close to that of relative object, Beaton remains vague as to the apple. He states that “perceiving is the same thing as engaging in (or being poised to engage in) meaning-filled, physical action in the world” [
42]. But nowhere he commits to a crystal-clear definition of the nature of the mind. The impression is that the apple is the way in which we interact with the world and that such interactions (or their knowledge of) are the stuff the world is made of. There are various objections. The first is that such interactions seem to be physically different from our experiences—he addresses the case of colors, but he never addresses colors as such, only their relational aspects. The second objection is that his treatment of illusions and dreams relies on shifting the focus from experience to the knowledge about the actions that we might do if circumstances were different. This is a strategy that has been explored by Noe [
27,
28] but that substitutes internal representations of objects with internal representations of actions. In contrast, the mind–object identity theory proposes a neater ontology: one’s experience is just the external object that takes place relative to one’s body.
Of course, such a proposal is a radical form of realism and thus it must address the traditional arguments against all forms of realism: illusions and dreams/hallucinations. As to illusions, the proposal is to revisit them in terms of misbeliefs about what we perceive rather than in terms of misperceptions. For instance, a mirage is just as physical as anything else and yet it yields to erroneous beliefs rather than to wrong perceptions. Yet, what we see when we see a mirage is just what we should see. As to hallucinations and dreams, they might be explained as forms of delayed and reshuffled perception once the notion of the present is reconsidered [
38,
45]. In other words, the hallucination of, say, a dagger might be explained as the delayed perception of a dagger one saw some years earlier. I am aware that I cannot even start to outline a brief reply to these two questions but it will suffice to say that the strategy will be to address all known empirical cases of both illusions and hallucinations and to show that they can be reduced to cases of unusual perception [
38].
The other approach is the embodied identity advanced by Myin [
41] who claims that, contrary to a widespread opinion, there is nothing bad in being explicit about what our mental states are identical to. I do agree wholeheartedly with the contention that “nothing in the idea of identity demands that the terms of identity be mind and brain, instead of mind and something else.” [
41]. The proposal is that our experience is identical with “organism-environment interactions. Sensation, perception, experience and cognition are things organisms do” [
41]. Three obstacles await us. The first is, once again, the special role given to the organism: “Being an organism having that form of organization, that is, actually occupying a particular perspective, living or enacting a life, is, for such an organism, to have experience” [
41]. This definition is circular and hardly explanatory. The second obstacle is that the available repertoire of interactions underdetermines the richness of our phenomenal experience. Third, organism–environment interactions do not have the properties we find in our experience. In contrast with Myin’s embodied identity, the main advantage of the mind–object identity is that it neither depends on a special role for the organism/body nor does it make any use of any mentalistic terms (goals, sense-making, action).
In 1998, at the very onset of their paper, Clark and Chalmers made a curious choice of words, “Where does the mind stop, and the rest of the world begin? The question invites two standard replies. Some accept the boundaries of skin and skull, and say that what is outside the body is outside the mind” [
34]. It is revealing that they wonder as to where the mind
stops, and the world
begins! This wording is the consequence of assuming that the mind
starts inside the brain, or, at least, inside the body. This is a big assumption that disguises a residual form of homuncularism. Clark and Chalmers—as EC and embodied AI supporters do—give the body a privileged status which is ontologically questionable. As we have seen, the body offers the possibility to instantiate functional patterns that might as well be instantiated inside a traditional computer. An embodied agent might be embodied by a simulation.
In fact, current EC approaches endorse the notion that the body (which is the container of the brain) is the entrance point of the mind. EC entails that the supervenience basis of the mind stretches from the brain to the environment as if the body has a special status. Although EC shifts the boundaries of the physical underpinnings of the mind beyond the neural system, for a functionalist functional patterns instantiated by the brain and functional patterns instantiated by the brain-body–world nexus are equivalent. The body and the brain are still seen as the location where functional patterns qualify as mental.
The point I want to make is that embodied cognition, while surely useful insofar as it moves forward from the narrow view of traditional computation, is not the radical paradigm shift it may seem. Embodied cognition is a sort of functionalism on steroids that extends the physical underpinning of cognition to a larger set of physical phenomena but that leaves unresolved many key issues: circularity, mentalism, body-ism, and functionalism. That is why I suggest upturning the applecart.
The main culprit is the hidden assumption is that the mind is somewhat originating inside the body (possibly inside the brain). The alternative is to consider a stronger thesis such as identity. The mind–object identity proposes the identity between the external relative object and our experience. By considering the possibility that the mind might be a physical object, the all-encompassing functionalism lurking in many versions of EC could be dodged.
Rather than searching for where the mind stops, as Chalmers and Clark and many other EC supporters have done, I propose to search for where the mind starts. The mind might be like a hurricane, with an empty space at its center. The brain would then be like the eye of the hurricane, the place that must be there for the hurricane to happen, but where no storm is actually occurring.