1. The Single Body Bias
Despite incredible advances in the understanding of cognition beyond brain centrism, which can be summarized under the 4E cognitive paradigm (extended, embodied, enactive, embedded), remains one strong bias that affects this research area: to consider cognitive morphology as encapsulated by one body. Of course, these approaches have considered how the raw/original/seminal/precursor body can have enactive connections with other bodies, but that is defined by clear bodily boundaries. At the same time, we need to admit that we can find self-reconfiguring modular robots (that is, with variable morphologies), but in any case, the notion of a single body with applied changes remains intact. Bioinspired morphological computing studies are, therefore, reproducing such bias.
On the other hand, there are other bioinspired examples or even technical possibilities that go beyond biological capabilities (such as constant morphological updating and reshaping, which asks for remapping cognitive performances). Additionally, despite the interest of swarm cognition (which includes superorganisms of flocks, swarms, packs, schools, crowds, or societies) in such non-human-centered approaches, there is still a biological constraint: such cognitive systems have permanent bodily morphologies and only interact between similar entities. In addition, cognition and language are based on sensorimotor intelligence that came before (active/embodied/animate/grounded/…) [
1].
Once clarified this contextual situation, I am, therefore, suggesting the necessity of thinking about cross-embodied cognitive morphologies, again based on EEEE (Embodied, Embedded, and Enacted Cognition), but using a more dynamical and challenging model of bodies than any other existing cognitive system already studied or created. Regardless, there are interesting and insightful counterexamples to such approach: Foundation Models (e.g., BERT, GPT-3, CLIP, Codex), by the Stanford HAI’s Center for Research on Foundation Models (CRFM), are highly successful models trained on broad data at scale, such that they can be adapted to a wide range of downstream tasks. For example, movement: embodied cognition people will say that if I have the “move”, then I can figure out the “think”. It is more natural—this is how it happened in evolution. This is the true bio-inspired approach. However, the Foundation Model researchers can also say that if I have the “think” then I can figure out in what parts of the “move” I need to ground my thinking. That is a reverse embodiment approach. The (silent) conclusion that the Foundation Models are not “the right ones” because they have no sensorimotor intelligence foundation is not necessarily right. Why? Because they can acquire it, by looking backward at the evolutionary ladder. This would mean starting with a symbolic conceptual system based on words and extending it to become multimodal. How? Well, due to a freaky historical sequence of events and an extraordinary amount of luck, the foundational models are not working with language. They are working with high-dimensional vectors representing structured objects. It is an example of how sensorimotor logics can be engineered using different building directions (bottom-up, top-down) and across a dynamic and open global morphology.
2. Conceptual Aspects of This Debate
Several important conceptual debates are running at the same time in this cognitive shift:
2.1. Embodied vs. Disembodied AI
Although embodiments are being understood as fundamental for the real intelligent systems, the current
Zeitgeist, more pragmatic and devoted to the deification of quick revenues is still defending the disembodiment of intelligent artificial systems. Take as an example how recently OpenAI abandoned robotics interests, despite working on an AGI (artificial general intelligence). We are facing curious actions, too: although it is obvious that we need to involve the motor system in the process of perception, such an idea is considered by the main computer vision intelligence practitioners as heresy. For such a reason, the topic has been excluded from CVPR—Conference on Computer Vision and Pattern Recognition, IEEE—for a long time. Can we then create AI without including robotics? Does it mean that the classic GOFAI will again dominate the research of the next decades? Although all the current challenges indicate that the next step will require the combination of symbolic and statistical AI, always having in mind the situated nature of cognitive systems [
2,
3,
4], quick successes of Deep Learning are blocking such complex attempts.
2.2. Haptics and Multimodal Perception Reconceptualization
Morphology is fundamental because even concepts/categories change radically based on robot morphology. For example, the concept of an obstacle changes if you are crawling, hopping, walking, or flying. Or if you have panoramic vision (you have eyes in the back of your head) what is the meaning of front or back? Or if you have eyes all over your body, what does it mean to see? We see, also, that space metaphors are fundamental for the design of thinking mechanisms and they are diverse according to geographical locations [
5]. Such variables, environment, and culture even shape both the color lexicon and the genetics of color perception [
6]. Therefore, the process of multimodal integration must be understood from the creation process, not the real observation, as well as a mechanism that embeds other epistemic processes (scale, time, or system selection data about events understudy).
2.3. Possible Combinations of Current Models
My proposal is not a novelty because it includes the integration of different already existing fields: cloud robotics (remote control + and/or distributed bodies), swarm robotics (collaborative social robots), modular robotics, or hybrid artificial-biological interfaces (biobots). The combinations of several of these elements offer us a rich matrix of possibilities concerning the design of cross-cognitive systems.
2.4. Cognitive Transferability
Because of the grounded nature of cognition, the question concerning the transferability of cognition emerges as an important one. It is important in fieldwork robotics, where cooperation and data/skills transfer will be necessary for the optimization of resources, without depending on something humans do very slowly sharing ideas/actions by language or imitation. If we add the problem of evolving morphologies, such transferability is something very difficult to achieve. Additionally, of course, always consider the loss of information without blocking the main expected action, as an escalation of data concerning the main task. Embodied and virtual bodies will make this process not universal or easy, but in any case, will demand very plastic adaptive “brains” (or, instead, complex informational systems)
2.5. Physically Grounded Programming Languages
Part of the success of this cross-morphological cognitive revolution will be related to the existence of physically grounded programming languages that will allow describing a task and that each compiler translates it to different architectures. Currently, there is a new programming language, called AL, that can describe tasks independently of the robot’s body; then programs written in AL could be compiled for particular robot architectures [
7].
3. Cross-Embodied Cognition
From the previously explained concepts, we can infer that the roadmap for a paradigm on cross-embodied cognition faces several challenges:
- ➢
Multiple Sensorimotor Intelligence Foundation;
- ➢
Transferability, and interoperability issues;
- ➢
Bioinspiration (not for GPT or BERT, although they are useful!);
- ➢
Multimodality integration and translation (thin on existing Hyper Dimensional (HD) Computing, introduced by Kanerva (and Platt before him—Vector Symbolic Architectures)) [
8];
- ➢
Decentralized morphologies.
Some current researchers are already working on these ideas [
9], still conditioned by the functional applicability of their systems. Of course, this text is a theoretical exercise of exploration for cross-cognitive systems and therefore is free from implantation tasks. Nevertheless, I want to finish this preliminary exploration with a
Gedankenexperiment: spatial exploration.
Imagine the requirements of space exploration: long-term missions, very, very far and without Earth remote control. Modular robots will need not only to self-repair themselves but also to be able to self adapt, and self reprogramming. Such systems will need to transfer knowledge to different operational robotic systems (much more specialized), something not only related to data but also about sensorimotor procedures, and actions. The main problem is that current modular robotics is not based on grounded AI, which could produce more complexities and mistakes during the process of transferability.
4. Conclusions
Cross-embodied cognitive morphologies are a fundamental part of the next revolution in robotics, but despite it, the numerous challenges and complexities implied require from us not only more resources and dedication but also a conceptual paradigm shift about the true nature of cognitive processes. The classic anthropomorphic and biological evolutionary set of resources are useful but also biased because the informational and bodily possibilities of new AGI offer new conceptual possibilities. Take this text as a preliminary exploration of some fundamental aspects of this new research field.