Abstract
Most of the bioinspired morphological computing studies have departed from a human analysis bias: to consider cognitive morphology as encapsulated by one body, which, of course, can have enactive connections with other bodies, but that is defined by clear bodily boundaries. Such complex biological inspiration has been directing the research agenda of a huge number of labs and institutions in recent decades. Nevertheless, there are other bioinspired examples or even technical possibilities that go beyond biological capabilities (such as constant morphological updating and reshaping, which asks for remapping cognitive performances). Additionally, despite the interest of swarm cognition (which includes superorganisms of flocks, swarms, packs, schools, crowds, or societies) in such non-human-centered approaches, there is still a biological constraint: such cognitive systems have permanent bodily morphologies and only interact between similar entities. In all cases, and even considering amazing possibilities, such as the largest living organism on Earth (specifically the honey fungus Armillaria ostoyae, measuring 3.8 km across in the Blue Mountains in Oregon), it has not been put over the table the possibility of thinking about cross-morphological cognitive systems. Nests of intelligent drones as a single part of AI systems with other co-working morphologies, for example. I am, therefore, suggesting the necessity of thinking about cross-embodied cognitive morphologies, more dynamical and challenging than any other existing cognitive system already studied or created.
1. The Single Body Bias
Despite incredible advances in the understanding of cognition beyond brain centrism, which can be summarized under the 4E cognitive paradigm (extended, embodied, enactive, embedded), remains one strong bias that affects this research area: to consider cognitive morphology as encapsulated by one body. Of course, these approaches have considered how the raw/original/seminal/precursor body can have enactive connections with other bodies, but that is defined by clear bodily boundaries. At the same time, we need to admit that we can find self-reconfiguring modular robots (that is, with variable morphologies), but in any case, the notion of a single body with applied changes remains intact. Bioinspired morphological computing studies are, therefore, reproducing such bias.
On the other hand, there are other bioinspired examples or even technical possibilities that go beyond biological capabilities (such as constant morphological updating and reshaping, which asks for remapping cognitive performances). Additionally, despite the interest of swarm cognition (which includes superorganisms of flocks, swarms, packs, schools, crowds, or societies) in such non-human-centered approaches, there is still a biological constraint: such cognitive systems have permanent bodily morphologies and only interact between similar entities. In addition, cognition and language are based on sensorimotor intelligence that came before (active/embodied/animate/grounded/…) [1].
Once clarified this contextual situation, I am, therefore, suggesting the necessity of thinking about cross-embodied cognitive morphologies, again based on EEEE (Embodied, Embedded, and Enacted Cognition), but using a more dynamical and challenging model of bodies than any other existing cognitive system already studied or created. Regardless, there are interesting and insightful counterexamples to such approach: Foundation Models (e.g., BERT, GPT-3, CLIP, Codex), by the Stanford HAI’s Center for Research on Foundation Models (CRFM), are highly successful models trained on broad data at scale, such that they can be adapted to a wide range of downstream tasks. For example, movement: embodied cognition people will say that if I have the “move”, then I can figure out the “think”. It is more natural—this is how it happened in evolution. This is the true bio-inspired approach. However, the Foundation Model researchers can also say that if I have the “think” then I can figure out in what parts of the “move” I need to ground my thinking. That is a reverse embodiment approach. The (silent) conclusion that the Foundation Models are not “the right ones” because they have no sensorimotor intelligence foundation is not necessarily right. Why? Because they can acquire it, by looking backward at the evolutionary ladder. This would mean starting with a symbolic conceptual system based on words and extending it to become multimodal. How? Well, due to a freaky historical sequence of events and an extraordinary amount of luck, the foundational models are not working with language. They are working with high-dimensional vectors representing structured objects. It is an example of how sensorimotor logics can be engineered using different building directions (bottom-up, top-down) and across a dynamic and open global morphology.
2. Conceptual Aspects of This Debate
Several important conceptual debates are running at the same time in this cognitive shift:
2.1. Embodied vs. Disembodied AI
Although embodiments are being understood as fundamental for the real intelligent systems, the current Zeitgeist, more pragmatic and devoted to the deification of quick revenues is still defending the disembodiment of intelligent artificial systems. Take as an example how recently OpenAI abandoned robotics interests, despite working on an AGI (artificial general intelligence). We are facing curious actions, too: although it is obvious that we need to involve the motor system in the process of perception, such an idea is considered by the main computer vision intelligence practitioners as heresy. For such a reason, the topic has been excluded from CVPR—Conference on Computer Vision and Pattern Recognition, IEEE—for a long time. Can we then create AI without including robotics? Does it mean that the classic GOFAI will again dominate the research of the next decades? Although all the current challenges indicate that the next step will require the combination of symbolic and statistical AI, always having in mind the situated nature of cognitive systems [2,3,4], quick successes of Deep Learning are blocking such complex attempts.
2.2. Haptics and Multimodal Perception Reconceptualization
Morphology is fundamental because even concepts/categories change radically based on robot morphology. For example, the concept of an obstacle changes if you are crawling, hopping, walking, or flying. Or if you have panoramic vision (you have eyes in the back of your head) what is the meaning of front or back? Or if you have eyes all over your body, what does it mean to see? We see, also, that space metaphors are fundamental for the design of thinking mechanisms and they are diverse according to geographical locations [5]. Such variables, environment, and culture even shape both the color lexicon and the genetics of color perception [6]. Therefore, the process of multimodal integration must be understood from the creation process, not the real observation, as well as a mechanism that embeds other epistemic processes (scale, time, or system selection data about events understudy).
2.3. Possible Combinations of Current Models
My proposal is not a novelty because it includes the integration of different already existing fields: cloud robotics (remote control + and/or distributed bodies), swarm robotics (collaborative social robots), modular robotics, or hybrid artificial-biological interfaces (biobots). The combinations of several of these elements offer us a rich matrix of possibilities concerning the design of cross-cognitive systems.
2.4. Cognitive Transferability
Because of the grounded nature of cognition, the question concerning the transferability of cognition emerges as an important one. It is important in fieldwork robotics, where cooperation and data/skills transfer will be necessary for the optimization of resources, without depending on something humans do very slowly sharing ideas/actions by language or imitation. If we add the problem of evolving morphologies, such transferability is something very difficult to achieve. Additionally, of course, always consider the loss of information without blocking the main expected action, as an escalation of data concerning the main task. Embodied and virtual bodies will make this process not universal or easy, but in any case, will demand very plastic adaptive “brains” (or, instead, complex informational systems)
2.5. Physically Grounded Programming Languages
Part of the success of this cross-morphological cognitive revolution will be related to the existence of physically grounded programming languages that will allow describing a task and that each compiler translates it to different architectures. Currently, there is a new programming language, called AL, that can describe tasks independently of the robot’s body; then programs written in AL could be compiled for particular robot architectures [7].
3. Cross-Embodied Cognition
From the previously explained concepts, we can infer that the roadmap for a paradigm on cross-embodied cognition faces several challenges:
- ➢
- Multiple Sensorimotor Intelligence Foundation;
- ➢
- Transferability, and interoperability issues;
- ➢
- Bioinspiration (not for GPT or BERT, although they are useful!);
- ➢
- Multimodality integration and translation (thin on existing Hyper Dimensional (HD) Computing, introduced by Kanerva (and Platt before him—Vector Symbolic Architectures)) [8];
- ➢
- Decentralized morphologies.
Some current researchers are already working on these ideas [9], still conditioned by the functional applicability of their systems. Of course, this text is a theoretical exercise of exploration for cross-cognitive systems and therefore is free from implantation tasks. Nevertheless, I want to finish this preliminary exploration with a Gedankenexperiment: spatial exploration.
Imagine the requirements of space exploration: long-term missions, very, very far and without Earth remote control. Modular robots will need not only to self-repair themselves but also to be able to self adapt, and self reprogramming. Such systems will need to transfer knowledge to different operational robotic systems (much more specialized), something not only related to data but also about sensorimotor procedures, and actions. The main problem is that current modular robotics is not based on grounded AI, which could produce more complexities and mistakes during the process of transferability.
4. Conclusions
Cross-embodied cognitive morphologies are a fundamental part of the next revolution in robotics, but despite it, the numerous challenges and complexities implied require from us not only more resources and dedication but also a conceptual paradigm shift about the true nature of cognitive processes. The classic anthropomorphic and biological evolutionary set of resources are useful but also biased because the informational and bodily possibilities of new AGI offer new conceptual possibilities. Take this text as a preliminary exploration of some fundamental aspects of this new research field.
Funding
This research received no external funding.
Institutional Review Board Statement
Not applicable.
Data Availability Statement
Not applicable.
Acknowledgments
To ICREA Acadèmia for their financial support, and Yannis Aloimonos for his ideas and generous conceptual feedback.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Dodig-Crnkovic, G. Morphological, Natural, Analog, and Other Unconventional Forms of Computing for Cognition and Intelligence. Proceedings 2020, 47, 30. [Google Scholar] [CrossRef]
- Lindblom, J.; Ziemke, T. Social situatedness of natural and artificial intelligence: Vygotsky and beyond. Adapt. Behav. 2003, 11, 79–96. [Google Scholar] [CrossRef]
- Schroeder, M.J.; Vallverdú, J. Situated phenomenology and biological systems: Eastern and Western synthesis. Prog. Biophys. Mol. Biol. 2015, 119, 530–537. [Google Scholar] [CrossRef] [PubMed]
- Vallverdú, J. Approximate and Situated Causality in Deep Learning. Philosophies 2020, 5, 2. [Google Scholar] [CrossRef] [Green Version]
- Nisbett, R.E. The Geography of Thought: How Asians and Westerners Think Differently... and Why; Free Press: New York, NY, USA, 2003. [Google Scholar] [CrossRef]
- Josserand, M.; Meeussen, E.; Majid, A.; Dediu, D. Environment and culture shape both the colour lexicon and the genetics of colour perception. Sci. Reports 2021, 11, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Yang, Y.; Li, Y.; Fermüller, C.; Fermüller, F.; Aloimonos, Y. Robot Learning Manipulation Action Plans by ‘Watching’ Unconstrained Videos from the World Wide Web. In Proceedings of the AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; pp. 3686–3692. [Google Scholar]
- Kanerva, P. Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors. Cogn. Comput. 2009, 12, 139–159. [Google Scholar] [CrossRef]
- Boudet, J.F.; Lintuvuori, J.; Lacouture, C.; Barois, T.; Deblais, A.; Xie, K.; Cassagnere, S.; Tregon, B.; Brückner, D.B.; Baret, J.C.; et al. From collections of independent, mindless robots to flexible, mobile, and directional superstructures. Sci. Robot. 2021, 6, eabd0272. [Google Scholar] [CrossRef] [PubMed]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).