Next Article in Journal
Are Large Language Models Intelligent? Are Humans?
Previous Article in Journal
The Logical Relationship of the Concept Discovery and Association Based on the Universal Factor Space
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

How GPT Realizes Leibniz’s Dream and Passes the Turing Test without Being Conscious †

by
Gordana Dodig-Crnkovic
1,2
1
Department of Computer Science and Engineering, Chalmers University of Technology, 412 96 Gothenburg, Sweden
2
School of Innovation, Design and Engineering, Mälardalen University, 721 23 Västerås, Sweden
Presented at the Workshop on AI and People, IS4SI Summit 2023, Beijing, China, 14–16 August 2023.
Comput. Sci. Math. Forum 2023, 8(1), 66; https://doi.org/10.3390/cmsf2023008066
Published: 11 August 2023
(This article belongs to the Proceedings of 2023 International Summit on the Study of Information)

Abstract

:
This article addresses the background and nature of the recent success of Large Language Models (LLMs), tracing the history of their fundamental concepts from Leibniz and his calculus ratiocinator to Turing’s computational models of learning, and ultimately to the current development of GPTs. As Kahneman’s “System 1”-type processes, GPTs lack mechanisms that would render them conscious, but they nonetheless demonstrate a certain level of intelligence and the capacity to represent and process knowledge. This is achieved by processing vast corpora of human-created knowledge, which, for its initial production, required human consciousness, but can now be collected, compressed, and processed automatically.

1. Introduction

The latest technological advancements in Large Language Models (LLMs), such as ChatGPT and other Generative Pretrained Transformer platforms (GPTs), have provided evidence supporting Turing’s theory on the possibility of machine-based intelligence/Artificial Intelligence (AI). Not only does ChatGPT deliver believable replies in dialogues with humans and effectively pass the Turing Test, it has also succeeded in impressing some humans to the point where they ascribe personality to it.
Originally referred to as “the imitation game”, the Turing Test was designed to assess a machine’s ability to exhibit intelligent verbal behavior comparable to that of a human. Turing proposed that a human evaluator would engage in natural language conversations with both a human and a machine, and if the evaluator could not distinguish between them, the machine would demonstrate its capacity for faithfully imitating human verbal behavior. Observe that Turing did not mention consciousness, only the ability to imitate.
However, there are reported examples of individuals who believe that ChatGPT is conscious. As reported by The New York Times on 23 July 2022 (accessed on 23 July 2022), Google fired engineer Blake Lemoine for claiming that Google’s Language Model for Dialogue Applications (LaMDA) was sentient, (i.e., experiencing sensations, perceptions, and other subjective experiences). While Lemoine’s views were extreme, he was not the only one attributing sentience and even consciousness to new LLM platforms. More cautious interpretations suggest that LLMs might, in principle, possess some degree of consciousness, but currently, we have no way to ascertain this.
The advances of LLMs are rooted in principles that have been a part of AI research since the inception of deep neural networks, but what sets them apart is how they are currently being implemented. Modern advancements are built upon extensive neural networks, which are trained on vast datasets using thousands of high-speed graphical processing units (GPUs) in large computer clusters. This training process is supported by an efficient infrastructure, Transformer architectures, and neural network optimization techniques. The crucial final steps involve applying reinforcement learning with the human feedback (RLHF) [1]. Human AI trainers create a reward model that ranks responses, training the AI to determine the most appropriate responses for a given human interaction.
The effectiveness of this method and the ability of LLMs to generate intelligent responses in natural language were unexpected by the majority, who believed that more complex approaches would be required to pass the Turing Test. However, it appears that performing comprehensive computations over human-structured data, information, and knowledge may be sufficient to demonstrate a surprising level of imitation of human language abilities.
The training of GPT-3.5 involved a wide range of resources, including Wikipedia articles, social media posts, news articles, books, and other documents published before 2021. The next step in the development of the current methodology involves enhancing ChatGPT prompts with web search capabilities using WebChatGPT.
Data compression (Kolmogorov–Chaitin compression) plays a central role in the entire process of utilizing the collected human knowledge. As formulated by Greg Chaitin in 2006, “A useful theory is a compression of the data; compression is comprehension” [2]. In 2018, inspired by the work of Greg Chaitin, Hector Zenil wrote an article titled “Compression is Comprehension and the Unreasonable Effectiveness of Digital Computation in the Natural World” (https://arxiv.org/abs/1904.10258v3, accessed on 23 July 2022). Gerry Wolff, in his article “Intelligence Via Compression of Information” (tinyurl.com/2p8d5f4z, accessed on 23 July 2022), published in the IEEE Computer Society Tech News, Community Voices, on 1 February 2023, explains the mechanism of data compression in a cognitive system as a process of intelligence. Phil Maguire, Philippe Moser, and Rebecca Maguire take it a step further by arguing that consciousness can be understood as data compression [3].
On the other end of the spectrum regarding the acknowledgment of LLM capabilities is Luciano Floridi, who views LLMs as “agency without intelligence” [4]. This alludes to “their ‘brittleness’ (susceptibility to catastrophic failure), ‘unreliability’ (false or fabricated information), and occasional inability to make basic logical inferences or handle simple mathematics”. Floridi’s article concludes that at the current stage of development, LLMs exhibit no intelligence. In this context, “intelligence” refers to an idealized human capacity for rational reasoning. However, it is worth noting that recent developments in cognitive science indicate a development toward a different, more inclusive understanding of intelligence, wherein not only humans but every living organism possesses a level of cognition (basal cognition) and intelligence [5].
To make a connection to human abilities, it is helpful to examine Large Language Models (LLMs) not only in terms of their implemented mechanisms of data/information processes and architectures but also in comparison to human cognition. According to Daniel Kahneman, humans possess two complementary cognitive systems: “System 1”, which involves rapid, intuitive, automatic, and non-conscious information processing; and “System 2”, which encompasses slower, reflective, conscious reasoning and decision-making [6,7]. By solely recognizing “System 2” symbolic information processing (as seen in traditional AI approaches such as GOFAI), the symbol grounding problem remains unsolved. In contrast, “System 1” neural networks, as sub-symbolic data processing mechanisms, provide a means for symbol grounding in deep learning.
The fast neural network computation performed by LLMs, resulting in convincing dialogues, aligns with the fast thinking associated with “System 1”. According to Kahneman’s description, being on the “System 1” level means that LLMs lack consciousness, which, in this context, is characteristic of “System 2”.
Researchers such as Joshua Bengio are exploring ways to incorporate “System 2” and merge neural networks with symbolic computing. Proposed hybrid models [8,9] would combine symbolic and sub-symbolic elements, enabling the modeling of a blend of reactive (fast, non-conscious/sub-conscious) and deliberative (slow, conscious) cognitive behaviors typical of human cognition. It should be noted that the interpretations of “System 1” and “System 2” by Bengio and Kahneman are not identical. This was evident from the discussion at the AAAI-2020 conference during the Fireside Chat with Lecun, Hinton, Bengio, and Kahneman [10]. However, the specifics of their differences are not essential for our present exposition.
The fast, automatic “System 1” can be understood, as stated in [11], through physical correlations explained by Carlo Rovelli [12], which aligns with Shannon’s relative information theory. These physical correlations can also accommodate some reflexive emotional elements [13] in embodied (physical) agents. On the other hand, the slow “System 2” introduces an element of choice and indeterminism [14], primarily due to the presence of synonyms in the symbol system. This topic has been further explored in other research that also discusses parallel concurrent computation, which is typical of biological systems but is inadequately represented by the Turing Machine model [15].

2. Historical Notes, from Leibniz via Turing to GPT

Gottfried Wilhelm Leibniz’s work on universal language, “Characteristica Uni-versalis”, and calculus of reasoning, “Calculus Ratiocinator”, provide a theoretical framework for universal logical calculations that have greatly influenced modern computer science and AI. In many ways, ChatGPT can be seen as the realization of Leibniz’s vision of “characteristica universalis” and a method for generating new knowledge from existing information.
This article traces the evolution of the computational approach to knowledge generation from its roots in Leibniz’s concept of a universal language to Turing’s ideas of morphogenesis and, further, to modern AI models such as ChatGPT. Turing’s contributions laid the ground for a computational approach to learning and knowledge generation, with the invention of the Turing Machine for symbol processing, the concept of an “unorganized machine” (neural network model), and the Turing Test for Artificial Intelligence [16].
This article explores the possibility that knowledge generation and learning can be achieved computationally, aligning with Leibniz’s original beliefs. It also highlights how the success of ChatGPT and other Large Language Models can be viewed in the context of Leibniz’s intellectual legacy.
Importantly, neither Leibniz, Turing, nor ChatGPT presupposes consciousness as a requirement for the computational process of generating knowledge from existing information and knowledge.

3. Penrose’s Criticism of Classical Computationalism and the Absence of Consciousness in ChatGPT

Roger Penrose has expressed criticism of computationalism in two of his books and proposed alternative perspectives on the nature of consciousness and human cognition. In his first book, Penrose [17] examines the limitations of Artificial Intelligence and computational models in explaining human consciousness and understanding. He discusses Gödel’s incompleteness theorems, the nature of consciousness, the role of quantum mechanics in the brain, and the constraints of algorithmic reasoning. In his second book, Penrose [18] continues his exploration of consciousness and its relationship to computation. He expands his previous arguments and addresses criticisms and responses received following the publication of the first book. In this book, Penrose suggests that quantum processes underlie consciousness.
In his more recent work from 2012, the foreword to A Computable Universe: Understanding Computation & Exploring Nature As Computation [19], which is Penrose’s latest text on computationalism, he explicitly acknowledges having changed his position on the question of computationalism (the belief that the mind can be modeled computationally) multiple times. In the foreword, Penrose discusses different versions of computationalism and various possible interpretations. He expresses his criticism, based on the Turing Machine Model of computation, assuming that computation = Turing Machine = algorithm.
When Penrose argues that consciousness is incomputable, he means that it is not algorithmic in the sense of the Turing Machine. This position is well-established, even within modern computational approaches which suggest that distributed, asynchronous, and concurrent models of computation are necessary.
It is thus crucial to differentiate between new computational models of intrinsic information processes in nature, such as natural computing/morphological computing, and the old computationalism based on the computer metaphor of the Turing machine, which describes symbol processing. This traditional model has been criticized as inadequate for modeling human cognition (Miłkowski [20,21]; Scheutz [22]), and it is considered irrelevant for AI by researchers such as Sloman [15].

4. Concluding Remarks. Passing the Turing Test Does Not Imply Consciousness but Demonstrates a Useful Level of Intelligence

When Turing discussed the possibility of constructing artificially intelligent agents based on computations performed by electronic machines, he was met with skepticism by many. Even to this day, human intelligence and especially consciousness are often considered impossible to implement in machines due to their supposed substantial differences in nature.
Recent technological advances in the development of Large Language Models (LLMs), such as ChatGPT and other GPT (Generative Pretrained Transformer) programs, have finally provided justification for Turing’s belief in the possibility of realizing intelligence in machines, also known as Artificial Intelligence (AI). These advancements are based on remarkably simple principles that have been known within the AI field for several decades, particularly since the advent of deep neural networks.
The current developments primarily differ in the following respects: the use of large neural networks trained on vast amounts of data; the utilization of thousands of fast GPUs in huge computer clusters that run for several weeks; the implementation of elaborate infrastructure; the optimization of neural networks; and the adoption of architectures such as Transformers.
The final step in this progression was achieving a better interface for human users, which was accomplished through reinforcement learning with human feedback (RLHF). In this process, human AI trainers created a reward model in which responses were ranked by humans, enabling the AI to learn which response was the most effective.
The method is surprisingly simple yet effective: predicting the next word based on immense compressed data that contains human knowledge on a given topic collected from a vast array of available Internet sources. Thus far, LLMs have employed large networks with a substantial number of parameters, building extensive datasets, and designing algorithms to pass the Turing Test.
Symbolic computing has not yet been involved. Researchers such as Joshua Bengio [23] believe in the power of hybrid computing models that involve both neural networks (NNs) and symbolic computing. This perceived necessity of two complementary systems is motivated by the findings of Kahneman [6] and Tjøstheim et al. [24] on two basic cognitive systems in humans: a “System 1” that involves reflexive, unconscious, automatic, and intuitive information processing; and a slow “System 2” that involves reflective, conscious, reasoning, and decision-making processes.
Computation in LLMs corresponds to Kahneman’s fast, intuitive “System 1”and provides a foundation for the slower, symbolic processing of “System 2”. Current developments in AI are continuing toward the modeling “System 2” symbolic reasoning, as presented by Russin, O’Reilly, and Bengio [23]. In a recent interview with Wired magazine, Sam Altman from OpenAI said that the age of giant AI models is over and that new development strategies will be needed in the future [25].
It is important to observe that the Turing Test which GPT programs can pass is not a test of human consciousness but one which shows the ability of a machine to produce sufficiently believable human-like dialogue.
The question of generating new knowledge de novo, i.e., not from a vast corpus of existing human knowledge, is a different inquiry. Language models (LLMs) serve as computational models for knowledge generation, producing new knowledge based on existing knowledge. This is why GPT programs are trained on huge amounts of human-produced text which required consciousness or physical presence in the world when it was produced by humans.
This situation is similar to synthetic biology, which can construct a living cell by assembling components from disassembled cells. However, it is still unable to generate a living cell de novo from a container containing organic molecules that constitute a living cell.
Using Kahneman’s terminology, LLMs operate on the ‘”System 1” level, with rapid and unconscious processing. The success of LLMs relies on human knowledge being compressed and reused. Humans employed consciousness during the generation of that knowledge, but once it existed, an automated procedure could use it to generate additional knowledge.

Funding

This research was funded by Chalmers University of Technology AI Research Centre CHAIR.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created.

Acknowledgments

The author would like to thank Chalmers AI Research Centre CHAIR for supporting the organization of the workshop “AI for People” at IS4SI summit 2023, to which this article was submitted.

Conflicts of Interest

The author declares no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Amodei, D.; Christiano, P.; Ray, A. Learning from Human Preferences. Available online: https://openai.com/research/learning-from-human-preferences (accessed on 23 July 2022).
  2. Chaitin, G. The limits of reason. Sci. Am. 2006, 294, 74–81. [Google Scholar] [CrossRef] [PubMed]
  3. Maguire, P.; Moser, P.; Maguire, R. Understanding Consciousness as Data Compression. J. Cogn. Sci. 2016, 17, 63–94. [Google Scholar] [CrossRef]
  4. Floridi, L. AI as Agency without Intelligence: On ChatGPT, Large Language Models, and Other Generative Models. Philos. Technol. 2023, 36, 15. [Google Scholar] [CrossRef]
  5. Levin, M.; Keijzer, F.; Lyon, P.; Arendt, D. Basal cognition: Multicellularity, neurons and the cognitive lens, Special issue, Part 2. Philos. Trans. R. Soc. B 2021, 376, 132. [Google Scholar]
  6. Kahneman, D. Thinking, Fast and Slow; Farrar, Straus and Giroux: New York, NY, USA, 2011; ISBN 9780374275631. [Google Scholar]
  7. Tjøstheim, T.A.; Stephens, A.; Anikin, A.; Schwaninger, A. The Cognitive Philosophy of Communication. Philosophies 2020, 5, 39. [Google Scholar] [CrossRef]
  8. Larue, O.; Poirier, P.; Nkambou, R. Hybrid reactive-deliberative behaviour in a symbolic dynamical cognitive architecture. In Proceedings of the 2012 International Conference on Artificial Intelligence, ICAI 2012, Las Vegas, NV, USA, 16–19 July 2012. [Google Scholar]
  9. van Bekkum, M.; de Boer, M.; van Harmelen, F.; Meyer-Vitali, A.; ten Teije, A. Modular Design Patterns for Hybrid Learning and Reasoning Systems: A Taxonomy, Patterns and Use Cases. arXiv 2021, arXiv:2102.11965v2. [Google Scholar] [CrossRef]
  10. AAAI-20 Fireside Chat with Daniel Kahneman. Available online: https://vimeo.com/390814190 (accessed on 23 July 2022).
  11. Ehresmann, A.C. A Mathematical Model for Info-computationalism. Constr. Found. 2014, 9, 235–237. [Google Scholar]
  12. Rovelli, C. Relative Information at the Foundation of Physics. In It From Bit or Bit From It? On Physics and Information; Springer International Publishing: Cham, Switzerland, 2015. [Google Scholar]
  13. von Haugwitz, R.; Dodig-Crnkovic, G.; Almér, A. Computational Account of Emotion, an Oxymoron? In Proceedings of the IS4IS Summit Vienna 2015, Vienna University of Technology (Online), 3–7 June 2015.
  14. Mikkilineni, R. Going beyond Computation and Its Limits: Injecting Cognition into Computing. Appl. Math. 2012, 3, 1826–1835. [Google Scholar] [CrossRef]
  15. Sloman, A. The Irrelevance of Turing machines to AI. In Computationalism—New Directions; Scheutz, M., Ed.; MIT Press: Cambridge, MA, USA, 2002; pp. 87–127. [Google Scholar]
  16. Turing, A. Computing Machinery and Intelligence. Mind 1950, 236, 433–460. [Google Scholar] [CrossRef]
  17. Penrose, R. The Emperor’s new Mind: Concerning Computers, Minds, and the Laws of Physics; Oxford University Press: Oxford, UK, 1989. [Google Scholar]
  18. Penrose, R. Shadows of the Mind: A Search for the Missing Science of Consciousness; Oxford University Press: Oxford, UK, 1994. [Google Scholar]
  19. Zenil, H. A Computable Universe. Understanding Computation & Exploring Nature as Computation; Zenil, H., Ed.; World Scientific Publishing Company/Imperial College Press: Singapore, 2012. [Google Scholar]
  20. Miłkowski, M. Is computationalism trivial? In Computation, Information, Cognition—The Nexus and the Liminal; Dodig-Crnkovic, G., Stuart, S., Eds.; Cambridge Scholars Press: Newcastle, UK, 2007; pp. 236–246. [Google Scholar]
  21. Miłkowski, M. Objections to Computationalism: A Survey. Rocz. Filoz. 2018, 66, 57–75. [Google Scholar] [CrossRef]
  22. Scheutz, M. Computationalism New Directions; MIT Press: Cambridge, MA, USA, 2002; ISBN 9780262283106. [Google Scholar]
  23. Russin, J.; O’Reilly, R.C.; Bengio, Y. Deep Learning Needs a Prefrontal Cortex. In Proceedings of the Workshop “Bridging AI and Cognitive Science” (ICLR 2020), Online, 26 April 2020. [Google Scholar]
  24. Stephens, A.; Tjøstheim, T.A. The Cognitive Philosophy of Reflection. Erkenntnis 2020, 87, 2219–2242. [Google Scholar] [CrossRef]
  25. Knight, W. OpenAI’s CEO Says the Age of Giant AI Models Is Already Over. Available online: https://www.wired.com/story/openai-ceo-sam-altman-the-age-of-giant-ai-models-is-already-over/ (accessed on 23 July 2022).
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Dodig-Crnkovic, G. How GPT Realizes Leibniz’s Dream and Passes the Turing Test without Being Conscious. Comput. Sci. Math. Forum 2023, 8, 66. https://doi.org/10.3390/cmsf2023008066

AMA Style

Dodig-Crnkovic G. How GPT Realizes Leibniz’s Dream and Passes the Turing Test without Being Conscious. Computer Sciences & Mathematics Forum. 2023; 8(1):66. https://doi.org/10.3390/cmsf2023008066

Chicago/Turabian Style

Dodig-Crnkovic, Gordana. 2023. "How GPT Realizes Leibniz’s Dream and Passes the Turing Test without Being Conscious" Computer Sciences & Mathematics Forum 8, no. 1: 66. https://doi.org/10.3390/cmsf2023008066

APA Style

Dodig-Crnkovic, G. (2023). How GPT Realizes Leibniz’s Dream and Passes the Turing Test without Being Conscious. Computer Sciences & Mathematics Forum, 8(1), 66. https://doi.org/10.3390/cmsf2023008066

Article Metrics

Back to TopTop