1. Introduction
The latest technological advancements in Large Language Models (LLMs), such as ChatGPT and other Generative Pretrained Transformer platforms (GPTs), have provided evidence supporting Turing’s theory on the possibility of machine-based intelligence/Artificial Intelligence (AI). Not only does ChatGPT deliver believable replies in dialogues with humans and effectively pass the Turing Test, it has also succeeded in impressing some humans to the point where they ascribe personality to it.
Originally referred to as “the imitation game”, the Turing Test was designed to assess a machine’s ability to exhibit intelligent verbal behavior comparable to that of a human. Turing proposed that a human evaluator would engage in natural language conversations with both a human and a machine, and if the evaluator could not distinguish between them, the machine would demonstrate its capacity for faithfully imitating human verbal behavior. Observe that Turing did not mention consciousness, only the ability to imitate.
However, there are reported examples of individuals who believe that ChatGPT is conscious. As reported by The New York Times on 23 July 2022 (accessed on 23 July 2022), Google fired engineer Blake Lemoine for claiming that Google’s Language Model for Dialogue Applications (LaMDA) was sentient, (i.e., experiencing sensations, perceptions, and other subjective experiences). While Lemoine’s views were extreme, he was not the only one attributing sentience and even consciousness to new LLM platforms. More cautious interpretations suggest that LLMs might, in principle, possess some degree of consciousness, but currently, we have no way to ascertain this.
The advances of LLMs are rooted in principles that have been a part of AI research since the inception of deep neural networks, but what sets them apart is how they are currently being implemented. Modern advancements are built upon extensive neural networks, which are trained on vast datasets using thousands of high-speed graphical processing units (GPUs) in large computer clusters. This training process is supported by an efficient infrastructure, Transformer architectures, and neural network optimization techniques. The crucial final steps involve applying reinforcement learning with the human feedback (RLHF) [
1]. Human AI trainers create a reward model that ranks responses, training the AI to determine the most appropriate responses for a given human interaction.
The effectiveness of this method and the ability of LLMs to generate intelligent responses in natural language were unexpected by the majority, who believed that more complex approaches would be required to pass the Turing Test. However, it appears that performing comprehensive computations over human-structured data, information, and knowledge may be sufficient to demonstrate a surprising level of imitation of human language abilities.
The training of GPT-3.5 involved a wide range of resources, including Wikipedia articles, social media posts, news articles, books, and other documents published before 2021. The next step in the development of the current methodology involves enhancing ChatGPT prompts with web search capabilities using WebChatGPT.
Data compression (Kolmogorov–Chaitin compression) plays a central role in the entire process of utilizing the collected human knowledge. As formulated by Greg Chaitin in 2006, “A useful theory is a compression of the data; compression is comprehension” [
2]. In 2018, inspired by the work of Greg Chaitin, Hector Zenil wrote an article titled “Compression is Comprehension and the Unreasonable Effectiveness of Digital Computation in the Natural World” (
https://arxiv.org/abs/1904.10258v3, accessed on 23 July 2022). Gerry Wolff, in his article “Intelligence Via Compression of Information” (
tinyurl.com/2p8d5f4z, accessed on 23 July 2022), published in the IEEE Computer Society Tech News, Community Voices, on 1 February 2023, explains the mechanism of data compression in a cognitive system as a process of intelligence. Phil Maguire, Philippe Moser, and Rebecca Maguire take it a step further by arguing that consciousness can be understood as data compression [
3].
On the other end of the spectrum regarding the acknowledgment of LLM capabilities is Luciano Floridi, who views LLMs as “agency without intelligence” [
4]. This alludes to “their ‘brittleness’ (susceptibility to catastrophic failure), ‘unreliability’ (false or fabricated information), and occasional inability to make basic logical inferences or handle simple mathematics”. Floridi’s article concludes that at the current stage of development, LLMs exhibit no intelligence. In this context, “intelligence” refers to an idealized human capacity for rational reasoning. However, it is worth noting that recent developments in cognitive science indicate a development toward a different, more inclusive understanding of intelligence, wherein not only humans but every living organism possesses a level of cognition (basal cognition) and intelligence [
5].
To make a connection to human abilities, it is helpful to examine Large Language Models (LLMs) not only in terms of their implemented mechanisms of data/information processes and architectures but also in comparison to human cognition. According to Daniel Kahneman, humans possess two complementary cognitive systems: “System 1”, which involves rapid, intuitive, automatic, and non-conscious information processing; and “System 2”, which encompasses slower, reflective, conscious reasoning and decision-making [
6,
7]. By solely recognizing “System 2” symbolic information processing (as seen in traditional AI approaches such as GOFAI), the symbol grounding problem remains unsolved. In contrast, “System 1” neural networks, as sub-symbolic data processing mechanisms, provide a means for symbol grounding in deep learning.
The fast neural network computation performed by LLMs, resulting in convincing dialogues, aligns with the fast thinking associated with “System 1”. According to Kahneman’s description, being on the “System 1” level means that LLMs lack consciousness, which, in this context, is characteristic of “System 2”.
Researchers such as Joshua Bengio are exploring ways to incorporate “System 2” and merge neural networks with symbolic computing. Proposed hybrid models [
8,
9] would combine symbolic and sub-symbolic elements, enabling the modeling of a blend of reactive (fast, non-conscious/sub-conscious) and deliberative (slow, conscious) cognitive behaviors typical of human cognition. It should be noted that the interpretations of “System 1” and “System 2” by Bengio and Kahneman are not identical. This was evident from the discussion at the AAAI-2020 conference during the Fireside Chat with Lecun, Hinton, Bengio, and Kahneman [
10]. However, the specifics of their differences are not essential for our present exposition.
The fast, automatic “System 1” can be understood, as stated in [
11], through physical correlations explained by Carlo Rovelli [
12], which aligns with Shannon’s relative information theory. These physical correlations can also accommodate some reflexive emotional elements [
13] in embodied (physical) agents. On the other hand, the slow “System 2” introduces an element of choice and indeterminism [
14], primarily due to the presence of synonyms in the symbol system. This topic has been further explored in other research that also discusses parallel concurrent computation, which is typical of biological systems but is inadequately represented by the Turing Machine model [
15].
2. Historical Notes, from Leibniz via Turing to GPT
Gottfried Wilhelm Leibniz’s work on universal language, “Characteristica Uni-versalis”, and calculus of reasoning, “Calculus Ratiocinator”, provide a theoretical framework for universal logical calculations that have greatly influenced modern computer science and AI. In many ways, ChatGPT can be seen as the realization of Leibniz’s vision of “characteristica universalis” and a method for generating new knowledge from existing information.
This article traces the evolution of the computational approach to knowledge generation from its roots in Leibniz’s concept of a universal language to Turing’s ideas of morphogenesis and, further, to modern AI models such as ChatGPT. Turing’s contributions laid the ground for a computational approach to learning and knowledge generation, with the invention of the Turing Machine for symbol processing, the concept of an “unorganized machine” (neural network model), and the Turing Test for Artificial Intelligence [
16].
This article explores the possibility that knowledge generation and learning can be achieved computationally, aligning with Leibniz’s original beliefs. It also highlights how the success of ChatGPT and other Large Language Models can be viewed in the context of Leibniz’s intellectual legacy.
Importantly, neither Leibniz, Turing, nor ChatGPT presupposes consciousness as a requirement for the computational process of generating knowledge from existing information and knowledge.
3. Penrose’s Criticism of Classical Computationalism and the Absence of Consciousness in ChatGPT
Roger Penrose has expressed criticism of computationalism in two of his books and proposed alternative perspectives on the nature of consciousness and human cognition. In his first book, Penrose [
17] examines the limitations of Artificial Intelligence and computational models in explaining human consciousness and understanding. He discusses Gödel’s incompleteness theorems, the nature of consciousness, the role of quantum mechanics in the brain, and the constraints of algorithmic reasoning. In his second book, Penrose [
18] continues his exploration of consciousness and its relationship to computation. He expands his previous arguments and addresses criticisms and responses received following the publication of the first book. In this book, Penrose suggests that quantum processes underlie consciousness.
In his more recent work from 2012, the foreword to A Computable Universe: Understanding Computation & Exploring Nature As Computation [
19], which is Penrose’s latest text on computationalism, he explicitly acknowledges having changed his position on the question of computationalism (the belief that the mind can be modeled computationally) multiple times. In the foreword, Penrose discusses different versions of computationalism and various possible interpretations. He expresses his criticism, based on the Turing Machine Model of computation, assuming that computation = Turing Machine = algorithm.
When Penrose argues that consciousness is incomputable, he means that it is not algorithmic in the sense of the Turing Machine. This position is well-established, even within modern computational approaches which suggest that distributed, asynchronous, and concurrent models of computation are necessary.
It is thus crucial to differentiate between new computational models of intrinsic information processes in nature, such as natural computing/morphological computing, and the old computationalism based on the computer metaphor of the Turing machine, which describes symbol processing. This traditional model has been criticized as inadequate for modeling human cognition (Miłkowski [
20,
21]; Scheutz [
22]), and it is considered irrelevant for AI by researchers such as Sloman [
15].
4. Concluding Remarks. Passing the Turing Test Does Not Imply Consciousness but Demonstrates a Useful Level of Intelligence
When Turing discussed the possibility of constructing artificially intelligent agents based on computations performed by electronic machines, he was met with skepticism by many. Even to this day, human intelligence and especially consciousness are often considered impossible to implement in machines due to their supposed substantial differences in nature.
Recent technological advances in the development of Large Language Models (LLMs), such as ChatGPT and other GPT (Generative Pretrained Transformer) programs, have finally provided justification for Turing’s belief in the possibility of realizing intelligence in machines, also known as Artificial Intelligence (AI). These advancements are based on remarkably simple principles that have been known within the AI field for several decades, particularly since the advent of deep neural networks.
The current developments primarily differ in the following respects: the use of large neural networks trained on vast amounts of data; the utilization of thousands of fast GPUs in huge computer clusters that run for several weeks; the implementation of elaborate infrastructure; the optimization of neural networks; and the adoption of architectures such as Transformers.
The final step in this progression was achieving a better interface for human users, which was accomplished through reinforcement learning with human feedback (RLHF). In this process, human AI trainers created a reward model in which responses were ranked by humans, enabling the AI to learn which response was the most effective.
The method is surprisingly simple yet effective: predicting the next word based on immense compressed data that contains human knowledge on a given topic collected from a vast array of available Internet sources. Thus far, LLMs have employed large networks with a substantial number of parameters, building extensive datasets, and designing algorithms to pass the Turing Test.
Symbolic computing has not yet been involved. Researchers such as Joshua Bengio [
23] believe in the power of hybrid computing models that involve both neural networks (NNs) and symbolic computing. This perceived necessity of two complementary systems is motivated by the findings of Kahneman [
6] and Tjøstheim et al. [
24] on two basic cognitive systems in humans: a “System 1” that involves reflexive, unconscious, automatic, and intuitive information processing; and a slow “System 2” that involves reflective, conscious, reasoning, and decision-making processes.
Computation in LLMs corresponds to Kahneman’s fast, intuitive “System 1”and provides a foundation for the slower, symbolic processing of “System 2”. Current developments in AI are continuing toward the modeling “System 2” symbolic reasoning, as presented by Russin, O’Reilly, and Bengio [
23]. In a recent interview with Wired magazine, Sam Altman from OpenAI said that the age of giant AI models is over and that new development strategies will be needed in the future [
25].
It is important to observe that the Turing Test which GPT programs can pass is not a test of human consciousness but one which shows the ability of a machine to produce sufficiently believable human-like dialogue.
The question of generating new knowledge de novo, i.e., not from a vast corpus of existing human knowledge, is a different inquiry. Language models (LLMs) serve as computational models for knowledge generation, producing new knowledge based on existing knowledge. This is why GPT programs are trained on huge amounts of human-produced text which required consciousness or physical presence in the world when it was produced by humans.
This situation is similar to synthetic biology, which can construct a living cell by assembling components from disassembled cells. However, it is still unable to generate a living cell de novo from a container containing organic molecules that constitute a living cell.
Using Kahneman’s terminology, LLMs operate on the ‘”System 1” level, with rapid and unconscious processing. The success of LLMs relies on human knowledge being compressed and reused. Humans employed consciousness during the generation of that knowledge, but once it existed, an automated procedure could use it to generate additional knowledge.