1. Introduction
The concept of consciousness has remained difficult to define and understand [
1]. Consciousness (at least, phenomenal consciousness, which will be our main focus) is an ontologically subjective phenomenon, and the scientific method applies only to epistemologically objective phenomena [
2]. Although remarkable progress has been made concerning the definition of physical correlates of consciousness [
3], the subjective experience of consciousness remains elusive. What is more, the multiple realizability assumption means that there might be different paths to consciousness, with different material substrates or physical correlates, so the evaluation of correlates is not equivalent to the identification of consciousness.
We can split the consciousness concept into several associated phenomena. Ned Block [
4] argued that discussions on consciousness often fail to adequately differentiate between two distinct aspects: phenomenal consciousness (P-consciousness) and access consciousness (A-consciousness). It should be noted that these concepts predate Block. Phenomenal consciousness refers to raw experiences, such as moving, colored forms, sounds, sensations, emotions, and feelings, with our bodies and responses at the core. These experiences, detached from their impact on behavior, are referred to as qualia. On the other hand, A-consciousness pertains to the accessibility of information in our minds for verbal reports, reasoning, and behavioral control. Perception provides access-conscious information about what we perceive, introspection grants access to information about our thoughts, and remembrance provides access-conscious information about the past, and so forth. Attention is the mechanism that focuses consciousness on a particular set of information coming from perception, interoception, or remembrance. Although some philosophers, like Daniel Dennett, [
5] question the validity of this distinction, others generally accept it.
While Block’s classification of two types of consciousness has been influential, some philosophers, like William Lycan [
6], propose a more extensive range. Lycan identifies at least eight distinct types of consciousness, including organism consciousness, control consciousness, the consciousness of, state/event consciousness, reportability, introspective consciousness, subjective consciousness, and self-consciousness. However, even this list excludes other less recognized forms. Arguably, the most recognizable and relevant forms of consciousness are phenomenal consciousness, access consciousness, and self-consciousness, which are defined as the models of our identity that we build based on our experience [
7].
Debate persists regarding the coexistence or separability of the different forms of consciousness, especially access consciousness and phenomenal consciousness, and how differently to approach each of them, which, as explained, would be identified with the hard and soft problems, respectively.
The deep divide lies, in other words, between the subjective phenomenon, the relative to the observer experience of something that the observer is paying attention to, and the objective phenomena, which can be simulated in a computer even if it is related to consciousness [
8]. For example, we can manipulate colors with computers but we cannot simulate qualia. We can even simulate the “bottleneck of consciousness” [
9] but cannot simulate the perceived experience of the result.
There is already a vast array of literature that deals with machine simulation of particular aspects of consciousness (see [
10] for a well-structured and comprehensive review). The main approaches have been the global workspace model, information integration, an internal self-model, a higher-level representation, and attention mechanisms.
Global workspace theory [
11] views the brain as a collection of specialized processors that provide for sensation, motor control, language, reasoning, and so forth. Conscious experience is hypothesized to emerge from globally shared information.
Information integration explains consciousness as shared or mutual information shared among brain regions as they interact in a constructive manner [
12].
The internal self-model is based on the idea that our mind includes a model of our body and how it relates to the space it is embedded in that is supported by certain brain regions [
13].
The higher-level representation theory explains consciousness with a higher pattern of information encoding, akin to symbolic processing [
14].
Finally, some authors point to the attention mechanisms that filter the fraction of ongoing experience that is experienced as the basis of consciousness. This is one of the mechanisms that has been simulated more successfully, even being considered at the core of the recent revolution of generative AI [
15]. Another relevant example has been the simulation of a computational ontology of a person based on historical records [
16].
All the previous models have lent themselves to simulation, which has been a fertile terrain for understanding some of the specific mechanisms of consciousness. These exercises have even resulted in the description of interesting related phenomena, such as in Axiomatic Consciousness Theory, a specialized simulation of visual processing based on attention, which can explain some properties of visual experiences, such as foveal, eye field, in-front, and the space images [
17].
However, how to implement or simulate phenomenal consciousness remains a complete mystery. We cannot simulate the experience of the perception of qualia, such as the redness of the red color.
We recall the example in the “Mary’s mind experiment”, a blind girl that is an expert in vision. Although she is an expert, Mary is unable to know what is the difference in the perception of red or black, as this information comes from qualia and is subjective, that is, cannot be represented in an objective manner such as in a book description [
18]. This is known as the knowledge argument: and it rests on the idea that someone who has completely objective knowledge, from the point of view of the scientific method and our epistemological scope, about another conscious being might yet lack knowledge about how it feels to have the experiences of that being [
19].
A common belief shared by part of the computer science community, and concretely in the machine consciousness community [
20] and even in part of the philosophy of mind community and more concretely shared by the connectionism community [
21], is that if an artificial general intelligence [
22] is modeled (supposedly via some meta-learning [
23] or transfer learning methodology [
24] applied to high capacity deep learning models [
25] and huge datasets), due to emergence, phenomenal consciousness may arise. Hence, this group of people believes, assuming the multiple realizability philosophy of mind assumption [
26], that an intelligent enough system is the cause of phenomenal consciousness. Phenomenal consciousness arises then as an epiphenomenon [
27], or, alternatively, intelligence is the cause of phenomenal consciousness or vice-versa. To justify this relation, it is mandatory to provide an objective definition of intelligence, which is problematic as the different mentioned communities provide different definitions of intelligence, particularly computational intelligence. In order to clarify our definition of intelligence, we provide a clarification attempt in
Section 4.
Several objections have been made to Mary’s argument; for example, that qualia are not information. Hence, Mary would know everything about color. However, we argue that qualia are indeed phenomenological information (of the form “How it feels to”), being modeled in the qualia space
Q designed by the neuroscientific information integration theory of Tononi [
12]. In particular, Q has an axis for each possible state (activity pattern) of an information complex (please see Tononi’s paper for more details [
12]). Within Q, each sub mechanism specifies a point corresponding to a repertoire of system states. Most critically, arrows between repertoires in Q define informational relationships. Consequently, and given the mathematical information theory [
28], qualia is information that would be unattainable for Mary if she is blind.
In order to continue analyzing the potential statistical, or even metaphysical, causal relation between phenomenal consciousness and intelligence, it is important to also briefly describe intelligence. Coming from the psychology community, and in a broad sense, intelligence is a very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience [
29]. If these problems are computational, we can reduce and quantify intelligence as an analytical expression [
30], giving rise to the concept of computational intelligence. We can express it quantitatively and study its relation with phenomenal consciousness. Here, we argue that computational intelligence would be an ontologically objective continuous numerical latent variable whose observation is noisy and obscured by a series of factors such as the personality or mood of the person being measured.
Computational intelligence can be exhibited by either living beings or machines, giving rise to what we call machine intelligence [
31]. However, AI does not involve phenomena such as understanding [
32], as understanding requires an entity to be aware of the learned concept. Nevertheless, computational intelligence does not require understanding. For example, a model can beat any human at chess, while being unaware of doing so. Hence, we find that computational intelligence, which is a subset of intelligence, does not share a relation in this example with phenomenal consciousness, which is the focus of this paper and we will further continue to provide examples such as this one.
The organization of this paper is as follows. First, we illustrate the analogy of Russell, which is at the root of the belief that intelligence and consciousness are related, and formalize it from a Bayesian point of view. Then, we provide some simple counter-examples that show empirical evidence of how unlikely it is for the hypothesis to be true. In an additional section, we study the concept of intelligence and provide a new definition of computational intelligence to more formally reject the mentioned hypothesis. With that definition, we study the potential causal relationship between phenomenal consciousness and computational intelligence. Afterward, we formalize how computational intelligence is not able to model phenomenal consciousness. Finally, we finish the paper with a section of conclusions and further work.
3. Russell’s Analogy of Consciousness
In this section, we will present the analogy postulated by Russell about intelligence and consciousness [
36]. Broadly speaking, he states that it is highly probable that consciousness is the only cause of the intelligent behavior that humans exhibit. It does so by supposing that if the behavior of people is similar to our own, then, by observation, we can establish a causal relation that the other people possess consciousness as we do. Literally, from the Analogy of Russell, we have that:
“We are convinced that other people have thoughts and feelings that are qualitatively fairly similar to our own...it is clear that we must appeal to something that may be vaguely called analogy. The behavior of other people is in many ways analogous to our own, and we suppose that it must have analogous causes. What people say is what we should say if we had certain thoughts, and so we infer that they probably have these thoughts...As it is clear to me that the causal laws governing my behavior have to do with thoughts...how do you know that the gramophone does not think?...it is probably impossible to refute materialism by external observation alone. If we are to believe that there are thoughts and feelings other than our own, that must be in virtue of some inference in which our own thoughts and feelings are relevan t...establish a rational connection between belief and data...From subjective observation I know that A, which is a thought or feeling, causes B, which is a bodily act, whatever B is an act of my own body, A is its cause. I now observe an act of the kind B in a body not my own, and I am having no thought or feeling of the kind A. But I still believe on the basis of self-observation, that only A can cause B. I, therefore, infer that there was an A which caused B, though it was not an A that I could observe.”
Russell’s analogy could be roughly summarized as “consciousness, which we cannot observe can be inferred by behavior, which we can observe”. Russell refers to the causal laws going from thoughts to behavior. If we understand “having thoughts” in the hard sense, as the subjective experience in phenomenal consciousness, then for Russell consciousness would be the cause of behavior (in particular, of intelligent behavior).
However, we must realize that this reasoning falls prey to the fallacy of affirmation of the consequent: there are many reasons why an agent may exhibit intelligent behavior. For instance, the DALLE-2 model generates artistic images but almost all of us would agree it is not conscious. The same can be said of ChatGPT and its dialogue applications, even in a stronger manner. The analogy reasoning involves more correlation than causality. Here, the confounder would be that both human behavior and the behavior of Generative AI are both the result of human intent.
From a classical logic point of view, Russell states that every living being produces intelligent behavior, applying modus ponens. However, applying modus tollens if a being does not exhibit intelligent behavior outside, then it would not be conscious, at least, from the probabilistic point of view that is considered in the analogy. However, this reasoning is flawed. For instance, a person suffering from severe autism, may not show intelligent behavior; hence, following the argument of Russell, there is, at least, a high likelihood that this person is not conscious. However, this is not true.
Having shown that there are phenomenally conscious human beings that do not exhibit intelligent behavior according to several estimates or that their computational intelligence cannot be compared to the ones of computers, we provide another counter-example to the analogy, coming from the field of artificial intelligence [
37]. In particular, we have seen how, in recent years, due to methodological advantages such as deep learning [
25] and the rise of computational power, intelligent systems have surpassed human abilities in a series of complex games. Some examples include AlphaGo winning at the Go game to the world champion [
38], IBM Watson winning at Jeopardy [
39], and discovering new unknown chess strategies with deep learning [
40]. General intelligence is a broad property, in the ontological sense, but we can reduce its meaning and provide a definition for a subset of it. In particular, we can epistemologically measure it as a function of the proportion of the computational problems that a system can solve from the set of all computational problems. Following this lower bound of general intelligence, a system implementing all the known machine learning models and meta-models of them able to solve any task with sufficient data will, for sure, outperform the performance of human beings in a broad series of problems and even solve problems that we do not know how to solve, like the protein folding problem [
41].
Following the analogy of Russell, it would seem highly likely that a system that implements these algorithms would be conscious. It would be even more probable that such a system is more conscious than any other human being. However, as we will further show, providing multi-disciplinary arguments, the likelihood that a Turing machine, which is essentially any known software being executed by a computer, is conscious is almost zero. Hence, the evidence given by the data shows that the hypothesis that there is a causal relation between intelligence and consciousness is fallacious. Finally, as an extreme argument, we can measure the computational intelligence, as in the next section we will do, of a severely autistic person with respect to a system implementing several methodologies such as AlphaGo, showing that phenomenal consciousness cannot be modeled by computational intelligence.
4. Defining Intelligence
Intelligence is a widely known concept that has been assimilated by the computer-science community to coin the term artificial intelligence. However, artificial intelligence is a misleading term, as it requires a proper definition of intelligence as a property that can be modeled with a set of numerical variables.
In particular, multiple definitions of intelligence have been proposed by different communities but all of them seem to be a reduction of the general meaning of intelligence. For example, if we include as intelligence the ability to understand and empathize with another person, this ability requires us to feel the situation that is having the other person. From a theoretical, relative to the observer and internal point of view, it will not be enough to appear to understand or feel by simulation methods based on quantitative measures, it would need to receive the qualia of the feeling or the idea being understood.
Hence, feeling requires awareness, or phenomenal consciousness, of the person that is having a conversation. As a consequence, this ability cannot be reduced to a simple set of numerical variables nor be implemented in a machine. In this section, we make a review of some of the different definitions of intelligence, to further justify why computational intelligence cannot model phenomenal consciousness.
4.1. Artificial Intelligence and Deep Learning
Artificial intelligence [
37] has another controversial definition. Generally, it is the science and engineering of making intelligent machines [
42]. But, if we want to define the intelligence of machines, that leads to a circular definition. We prefer to define it as an objective quantitative measure that is determined by the scope of problems that an artificial system is able to solve.
In recent years, due to the significant advances in computational power, it has been possible to implement high-capacity machine-learning models [
43] like deep neural networks, which is usually referred to as deep learning [
25]. As we have illustrated in the introduction, these models, whose capacity includes having more than 500 billion of parameters [
44], are able to solve complex problems like the protein folding problem [
41], go and chess [
38], write philosophy articles in a newspaper mocking the type of writings that were usually only attributed to human beings [
45] mastering natural language processing and common sense tasks and generating art [
46]. In essence, deep learning methodologies are able to fit complex probability distributions by being able to generalize their behavior to tasks that are only supposed to be solved by humans, making their behavior indistinguishable from that of humans [
47].
However, deep neural networks are software programs that are executed using computer hardware in a CPU (Central Processing Unit), GPU (Graphical Processing Unit), or TPU (Tensorial Processing Unit). Concretely, these hardware units are part of a Von Neumann architecture, which is essentially a Turing machine, making deep neural network algorithms that can be executed by a Von Neumann architecture, hence a Turing machine. Consequently, as we will further provide arguments for this claim, they lack awareness or phenomenal consciousness. As a result, they are unable to understand nor experience the scope of problems that they are solving and merely solve computations involving pattern recognition, independently of their complexity. Hence, artificial intelligence systems (at least in their current form) only possess computational intelligence [
30,
31], lacking understanding as it requires the qualia of the problem being solved. However, a virtue of computational intelligence is that it can be quantified, as it solves objective problems belonging to the set of all possible computational problems. In contrast, general human intelligence, as we will further see, is subjective and relative to the observer, requiring the qualia generated by understanding, feelings, or empathy and hence impossible to quantify without reducing it at its essence.
There are several propositions to quantify the computational intelligence that a system or an entity possesses. Let
be an entity, for example, a human being, that in every instant
t is able to perform a set of actions
A to solve a given problem. An intelligent agent
would decide, for every instant
t, the optimum action
to solve the problem. The branch of computer science that studies how to train intelligent agents in this framework is called reinforcement learning [
48] and can be directly extrapolated to reality. For example, if we want to say the optimum phrase to win a negotiation, in every instant
t we receive the sentence of the person that we are negotiating with, her word frequency distributions, or her emotional state. As a function of all that information, we choose to answer a certain phrase in a particular manner. As we can see, reinforcement learning can be applied to a plethora of computational intelligence problems. In fact, reinforcement learning systems are implemented in robots for planning. We now introduce the analytical expressions of several measures of intelligence to objectively clarify how they could be modeled mathematically. Dealing with these systems, which can perfectly be humans, the universal intelligence function
of a data structure resembling an agent
is given by the following measure [
49]:
where
is a data structure representing an environment from the set
E of all computable reward bounded environments,
is the Kolmogorov complexity, and
is the expected sum of future rewards
when agent
interacts with environment
. That is, the previous expression is a weighted average of how many problems
an agent
can solve weighted by their difficulty
. In particular, this is the reason why
includes
. Several things are interesting in this expression. First, the set of all computable reward bounded environments, i.e., would be the set of all computational problems and is countably infinite. Hence, the intelligence
of an agent is not upper-bounded. If we transform the set
E to a set where the area of a problem
is given as a function of its difficulty
, with a larger area given to more difficulty with respect to a particular agent
, the previous measure can be transformed in this abstract, general measure:
where
is an oracle function that gives the objective area of a problem
and
is a delta function representing whether the particular problem is solved or not by the agent
. Recall that the delta function outputs 1 if the problem is solved and 0 otherwise. As the set is potentially countably infinite, a problem can be decomposed according to the progress on it in different problems until a simple base problem, each one with a different area to measure the progress of an agent in the progress of a particular problem. Interestingly, the integral over the set
E gives the area of computational problems being solved, and this area is infinite. Moreover, an oracle giving a particular objective unbiased measure of difficulty for every problem would be needed. Depending on the features of the system, a problem may be more difficult than others, especially for non-computable problems requiring qualia to be solved. These objections make such a measure impossible to be unbiasedly implemented in practice but may be a lower bound of the computational intelligence of a system, animal, or human being.
Another example of an intelligence measure of a system represented by
for a scope of tasks sampled from
is now described. We have generalized from the measure proposed by Chollet, taking into account not only the scope of particular tasks that are numerable into a set but all the possible tasks that can be performed in our universe, which is potentially infinite and the one that we believe should be taken into account. Recall that we want to provide an ontologically objective general measure of computational intelligence [
30], as we want to study its relation with an ontologically objective dichotomous property, which is whether an agent is aware of its phenomenal consciousness. Consequently, any measure that excludes a single property or is noisy or biased, such as the intelligence quotient, cannot be compared with phenomenal consciousness without being also the results biased or noisy. Summarizing the main components of the expression, let
(priors plus experience) represent the total exposure of the system to information about the problem, including the information it starts with at the beginning of training represented by
C, the curriculum. Let
be the subjective value we place on achieving sufficient skill at
T and let
be the generalization difficulty for agent
of solving task
T given its curriculum or specific properties of the agent
C:
The formula is basically a generalization of that takes into account the previous knowledge, modeled by the curriculum and the priors, to solve a particular task T. The difficulty of the task is now modeled by the generalization difficulty and solving a potentially infinite scope is given as the computing the expectation over . However, although this measure takes into account whether an entity is able to generalize from prior knowledge as a measure of intelligence, we find the same problems as in the previous measure.
In both measures of intelligence, as the set of potential problems is potentially infinite and not-numerable, any entity would really have a measure of general intelligence of approximately 0, as it would fail to solve a potentially infinite set of problems. Moreover, both measures require having an oracle to determine the difficulty of the task. Consequently, they would both be biased although an objective oracle was able to provide this quantity.
Any measure of intelligence giving any other score rather than zero, although practical, would be just a lower bound of the true intelligence of the entity, better approximated with these measures than with the intelligence quotient measure. Hence, it can be useful for health assessments but never to classify an individual as more intelligent than another individual or, as we will further see, to say that a being is susceptible to having more or less likelihood of having phenomenal consciousness. It is critical to provide an abstract definition of computational intelligence because of two main reasons: First, in order to study its relation with an ontological property such as phenomenal consciousness, it needs to possess the same properties as phenomenal consciousness, that is, being ontological, general and not biased. Second, it can be useful to provide such a definition of intelligence to shed light on the psychology community to provide less biased estimators to it. Once again, this definition corresponds to the parameter, and measures such as the intelligence quotient correspond to the estimator.
4.2. Intelligence Quotient and Similar Approaches
Human intelligence includes a series of skills that are focused on different types of problems. The set of problems that human intelligence can solve intersects with the set of computational problems but is not contained in it.
Some examples of this kind of problem include discriminating, which is the most beautiful color for a particular observer in terms of our perception of the colors, which is the best action that we should do in a complex personal conflict involving human relationships, how a person must adapt her emotional state or which is the true notion of a metaphysical phenomenon. The common feature of all these problems is that they involve qualia, information about our universe that Turing machines lack. In particular, we consider qualia as semantic information, in the sense that the observer perceives the quality of color in a particular way, the redness of red, and not in another one. Consequently, this perception can be considered a property that may be codified and that is actually transmitted to the observer by the brain. Although this information is subjective and relative to the observer, it is still information that can be represented in a qualia space such as in the integrated information theory, and is transmitted to the phenomenal consciousness observer.
Consequently, from our point of view, we can only measure the intelligence that a human being shows externally and that is associated with these problems in terms of correlations, which are a reduction of its true scope but are the only way of being objective. Since ancient times, human intelligence has been measured through some of its specific features. For example, in ancient Greece, memory was very valuable. For instance, Plato, in Phaedrus, saw writing as an undesirable tool for external memory, where the memory of dialogues was not only seen as a passive repository of information but also a tool for critical thinking and the creation of new ideas. Then, Rhetoric was indeed a crucial part of education, law, politics, and literature in ancient Rome [
50]. In particular, Cicero argued that the ideal orator would be knowledgeable in all areas of human life and understanding, emphasizing the connection between broad knowledge and the ability to speak persuasively, thus highlighting the close relationship between rhetoric and intelligence in Roman society [
51]. In the past century, abstract reasoning was very appreciated and became a critical feature of Stern’s intelligence quotient [
52]. Stern’s intelligence quotient assigns a mental age to a person based on their performance on a series of tests, including reasoning, logic, language, and more. In particular, he divides the scored mental age with the chronological age to obtain a simple ratio.
However, several features that are independent of intelligence may affect Stern’s measure. For example, the subject can be in a sad mood, be an introvert or have some special condition such as autism. Due to these conditions, the intelligence shown externally by the subject does not correspond to its true intelligence, in other words, the true human intelligence would be a latent variable contaminated by noise or any approach that measures human intelligence as Stern’s intelligence quotient is an approximation to the underlying intelligence of the subject.
Moreover, as Stern’s test and similar ones include only a subset of all the subjective problems that a human being is able to solve, the intelligence measured by these tests is a lower bound of the true intelligence of the human being. Consequently, we believe that these approximations are very naive, poor, unreliable, culturally biased, and noisy. From a statistical point of view, the intelligence quotient would be a poor estimator of human intelligence: biased because it does not test all the areas of intelligence and it is influenced by Western culture and with high variance as its measurement contains noise because individuals may be nervous, be shy, have a special condition or simply do not wish to score high.
Hence, as we can only obtain a measure of intelligence via a test analogous to the one of Stern, as the quality of the approximation is poor, the value of this random variable cannot be used in a causal relation with the value of the phenomenal consciousness dichotomous variable. Recall that these tests are only able to reduce the true underlying intelligence of a human being, or even a system, as an approximate lower bound. Consequently, this quantity cannot be established as the cause nor the effect of phenomenal consciousness.
We can illustrate several examples of this statement. For instance, a comatose person is, according to neuroscience, phenomenally conscious [
53] but would score a 0 according to Stern’s test or similar ones. Another example is natural language generative transformers like GPT-3. This algorithm is very close to passing the Turing test [
47] and performs very successfully in intelligence quotient tests. However, as we will see in the next section, it is clear that this system does not possess awareness. Finally, a Down syndrome person would score fewer points on average than a neurotypical person but clearly possesses consciousness. These three examples show how computational intelligence and phenomenal consciousness are not directly related, therefore refuting Russell’s analogy. An even more convincing case than the rest is this one: In the science-fiction book
The Three-Body Problem [
54], an enormous plethora of people were displayed on a planet acting like a CPU. Each person acts as a transistor, creating a vast Von Neumann architecture. Most critically, observe that there is no physical connection between the people acting as transistors. Consequently, according to consciousness theories such as information integration theory, which requires physical connections [
12], or the Pribram–Bohm holoflux theory of consciousness [
55], this people-CPU would be non-conscious as a whole. However, it can solve the same problems that a high-capacity deep learning model can solve, as the people CPU can execute a program that implements the deep learning model. This is the most obvious case where we can see that any algorithm, independent of the degree of intelligence we can measure concerning its behavior, does not have phenomenal consciousness and that phenomenal consciousness does not share a relationship concerning intelligence. In the following section, we will argue how non-computational intelligence may be correlated with consciousness, but it remains a mystery, and we cannot say objectively if they are dependent or not.
6. Can Computational Intelligence Model Phenomenal Consciousness?
We will now formalize Russell’s analogy from a Bayesian point of view. First of all, we emphasize that we prefer to use the Bayesian framework when we refer to random variables, as phenomenal consciousness. Concretely, from a Bayesian point of view, consciousness is a random variable because we do not observe it directly for other individuals except us; hence, from a Bayesian point of view, we can only reason about it using random variables until it is possible to reduce the uncertainty about this variable via observations, hence being a clear example of a random variable from a Bayesian perspective.
The latent, unobservable measure would be whether an entity possesses phenomenal consciousness or not. We assume here, as we isolate the observer of phenomenal consciousness, in the sense of the defined term awareness by Dehaene, from all the different features of consciousness such as access consciousness, that phenomenal consciousness is a dichotomous variable C. Recall that phenomenal consciousness is not being aware of more or fewer phenomena, as the complexity of the integrated information theory qualia space can model. Phenomenal consciousness, from our definition, is being an observer of the qualia space generated by a living being. Consequently, you can only be aware of the qualia space, an observer of the qualia space, or not. Hence, following our assumptions that phenomenal consciousness is not an epiphenomenon or intrinsically related to the qualia space but a property of beings to be aware of their qualia space, we define phenomenal consciousness as a dichotomous variable of perceiving or not the qualia space that a being generates.
Let I be the computational intelligence of an entity as defined in previous sections, denoted by the continuous numerical variable I. A subject S may possess or not have phenomenal consciousness, but with the current state of science, we are only able to determine whether it is conscious by looking at the neural correlates of consciousness. If the system does not have a biological brain or nervous system, science is unable to provide any clue about the consciousness of S. Then, would be the conditional probability that a subject S has phenomenal consciousness such that and is the conditional probability of the computational intelligence of the subject. Concretely, an intelligence quotient test would not determine the intelligence of S as a point estimation but the only thing that it would do is to reduce the entropy of the distribution.
In order to carry out this analysis, we use some concepts from probability theory that we now review. The first one is the amount of information needed to encode a probability distribution, also known as entropy. The entropy
can be viewed as a measure of information for a probability distribution
associated with a random variable
X. That is, it is self-information. It can be used as a measure of the uncertainty of a random variable
X. When the random variable is continuous, we refer to the entropy as differential entropy. The entropy of a uni-dimensional continuous random variable
X with a probability density function
, or differential entropy
, is given by the following expression:
where
S is the support of the random variable
X, that is, the space where
is defined. The entropy
is useful to model the following relation: If we have a random variable
X with high entropy
, that means that we have low information about the values that it may take. On the other hand, if we consider a random variable
X with low entropy
, it is a sign that we have high information about the potential values that the variable
X can take. In other words, higher knowledge of a random variable implies lower entropy and vice-versa. Another interesting concept regarding information theory, which we use in this work, is the mutual information
of two random variables
X and
Y. Mutual information is defined as the amount of information that a random variable
X contains about another random variable
Y. It is the reduction in the uncertainty of one random variable
X due to the knowledge of the other. Mutual information is a symmetric function. Consider two random variables
X and
Y with a joint probability density function
and marginal probability density functions
and
. The mutual information
is the relative entropy between the joint distribution
and the marginal distributions
and
:
Concretely, we define as information gain the amount of information we gain for a particular random variable knowing the value of the other one.
According to Russell, we know that human beings are likely to be conscious, so we denote the being of a human being as the dichotomous random variable
B. Then,
independently on the degree of intelligence. More technically, the information gain of the intelligence degree
I over consciousness, given that the entity is a human being, is 0.
In other words, the entropy
of the conditional probability distribution of consciousness is also conditioned to the degree of computational intelligence of subject
S, which is also a random variable as we do not have direct access to it, is the same one. Then, in our case, we can illustrate that the entropy on the consciousness random variable for humans
is equal to the conditional entropy on the consciousness for a certain computational intelligence level
I.
As , there is no need to show that , as it is obvious. Hence, we have formally shown how, in the case of human beings, the computational intelligence degree is independent of the phenomenal consciousness variable. However, until now we have only performed the analysis of computational intelligence and phenomenal consciousness in the case that the subject is a human being. Nevertheless, important implications of this analysis need to be taken into account. For example, we now know that a low measure of computational intelligence, according to the intelligence quotient of Stern, does not condition the subject from being conscious. Let denote a probability distribution over the computational intelligence for a subject S having its density concentrated over a low value. Concretely, we know that . We put here and not being k a real number as we have denoted that current measures of intelligence are noisy lower bound over the true value of intelligence of subject S, which is a random variable. Importantly, we now know with complete certainty that, in the case of disabilities or certain comatose states, a subject has phenomenal consciousness.
Next, we analyze and compare the probability distributions
and
. Science gives us evidence that if the entity shares features with the human being biologically speaking, concretely the neural correlates of consciousness, the subject may be conscious. We denote with
a continuous numerical variable that represents the degree of biological similarity of the brain of the subject with the brain of the human being. Concretely, current AI systems, denoted with the dichotomous variable
A, have
, as deep neural networks or meta-learning methodologies are just sequences of instructions sequentially computable by Turing machines as we have shown before, although their name may be misleading. In particular, every algorithm written in a computer can be solved by a Turing machine. Critically, the concept of the Turing machine is relevant to the question of machine consciousness because it provides a framework for thinking about the limits of computation and because it models correctly any algorithm performed in a computer. One way in which the Turing machine can be used to clarify the question of machine consciousness is by providing a way to distinguish between computational processes that are purely mechanical, such as the ones involved in intelligence as neuroscience shows [
74], and those that involve some form of conscious, or meta-cognition, experience, whose physical explanation is not clear [
64]. An infinite tape quantum Turing machine would be able to run a set of algorithms that potentially solve any problem related to intelligence but whose artificial support would not necessarily perceive qualia, always remaining unaware of solving any problem. Also, according to some theories of consciousness, such as the global workspace theory, consciousness involves the integration of information from different parts of the brain. From this perspective, a machine could be considered conscious if it is capable of integrating information in a similar way. However, it is not clear whether a Turing machine, which eventually operates on a fixed set of rules, is capable of this type of integration. Further work will explain other modeling alternatives like Gualtiero Piccinini, Nir Fresco, and Marcin Milkowski.
We found a real analogy with
and
. Concretely, these variables are, according to evidence found in neurobiology, linearly correlated, i.e.,
being
r the correlation coefficient. However, a bird, elephant, dolphin, monkey, or cephalopod, for example, may score a low computational value
. However, again, we find that conditioning the variable
to the conditional distribution
does not change the entropy of the distribution:
Finally, we use the example of a meta-learning system to show how the degree of computational intelligence is not correlated to phenomenal consciousness. Concretely, a meta-learning system with
has the biggest computational intelligence known as it is able to solve a potentially infinite set of computational problems that humans or animals are not able to solve up-to-date, as we have seen in previous sections. We denote that such a system has a computational intelligence probability distribution
. However, we know that
, independently of its degree of computational intelligence. In other words, if we condition the probability to
, for all the set of artificial intelligence systems, we have that
. Hence, the degree of intelligence does not generate phenomenal consciousness as an epiphenomenon or by emergence. Concretely, it is the anatomy of the biological brain, or even less probably the nervous system or body, where supposedly we find, at least, neural correlates of consciousness. Given all the information and evidence that we have provided, we could formalize that the information gain of the computational intelligence random variable given that we know the phenomenal consciousness variable if we marginalize the kind of entity that may have phenomenal consciousness is 0, i.e., they are independent random variables independently of the degree of intelligence.
From a Bayesian point of view, this information could be formalized as follows. Concerning artificial intelligence systems, let be an a priori distribution representing the probability of the system being conscious, our previous beliefs coming from Russell’s analogy. Following this analogy, this probability was high as the system is intelligent and the complementary probability, , is low. We have provided empirical and theoretical evidence showing that this is not true, which we formalize in the likelihood , being E the evidence we have illustrated in previous sections. Let be the marginal likelihood representing the probability of our evidence being true, which is high due to the fact that it comes from highly cited papers of various research communities like neurobiology, psychiatry, or philosophy of mind. Lastly, let be our posterior beliefs of the hypothesis that artificial intelligence systems are conscious. As the probability rectifier coefficient is very low, that is , despite having an a priori belief supporting the hypothesis of conscious artificial intelligence systems, now the posterior belief clearly shows that significantly. Mainly because computational intelligence cannot model phenomenal consciousness.
We would like to discuss several non-conventional computing paradigms under this framework. The first one is symbolic computation. In particular, recall that first-order logic framework is a subset of Bayesian inference, as for example, we have is equal to . However, without having to use fuzzy logic, we could have a Bayesian network , being impossible to model under classical first-order logic. As probabilistic programming is also a subset of the algorithms that can be run in a Turing machine, all the previous statements hold. Second, current deep learning models and spiking neural networks, although being fancy for folklore psychology and with brilliant behavior, are also reducible to binary instructions being executed in a computer, very different from how a human brain works, so they also belong to Turing’s machine algorithms and previous statements hold. Finally, we make an exception for future neuromorphic hardware, as the span of algorithms that may execute and the physical and chemical properties involved may, we do not know, emerge consciousness. Hence, we place a non-informative prior to the emergence of consciousness in future neuromorphic hardware systems, leaving the analysis of these systems for further research.