1. Introduction
The year 1935 saw the publication of two fundamental articles that illustrated the peculiarities of `quantum entanglement’. In one of these articles, Albert Einstein, Boris Podolsky, and Nathan Rosen (EPR) showed that, whenever a composite quantum system, or `entity’, made up of two individual entities, is in an entangled state, the component entities exhibit a specific type of statistical correlations, known as `EPR correlations’ [
1]. In the other article, Erwin Schrödinger showed that two entangled quantum entities, though separated in space, behave as if they were actually `non-separated’ [
2]. Then, in 1951, David Bohm introduced the archetypical situation of a composite bipartite quantum entity made up of two spin 1/2 quantum entities that fly apart when the composite entity is in the singlet spin state [
3]. If one imagines that, (i) if one spin is forced “up” by the measurement apparatus applied to it, then, as a consequence, the other spin is `immediately’ forced “down”, even when no measurement is performed on it, and (ii) this process is independent of the distance between the two quantum entities, the phenomenon reveals a sort of `spooky action at-a-distance’, a locution coined by Einstein to stress the strange aspects of entanglement. Next, in 1964, John Bell put forward the idea of how an empirical test can be designed to detect the presence of entanglement in physical domains. More precisely, what Bell did was deriving an inequality that should not be violated under the assumption, reasonable in classical physics, of `local separability’, or `local realism’, whereas the inequality is violated in quantum mechanics [
4]. After the seminal article by Bell, it became clear that, due to entanglement, quantum entities exhibit genuinely non-classical aspects, as `contextuality’, `non-separability’, and `non-locality’. Bell’s work was also relevant for the so-called `hidden variables programme’, initiated after the EPR article, because these results entail that any hidden variables’ completion of quantum mechanics must be non-local. Finally, one can prove that the EPR correlations cannot be modelled in a classical Kolmogorovian probability framework [
5,
6].
In the last two decades, research on entanglement has flourished in physics. Indeed, at a theoretical level, several variants of Bell’s original inequality have been derived, now generally called `Bell’s inequalities’ (see, e.g., [
7,
8]). At an empirical level, several tests on micro-physical entities have been performed that confirm the predictions of quantum mechanics (see, e.g., [
9,
10,
11]). Among Bell’s inequalities, the `Clauser–Horne–Shimony–Holt (CHSH) inequality’ [
12] is particularly suited, not only for empirical tests of non-locality in quantum physics, but also as a test to detect the presence of entanglement, within and beyond physical domains.
Concerning the latter point, various theoretical and empirical studies have appeared that investigate the presence of entanglement in conceptual-linguistic domains, mainly in the combination of two concepts (see, e.g., [
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24] and references therein). This investigation fits a growing research programme, the `quantum cognition programme’, that applies the mathematical formalism of quantum mechanics in the modelling of high-level cognitive processes, as perception, categorization, language, judgement, and decision making (see, e.g., [
25,
26,
27,
28,
29,
30,
31,
32,
33,
34] and references therein). In particular, some of us have put forward a `realistic-operational approach’ for human cognition, inspired by physics. Indeed, as it occurs in a physics test, a cognitive test consists of an `entity’, whose nature is conceptual, which (i) is prepared in a given `state’, (ii) undergoes a `measurement’ process in which generally a `context’ plays a role, and (iii) is such that a statistics of `measurement outcomes’ can be collected. Thus, a relevant aspect of the approach, which makes it different from other approaches to cognition, is that, in it, a concept is an entity in a defined state, which can change under the influence of a context [
35]. In view of the above analogy with physics, the realistic-operational approach provides a general guiding scheme that enables modelling conceptual entities by means of the formalism of quantum mechanics in Hilbert space [
25,
29,
34]. We add that, throughout this article, we write concepts using capital letters and italics, e.g.,
Animal,
Fruit,
Vegetable, etc. This is frequently the way of referring to concepts in cognitive psychology.
Regarding specifically the identification of entanglement in conceptual combinations, we have performed various `Bell-type tests’ in which we have tested the presence of entanglement by means of a violation of the CHSH inequality. These tests include text-based [
15,
16,
22,
23] and video-based [
24] cognitive tests on human participants, information retrieval tests on corpora of documents of English language [
21], and image retrieval tests on web search engines [
20]. We stress that information retrieval tests on corpora of documents show deep analogies with cognitive tests on human participants. Indeed, since texts are written by human beings, hence human minds, the written texts that are collected in corpora can be regarded as taped conversations between humans in a structured form. Therefore, we expect that, despite specific differences, cognitive tests on human participants and information retrieval tests on corpora of documents will exhibit very similar patterns, in particular in regard to the identification of conceptual-linguistic structures. We have particularly tested the combination
The Animal Acts, meant as a composite bipartite conceptual entity made up of the individual conceptual entities
Animal and
Acts, where the term “acts” refers to the action of emitting a sound by the animal. We have also tested the combination
The Animal eats the Food, meant as a composite bipartite conceptual entity made up of the individual conceptual entities
Animal and
Food (see
Section 2).
While all Bell-type tests significantly violated the CHSH inequality, thus indicating the presence of entanglement in conceptual-linguistic domains too, similar to how the violation of the CHSH inequality in physical domains is provoked by entanglement, we also found a systematic violation of the marginal law conditions and a frequent violation of the CHSH inequality beyond the so-called `Cirel’son’s bound’ [
36,
37] (see again
Section 2). These two results were unexpected from the point of view of quantum physics. This led us to work out a new theoretical perspective to rigorously characterize entanglement in all its aspects of non-classicality [
19,
24]. While the new theoretical perspective has many analogies with some established results on physical entanglement, we also identified some points where the two perspectives, physical and conceptual-linguistic, diverge, in particular with respect to the relationship between the violation of the marginal law conditions and the possibility of `signalling’ (see
Section 3).
In this article, we deepen and extend the investigation above and present the results of three new information retrieval tests we have recently performed on the conceptual combinations
The Animal Acts and
The Animal eats the Food, using selected corpora of Italian language. The results strongly confirm the empirical patterns identified in Bell-type information retrieval tests on corpora of English language. This indicates that we have identified some general and deep structures underlying conceptual entities and their formation/combination, which are independent of the specific language, English or Italian, that is used to reveal them. In particular, we show that the CHSH inequality is significantly violated in all tests, which reveals the presence of entanglement between the component concepts (see
Section 4).
Next, we apply the quantum-theoretic framework that enables the modelling of any Bell-type test and represents collected data in Hilbert space, showing that entanglement occurs at both state and measurement levels; hence, it is stronger than the entanglement that is typically detected in quantum physics tests. More importantly, the modelling reveals that entanglement is the way to formally express `meaning’. Equivalently, both of the concepts
Animal (
Animal) and
Acts (
Food) carry meaning, but also the combination
The Animal Acts (
The Animal eats the Food) carries its own meaning, and this meaning is not simply related to the separate meanings of
Animal and
Acts as prescribed by classical compositional semantics. It also contains an emergent meaning almost completely caused by its interaction with the wide overall context (see
Section 5).
In a recent work, we have called `contextual updating’ the complex process by which meaning is attributed to the combined concept, and it occurs at the level of entanglement formation [
38]. We believe that this is the general way to express entanglement in physical domains too and is more convincing than the typical `spooky action at-a-distance’ view through which physicists tend to understand the phenomenon of entanglement (see
Section 6).
2. Detection of Entanglement in Physical and Conceptual-Linguistic Domains
We review here the empirical setting that is typically used to detect entanglement in both physical and conceptual-linguistic domains. This detection involves testing one of Bell’s inequalities, namely, the `CHSH inequality’ [
12].
An empirical test for the detection of entanglement in micro-physical domains is called a `Bell-type test’ and requires the following steps [
1,
3,
4,
8,
12]. One first considers a composite physical entity
, prepared in an initial state
p and such that the individual entities
and
can be recognized as component parts of
. Then, one performs the coincidence measurements
,
,
, where each
consists in performing the measurement
X on
, with possible outcomes
,
, and the measurement
Y on
, with possible outcomes
,
. The component entities
and
have interacted in the past, but are spatially separated when the
s are performed. If the outcomes
and
can only be equal to
, the expected value of the coincidence measurement
is just the correlation function
where
is the joint probability of obtaining the outcome
in a measurement of
X on
and
in a measurement of
Y on
. Next, one calculates the quantity
called the `CHSH factor’, and inserts it into the CHSH inequality
The inequality in Equation (
3) follows from an assumption of `local separability’, or `local realism’ [
1,
4], which is reasonable in classical physics. Equivalently, Equation (
3) follows from the requirement that a classical Kolmogorovian model of probability exists for the measured correlations between
and
[
5,
6]. We finally notice that the CHSH factor in Equation (
2) is mathematically bound by the values
and
.
The standard formulation of quantum mechanics associates the entities
and
with the complex Hilbert spaces
and
, respectively; hence, the composite entity
is associated with the tensor product Hilbert space
. The possible (pure) states of
and
are represented by unit vectors of
and
, respectively, and the measurements that can be performed on
and
are represented by self-adjoint operators on
and
, respectively. The states of
that are represented by product vectors of
are called `product states’, while the states that cannot be represented by product vectors are called `entangled states’. Analogously, the measurements on
that are represented by product self-adjoint operators on
are called `product measurements’. However,
also contains non-product self-adjoint operators, which thus represent `entangled measurements’. In an entangled measurement, `at least’ one eigenvector of the corresponding self-adjoint operator represents an entangled state [
8]. In the Bell-type test above,
and
are both isomorphic to the complex Hilbert space
of all ordered pairs of complex numbers; hence,
is isomorphic to the complex Hilbert space
.
The CHSH inequality in Equation (
3) is violated in quantum mechanics, which is interpreted as due to the presence of entanglement between the entities
and
that are recognized as component parts of
. This entanglement is typically attributed to the initial state
p of the composite entity
being the singlet spin state, i.e., a maximally entangled state, and the coincidence measurements
being product measurements,
,
.
In the situation of the singlet spin state and product coincidence measurements:
- (i)
The CHSH factor is equal to
and is called `Cirel’son’s bound’ [
36,
37], as it is typically considered the maximum value reachable in quantum mechanics in the presence of product measurements;
- (ii)
The conditions that, for every
,
,
,
,
,
are trivially satisfied and are called the `marginal law’, or `no-signalling’ [
7,
8,
39], or `marginal selectivity’ [
40,
41] conditions. A violation of Equations (
4) and (
5) is typically considered uninteresting in physics, as it would entail the possibility of `signalling’. We will return to points (i) and (ii) in
Section 3.
Bell-type tests have been extensively performed in micro-physical domains, mainly to identify the phenomenon of `non-locality’, and they all confirm the predictions of quantum mechanics (see, e.g., [
9,
10,
11]).
Coming to conceptual-linguistic domains, we performed several Bell-type tests in the form of cognitive tests on human participants [
15,
16,
22,
23,
24], document retrieval tests on corpora of documents of English language [
21], and image retrieval tests on web search engines [
20,
23]. We particularly tested the conceptual combination
The Animal Acts, meant as a composition of the individual conceptual entities
Animal and
Acts, where the term “acts” refers to the possible sounds that can be emitted by the animal, and
The Animal eats the Food, meant as a composition of the individual conceptual entities
Animal and
Food. The idea was to perform measurements on the composite entity, i.e., the conceptual combination, which were the conceptual-linguistic analogue of the coincidence measurements described above, and test the CHSH inequality to identify the eventual presence of entanglement. We found a systematic violation of the CHSH inequality, in some cases beyond Cirel’son’s bound, together with a systematic violation of the marginal law conditions.
While the violation of the CHSH inequality was in substantial agreement with the predictions of quantum mechanics, indicating a non-classical situation where entanglement occurs, as in physical domains, the additional violation of the marginal law conditions and Cirel’son’s bound was somewhat unexpected, as they are not believed to occur in physical domains (see points (i) and (ii) above). These findings led us to start a theoretical analysis of a problem that is connected with entanglement and is usually overlooked in quantum mechanics, namely, the `identification problem’, that is, the problem of identifying individual entities that are the component parts of a composite entity by performing measurements on the latter [
16]. Thus, we elaborated a general theoretical framework to model any Bell-type test, independently of the nature, physical or conceptual-linguistic, of the entities involved, within the formalism of quantum mechanics in Hilbert space [
16,
19]. In this theoretical framework, one applies the quantum mechanical prescription that the composite entity
is associated with a complex Hilbert space whose dimension is determined by the number of distinct outcomes of the measurements performed on
. In the case of a Bell-type test, each coincidence measurement,
,
,
, has four distinct outcomes; hence,
should be associated with the Hilbert space
of all ordered 4-tuples of complex numbers. Only in the attempt of identifying two individual entities,
and
, as parts of
, one considers possible isomorphisms with the tensor product Hilbert space
, where each copy of
takes into account the fact that measurements with two distinct outcomes can be performed on
and
in a Bell-type test. Additionally, it is only at the stage in which individual entities are identified from measurements performed on the composite entity that entanglement may arise. We proved that, in general, no unique isomorphism exists between
and
, which is the reason why, from a mathematical point of view, different ways exist to account for entanglement being present within the composite entity
with respect to the individual entities
and
that are identified as parts of
.
To formalize these more general situations that appear in conceptual-linguistic domains, we needed a more rigorous characterization of the non-classical aspects of entanglement than the intuitive, but somewhat misleading, picture of entanglement that is typically provided in physical domains. As we will explain in
Section 3, we found this characterization in the fact that entanglement formalizes the non-classical situation in which the probabilities of a coincidence measurement on
cannot be written as products of probabilities of measurements on the individual entities
and
that are identified as parts of
. Hence, entanglement is a `relational property’ between the coincidence measurements and the measurements on the component entities. Only when the marginal law conditions in Equations (
4) and (
5) are satisfied in all coincidence measurements that the entanglement of these different coincidence measurements can be captured in the state of the composite entity. Indeed, if the marginal law conditions are satisfied in all coincidence measurements, then one can prove that a unique isomorphism exists that connects
with
, in which case
can be directly identified with
. This means that the situation typically reported in quantum mechanics, namely, entanglement as a consequence of an initial entangled state and product coincidence measurements, is not the general one [
16]. In the general situation in which the marginal law conditions are empirically violated, no unique isomorphism exists between
and
; hence, entanglement cannot by captured only by the initial state. As a matter of fact, empirical violations of Equations (
4) and (
5) have been identified in linguistic-conceptual domains, as anticipated above, but also in physical domains [
42,
43,
44,
45,
46]. However, little attention has been devoted so far to the violation of the marginal law conditions in physical tests of entanglement, because the latter has been attributed to artefacts of the measurement process [
43].
On the other side, the simultaneous violations of the CHSH inequality, the marginal law conditions, and Cirel’son’s bound led us to investigate in depth the identification problem above, reaching conclusions on entanglement as a phenomenon that diverge in some aspects from the typical tenet of quantum mechanics summarized at the beginning of this section. Illustrating this new theoretical perspective on entanglement will be the aim of
Section 3.
3. A New Theoretical Perspective on Entanglement
Our general theoretical perspective on entanglement as a phenomenon appearing in both physical and conceptual-linguistic domains was motivated by the awareness that the phenomenon is `holistically deeper’ than the intuitive picture typically provided by quantum physicists (see [
16,
19,
24]). This led us to reconsider the following elements.
- (i)
When investigating if/how the phenomenon occurs for conceptual entities, which, different from physical entities, cannot be localized in space, one has to distil the non-classical aspects of entanglement that are independent of the presence in space of the entities under study. As such, one needs to dissociate entanglement from the physical notion of `non-locality’.
- (ii)
Because of (i), the typical relationship between the marginal law conditions and no-signalling also has to be carefully re-analysed, because it mainly relies on the intuitive, but in our opinion misleading, picture of two physical entities that exist separately and are localized in space, hence can exchange signals.
- (iii)
Though in Bell-type tests on conceptual entities, one still considers composite entities that are bipartite, i.e., consist of two component entities, generally speaking, one does not generally know how the two component entities relate to each other and to the composite entity.
Points (i)–(iii) led us to look for a mathematically more rigorous way to introduce entanglement that was closer to its underlying nature, independently of its physical or conceptual-linguistic declination.
Let us now consider a bipartite entity, prepared in a defined state and such that coincidence measurements can be performed on it and probabilities can be defined as large number limits of relative frequencies of these coincidence measurements. We agree that entanglement is present if the probabilities obtained from the coincidence measurements on the bipartite entity cannot be factorized as the product of the probabilities obtained from the individual, or separate, measurements that compose the coincidence measurements.
This characterization of entanglement incorporates its non-classical aspects because, if classical physical entities are separated from each other in space, then the probabilities above do factorize. Also in quantum mechanics, when a bipartite entity is in a product state and the coincidence measurements are represented by product self-adjoint operators, then the probabilities above factorize. This means that, in a situation modelled within the quantum formalism, if the probabilities of the coincidence measurements do not factorize, at least one of these aspects, initial state or coincidence measurements, fails. The most studied failing case is when the state of the bipartite entity is not a product state, thus an entangled state. Little attention was paid to the situation where the coincidence measurements are not represented by product self-adjoint operators, which also entails a lack of factorization of probabilities.
Let us then come to the relationship between entanglement and the violation of the CHSH inequality within our general perspective. We refer to Pitowsky’s theory of correlation polytopes and their connection with classical Kolmogorovian probabilities [
6]. In this theory, the CHSH inequality is equivalent to the existence of a classical Kolmogorovian probability model for the corresponding probabilities. Hence, a violation of the CHSH inequality is sufficient for the presence of a non-classical, possibly quantum, probability model.
Little attention has been devoted so far to the situation where entanglement has its origin in the violation of both the CHSH inequality and the marginal law conditions. The reason is that the violation of the marginal law conditions, which are also called the no-signalling conditions (see
Section 2), is typically interpreted as indicating the presence of signalling; hence, it is marked as trivially uninteresting by physicists. We believe that this is not necessarily true, because the marginal law conditions only constitute a `sufficient’, but `not necessary’, condition for no-signalling. Hence, a violation of the marginal law conditions does not entail in itself the presence of signalling. In other words, a violation of the marginal law conditions cannot be used to claim that signalling must take place. Whether in such a situation signalling is present or not necessitates an investigation of a different type. Because, in the many models we built over the years, the origin of the violation of the marginal laws was quite obviously due to a lack of symmetry, we have all reason to believe that signalling does not necessarily take place in these models. However, it is always possible that someone finds a not-yet-identified way to use the models to send a signal. Hence, we do not exclude this possibility. On the other side, we have studied in detail examples of Bell-type situations where a violation of the CHSH inequality occurs together with a violation of the marginal law conditions [
47,
48,
49], which allowed us to conclude that the violation of the marginal law conditions is due to a lack of some form of symmetry, rather than a consequence of the existence of signalling. In these examples, the violation of the CHSH inequality was interpreted as a genuine expression of entanglement provoked by the presence of `potential correlations which are only actualized when the coincidence measurements are performed’, which already provided an explanation for the appearance of entanglement, which was more palatable than the typical `spooky action-at-a-distance view’.
That the violation of the marginal law conditions is only a secondary effect, due to a lack of enough symmetry in the bipartite entity under study, and not a primary effect responsible of signalling, is also evident from the fact that, in some of our previous tests, the CHSH inequality is violated by an amount that exceeds Cirel’son’s bound, whereas this is not the case for micro-physical entities that are entangled. We can understand the condition for the existence of this bound when we consider Cirel’son’s proof of it in detail. We can then note that it is necessary to be able to represent the four considered coincidence measurements present in the CHSH inequality expression by one self-adjoint operator in the considered Hilbert space. This indicates the supposed presence of a very large and rigorous quantum coherence that incorporates the four coincidence measurements simultaneously and independently of how and when they may be performed separately. This very large internal coherence is not easy to produce in conceptual-linguistic tests.
We have thus summarized the key points in which our general perspective on entanglement differs from the typical view of entanglement in physics. As mentioned above, there is one important point that arose from our research of entanglement in conceptual-linguistic domains, namely, a deeper holistic nature is present when two entities entangle each other. The entangled entity is more than a new entity in itself only still connected to the original entities by which it was formed in a very specific way, and this has become even more clear to us by analysing more deeply how concepts entangle in human language. As mentioned in
Section 1, we have recently introduced a phenomenon of `contextual updating’ taking place in human language relative to the global meaning carried by the whole context, which contains an important part of our new understanding. We also believe that the phenomenon of contextual updating captures the most important part of `contextuality’ as it occurs in cognitive domains, because contextual updating entails that the context at play is exactly the `context of the meaning-coherent structures that are present when an individual participates in a cognitive test.’ Contextual updating also reveals that contextuality cannot be reduced to conditions of probabilities. Rather, its proper formulation needs the more general realistic-operational approach sketched in
Section 1. That also the marginal law conditions are violated as a consequence of the entangled entity behaving fully as a new entity relative to the global meaning context is clearly identified in language, and it is also clearly seen that this is not related to the presence of signals. We believe that a similar process takes place when two physical entities become entangled; namely, a new entity is created that forms itself not primarily with respect to the two component entities, but contextually with respect to the global quantum coherence present, in which the two component entities will generally play an important role, but in principle, not only they [
38] (see
Section 6). In this sense, we believe that the situation where both the CHSH inequality and the marginal law conditions are violated is the default situation in terms of entanglement.
We finally come to the mathematical representation in Hilbert space of the general perspective on entanglement summarized in this section. In that regard, the analysis of the problem of identifying component entities from measurements performed on a composite bipartite entity (
Section 2) led us to work out a quantum-theoretic framework to model any Bell-type test that violates both the marginal law conditions, the CHSH inequality and Cirel’son’s bound. In this quantum-theoretic framework, we explicitly introduce entangled measurements [
16,
20,
21,
22,
24]. Specifically, we have proved that, whenever two entities combine to form a composite bipartite entity, a strong form of entanglement is created between the individual entities composing it, which is such that not only the state of the composite entity is entangled, but also the coincidence measurements are entangled. By the way, the idea of using both entangled states and entangled measurements to model in Hilbert space situations that violate the marginal law conditions was shared by other authors (see, e.g., [
18]).
We are now ready to present in detail the information retrieval tests we performed on Italian linguistic corpora with the aim of identifying conceptual entanglement. This will be the aim of
Section 4.
4. Description of the Tests on Italian Linguistic Corpora
As anticipated in
Section 1, we performed three Bell-type information retrieval tests using selected corpora of Italian language, namely, the corpus “CORIS/CODIS”, the corpus “PAISÀ”, and the corpus “Italian Web 2020”. Thus, let us preliminarily provide some information about these corpora.
The corpus “CORIS/CODIS” (see the webpage
https://corpora.ficlit.unibo.it/coris_eng.html) is a corpus of written Italian language, which has been publicly accessible online since September 2001. Initiated in 1998 by R. Rossini Favretti, the project aimed at developing a comprehensive and representative reference corpus of contemporary written Italian, designed for ease of access and use. The corpus, now containing 165 million words, is regularly updated every 3 years through an embedded monitoring corpus. It is composed of a collection of authentic and widely used electronic texts, carefully selected to reflect usage of current Italian language.
The corpus “PAISÀ” (see the webpage
https://www.corpusitaliano.it/en/index.html) is a large corpus of authentic contemporary Italian texts from the web. It was created within the project that holds the same name (PAISÀ is the acronym of “Piattaforma per l’Apprendimento dell’Italiano Su corpora Annotati”, or platform for learning of Italian language on annotated corpora) with the aim of providing a large resource of freely available Italian texts for language learning by studying authentic text materials. The corpus has a dimension of around 250 million tokens and contains freely available and distributable web texts, collected in September/October 2010.
The corpus “Italian Web 2020”, also known as “ItTenTen20” (see the webpage
https://www.sketchengine.eu/ittenten-italian-corpus/), is part of the corpus manager “Sketch Engine”, and more specifically, it is the version for Italian language of the “TenTen Corpus Family”, which includes more than 50 languages. All corpora in this family are prepared according to the same criteria. The Italian corpus, whose most recent version contains 12.4 billion words, is made up of texts collected from the web in November/December 2019 and December 2020. Its sample texts were checked manually, and content with poor quality text was removed.
Let us now come to the description of the three Bell-type tests.
In the first test, we studied the Italian translation
L’Animale fa un Verso of the conceptual combination
The Animal Acts, considered as a composite entity made up of the conceptual entities
Animal and
Acts. In this test, we chose the Italian translations of the examples of animals and acts considered in our previous studies on
The Animal Acts [
15,
16,
22,
23,
24], and used the corpora “CORIS/CODIS” and “Italian Web 2020”. To set up a Bell-type test, we applied in each corpus the procedure illustrated in [
21], as follows.
For the coincidence measurement , we calculated the number of times that each of the following strings appeared in the texts of the corpus:
: Il Cavallo Ringhia (The Horse Growls)
: Il Cavallo Nitrisce (The Horse Whinnies)
: L’Orso Ringhia (The Bear Growls)
: L’Orso Nitrisce (The Bear Whinnies)
For the coincidence measurement , we calculated the number of times that each of the following strings appeared in the texts of the corpus:
: Il Cavallo Sbuffa (The Horse Snorts)
: Il Cavallo Miagola (The Horse Meows)
: L’Orso Sbuffa (The Bear Snorts)
: L’Orso Miagola (The Bear Meows)
For the coincidence measurement , we calculated the number of times that each of the following strings appeared in the texts of the corpus:
: La Tigre Ringhia (The Tiger Growls)
: La Tigre Nitrisce (The Tiger Whinnies)
: Il Gatto Ringhia (The Cat Growls)
: Il Gatto Nitrisce (The Cat Whinnies)
For the coincidence experiment , we calculated the number of times that each of the following strings appeared in the texts of the corpus:
: La Tigre Sbuffa (The Tiger Snorts)
: La Tigre Miagola (The Tiger Meows)
: Il Gatto Sbuffa (The Cat Snorts)
: Il Gatto Miagola (The Cat Meows)
All retrieved strings, or pairs of lemmas, in the jargon of computational linguistics, were inspected manually to avoid false-positive cases. Then, for each coincidence measurement
,
,
, we calculated the relative frequency of appearance of the string
,
, which we considered, in the large number limit, as the probability of appearance
or, equivalently, as the probability that the outcome
is obtained in the coincidence measurement
.
Table 1 reports the probabilities of appearance computed in this way. In it, we indicate the outcomes in English language, for the sake of clarity.
Next, the probabilities of appearance
,
were used to calculate the expectation values, or correlation functions,
, using Equation (
1),
,
. Finally, we calculated the CHSH factor in Equation (
2) and compared it with the CHSH inequality in Equation (
3).
As we can see from
Table 1, in both of the corpora “CORIS/CODIS” and “Italian Web 2020”, the CHSH factor exceeds the numerical value of 2, which indicates a violation of the CHSH inequality in Equation (
3), hence a `deviation from classicality’ and, because of our considerations in
Section 2 and
Section 3, the presence of `quantum entanglement’ between the component entities
Animal and
Acts. In the first case, relative to the corpus “CORIS/CODIS”, the CHSH factor is equal to 3; thus, Cirel’son’s bound is violated too. This result shows substantial agreement with the information retrieval tests in [
21,
23], where corpora of English language were employed, and also with the cognitive tests in [
22,
23,
24], where human participants were employed. In the second case, relative to the corpus “Italian Web 2020”, the CHSH factor is equal to
, which shows substantial agreement with the cognitive tests in [
15,
16].
It should be noted that the corpus “PAISÀ” does not have the capability to systematically detect entanglement due to an insufficient number of occurrences of the utilized strings. In addition, we did not retrieve in our searches too many entries with the other two corpora either. This is why we decided to perform a second test on the Italian translation
L’Animale fa un Verso of the conceptual combination
The Animal Acts, which was a variant of the original test. We decided to identify additional animals and associated sound emissions, i.e., acts, that could provide a sufficient number of occurrences to perform the test, replicating the methodology of the original test. Concerning the latter, indeed, we noticed that, though horses are well known in Italian culture, and their characteristic neigh and snort are commonly recognized, there were insufficient occurrences of horses whinnying and snorting in our corpora. Consequently, we replaced the horse with a more common domestic animal, the dog, which is familiar across many cultures. We selected barking as the primary vocalization and howling as the secondary one for the dog. The bear, which can growl, was substituted by the rooster, which sings, due to its greater familiarity and prevalence. The tiger, present in some Italian zoos and numerous films, was replaced by the wolf as a wild animal due to the wolf’s comparable recognition. Then, we repeated the test using the corpora “CORIS/CODIS”, “PAISÀ”, and “Italian Web 2020”. More specifically, we considered a maximum of 9 words between the two lemmas and then, again, removed false positives from the extraction. The results are presented in
Table 2, where we indicate the outcomes in both Italian and English language, for the sake of completeness.
As we can see from
Table 2, in all corpora “CORIS/CODIS”, “PAISÀ”, and “Italian Web 2020”, the CHSH factor exceeds the numerical value of 2, which indicates a violation of the CHSH inequality in Equation (
3). In addition, the CHSH factor oscillates around Cirel’son’s bound. In all cases, however, a strong deviation from classicality occurs, which indicates that entanglement is again at play in the conceptual combination
The Animal Acts.
Finally, in the third test, we investigated the Italian translation
L’Animale mangia il Cibo of the conceptual combination
The Animal eats the Food, which we considered as a composite entity made up of the individual conceptual entities
Animal and
Food. We used the same examples of animals and food as in [
22] and worked on the corpus “Italian Web 2020”, where we considered a maximum distance of 4 words between the subject and the verb and a maximum distance of 5 words between the verb and the object. Furthermore, we included the most common synonyms of the word “eat”, such as “nourish” and “feed”. To set up a Bell-type test, we applied the procedure illustrated above, as follows.
For the coincidence measurement , we calculated the number of occurrences of the following strings:
: Il Gatto mangia l’Erba (The Cat eats the Grass)
: Il Gatto mangia la Carne (The Cat eats the Meat)
: La Mucca mangia l’Erba (The Cow eats the Grass)
: La Mucca mangia la Carne (The Cow eats the Meat).
For the coincidence measurement , we calculated the number of occurrences of the following strings:
: Il Gatto mangia il Pesce (The Cat eats the Fish)
: Il Gatto mangia le Noci (The Cat eats the Nuts)
: La Mucca mangia il Pesce (The Cow eats the Fish)
: La Mucca mangia le Noci (The Cow eats the Nuts).
For the coincidence measurement , we calculated the number of occurrences of the following strings:
: Il Cavallo mangia l’Erba (The Horse eats the Grass)
: Il Cavallo mangia la Carne (The Horse eats the Meat)
: Lo Scoiattolo mangia l’Erba (The Squirrel eats the Grass)
: Lo Scoiattolo mangia la Carne (The Squirrel eats the Meat).
For the coincidence experiment , we calculated the number of occurrences of the following strings:
: Il Cavallo mangia il Pesce (The Horse eats the Fish)
: Il Cavallo mangia le Noci (The Horse eats the Nuts)
: Lo Scoiattolo mangia il Pesce (The Squirrel eats the Fish)
: Lo Scoiattolo mangia le Noci (The Squirrel eats the Nuts).
Additionally, in this case, all entries were inspected manually to avoid false positives. Then, for each coincidence measurement
,
,
, we calculated the relative frequency of appearance of the string
,
, which we considered, in the large number limit, as the probability of occurrence
, i.e., the probability that the outcome
is obtained in
.
Table 3 reports the probabilities of appearance computed in this way. We again indicate the outcomes in English language, for the sake of clarity.
As we can see from
Table 3, the CHSH factor in Equation (
2) violates both the CHSH inequality in Equation (
3) and Cirel’son’s bound, in substantial agreement with the empirical patterns identified in [
22]. Additionally, in this case, the violation of the CHSH inequality indicates the presence of entanglement in the conceptual combination
The Animal eats the Food.
Summing up, the three information retrieval Bell-type tests we have analysed in this section lead one to draw the preliminary conclusion that a violation of the CHSH inequality systematically occurs. This suggests that the phenomenon of entanglement is systematically present whenever two (or more) concepts combine, and the phenomenon is independent of the language, English or Italian, that is used to reveal it. This means that we have identified here a deep conceptual structure underlying human language and, more generally, human cognition [
19]. This conclusion will be sustained by the elaboration of an explicit quantum-theoretic model for the data presented in this section, as shown in
Section 5.
5. A Quantum Model in Hilbert Space
We elaborate a quantum mathematical model in Hilbert space for the data presented in
Section 4, limiting ourselves to the corpus “Italian Web 2020”, since the latter enabled the collection of data for all tests. The model rests on the general quantum-theoretic framework that we have worked out for the modelling of any Bell-type situation [
16,
19,
21,
22]. As such, it is not an `ad hoc model’, designed on purpose to fit empirical data.
As already shown in
Section 4, though the tests were conducted in Italian language, we will use the English language translations of the conceptual entities and the examples of animals, acts, and food considered in the tests, for the sake of clarity.
In our quantum-theoretic framework, the conceptual entity under study (
The Animal Acts or
The Animal eats the Food) is a composite bipartite entity
made up of the individual entities
and
(
Animal and
Acts or
Animal and
Food, respectively). We assume that
is in an initial state
p, which corresponds to the situation of an animal that makes a sound (in
The Animal Acts case) or an animal that eats some food (in
The Animal eats the Food case). Then, for every
,
, the coincidence measurement
has four possible outcomes
,
, which correspond to the four strings, or pairs of lemmas, we looked at in
(see
Table 1 and
Table 2 for
The Animal Acts and
Table 3 for
The Animals eats the Food). Next, each outcome
of
is associated with an outcome state, or eigenstate,
, which corresponds to an example of an animal that makes a specific sound (in
The Animal Acts, e.g.,
The Bear Growls,
The Dog Barks) or an example of an animal that eats some specific food (in
The Animal eats the Food, e.g.,
The Cat eats the Fish,
The Squirrel eats the Nuts). Finally, the probability of appearance
corresponds to the probability
that the outcome
is obtained when the coincidence measurement
is performed on the composite entity
in the initial state
p.
The above identification of the empirical notions of entities, states, measurements, and probabilities of measurement outcomes enables straight application of the usual quantum representation of these notions by means of the mathematics of Hilbert space, as follows.
Since each coincidence measurement
,
,
, has four outcomes, the composite entity
, meant as a `single entity’, is associated with the complex Hilbert space
of all ordered 4-tuples of complex numbers. Moreover, each state
p of
is represented by a unit vector of
, and each coincidence measurement on
is represented by a self-adjoint operator or, equivalently, by a spectral family, on
. On the other side, each outcome
,
is obtained by juxtaposing the outcomes
and
(e.g., in
The Animal Acts, the string “horse whinnies” is obtained by juxtaposing the lemmas “horse” and “whinnies”). This defines a two-outcome measurement
X,
on the individual entity
and a two-outcome measurement
Y,
on the individual entity
. Since each of these individual entities is associated with the complex Hilbert space
of all ordered pairs of complex numbers,
, meant as a `composition of
and
’, is associated with the tensor product Hilbert space
(see
Section 2).
The vector spaces
and
are formally isomorphic, where each isomorphism maps an orthonormal (ON) basis of
onto an ON basis of
. The states of
are represented by unit vectors of
, which correspond, through the isomorphism, to vectors of
, hence to either vectors that represent product states or vectors that represent entangled states. Analogously, the vector space
of all linear operators on
is isomorphic to the tensor product vector space
, where
is the vector space of all linear operators on
. The measurements on
are represented by self-adjoint operators, which correspond, through the isomorphism, to self-adjoint operators of
, hence to either self-adjoint operators that represent product measurements or self-adjoint operators that represent entangled measurements (see again
Section 2). More precisely, let
be an isomorphism mapping a given ON basis of
onto a given ON basis of
. We say that a state
p, represented by the unit vector
, is a `product state with respect to
I’, if two states
and
, represented by the unit vectors
and
, respectively, exist such that
. Otherwise,
p is an `entangled state with respect to
I’. Analogously, we say that a measurement
e, represented by the self-adjoint operator
on
, is a `product measurement with respect to
I’, if two measurements
and
, represented by the self-adjoint operators
and
, respectively, on
exist such that
. Otherwise,
e is an `entangled measurement with respect to
I’. Thus, the notion of entanglement does depend on the `isomorphism that is used to identify individual entities within a given composite entity’.
With reference to a Bell-type setting, one can then prove the following statements [
16]:
- (i)
If the coincidence measurements
and
,
,
,
are product measurements with respect to the isomorphism
I, then, for every state
p of the composed entity
, the marginal law condition in Equation (
4) is satisfied;
- (ii)
If the coincidence measurements
and
,
,
,
are product measurements with respect to the isomorphism
I, then, for every state
p of
, the marginal law condition in Equation (
5) is satisfied;
- (iii)
If the marginal law conditions are satisfied in all coincidence measurements, then a unique isomorphism exists, which can be chosen to be the identity operator.
It follows from (i)–(iii) that, if the marginal law conditions in Equations (
4) and (
5) are empirically violated, one cannot find a unique isomorphism
such that all measurements are product measurements with respect to
I. This violation of the marginal law conditions frequently occurs in Bell-type tests on conceptual-linguistic entities, and the three tests in
Section 4 do not make an exception, as one can verify by directly inspecting the statistical data in
Table 1,
Table 2 and
Table 3. In these cases, one cannot attribute the violation of the CHSH inequality by concentrating all the entanglement of the state-measurement situation in the initial state and assuming that all coincidence measurements are product measurements, as one does in Bell-type tests on physical entities. We add that, as anticipated in
Section 2, there are reasons to believe that the marginal law conditions are also violated in Bell-type tests on physical entities, which indicates that entangled measurements are present in the physical domain too. However, the violation of the marginal law conditions in physical tests is not large; hence, it has not been investigated in depth [
19]. Thus, if we set an isomorphism
, it is likely that both the initial state and all coincidence measurements are entangled [
16].
Keeping in mind the considerations above, we associate the composite conceptual entity (The Animal Acts or The Animal eats the Food) with the complex Hilbert space . Then, let , , , and be the unit vectors of the canonical ON basis of , and let be the isomorphism such that the canonical ON basis of coincides with the ON basis of the tensor product Hilbert space made up of the unit vectors , , , and .
In the ON bases above, a given state
q of the composite entity is represented by the unit vector
,
,
,
,
,
,
, where
ℜ is the real line. One easily proves that
represents a product state if and only if
Otherwise,
represents an entangled state.
Next, in analogy with Bell-type tests on physical entities, we represent the initial state
p of the composite entity
by the unit vector
This is a reasonable choice. Indeed, the vector in Equation (
7) represents in quantum mechanics the maximally entangled state that corresponds to the singlet spin state and is rotationally invariant, i.e., does not privilege any ON basis representation. Shifting to the conceptual-linguistic case, this entangled state corresponds to the abstract situation of an animal that makes a sound, or an animal that eats some food, without privileging any specific example of the conceptual entities under study. In other words, the choice to represent the initial state by the unit vector in Equation (
7) corresponds to the situation in which the composite entity is open to any type of measurement involving any example of animals and sounds, or any example of animals and food. In addition, we try, in this way, to concentrate as much entanglement as possible on the initial state (see also the analysis in [
19]). However, the one on in Equation (
7) is not the only possible choice. For example, an alternative choice would have been to represent the initial state of the composite entity by the unit vector
, where
and
(see the discussion in [
24], Section 5). We will not dwell on this technical aspect here, for the sake of brevity.
Further, for every
,
, we represent the coincidence measurement
by the spectral family constructed on the ON basis of the four eigenvectors
, which represent the eigenstates
introduced above, where we set, for every
,
The coefficients are such that
and
. We set
, where
, for the sake of simplicity. One easily verifies that, for every
,
,
is a product measurement if and only if all
s are product vectors. Otherwise,
is an entangled measurement.
Finally, for every
,
,
, the probability
of obtaining the outcome
in a measurement of
on the composite entity
in the state
p is given by Born’s rule of quantum mechanics, that is,
Summing up, for every measurement
, the four unit vectors in Equation (
8) have to satisfy the following sets of conditions.
- (1)
Normalization. The eigenvectors in Equation (
8) are unit vectors, that is, for every
,
,
,
- (2)
Orthogonality. The eigenvectors in Equation (
8) are mutually orthogonal, that is, for every
,
,
,
,
,
- (3)
Empirical adequacy. For every
,
,
, the probability
coincides with the empirical probability
in
Table 1,
Table 2 and
Table 3, that is,
where we have used Born’s rule in Equation (
9).
We now provide an explicit solution for the data in
Table 1,
Table 2 and
Table 3, starting with the first test on
The Animal Acts original.
The eigenstates of the measurement
are represented by the unit vectors
By applying the entanglement condition in Equation (
6), we find that
is an entangled measurement. In addition, Equation (
6) shows a larger deviation from zero in the unit vector
, which thus represents a relatively more entangled state. Indeed, the eigenstate
corresponds to
The Horse Whinnies, which is more characteristic of the global meaning carried by
The Animal Acts. Analogously, the unit vector
represents a product state. Indeed, the eigenstate
corresponds to
The Bear Whinnies, which is less characteristic of the global meaning carried by
The Animal Acts.
The eigenstates of the measurement
are represented by the unit vectors
Additionally, in this case,
is an entangled measurement. In addition, Equation (
6) shows a larger deviation from zero in the unit vector
, which thus represents a relatively more entangled state. Indeed, the eigenstate
corresponds to
The Horse Snorts, which is more characteristic of the global meaning carried by
The Animal Acts. Analogously, the unit vectors
and
represent product states. Indeed, the eigenstates
and
correspond to
The Horse Meows and
The Bear Meows, which are less characteristic of the global meaning carried by
The Animal Acts.
The eigenstates of the measurement
are represented by the unit vectors
Again,
is an entangled measurement. In addition, Equation (
6) shows a larger deviation from zero in the unit vector
, which thus represents a relatively more entangled state. Indeed, the eigenstate
corresponds to
The Cat Growls, which is more characteristic of the global meaning carried by
The Animal Acts. Analogously, the unit vector
represents a product state. Indeed, the eigenstate
corresponds to
The Cat Whinnies, which is less characteristic of the global meaning carried by
The Animal Acts.
Finally, the eigenstates of the measurement
are represented by the unit vectors
Further,
is an entangled measurement. In addition, Equation (
6) shows a larger deviation from zero in the unit vector
, which thus represents a relatively more entangled state. Indeed, the eigenstate
corresponds to
The Cat Meows, which is more characteristic of the global meaning carried by
The Animal Acts. Analogously, Equation (
6) shows a smaller deviation from zero in the unit vector
, which thus represents a relatively less entangled state. Indeed, the eigenstate
corresponds to
The Cat Snorts, which is less characteristic of the global meaning carried by
The Animal Acts.
The second test on
The Animal Acts modified allows one to draw exactly the same considerations as the first one. Indeed, the eigenstates corresponding to examples whose meaning is closer to the global meaning context provided by
The Animal Acts are represented by relatively more entangled states, namely, the eigenstates corresponding to
The Dog Barks in the measurement
,
The Wolf Howls in the measurement
,
The Rooster Sings in the measurement
, and
The Cat Meows in the measurement
. Analogously, the eigenstates corresponding to examples that are farther than the global meaning context provided by
The Animal Acts are represented by relatively less entangled states, namely, the eigenstates corresponding to
The Wolf Barks in the measurement
,
The Wolf Meows in the measurement
,
The Rooster Barks in the measurement
, and
The Rooster Meows in the measurement
. For the sake of completeness, we report the complete set of eigenvectors in the following.
The third test on
The Animal eats the Food completely aligns with the first two tests. Indeed, the eigenstates correspond to examples whose meaning is closer to the global meaning of
The Animal eats the Food, namely, the eigenstates corresponding to
The Cow eats the Grass in the measurement
,
The Cat eats the Fish in the measurement
,
The Horse eats the Grass in the measurement
, and
The Squirrel eats the Nuts in the measurement
. Analogously, the eigenstates corresponding to examples that are farther than
The Animal eats the Food are represented by relatively less entangled states, namely, the eigenstates corresponding to
The Cow eats the Meat in the measurement
,
The Cow eat the Nuts in the measurement
,
The Squirrel eats the Meat in the measurement
, and
The Horse eats the Meat in the measurement
. Additionally, in this case, we report the complete set of eigenvectors, as follows.
The quantum mathematical model for the data presented in
Section 4 is thus completed and allows us to make some interesting considerations on the appearance of entanglement in conceptual-linguistic domains, as follows.
- (a)
The results obtained on corpora of Italian language confirm and strengthen those obtained in both cognitive tests on human participants [
16,
22,
24] and information retrieval tests on corpora of English language [
20,
21]. In particular, the individual concepts
Animal and
Acts entangle when they combine to form the combination
The Animal Acts. Analogously, the individual concepts
Animal and
Food entangle when they combine to form the combination
The Animal eats the Food. This is because the concepts
Animal (
Animal) and it Acts (
Food) carry `meaning’, but
The Animal Acts (
The Animal eats the Food) also carries its own meaning, which is not attributed to the latter by separately attributing meaning to
Animal (
Animal) and
Acts (
Food). On the contrary, the combination process `creates additional meaning’ in a way that violates the rules of classical compositional semantics. We believe that this process of `meaning attribution and creation as a result of conceptual combination can exactly be captured by the quantum phenomenon of entanglement’. Or equivalently, meaning is the conceptual-linguistic counterpart of what entanglement is in physical domains.
- (b)
In both
The Animal Acts and
The Animal eats the Food cases, the empirical violation of the marginal law conditions of Kolmogorovian probability forbids concentrating all the entanglement of the state-measurement situation in the initial state of the composite entity, as we have seen above. As discussed in
Section 3, this violation cannot be interpreted as a clue that signalling is in place. On the contrary, we believe that the violation of the marginal law conditions indicates that we are in the presence of a stronger form of entanglement than the one typically identified in physical domains, and this stronger form of entanglement is the one that is most frequent, even in Bell-type tests on physical entities. In this sense, the violation of the CHSH inequality beyond Cirel’son’s bound in some of the tests we performed should not come as a surprise, but it is again a consequence of this stronger form of entanglement involving both states and measurements.
- (c)
In the quantum mathematical modelling above, most of the eigenstates of the coincidence measurements are entangled too. This can be explained by observing that, in each coincidence measurement, all possible outcomes correspond themselves to combinations of concepts; e.g., the outcome The Wolf Howls is itself a combination of the concepts Wolf and Howls. Analogously, the outcome The Cat eats the Fish is itself a combination of the concepts Cat and Fish. Thus, on the basis of the theoretical connections above between entanglement and meaning, we believe that a non-classical process of meaning attribution and creation occurs at the level of examples too. In addition, in each coincidence measurement, some eigenstates exhibit a relatively higher degree of entanglement than others. Again, this can be explained by the fact that entanglement captures meaning; hence, eigenstates with a higher degree of entanglement correspond to examples whose meaning is closer to the overall meaning carried by the relative composite entity.