1. Introduction
In spite of its labeling as a “mathematical theory of communication”, the theory of Claude E. Shannon appears to be primarily concerned with the relation of signals and noise, or, in terms of theory, with a mathematical definition of information. As Shannon himself [
1] mentioned, his conception is hardly a full-fledged theory of communication. Although it allows grasping some technical aspects of communication, it does not account for the semantic or hermeneutic aspects, which many consider essential in communication [
2,
3,
4,
5]. In the extant literature some of these aspects are comprised as meaning and especially in traditional philosophy are considered extra-communicational, with meaning being thought of as a sort of individual knowledge that draws on an actor’s intentions, and that finds its expression in special “illocutionary forces” (Austin) that correspond to a particular will of human actors to be understood [
6]. In this sense, meaning is considered a (human) attribute to communication, which resides somewhere beyond the sphere of what (Shannon) information is able to grasp [
7,
8].
In opposition to these assumptions, the following considerations will try to outline a thread of argumentation, which regards meaning as not fundamentally different from Shannon-type information. With Talcott Parsons’ conception of “double contingency” [
9] and especially with Niklas Luhmann’s elaboration [
5], I will try to conceive meaning as a sort of second-order information which, when regarded in terms of the cybernetic concept of Eigenforms [
10], might principally be regarded as structurally equivalent to information. It should therefore be possible to subject the interplay of information and meaning (
i.e., communication) to formalization, at least in terms of an emergent property of the interaction of a multitude of autonomous agents, as will be insinuated with regards to a multi-agent simulation. With this, these considerations could contribute to what has been envisioned as a Unified Theory of Information [
11].
However, the way meaning is conceived in this context seems to feed back on the conception of information as well. If meaning has to be seen as a dynamical entity, so does information. Moreover, the way the interplay of information and meaning is conceived in this context necessitates regarding them as interdependent and hence subject to a circularity which makes ranking them analytically in terms of cause and effect meaningless. In respect to communication, it appears to be the recursive interplay of information and meaning that matters, a complex and dynamic interplay which nevertheless tends to emerge as stabilities that seem to be best described as Eigenforms. Recently this hypothesis seems to find confirmation in experiments in AI and robotics as conducted by Luc Steels [
12] and colleagues [
13].
3. Double Contingency
The conception of “double contingency” as suggested by Talcott Parsons [
9] could shed some light on this question. This conception builds on Shannon’s conception of information and is meant to explain the emergence of human interaction.
A situation of “double contingency” confronts an interlocutor EGO with the problem of what action, be it a gesture, an utterance, or a way of behavior, to choose from the set of possible actions or behaviors, so that ALTER in her turn can choose an appropriate action from her set of options which leaves EGO with a chance to again choose an action so that ALTER again can react, and so on. The crucial point in this theoretical setting is that EGO and ALTER face a vast set of possible actions and do not know which one of them would be appropriate to establish an interaction. They do not have too few communicational options, but too many. The problem consists in the contingency of the preconditions on both sides of the potential interaction. EGO and ALTER have no information about which one of their options might provide them with a good chance to keep interacting.
In Parsons’ conception this lack of information is solved by an a priori synchronized “shared symbolic” system, a common culture for instance, or a common language, common behavioral habits, etc., which constrain the possibility spaces of the interlocutors to an extent at which at least some action becomes sufficiently likely. In short, communication should not be a problem if (1) interlocutors share the same symbolic system; and if (2) this system is finite and fixed.
However, both conditions cannot be presupposed. The first one could be dismissed with the objection that the fact that the interlocutors share the same symbolic system should be seen as a consequence and not as a precondition of communication. I will come back to this in
Section 4 below. The second condition seems erroneous in respect to a property of natural languages and cultures that has been marked as generativity [
14]. This property refers to the possibility that words can always be used to define new words, and behaviors can generate and establish new behaviors. The set of options from which to choose in these cases is not finite like the number of symbols in an alphabet or the capacity of a cable. It is open-ended and potentially infinite. I will return to this in
Section 8.
Therefore, the issue about the whereabouts of additional information for choosing appropriate actions becomes complicated. This might be a reason why its origins are often regarded as laying beyond the reach of Shannon’s conception.
4. Structural Coupling
Let’s stay with the assumption of more or less (for all practical purposes) finite option sets for the moment. According to Parsons, as we have seen, additional information for choosing options from such sets is provided by a sort of
a priori of human interaction, a “shared symbolic system”, a culture for instance. In principle, Niklas Luhmann agrees with this assumption, but ties his agreement to the still more basic question on how these constraints emerged in the first place [
5]. In his opinion, these constraints as well cannot have developed without interaction. Therefore, the situation of double contingency has to be reconsidered with actors without any pre-given relational assets. Referring to constructivism, Luhmann conceives EGO and ALTER as self-referentially closed “black boxes”, as systems which act solely on the base of their own onboard means and perceive everything coming from “outside” as “irritation”. Luhmann uses the term irritation to indicate that, when understood stringently, this kind of system can ascribe any inputs only retrospectively—
i.e., when interaction
will have taken off—to an “outside”, that is, to an “external world” or to another system.
Hence systems can find ways to cope with these irritations, they can memorize these ways, and they can—as it would seem to an observer—eventually adapt to these irritations. If this happens concurrently on both sides of the EGO-ALTER relation, an observer might interpret this process as emergence of interaction. All this in spite of the fact that the systems themselves do nothing else than internally change their modes of operation, without any concept of an “outside”.
In its most abstract form, in Luhmann’s conception, these onboard means of EGO and ALTER are conceived with George Spencer-Brown’s [
15] dual act of
differentiation and
indication. Systems operate in and on their world by differentiating formerly undifferentiated aspects of it, and indicating one of them—just like an air-conditioner distinguishes too high and normal temperatures and indicates “too high” as reason for sending an on-signal to the cooler. With each act of differentiation and indication, systems generate states (or conditions), which in the next step can be differentiated again in order to indicate one part of it thereby generating the next state as precondition for another differentiation-indication. If the resulting sequence of differentiation-indications can be memorized, for example in the form of a particular symbol
A, not
B, women, not men, the set of options of this system becomes structured. The selection of some actions from the set of options becomes more likely than the one of others. Compare this to Shannon’s conception of a selection of a certain symbol from a given set being formalized as a sequence of binary decisions, and information being defined as the number of decisions needed to unambiguously determine the symbol.
In Luhmann’s conception, the reciprocal irritation of EGO and ALTER stands for the initiation of such a sequence of differentiation-indications. By itself an actor just acts on behalf of its own onboard means. Only to an external observer this action might look like actively irritating another actor, which in its turn—again to the observer—seems to find a way in its onboard means to cope with this irritation. The actor does not have to—there is no kind of teleology whatsoever in this conception, but if he does this might induce an irritation again to which his vis-à-vis is challenged to find a way of handling it in the next moment of time. If these ways of handling irritations can be memorized, the two black boxes differentiate their possibility spaces step by step to a degree at which some actions get distinctly more probability than others. As this structuration proceeds concurrently on both sides of the “double contingency”, ALTER and EGO seem to coordinate their actions. In Luhmann’s terms: Their actions become structurally coupled. For an observer (and eventually, if ALTER and EGO reach a complexity that enables self-observation, for themselves as well) these actions begin to look like reactions. ALTER and EGO seem to react to each other and therefore to interact.
In this respect, their interaction seems to emerge as the consequence of self-contained actions, which are not based on any conception of a world “outside” of EGO and ALTER, and therefore are also not dependent on any additional information or meaning. On this abstract level, EGO and ALTER operate (and keep operating) solely in terms of pure Shannon information, with their onboard means defining their sets of options.
However, if the results of these self-contained actions can be memorized, for example in the form of symbols, they stand ready to constrain
subsequent actions, that is, to make some of these actions more likely than others. In this way,
additional information becomes available. Eventually—once actions are coupled—the set of options gets structured in a way that makes them appear to have
meaning, a meaning, which might appear to be pre-given to the current interaction. In respect to the process of coupling, however, this meaning is a
result of preceding interactions, or to use a terminological suggestion of Josef Mitterer [
16], a result of
interactions so far. Therefore, this result is “internal” to the overall interaction. As such, it is nothing but information itself and thus conforms to Shannon’s conception.
In summary, in this context meaning is seen as a consequence of a process of structural coupling, which generates n-level information. If this information sediments, for example in the form of symbols, it can provide additional information for successive (n + 1)-level interactions. Therefore (and only therefore) (n + 1)-level, and in general, higher-level interactions can be observed as meaningful communication.
5. Percepts and Concepts
In order to comprehend this process of structural coupling in its details and particularities, and in order to test its underlying assumptions, the Luhmannian conception of (a three-step differential process of) communication [
5] has been modeled by way of multi-agent simulation. This model is described in detail by [
17,
18]. Below it will be outlined only as far as required for this context. As it turned out, in many of its aspects this model builds on similar premises as the Talking Head experiment of Luc Steels [
12,
13]. To explain and to legitimate these premises, it seems helpful to shortly review some considerations of Robert K. Logan [
19,
20] concerning a difference of what he calls
percepts and
concepts in the way humans communicate.
Logan regards the words as used in human languages as cognitive conceptions, in short
concepts, which help to deal with complex and in particular social life situations. A more basic and individual counterpart of
concepts are
percepts which are conceived to be phylogenetically older than concepts, and which can be seen as individual
ad-hoc reactions to situational irritations, generated on the base of mimesis and immediate action. “Percepts are specialized, concrete and tied to a single concrete event” [
20], or with Donald [
21], they are “implementable action metaphors”, particular and momentarily. Concepts on the other hand, are “abstract ideas that result from the generalization of particular examples”. They became necessary when early humans subsequently began acquiring skills like tool making, fire control, group foraging and coordinated hunting, grouped together in larger social organizations. According to Logan, the higher complexity of their interactions resulted in an “information overload” which necessitated a new abstract level of order that emerged in the form of verbal language and conceptual thinking.
Logan explains this emergence of conceptual thinking in respect to Ashby’s Law of requisite variety [
22], which states that more complex systems need more complex means of control [
19,
20]. With a slightly different emphasis, the following model focuses on the individual-society difference implied in the conception of percepts and concepts. Whereas percepts are seen as individual and situational means to cope with the world, concepts are regarded as evolving in social interaction. Percepts represent individual selections from the set of possible actions in a certain situation. They conform to a specific and particular perspective of an agent in a given moment of time. Therefore they are short-lived, but as such serve as starting points for the development of longer-lasting and socially aligned generalizations of particular situations,
i.e.,
concepts, which when memorized can start to serve as media for interaction in their own right, therefore providing next-level interactions with the additional information needed for sufficiently high probabilities of choosing appropriate actions from the set of possible actions. Concepts in this regard can be understood as “symbolically generalized interaction media” in the sense of Parsons.
However, we have to emphasize at this point that the addressed individual-society difference has no ontological essence on its own. It is used analytically, and in the model is due to technical aspects of implementation. Following Luhmann’s conception, we bear in mind that no autonomous agent, however abstractly, can be assumed prior and thus independent of its social conditions, in the same way, as society cannot be conceived prior to individuals. And as there is no legitimacy to prioritize individual traits over social ones, it will turn out that there is also no reason to prioritize information over meaning. We will come back to this point in
Section 8.
6. The Model
In the model, a population of p computer generated agents is confronted with n randomly appearing “things” which are assumed to be elements of a “world” or of an environment the agents have to survive in by distinguishing things with the help of symbols. Technically, these things are enlisted in a <thing-list> in invariant order and, for accessing them in experimentation, have the form of simple letter constellations like “aaa”, “bbb”, “ccc”, .... However, in perceiving these “things”, an agent does not refer to those letter constellations, but to a percept that is represented by a different letter constellation. These percepts are the agent’s individual way to refer to things and may have the form “AA”, “BB”, “CC”, .... Percepts are likewise enlisted in an invariant <word-list> of length n. The position of things in the <thing-list> does not correspond to the position of percepts in the <word-list>, but is randomly assigned in the model’s setup. Agents cope with things by “uttering” percepts or words, as we shall say in the following.
These words are situational. Whenever a thing appears, an agent just chooses one of the possible words from the word-list and refers to the thing. If a thing appears next time, the agent may, without consequence, choose a completely different word to refer to it. Thus, initially, words (percepts) are no means of communication, but an internal tool for individually ordering an agent’s world. It is assumed however, that agents utter these words “aloud”, so that they can be “heard” by other agents. An uttered word is interpreted in the abstract sense of Luhmann’s irritation, that is, as something to which an agent has no other access then via own onboard means. These onboard means consist of a (n × n)-<probability-matrix> with n rows indicating entries for words and n columns indicating entries for things. All entries are set to zero at the start of the simulation.
This matrix serves as a “memory” for the agents. Whenever one of them is “irritated” by another agent uttering a percept of a thing, that is, by a word they hear, the agent checks the row of her probability-matrix corresponding to this word. Of course, initially there is no difference in this matrix. All entries are set to zero. So the agent will pick one random position in this row and interpret the heard word as indicating the thing corresponding to this position. The agent just “guesses”, and this guess will be wrong in most cases. But if by coincidence a guess is correct (with “correct” denoting a correspondence of the guess and the uttered word) the agent memorizes this “success” by increasing the corresponding matrix position by one point (in Steels’ experiment this incrementation is called “latent inhibition”). From then on, whenever the agent again is “irritated” by a word, she will find a structured probability-matrix from which she chooses the highest entry and—with a certain probability—interprets the corresponding thing as the one represented by the word.
In order to prevent that an agent’s interpretation becomes immutable in the first steps of the process already, agents are made to compare each entry to a random number between 0 and 100. The corresponding thing is assigned only if this random number is smaller than the entry. Agents thus have to increment entries up to 100 to finally be sure of common word-thing-correlations (
Figure 1).
Figure 1.
Schematized probability-matrices with n = 3, (a) at setup with all entries zero; (b) at the end of a coupling process (i.e., a completed run of the model).
Nevertheless, with this simple procedure, agents more or less rapidly “learn” to discriminate things via words, which (in most cases, see below) are commonly used. These words now are no individual percepts anymore, but generalized, socially concerted symbols in the sense of concepts as abstracted from the term used by Logan. They are not situational or subject to particular perspectives anymore, but find ongoing and society-wide use by agents that now can be observed as interacting and therefore generating additional information to orientate subsequent actions. Eventually these actions could be observed as communication [
18].
Figure 2 shows the developments of a typical run with 10 agents in a world of three things.
It should be emphasized here that this emergence of interaction depends on a (human) observer perceiving it in this way. The agents, of course, have no concept of any “inter”, nor any “self”, or any “other”. Even if this is obvious in the case of a computer-generated simulation, and to some extent maybe also in the case of social insects coordinating their actions with the help of pheromone diffusion, it might be less obvious in the case of human interaction. And it also might be misleading to speak of agents initially expressing particular (“subjective”) perspectives via percepts and then “convening” on something that looks like an “intersubjective agreement”. Whereas philosophical theories of communication prefer to regard agents as actively looking for such agreements—for “understanding”, as Habermas [
8] puts it, this model, in accordance with the conception of Luhmann, assumes agents as acting solely in respect to their own onboard means, with no kind of “interest” or “intent” to interact whatsoever. Remember Maturana and Varela’s submarine driver who, on being congratulated for avoiding reefs, is confused because all he did was read certain dials and maintain correlations between indicators within the limits of his equipment [
23]. Like this submarine driver, agents have no concept of any reefs or other world particularities. They act solely on the grounds of their own onboard means. Obviously however, this does not forbid them to concert their actions so that stable forms emerge which might feedback on their own probability and therefore trigger an ongoing increase of complexity in interaction.
Figure 2.
(a) Dynamics of structural coupling of 10 agents in a world with three things; (b) the dynamics of the entries in one row of an agent’s probability-matrix (with n = 3) depicting the probability of using certain words over time.
7. Plasticity
The decision to analytically distinguish individual percepts and socially concerted concepts in this model seems to conform to considerations of Donald M. MacKay [
24] in regard to Shannon’s theory. Whereas in Shannon’s conception the focus is on the one selection from a pre-given and fixed set of options, MacKay—not too different from Parsons and Luhmann—suggests that communication is subject to a “double selection” consisting of the selection of a message from a set of possible messages (as in Shannon’s theory), and of the selection of the set of possible messages from the
set of all sets of possible messages that are held by the population of interlocutors, in this case in the form of individual percepts. As in the model, these two selections are conducted concurrently and interdependently. Agents successively align their individual choices and option sets with the choices and option sets of all others. Before this alignment is achieved, information is just a difference that makes a difference
to an individual agent. It is not yet functional for communication.
The matrix in our model thereby could be interpreted in terms of MacKay’s “state of conditional readiness”, that is, as defining a state of an information-receiver with a particular probability for using certain words. Each incoming message alters this state, but does not—at least not as long as probabilities do not reach 100%—determine respective interpretations finally.
The basic version of the model however, does not really account for this conditionality. As a consequence, the coupling of agents does not always result in an overall “agreement” of all agents on the use of one common language. Quite often, in the basic version, agents agree on some commonly used words, but do not agree on others. Or they come to terms with some of their peers about the use of some words, and they align with others about the use of others. What is more, agents at times tend to use words homonymously, that is, to use the same words to refer to different things.
Figure 3 depicts a typical result of a coupling process with this basic model with a population of 10 in a world of five things.
Figure 3.
(a) Words as “agreed” upon by agents; (b) correspondingprobability-matrices. (p = 10, n = 5).
With David Lewis [
25] one might say that agents convene on a “babbling equilibrium”. They do not couple to an extent, which results in one common language. The reason for this is twofold. First, there is a one-way street in the development of probabilities. The basic model allows only an increase, not a decrease of probabilities. If the model is altered so that every failed interpretation of a word by an agent results in a decrease of 0.2 of the corresponding position in the probability-matrix, the population practically always “agrees” on one common language. Even in the largest (and longest-running) scenario that I tested, with
p = 30 and
n = 10, the coupling process results in one common language.
Figure 4 depicts a typical example of a common language (
Figure 4a), and some statistics for a population of 10 in a world with five things (
Figure 4b).
To a lesser extent, this is true for the altered version of the model as well, the one that allows for a decrease of probabilities. In this case, the system possesses more plasticity. Its performance shows much more flexibility in regard to the integration of different percepts and hence in regard to the generation of generalized concepts that can be used to mediate interaction. In this case, the agents’ readiness seems more conditioned, and their concepts therefore, once some are available, can start to act as attractors for other percepts, and align the varying perspectives to a common conception.
The second reason for the “babbling equilibrium” is somehow more subtle. The increase of the probabilities for using certain words does not proceed linearly. Rather probabilities rise slowly in the beginning, but then, from a certain moment on tend to ascend rapidly upwards to reach 100% without any further delay (confer the right plot in
Figure 2). In respect to Arthur [
26], one might speak of a probability lock-in. Probabilities provide their own growth with a feedback that causes a rapid phase transition from a low chance for coupling to a very high one. Once a group of agents shares entries in the height of about 30 at the same position in their probability-matrices, the further growth of probability on this position becomes inexorable. This impedes other agents with high probabilities at other matrix positions to couple to these actions.
This plasticity seems to shed light on a fundamental property of meaning, as it is interpreted here as additional information that enables actors to choose appropriate words or actions from a set of options. So far, it has been marked as a socially concerted frame or background that consists of generalizations emerging in preceding processes of coupling and as such standing ready to orientate higher-order coupling. In this respect, I tentatively differentiated information and meaning as n-order and (n − 1)-order information, with both of them principally conforming to the definition of information as suggested by Shannon.
Figure 4.
(a) Words as “agreed” upon by agents, with the possibility to increase and decrease probabilities (p = 10, n = 5); (b) Results of 2 × 10 test runs, once with the basic version of the model (p increasing by 1 in the case of a successful interaction), and once with the altered model (p increasing by 1 in the case of a successful interaction and decreasing by 0.2 in the case of a failed interaction). “no. languages” indicates the number of different languages (or rather “dialects”, since languages usually differ only in one or two words) that agents agree upon.
As it turns out, however, in order to allow an as comprehensive as possible integration of different viewpoints (here abstracted as percepts), meaning has to be sufficiently flexible. This flexibility can be seen as a reason why meaning often is considered inconceivable in terms of Shannon-information.