^{1}

^{2}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

In 1872 Ludwig von Boltzmann derived a statistical formula to represent the entropy (an apophasis) of a highly simplistic system. In 1948 Claude Shannon independently formulated the same expression to capture the positivist essence of information. Such contradictory thrusts engendered decades of ambiguity concerning exactly what is conveyed by the expression. Resolution of widespread confusion is possible by invoking the third law of thermodynamics, which requires that entropy be treated in a relativistic fashion. Doing so parses the Boltzmann expression into separate terms that segregate apophatic entropy from positivist information. Possibly more importantly, the decomposition itself portrays a dialectic-like agonism between constraint and disorder that may provide a more appropriate description of the behavior of living systems than is possible using conventional dynamics. By quantifying the apophatic side of evolution, the Shannon approach to information achieves what no other treatment of the subject affords: It opens the window on a more encompassing perception of reality.

The most important thing about information theory is not information. In today's “Age of Information”, as the centennial of the birth of 1960s media guru, Marshall McLuhan [

The same numbness can still be seen in conventional evolutionary theory. In 1859 Charles Darwin published his understanding of the

The encounter of science with information seems to have elicited the same numbness that McLuhan had suggested. For three centuries now science could be described as almost an entirely positivistic and apodictic venture. No surprise, then, that science should focus entirely on the positivist role of information in how matters transpire. But, in a somewhat ironic reversal of McLuhan's IBM example, some are slowly beginning to realize that a possibly more significant discovery may be the new capability to quantify the absence of information, or “not information”.

To assess the importance of the apophatic, or that which is missing, it helps to reframe how Ludwig von Boltzmann [_{i}. Conventionally, this value is normalized to fall between zero and one by dividing the number of times that i has occurred by the total number of observations. Under this “frequentist” convention, the probability of i _{i}). Boltzmann's genius, however, was in abjuring this conventional measure of non-occurrence in favor of the negative of the logarithm of p_{i}. (It should be noted that −log(p_{i}) and (1 − p_{i}) vary in uniform fashion, _{i}) results in the symmetrical parabolic function (p_{i} − p_{i}^{2}). If, however, one calculates average absence using Boltzmann's measure, the result,
_{i} (or larger [1 − p_{i}]),

Claude E. Shannon [_{i}) also was a suitable measure of the degree of surprise an observer would experience upon an encounter with state i. If p_{i} ≈ 1, there is little surprise; however, if p_{i} is very near zero one experiences major surprise when i occurs. To observe i when p_{i} is small was said to provide much information. It followed from this reasoning that the average surprisal, which is formally identical to Boltzmann's H function, should provide a convenient gauge of the total information inherent in the system. Thus it came to pass that the positivist notion of information was confounded with Boltzmann's apophatic measure, H. To make matters worse, John von Neumann suggested (as a joke) to Shannon that he call his function “entropy” following the connection that Boltzmann had drawn with the second law. Sadly, Shannon took the suggestion seriously [

Confusion about H stems from the fact that the measure embodies aspects of mutually-exclusive attributes. Ernst von Weizsäcker [_{i} is small can be assessed only post-facto. In reality, one is comparing the _{i} with the

Tribus' definition also identifies _{i}. It is possible to speak of information in apodictic fashion only insofar as a given distribution p_{i} relates to some other distribution, p_{i}.

It immediately follows that the obverse criticism pertains to Boltzmann's use of H as a general measure of entropy. H is not an appropriate measure of entropy, because the third law of thermodynamics states that entropy can be measured only in relation to some reference state. Although the convention in thermodynamics is to set the reference point as zero degrees Kelvin, more generally the requirement is that some reference state be specified. That Boltzmann may not have been aware of the relativistic nature of entropy is understandable, given as how it was formulated only later by Nernst [

It is clear that both information and entropy are relativistic and must always be treated in the context of changing probabilities. Unfortunately, Shannon's “entropy” is identical neither to the common sense of information _{i}_{j}_{i}_{j}

One notes that if _{i} and _{j} are completely independent of each other, then p(_{i}_{j}_{i}_{j}_{i}_{j}_{i}_{j}_{i}_{j}_{i}_{j}_{i} reveals about _{j}_{i} and _{j}

The particular boundary conditions that Boltzmann chose forced H = Φ. One should note, however, that this equality does not hold for systems of interacting elements [

An appreciation for the relativistic nature of information and its measurement resolves several conundrums regarding information and “meaning” [

If one inquires whether the pattern on the screen is

The following are three random strings of 200 digits:

Sequence A:

42607951361038986072061245134123237671363847519601557824849686201007746224524209 37159144904694056560480338986072061245134123237671363847519601557824849686201007 7462245242093715914490469405656048033898

Sequence B:

03617746439242093715914490469405656048033898607206124513412323767136384751960155 78248496862010077462245242093715914490469405656048033898607206124513412323767136 3847519601557824849686201007746224524209

Sequence C:

01475623843789694751743102380318185453848905236473225910906494173735504160210176 85326300670460724247097189694751743102380318185453848905236473225910906494173779 5041102101768532630067046072424709708969

The values H for each sequence are 3.298, 3.288 and 3.296 bits, respectively. That no internal order is present in any of the sequences is shown by the average mutual information values of adjacent pairs of digits in each of the three cases (as with the adjacent pixels on a TV screen). These calculate to 10.97%, 10.03% and 9.94% of the respective paired entropies. Each fraction is typical of a random distribution of 200 tokens among 10 types. Relationships between more distant pairs are likewise random.

Next, the correspondences between the three pairs of sequences are examined. Recording how each digit in A pairs with the occupant in its corresponding location in B yields a joint entropy of 5.900 bits,11.61% of which appears as mutual information (once again, random correspondence). Similar pairings between sequences A and C, however, reveal that fully 91.69% of the joint “entropy” consists of mutual

While these comparisons may appear to some as typical exercises in coding/decoding, they actually have deeper implications. Instead of digits, one could have used as categories symbols for codons in a genome (A,C,T,G) or monomers in a protein (Gly, Ala, Leu, Trp,

In such a situation the pattern in C would provide ultimate meaning to A. The match would signify the end towards which A was created by the immune system and would initiate a highly directed action on the part of A (to eliminate the microbe). This significance is clearly apparent in the high value of mutual information between the sequences. Whence, although the primitive Shannon measure does not by itself convey meaning, the relative information indicated by A clearly provides at least a sense of “proto meaning”. That such “meaning” for antibodies is but a pale shadow of meaning in the human context only reflects how wanly quantitative models in general prefigure more complicated human situations. In order to get from meaningless physical phenomena to full-blown human semiosis, it is necessary to pass through some inchoate precursor of meaning. Shannon measures, it would appear, are

While these two examples highlight more accurately the positive role that information plays in living systems, less attention is usually paid to the residual Φ that represents flexibility. Most would rather ignore Φ in a science that is overwhelmingly positivist and apodictic, because rewards go to those who focus upon identifying the constraints that guide how things happen. The instances where physics addresses anything other than the positivistic are indeed very few—the Pauli Exclusion Principle and Heisenberg uncertainty are the only exceptions within this writer's memory.

Physics, however, deals almost exclusively with the homogeneous, but as soon as one leaves the realm of universals and enters the very heterogeneous world of the living, the

Such accounting, however, is precisely what Boltzmann initiated (whether consciously or unconsciously). Furthermore, Boltzman weighted non-being so as to skew its importance vis-à-vis that which exists, thereby providing a bias that accords with the second law. Now, it happens that Boltzmann's formula pertains to circumstances far more complex than his rarified, homogeneous and non-interacting example system. Even in highly complex systems, Boltzmann's H can be parsed into separate terms that gauge constraint and flexibility, respectively.

Such parsing requires the comparison of two distributions with one another. There is no prohibition, however, against abstracting the two distributions from the same system. This was done above, for example, when the (non-significant) values of A were calculated on successive pairs of integers within each string of 200 integers. Of possibly greater utility is the comparison of the past (

To parse a network in this fashion one considers the interaction strengths, T_{ij}

The reader should note that nothing need to be known concerning the particular details of the constraints that guide the constitutive links, nor about the specifics of the degeneracies that contribute to Φ. All one needs to calculate the overall system constraint and flexibility are the phenomenological observations T_{ij}

Being able to quantify the overall constraint inhering in a system (A) is a major step forward, but it could be argued that the ability to quantify

In terms of ecological (and likely as well economic, social and immune) systems what is missing can be of critical importance. Parallel redundant pathways, inefficient and incoherent processes all contribute to the magnitude of Φ. While they often hinder the efficient functioning of the system (as gauged by A), it is precisely such “noise” that is required by a system if it is to mount a response to a novel perturbation [

Furthermore, to endure and remain sustainable, it appears that a system must possess even more flexibility (Φ) than constraint (A). Available data on ecosystems indicate that such balance occurs within a narrow range of values of the quotient A/H [

The necessity for apophasis bears strongly upon the issue of preserving biodiversity. In recent decades much effort has justifiably been invested at the global level towards the conservation of biodiversity. Society intuitively senses that maintenance of biodiversity is necessary for global ecological health. What is hardly ever mentioned, however, is that solid theoretical justification for preserving biodiversity has been wanting. In retrospect, we see why this is so: Having only positivist tools at one's disposal, one cannot hope to circumscribe the interplay between constraint and looseness that provides sustainability. But the definitions of A and Φ now engender a quantitative methodology with which to follow the dynamics between the apodictic and the apophatic. Furthermore, such analysis often reveals that it is an increase in the latter that becomes necessary for system survival.

Virtually all domains of science remain “one-eyed” in scope, save for the discipline of thermodynamics, where entropy explicitly appears as a manifestation of the apophatic (although it is rarely acknowledged as such). Schroedinger coined the term “negentropy” to refer to the inverse of entropy, and there have been numerous treatments of the entropy-negentropy conversation. Terrance Deacon [

Although the aim of this collection of essays has been a better apodictic notion of information, perhaps a more important goal should be a fuller appreciation for the dialectic between constraint and flexibility. In the end, the metaphor of transaction provides a more appropriate context within which to appraise the dynamics of living systems, because the dynamics of life cannot be minimalized as “matter moving according to universal laws” [

To put a finer point on the dialectic, one notes that the opposition between generation and decay is not absolute. In Hegelian fashion, each of the countervailing trends requires the other at some higher level: The development of new adaptive repertoires requires a cache of what formerly appeared as redundant, inefficient, incoherent and dissipative processes. On the other hand, greater constrained performance always generates increased dissipation.

In the dialectical scenario, information as commonly perceived becomes a degenerate subclass of the more general notion of constraint. No longer is it necessary to treat information using the narrow rubrics of communication theory. That Shannon developed his mathematics in that theatre can be regarded as an historical accident. His ensuing quantifications apply far more broadly, not just to constraint in general, but possibly more importantly, to the lack of constraint as well.

Even though Shannon's H formula rests upon solid axiomatic foundations, it has engendered perhaps as much confusion as it did enlightenment. The selfsame formula was believed to quantify the mutually exclusive attributes of entropy and information. Such contradiction has strained logic and spawned abstruse narration, e.g., Brillouin [

The key to resolving confusion about H is to tie the function not only to the second law of thermodynamics, but to connect it to the third law as well. Entropy can never be defined in absolute terms, but acquires

In hindsight it is now clear why the H function alone is a poor surrogate for many of its intended applications (beginning with entropy per se). As demonstrated above, H fails to represent “meaning”, whereas its relational component, A, appears capable to the task. Other purported shortcomings of Shannon information should be re-examined as well in the light of

In conclusion, it is highly premature to dismiss the Shannon/Boltzmann approach for measuring information, because something else as important as information is at stake. Other attempts at improving the apodictic characterization of information fail to encompass the necessary roles for apophasis. The Boltzmann/Shannon mathematics provides in the end a richer and more inclusive vantage on the dynamics of nature—one that allows the scientist to open his/her blind eye towards the broader causality at work in the living world. In that sense, it can truly be said that the most important contribution that information theory makes to science is not information.

The author wishes to acknowledge Pedro Marijuan for encouraging him to write this essay and to an anonymous reviewer for pointing the author towards the work of Ernst Ulrich Freiherr von Weizsäcker.