Information Theory: a Multifaceted Model of Information

: A contradictory and paradoxical situation that currently exists in information studies can be improved by the introduction of a new information approach, which is called the general theory of information. The main achievement of the general theory of information is explication of a relevant and adequate definition of information. This theory is built as a system of two classes of principles (ontological and sociological) and their consequences. Axiological principles, which explain how to measure and evaluate information and information processes, are presented in the second section of this paper. These principles systematize and unify different approaches, existing as well as possible, to construction and utilization of information measures. Examples of such measures are given by Shannon’s quantity of information, algorithmic quantity of information or volume of information. It is demonstrated that all other known directions of information theory may be treated inside general theory of information as its particular cases.


Introduction
As with any natural phenomenon, there are two main problems related to information.The first one is to define what information is and to find what basic properties it has.The second problem is how to measure and evaluate information.In this paper, we consider the second problem.
From the beginning of the development of information theory, it was known more how to measure information than what information is.Hartley and Shannon gave effective formulas for measuring the quantity of information.However, without understanding the phenomenon of information, these formulas bring misleading results when applied to irrelevant domains.
At the same time, a variety of information definitions have been introduced.Being mostly vague and limited, these definitions have brought confusion into information studies (cf., for example, [34,4,19,39]).
The existing confusion with the term information is increased when researchers call by the name "information" a measure of information or even a value of such a measure.For example, many call by the name "information" Shannon's quantity of information I = -Σ i=1 n p i log p i or Renyi's measure of information H α (X) = (1 -α) -1 log Σ x∈X P α (x) ( [29,15]) or pragmatic measure of information I M (p, q) = Σ i,m p i/m ϕ m log 2 (p i/m /q i ) ( [37]).At the same time, some researchers (cf., for example, [30] or [21]) never did this.That is why it is so important to explain and understand distinctions between some phenomena and their measures.This completely refers to information.
Even if we have an answer to the question what is information, it is not sufficient for practical purposes of information processing.The main problem in this perspective is how to measure or, at least, to evaluate information.The results of [8], which describe information as a general phenomenon and its basic properties by means of ontological principles, provide a base for developing a unified theory for information evaluation and measurement.This is done in the second section of the paper, which goes after Introduction.It contains the axiological component of the general theory of information.This component is developed on the base of axiomatic methodology, providing basic axiological principles for information evaluation and measurement.Basic axiological principles explain what are basic properties of measures and estimates for information.These principles systematize and unify different approaches, existing as well as possible, to construction and utilization of information measures.This, axiological aspect of the theory is not less important than the ontological one because methods of modern science emphasize importance of measurement and evaluation that are technical tools for observation and experiment in scince, as well as for engineering.In the third section of this paper, it demonstrated how the main directions of information theory as a whole are unified and systemized in the context of the general theory of information.

Basic Axiological Principles of the General Theory of Information
Basic axiological principles explain how to evaluate information and what measures of information are necessary.
According to the ontological principles [8,11,10], information causes changes either in the whole system R that receives information or in an inforlogical subsystem IF(R) of this system.Consequently, it is natural to assume that measure of information is determined by the results that are caused by reception of the corresponding portion of information.It is formulated in the first principle.
Axiological Principle A1.A measure of information I for a system R is some measure of changes caused by I in R (for information in the strict sense, in the infological system IF(R) ) .
Next principles describe what information measures reflect.This implies several classifications for information measures.
The first criterion for measure classification is the time of changes.
Axiological Principle A2.According to time orientation, there are three temporal types of measures of information: 1) potential or perspective; 2) existential or synchronic; 3) actual or retrospective.Let us consider the following example.Some student R studies a textbook C.After two semesters she acquires no new knowledge from C and finishes to use it.At this time an existential measure of information contained in C for R is equal to zero.The actual measure of information in C for R is very big if R is a good student and C is a good textbook.But the potential measure of information in C for R may be also bigger than zero if in future R returns to C and finds in C such things that she did not understand in her youth.
Different types of information measures can estimate information in separate infological systems.For example, synchronic measures reflect the changes of the short-term memory, while retrospective measures represent transformations in the long-term memory of a human being.
The second criterion for measure classification is derived from the system separation triad: (R, l, W) Here, R is a system, W is the environment of this system and l represent different links between R and W.
Axiological Principle A3.There are three structural types of measures of information: external, intermediate, and internal.Examples are given by the change of the probability p(R, g) of achievement of a particular goal g by the system R.This information measure was suggested by [20] and is called the quality of information.
Definition 4.6.An external information measure reflects the extent of outer changes caused by I, i.e., the extent of changes in W.
Examples are given by the change of the dynamics (functioning, behavior) of the system R or by the complexity of changing R.
Axiological Principle A4.There are three constructive types of measures of information: abstract, realistic, and experiential.Definition 4.7.An abstract information measure is determined theoretically under general assumptions.
Examples are given by the change of the length (the extent) of a thesaurus.Definition 4.8.A realistic information measure is determined theoretically subject to realistic conditions.
Quality of information [20] is an example of such measure.
Those people who worked with information technology and dealt with problems of information security and reliability have discovered the difference between abstract and realistic measures of information.They found that if you have an encrypted message, you know that information contained in this message is available.Those who know the cipher can get it.However, if you do not possess this cipher and do not have working algorithms for deciphering, then this information is inaccessible to you.To reflect this situation, exact concepts of available and acceptable information have been introduced.Available information is measured by abstract information measures, while acceptable information is measured by realistic information measures.
The third type of measures from the Principle A4 is defined as follows.Definition 4.9.An experiential information measure is obtained through experimentation.Remark 4.1.In some cases, one information measure may belong to different types.In other words, classes of information measures overlap.
As an example of such a measure, we may take the measure that is used to estimate the computer memory content as well as the extent of a free memory in computer or on the disk.Namely, information in computers is represented as strings of binary symbols and the measure of such a string is the number n of these symbols.The length of the string is taken as the value of its information measure.The unit of such a measure is called a bit.Computer memory is measured in bits, bytes, kilobytes, which contain 1,000 bytes, megabytes, which contain 1,000,000 bytes and so on.This reflects the length of the strings of symbols can be stored in a memory.This is the simplest measure of symbolic information.However, this measure is necessary because storage devices (such as computer disks) have to be relevant to needs in information storage.For example, if you have a file containing five megabytes and a floppy disk of 1.4 megabytes, then cannot store this file on this floppy disk.
Moreover, some authors consider information in such a simplistic way.For example, in one article, it is assumed that information is a one-dimensional string comprising a sequence of atomic symbols.Each such symbol is a token drawn, with replacement, from a set of available symbol types.Sets of symbol types may be the binary digits (0 and 1) or the alphanumeric characters or some other convenient set.
It is necessary to remark that different information measures may give different values for the same string.For example, according to the measures used in the algorithmic information theory, the algorithmic measure of this string having length n may be much less that n (cf., [21,13]).
Let us look how this measure relates to the axiological principles of the general theory of information.When such a string is written into the computer memory, it means that some information is stored in the memory.Changes in the memory content might be measured in a different way.The simplest is to measure the work that has been performed when the string has been written.The simplest way to do this is to count how many elementary actions of writing unit symbols have been performed.However, this number is just the number of bits in this string.So, conventional measure of the size of a memory and its information content correlates with the axiological principles of the general theory of information.
Let us take classifications of measures that are presented in the axiological principles A2-A4 and apply it to the conventional measure of the size of a memory.We see that it is an internal measure (cf.Principle A3), both abstract and realistic measure (cf.Principle A4), and belong to all three classes of potential, existential and actual measures (cf.Principle A2).
The axiological principles A2-A4 have the following consequences.
A unique measure of information exists only for oversimplified system.Any complex system R with a developed infological subsystem IF(R) has many parameters that may be changed.So, such systems demand many different measures of information in order to reflect the full variety of these systems properties as well as of conditions in which these systems function.Thus, the problem of finding one universal measure for information is unrealistic.
Uncertainty elimination (which is measured by the Shannon's quantity of information, cf.Section 3) is only one of the possible changes, which are useful to measure for information.Another important property is a possibility to obtain a better solution of a problem (which is more complete, more adequate, demands less resources, for example, time, for achievement a goal).Changes of this possibility reflect the utility of information.Different kinds of such measures of information are introduced in the theory of information utility [20] and in the algorithmic approach in the theory of information [12,13,21].
Axiological Principle A5.Measure of information I, which is transmitted from C to a system R, depends on interaction between C and R.
Stone [33] gives an interesting example of this property.Distortions of human voice, on one hand, are tolerable in an extremely wide spectrum, but on the other hand, even small amounts of distortion create changes in interactive styles.
The next principle is a clarification of Principle A4.

Axiological Principle A6. Measure of information transmission reflects a relation (like ratio, difference etc.) between measures of information that is accepted by the system R in the process of transmission and information that is presented by C in the same process.
It is known that the receiver accepts not all information that is transmitted by a sender.Besides, there are different distortions of transmitted information.For example, there is a myth that the intended understanding may be transmitted whole from a sender to a receiver.In almost every process of information transmission the characteristic attitudes of the receiver "interfere" in the process of comprehension.People make things meaningful for themselves by fitting them into their preconceptions.Ideas come to us raw, and we dress and cook them.The standard term for this process is selective perception.We see what we wish to see, and we twist messages around to suit ourselves.All this is demonstrated explicitly in the well-known 'Mr.Biggott' studies [35].An audience was shown a series of anti-prejudice cartoons featuring the highly prejudiced Mr. Biggott.Then people from the audience were subjected to detailed interviews.The main result was that about two thirds of the sample clearly misunderstood the anti-prejudice intention of the cartoons.The major factors accounting for this selective perception, according to the researchers, were the predispositions of the audience.Those who were already prejudiced saw the cartoons as supporting their position.Even those from them who understood the intentions of the cartoons found ways of evading the antiprejudice 'effect.'Only those with a predisposition toward the message interpreted the films in line with the intended meanings of the communicators.

Systematizing Theoretical Approaches in Information Science
Other developed theoretical approaches in information science are particular cases of the general theory of information because they explicitly or implicitly consider treat information from the functional point of view as a kind of transformations in a system.To prove this, we find in each case a relevant infological system IF(R) and demonstrate that in each of these approaches information is what changes this system.
It is necessary to remark that there are some approaches, which consider information as some kind of knowledge.More exactly, in modern information theory, according to [17], a distinction is made between structural-attributive and functional-cybernetic types of theories.While representatives of the former approach conceive information as structure, like knowledge or data, variety, order, and so on; members of the latter understand information as functionality, functional meaning or as a property of organized systems.The general theory of information treats information from the functional and, more exactly, dynamic perspective.As it is demonstrated in [10,11], structural-attributive interpretation does not represent information itself but relates to information carriers.Consequently, structuralattributive types of information theories are also included in the scope of the general theory of information because structures and attributes are represented in this theory by infological elements and their properties and systems.

Shannon's Information Theory
The statistical approach is now the most popular direction in the information sciences.It is traditionally called Shannon's information theory or, as it was at first named by Shannon, the theory of communication [30].It is a mathematical theory formulated principally by the American scientist Claude E. Shannon to explain aspects and problems of information and communication.
The basic problem that needed to be solved, according to Shannon, was the reproduction at one point of a message produced at another point.He deliberately excluded from his investigation the question of the meaning of a message, i.e., the reference of the message to the things of the real world.He wrote [30]: "Frequently the messages have meaning; that is they refer to or are correlated according to some system with physical or conceptual entities.These semantic aspects of communication are irrelevant to the engineering problem.The significant aspect is that the actual message is one selected from a set of possible messages.The system must be designed to operate for each possible selection, not just the one which will actually be chosen since this is unknown at the time of design." While the statistical theory of information is not specific in many respects, it proves the existence of optimum coding schemes without showing how to find them.For example, it succeeds remarkably in outlining the engineering requirements of communication systems and the limitations of such systems.
In the statistical theory of information, the term information is used in a special sense: it is a measure of the freedom of choice with which a message is selected from the set of all possible messages.Information is thus distinct from meaning, since it is entirely possible for a string of nonsense words and a meaningful sentence to be equivalent with respect to information content.
In the statistical theory of information, information is measured in bits (short for binary digit).One bit is equivalent to the choice between two equally likely choices.For example, if we know that a coin is to be tossed but are unable to see it as it falls, a message telling whether the coin came up heads or tails gives us one bit of information.A special measure that is called quantity of information of a message about some situation is defined by the formula I = -Σ i=1 n p i log p i where n is the number of possible cases of the situation in question and p i is the probability of the case i .So, information is considered as elimination of uncertainty, i.e., as a definite change in the knowledge system that is the infological system IF(R) of the receptor of information.Consequently, we have the following result.Proposition 3.1.The statistical information theory is a subtheory the general theory of information.Interestingly, the mathematical expression for information content closely resembles the expression for entropy in thermodynamics.The greater the information in a message, the lower its randomness, or "noisiness," and hence the smaller its entropy.Since the information content is, in general, associated with a source that generates messages, it is often called the entropy of the source.
However, many saw limitations of Shannon's information theory, especially when it was applied outside technical areas.As a result, other directions have been suggested in information science.However, to this day, there is no measure in information theory that is as well-supported and as generally accepted as Shannon's quantity of information.On the other hand, Shannon's work is rightly seen as lacking indications for a conceptual clarification of information.

A Semantic Theory of Information
The semantic theory of information tries to encompass semantic aspects of information because some experts in information sciences consider meaning as the essence of information.This approach is based on the assumption (cf., for example, [2]), that every piece of information has the characteristic that it makes a positive assertion and at the same time makes a denial of the opposite of that assertion.In the semantic information theory, transformations of such infological system as thesaurus, or system of knowledge, are treated as information.The founders of this approach, Bar-Hillel and Carnap [1] build it as a logical system.Thus, the infological systems in this theory are sets of logical propositions.These proposition, are used to describe and represent the state of an arbitrary system.The corresponding measure of information is defined for a separate proposition as the probability that this proposition is a part of the description of the real state of a system under consideration.Thus, information causes change in knowledge about this state.Consequently, we have the following result.

The semantic information theory is a subtheory the general theory of information.
A similar concept of information is utilized in the approach that was developed by Shreider [31] and also called the semantic theory of information.In his theory, the notion of a thesaurus is more imprecise.At the same time, the notion of information is more general than in [1] being a transformation of a thesaurus.As any thesaurus is a kind of infological systems, this approach is also included in the general theory of information.

Fisher Information
Fisher information is also based on mathematical statistics.It was introduced by Ronald A. Fisher, who developed classical measurement theory [16].According to this theory, the quality of any measurement may be specified by a form of information that has come to be called Fisher information.
Consider the class of "unbiased" estimates, obeying the law <θ (y)> = θ, where θ is a value of a given parameter and θ (y) is an estimate of θ, which is an optimal function for a measured data y = θ + x.The mean-square error e 2 in such an estimate obeys a relation e 2 I ≥ 1.Here I is called the Fisher information.In particular case, which is important for physics [18], we have I = ∫ p' 2 (x) / p(x) dx Here p(x) denotes the probability density function for the noise value x, and p'(x) = dp(x)/dx.This demonstrates that Fisher information I is the quality metric for the estimation/measurement procedure.Consequently, I measures an adequate change in knowledge about the parameter θ, and is information in the sense of the general theory of information.Thus, we have the following result.Proposition 3.3.The Fisher information theory is a subtheory the general theory of information.
It is necessary to remark that there is an approach in physics that is based on the assumption that information has to be the base for the whole physics because physics is knowledge about the universe and all knowledge is acquired through information [18].By this theory, the observer is included into the phenomenon of measurement.According to the classical quantum mechanical philosophy, the observer becomes a collector of data, which are influenced by the measurement.More radical ideas, which are stated, for example, by J.A. Wheeler [36], represent the observer as an activator of the physical phenomenon that gives rise to the data.

The Qualitative Information Theory
In the qualitative information theory [28], the following definition is given: information is a transformation of one communication of an information association into another communication of the same association.So, here the infological system IF(R) is some information association and information is a transformation of T. Consequently, we have the following result.
Proposition 3.4.The qualitative information theory is a subtheory the general theory of information.

The Algorithmic Theory of Information
The principal aim of the algorithmic theory of information is the development of more realistic concepts of randomness and probability than those that are in the traditional theory of probability.This new approach was suggested in the works of three authors: Solomonoff [32], Kolmogorov [21], and Chaitin [13].The main idea was that algorithms play a leading role in many processes, including people behavior.Consequently, theory of algorithms has to be central in a study of such processes.
In the algorithmic information theory, there are two kinds of measures of information [32,21,12,5,7].The first one is called the entropy, or information content, or complexity of finite constructive objects (like strings of symbols).Such algorithmic measure of information is defined to be the number of bits (or symbols) needed to specify the object in question so effectively that it can be constructed.This measure is called the Kolmogorov complexity of x and is denoted by K(x).By the definition, this is the measure of such a transformation as a construction of some object.It is an external measure of information that acts on such infological system IF(R) as thesaurus, or the system of knowledge.Knowledge in this system is represented by strings of symbols (texts) that have meaning.
It is possible to define a more general measure of information of the first type.It is called a generalized Kolmogorov complexity measure [5,7].Generalized Kolmogorov complexity measures reflects a measure of resources needed to construct the object in question.According to [5], generalized Kolmogorov complexity measures are dual to computational complexity measures.
To get the second algorithmic measure of information, we consider two texts or simply, sequences of symbols x and y.Then in the algorithmic information theory, the second, relative measure of information is defined as information in y about x.This measure is given by the formula I(y, x) = K(x) -K(x/y) where K(x) is the Kolmogorov complexity of x and K(x/y) is the Kolmogorov complexity of construction of x when y is given.The Kolmogorov complexity K(x) or other dual measures of complexity studied in [6] may be also treated as absolute measures of information which are necessary for constructing some object.So, information in y about x changes the system of algorithms that are used for the computation (or construction) of x.In this case, the infological system IF(R) that is changed by the information is the system of algorithms which compute (or construct) x.Consequently, we have the following result.
Proposition 3.5.The algorithmic information theory is a subtheory the general theory of information, and I(y, x) is an internal measure of information.

The Pragmatic Theory of Information
The concept of pragmatic information was introduced by Ernst Ulrich von Weizsäcker [38] and further developed by Von Lucadou [24,25], Kornwachs [22], Weinberger [37] and some others.This concept relies on the two notions: firsteness ("Erstmaligkeit") or originality or novelty, and confirmation ("Bestatigung") or redundancy to already known.Weizsäcker suggests that a message that does nothing but confirm the prior knowledge of a receiver will not change its structure or behaviour.Thus, with confirmation up to 100%, a message gives no pragmatic information.On the other hand, a message providing only original/novel material completely unrelated to any prior knowledge also will not change structure or behaviour of the receiver, because the receiver will not understand this message.Thus, with firstness up to 100%, a message gives no pragmatic information.Only a relevant mixture of firstness and confirmation allows the receiver to get pragmatic information from the message.Thus, pragmatic information, according to Weizsäcker, is related to changes of structure or behaviour of the receiver.Thus, the theory of pragmatic information is a special case of the general theory of information.Taking thesaurus, i.e., the system of knowledge, as an infological system, adding some principles and developing them further, we obtain the theory of pragmatic information.
The psychologist von Lucadou [24,25] applied pragmatic information to problems of psychology.He writes: "The model of pragmatic information (MPI), which is a candidate for a non-classical model of psychology, predicts that the behavior of a non-classical system depends on the conditions of its observation.The exchanged pragmatic information (meaningful information) ties the "observer" (e.g., a person) and the "observed" (e.g., a machine) together and creates an "organizational closed system."It is assumed that this process produces non-local correlations between the observer and the observed." In such a way, pragmatic information acts on the observed system.This entirely corresponds to the concept of information in the general theory of information.
However, the general theory of information allows us to go further and to consider two opportunities of change.In one case, behavior of the system under observation changes.It corresponds to an action on the behavioral infological system.In the other case, changes only information that we get from the system under observation.It corresponds to an action on the representational infological system.

Social Information
Goguen (1997) suggests to study first of all social information because it is very important to the development of information processing systems.He introduces the following definition: An item of information is an interpretation of a configuration of signs for which members of some social group are accountable.
According to the general theory of information, a configuration of signs is a carrier of information, while interpretation as an action is the change caused in some social group.At the same time, interpretation as some text (symbolic configuration) is an element of the cognitive infological system of this group.

The Utility Theory of Information
In the utility theory of information [20], the measure of information is called the quality of information.It is defined for mission oriented systems R .If I is some portion of information, then the quality of this information is equal to the caused by I change of the probability p(R,g) of achievement of a particular goal g by the system R .If we consider objective probability, then the corresponding infological system is the state space of the world in which system R functions.If we consider subjective probability, then the corresponding infological system is the belief space in which probabilities for different events involving system R are represented.In both cases, information appears as a change in the corresponding infological system.Consequently, we have the following result.
Proposition 3.6.The utility information theory is a subtheory the general theory of information.

An Economic Theory of Information
The economic theory of information [26,27] appeared aiming to represent economical aspects information processes in society.It reflected some changes in the thinking habits of economists that has resulted in a broadening of the concept of `economics'.As time advanced, problems of decision, information and organization came to the center of many economical theories.The actions considered and their outcomes may, but need not, be inputs and outputs of quantifiable and marketable production factors and products, or their prices.Nowadays, broader `optimization problems' are pigeonholed as belonging to `operations research', `management science', or `systems analysis'.They occupy economists, engineers, and (to the extent that applied probabilities are involved) statisticians as well.As `benefit-cost analysis', these tools are also applied to problems of social policy no less than to military or medical planning.There is a promise of cross-fertilization with the evolutionary theories of life science and anthropology.Moreover, the decision-theoretic approach has recently taken a foothold in the philosophy of science.
In the economic information theory, information x about the state of environment is considered with respect to a person's action a. Person's profit u(a, x) is taken as a utility function.It makes possible to consider expectation U 0 = max a E u(a, x) of the profit without knowing x as well as expectation U 1 = E max a u(a, x) of the profit when x is known.Then the value of information v(I x ) about x is defined as v(I x ) = U 1 -U 0 .Thus, information is also treated as a change in the system of knowledge that is the infological system IF(R) in this case.Consequently, we have the following result.
Proposition 3.7.The economic information theory is a subtheory the general theory of information.
According to the principles of general theory of information, while in Shannon's theory quantity of information is an internal measure, value of information in the sense of the economic information theory is an external measure of information.

A Dynamic Theory of Information
The principal aim for the development of the dynamic theory of information was investigation of biological information and information processes in living systems [14].In the dynamic information theory, the notion of information is taken as a basic one.Information is considered as a sequence of two operations: choice of one alternative from a collection of possible alternatives and saving the chosen alternative.Thus, information is also treated as an action causing transformations.Consequently, we have the following result.
Proposition 3.8.The dynamic information theory is a subtheory the general theory of information.

Conclusion
Thus, we have demonstrated that the general theory of information makes it possible to solve many open problems related to information and achieve a new profound understanding of information as a natural, technological, and social phenomenon.It provides for systematization of all other approaches in information sciences, eliminating many of their shortcomings.In particular, a new, more adequate definition of information is obtained.In addition, it allows us to discover new types of information: cognitive, conditioning, and regulative information.Cognitive information gives knowledge and thus, it is what people know under the name of information.Two other types are new and help to understand many phenomena in system functioning.This discovery of new types of information makes possible to determine the role of information for system functioning.Conditioning information is basic for any system from the perspective of its inner evolution.Regulative information is basic for any system from the perspective of its interaction with environment.Cognitive information appears only on higher levels of system development.The role of cognitive information increases with the development of a system and becomes decisive on some level of this development.

Definition 4 . 1 . 4 . 2 . 4 . 3 .
Potential or perspective measures of information I determine (reflects) what changes (namely, their extent) may be caused by I in R. Definition Existential or synchronic measures of information I determine (reflects) what changes (namely, their extent) are going in R during some fixed interval of time after receiving I.This interval of time may be considered as the present time.Definition Actual or retrospective measures of information I determine (reflects) what changes (namely, their extent) were actually caused by I in the system R.

Definition 4 . 4 . 4 . 5 .
An internal information measure reflects the extent of inner changes caused by I .Examples are given by the change of the length (the extent) of a thesaurus.Definition An intermediate information measure reflects the extent of changes caused by I in the links between R and W.