From Data to Semantic Information

: There is no consensus yet on the definition of semantic information. This paper contributes to the current debate by criticising and revising the Standard Definition of semantic Information (SDI) as meaningful data, in favour of the Dretske-Grice approach: meaningful and well-formed data constitute semantic information only if they also qualify as contingently truthful. After a brief introduction, SDI is criticised for providing necessary but insufficient conditions for the definition of semantic information. SDI is incorrect because truth-values do not supervene on semantic information, and misinformation (that is, false semantic information) is not a type of semantic information, but pseudo-information, that is not semantic information at all. This is shown by arguing that none of the reasons for interpreting misinformation as a type of semantic information is convincing, whilst there are compelling reasons to treat it as pseudo-information. As a consequence, SDI is revised to include a necessary truth-condition. The last section summarises the main results of the paper and indicates the important implications of the revised definition for the analysis of the deflationary theories of truth, the standard definition of knowledge and the classic, quantitative theory of semantic information.


Introduction
The concept of information has become central in most contemporary philosophy. 1 However, recent surveys have shown no consensus on a single, unified definition of semantic information. 2This is hardly surprising.Information is such a powerful and elusive concept that, as an explicandum, it can be associated with several explanations, depending on the cluster of requirements and desiderata that orientate a theory. 3Claude Shannon, for example, remarked that: The word "information" has been given different meanings by various writers in the general field of information theory.It is likely that at least a number of these will prove sufficiently useful in certain applications to deserve further study and permanent recognition.It is hardly to be expected that a single concept of information would satisfactorily account for the numerous possible applications of this general field [54].
Polysemantic concepts such as information can be fruitfully defined only in relation to a wellspecified context of application.Following this localist principle, in this paper only one crucial aspect of a specific type of information will be analysed, namely the alethic nature of declarative, objective and semantic (DOS) information (more on these qualifications in the next section).The question addressed is whether alethic values are supervenient 4 on DOS information, as presumed by the standard definition of information (SDI).The negative answer defended is that DOS information encapsulates "truthfulness", so that "true information" is simply redundant and "false information", i.e. misinformation, is merely pseudo-information.It follows that SDI needs to be revised by adding a necessary truth-condition.Three important implications of the revised definition are briefly discussed in the last section.

The Standard definition of information
Intuitively, "information" is often used to refer to non-mental, user-independent, declarative (i.e.alethically qualifiable), 5 semantic contents, embedded in physical implementations like databases, encyclopaedias, web sites, television programmes and so on [9], which can variously be produced, collected, accessed and processed.The Cambridge Dictionary of Philosophy, for example, defines information thus: an objective (mind independent) entity.It can be generated or carried by messages (words, sentences) or by other products of cognizers (interpreters) Information can be encoded and transmitted, but the information would exist independently of its encoding or transmission.
The extensionalist analysis of this popular concept of DOS (declarative, objective and semantic) information is not immediately connected to levels of subjective uncertainty and ignorance, to probability distributions, to utility-functions for decision-making processes, or to the analysis of communication processes.So the corresponding mathematical 6 and pragmatic 7 senses in which one may speak of information are not relevant in this context and will be disregarded.
Over the last three decades, most analyses have supported a definition of DOS information in terms of data + meaning.This bipartite account has gained sufficient consensus to become an operational standard in fields that tend to deal with data and information as reified entities (consider, for example, the now common expression "data mining"), especially Information Science; Information Systems Theory, Methodology, Analysis and Design; Information (Systems) Management; Database Design; and Decision Theory.A selection of quotations from a variety of recent, influential texts illustrates the popularity of the bipartite account: 8  Information is data that has been processed into a form that is meaningful to the recipient [14].Data is the raw material that is processed and refined to generate information [55].
Information equals data plus meaning [12].Information is data that have been interpreted and understood by the recipient of the message.Data will need to be interpreted or manipulated [to] become information [60].
More recently, the bipartite account has begun to influence the philosophy of computing and information as well (see for example [11,21,27,45]).
The practical utility of the bipartite account is indubitable.The question is whether it is rigorous enough to be applied in the context of an information-theoretical epistemology.We shall see that this is not the case, but before moving any criticism, we need a more rigorous formulation.

An Analysis of the Standard Definition of Information
Situation logic [5,35,17] provides a powerful methodology for our task.Let us use the symbol σ and the term "infon" to refer to discrete items of information, irrespective of their semiotic code and physical implementation: 9  SDI) σ is an instance of DOS information if and only if: SDI.1) σ consists of n data (d), for n ≥ 1; SDI.2) the data are well-formed (wfd); SDI.3) the wfd are meaningful (mwfd = δ).
According to SDI, σ = δ.Three comments are now in order.First, SDI.1 indicates that information cannot be dataless, but it does not specify which types of δ constitute information.Data can be of four types [21]: δ.1) primary data.These are what we ordinarily mean by, and perceive as, the principal data stored in a database, e.g. a simple array of numbers, or the contents of books in a library.They are the data an information management system is generally designed to convey to the user in the first place; δ.2) metadata.These are secondary indications about the nature of the primary data.They enable a database management system to fulfil its tasks by describing essential properties of the primary data, e.g.location, format, updating, availability, copyright restrictions, etc.; δ.3) operational data.These are data regarding usage of the data themselves, the operations of the whole data system and the system's performance; δ.4) derivative data.These are data that can be extracted from δ.1-δ.3,whenever the latter are used as sources in search of patterns, clues or inferential evidence, e.g. for comparative and quantitative analyses (ideometry).At first sight, the typological neutrality (TN) implicit in SDI.1 may seem counterintuitive.A database query that returns no answer, for example, still provides some information, if only negative information; and silence is a meaningful act of communication, if minimalist.TN cannot be justified by arguing that absence of data is usually uninteresting, because similar pragmatic considerations are at least controversial, as shown by the previous two examples, and in any case irrelevant, since in this context the analysis concerns only objective semantic information, not interested information. 10 Rather, TN is justified by the following principle of data-types reduction (PDTR): PDTR) σ consists of a non-empty set (D) of data δ; if D seems empty and σ still seems to qualify as information, then 1. the absence of δ is only apparent because of the occurrence of some negative primary δ, so that D is not really empty; or 2. the qualification of σ as information consisting of an empty set of δ is misleading, since what really qualifies as information is not σ itself but some non-primary information µ concerning σ, constituted by meaningful non-primary data δ.2-δ.4 about σ.
Consider the two examples above.If a database query provides an answer, it will provide at least a negative answer, e.g."no documents found", so PDTR.1 applies.If the database provides no answer, either it fails to provide any data at all, in which case no specific information σ is available, or there is a way of monitoring or inferring the problems encountered by the database query to establish, for example, that it is running in a loop, in which case PDTR.2 applies.In the second example, silence could be negative information, e.g. as implicit assent or denial, or it could carry some non-primary information µ, e.g. the person has not heard the question.
When apparent absence of δ is not reducible to the occurrence of negative primary δ, either there is no information or what becomes available and qualifies as information is some further non-primary information µ about σ, constituted by some non-primary δ.2-δ.4.Now, differences in the reduction both of the absence of positive primary δ to the presence of negative primary δ and of σ to µ (when D is truly empty) warrant that there can be more than one σ that may (misleadingly) appear to qualify as information and be equivalent to an apparently empty D.Not all silences are the same.However, since SDI.1 defines information in terms of δ, without any further restriction on the typological nature of the latter, it is sufficiently general to capture primary (positive or negative) δ.1 and non-primary data δ.2δ.4 as well, and hence the corresponding special classes of information just introduced.
Second comment.According to SDI.1, σ can consist of only a single datum.Information is usually conveyed by large clusters of well-formed, codified data, often alphanumeric, which are heavily constrained syntactically and already very rich semantically.However, in its simplest form a datum can be reduced to just a lack of uniformity, i.e. a difference between the presence and the absence of e.g. a signal: Dd) d = (x ≠ y).
The dependence of information on the occurrence of syntactically well-formed clusters, strings or patterns of data, and of data on the occurrence of physically implementable differences, explains why information can be decoupled from its physical support (consider the same text as a file on a floppy disk or as a printed text on paper).Interpretations of this support-independence can vary quite radically because Dd leaves underdetermined not only the logical type to which the relata belong (see TN), but also the classification of the relata (taxonomic neutrality) and the kind of support that may be required by the implementation of their inequality (ontological neutrality).
Consider the taxonomic neutrality (TaxN) first.A datum is usually classified as the entity exhibiting the anomaly, often because the latter is perceptually more conspicuous or less redundant than the background conditions.However, the relation of inequality is binary and symmetric.A white sheet of paper is not just the necessary background condition for the occurrence of a black dot as a datum, it is a constitutive part of the datum itself, together with the fundamental relation of inequality that couples it with the dot.Nothing is a datum per se.So being a datum is an external property, and SDI endorses the following thesis: TaxN) a datum is a relational entity.
Understood as relational entities, data are constraining affordances, exploitable by a system as input of adequate queries that correctly semanticise them to produce information as output.In short, semantic information can also be described erotetically as data + queries [21].
Consider next the ontological neutrality (ON).By rejecting the possibility of dataless information, GDI endorses the following modest thesis: ON no information without data representation.Following [38][39][40][41], ON is often interpreted materialistically, as advocating the impossibility of physically disembodied information, through the equation "representation = physical implementation": S.1) no information without physical implementation.S.1 is an inevitable assumption when working on the physics of computation, since computer science must necessarily take into account the physical properties and limits of the carriers of information. 11It is also the ontological assumption behind the Physical Symbol System Hypothesis in AI and Cognitive Science [49].However, ON does not specify whether, ultimately, the occurrence of every discrete state necessarily requires a material implementation of the data representations.Arguably, environments in which all entities, properties and processes are ultimately noetic (e.g.Berkeley, Spinoza), or in which the material or extended universe has a noetic or non-extended matrix as its ontological foundation (e.g.Pythagoras, Plato, Leibniz, Fichte, Hegel), seem perfectly capable of upholding ON without embracing S.1.The relata in Dd could be monads, for example.Indeed, the classic realism vs. antirealism debate can be reconstructed precisely in terms of the possible interpretations of ON.
All this explains why SDI is also consistent with two other popular slogans this time favourable to the proto-physical nature of information and hence completely antithetic to S.1: S.2) "It from bit.Otherwise put, every "it" every particle, every field of force, even the spacetime continuum itselfderives its function, its meaning, its very existence entirelyeven if in some contexts indirectlyfrom the apparatus-elicited answers to yes-or-no questions, binary choices, bits."It from bit" symbolizes the idea that every item of the physical world has at bottoma very deep bottom, in most instancesan immaterial source and explanation; that which we call reality arises in the last analysis from the posing of yes-no questions and the registering of equipment-evoked responses; in short, that all things physical are information-theoretic in origin and that this is a participatory universe."[63].and S.3) "[information is] a name for the content of what is exchanged with the outer world as we adjust to it, and make our adjustment felt upon it."[65]."Information is information, not matter or energy.No materialism which does not admit this can survive at the present day" [66].
S.2 endorses an information-theoretic, metaphysical monism: the universe's essential nature is digital, being fundamentally composed of information as data instead of matter or energy, with material objects as a complex secondary manifestation. 12S.2 may, but does not have to endorse a computational view of information processes.S.3 advocates a more pluralistic approach along similar lines.Both are compatible with SDI.
The third and final comment concerns SDI.3 and can be introduced by discussing a fourth slogan: S.4) "In fact, what we mean by information -the elementary unit of information -is a difference which makes a difference".[6].
S.4 is one of the earliest and most popular formulations of SDI (see for example [27,11]; note that the formulation in MacKay [1969], "information is a distinction that makes a difference", predates Bateson's and, although less memorable, is more accurate).A "difference" is just a discrete state, i.e. a datum, and "making a difference" simply means that the datum is "meaningful", at least potentially.How data can come to have an assigned meaning and function in a semiotic system in the first place is one of the hardest problems in semantics.Luckily, the semanticisation of data need not detain us here because SDI.3 only requires the δ to be provided with a semantics already.The point in question is not how but whether data constituting semantic information can be correctly described as being meaningful independently of an informee.The genetic neutrality (GN) supported by SDI states that: GN) δ can have a semantics independently of any informee.
Before the discovery of the Rosetta Stone, Egyptian hieroglyphics were already regarded as information, even if their semantics was beyond the comprehension of any interpreter.The discovery of an interface between Greek and Egyptian did not affect the hieroglyphics' embedded semantics but its accessibility.This is the weak, conditional-counterfactual sense in which SDI.3 can speak of meaningful data being embedded in an information-carrier informee-independently.GN supports the possibility of information without an informed subject, to adapt a Popperian phrase.Meaning is not (at least not only) in the mind of the user.GN is to be distinguished from the stronger, realist thesis, supported for example by [19], according to which data could also have their own semantics independently of an intelligent producer/informer.This is also known as environmental information, and a typical example is supposed to be provided by the concentric rings visible in the wood of a cut tree trunk, which may be used to estimate the age of the plant.
To summarise, insofar as SDI provides necessary conditions for σ to qualify as DOS information, it also endorses four types of neutrality: TN, TaxN, ON and GN.These features represent an obvious advantage, as they make SDI perfectly scalable to more complex cases, and hence reasonably flexible in terms of applicability.However, by specifying that SDI.1-SDI.3 are also sufficient conditions, SDI further endorses a fifth type of alethic neutrality (AN) which turns out to be problematic.Let us see why.

Alethic neutrality
According to SDI, alethic values are not embedded in, but supervene on semantic information: AN) meaningful and well-formed data qualify as information, no matter whether they represent or convey a truth or a falsehood or have no alethic value at all.It follows that 13  FI) false information (including contradictions), i.e. misinformation, is a genuine type of DOS information, not pseudo-information; TA) tautologies qualify as information; and TI) "it is true that σ" where σ is a variable that can be replaced by any instance of genuine DOS information, is not a redundant expression; for example, "it is true" in the conjunction "'the earth is round' qualifies as information and it is true" cannot be eliminated without semantic loss.None of these consequences is ultimately defensible, and their rejection forces a revision of AN and hence of SDI.For the sake of simplicity, in the rest of this article only the rejection of FI and TA will be pursued, following two strategies.The first consist in showing that none of the main reasons that could be adduced for interpreting false information as a type of information is convincing.This strategy is pursued in section four.The second strategy consists in showing that there are compelling reasons to treat false and tautological information as pseudo-information.This is argued in section five.Further arguments against TI could be formulated on the basis of the literature on deflationary theories of truth [34,36,50].These arguments are not going to be rehearsed here because the development of this strategy, which has interesting consequences for the deflationary theories themselves, deserves an independent analysis that lies beyond the scope of this paper.I shall return to the issue in the conclusion, but only to clarify what may be expected from this line of reasoning.

Nine bad reasons to think that false information is a type of semantic information
Linguistically, the expression "false information" is common and perfectly acceptable.What is meant by it is often less clear, though.The American legislation on food disparagement provides an enlightening example.
Food disparagement is legally defined in the US as the wilful or malicious dissemination to the public, in any manner, of false information that a perishable food product or commodity is not safe for human consumption."False information" is then defined, rather vaguely, as "information not based on reasonable and reliable scientific inquiry, facts, or data" (Ohio legislation, http://www.ohiocitizen.org/campaigns/pesticides/veglibel.html);"information that is not based on verifiable fact or on reliable scientific data or evidence" (Vermont legislation, http://www.leg.state.vt.us/docs/2000/bills/intro/h-190.htm);"information which is not based on reliable, scientific facts and reliable scientific data which the disseminator knows or should have known to be false" (Arkansas legislation, http://www.arkleg.state.ar.us/ftproot/bills/1999/htm/hb1938.htm).
In each case, false information is defined in the same way in which one could define a rotten apple, i.e. as if it were a "bad" type of information, vitiated by some shortcoming.Why? Suppose that there are going to be exactly two guests for dinner tonight, one of whom is in fact vegetarian.This is our situation S. Let the false information about S be FI = "(A) there will be exactly three guests for dinner tonight and (B) one of them is vegetarian".One may wish to argue that FI is not mere pseudoinformation, but a certain type of information that happens to be false, for a number of reasons, yet even the most convincing ones are not convincing enough and this is why: FI.1) FI can include genuine information.
Objection: this merely shows that FI is a compound in which only the true component B qualifies as information.
FI.2) FI can entail genuine information.
Objection: even if one correctly infers only some semantically relevant and true information TI from FI, e.g. that "there will be more than one guest", what now counts as information is the inferred true consequence TI, not FI.FI.3) FI can still be genuinely informative, if only indirectly.Objection: this is vague, but it can be reduced to the precise concept of non-primary information µ discussed in section two.For example, FI may be coupled to some true, metainformation M that the source of FI is not fully reliable.What now counts as information is the true M, not the false FI.
FI.4) FI can support decision-making processes.Objection: one could certainly cook enough food on the basis of FI but this is only accidental.The actual situation S may be represented by a wedding dinner for a hundred people.That is why FI fails to qualify as information.However, FI.4 clarifies that, if FI is embedded in a context in which there is enough genuine metainformation about its margins of error, then FI can be epistemically preferable to, because more useful than, both a false FI 1 , e.g."there will be only one guest for dinner", and a true but too vacuous FI 2 , e.g."there will be less than a thousand guests for dinner".What this shows is not (i) that false information is an alethically qualified type of genuine information, but that (ii) false information can still be pragmatically interesting (in the technical sense of the expression, see section two), because sources of information are usually supposed to be truth-oriented or truth-tracking by default (i.e. if they are mistaken, they are initially supposed to be so only accidentally and minimally), and that (iii) logically, an analysis of the information content of σ must take into account the level of approximation of σ to its reference, both when σ is true and when it is false.FI.5) FI is meaningful and has the same logical structure as genuine information.Objection: this is simply misleading.Consider the following FI: "One day we shall discover the biggest of all natural numbers".Being necessarily false, this can hardly qualify as genuine but false information.It can only provide some genuine, non-primary information µ, e.g. about the mathematical naivety of the source.In the same sense in which hieroglyphics could qualify as information even when they were not yet interpretable, vice versa, an infon σ does not qualify as information just because it is interpretable.This point is further discussed in section five.FI.6) FI could have been genuine information had the relevant situation been different.Perhaps the difficulty seen in FI.5 is caused by the necessary falsehood of the example discussed.Meaningful and well-formed data that are only contingently false represent a different case and could still qualify as a type of information.It only happens that there will be less guests than predicted by FI.
Objection: this only shows that we are ready to treat FI as quasi-information in a hypotheticalcounterfactual sense, which is just to say that, if S had been different then FI would have been true and hence it would have qualified as information.Since S is not, FI does not.FI need not necessarily be pseudo-information.It may be so contingently.This point too is further discussed in section five.FI.7)If FI does not count as information, what is it?Assuming that p is false "if S only thinks he or she has information that p, then what does S really have?Another cognitive category beyond information or knowledge would be necessary to answer this question.But another cognitive category is not required because we already have language that covers the situation: S only thinks he or she has knowledge that p, and actually has only information that p." [13].
Objection: first, a new cognitive category could be invented, if required; secondly, there is actually another cognitive category, that of well-formed and meaningful data, which, when false, constitute misinformation, not a type of information.Third, the difference between being informed that p and knowing that p is that, in the latter case, S is supposed to be able to provide, among other things, a reasonable and appropriate (possibly non-Gettierisable) account of why p is the case.The student Q who can recall and state a mathematical theorem p but has no understanding of p or can provide no further justification for p, can be said to be informed that p without having knowledge that p.But if the mathematical theorem is not correct (if p is false), it must be concluded that Q is misinformed (i.e.not informed) that p (Q does not have any information about the theorem).It is perfectly possible, but strikes one as imprecise and conceptually unsatisfactory, to reply that Q is informed that p (Q does have some information about the theorem) and p is false.FI.8)We constantly speak of FI.Rejecting FI as information means denying the obvious fact that there is plenty of information in the world that is not true.
Objection: insofar as DOC information is concerned, this is a non sequitur.Denying that FI counts as a kind of information is not equivalent to denying that FI is a common phenomenon; it is equivalent to denying that a false policeman, who can perfectly well exist, counts as a kind of policeman at all.We shall see this better in the next section.Here it is sufficient to acknowledge that ordinary uses of technical words may be too generic and idiosyncratic, if not incorrect, to provide conceptual guidelines.
FI.9) "'x misinforms y that p' entails that ¬ p but 'x informs y that p' does not entail that p [and since] … we may be expected to be justified in extending many of our conclusions about 'inform' to conclusions about 'information' [it follows that]… informing does not require truth, and information need not be true; but misinforming requires falsehood, and misinformation must be false."[26].
Objection: the principle of "exportation" (from information as process to information as content) is more than questionable, but suppose it is accepted; misinforming becomes now a way of informing and misinformation a type of information.All this is as odd as considering lying a way of telling the truth about something else and a contingent falsehood a type of truth on a different topic.The interpretation becomes perfectly justified, however, if informing/information is used to mean, more generically, communicating/communication, since the latter does not entail any particular truth value.But then compare the difference between: (a) "Q is told that p" and (b) "Q is informed that p", where in both cases p is a contradiction.(a) does not have to entail that p is true and hence it is perfectly acceptable, but (b) is more ambiguous.It can be read as meaning just "Q was made to believe that p", in which case information is treated as synonymous with (a form of) communication (this includes teaching, indoctrination, brain-washing etc.), as presumed by FI.9.But more likely, one would rephrase it and say that (b) means "Q is misinformed that p" precisely because p is necessarily false, thus implying that it makes little sense to interpret (b) as meaning "S has the information that p" because a contradiction can hardly qualify as information (more on this in section five) and being informed, strictly speaking, entails truth.
In conclusion, there seem to be no good reason to treat false information as a type of information.This negative line of reasoning, however, may still be unconvincing.We need more "constructive" arguments showing that false information is pseudo-information.This is the task of the next section.

Two good reasons to believe that false information is pseudo-information
The first positive argument is a test based on a conceptual clarification.The clarification is this.The confusion about the nature of false information seems to be generated by a misleading analogy.The most typical cases of misinformation are false propositions and incorrect data.Now a false proposition is still a proposition, even if it is further qualified as not being true.The same holds true for incorrect data.Likewise, one may think that misinformation is still a type of information, although it happens not to be true.The logical confusion here is between attributive and predicative uses of "false".The distinction was already known to medieval logicians, was revived by Geach [30] and requires a further refinement before being applied as a test to argue that "false information" is pseudo-information.
Take two adjectives like "male" and "good".A male constable is a person who is both male and employed as a policeman.A good constable, however, is not a good person who is also employed as a member of the police force, but rather a person who performs all the duties of a constable well."Male" is being used as a predicative adjective, whereas "good" modifies "constable" and is being used as an attributive adjective.On this distinction we can build the following test: if an adjective in a compound is attributive, the latter cannot be split up.This property of indivisibility means that we cannot safely predicate of an attributively-modified x what we predicate of an x.So far Geach.We now need to introduce two further refinements.Pace Geach, at least some adjectives can be used attributively or predicatively depending on the context, rather than necessarily being classified as either attributive or predicative intrinsically.Secondly, the attributive use can be either positive or negative.Positive attributively-used adjectives further qualify their reference x as y."Good constable" is a clear example.Negative, attributively-used adjectives negate one or more of the qualities necessary for x to be x.They can be treated as logically equivalent to "not".For example, a false constable (attributive use) is clearly not a specific type of constable, but not a constable at all (negative use), although the person pretending to be a constable may successfully perform all the duties of a genuine constable (this further explains FI.4 above).The same holds true for other examples such as "forged banknote", "counterfeit signature", "false alarm" and so on.They are all instances of a correct answer "no, it is a F(x)" to the type-question "is this a genuine x?".
Let us now return to the problem raised by the analogy between a false proposition and false information.When we say that p, e.g."the earth has two moons", is false, we are using "false" predicatively.The test is that the compound can be split into "p is a proposition" and "p is a contingent falsehood" without any semantic loss or confusion.On the contrary, when we describe p as false information, we are using "false" attributively, to negate the fact that p qualifies as information at all.Why?Because "false information" does not pass the test.As in the case of the false constable, the compound cannot be correctly split: it is not the case, and hence it would be a mistake or an act of misinformation to assert, that p constitutes information about the number of natural satellites orbiting around the earth and is also a falsehood.Compare this case to the one in which we qualify σ as digital information, which obviously splits into "σ is information" and "σ is digital".If false information were a genuine type of information it should pass the splitting test.It does not, so it is not.
The second argument is semantic and more technical but its gist can be outlined in rather simple terms (details are provided in the appendix).If false information does not count as semantic junk but as a type of information, it becomes difficult to make sense of the ordinary phenomenon of semantic erosion.Operators like "not" lose their semantic power to corrupt information, information becomes semantically indestructible and the informative content of a repository can decrease only by physical and syntactical manipulation of data.This is utterly implausible, even if not logically impossible.We know that the cloth of semantic information can be, and indeed is, often undone by equally semantic means.When false information is treated as semantic information, what may be under discussion is only a purely quantitative or syntactic concept of information, that is meaningful and well-formed data, not DOS information.

The standard definition of information revised
Well-formed and meaningful data may be of poor quality.Data that are incorrect (somehow vitiated by errors or inconsistencies), imprecise (understanding precision as a measure of the repeatability of the collected data) or inaccurate (accuracy refers to how close the average data value is to the "true" value) are still data and they are often recoverable, but, if they are not truthful, they can only constitute misinformation (let me repeat: they can be informative, but only indirectly or derivatively, for example they can be informative about the unreliability of a source, but this is not the issue here).We have seen that misinformation (false information) has turned out to be not a type of information but rather pseudo-information.This is the Dretske-Grice approach (other philosophers who support a truth-based definition of semantic information are [5,31]): […] false information and mis-information are not kinds of information -any more than decoy ducks and rubber ducks are kinds of ducks [19].
False information is not an inferior kind of information; it just is not information [32].
Like "truth" in the expression "theory of truth", "information" can be used as a synecdoche to refer both to "information" and to "misinformation"."False information" is like "false evidence": it is not an oxymoron, but a way of specifying that the contents in question do not conform to the situation they purport to map.This is why, strictly speaking, to exchange (receive, sell, buy, etc.) false DOS information about e.g. the number of moons orbiting around the earth, is to exchange (receive, sell, buy, etc.) no DOS information at all about x, only meaningful and well-formed data, that is semantic content.Since syntactical well-formedness and meaningfulness are necessary but insufficient conditions for information, it follows that SDI needs to be modified, to include a fourth condition about the positive alethic nature of the data in question: RSDI) σ is an instance of DOS information if and only if: 1. σ consists of n data (d), for n ≥ 1; 2. the data are well-formed (wfd); 3. the wfd are meaningful (mwfd = δ); 4. the δ are truthful."Truthful" is used here as synonymous for "true", to mean "representing or conveying true contents about the referred situation".It is preferable to speak of "truthful data" rather than "true data" because (a) the data in question may not be linguistic and a map, for example, is truthful rather than true; and (b) we have seen that "true data" may give rise to a confusion, as if one were stressing the genuine nature of the data in question, not their positive alethic value. 14

Conclusion: summary of results and future developments
We ordinarily speak of false information when what we mean is misinformation, i.e. no information at all.Whales are not fish just because one may conceptually think so.The goal of this paper has been to clarify the confusion.The standard definition of DOS information (SDI) provides necessary but insufficient conditions for the qualification of data as information.The definition has been modified to take into account the fact that information encapsulates truthfulness.The new version of the definition (RSDI) now describes DOS information as well-formed, meaningful and truthful data.In the course of the analysis, the paper has provided an explanation and refinement of the three necessary conditions established by SDI; an analysis of the concept of data; a clarification of four popular interpretations of SDI; and a revision of some basic principles and requirements that underpin any theory of semantic information.Three important implications of RSDI are: 1) a critique of the deflationary theories of truth (DTT).From RSDI, it follows that one could accept deflationary arguments as perfectly correct while rejecting the explanatory adequacy of DTT."It is true that" in "it is true that σ" is redundant because there cannot be semantic information that is not true, but DTT could mistake this linguistic or conceptual redundancy for unqualified dispensability."It is true that" is redundant precisely because, strictly speaking, information is not a truth-bearer but already encapsulates truth as truthfulness.Thus, DTT may be satisfactory as theories of truth-ascriptions while being inadequate as theories of truthfulness [36].
2) the analysis of the standard definition of knowledge as justified and true belief in light of a "continuum" hypothesis that knowledge encapsulates truth because it encapsulates DOS information [1].
3) the development of a quantitative theory of semantic information based on truth-values and degrees of discrepancy of σ with respect to a given situation rather than probability distributions [3,25].
The development of these three consequences has been left to a second stage of this research. 15 extension of the concept of information is accepted by every theory of semantic information.If it is the only implemented restriction, P.1-P.5 support the following model: M.1 Alethic restriction on the extension of the concept of information.I.ii.|= ∀x (¬ T(x) → (H(x) > 0)) it follows that, ∀x∀y 1 would already require a modification of SDI because of I.i.Is it acceptable?Let S be a finite sequence ∀x (x), ∀x∀y (x, y),… ∀x∀y…∀z (x, y,… z)) in which each member is a sum obtained following M.1, and let each member of S be labelled progressively m 1 , m 2 , … m n , then from M.1 it follows that: I.9.S → H(m 1 ) ≤ H(m 2 ) ≤ …H(m n ) I.10.∀m x ∈ S ((x < y) → (P (H(m y ) > H(m x )) > P (H(m y ) = H (m x )))) Both I.9 and I.10 are utterly implausible.Although by definition (see P.2) M.1 supports only zeroorder Markov chains, I.9 generates only increasing monotonic sequences, despite the random choice of the members.In other words, informative contents cannot decrease in time unless data are physically damaged.I.10 is an anti-redundancy clause.It further qualifies I.9 by indicating that, through a random choice of well-formed and meaningful data, informative content is more likely to increase than to remain equal, given that redundancy is possible only when tautologies are in question and, trivially, whenever x = y.In fact, according to M.1, it is almost impossible not to increase informative contents.It is sufficient to include more and more contradictions following I.4, for example.This is a simplified version of the Bar-Hillel-Carnap Semantic Paradox, according to which "a self-contradictory sentence, hence one which no ideal receiver would accept, is regarded as carrying with it the most inclusive information" [3].Of course, all this is just too good to be true.
The fact that M.1 turns out to be counterintuitive does not prove that it is unacceptable, but it does show that it is at least in need of substantial improvements.The problem with M.1 is that I.i is insufficient and its concept of information is still too inflated.The model should allow informative contents to decrease without being physically damaged, and it should account for the commonly experienced difficulties encountered when trying to enrich an information repository.These two basic requirements can be formulated thus: R.1 ¬ (S → H(m 1 ) ≤ H(m 2 ) ≤ H(m 3 )…) R.2.a ∀m ∈ S ((x < y) → (P (H(m y ) > H(m x )) < P (H(m y ) ≤ H (m x )))) (weaker) R.2.b ∀m ∈ S ((x < y) → (P (H(m y ) ≥ H(m x )) < P (H(m y ) < H (m x )))) (stronger) R.1 indicates that, by summing different informative contents, the result could be H(input) ≥ H(output).R.2 further qualifies R.1 entropically: ceteris paribus (e.g.given the same amount of computational and intellectual resources involved in the random production of informative contents), experience shows that informative contents are comparatively more likely to decrease or remain unmodified (depending on how strongly R.2 is interpreted) than to increase.The implementation of R.1 and at least R.As one would expect, now informative content can decrease (II.6) and, if it increases, it does so much less easily than in M.1.H(x) can decrease because II.i specifies, quite reasonably, that F(x)→ H(x) = 0, from which it follows that two infons with H > 0 may generate an inconsistent output whose H = 0 (II.6).Is M.2 acceptable?It certainly seems to be the model that most defenders of SDI have in mind.They accept that tautological and inconsistent contents do not qualify as information because they are in principle not informative, but they are still inclined to argue that contingently false contents should count as information because they could be informative about what is possible, although not about what is actual.The argument has been already criticised in section 4, and in this context we can concentrate on one of its consequences that represents a substantial limitation of M.2.The model satisfies R.2.a/bonly partially because it cannot fully account for the ordinary phenomenon of semantic loss of informative content.In M.2, informative content can be lost only physically or syntactically.This is surprising.Imagine that the last extant manuscript of an ancient work tells us that "Sextus Empiricus died in 201 AD, when Simplicius went to Rome".Suppose this is true.This informative content could be lost if the manuscript is burnt, if it is badly copied so that the letters/words are irreversibly shuffled, the names swapped, but also if some inconsistent or false information is added, if the meaning is changed, or a negation added.However, according to M.2, no loss of informative content would occur if the copyist were to write "Sextus Empiricus was alive in W. M. Meijers and his students in a series of lectures about the philosophical aspects of information at the Delft University of Technology, The Netherlands, and I am very grateful to him and those who attended the lectures for their detailed comments.