This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (

What is information? Although researchers have used the construct of information liberally to refer to pertinent forms of domain-specific knowledge, relatively few have attempted to generalize and standardize the construct. Shannon and Weaver (1949) offered the best known attempt at a quantitative generalization in terms of the number of discriminable symbols required to communicate the state of an uncertain event. This idea, although useful, does not capture the role that structural context and complexity play in the process of understanding an event as being informative. In what follows, we discuss the limitations and futility of any generalization (and particularly, Shannon’s) that is not based on the way that agents extract patterns from their environment. More specifically, we shall argue that agent concept acquisition, and not the communication of states of uncertainty, lie at the heart of generalized information, and that the best way of characterizing information is via the relative gain or loss in concept complexity that is experienced when a set of known entities (regardless of their nature or domain of origin) changes. We show that Representational Information Theory perfectly captures this crucial aspect of information and conclude with the first generalization of RIT to continuous domains.

What is information? Why is it a useful construct in science? What is the best way to measure its quantity and quality? Although these questions continue to stir profound debate in the scientific community [

Naïve informationalism offers a tenable explanation as to why scientists from a wide range of disciplines, from Physics to Psychology and from Biology to Computer Science use the term “information” to refer to specific types of knowledge that characterize their particular domain of research. For example, a data analyst may be interested in the way that data may be stored in a computing device, but has no interest in the molecular interactions of a physical system. Such molecular activity is not relevant to the problems and questions of interest in the field. One could say that, to the data analyst, the entities of interest are data. Accordingly, in the field of data analysis, the terms “information” and “data” are often used interchangeably to refer to the kinds of things that a computing device is capable of storing and operating on. Similarly, for some types of physicists, the quantum states of a physical system during a certain time window comprise information. On the other hand, a behavioral psychologist may be interested on the behaviors of rats in a maze. Indeed, to a behavioral psychologist the objects of information are these behaviors. In contrast, a geneticist may find such behaviors quite tangential to his discipline. Instead, to the geneticist, knowledge about the genome of the rat is considered far more fundamental and symbol sequences (e.g., nucleotides) may be a more useful way of generalizing and thinking about the basic objects of information. All of these examples support the idea that there are as many types of information as there are domains of human knowledge [

In spite of these domain-specific notions of information, some scientists of the later 19th and early 20th centuries attempted to provide more general definitions of information. These attempts were often motivated, again, by development of domain specific knowledge. For example, in the field of electrical engineering, the invention of electrical technologies such as the telegraph, telephone, and radar, set the stage for a key definition of information that has influenced nearly all that have come after it. The electrical engineer Ralph Hartley proposed that information could be understood as a principle of individuation [

To implement his individuation principle, Hartley proposed that the amount of information associated with any finite set of entities could be understood as a function of the size of the set. If the size of the set is a simple measure of raw information, then the “true” amount of information is given by a function that is able to specify the length of the

For example, if we wish to compute the information content of the set S = {airplane, bus, automobile, dog}, then Equation (1) gives

At this juncture we should clarify why the logarithmic characterization of information has achieved such an eminent status beyond simply a way of operationalizing an individuation principle. The logarithmic function embodies three desirable or ideal specifications (

The idea that information can be measured using the notion of individuation, although simple, had profound implications for subsequent measures. In fact, it was the applicability of this same notion to the probability of the value of a random variable that was the basis of Shannon’s formulation [

In the next sections we shall argue that neither way of measuring raw information reflects the true nature of information. One reason, discussed by Luce [

A second way of interpreting Hartley’s measure assumes an alternative notion of information based on the uncertainty of an event. More specifically, if we sample an element from the finite set

Shannon’s information measure appeals to our psychological intuitions about the nature of information if interpreted as meaning that the more improbable an event is, the more informative it is because its occurrence is more surprising. To explain, let

The general idea behind the basic measure proposed in SWIT is, in principle, very close to Hartley’s except that the carriers of information are now events instead of sets of entities. The primary measure of (raw) information is now the probability of a random variable and not the cardinality of a set as in Hartley’s measure. The framework attains its generality from the fact that many situations in nature can be described in terms of events and the probability distributions of their associated random variables (

With respect to the second axiom, the equivalent for Shannon information is that as the probability of an event increases, the amount of information associated with it decreases. This inverse relation is fully accounted for by the fact that probabilities are quantities in the

SWIT was not intended to characterize human intuitions about the meaning of information nor was it intended to predict judgments about what humans deem informative. In fact, Shannon warned researchers of such misapplications for they fell short of the scope of his theory, which was meant to characterize communication between devices, and not provide a definitive answer to how to measure information. In spite of this, psychologists and cognitive scientists have applied Shannon’s information measure (SIM) to interpret the limits of various human cognitive capacities, such as short-term memory storage capacity [

In addition to these criticisms based on experimental results, the idea that uncertainty underlies informativeness does not agree with human intuitions about informativeness. Indeed, these intuitions seem to violate the inverse relation principle proposed by Barwise and Seligman (1997) which says that the rarer a piece of information the more informative it is. To begin with, the meaning and quality of any information conveyed play a role that is independent from how surprising an event may be due to its rarity or infrequency. For example, consider an observer that finds out that a bus ran through his best friend’s house the previous night without causing any bodily harm to its residents. This event may seem highly improbable and thus, very surprising to the observer. Indeed, using Shannon’s information measure, the event should be very informative. However, if instead the same observer was told that her best friend had suffered a stroke, this far more probable event would be likely perceived as more informative because of the directly impactful and vivid long chain of concepts “awakening” in the observer: for example, cancelling work to rush to the hospital, notifying relatives, spending quality time with her sick friend,

To further illustrate that a measure of information grounded on uncertainty cannot account for situations grounded on context and meaning, suppose that Joe wins the lottery (a one in ten million chance). The event carries information: Namely the fact that he has won. As such, the information that it carries seems disproportionately small when compared to how highly improbable and surprising the event of winning the lottery happens to be. In contrast, consider the following alternative scenario. Joe, a bird lover, lives in a wooded area with a large bird population. Joe walks into his home and discovers a dead bird on his kitchen floor. Evidently, the bird came down the cooktop vent. Nonetheless, Joe is greatly surprised to find the dead bird. The probability of birds getting caught in kitchen vents in Joe’s neighborhood is far higher that winning the lottery and Joe knows this fact. Yet, Joe is more surprised about finding a dead bird in his kitchen than about winning the lottery. Furthermore, upon asked to compare the informativeness of the two events, Joe perceives the amount of information conveyed by the “dead bird” event as greater than that conveyed by the “winning the lottery” event.

My last example addresses the subjective nature of probability judgments. In particular, the idea that these judgments are determined by the interaction of an observer with its environment. Indeed, there is empirical evidence that shows that humans will find more likely events that they have frequently experienced and, hence, commit to memory more vividly, even though such events may be objectively highly improbable [

One final (and now famous) example proposed by Tversky and Kahneman in [

The above examples, among many others, demonstrate how relevance, meaning, and context, and not uncertainty, dictate our subjective sense of what is informative. They support the proposition that degrees of uncertainty (as measured by probabilities) are not a reliable marker or a sufficiently objective benchmark to serve as a quantitative measure of subjective information. In short, degree of informativeness does not hinge strictly upon what we do not know (agent uncertainty), but on what we think we know (agent certainty). The pundits might say that our examples are oversimplifications of real world situations and that recasting the described scenarios in more precise terms one may be able to make sense of the link between surprise and uncertainty. But our point is that this kind of subjective patch-up work exposes any possible link between subjective surprise and uncertainty as too fuzzy, unreliable, dubious, and noisy to serve as a basis for a rigorous quantitative theory of general information. Consequently, SWIT offers an inadequate characterization of subjective information. As mentioned, we suspect that this shortcoming is due to the fact that: (1) The carriers of SWIT are devoid of structure (

For example, using these axioms, one can define the conditional probability of an event (

Next, we shall offer an alternative way of characterizing information that overcomes the noted limitations and that conforms to our most fundamental intuitions as to the nature of information. In addition, we shall offer empirical evidence for the latter claim. Note that the motivation behind proposing an alternative way of thinking about information is not to replace or undermine SWIT. SWIT has been and will continue to be an important way of construing information within the appropriate domains. Instead, our goal is to develop a general theory of information that: (1) is based on meaning (patterns and concepts) and complexity rather than uncertainty; (2) is structure and context sensitive; and (3) it, therefore, better accounts for human intuitions and human subjective assessments as to what makes things informative.

Representational Information Theory or RIT [

More importantly, concepts are the “stuff” of meaning: Which means that concepts facilitate the integration of related experiences in an economical fashion and at multiple levels. This integration in turn helps facilitate storage and retrieval of concept instances from memory. For example, consider the “chair” concept which sparsely represents the set of chairs experienced up to a given point in time in a human’s lifetime. The concept has gradually been learned by extracting from the set of all known chairs key relationships between them. These key relationships we refer to as the essence of the category or set of objects being learned. Humans extract the essence of sets of objects by detecting certain kinds of patterns inherent to the set. In fact, the goal of theories of concept learning is to determine what sorts of patterns humans are most susceptible to and use as the basis for forming concepts. Among these theories, one stands out for making particularly accurate predictions as to the way that humans extract patterns from sets of objects in the environment [

The aforementioned experiments operationalize concept learning as the ability to classify objects accurately after a set of objects (for an example, see

Given the fundamental role that concepts play in our mental lives and given the empirical and theoretical evidence in support of the view that organisms represent their environment conceptually, it is surprising that little formal advancement has been made linking conceptual behavior to information. If organisms are primarily conceptual filters going about the world detecting and storing the key relational patterns of sets of objects as concepts, then the raw material of information itself must be in these sets, and, more precisely, in their subsets. Then, the components of these sets of objects (its subsets), play the role of information carriers with respect to the entire set because they represent particular aspect of the entire set from which a complete concept is formed. In other words, these subsets stand for category cues to the complete concept. On the other hand, the mediators for these sets are their corresponding concepts. For example, when we wish to communicate a fact about a chair that we are not able to point to directly, we assume that the concept

Six types of structures for sets consisting of objects defined over three dimensions (color, shape, and size). The last column consists of logical descriptions of the sets.

To recap, concepts are generalizations about the world. They encompass everything that is knowable. Categories or sets of objects, on the other hand, are the material from which concepts are formed. However, unlike the sets of entities in Hartley’s theory and in SWIT, the entities that we propose have inherent components that allow for relationships to be perceived by observers. These components are simply dimensional values.

Thus far, we have defined our information carriers as subsets of sets of dimensionally-defined objects. To explain, consider sets of attributes that any of the objects of some set can have. Then, the objects can be characterized as tuples of such attributes. Thus, classes of objects are subsets of the Cartesian product of sets of attributes. For example, if the attributes come from sets X, Y, and Z, then the classes of objects are nothing but subsets of X × Y × Z (where the symbol × is the Cartesian product relation). In turn, carriers are subsets of these classes of objects. These subsets are present everywhere we look in our environment. These subsets come in the form of items we purchase at the local store to the molecules or atoms that make up a coffee cup. As long as they can be construed in terms of a finite number of constituent dimensions, they are regarded by RIT as information carriers. Moreover, there are no limits with respect to the granularity or resolution of the compositional makeup of the objects in these subsets—indeed, the choice is entirely up to the observer. These choices may range from subatomic particles, to the planets in our solar system, to symbols that make up a string.

On the other hand, we have also defined our information mediators as concepts. The concepts live in the mental space of organisms ranging from aplasia to insects and from dolphins to humans. Some may argue that they also live in the mental spaces of intelligent robots and expert systems. Regardless, the point is that only by using concepts as mediators can information as a measurable quantity reflect human intuitions as to what is informative However, the question remains: What is information and how is it measured under this view? Recall that in SWIT the information content of the carrier (event) was measured using the classical probability measure of random variables. In contrast, in HIT, it was measured by the cardinality or size of a set of objects. In RIT, the information conveyed by the carriers (subsets of dimensionally defined sets of objects) is measured by how faithful they convey the contents of the original set. This relationship between sets of objects and their subsets revolves around the complexity of sets. More specifically, it hinges upon how the degree of difficulty (or perceived complexity) of learning the set of origin (

But how is perceived complexity measured? As mentioned, RIT uses CIT (and its generalization, GIST) to characterize concept learning performance. In CIT, perceived complexity (or the degree of learning difficulty of a concept) is a tradeoff between the size of a set of objects (its “raw complexity”) and how much relational pattern the set of objects is perceived to have (the details of the measure are in the technical appendix). In other words, the perceived complexity of a set of objects is directly proportional to its size and inversely proportional to its degree of perceived patternfulness or structural coherence. This notion of complexity is unlike any other notion surveyed by Feldman and Crutchfield [

That this approach captures the role of meaning in information may be illustrated by using the set from the first section of this article. The set S = {airplane, bus, automobile, dog} has four elements. If we ask how much information a subset of S, say

Likewise, when an object(s) is removed from S making S harder to learn, the absence from S contributes to the lack of patternfulness of S. This lesser regularity or absence of patternfulness makes the set more complex (and harder to learn). It also tells the receiver that the subset is structurally relevant to S or contributes to the regularity of S. Imagine if we remove the object “airplane” from S. This would make S more complex and harder to learn as a concept. As mentioned, it turns out that RIT uses an established measure of this degree of learning difficulty or, better yet, a measure of how structurally complex a set of objects is perceived to be (see appendix for details) that has been empirically verified in [

Generally, the amount of information conveyed by any carrier set is then characterized by the percentage increase or decrease in complexity experienced by the base set when the carrier subset is removed. The greater the percentage increase in base set complexity, the higher the quality of information conveyed by the carrier set; likewise, the greater the percentage decrease in base set complexity, the lower the quality of information conveyed. The information quality of the carrier is indicated by a negative sign for a negative rate of change in complexity and a positive sign for a positive rate of change in complexity. Because humans prefer a decrease in complexity, whenever there are two objects with the same magnitude in the relative rate of change in complexity of their associated category, the one with the negative sign indicates a relative higher quality of information.

The predictions made by RIT regarding the informativeness of single object carrier subsets (with respect to the six structures of four objects shown in

RIT predictions corresponding to the six sets of objects shown in

Category | Objects | Information |
---|---|---|

3[4]-1 | {001, 011, 000, 010} | [0.20, 0.20, 0.20, 0.20] |

3[4]-2 | {100, 001, 011, 110} | [0.05, 0.05, 0.05, 0.05] |

3[4]-3 | {011, 111, 000, 001} | [−0.08, −0.31, −0.31, −0.08] |

3[4]-4 | {000, 001, 101, 011} | [−0.31, 0.78, −0.31, −0.31] |

3[4]-5 | {110, 011, 000, 001} | [−0.41, −0.22, −0.22, 0.52] |

3[4]-6 | {001, 010, 111, 100} | [−0.25, −0.25, −0.25, −0.25] |

In addition to providing a solution to the problem of the role of meaning in information, RIT, by its very nature, also provides a way to measure the effects of structural context on the carriers of information. Recall that carriers are carriers by virtue of the relationships between the objects of the base set. But to remove any number of objects from the base set is to change its conceptual fabric or, in other words, its meaning. This functional relationship between context and meaning gives RIT a clear advantage over SWIT and HIT, and solves Luce’s (2003) concern about SWIT not capturing the structural properties of stimuli.

To summarize, in RIT, a new general way of characterizing information is proposed that is based on five principles: (1) That humans and other agents communicate via concepts or, in other words, mental representations of categories of objects (where a category is simply a set of objects that are related in some way); (2) concepts are the mediators of information; (3) concepts are mental representations of the relationships between qualitative objects in the environment that are defined dimensionally; (4) the degree of perceived homogeneity of a category (

Hence, the SWIT-based notion of measuring subjective information as a function of the degree of surprise of an event is abandoned in RIT in favor of the notion that the amount of information conveyed by any subset of a category of objects about the category is the rate of change in the perceived structural complexity of the category (or, equivalently, the rate of change in the category’s degree of concept learning difficulty) when the subset is removed. More generally, the basic idea underlying RIT is that perturbations to the fabric of perceived complexity in the environment account for what we deem informative. This idea frees information from the shackles of probability theory and from the domain specific knowledge-based notions of informativeness spawned by naïve informationalism. In addition, the information measure proposed in RIT determines not only the amount of information of any dimensionally-defined object, but also its quality. More specifically, negative information values of the measure indicate a decrease in the complexity of the category when a given object is removed, whereas positive values indicate an increase in complexity with the object’s removal from the set. Also, note that the idea of linking the change in the complexity of a category structure (when some of its elements are removed) to information content addresses the problem of determining the role that context plays on the amount of information humans attribute to each object of a category. Hence, the most informative entities in a category are those that decrease its perceived complexity (for the technical details of the theory see the technical appendix).

To further contrast these ideas to HIT and SWIT, in the introduction, we made a distinction between two components of individuation-based information measures that reveal their fundamental nature: The first component measures the amount of information present in the carrier and the second operationalizes or functionalizes this quantity in terms of a principle of individuation using a mediator. We suspect that paradigm shifts in information science emerge depending on how the carrier, the mediator, and their relationship are specified. In our approach, we have proposed reassigning the role of the carriers of information to subsets of some base set of objects that are dimensionally defined and designating concepts as the mediators of information. Furthermore, the relationship between carriers and mediators was not based on a principle of individuation, but rather on a principle of differentiation: Namely, that the percentage rate of change in the complexity of any set of objects when a subset (its carrier) is removed gives the amount of information conveyed by the removed set. Again, the quality of information is measured by the sign or the direction of the slope of such rate of change.

From an ecological point of view, this characterization of information makes sense. For too long, cognitive scientists and psychologists have been greatly influenced by the belief that the environment is full of uncertainties and that behavior (and particularly, conceptual behavior) is primarily driven by uncertainty. This widespread belief explains why psychologists have attempted to use SWIT to account for cognitive performance. But uncertainty itself is a complex concept—one that depends largely on the varying frequencies of everyday experiences as managed by our conceptual system and as acquired over the course of a lifetime. This highly subjective aspect to the concept of uncertainty has been a roadblock to any kind of truly generalized information measure. The second major problem undermining SWIT as a theory of information is the fact that it plays no attention to the relationships between the carriers. Indeed, this was the crux of Luce’s criticism. RIT, on the other hand, is based exclusively on such relationships. As such, it may be used to determine the extent to which the structural context of each carrier influences the quantity and quality of information it conveys about its base set.

In closing, we have argued in this article that in order to accurately and effectively measure the amount and quality of information conveyed by stimuli in the environment, one should abandon an uncertainty-oriented conception of information in favor of one based on context, meaning, and complexity. We have discussed the assumptions and limitations of HIT and SWIT that prevent either theory from achieving such a measure. Indeed, these limitations support the proposition that any theory of subjective information that is grounded exclusively on probability theory as a measure of uncertainty is doomed to fail for it cannot capture the role that meaning (as representation) and structure play when measuring the quantity and quality of information conveyed by a set of object-stimuli. In contrast, RIT proposes that what renders an entity informative is its ability to greatly increase or decrease the perceived complexity of the surrounding environmental stimuli as determined by the observer’s conceptual system. When compared to the defining characteristics of HIT and SWIT, RIT (and its generalization to continuous domains, GRIT) successfully challenges many of our most coveted intuitions about how information should be measured.

In this extensive technical appendix we give a simplified introduction to RIT (based largely on material from [

Six examples of categorical stimuli consisting of objects defined over the discrete binary dimensions of color, shape, and size were shown in

Before defining the representational information measure, we shall first define the notion of a representation (or “representative”) of a well-defined category. A representation of a well-defined category S is any subset of S. The power set

Categorical invariance theory is a theory of human concept learning that has been successful at accurately predicting the degree of concept learning difficulty of categories of objects (see [

Logical manifold transformations along the dimensions of shape, color, and size for a set of objects defined over three dimensions. The fourth pane underneath the three top panes contains the pairwise symmetries revealed by the shape transformation.

Formally, these partial invariants can be represented in terms of a vector of discrete partial derivatives of the concept function that defines the Boolean category. This is shown in Equation (5) below where

On the other hand, the discrete partial derivative, defined by the Equation below (where

The value of the derivative is ±1 if the function assignment changes when

Accordingly, the discrete partial derivatives in Equation (5) below give the number of items that have been changed in the category in respect to a change in each of its dimensions. The double lines around the discrete partial derivatives give the proportion of objects that have not changed in the category and are defined in Equation (6) below (where

In the above definition (Equation (6)),

Relative degrees of total invariance across category types from different families can then be measured by taking the Euclidean distance of each structural or logical manifold (Equation (7)) from the zero logical manifold whose components are all zeros (

Using our example from pane one in

Note that the concept function

Invariance properties facilitate concept learning and identification. More specifically, the proposed mathematical framework reveals the pairwise symmetries that are inherent to a category structure when transformed by a change to one of its defining dimensions. One such pairwise symmetry is illustrated in the bottom pane of

Using the definition of categorical invariance (Equation (7) above), we define the structural complexity

The simplest function that meets the above criterion is the identity function. Thus, we use it as a baseline standard to define the structural complexity of a category. Moreover, since the degree of categorical invariance

Although Equation (10) above is a good predictor of the perceived structural complexity of a well-defined category (as indicated by how difficult it is to apprehend it), it has been shown empirically that subjective structural complexity judgments may more accurately obey an exponentially decreasing function of its degree of invariance in [

There are parameterized variants of Equations (10) and (11) above with cognitively-motivated parameters [

With the preliminary apparatus introduced, we are now in a position to introduce a measure of representational information that meets the goals set forth in the introduction to this paper. In general, a set of objects is informative about a category whenever the removal of its elements from the category increases or decreases the structural complexity of the category as a whole. That is, the amount of representational information (RI) conveyed by a representation R of a well-defined category

More specifically, let

Note that definitions 12 and 13 above yield negative and positive percentages. Negative percentages represent a drop in complexity. Thus, RI has two components: a magnitude and a direction (just as the value of the slope of a line indicates both magnitude and direction). For humans, the direction of RI is critical: for example, a relatively large negative value obtained from 12 and 13 above indicates that high RI is conveyed by the subset of

Using Equation (13) above, we can compute the amount of subjective representational information associated with each representation of any category instance defined by any concept function. Take the category defined by the concept function

Next, we compute the values of

Similarly, if we compute the results for the remaining two singleton (single element) representations of the set {111, 011, 000}, we get the values shown in the table of

Amount of Information conveyed by all the possible single element representations of

{111} | {111, 011, 000} | {011, 000} | 0.30 |

{011} | {111, 011, 000} | {111, 000} | 0.30 |

{000} | {111, 011, 000} | {111, 011} | −0.52 |

Category instance of

Amount of information conveyed by all the possible single element representations of six different category types or concept functions.

Category | Objects | Information |
---|---|---|

3[4]-1 | {000, 001, 100, 101} | [0.20, 0.20, 0.20, 0.20] |

3[4]-2 | {000, 010, 101, 111} | [0.05, 0.05, 0.05, 0.05] |

3[4]-3 | {101, 010, 011, 001} | [−0.31, −0.31, −0.08, −0.08] |

3[4]-4 | {000, 110, 011, 010} | [−0.31, −0.31, −0.31, 0.78] |

3[4]-5 | {011, 000, 101, 100} | [−0.41, −0.22, −0.22, 0.52] |

3[4]-6 | {001, 010, 100, 111} | [−0.25, −0.25, −0.25, −0.25] |

In the above discussion, RIT has been portrayed as a theory that applies only to sets of objects or categories that are defined over binary dimensions. In order to transition to continuous dimensions with values standardized in the

Equivalence of Invariance to partial similarity across two dimensions.

In the discussion below we shall employ the following notation:

(1) Let

(2) Let the object-stimuli in

(3) Let the vector

(4) Let

(5) Let

We begin by describing formally the processes of dimensional binding and partial similarity assessment. To do so, we will introduce a new kind of distance operator. But first, let’s define the generalized Euclidean distance operator

As in the Generalized Context Model (GCM) [

Equation (16) computes the psychological distance between two stimuli ignoring their

And more generally, for any stimulus set containing

Similarly, we can define the partial similarity between the two exemplars corresponding to the two object-stimuli—as is done in the GCM [

In Equation (19) above, we have standardized the value

This standardization will prove useful when we introduce the discrimination threshold parameter later in this section. As in [

In spite of using the above metric, we acknowledge the possibility that a different kind of function may be playing a similar role in the computation of partial similarities. Next we can construct the matrix of the pairwise partial psychological similarities between all four exemplars corresponding to the four object-stimuli in X as seen in Equation (22) below:

Again, as a process assumption, we have excluded reflexive or self-similarities in the diagonal of the partial distances matrix shown in Equation (22) above. However, we include symmetric comparisons since we assume that they are processed by humans when assessing the overall homogeneity of a stimulus; besides, they add to the homogeneity of the stimulus as characterized by the categorical invariance principle and the categorical invariance measure, and we wish to be consistent with both of these constructs.

Adding the values of the similarity matrix that correspond to differences within a chosen discrimination threshold

The Equation above defines the perceived degree of local homogeneity

For example, take a stimulus set consisting of four binary dimensions and four objects as seen in

Matrix representing a stimulus set structure with four object-stimuli O1–O4 of four dimensions D1–D4.

D1 | D2 | D3 | D4 | |
---|---|---|---|---|

O1 | 1 | 1 | 1 | 0 |

O2 | 1 | 1 | 0 | 1 |

O3 | 1 | 1 | 0 | 0 |

O4 | 1 | 1 | 1 | 1 |

Note that the computed matrix in Equation (24) contains 4 ones that represent four identical pairs of exemplars corresponding to four pairs of object-stimuli. Applying Equation (23) above, we get Equation (25).

Lastly, we define the generalized structural manifold by Equation (26). This construct is analogous to the global homogeneity construct defined under the binary theory, except that it applies to both binary and continuous dimensions and is equipped with a distance discrimination threshold. It measures the perceived degree of global homogeneity of any stimulus set.

We can also specify the particular degree of partial homogeneity of the structural manifold as seen in the Equation below.

We hypothesize that for every dimension

The distance and similarity matrices associated with the computation of the local homogeneities of the stimulus set A: There are 4 structural kernels and these are listed in the last column under the perceived local homogeneity measure. Combined they form the manifold of the stimulus set A.

Dimension | Standardized | Standardized Similarity | Perceived | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

Distance Matrix | Matrix | Local Homogeneity | ||||||||||

1 | 1110 | 1101 | 1100 | 1111 | 1110 | 1101 | 1100 | 1111 | 0/4=0 | |||

1110 | 0 | 1 | 0.5 | 0.5 | 1110 | 1 | 0.37 | 0.61 | 0.61 | |||

1101 | 1 | 0 | 0.5 | 0.5 | 1101 | 0.37 | 1 | 0.61 | 0.61 | |||

1100 | 0.5 | 0.5 | 0 | 1 | 1100 | 0.61 | 0.61 | 1 | 0.37 | |||

1111 | 0.5 | 0.5 | 1 | 0 | 1111 | 0.61 | 0.61 | 0.37 | 1 | |||

2 | 1110 | 1101 | 1100 | 1111 | 1110 | 1101 | 1100 | 1111 | 0/4=0 | |||

1110 | 0 | 1 | 0.5 | 0.5 | 1110 | 1 | 0.37 | 0.61 | 0.61 | |||

1101 | 1 | 0 | 0.5 | 0.5 | 1101 | 0.37 | 1 | 0.61 | 0.61 | |||

1100 | 0.5 | 0.5 | 0 | 1 | 1100 | 0.61 | 0.61 | 1 | 0.37 | |||

1111 | 0.5 | 0.5 | 1 | 0 | 1111 | 0.61 | 0.61 | 0.37 | 1 | |||

3 | 1110 | 1101 | 1100 | 1111 | 1110 | 1101 | 1100 | 1111 | 4/4=1 | |||

1110 | 0 | 1 | 0 | 1 | 1110 | 1 | 0.37 | 1 | 0.37 | |||

1101 | 1 | 0 | 1 | 0 | 1101 | 0.37 | 1 | 0.37 | 1 | |||

1100 | 0 | 1 | 0 | 1 | 1100 | 1 | 0.37 | 1 | 0.37 | |||

1111 | 1 | 0 | 1 | 0 | 1111 | 0.37 | 1 | 0.37 | 1 | |||

4 | 1110 | 1101 | 1100 | 1111 | 1110 | 1101 | 1100 | 1111 | 4/4=1 | |||

1110 | 0 | 1 | 1 | 0 | 1110 | 1 | 0.37 | 1 | 0.37 | |||

1101 | 1 | 0 | 0 | 1 | 1101 | 0.37 | 1 | 0.37 | 1 | |||

1100 | 1 | 0 | 0 | 1 | 1100 | 1 | 0.37 | 1 | 0.37 | |||

1111 | 0 | 1 | 1 | 0 | 1111 | 0.37 | 1 | 0.37 | 1 |

Combined as a vector, these four values represent all the structural information of a concept, or in other words, the ideotype of the stimulus set. The overall degree of perceived global homogeneity or invariance of a stimulus set X defined over

Note the arc above the capital phi variable: It stands for the invariance measure when is able to handle objects defined over dichotomous and continuous dimensions. Equation (28) is all that is needed to generalize RIT to continuous domains (and, hence, to go convert RIT into GRIT). Thus, the final general measure is then given by the Equation below when we let

The author would like to thank Mikayla Barcus, Charles Doan, Andrew Halsey, and Derek Zeigler for their helpful comments. Correspondence and requests for materials should be addressed to Ronaldo Vigo.

For the readers’ convenience, the parameterized variants of Equations (10) and (11) (see main text) respectively as introduced by Vigo (2009, 2011) are as follows:

We could simply define the representational information of a well-defined category as the derivative of its structural complexity. We do not because our characterization of the degree of invariance of a concept function is based on a discrete counterpart to the notion of a derivative in the first place.