1. Preview
The article presents a theory of understanding that boils down to the following. The brain is a regulatory system orchestrating sensory-motor responses to varying external conditions. Thermodynamics steers self-organization in the brain towards producing quasi-stable neuronal groupings (called packets) that can be re-combined rather than composed anew each time the conditions change. Packet combinations serve as models of the environment that can be exercised under the guidance of external feedback, or without such guidance (simulation, thinking). Self-directed construction and manipulation of mental models underlie understanding. Absorbing information via feedback constitutes negentropy extraction. Mental modeling in the absence of such feedback equates to producing information and generating negentropy. The modeling capability is a recent evolutionary discovery that yielded high regulatory efficiency, and at the same time, created a uniquely human need for understanding the world, which is related to, but distinct from, the need to organize interaction with the world in a manner conducive to survival. These suggestions will be unpacked in several iterations, starting with the remainder of this preview.
This paper argues that a definitive feature of information dynamics in the human brain is a particular form of information production [
1] that underlies the understanding capacity and is unique to the human species. Information is produced when regularities intrinsic to the processing system are superposed on regularities in the incoming signal (e.g., correlations in the stimuli stream). Perception is a form of information production shared by animals and humans. In perception, e.g., vision, correlated excitations of receptor cells in the retina are organized into images (mental objects) that are projected outside and experienced as physical objects located in the space beyond the receptor surface. Understanding involves construction of mental models that go beyond perception, in two ways. First, models represent not only objects but their changes over time (behavior) and the ways the changes are mutually coordinated (inter-object relations). Second, perception operates on sensory signals, while modeling (“thinking”) is decoupled from the sensory inflows. Constructing models and operating on them in the absence of motor-sensory feedback constitutes a uniquely human form of information production (see
Section 5.2.4 for clarifications).
Projecting the products of mental modeling into the outside world underlies the experience of having apprehended salient relations in the environment, which enables explanation, anticipation of changes, and planning actions likely to influence such changes in the desired direction. Understanding complements learning; learning re-uses responses found to be successful under circumstances similar to the present conditions. In contrast, mental modeling takes advantage of the past experiences without repeating them, by allowing re-combination in the construction of novel responses. In short, understanding enables construction of robust (near-optimal) responses to disruptive changes, and anticipatory preparation to possible changes before their onset.
How exactly are the models constructed and exercised? These questions have not received much attention, and answers to them are unknown. This paper makes the following suggestions: mental models are constituted by neuronal structures acting as “synergistic wholes”, where changes in one component constrain changes in the other components. “Synergistic wholes” radically reduce the number of degrees of freedom in their components (spontaneous and attentive activities engage distinct neuronal systems, discussed in some detail in the concluding sections section). As a result, exercising a model, e.g., varying parameters in a particular component, produces variations throughout the structure within narrow ranges allowed by inter-component coordination. Formation of “synergistic wholes” is spontaneous, while parameter variations are amenable to attentive examination (
Section 5.2.2 expands on that notion). Suppression of degrees of freedom and complexity collapse in mental models enable attentive reasoning by maintaining attentive processes within narrow bounds of parameter variations afforded by the model. Since attentive processes demand energy, the mental modeling capacity yields radical reductions in the demand. As a result, humans can handle complex coordination problems (e.g., managing battles, playing chess, designing complex artifacts, etc.) with small energy budgets. Formation of modeling hierarchies underlies the experience of growing understanding and gradual development of a coherent worldview, revealing unifying principles behind expanding multitudes of diverse phenomena.
In the remainder of the paper, these suggestions will be elaborated following three lines of enquiry:
- (1)
Self-organization in the neuronal substrate;
- (2)
Information production; and
- (3)
Optimization of the organism–environment interaction.
Energy is a universal interaction currency. Accordingly, all three lines converge in the notion of thermodynamic efficiency. Mental modeling is a form of information production, which yields the dual benefits of minimizing energy expenditure inside the system, while maximizing energy inflows from the outside (hence, optimization). Information efficiency is a corollary of thermodynamic efficiency—minimal amount of sampling obtains maximally valuable information. Importantly, associating mental modeling with self-organization entails differentiation between extrinsic and intrinsic sources of value. Extrinsic values are determined by energy extraction opportunities that the information signifies, while intrinsic values (worth, significance) are determined by self-organization opportunities that the information enables. Intrinsic values motivate information seeking and production, experienced as unification of models and interlocking of modeling hierarchies. In this way, decoupling from sensory inputs gives rise to uniquely human pursuits, separating intellectual significance of information from the material benefits it can bring about.
The proposed theory relies on two notions: neuronal packets and virtual associative networks. Neuronal packets are Hebbian assemblies stabilized by boundary energy barriers. Formation of bounded, quasi-stable neuronal packets underlies perception, that is, gives rise to bounded, quasi-stable feature aggregations (images, mental objects) projected into the outside space. Packets form in associative networks capturing correlations in the sensory stream; strong correlations trigger phase transition in tightly associated neuronal subnets causing their segregation and folding into packets. Packets establish an energy landscape over the associative network, regulating attentional access to packet internals (high boundary barriers deny access or cause attention capture inside packets). Virtual network comprises a hierarchy of functional networks establishing coordinations (relations) between individual packets, between packet groupings, between groups of groups, etc. In this way, information assimilated in the form of associative links of varying strength (synaptic weights) gives rise to a self-produced functional hierarchy, transforming sensory flux into unifying world models of growing generality. As succinctly stated by Eddington, “the mind filters out matter from meaningless jumble of qualities, as the prism filters out the colors of the rainbow from the chaotic pulsation of white light” [
2].
These ideas have motivated the following conjecture: Understanding boils down to apprehending coordination in the behavior of mental objects. Accordingly, the onset of the understanding and language capacities approximately 80,000 years ago (Cognitive Revolution [
3]) could be the result of evolutionary advancement when the apparatus of sensory-motor coordination richly developed in the protohuman, optimized for manipulating external objects under the control of motor-sensory feedback, was co-opted for the manipulation of mental objects in the absence of such feedback. Similarly, the signaling system was co-opted to support the handling of mental objects. These conjectures will be addressed in the outline of the theory.
Operationalizing these ideas motivated a computational architecture (dubbed “gnostron”) simulating some of the processes presumed to underlie human understanding. The gnostron approach is orthogonal to that implemented in perceptron, and picks up where the perceptron leaves off. Perceptron establishes a mosaic of synaptic weights and gnostron operates on that mosaic while leaving the weights intact. The concluding discussion will compare the two approaches.
To complete the preview, some limitations of the proposed theory need to be pointed out. In particular, the theory uses the notion of “mental models” in a restricted way and in a manner that does not always agree with the meaning attributed to this notion in the literature. According to Wikipedia, “a mental model is a kind of internal symbol or representation of external reality, hypothesized to play a major role in cognition, reasoning, and decision making … In psychology, the term mental models is sometimes used to refer to mental representations or mental simulation generally… scientific debate continues about whether human reasoning is based on mental models, versus formal rules of inference, or domain-specific rules of inference or probabilities”. As stated above, the present theory uses the term to denote structures comprised of neuronal packets and suggests that human understanding is rooted in operations based on such structures. It can be ascertained that those same operations are involved in problem solving, reasoning, inference, decision making, etc., but the paper does not enter the debate and recognizes that this theory in its present form might fail to account adequately for all aspects and nuances of these and other cognitive processes. Identifying such shortcomings in the present version of the theory is a necessary step for future work, which will lead to revisions and new synthesis.
Besides mental modeling, the theory addresses other cognitive processes, such as attention and motivation. The article defines these processes within the framework of the theory and as the theory is outlined, postponing clarifications until the discussion section. The important task of comparing definitions and opinions in this paper with the plethora of definitions and opinions in the literature is assigned to future research. Some key notions are repeated throughout the text, on the assumption that readers will tolerate some redundancy for the sake of clarity.
The article is organized into five parts, including the introduction and concluding discussion. The introductory part seeks to place the theory of understanding within a broad framework, combining philosophy (the mind-matter dichotomy), information theory, thermodynamics (self-organization in open, far-from-equilibrium systems), and optimal control. The introduction concludes by formulating “principles of understanding”. Part 3 outlines the proposed theory, and summarizes theoretical results and experimental findings to date that appear to corroborate it. Part 4 applies the theory to address topics overlapping with that of understanding, such as consciousness, mental modeling in norms, and pathology, among other. The theory affords treatment of these complex topics that is exceedingly simple and coheres with insights expressed in some other theories. The lines of treatment are only tentatively stated, in the hope of motivating other researchers to pursue them further. The concluding part 5 summarizes the proposal and makes suggestions for further research.
5. Summary and Discussion
This part is broken into three sections.
Section 5.1 discusses ideas central to the theory.
Section 5.2 clarifies and extends some of the notions in the article, focusing on their interpretations within the framework of the theory (for convenience, this section re-states some of the points scattered throughout the text).
Section 5.3 presents a thumbnail digest, emphasizing distinctions between this proposal and other ideas in the literature. Suggestions for further research conclude the paper.
5.1. Discussion: How Neurons Make Us Smart
Challenges facing cognitive systems in a fluid environment can be defined as follows. The present condition in the environment is A, changing it to promises reward W; which coordinations can I deploy to achieve A → with an acceptable level of effort and within the available time (i.e., before the opportunity A is gone)? In the parlance of neuronal processes, the problem maps onto the following: “Neurons , …, have fired, indicating presence of stimuli , offering potential energy reward W; therefore, which neurons should be fired next in order to get the reward with sufficient certainty and at the lowest energy cost?”. Operation of the cognitive system is reduced to dynamic optimization of neuronal resources. The remainder of this section applies the mapping to elaborate some of the key ideas in this paper.
Assume that optimal allocations have been computed for a large set of stimuli. For argument’s sake, assume that all the possible allocations have been calculated meticulously, step-by-step, and the best reward-maximizing, expense-minimizing allocation has been selected, allocating to each stimuli a group of neurons. Record these groups and do the following: heat the neuronal pool and witness formation of neuronal packets, similar to Bernard cells. When comparing packets to the computed groups, you will find that they are nearly-identical. The point is that thermodynamically-driven self-organization produces neuronal packets yielding near-optimal allocations. At the psychological level, the process manifests in the transformation of stimuli streams into sets of distinct objects preserving their self-identity within some ranges of condition variation. The central claim is that near-optimal distribution of neurons between packets is neither computed by hidden agents, nor results from message passing obtaining a negotiated consensus in the neuronal pool. Self-organization is the key property of the neuronal substrate, making it a suitable medium for behavior regulation. Humans are smart, not because their brains run efficient step-by-step procedures, but precisely because neurons engage in collective behavior alleviating the need for such procedures. The emergence of Sapience was a result of a confluence of developments in the nervous system, enabling advanced forms of collective behavior yielding understanding. The capacity is inherent in the species and is mastered by individuals in the course of maturation.
Thermodynamics works for leeches the same way it works for humans. Leeches possess a model of the world comprised of two object types: crawlable object and swimmable object. The crawlable object behaves in many different ways, all accounted for by different activity patterns in the “crawl” packet (the same goes for the “swim” packet). Swimming and crawling are different activities but have something in common (an overlap). Thermodynamics enforces economy in the form of “shifting coalitions,” by combining the “overlap” packet alternatively with the “swim” or “crawl” packets [
38]. Leeches deploy their models in no other way but by crawling and swimming, in a move-by-move fashion. Crawling from point 1 to point 2 and then swimming from point 2 to point 3 is accompanied by packet vectors oscillating around the 1-2 and 2-3 axes (e.g., responding to changes in the crawl surface or conditions in the swim volume). Transforming a leech into a “thinking leech” would require an ability to form models of 1-2-3 movement that can be exercised without performing the movement, and crucially, will orient packet vectors along the 1-2 and 2-3 axes without the need for reproducing the movement-by-movement oscillations. A “thinking leech” will turn into an “understanding leech” when a model can be formed such that thinking “I would rather crawl to point 4 and swim from there” will automatically orient the swim vector along the 4-3 axis. A crawling leech is unaware of the forthcoming swimming while the understanding leech is, and is also aware that changes in crawling will have consequences for the forthcoming swimming.
In a similar fashion, synergistic models of chess positions allow one to conjure up strategic ideas without thinking through all the moves, with the ideas (if coherent) radically reducing the number of moves that remain to be thought through. The result is that chess machines had to reach the speed of searching about
moves per second in order to compete with humans capable of considering at most a few moves per minute (Deep Thought (1989) searched about
positions per second, Deep Blue (1996) searched
[
103]). The comparison suggests a new interpretation of the “Achilles can’t keep up with a tortoise” paradox—the computing Achilles takes detours running multiple times to the moon and back for every step taken by the understanding tortoise (2000 steps per mile, 240K miles to the moon). No wonder Achilles is energy hungry.
These suggestions contradict the mainstream cognitive science, where intelligence is equated to possession of algorithms and the role of understanding is downplayed. In particular, a treatise on human problem solving that is foundational in the discipline [
103] allowed the issue of understanding to enter the argument once (on page 822), and only to question the role of understanding in performance:
“Observe that a high level of mechanization can be achieved in executing the algorithm, without any evidence of understanding, and a high level of understanding can be achieved at a stage where the algorithm still has to be followed from an externally stored recipe”.
The argument is not without problems [
32], but proved to be compelling enough to cause associating intelligence with acquiring algorithms (learning), while marginalizing the role of understanding. As suggested earlier, understanding exploits the machinery of sensory-motor coordination but decouples the cognitive process from sensory inputs, thus liberating it from the dictates of prior learning. As a result, the responsibility for performance efficiency is shifted from accumulating and searching through precedents to constructing coherent explanations, as in abductive inference:
“Abduction … is an inferential step … including preference for any one hypothesis over others which would equally explain the facts, so long as this preference is not based upon any previous knowledge bearing upon the truth of the hypothesis, nor on any testing of any of the hypotheses, after having admitted them on probation”.
Explanations enable reliable predictions. For example, one can observe movement of object A, accumulate statistics, and make predictions about future movements. Alternatively, apprehending that (A is inside B) will explain peculiarities in the movement of A and predict that whatever the future trajectories, they will not cross the perimeter of B. More generally, understanding involves recursive application of set operation (e.g., alternating between packet vector and vector components) having no algorithmic expression. For example, thinking “my class of A, B, …, Z” can conjure up either a set of images, or a featureless unit, as in “I am transferring my class to another room.” Without having condensed the multitude into a unit, thinking of the transfer would require either its execution, step-by-step and person-by-person (A, and B, …, and Z), or the envisioning of such an execution. If operations on sets were restricted to operations on members, ideas concerning sets as wholes could be neither formed nor comprehended. Hence, no human thinking.
Attributing intelligence to the properties of biological neurons does not rule out the possibility of designing intelligent artifacts. On the contrary, apprehending the underlying principles can inform the design of computational methods that approximate biological mechanisms and construction of devices that emulate them. In a similar fashion, apprehending the principles of aerodynamics allowed design of flying machines that do not flap wings or land on trees.
Figure 11 summarizes the proposed theory, conceptualizing cognitive processes as allocations of neuronal resources.
Conceptualizing cognition as dynamic optimization of neuronal resources translates naturally into a computational framework (dubbed “gnostron”), where neuronal resources are allocated probabilistically to streams of reward-carrying stimuli [
30,
106,
107]. The key elements of the present theory (formation of associative networks, formation of packets, packet manipulation and co-ordination, etc.) map directly onto the optimization procedure, with a straightforward interpretation—they represent collective behavior in the neuronal system and serve as heuristics reducing complexity of the procedures with minimal sacrifices of accuracy.
The gnostron framework is orthogonal to that of perceptron (neural network) (dynamically selected neurons versus a fixed set of neurons, feedback-driven operations versus feedback-decoupled operation, other). Increasing internal order in the gnostron system equates to negentropy generation. Boundary energy barriers in packets implement Markov blankets [
10]. Optimization of neuronal resources yields surprise minimization, reconciling the principle of variational free energy minimization [
11] with the thermodynamically-motivated requirement to minimize energy expenditure and divert free energy to the work of mental modeling [
31,
33].
The theory offers some predictions concerning the properties of biological neurons and characteristics of neuronal space. In particular, the theory anticipates the existence of hyper-complex neurons (probably in the higher-order thalamus) that respond to different activity patterns in neuronal packets, and importantly, different rates of activity variation [
108]. Such hyper-complex and complex neurons can form tensor structures yielding activity patterns invariant under coordinate transformations. The thalamus and cerebellum [
109] can operate in a coordinated fashion in the vector space defined by packet vectors.
It can be expected that the next generation of AI systems will differ from the present one, just as the first airplane by the Wright brothers differs from Boeing 757. The advancement is predicated on elucidating biophysical mechanisms responsible for turning neuronal collectives into synergistic wholes amenable to mental manipulation. The future systems will not be programmed but rather endowed with “genetic” propensities compelling them to develop of understanding of their environment sufficient for fulfilling the operator-defined goals. Insights concerning the design of such systems might come from the analysis of biophysical processes in individual cells [
110,
111,
112], relations between information and energy [
113], information dynamics in physiological structures [
114,
115], or other areas, contributing into the development of an expanded theoretical framework unifying information-theoretic [
10,
11], physics-motivated [
116,
117], and biophysical accounts of cognition [
118]. Progress towards such unification will enable transition from machine learning to machine understanding.
5.2. Clarifications and Definitions
5.2.1. The Brain Operates as a Resource Allocation System with Self-Adaptive Capabilities
This theory conceptualizes the brain operation as a probabilistic resource allocation system with self-adaptive capabilities; neurons are resources dynamically allocated to streams of stimuli [
119]. Allocations (accessing, mobilizing, and firing neurons) consume energy, successful allocations are rewarded by energy deposits emitted by stimuli, and self-organization in the system seeks to maintain net energy inflows above the survival threshold [
31]. Central to this concept is the notion of self-adaptation, as envisioned by Roger Sperry: “
a brain process must be able to detect and to react to the pattern properties of its own excitation” [
120].
Self-adaptation in the brain entails optimization of neuronal resources under a dual criteria: maximizing energy inflows from the outside, while minimizing energy expenditures in the inside. Attention, motivation, and other functions are defined within this optimization framework. Self-reflective thinking, self-awareness, and self-consciousness are attributes of self-adaptation.
5.2.2. Attention
On the account of this theory, attention is a brain process that not only reacts to “the pattern properties” in the brain but actively orchestrates them (mobilizes and selectively excites or inhibits neurons). The theory differentiates attention mechanisms operating on external stimuli and those operating on the internal patterns. This view appears to be supported by a number of findings and recent theoretical suggestions, as follows. It is now thought that attention is not a unitary process but involves two distinct neuronal systems. The ventral network implements exogenous (stimulus-driven) attention, while the dorsal parieto-frontal network [
121] and anterior insula network implement endogenous (self-directed, volitional, goal-directed) control. The systems converge in the lateral prefrontal area [
122]. Coherent behavior is a product of coherent neuronal firings in diverse areas orchestrated by the attention mechanism implemented in corticiothalamic loops [
96]. Associating the function of attention with orchestration of firing and inhibition in neuronal networks (see
Figure 7) is consistent with the above findings and proposal in a previous study [
123]:
“An attentional mechanism helps sets of the relevant neurons to fire in a coherent semi-oscillatory way … so that a temporary global unity is imposed on neurons in many different parts of the brain”.
Attention alternating between packets in the formation of mental models obtains such global unity.
5.2.3. Motivation
The concept of motivation subsumes the totality of goal-related processes. Neuronal substrates of motivation include extended amygdala, the ventral striatopallidum, and other subsystems in the basal forebrain [
124]. Recently, neurons were identified in the striatum that are sensitive to the motivational context in which the activity is being carried out [
125]. Seeking significant information and pursuing understanding constitute goals that conceivably can engage the same neuronal substrate as other goal-related processes.
5.2.4. Understanding in Humans and Animals
Large amounts of data has been accumulated in the animal studies demonstrating remarkable cognitive capabilities in other species (e.g., numerical capabilities in honeybees [
126]) and suggesting that the functions of human intelligence could have evolved from neural substrates common to many species [
127]. Recognizing that a significant overlap exists in the principles governing neuronal mechanisms across a spectrum of species [
128], this article is focused on cognitive functions that are subsumed in human understanding, and arguably, lie outside the overlap area (e.g., abduction, explanation). The depth of the available functional hierarchy could be one of major differences between humans and animals (suggested by a reviewer). At the present time, the demarcation line between human intelligence and that of other species has not been clearly defined, and is likely to be revised as new data becomes available.
5.2.5. Neuronal Substrate of Relations
This theory assumes that the emergence of mental modeling in humans involved co-opting mechanisms of sensory-motor coordination optimized for the manipulation of external objects and re-purposing them for the manipulation of internal, or mental objects. On that assumption, establishing relations between objects involves complex and hyper-complex regulatory neurons responding to kinematic variables; that is, not only to packet composition but also to different forms of coordination in the movement of packet vectors (hypercomplex neurons respond to coordination between packets comprising complex neurons). Structures comprising complex and hypercomplex regulatory neurons can implement relations of any complexity [
108]. A number of findings appear to suggest the feasibility of the notion, as follows. A recent study identified and modeled neurons sensitive to the instantaneous position, velocity, and acceleration of the stimuli, as well as to short strips of stimulus trajectory [
129]. Earlier studies identified directionally-selective neurons responding to movement of the stimulus in the preferred direction [
130]. Neurons in the motor cortex have been identified as responsible for the coordinated action of large muscle groups (“muscle synergies”), enabling organized movements of limbs to particular points in space [
131,
132]. Such complex neurons can be grouped, producing a “vocabulary of neural primitives.” Simulations have demonstrated the feasibility of orchestrating coordinated motor activities by deploying various combinations of such primitives [
133]. In general, limits of sensitivity and functional diversity of complex neurons are yet to be determined. For example, a neuron was discovered in the human hippocampus selectively responding to different images of the same person, even if wearing a disguise. Moreover, the same neuron responded to the name of that person expressed in different modalities (written and spoken) [
134].
5.2.6. Relations as Objects
A particular form of abstract thought (identified by Charles Pierce and called
hypostatic abstraction [
135]) transforms relations into objects. For example, “relation A
loves B” implies a certain form of coordination in the behavior of A and B. Hypostatic abstraction postulates a universal source of such coordination, treating it as an object separate from A and B (say, a goddess of love) and capable of granting or withholding love (moreover, activities attributed to the source can be further abstracted and treated as objects, i.e., Cupid’s arrows). Both ordinary and scientific thinking involve objectification of properties and relations (e.g., the idea of phlogiston).
5.2.7. Thinking
Thinking involves grouping, grasping, and simulating—packets are grouped into models, relations between packets are grasped, and manipulating packet contents (rotating packet vectors) constitutes simulation. Insight (in-sight) involves “looking inside” packets, i.e., completing a transition, which requires effort, from being vaguely aware of the packet internals to experiencing and manipulating these internals (please re-visit
Figure 9). Grasp is a form of insight resulting in apprehending coordination between patterns of changes in the packet internals. Insight is a routine component of thinking. Reasoning (symbol manipulation) is auxiliary to mental modeling and is enabled by it. An example will illustrate these suggestions.
Consider a variation of Wechsler’s intelligence test. A subject is presented with a picture showing fragments of a vase lying on the floor next to a vase stand and a cat sitting nearby, and asked to explain the scene in as many ways as might come to mind. Assume three answers: (A) the cat jumped and kicked down the vase, (B) a child was playing with a ball in that room sometimes ago, and (C) a poltergeist did it. With the present theory, these answers were enabled by operations on a mental model comprising three packets (vase, cat, vase stand), as follows. Answer (A) involved grasping a relation (cat pushed vase) and imagining the cat jumping and the vase falling from the stand (insight, simulation). Answer (B) involved abduction (outside packets (child, ball) were pulled into the model having no sensory counterparts in the picture; the subject neither had prior knowledge bearing on the hypothesis nor possessed any means for validating it). Answer (C) invoked hypostatic abstraction (transforming relation push into object mischievous pusher). Note two critical features of mental modeling: (1) Models reduce the number of degrees of freedom in the packets (assume that packet cat affords five instantiations: sitting cat, walking cat, running cat, sleeping cat, and jumping cat. Of those, only the last option was available. It is safe to assume that images of a sitting or sleeping cat floating through the air were not rejected upon examination but simply did not come to mind). (2) Abduction involved re-grouping (cat was exonerated and relation (cat pushed vase) was de-established. Instead, relations (child kicks ball, and ball pushes vase) were formed).
5.2.8. Dynamics of Thinking
Thinking is predicated on stability of memory structures and reversibility of cognitive operations [
7], demanding minimization of entropy in the system. At the same time, exploration of the system’s phase space and identification of instabilities require injections of entropy [
16]. This proposal suggests that temperature variations provide the requisite injections, causing the system to pulsate between far-from-equilibrium and equilibrium states (which might correlate roughly with the experience of alternations between effortful attention focusing and diffuse attention, and spontaneous associative shifts). Consistent with the present theory, recent approaches in the analysis of brain processes associate transient dynamics with information production [
136]. Conceivably, neuronal avalanches [
137] underpin alternations between the states, helping to satisfy the competing demands of information transmission and network stability.
5.2.9. Learning with and without Understanding
The distinction is best explicated using the notions of
fluid and
crystallized intelligence [
138]. Roughly, the latter term denotes ability to learn and to act based on the results of learning. By contrast, the former term denotes the ability to deviate from the directives of prior learning and to act adequately under unfamiliar conditions. With the present theory,
fluid intelligence is predicated on understanding and builds on top of
crystallized intelligence. More technically, learning involves synaptic modifications represented in a mosaic of link weights in the associative network. Packets form in the associative networks but their formation, grouping into models, and operations on models leave the weight mosaic intact. Understanding capacity was a product of evolutionary development building on the learning capacity, and served to overcome its limitations (e.g., cats are often observed attacking small moving objects (large associative weights) and hardly ever observed attacking large stationary objects (small weights). As a result, rigid reliance on a
crystallized weight mosaic would have precluded the “cat
pushed vase” idea, leaving the objects uncoordinated in the subject’s mind and rendering the scene unexplainable).
5.2.10. Meaning and value
Meaning of information is determined by the mental model where the information is fitted in. For example, in the “broken vase” test, a hint informing the subject that “a child was playing with a ball nearby” would make sense if the subject was able to grasp the relation and would remain meaningless if otherwise. Value is a function of worth attributed to the objects and the outcomes of modeling (accordingly, meaningless information has no value).
5.2.11. Neuroenergetics
Most brain energy is used on synapses [
139,
140] This theory itemizes the account by introducing costs incurred in the navigation of the energy landscape in which the synaptic network is embedded. It has been demonstrated that pre- and postsynaptic terminals in the neurons are optimized to allow maximum information transmission between synapses at minimal energy costs [
141]. This theory contends that: (a) the brain’s functional architecture is optimized to allow maximum information production at minimal energy costs, (b) optimization involves mechanisms controlling the interplay between the costs of engagement (exciting/inhibiting neurons) and the costs of navigation, and (c) the understanding capacity is a product of such optimization.
5.2.12. Gnostron
Gnostron framework combines elements of reinforcement and unsupervised learning in the formation of packet networks, and admits the use of other techniques in network processing (e.g., Bayesian updating, probabilistic inter-packet routing [
142], other). Gnostron process can be viewed as a form of mapping different from that used in perceptron: perceptron (neural nets) seeks to establish mapping between vectors while gnostron seeks to establish coordination between patterns of vector movement. Establishing coordination involves combining packets into models, which underlies understanding and attainment of meaning. In short, perceptron learns to recognize conditions while gnostron learns to understand them. Technically, gnostron is an adaptive controller obtaining progressively improving efficiency via operations on self-organizing vector spaces [
143]. On the present theory, gnostron implements the key function attributed to the human brain: transforming energy into the work of information production
5.3. Further Research—A Fork in the Road
This proposal seeks to form a conceptual bridge between two foundational ideas in neuroscience: the idea of neuronal assembly [
34,
35] and the idea of Markov blanket and variational free energy minimization [
10,
11]. In a sense, these ideas reside in a two-dimensional space defined by neuropsychological and information-theoretic axes. The bridging idea (neuronal packets) positioned energy (thermodynamics) as the third dimension. The intuition was that energy processes are not alien and external to the cognitive machinery (e.g., a horse is alien and external to the cart it pulls) but are interwoven into it at every level [
30,
31]. Suggestions resonating with this idea are now beginning to enter the mainstream from many directions (e.g., energy-aware computing [
144]), in a radical departure from the conventional AI and cognitive science framework. Developments along these new lines quickly arrive at a “fork in the road”, posed by some of the most entangled and challenging issues in science—the role of the second law in biophysics and physical underpinnings of information processing. The two paths in the fork are determined by the way the operation of the second law in the development of life and cognition is conceptualized. Both paths assume that the second law drives optimization in the cognitive system but the choices of the optimization criteria could not be more different: (A) Cognitive machinery is optimized to maximize entropy production, or (B) cognitive machinery is optimized to maximize information production.
(A) The notion that evolution selects for maximum entropy production derives from the assumption that “order produces entropy faster than disorder” [
145,
146]. With this notion, proliferation of forms and progression from simple to more richly-structured forms are manifestations of “zigzags” between low entropy pockets (complex forms), executed by nature in its rush downhill, towards universal homogeneity and dissolution of all forms (by implication, growing order entails accelerated descent). The following comments question not the assumption but its usefulness in the study of cognition. A caricature analogy of the assumption would be equating the role of digestion to production of waste. With that, a measure of digestive efficiency would be the ratio of the amount consumed to the amount expelled, overlooking extraction of nutrients and their role in keeping the organism alive. One can accept that evolution (say, from protohuman to Sapience) was accompanied by increased entropy production in the brain counteracted by accelerated entropy removal. Even if proven correct, the result would not shed much light on the mechanisms of cognition. The Internet can be viewed as a means of information processing or as a drain on resources. As in the Necker cube, both perspectives are possible but one of them opens a view on search engines, while the other one obstructs it. The formula “from dust to dust” is undoubtedly correct but short circuits enquiries into what might be happening during transit. In short, the entropy maximization principle can hardly inform analysis of cognition or design of intelligent artifacts.
(B) Cognition involves information production predicated on entropy reduction in the neuronal system. As formulated by Konrad Lorentz:
“Without offending against the principle of entropy in the physical sense … all organic creation achieves something that runs exactly counter to the purely probabilistic process in the inorganic realm. The organic world is constantly involved in a process of conversion from the more probable to the generally more improbable by continuously giving rise to higher, more complex sates of harmony from lower, simpler grades of organization”.
Both (A) and (B) agree on the vector of evolution (from the simple to the complex) but disagree on the assessment of where the vector points: (a) the organic world rushes itself downhill towards self-destruction, or (b) the organic world pushes itself uphill towards self-comprehension, culminating in the development of the understanding capacity.
The latter viewpoint suggests the following lines of enquiry:
(1) Cognitive thermodynamics. Statistical thermodynamics (thermal physics) addresses energy processes in simple systems (e.g., ideal gas, inorganic compounds) [
148]. Biological thermodynamics addresses energy transformation in the living matter [
149,
150]. Cognitive thermodynamics focuses on the energy processes in the nervous system that underpin cognitive functions, seeking to integrate various theoretical constructs (metastability and phase transition in the brain [
151,
152], cortical coordination dynamics [
153], neuronal group selection [
154], dynamical systems [
155,
156], self-organization in the brain, [
157], embodied cognition [
158], other) within a unifying framework defined by the Markov blanket and free energy minimization principles. The following conjectures are within the scope of this enquiry. Mental modeling creates mechanisms amplifying thermodynamic efficiency of neuronal processes in the volume of the model, including:
Converting excessive heat into work;
Biasing ATP hydrolysis towards accelerating release of Gibbs free energy and inhibiting release of metabolic heat;
Reducing Landauer’s cost of information processing (by regulating access in the landscape).
(2) Neuropsychology of understanding. Reducing cognition to possession of disembodied algorithms entailed excising understanding capacity from the purview of cognitive theory.
“Unified theories of cognition are single sets of mechanisms that cover all of cognition—problem solving, decision making, routine action, memory, learning, skill, perception, motor activity, language, motivation, emotion, imagining, dreaming, daydreaming, etc. Cognition must be taken broadly, to include perception and motor activity”.
(please recollect that understanding was presumed to play a marginal role in problems solving, if any [
104], as seen
Section 5.1.) Neuropsychological theory of understanding has a dual objective of: (a) analyzing performance benefits conferred by the understanding capacity, and (b) elucidating the underlying neuronal mechanisms, aiming at representing them within a unified functional architecture (architecture for understanding) (e.g., contingent on further analysis, the architecture might account for recent findings indicating that processing of plausible and implausible data engages different pathways in the brain [
160]). The theory needs to be broad enough to allow comprehensive analysis of the role played by understanding in different manifestations of intelligence (“multiple intelligences” [
161]).
(3) Machine understanding. Machine intelligence builds on the results of the above enquiries, implementing a transition from machine learning (knowledge-based systems) to machine understanding (understanding-based systems). Understanding-based systems combine energy efficiency with the ability to construct adequate responses under unforeseen and disruptive conditions, and to explain decisions motivating the responses. Construction derives response elements and their organization (procedures, algorithms) from internal models; explanation capabilities are organic to the system, accounting for operations on models that are inherently intelligible (e.g., grasping relations) and intrinsic to the decision process. Such systems can act autonomously or collaboratively, predicating their overt actions on the results of self-assessment seeking to verify that understanding of the task and circumstances is sufficient for executing the task.
5.4. Digest
Life emerges in molecular networks, when subnets fold into quasi-stable aggregations bounded by surfaces (Markov blankets), conferring a degree of statistical independence to the internals. Sustaining life requires regulating flows of matter and energy through the boundary surface. Folding in networks appears to be the mechanism used in both creating life and regulating life; subnets in neural networks fold into quasi-stable aggregations (neuronal packets) bounded by energy barriers. Matching such packets against changing conditions (stimuli stream) at the organism’s boundary surface creates packet networks reflecting order (regularities) in the stream. The process is stimuli-driven, thus amounting to absorbing information and extracting negentropy from the stream. The tendency to improve matching scores (minimize surprise) gives rise to processes operating on packet networks and combining packets into new structures (models). Operations on models are decoupled from the stimuli stream and self-directed, thus amounting to information production and negentropy generation. Modeling prepares the system to future conditions, thus radically improving matching scores and giving rise to the experience of attaining understanding. Modeling processes are governed by an interplay of two criteria: improving the scores and reducing overhead (energy costs incurred during modeling). The interplay makes the system self-aware and motivates continuing construction, modification, and integration of models, in a spiral of information production and growing understanding. The takeaway notion concerns distinctions between learning and understanding, as follows. Learning allows extrapolation, i.e., draws a line connecting the past and the present and extends it into the future. Mental modeling allows the extended line to be split into a bundle (what-ifs). Understanding employs a form of modeling that submits for attentive examination a few lines in the bundle plausible under the multitude of factors impinging on the outcomes of interest. Understanding does not foretell the future but accounts for the past, explains the present, and offers the lowest ceiling on future surprises.