Increase in mutual information during interaction of the brain with environment contributes to perception

Perception and motor interaction with physical surroundings can be analyzed by the changes in probability laws governing two possible outcomes of neuronal activity, namely the presence or absence of spikes (binary states). Perception and motor interaction with physical environment are accounted partly by the reduction in entropy within the probability distributions of binary states of neurons in distributed neural circuits, given the knowledge about the characteristics of stimuli in physical surroundings. This reduction in the total entropy of multiple pairs of circuits in networks, by an amount equal to the increase of mutual information among them, occurs as sensory information is processed successively from lower to higher cortical areas or between different areas at the same hierarchical level but belonging to different networks. The increase in mutual information is partly accounted by temporal coupling as well as synaptic connections as proposed by Bahmer and Gupta [1]. We propose that robust increases in mutual information, measuring the association between the characteristics of sensory inputs and neural circuits connectivity patterns, are partly responsible for perception and successful motor interactions with physical surroundings. It is also argued that perception from a sensory input is the result of networking of many circuits to a common circuit that primarily processes the given sensory input.


Information Theoretic Basis of Perception and Action:
Recent years have seen a surge in interest in understanding the neural basis of perception.Many modern theories have centered around predictive coding processes [2].According to this view, the brain uses an internal model to predict sensory input based on movements and past sensory experience [2][3][4][5].More recently, an information-theoretic based model of predictive coding was also developed [6].
In another model of perception recently proposed by Singer and Lazar [7], the brain's spontaneous synaptic activity in the neocortex stores priors, performs high-dimensional processing with nonlinear dynamics and provides efficient, flexible computational strategies.High-dimensional states in the cortex, in the presence of given sensory inputs, collapse into specific low-dimensional substates corresponding to specific perceptual experience [7].
In this paper, we present a model of perception, which proposes that the increase in mutual information resulting from a decrease in the total entropy in neural circuits, which form the networks that process sensory inputs, is partly responsible for perceptual experience.Entropy measures the uncertainty in the binary states (spiking versus resting) of neurons in neural circuits.An increase in mutual information reflects the decrease in the uncertainty in the binary states of neurons in neural circuits, given the knowledge about the characteristics of external sensory stimuli in the environment.
Reduction in the uncertainty (entropy) in neural circuits results from the interaction of the brain with external environment, involving sensory inputs, which introduces changes in the probability laws governing the binary states of the neurons.This view is supported by experimental data, which show that in behaving animals, specific neurons show statistically reliable spiking patterns [8,9].Furthermore, these spike patterns in behaving animals show stable correlations with behavior as well as significant inter-neuronal correlations [10].This suggests that there is a decrease in uncertainty (mutual information), equivalently stated, an increase in certainty in the probability distribution of the binary state of neurons, given the knowledge about the interaction of the animal with external physical environment.
We also consider sparse coding as a source of large amount of information available for processing perception and motor interaction.Bahmer and Gupta [1] have argued that dense coding from gamma oscillations, such as those present in lower cortical areas can be transformed into sparse codes with the help of an integration process that involves coincidence detection by at least two other neural events, namely, low frequency oscillations that synchronizes different circuits [11] and commonly observed ramping activities of cortical neurons [12][13][14].This integration process, which generates a sparse code from a dense code via coincidence detection, would be responsible for generating information that underlies perception.A study has shown that the conversion of dense to sparse code, representing the identity of odorant in the nervous system of the locust, can result from coincidence detection [15].Note that the synchronization of gamma oscillations communication through coherence (CTC) is also sufficient for information coding [16].
Information-theoretic methods are used to argue the importance of the role of mutual information and sparse coding in perception and motor interaction of the brain with physical surroundings.For the analysis, two states of a neurons are considered: (a) Active state; with spikes; for simplicity there will be no consideration of rate of spike generation or the ability of spikes to produce effects, (b) Inactive state (no spikes).Circuits will be considered as probability distributions of neurons with two (binary) states: with spikes and without spikes.
Although an increase in mutual information parallels a dynamic representation of external environment, information representing external stimuli and perceptual objects, such as for working memory and episodic memory could be generated by sparse coding for online perception and action [1].

Mutual Information between two successive neural circuits:
Mutual information is a general measure of the strength of the association between two variables.In the brain the mutual information between two distributions of variables, represented by the binary states of neurons, in pairs of circuits in a network will give a measure of strength of connection.In fact, mutual information kernel has been shown to be a good measure of functional connectivity in non-stationary data, such as EEG [17].
Given two variables X and Y, the mutual information, (, ) is the average reduction in uncertainty about X that results from knowing the value of Y, and vice versa [18].For the analysis of mutual information in neural circuits, we consider probability distribution of random variables, defined as binary states of neurons, active (showing spike generation) and inactive (without any spikes), in two separate pools of neurons, X and Y. m and n are number of neurons in X and Y neuronal pools, representing two local circuits.
The mutual information between variables X and Y is defined by following equation 1 [18]: If the variables in two distributions, X and Y, are independent, then the joint probability (  ,   ) will equal the product of the probability of the activity of individual neurons, that is, (  ,   ) = (  )(  ) ).In this case, mutual information, (, ) will be zero since  ( (  ,  ) (  )(  ) ) is reduced to log1 or zero (result 1; Table 1).
However, if a change in the binary activity state (active vs. inactive) of neurons in circuit X affects the probability of binary states of neurons in circuit Y via direct or indirect synaptic connections or due to an external physical constraint, it will increase the joint probability (  ,   ).Moreover, note from equation 1 that an increase but NOT a decrease in the joint probability ((  ,   )) increases the mutual information (result 2; Table 1).Note that the increase in joint probability ((  ,   )) intuitively implies that there is a greater certainty about the binary activity state of variable y given the knowledge about variable x.Although, mutual information is a symmetrical quantity, it does not require reciprocal connections.Mutual information can be also computed even if there are unidirectional connections between two circuits.
Expected (E) or average values is defined in equation 2: 2 4) Rearrangement of equation 4 gives another useful form (equation 5): This equation tells that the total entropy (() + ()) is reduced whenever there an increase in mutual information, computing the net value of joint entropy ((, )).

Relative contributions of conditional entropy and mutual information in the total entropy:
Various neural phenomena, such as coincidental activation, synchronization by single neural oscillations or nested oscillations, increase the joint probabilities of activation and/or inactivation of neurons, which increases the mutual information between two connected circuits of the brain.The gain of mutual information leads to the increase in the certainty in the state of neural circuits given the knowledge about the sensory stimulus.The uncertainty in binary state distribution of neurons given the knowledge of external stimulus, which is also referred as noise, present alongside with inputs from sensory stimulus, plays an important role determining how accurately mutual information is reflected in the input.
Figure 1: This schematic depicts a typical configuration of processing hubs formed by cortical areas in the brain.A large cortical area, A is shown to interact with several other circuits (B, C and D) by sending and/or receiving inputs when processing from inputs related to a common stimulus or input source.In a task with a greater complexity, another circuit E (broken line) is shown to be involved.The addition of another circuit, to the network processing a given stimulus, will decrease the conditional entropy (area of circle A, excluding overlapping regions) and increase the mutual information (area of all overlapping regions within circle A), which will improve the signal to noise ratio (Equations 11 -13).
The conditional entropy, (|), is defined as the uncertainty in the data from Y after observing the data from X [18].The conditional entropy ((|), in the data from circuit Y after observing the data in X is defined as noise since this part of the total entropy in circuit Y, () does not give information about the data in X.The mutual information (, ), which results from changes in the probability distribution of binary states in connected circuits, given the knowledge of input source or stimulus, is an index of signal.Mutual information, (, ) is related to the conditional entropy (|) according to the equation 6: If this conditional entropy remains large after interacting, then there would be a larger neuronal pool that will be available for further reduction of the uncertainty, given the knowledge about the external physical stimuli (See equations 8 and 9; Figure 1).Thus, conditional entropy, which is referred in this paper as noise, instead of serving no useful purpose, serves as the extra capacity for additional increases in mutual information, which may result following interactions with other circuits (Figure 1).
The effect of availability of several independent circuits in increasing mutual information can be understood in a hypothetical, but a typical situation.In such example of a circuit (A), processing a given input (from primary sensory areas or higher areas) has connections with other circuits (B, C and D) in the brain (Figure 1), the mutual information, given the knowledge of the source of inputs, can be analyzed by the following equation, by a simple addition of mutual information between different pairs of circuits formed with circuit A: (, ) + (, ) + (, ) = () + () + () + () − ((, ) + (, ) + (, )) The net increase in mutual information, representing strength of association of circuit A with circuits B, C and D, given the knowledge about the source of input, can be quantified as (, ) + (, ) + (, ) The total entropy of circuit A can be analyzed in terms of mutual information and conditional entropy in equation 8.The conditional entropy is the uncertainty in data in circuit A after observing data in other circuits B, C, D and E (Figure 1).
In other words, this additional interaction with circuit E will further increase the certainty in circuit A given the knowledge about the source responsible for inputs.This would relate physiologically to increasing the attention to the source of inputs.The increase in mutual information will come from total entropy of circuit A (()), which is evident in equation 9.

𝑺𝒊𝒈𝒏𝒂𝒍 𝒏𝒐𝒊𝒔𝒆 = 𝑰(𝑨, 𝑩) + 𝑰(𝑨, 𝑪) + 𝑰(𝑨, 𝑫) + 𝑰(𝑨, 𝑬) (𝑯(𝑨|𝑩) + 𝑯(𝑨|𝑪) + 𝑯(𝑨|𝑫) + 𝑯(𝑨|𝑬))
Equation 10 Equation 10 can be rewritten in another form (equation 11), which expresses conditional entropy as the difference between the total entropy and mutual information.As explained later, signal to noise ratio will increase whenever additional circuit couples with circuit, such circuit E to already networked B, C and D. The networking of another circuit E will the increase mutual information by (, ) (Figure 1), which is done by subtracting from the total entropy of circuit A that constitutes the conditional entropy, after data in B, C and D are observed.This will lead to an increase in signal to noise ratio.This is evident after comparing the values of signal to noise ratio computed in equations 11 (circuit E is NOT in the network) and 12 (circuit E is in the network) (inequality equation 13).Since signal (mutual information) is a direct correlate of perception, the increase of signal to noise ratio will result in increased attention to the perceptual object (result 3; Table 1).

Equation 11
Once circuit E, also processing the same stimulus or internal source, is added to the network, the signal to noise ratio is given by the following equation ′ ′ = (, ) + (, ) + (, ) + (, ) () − ((, ) + (, ) + (, ) + (, )) Equation 12Comparison of equations 11 and 12 below reveals that addition of another circuit (circuit E) to a network will increase the signal to noise ratio.This result is used for arguing that perception, which is partly dependent on signal to noise ratio, is the result of networking of many circuits to a common circuit (circuit A) that primarily processes an input.

𝑺𝒊𝒈𝒏𝒂𝒍′
′ >   Equation 13Therefore, it will be advantageous for entropy of circuit A, (), to be a large quantity, so that there is more capacity for an increase in mutual information, which would allow more complex processing of perceptual objects.Thus, it is likely that the best candidates for circuit A would be network hubs, and association areas, which include large inter-connected parts of the cortex [19,20].
Note that an important condition to be present for equations 7 -12 to be true is that the circuits A, B, C, D and E interact with the same internal source or external stimulus.Although not independent, since they process the same input source, B, C, D and E are assumed to be probability distributions that are formed by separate sets of variables, namely the neurons with binary states.

Surprise, entropy and Shannon Information:
Unlike a coin toss, classification of neural spikes as a binary event is less obvious.Spiking or action potential represents one binary state; the other state would be when there is no action potential, or the neuron is in the resting state.Like the toss of coin, a spiking neuron shows two outcomes during its activity, namely the presence or the absence of spikes at a particular time.Moreover, the generation of a neuronal spike is determined by a probability function.Shannon information from observing one event/outcome is a measurement of average surprise associated with that particular event/outcome [21].In case of neurons, the one particular outcome of physiological interest will be the generation of spikes.In contrast, entropy is the average surprise after observing all outcomes, which are both, spikes and resting state (no spikes) in neurons [21].Shannon information or average surprisal after observing spike generation in a population of m neurons in an area of the brain, which generates infrequent activities, is given by the Equation 14; () is the probability function that governs the activity of the neuron.
(Equation 14) Sparse coding refers to infrequent activities in a small number of neurons, encoding information in different parts of the brain [22].Sparse coding is associated with apparently complex motor outcomes in song birds, such as highly stereotyped songs [23,24].Neurons that are responsible for sparse coding exhibit a low average probability of firing [22,23,25].Accordingly, Shannon information after observing a spike generation in sparsely firing neuron will be a large quantity (Equation 14).Despite the large amount of Shannon information that would result from a low probability event, it is not clear how sparse coding is mechanistically related to complex stereotyped outcomes [22].
Figure 2: This schematic (adapted from Bahmer and Gupta [1]) depicts the coincidence detection (D) leading to the activation of a set of neurons.This sparse set of neurons are activated (D) when excitatory phase of low frequency oscillation (B) coincides with gamma synchronized synaptic discharge [65] from ramping neurons (A).Sensory stimuli can reset the phase of low-frequency oscillations, leading to a random shift in the excitatory phase (orange shaded area) of the oscillations in relation to the ramping activities, which represent internal states of the brain.Based on the fraction of excitatory phase of the total cycle length, the excitatory phase of the low-frequency oscillation will coincide with a high-level firing state of a ramping neuron according to a probability value of 1/k.The surprisal or Shannon information from the coincidence detection, which results in firing of a set neurons (sparse coding), will encode the sensory and motor interaction with external environment.
It is argued above that sparse coding could result from coincidence detection by active phase of a low frequency oscillation and the stage of climbing neuronal activity (Figure 2).If the excitatory phase is a fraction (1/) of the total length of the oscillation cycle, the probability of finding a low frequency in the excitatory phase, after sensory stimuli-induced random phase-shift of oscillatory cycle [26], will be 1/k.Additionally, the probability (()) of ramping neurons to reach a threshold will depend on the probability function governing slope () of ramping activity.Also note that for the purpose of analysis of ramping activity in neural circuits, the appearance of excitatory phase of a stimulusinduced phase-shifted low-frequency oscillation leading to a coincidence detection is a random event.
When coincidence detection by two relatively independent events occurs, it will lead to increased activity in a set of neurons, receiving inputs from ramping neurons, during the excitatory phase of reset low-frequency oscillation (Figure 2).This activity will, result in sparse coding.The probability (()) of a sparse outcome, due to the activity of ramping neurons is given by the following equation.

𝑝(𝑆) = 𝑝(𝐿) 𝑘 Equation 15
Note that the term, 1  , expressing the probability of finding the oscillation in excitatory phase, in equation 15, is an approximation, given that the excitatory phase is a single discrete event.However, the probability of spike generation during the excitatory phase of oscillation would vary according to a continuous trigonometric function.Furthermore, according to the model suggested above (Figure 2) the information that would be produced by sparse coding is a consequence of two relatively independent events, ramping activity, representing internal states [27] and stimulus-induced phase shift in neural oscillations.Thus, Shannon information generated by this mechanism will enable the brain to provide optimal outcomes during an interaction with the external surroundings.Furthermore, the independence of two events will provide a limited number of ramping activities the ability to interact with various neural oscillations to generate information in a very wide range of brain functions.Ramping activities, which are the most common patterns in the frontal cortex during timing tasks, are shown to be important in the temporal control of actions [12][13][14].Furthermore, a study in rats suggests that ramping activities in the orbitofrontal cortex represent internally generated waiting control [27].The proposed role of ramping neuron activities in information processing is consistent with its role in distributed modular clocks, proposed by Gupta [28].Sparse activity can result from the interaction with the environment based on the postulated mechanism since a variety of stimuli, such as visual, auditory and proprioception, can cause random phase-shifting of neural oscillations.Note that the random nature of stimuli onset from environment is due to the lack of knowledge about external stimuli in neural circuits before their onset.The observation of sparse activity can be quantified as surprisal or Shannon information (equation 14), which would potentially encode behavioral outcomes, such as timing.Further note that the increase in Shannon information occurs concomitantly with the increases in the certainty, given the knowledge about sensory stimulation, during interaction with physical surroundings.A study of Shannon information in imaging systems has shown that Shannon information to some extent rises with increase in signal, measured as image diameter or threshold of detection after parameter optimization [29].This suggests that large amounts of Shannon information could also encode complex functions of the brain for processing working memory functions or for generating episodic memories.Moreover, note that dense coding, in contrast to sparse coding, will interfere with the increase in mutual information.

Effect of temporal coupling of neural activities on the measurement of mutual information:
We have argued based on equation 1 that an increase but NOT a decrease in the joint probability ((  ,   )) increases the mutual information (result 2; Table 1).If there is simultaneous activity of neurons pairs (xi and yj) in two neuronal pools (X and Y) either as a result of a constraint or synaptic connections, then this would lead to an increase in joint probability.In one type, there is a simultaneous activation of pairs of neurons in two neuronal pools, via chemical or electrical (gap junction) synaptic connections, which will increase the probability ((  ,   )) of joint activity of neurons, xi and yj.In a different scenario, pairs of neurons are activated simultaneously because they are both controlled by a specific aspect of the external stimulus.Notice that web-like configuration in A and B are different, which depicts different consequences due to differences in stimuli, affecting the probability laws that govern neuronal activities.Moreover, these differences in web-like patterns in two scenarios can result in differences in the outputs, which would be responsible for differences in motor or behavioral outcomes.

Web-like configuration following processing of specific inputs imposes patterns of activation:
Activation of neurons in networked circuits, as inputs are being processed, leads to a web-like configuration of circuits (Figure 3), where dots represent neurons, and the (double arrow) lines connecting the dots represent the states associated with increased probability of joint activity of pairs of connected neurons.Increased joint probability will reduce the total entropy by an amount called, mutual information in networks of circuits if they process the same internal source or external stimulus (equation 1, result 2).The increase in the amount of mutual information, which is quantitatively related to the increase in joint probability of activity of pairs of neurons in two circuits, is correlated to perception.Connected dots, representing pairs of neurons with high probability of joint activity, will also contribute to the reduction in total entropy in other circuits, where they will increase the certainty about the external stimulus, further contributing to the certainty related to a perceptual object or a specific motor response.

Increase in mutual information contributes to perception:
Here we argue that the presence of robust increases of mutual information is the crucial link for perception, which is mainly based on experimental and theoretical evidence available from an increasing body of research data in processing of visual information.It was proposed earlier by von der Malsburg [30] that neuronal responses can be synchronized for processing the grouping of stimulus-specific features of perceptual objects.However, experimental evidences supporting the role of synchronization in the temporal coupling of neuronal responses in processing stimulusspecific features were provided later by Singer laboratory [31][32][33].Neuenschwander and Singer [32] demonstrated that temporal coupling among responses of spatially segregated ganglion cells can be exploited to convey information relevant for perceptual grouping [1].Bahmer and Gupta [1] have recently argued based on studies of other perceptual functions, including auditory, olfactory and interval-timing that temporal correlations between neural events form an important basis of perception.

Retinal processing of visual information:
We propose that the mutual information, a measure of association between two discrete spatial points, will determine the perception of how close two spatial points from each other are.Furthermore, since there is direct electrical coupling formed by the gap junctions between the rods and cones in different animals including primates [34][35][36], there will be an increase in the joint probability of their activation, increasing mutual information.
An increase in the joint probability of activation of the rods and cones, via direct or indirect coupling, will also increase the joint probability of their activity of their cortical connections, serving distinct spatial points on the physical object.In addition to the gain in mutual information (result 2, table 1), this would lead to dimensionality reduction by converting multiple spatial points of physical object(s) into a single neural representation for the purpose of feedback loops.This will reduce information processing load, especially for those functions that are dependent on feedback processes, by eliminating the need for multiple independent feedback loops.Moreover, the representation of multiple spatial points of a physical object, as a single neural entity in motor feedback interactions or during working memory functions, will also increase mutual information, which would improve the signal to noise ratio, contributing to the attentional modulation as well as perception of the physical object (see equations 11 -13).

Perception of physical continuity:
Ganglion cells in the retina are coupled together by direct gap junctions as well as indirect gap junctions, which are constructed with the help of amacrine cells [37].Direct and indirect coupling of ganglion cells increase the joint probability of activity in pairs, which will depend on the number of gap junctions connecting them.The effect of increased joint probability of activity of ganglion cells will be also extended to the corresponding pairs of third order neurons, which project to the primary visual cortex.When specific pairs of third order neurons show greater joint probability of activity, spatial points served by the corresponding photoreceptors will be perceived as physically closer or continuous if the joint probability of activity approaches 1.
Furthermore, the mutual information processed from the rods and cones will be also affected by other complex interactions within the retinal circuitry as reviewed recently [36], which could help us understand the basis of color, depth or texture perception.A successful motor interaction of the brain with external objects requires a robust increase in mutual information given the knowledge of sensory stimuli.This will result in a web-like connectivity pattern (Figure 3), which will set the stage for successful motor interaction with the environment [28,38,39].
In a past study, gamma oscillations, localized in the primary motor cortex, were seen to reach a peak amplitude during the movement [40].The same study also showed that gamma oscillations were absent during the sustained part of isometric movements, when there is no finger movement or muscle shortening [40].It is interesting to note that the presence of movements, which involves an interaction with the environment, shows a gamma band peak in the primary motor cortex, while there is no increase in the amplitude of gamma bands when there are no movements, consistent with the lack of direct interaction with external surroundings.The increase in coupling of successive circuits can produce a robust increase in mutual information from sensory areas, mostly involving the proprioceptors and photoreceptors, to the premotor areas leading to movements producing successful interaction with the environment.In fact, several studies have shown the presence of coherence, consistent with decrease in uncertainty via gamma band synchronization, between different areas involved in visuo-motor transformations, starting from the early visual areas and reaching through the parietal cortex and motor cortex to the spinal cord [16].Thus, gamma oscillations, by helping to generate mutual information may play a key role in cognitive functions.Hence, it not surprising that different gamma oscillation activities are also found to be reduced in schizophrenia, which is primarily a disorder of cognitive functions [41].4): for the identification of objects [20,42,43].The dorsal stream runs from the primary visual area to the middle temporal area (MT) to posterior parietal areas, projecting to the premotor areas [20,43,44].The ventral stream projects from the primary visual area to the inferotemporal area [20].There are extensive connections between both pathways [20], which are in agreement with the argument that there will a robust increase in mutual information dependent on both, the visual and motiondependent characteristics of visual objects.Note that the MT is specialized to process velocity, direction and depth [44].Furthermore, the cerebellum, which depends on the feedback mechanisms for the calibration of in neural circuits [28] is connected to the cerebrum by multiple parallel loops, that is, the cortical areas project to the same part of the cerebellum, from which they receive inputs [45].Interestingly, although the cerebellum has major connections with the parietal and prefrontal cortices, it also appears to have reciprocal connections with the temporal lobe in humans [45][46][47][48].Due to the connections of the cerebellum with many parts of the dorsal stream, it is likely to be responsible for large increases in certainty in the brain circuits during the motor control of visual objects, given the knowledge about the stimulus characteristics.This increase in mutual information will increase the signal to noise ratio according to equations 11 -13, contributing to attention and other aspects of the perception related to the visual object.

Robust Reduction of Entropy, given stimulus characteristics, involves many parts of the brain (Figure
A study of visuomotor task, using a trackball to manipulate a randomly rotating cube on a computer screen, revealed the presence of coupling between phase of delta oscillations (2-5 Hz) and an increase in amplitude of gamma oscillations (60-90Hz) in the occipital and parietal cortical areas as well as the cerebellum [49].Based on the role of gamma oscillations in producing tighter temporal coupling of synaptic inputs [16], referred as communication through coherence, this increase in the amplitude of gamma oscillations is consistent with increasing mutual information through the synchronization of inputs into various hubs of the network that form the dorsal stream (Figure 4).
A schematic (Figure 4) depicts the circuits of the dorsal and ventral streams, illustrating that the interactions between successive circuits lead to the increase in mutual information.One can compare the inferotemporal area in Figure 4, which is a major hub interacting with other brain areas during visuomotor task, to circuit A depicted in Figure 1.Note that the inferotemporal area, which forms the ventral stream, has connections with the primary visual area, posterior parietal area and cerebellum.Large numbers of interconnections of the inferotemporal area effectively increase the total entropy, which makes it suitable for large increases in mutual information required for attention to the visual objects during visuomotor tasks [49].

Reduction in joint entropies via interaction between feedforward and feedback connections in the cortex:
In order to interpret sensory data based on Bayesian scheme, the brain may use several constraints, such as, resulting from prior experience, recent experience, present data and an internal model of the world.Predictive coding specifically refers to the use of an internal model to interpret sensory data [5,50,51].Feedforward connections carry current expectation of sensory data while feedback connections carry optimal expectation of sensory data [50,51].Feedforward connections, which are synchronized by high-frequency gamma oscillations originate predominately from the superficial layers while feedback connections, carried in the deep layers, are synchronized by alpha and beta oscillations [52,53], consistent with processing in networks [11] that would allow a decrease in the total entropy (Figure 4).
Sensory data in feedforward and feedback directions help to generate prediction error signals in the layer IV, which eventually updates the current expectation of sensory data in the feedforward direction.The sensory data in superficial layers come from the thalamic relay nuclei, which directly reduces uncertainty in successive circuits, given the knowledge of the characteristics of external sensory stimuli.On the other hand, the deep layers carry optimal expectation of sensory data based on an internal model.More significantly, it is the error signal that is used for updating current expectation of sensory data, which would directly increase the mutual information between circuits served by feedforward and feedback directions, to interpret (gaining the knowledge about) the sensory data.These error cells in the cortical layer IV also drive changes in the superficial cells to update expectations.Increases in mutual information, given the knowledge about sensory data from the external stimulus will occur, since error cells compute the difference between current expectations, which is sensory data encoded in superficial layers and other incoming messages, which represent the optimal expectation [51].

Increase in joint probability of a group of neurons: A result of external physical constraint:
If a visual stimulus is present in the receptive fields of a specific set of neurons, then all neurons in this set would become active simultaneously, even if those neurons are not connected by synapses or gap junctions.In this case, the temporal correlation of neuronal activities is not due to the presence of synapses, but it is due to an external constraint, namely, the simultaneous presence of stimuli in the receptive fields of these neurons.Such temporal coupling was observed in a study in which, neurons in early visual areas fired synchronously at 40 Hz (V1 and V2) when the visual stimulus was simultaneously present in their receptive fields [31].Similarly, during a motor interaction, motor circuits for common muscles may be activated together in both hemispheres without mutual synaptic connections.Again, this is the result of an external constraint.

Role of phase-restting of neural oscillations in the temporal processing of information
Sparse coding, which codes sensory and motor information, results from the synchronous activity of neurons in a large area of the brain [22].In one of plausible mechanisms, the probability function controlling sparse activity will depend on the low-frequency oscillations, the phase of which can be reset by a sensory stimulus or internal cue [26,54].According to the modular clock mechanism proposed by Gupta [28], timing information in neural circuits is calibrated as a result of sensory and motor interaction with the physical surroundings.Stimulus-induced phase resetting is likely to be an important mechanism responsible for calibration of timing information in neural circuits during sensory interaction with the physical environment.Moreover, stimulus-induced phase-shifting of low-frequency oscillations will affect the timing as well as the amount and pattern of Shannon information generated, playing an important role in producing a favorable outcome during an interaction with the environment.
Furthermore, experimental evidence showing the synchronization of neuronal activity in the visual areas of the cat when presented with optimally aligned bars in corresponding receptive fields [31] is consistent with stimulus-induced phase shift.The stimulus induced phase-synchronization, a neural event corresponding to the presentation of a stimulus, would lead to generation of information according to the mechanism that involves ramping activity as outlined above (Figure 2).This would associate the time of an external event, presentation of stimuli, with the time of production of surprisal information, generating a representation of the physical time-coordinate.Generation of internal representation of the physical time-coordinate, plotted by the processing of several sensory stimuli at separate physical time-points, will help to input physical time information into neural circuits that process timing functions as proposed by Gupta [28].We further argue that the production of surprisal information in separate circuits, processing different features synchronously is responsible for the feature binding resulting from stimulus-induced synchronization independently in small areas of the brain even when the EEG is desynchronized [55].

Perceptual Cycles and Information Processing:
A large body of evidence now suggests that perception in various modalities is discrete rather than continuous in nature [56][57][58].Experimental data have shown that low-frequency oscillations adjust the excitability of neurons, which is measured as high-frequency oscillations [59], across the occipital, parietal and frontal regions, which can predict behavior on a sub-second time scale [60].The spectral analysis of modulations in perception and attention in different modalities in published studies show that attention and perception are modulated with ~7 and ~10 Hz frequencies respectively [57].Detection of perceptual cycle is an evidence that supports the temporal correlation of neurons in distinct phase of low-frequency oscillations.
Phasic increases in mutual information, which represent certainty about input signals generated internally or an external stimulus, are consistent with a recent study that shows that the visuospatial attention is associated with robust and sustained long-range synchronization of cortical oscillations exclusively in the high-alpha (10-14 Hz) frequency band, connecting frontal, parietal and visual regions, and was observed concurrently with the suppression of low-alpha (6-9 Hz) band oscillations in visual cortex [61].It is noteworthy that higher frequency oscillations in beta range (or high alpha range) oscillations promote local gamma synchronized activities and low alpha range oscillations inhibit local gamma synchronized activities [16].Therefore, this study [61] supports the role of the frontoparietal network in generating mutual information in discrete time intervals during active phase of high range alpha oscillations.Phasic increases in mutual information, mechanistically supported by high-frequency oscillations nested in low-frequency oscillations, are likely to be responsible to the modulations in behavior in visuomotor tasks.Anatomical data also suggest the presence structures throughout the brain that can serve as the basis of producing nested oscillations [62].There are several parts of the brain that contain helix-like anatomical structures, resolved at the level of cells, consistent with hierarchical processing.The examples include ventral part of the lemniscus lateralis, locus coeruleus, oculomotor nuclei amygdala, hippocampus (cornu ammonis 3), and pars compacta and reticulata of the substantia nigra [62].

Summary
In this article, we have argued based on the current research data by using information theoretic approach that perception is partly a consequence of robust increases in mutual information between successive circuits across the brain, analyzed by the probability distribution of binary states of neurons, namely spiking and non-spiking.This robust increase in mutual information reflects the increase in certainty in neural circuits about the knowledge of sensory stimuli from the environment or internal sources of inputs.Since the current model provides a basis for generating an internal model of sensory environment via an increase in mutual information, it supplements predictive coding algorithms for sensory perception.Moreover, low-dimensional substates, resulting from highdimensional processing in the neocortex, responsible for perceptual functions, as proposed by Singer and Lazar [7], has specific correlation structures, which is also consistent with a reduction in the entropy.
The increase in certainty also places constraints on how neural circuits in other areas may be activated, which could lead to specific motor responses or behavior outcomes, optimizing the outcomes of external tasks.Moreover, mutual information is a symmetrical quantity, which means that if stimulus-biased probability laws increase the certainty of binary states in sensory circuits, it also results in increased certainty in motor circuits, given the knowledge about the stimulus, which would lead to stimulus-specific motor responses.
On one hand, an increase in mutual information tends to reduce the total entropy, other neural events with low probability, such as sparse coding generates Shannon information that could potentially code behavioral outcomes.However, there is no obvious link between sparse activity, that is associated with high Shannon information, and the complexity of behavioral outcomes, which makes it an interesting topic for future studies.Other important areas for future studies must include understanding how Gabor-like information, represented by wavelets, called 'logons' by Gabor, encodes perceptual functions [63].Many studies have shown successful analysis of temporal and perceptual functions of the brain in terms of Gabor-like information [

Figure 3 :
Figure 3: This schematic depicts the web-like connectivity resulting from increased joint probability of activity ((  ,   )), (shown as double-headed arrows) of pairs of neurons in circuits X and Y.Notice that web-like configuration in A and B are different, which depicts different consequences due to differences in stimuli, affecting the probability laws that govern neuronal activities.Moreover, these differences in web-like patterns in two scenarios can result in differences in the outputs, which would be responsible for differences in motor or behavioral outcomes.

Figure 4 :
Figure4: This schematic depicts interactions between different regions that construct dorsal and ventral streams, which serve as large populations of neurons with binary states, providing a large entropy.A large entropy can allow robust increases in mutual information via interactions with many circuits, helping to improve signal to noise ratio (equations 11 -13).Also notice that many regions of dorsal and ventral streams have extensive connections with one another and with the cerebellum.

Table 1
processing the same stimulus or internal source, serves as an index of signal to noise ratio with respect to the perceptual functions related to that stimulus 4. Areas with sparse activity associated with low probability of neuronal activities generate large amounts of Shannon Information Preprints (www.