Life on the Edge : Latching Dynamics in a Potts Neural Network

Chol Jun Kang 1,2 ID , Michelangelo Naim 1,3 ID , Vezha Boboeva 1 and Alessandro Treves 1,4,* ID 1 Cognitive Neuroscience, SISSA—International School for Advanced Studies, Via Bonomea 265, 34136 Trieste, Italy; ckang@sissa.it (C.J.K.); michelangelonaim@gmail.com (M.N.); vboboeva@sissa.it (V.B.) 2 The Abdus Salam International Centre for Theoretical Physics, Strada Costiera 11, 34151 Trieste, Italy 3 Department of Physics, La Sapienza Università di Roma, Piazzale Aldo Moro, 5, 00185 Roma, Italy 4 Centre for Neural Computation, Norwegian University of Science and Technology, 7491 Trondheim, Norway * Correspondence: ale@sissa.it; Tel.: +39-040-3787-623


Introduction
How can the human brain produce creative behaviour?Systems neuroscience has mainly focused on the states induced, in particular in the cortex, by external inputs, be these states simple distributions of neuronal activity or more complex dynamical trajectories.It has largely eschewed the question of how such states can be combined into novel sequences that express, rather than the reaction to an external drive, spontaneous cortical dynamics.However, the generation of novel sequences of states drawn from even a finite set has been characterized as the infinitely recursive process deemed to underlie language productivity, as well as other forms of creative cognition [1].If the individual states, whether fixed points or stereotyped trajectories, are conceptualized as dynamical attractors [2], the cortex can be thought of as engaging in a kind of chaotic saltatory dynamics between such attractors [3].Attractor dynamics has indeed fascinated theorists, and a major body of work has shown how to make relevant for neuroscience the concepts and analytical tools developed within statistical physics, but the focus has been on compact, homogeneous neural networks [4][5][6][7].These have been regarded as simplified models of local cortical networks-as well as, e.g., of the CA3 hippocampal field-and have not been analysed in their potential saltatory dynamics, given that it would make no sense to consider local cortical networks as isolated systems.Even in the case of a ground-breaking investigation of putative spatial trajectory planning [8], the hippocampal activity that expressed it was thought not to be entirely endogeneous, but rather guided by external inputs, including those representing goals and path integration.Therefore, formal analyses of model networks endowed with attractor dynamics have been largely confined to the simple paradigm of cued retrieval from memory.Attempts have been made to explore methodologies to study mechanisms beyond simple cued retrieval [9,10], for example those involved in drawing, confabulation, thought processes in general, and language, which are all considered to be largely independent of external stimuli, at their core, and to combine generativity with recursion [11][12][13][14][15][16].
Potts neural networks, on the other hand, originally studied merely as a variant of mathematical or potentially applied interest [17][18][19][20][21], offer one approach to model spontaneous dynamics in extended cortical systems, in particular if simple mechanisms of temporal adaptation are taken into account [22].They can be subject to rigorous analyses of e.g., their storage capacity [23], of the mechanics of saltatory transitions between states [24] and are amenable to a description in terms of distinct "thermodynamic" phases [25,26].The dynamic modification of thresholds with timescales separate from that of retrieval, i.e., temporal adaptation, together with the correlation between cortical states, are key features characterizing cortical operations, and Potts network models may contribute to elucidate their roles.Adaptation and its role in semantic priming [27] have been linked to the instability manifested in schizophrenia [28].
The Potts description is admittedly an oversimplified effective model for an underlying two-level auto-associative memory network [29].The even more drastically simplified model of latching dynamics considered by the Tsodyks group [30,31], however, has afforded spectacular success in explaining the scaling laws obtained for free recall in experiments performed 50 years ago.The Potts model may be relevant to a wide set of behaviours and to related experimental measures, once the correspondence between model parameters and the quantities characterizing the underlying two-level network are elucidated.On this correspondence, we elaborate in a separate study [32].Here, we ask when does the Potts network latch?

The Model
We consider an attractor neural network model comprised of Potts units, as depicted in Figure 1.The rationale for the model is that each unit represents a local network of many neurons with its own attractor dynamics [4,6], but in a simplified/integrated manner, regardless of detailed local dynamics.Local attractor states are represented by S + 1 Potts states: S active ones and one quiescent state (intended to describe a situation of no retrieval in the local network), We call this autoassociative network of Potts units a Potts network, and refer to our earlier studies of some of its properties [22][23][24][25]33].The "synaptic" connection between two Potts units is in fact a tensor summarizing the effect of very many actual connections between neurons in the two local networks, but still following the Hebbian learning rule [34], we write the connection weight between unit i in state k and unit j in state l as [23] where c ij is 1 if two units i and j have a connection and 0 otherwise, C is the average number of connections per unit, a is the sparsity parameter, i.e., the fraction of active units in every stored global activity pattern ({ξ and p is the number of stored patterns.The last two delta functions imply that the learned connection matrix does not affect the quiescent states.We will use the indices i, j for units, k, l for states and µ, ν for patterns.Units are updated in the following way: and where r k i is the input to (active) state k of unit i integrated over a time scale τ 1 , while U and θ 0 i are, respectively, the constant and time-varying component of the effective overall threshold for unit i, which in practice act as inverse thresholds on its quiescent state.θ 0 i varies with time constant τ 3 , to describe local network adaptation and inhibitory effects.The stiffness of the local dynamics is parametrized by the inverse "temperature" β (or T −1 ), which is then distinct from the standard notion of thermodynamic noise.The input-output relations (2) and (3) ensure that In addition to the overall threshold, θ k i is the threshold for unit i specific to state k, and it varies with time constant τ 2 , representing adaptation of the individual neurons active in that state, i.e., their neural or even synaptic fatigue.The time evolution of the network is then governed by equations that include three distinct time constants: where the field that the unit i in state k experiences reads The "local feedback term" w is a parameter, first introduced in [25], that modulates the inherent stability of Potts states, i.e., that of local attractors in the underlying network model.It helps the network converge to an attractor faster by giving positive feedback to the most active states and so it effectively deepens their basins of attraction.Note that, in this formulation, feedback is effectively spread over (at least) three time scales: w is positive feedback mediated by collective attractor effects at the neural activity time scale τ 1 , θ k i is negative feedback mediated by fatigue at the slower time scale τ 2 , while θ 0 i is also negative, and it can be used to model both fast and slow inhibition; for analytical clarity, we consider the two options separately, as the "slowly adapting regime", with τ 3 > τ 2 , and the "fast adapting regime", with τ 3 < τ 1 .It would be easy, of course, to introduce additional time scales, for example by distinguishing a component of θ 0 i that varies rapidly from one that varies slowly, but it would greatly complicate the observations presented in the following.
The overlap or correlation of the activity state of the network with the global memory pattern µ can be measured as Randomly correlated memory patterns are generated according to the following probability distribution P(ξ a, while correlated patterns are generated by the multi-parent algorithm sketched in [22], which will be discussed in a separate study [35].

Results
When does robust latching, as a model of spontaneous sequence generation, occur?We address this question with extensive computer simulations, mostly focused on latching between randomly correlated patterns.We consider first the slowly adapting regime (τ 1 τ 2 τ 3 ) in which active states (τ 2 ) adapt slower than activity propagation to other units (τ 1 ), while inhibitory feedback is restricted to an even slower timescale, τ 3 .Next, we contrast with it the fast adapting regime (τ 3 τ 1 τ 2 ) in which, instead, inhibitory feedback is immediate, relative to the other two time scales.
The critical parameters at play are the number of patterns, p, the number of active states, S, and the number of connections per unit, C, and we also look at the effect of the feedback term w.The other parameters, including T, τ 1 , τ 2 , and τ 3 , are kept fixed during simulations, after having chosen a priori values that can lead to robust latching dynamics in the two regimes.

Slowly Adapting Regime
In the slowly adapting regime, over a (short) time of order τ 1 the network, if suitably cued, may reach one of the global attractors, and stay there for a while; whereupon, after an adaptation time of order τ 2 , it may latch to another attractor, or else activity may die [25].However, how distinct is the convergence to the new attractor?One may assess this as the difference between the two highest overlaps the network activity has, at time t, with any of the memory patterns, m 1 (t) − m 2 (t): ideally, m 1 1 and m 2 is small, so their difference approaches unity.A summary measure of memory pattern discrimination can be defined as , where, of course, the identity of patterns 1 and 2 changes over the sequence.
As discussed in [25], by looking at the latching length, how long a simulation runs before, if ever, the network falls into the global quiescent state, one can distinguish several "phases".Depending on the parameters, the dynamics exhibit finite or infinite latching behaviour, or no latching at all.Typically, when increasing the storage load p, the latching sequence is prolonged and eventually extends indefinitely, but, at the same time, its distinctiveness decreases, since memory patterns cannot be individually retrieved beyond the storage capacity; and, even before, each acquires neighbouring patterns, in the finite and more crowded pattern space, with which it is too correlated to be well discriminated.
In Figure 2, we see that, for each S = (2, 3, 4), as p is increased beyond a certain value, latching dynamics rapidly picks up and extends eventually through the whole simulation, but, in parallel, its discriminative ability decreases and almost vanishes-the p-range where d 12 is large is in fact when there is no latching, and d 12 only measures the quality of the initial cued retrieval.For S = 1 no significant latching sequence is seen, whereas for higher values, at fixed p, its distinctiveness increases with S, but its length decreases from the peak value at S = 2. Since the latching length l is not itself sufficient to characterize latching and has to be complemented by discriminative ability, we find it convenient to quantify the overall quality of latching.
With a new quantity Q defined as where η is introduced to exclude cases in which the network gets stuck in the initial cued pattern, so that no latching occurs; however, high d 12 and l are: η = 1 : if at least one transition to a second memory pattern occurs, 0 : otherwise.
Q is therefore a positive real number between 0 and 1, and we report its color-coded value to delineate the relevant phases in phase space.Thus, low quality latching with small Q may result from either small d 12 or short l, or both.The parameters that determine Q which we focus on are S, C and p, after having suitably chosen all the other parameters, which are kept fixed.Their default values in the slowly adapting regime are N = 1000, a = 0.25, U = 0.1, T = 0.09, w = 0.8, τ 1 = 3.3, τ 2 = 100.0,τ 3 = 10 6 , unless explicitly noted otherwise.If activity does not die out before, simulations are terminated after N update = 6 × 10 5 steps, the total number of updates of the entire Potts network, and are repeated with different cued patterns.Re the values of S, C and p, we use the following notation, for simplicity: Figure 3 shows that there are narrow regions in the S-p and C-p planes, which we call bands, where relatively high quality latching occurs.The values of p with the "best" latching scale almost quadratically in S, and sublinearly in C.Moreover, one notices that, below certain values of S and C, no latching is seen, i.e., the band effectively ends at S ∼ 2, p ∼ 90 in Figure 3a and at C ∼ 50, p ∼ 70 in Figure 3b.Importantly, the band in Figure 3a is confined in the area delimited by the cyan solid and dashed curves above and below it.The dashed curve is for the onset of latching, i.e., the phase transition to finite latching [25], while the solid curve above is the storage capacity curve in a diluted network, given by the approximate relation beyond which retrieval fails [25].It should also be noted that overall Q values are not large, in fact well below 0.5 throughout both S-p and C-p planes.The reason is, again, in the conflicting requirements of persistent latching, favoured by dense storage, high p, and good retrieval, allowed instead only at low storage loads (in practice, relatively low p/S 2 and p/C values): In Figure 4, we show representative latching dynamics at three selected points in the (S, p) plane, in terms of the time evolution of the overlap of the states with the stored activity patterns (see Equation ( 8)).The three points, marked in red, span across the band in Figure 3a, and we see that latching is indefinite but noisy in the example at (5,250), which is apparently too close to storage capacity, while memory retrieval is good at (7,150), but the sequence of states ends abruptly, as the network is in the phase of finite latching [25].The two trends are representative of the two sides of the band, while in the middle, at (6,200), one finds a reasonable trade-off, with relatively good retrieval combined with protracted latching.
We use two statistical measures, the asymmetry of the transition probability matrix and Shannon's information entropy [33,36,37] to characterize the essential features of the dynamics in different parameter regions.For that, we take all five red points from Figure 3a, such that they cut across the latching band in the S-p plane, and extend further upwards.We first compile a transition probability (or rather, frequency) matrix M from all distinct transitions observed along many latching sequences generated with the same set of stored patterns, as in [33].The dimension of the matrix M is (p + 1) × (p + 1), as it includes all possible transitions between p patterns plus the global quiescent state.M is constructed from the transitions between states having both overlaps above a given threshold value, e.g., 0.5, in a data set of 1000 latching sequences, by accumulating their frequency between any two patterns into each element of the matrix and then normalizing to 1 row by row, so that M µ,ν reflects the probability of a transition from pattern µ to ν. A, the degree of asymmetry of M, is defined as where M T is the transpose matrix of M and ||M|| = ∑ µ,ν |M µ,ν |.Note that A is small for unconstrained bi-directional dynamics and large for simpler stereotyped flows among global patterns, attaining its maximum value A = 2 for strictly uni-directional transitions.Note also that if the average had been taken over different realizations of the memory patterns, given sufficient statistics A would obviously vanish.(6,200); and (c) (7,150) in Figure 3a.
Another measure we apply to the transition matrix M is Shannon's information entropy, defined as I µ takes positive real values from 0 (deterministic, all transitions from one state are to a single other state) to 1 (completely random), since it is normalized by log 2 (p + 1), which corresponds to a completely random case.
We use these two measures, A and I µ , on the points, marked red in Figure 3a.
(3, 350) − (4, 300) − (5, 250) − (6, 200) − (7, 150) that lie on a segment going through the latching band observed in the slowly adapting regime.If we focus on transitions between states reaching at least a threshold overlap of 0.5, Figure 5 appears to show two complementary, almost opposite U-shaped curves as the two measures, asymmetry and entropy, are applied to the five points along the segment.One branch of each U shape extends over the range that includes the high-Q latching band: these are the right branches of the two curves, in which asymmetry decreases from a large value A 1.6 at (7, 150) to a smaller one A 0.6 at (5, 250), while concurrently the entropy increases from I µ < 0.5 at (7, 150) to I µ > 0.8 at (5, 250).As Figure 4 indicates, at (7, 150), latching sequences are distinct but very short, and few entries are filled in the transition matrix: generally either M µν = 0 or M νµ = 0, so that asymmetry is high and entropy relatively low.This holds irrespective of the number of sequences that are averaged over.The opposite happens at (5,250), where many transitions are observed, and in filling the transition matrix they approach the random limit.The point with the highest Q-value, (6,200), is characterized by intermediate values of asymmetry and entropy which, we have previously observed, may be seen as a signature of complex dynamics [33].Extending the range upwards, it seems as if the asymmetry, with threshold 0.5, were to eventually increase again, reaching its maximum A = 2 at (3, 350), with a decreasing entropy, vanishing at the same point (3,350).These left branches are, however, dependent on the threshold values used, as Figure 5 shows, and do not imply that transitions become more deterministic because, in this region, there are simply fewer and fewer distinct transitions discernible above the noise (Figure 4).The left branches merely reflect the increasing arbitrariness with which one can identify significant correlations with memory states in the rambling dynamics observed at higher storage loads.In Figure 6, we see that the effect of the local feedback term, w, is first to enable latching sequences of reasonable quality, and then to also shift the latching band to higher values of S, effectively pushing this behaviour away from the storage capacity curve representing the retrieval capability of the Potts associative network.Hence, if one were to regard S as a structural parameter of the network, and w as a parameter that can be tuned, there is an optimal range of w values that allows good quality latching for higher storage.This argument has to be revised, however, by considering also the threshold U, since increasing w can be shown to be functionally equivalent, in terms of storage capacity, to decreasing U [32].Also for U, in fact, one can find an optimal range for associative retrieval to occur, in the simple Potts network with no adaptation and with w = 0 [23].This near equivalence between U and −w does not hold anymore in the fast adapting regime, to which we turn next.

Fast Adapting Regime
We characterize the fast adapting regime by the alternative ordering of time scales τ 3 < τ 1 τ 2 , such that the mean activity in each Potts unit is rapidly regulated by fast inhibition, at the time scale τ 3 .Equation (6) stipulates that ∑ S k=1 σ k i (t), the total activity of each unit, is followed almost immediately, or more precisely at speed τ −1 3 , by the generic threshold θ 0 i (t).Extensive simulations, with the same parameters as for the slowly adapting regime, except for w = 1.37, τ 1 = 20, τ 2 = 200 and τ 3 = 10, show that, similarly to the slowly adapting regime, there are latching bands in the Q(S, p) and Q(C, p) planes (see Figure 7).With these parameters, in particular, the larger value chosen for the feedback term w, the bands occupy a similar position as in the slowly adapting regime.Again, they appear to vanish below certain values of S and C, more precisely around S ∼ 3, p ∼ 120 in Figure 7a and around C ∼ 50, p ∼ 90 in Figure 7b, and to scale subquadratically in S and sublinearly in C. The band in the S-p plane is again confined by the storage capacity (solid cyan curve) and by the onset of (finite) latching (dashed curve).The storage capacity curve, which is independent of threshold adaptation, follows the same Equation ( 13).Examples of latching behaviour outside and inside the band are presented in Figure 8, at the same values for S but shifted by ∆p = 100, i.e., at the "red" points (5, 350), (6, 300), and (7, 250) in the S-p plane.Again, we see from Figure 7a that (5, 350) lies just above the band, while (6, 300) is right on the centre.To the right of the band, e.g., at (7, 250), the transitions are distinct but latching dies out very soon, while on the left, e.g., at (5,350), the progressively reduced overlaps are a manifestation of increasingly noisy retrieval dynamics.In all three examples, we observe that latching steps proceed slowly, even slower than the doubled time scale τ 2 = 200 would have led to predict.This appears to be because often a significant time elapses between the decay of the overlap of the network with one pattern and the emergence of a new one.
Figure 9 shows the asymmetry and entropy measures, A and I µ , along the points (4, 400) − (5, 350) − (6, 300) − (7, 250) − (8, 200), in Figure 7a, where, again, we have chosen a series shifted by ∆p = 100 upwards in order to centre it better on the high quality latching band.Only an overlap threshold of 0.5 is considered.What one can see, in contrast with the slowly adapting regime, is that now the two measures are not quite complementary.The point (6, 300) that lies inside the band, very much at its quality peak, shows again an intermediate value for the asymmetry, but the highest value, given the overlap threshold, for the entropy.The discrepancy may be ascribed to the different prevailing type of latching transition observed in the fast adapting regime, Figure 8.As discussed in [24], in a Potts network latching transitions with a high cross-over, which can only occur between memory patterns with a certain degree of correlation, can be distinguished from those with a vanishing cross-over, which are much more random.In the fast adapting regime, as indicated by the examples in Figure 8, all transitions tend to be of the latter type.A more careful analysis indicates, in fact, that they are quasi-random, in that they avoid a memory pattern in which largely the same Potts units are active as in the preceding pattern.In fact, the value of the entropy at (6, 300) implies that on average from each of the 300 memory patterns there are transitions to at least 190 other patterns (to 190 if they were equiprobable, in practice many more); therefore, only the few patterns that happen to be more (spatially) correlated are avoided.Towards the left, the curves do not vary much depending on the threshold chosen for the overlaps, but the asymmetry eventually becomes maximal and the entropy vanishes simply because sequences of robustly retrieved patterns do not last long, so, in this particular case, it would take more than 1000 sequences to accumulate sufficient statistics.
The effects of increasing the w term in the fast adapting regime are shown in Figure 10, where one notices two main features.First, there is heightened sensitivity to the exact value of w, so that relatively close data points at w = 1.33, 1.37, 1.41, and 1.45 yield rather different pictures.Second, although again increasing w shifts the latching band rightward, by far the main effect is a widening of the band itself.This is because in the presence of rapid feedback inhibition a larger w term ceases to be functionally similar to a lower threshold, which in the slowly adapting regime was leading in turn to noisier dynamics and eventually indiscernible transitions.In the fast adapting regime, the increased positive feedback can be rapidly compensated by inhibitory feedback, so that in the high-storage region overlaps remain large, until they are suppressed by storage capacity constraints (the cyan curve, which remains at approximately the same distance from the larger and larger latching band).
We now turn to more explicit comparison of the transition dynamics in two regimes.

Comparison of Two Regimes
To look more closely at latching dynamics in the slowly and fast adapting regimes, we take the following points from Figures 3a and 7a, which allow us to cut through the bands at two different storage levels p = 200, S = (4, 5, 6, 7), p = 400, S = (6, 7, 8, 9).( 16) Figure 11 shows in different colors the overlaps of the state of the network with the global patterns, for sample sequences along the points (16), in the slowly adapting regime.For both p = 200 and 400, latching length is observed to decrease with S, unlike the discrimination between patterns, as measured by d 12 , in agreement with Figure 2. Note that the two rows in the figure are similar, indicating that the shift ∆p = 200 is approximately compensated by the rightward shift ∆S = 2.The fast adapting regime shows the same trends, again one sees in Figure 12 the approximate compensation between the two shifts ∆p = 200 and ∆S = 2, but latching appears in general less noisy.The main difference between the two regimes, however, is in the distribution of crossover values, those when the network has equal overlap with the preceding and the following pattern: their distribution (PDF, or probability density function) is shown in Figures 13 and 14    We see that, in the fast adapting regime, most transitions occur at very low crossover, i.e., the correlation with the preceding memory has to decay almost to zero before the next memory pattern can be activated.Only in regions of the (S, p) plane where latching sequences are very short, a few transitions only, we begin to see a small fraction of them with crossover values above 0.2.In most cases, the inhibitory feedback conveyed by the variable θ 0 i is so fast as not to allow transitions to be carried through by positive correlations, i.e., by the subset of Potts units which are in the same active state in the preceding and successive pattern.The choice of the next pattern is not completely random, as indicated by the relative entropy values still below unity, but is determined essentially by negative selection, as mentioned above: the next pattern tends to have few active Potts units that coincide with those active in the preceding pattern.
In the slowly adapting regime, instead, due to the slow variation of the non-specific threshold, active Potts units can remain active, but they are encouraged by the variables θ k i to switch between active states if they have been in the same for too long.This can produce, particularly in the center of the latching band, sequences of patterns succeeding each other at high crossover, as shown by the distribution in Figure 13c.Even when latching is very noisy and approaches randomness, as in panels Figure 13a,e, crossover values are consistently above 0.2, indicating a preference for patterns insisting on the same set of active Potts units, unlike the fast adapting regime.Finally, when the number of states S is too large or, equivalently, that of patterns p too low, we observe some transitions with minimal crossover and a majority with very large crossover, as if occurring only with those patterns that were already partially retrieved when the network had still the largest overlap with the preceding pattern, but the main observation is that there are very few transitions at all, so that to plot a probability density distribution we need to used wide bins, in panels Figure 13d,h (and in Figure 14d).
This difference between the two regimes is confirmed by an analysis of the correlations between successive patterns in latching sequences.In the Potts network, at least two types of spatial correlation between patterns are relevant: how many active Potts units the two patterns share, and how many of these units are active and in the same state.We quantify them with C 1 , the fraction of the units active in one pattern that are active also in the other, and in the same state; and with C 2 , the fraction that are also active, but in a different state.In a large set of randomly determined patterns, the mean values are C 1 = a/S and C 2 = a(S − 1)/S.The full distribution, among all pairs, is scattered around these mean values.However, do transitions occur between any pair of patterns?
Figure 15 shows that relative to the full distribution, in blue, transitions tend to occur, in the slowly adapting regime on the left, only between patterns with C 1 above and C 2 below (or at most around) their average values.Thus, when the network has retrieved a memory representation, it looks for correlated ones, as it were, where to jump.In the fast adapting regime, this is not the case: transitions are almost random, except there appears to be a slight tendency to avoid those with C 1 well above its mean value.Note that the values of p and w are different in the two panels, and are chosen so as to be in roughly equivalent positions within the respective latching bands.
The analysis of the crossover points, therefore, affords insight into the rather different transition dynamics prevailing in the fast and slowly adapting regimes, in particular in the center of their latching bands, suggesting that in a more realistic cortical model, which combines both types of activity regulation, there should still be a significant component of "slow adaptation" for interesting sequences of correlated patterns to emerge.The preceding simulations, however, were all carried out with randomly correlated patterns, in which the occasional high or low correlation of a pair is merely the result of a statistical fluctuation.Does the insight carry over to a more stuctured model of the correlations among memory patterns?This is what we ask next.

Analysis with Correlated Patterns
Correlated patterns were generated according to the algorithm mentioned by [22] and discussed in detail in [35].The multi-parent pattern generation algorithm works in three stages.In the first step, a total set of Π random patterns are generated to act as parents.In the second step, each of the total set of parents are assigned to p par randomly chosen children.Then, a "child" pattern is generated: each pattern, receiving the influence of its parents with a probability a p , aligns itself, unit by unit, in the direction of the largest field.In the third and final step, a fraction a of the units with the highest fields is set to become active.In this way, child patterns with a sparsity a are generated.In addition, another parameter ζ can be defined, according to which the field received by a child pattern is weighted with a factor exp (−ζk) where the index k runs through all parents.This is meant to express a non-homogeneous input from parents.
It is clear that such patterns, however, cannot be considered as independent and identically distributed, as in Equation ( 9), because their activity is drawn from a common pool of parents.In fact, they are correlated, in the sense that those children receiving congruent input from a larger number of common parents will tend to be more similar.All of these observations are studied in more detail in [35], and here we only focus on how correlations affect the phase diagrams.In the following simulations, the parameters pertaining to the patterns are a p = 0.4, Π = 100, ζ = 0.1 while p par /p, the probability that a pattern be influenced by a parent is kept constant at 0.277.
Simulations with correlated patterns were carried out across the same S-p and C-p planes in phase space, in the slowly adapting regime, as shown in Figure 16.We focused on the slowly adapting regime based on the results of the crossover analysis.All other simulation parameters were kept at the values used with randomly correlated patterns.We see from the figure that the presence of non-random correlations among the memory patterns, albeit weak, shifts the bands to the left and upward in phase space, keeping approximately the dependence of the viable storage load p on S and C, but at somewhat higher values.It is as if more memories could "fit", if correlated, into the same latching dynamics.
Figure 17 shows the S-p plane cut along p = 200, to better compare the cases with correlated (blue) and random (red) patterns.It is apparent that there is a leftward shift, in the case of correlated patterns, from the red curve applying to the random case, but the dependence on S remains very similar.

Conclusions
In this paper, we have found the region in the Potts network phase space spanned by the number of Potts states S, the number of connections per unit C and the storage load p, where latching dynamics occur, and we have described their character, comparing and contrasting the slowly and fast adapting regimes.In relation to our earlier paper [22], where the possibility of such a latching region was pointed out on the basis of limited simulations, we have now a firmer basis to extrapolate to regions of parameter space of relevance to the human cortex, possibly a step toward quantitatively studying human specific capacities, including creative behaviour.A common hallmark in both regimes is that good quality latching occupies a band which scales almost quadratically in the p-S plane, while it is sublinear in the p-C plane.These bands are bounded by the storage capacity line, above, and by the boundary between no latching and finite latching, below.If, as discussed elsewhere [32], we were to take C ≈ 10 2 and S ≈ 10 2 as the orders of magnitude of interest for the human brain, we would conclude that the relevant storage load, or semantic depth, is in the region p ≈ 10 5 , in both regimes.At the center of the band in the slowly adapting regime, asymmetry and entropy take intermediate values, pointing at maximally complex and potentially useful dynamics, intermediate between the deterministic and the random extremes.High crossover values indicate that many transitions occur between highly correlated patterns.Using correlated patterns shifts the position of the band in phase space, but preserving the features observed with random patterns, still in the slowly adapting regime.
In the fast adapting regime, instead, in the center of the band, which can be made wider and more robust, the entropy is higher, and correspondingly only low crossover transitions are observed, indicating that the network latches most of the time from one pattern to any other among the many with which it is weakly or anti-correlated, avoiding only those few with which it is highly correlated.
Therefore, we can conclude that the fast adapting regime, modelling rapid inhibitory feedback, offers a robust framework for latching dynamics, but of an essentially random, not very useful nature; whereas in the slowly adapting regime, modelling slow inhibition or local fatigue, correlations can drive latching transitions, potentially enabling semantic content in a stream of thoughts or linguistic productions, but with fragile dynamics, living at the very edge between memory overload and sequence termination because of the inability of the network to jump forward.This suggests the opportunity of considering models that integrate both fast and slowly adapting dynamics in their non-specific thresholds, so as to combine the useful features of both regimes.It will be the object of future work.
We would like to note, in the end, the inherent limitation of considering a simple homogeneous Potts network, with no differentiation among its units and no internal structure.In order to make contact with cognitive processes, of any kind, this limitation has to be overcome, as perhaps attempted, with one first step among many possible ones, by arranging Potts units on a ring [38].Nevertheless, even in its crudest form, the Potts network with its latching dynamics can be used to explore e.g., novel theories as to the evolutionary origin of complex cognition [39].It establishes a quantitative framework to understand phase transitions [25], complementary to the perspective offered by other modelling approaches to sequence generation in cortical networks [40].At the most abstract level, it can be considered an implementation of a fuzzy logic system [41,42], but with the critical advantage that its parameters can eventually be related to cortical parameters, as we begin to describe in a related study [32].

Figure 1 .
Figure 1.Global cortical model as a Potts neural network.Reprinted with permission from [25].

Figure 2 .
Figure 2. Trade-off between latching sequence length (solid lines) and retrieval discrimination (dashed lines).Different colors indicate different S values, while C = 400 throughout.The latching length l is in time steps (not in the number of transitions), normalized by the time of the simulation, N update = 6 × 10 5 .

Figure 3 .
Figure 3. Phase space for Q(S, p) in (a) and Q(C, p) in (b) with randomly correlated patterns in the slowly adapting regime.The parameters are C = 150 and S = 5, if kept fixed, and w = 0.8.The red spots in (a) mark the parameter values used in the following analyses.

Figure 5 .
Figure 5. (a) asymmetry A of the transition matrix and (b) Shannon's information entropy, I µ along the (3, 350)-(4, 300)-(5, 250)-(6, 200)-(7, 150) parameter series from Figure 3. Different curves correspond to different thresholds for the overlap of the two states between which the network is defined to have a transition.The error bars report the standard deviation of either quantity for each of 1000 sequences.

Figure 6 .
Figure 6.Latching quality Q(S, p) with increasing local feedback, w = 0.37, 0.55, 0.8, and 1.0 in the slowly adapting regime.Randomly correlated patterns are used, with C = 150 as in Figure 3a.

Figure 7 .
Figure 7. Phase space for Q(S, p) in (a) and Q(C, p) in (b) with randomly correlated patterns in the fast adapting regime.The parameters are identical to those in the slowly adapting regime, with the exception of w = 1.37, τ 1 = 20, τ 2 = 200, τ 3 = 10.The red spots in (a) mark, again, the parameter values used in the Figures below.

Figure 10 .
Figure 10.Latching quality Q(S, p) with increasing local feedback, w = 1.33, 1.37, 1.41, and 1.45 in the fast adapting regime.Randomly correlated patterns are used, with C = 150 as in Figure 7a.

Figure 15 .
Figure15.Scatterplots of the fractions C 1 and C 2 of Potts units active in one pattern that are active also in another, and in the same state or, respectively, in another active state.The panels show the full distribution between any pattern pair, in the slowly (a) and fast adapting (b) regimes, in blue; and the distribution between successive patterns in latching transitions, in red.The blue distribution for the fast adapting regime (for which a = 0.25, S = 6, p = 300 and w = 1.32) is similar to the one for the slowly adapting regime (for which again a = 0.25, S = 6, but p = 200 and w = 0.65), except that it is slightly wider, because of the higher storage load, while the red distributions are markedly different.Vertical lines indicate mean values: (a) slowly adapting regime; (b) fast adapting regime.

Figure 16 .
Figure 16.Phase space, cut across the Q(S, p) plane in (a) and Q(C, p) in (b), with correlated patterns in the slowly adapting regime.Red dots represent the quality peaks in the the same planes, with randomly correlated patterns.The parameters are C = 150 and S = 5, if kept fixed, and w = 0.8.

Figure 17 .
Figure 17.Comparison of S-p phase spaces along p = 200 with random (red dotted line) and correlated (blue dotted) patterns in the slow adapting regime.