# The Capacity for Correlated Semantic Memories in the Cortex

^{1}

^{2}

^{*}

## Abstract

**:**

^{7}, as originally estimated without correlations. When this storage capacity is exceeded, however, retrieval fails completely only for balanced factors; above a critical degree of imbalance, a phase transition leads to a regime where the network still extracts considerable information about the cued item, even if not recovering its detailed representation: partial categorization seems to emerge spontaneously as a consequence of the dominance of particular factors, rather than being imposed ad hoc. We argue this to be a relevant model of semantic memory resilience in Tulving’s remember/know paradigms.

## 1. Introduction

#### 1.1. Correlations

#### 1.2. Connectivity

^{3}, comprising some 10

^{5}neurons, as one local network interacting through the B system, whose activity is coarsely subsumed into a Potts unit. A Potts unit has multiple activity states, akin to a capsule of the kind recently introduced in deep learning networks [30]. The Potts network, aimed at describing the cortex, or a large part of it, is comprised of N such units, constituting the A-system, Figure 3b. We refer to [31] for a detailed analysis of the approximate thermodynamic and dynamic equivalence of the full multi-modular model and the Potts network. We do not dwell on the correspondence here, but use it to discuss correlations in the Potts framework.

## 2. Results

#### 2.1. The Potts Network

#### 2.2. Generating Correlated Representations

#### 2.2.1. Single Parents and Ultrametrically Correlated Children

#### 2.2.2. Multiple Parents and Non-Trivially Organized Children

#### 2.2.3. The Algorithm Operating on Simple Binary Units

#### 2.2.4. The Algorithm Operating on Genuine Potts Units

#### 2.2.5. Resulting Patterns and Their Correlations

#### 2.2.6. The Ultrametric Limit

#### 2.2.7. The Random Limit

#### 2.2.8. Semantic Dominance

#### 2.3. Storage Capacity of the Potts Network with Correlated Patterns

#### 2.3.1. Self-Consistent Signal to Noise Analysis

#### 2.3.2. Numerical Solutions of Mean-Field Equations and Simulations

#### 2.3.3. The Effect of Correlation Parameters f and ${a}_{p}$

#### 2.3.4. Correlated Retrieval

#### 2.3.5. Residual Information: Memory Beyond Capacity

^{−3}bits per connection, some five times below the entropy of the stored pattern, the initial plateau in Figure 18). It then decreases again at approximately $\zeta \simeq 0.5$. This effect is reminiscent of a phase transition with control parameter $\zeta $, where the information plays the role of the order parameter. Below $\zeta \simeq 0.01$ and above $\zeta \simeq 0.5$, once the capacity is exceeded, there is no more retrievable information about the cued pattern. Within the range $0.01\lesssim \zeta \lesssim 0.5$ however, the network retrieves some information about the cued pattern. In Figure 19b we plot, as a phase diagram, the residual information as a function of $\zeta $ in the x-axis and f in the y-axis. One sees that a non-vanishing residual information requires, essentially, intermediate values of $\zeta $ and sufficiently large values of f. In terms of either parameter, the region with non-vanishing residual information spans more than one order of magnitude. This intermediate regime can be argued to form the basis of semantic resilience in our model.

#### 2.3.6. Residual Memory Interpreted through Cluster Analysis

#### 2.3.7. Residual Memory Rides on Fine Differences in Ultrametric Content

## 3. Discussion: A New Model for the Extraction of Semantic Structure

^{7}with human cortical parameters [31]. Other prescriptions for learning, enhancing capacity, may be explored and studied, and we leave such studies for future investigations. Of the correlation parameters, the effect of the dominance $\zeta $ is particularly interesting. $\zeta \simeq 0$ corresponds to a situation in which all parents are on equal footing, while the opposite limit corresponds to only one handful dominating the rest. For intermediate values of the dominance $\zeta $, we observe correlated retrieval, in that with the decrease in successful retrieval, the fraction of trials in which another, correlated pattern is retrieved, increases; a closer look however, indicates that the phenomenon is linked to the retrieval of the factors, i.e., factor retrieval. In terms of the mutual information between the cued pattern and the final configuration of the network, after retrieval dynamics, we observe that in an intermediate regime of $\zeta $, after the capacity limit has been reached, it does not go to zero. We call this the residual information. The residual information displays a non-monotonic dependence on the dominance $\zeta $. Such a non-trivial behavior is reminiscent of a phase transition, in which the residual information is the order parameter and $\zeta $ is the control parameter. The residual information has an interesting interpretation: it can be thought of as the information pertaining to the gross, core semantic component of the memories, after the fine details have been compromised. Note that $1/\zeta $ is a measure of the number of parents/factors/attributes that effectively dominate semantic space.

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A. Calculation of the Probability Distribution of the Field for S = 1

## Appendix B. Calculation of the Probability Distribution of the Field for S = 2

## Appendix C. Ultrametric Content

## References

- Yonelinas, A.P. The nature of recollection and familiarity: A review of 30 years of research. J. Mem. Lang.
**2002**, 46, 441–517. [Google Scholar] [CrossRef] - Treves, A.; Rolls, E.T. Computational constraints suggest the need for two distinct input systems to the hippocampal CA3 network. Hippocampus
**1992**, 2, 189–199. [Google Scholar] [CrossRef] [PubMed][Green Version] - Alme, C.B.; Miao, C.; Jezek, K.; Treves, A.; Moser, E.I.; Moser, M.B. Place cells in the hippocampus: Eleven maps for eleven rooms. Proc. Natl. Acad. Sci. USA
**2014**, 111, 18428–18435. [Google Scholar] [CrossRef] [PubMed][Green Version] - Lauro-Grotto, R.; Ciaramelli, E.; Piccini, C.; Treves, A. Differential impact of brain damage on the access mode to memory representations: An information theoretic approach. Eur. J. Neurosci.
**2007**, 26, 2702–2712. [Google Scholar] [CrossRef] [PubMed] - Huth, A.G.; Nishimoto, S.; Vu, A.T.; Gallant, J.L. A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron
**2012**, 76, 1210–1224. [Google Scholar] [CrossRef] [PubMed] - Huth, A.G.; de Heer, W.A.; Griffiths, T.L.; Theunissen, F.E.; Gallant, J.L. Natural speech reveals the semantic maps that tile human cerebral cortex. Nature
**2016**, 532, 453–458. [Google Scholar] [CrossRef] [PubMed][Green Version] - Mitchell, T.M.; Shinkareva, S.V.; Carlson, A.; Chang, K.M.; Malave, V.L.; Mason, R.A.; Just, M.A. Predicting human brain activity associated with the meanings of nouns. Science
**2008**, 320, 1191–1195. [Google Scholar] [CrossRef] [PubMed] - Collins, A.M.; Quillian, M.R. Retrieval time from semantic memory. J. Verbal Learn. Verbal Behav.
**1969**, 8, 240–247. [Google Scholar] [CrossRef] - Warrington, E.K. The selective impairment of semantic memory. Q. J. Exp. Psychol.
**1975**, 27, 635–657. [Google Scholar] [CrossRef] [PubMed] - Treves, A. On the perceptual structure of face space. BioSystems
**1997**, 40, 189–196. [Google Scholar] [CrossRef] - Parga, N.; Virasoro, M.A. The ultrametric organization of memories in a neural network. J. Phys.
**1986**, 47, 1857–1864. [Google Scholar] [CrossRef] - Gutfreund, H. Neural networks with hierarchically correlated patterns. Phys. Rev. A
**1988**, 37, 570–577. [Google Scholar] [CrossRef] - Franz, S.; Amit, D.J.; Virasoro, M.A. Prosopagnosia in high capacity neural networks storing uncorrelated classes. J. Phys.
**1990**, 51, 387–408. [Google Scholar] [CrossRef] - Virasoro, M.A. Categorization in neural networks and prosopagnosia. Phys. Rep.
**1989**, 184, 301–306. [Google Scholar] [CrossRef] - Brasselet, R.; Arleo, A. Category Structure and Categorical Perception Jointly Explained by Similarity-Based Information Theory. Entropy
**2018**, 20, 527. [Google Scholar] [CrossRef] - Shallice, T.; Cooper, R. The Organisation of Mind; Oxford University Press: Oxford, UK, 2011. [Google Scholar]
- Rumelhart, D.E.; Hinton, G.E.; McClelland, J.L. A general framework for parallel distributed processing. Parallel Distrib. Process. Explor. Microstruct. Cognit.
**1986**, 1, 45–76. [Google Scholar] - Farah, M.J.; McClelland, J.L. A computational model of semantic memory impairment: Modality specificity and emergent category specificity. J. Exp. Psychol. Gen.
**1991**, 120, 339. [Google Scholar] [CrossRef] [PubMed] - Plaut, D.C. Semantic and associative priming in a distributed attractor network. In Proceedings of the 17th Annual Conference of the Cognitive Science Society, Hillsdale, NJ, USA, 22–25 July 1995; Volume 17, pp. 37–42. [Google Scholar]
- Rogers, T.T.; Lambon Ralph, M.A.; Garrard, P.; Bozeat, S.; McClelland, J.L.; Hodges, J.R.; Patterson, K. Structure and deterioration of semantic memory: A neuropsychological and computational investigation. Psychol. Rev.
**2004**, 111, 205. [Google Scholar] [CrossRef] [PubMed] - Hellwig, B. A quantitative analysis of the local connectivity between pyramidal neurons in layers 2/3 of the rat visual cortex. Biol. Cybern.
**2000**, 82, 111–121. [Google Scholar] [CrossRef] [PubMed] - Roudi, Y.; Treves, A. An associative network with spatially organized connectivity. J. Stat. Mech. Theory Exp.
**2004**, 2004, P07010. [Google Scholar] [CrossRef][Green Version] - Pucak, M.L.; Levitt, J.B.; Lund, J.S.; Lewis, D.A. Patterns of intrinsic and associational circuitry in monkey prefrontal cortex. J. Comp. Neurol.
**1996**, 376, 614–630. [Google Scholar] [CrossRef] - Braitenberg, V.; Schüz, A. Anatomy of the Cortex: Statistics and Geometry; Springer Science & Business Media: Berlin, Germany, 1991; Volume 18. [Google Scholar]
- O’Kane, D.; Treves, A. Short-and long-range connections in autoassociative memory. J. Phys. A Math. Gen.
**1992**, 25, 5055. [Google Scholar] [CrossRef] - O’Kane, D.; Treves, A. Why the simplest notion of neocortex as an autoassociative memory would not work. Netw. Comput. Neural Syst.
**1992**, 3, 379–384. [Google Scholar] [CrossRef] - Mari, C.F.; Treves, A. Modeling neocortical areas with a modular neural network. Biosystems
**1998**, 48, 47–55. [Google Scholar] [CrossRef] - Johansson, C.; Rehn, M.; Lansner, A. Attractor neural networks with patchy connectivity. Neurocomputing
**2006**, 69, 627–633. [Google Scholar] [CrossRef] - Dubreuil, A.M.; Brunel, N. Storing structured sparse memories in a multi-modular cortical network model. J. Comput. Neurosci.
**2016**, 40, 157–175. [Google Scholar] [CrossRef] [PubMed] - Sabour, S.; Frosst, N.; Hinton, G.E. Dynamic routing between capsules. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 3859–3869. [Google Scholar]
- Naim, M.; Boboeva, V.; Kang, C.J.; Treves, A. Reducing a cortical network to a Potts model yields storage capacity estimates. J. Stat. Mech. Theory Exp.
**2018**, 2018, 043304. [Google Scholar] [CrossRef][Green Version] - Hopfield, J.J. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. USA
**1982**, 79, 2554–2558. [Google Scholar] [CrossRef] [PubMed] - Tsodyks, M.V.; Feigel’Man, M.V. The enhanced storage capacity in neural networks with low activity level. EPL (Europhys. Lett.)
**1988**, 6, 101. [Google Scholar] [CrossRef] - Kropff, E.; Treves, A. The storage capacity of Potts models for semantic memory retrieval. J. Stat. Mech. Theory Exp.
**2005**, 2005, P08010. [Google Scholar] [CrossRef] - Mézard, M.; Parisi, G.; Virasoro, M.A. Spin Glass Theory and Beyond: An Introduction to the Replica Method and Its Applications; World Scientific Publishing Co Inc.: Singapore, 1987; Volume 9. [Google Scholar]
- Amit, D.J. Modeling Brain Function: The World of Attractor Neural Networks; Cambridge University Press: Cambridge, UK, 1992. [Google Scholar]
- Treves, A. Frontal latching networks: A possible neural basis for infinite recursion. Cognit. Neuropsychol.
**2005**, 22, 276–291. [Google Scholar] [CrossRef] [PubMed] - Sartori, G.; Lombardi, L. Semantic relevance and semantic disorders. J. Cognit. Neurosci.
**2004**, 16, 439–452. [Google Scholar] [CrossRef] [PubMed] - Osgood, C.E. Semantic differential technique in the comparative study of cultures. Am. Anthropol.
**1964**, 66, 171–200. [Google Scholar] [CrossRef] - Löwe, M. On the storage capacity of Hopfield models with correlated patterns. Ann. Appl. Probab.
**1998**, 8, 1216–1250. [Google Scholar] [CrossRef] - Engel, A. Storage capacity for hierarchically correlated patterns. J. Phys. A Math. Gen.
**1990**, 23, L285. [Google Scholar] [CrossRef] - Shiino, M.; Fukai, T. Self-consistent signal-to-noise analysis of the statistical behavior of analog neural networks and enhancement of the storage capacity. Phys. Rev. E
**1993**, 48, 867. [Google Scholar] [CrossRef] - Kropff, E. Full solution for the storage of correlated memories in an autoassociative memory. Comput. Model. Behav. Neurosci. Closing Gap Neurophysiol. Behav.
**2009**, 2, 225. [Google Scholar] - Tamarit, F.A.; Curado, E.M. Pair-correlated patterns in Hopfield model of neural networks. J. Stat. Phys.
**1991**, 62, 473–480. [Google Scholar] [CrossRef] - Liberman, A.M.; Harris, K.S.; Hoffman, H.S.; Griffith, B.C. The discrimination of speech sounds within and across phoneme boundaries. J. Exp. Psychol.
**1957**, 54, 358–368. [Google Scholar] [CrossRef] [PubMed] - Rips, L.J.; Shoben, E.J.; Smith, E.E. Semantic distance and the verification of semantic relations. J. Verbal Learn. Verbal Behav.
**1973**, 12, 1–20. [Google Scholar] [CrossRef] - Rosch, E.H.; Mervis, C.B. Family resemblances: Studies in the internal structure of categories. Cognit. Psychol.
**1975**, 7, 573–605. [Google Scholar] [CrossRef] - Kuhl, P.K. Human adults and human infants show a “perceptual magnet effect” for the prototypes of speech categories, monkeys do not. Percept. Psychophys.
**1991**, 50, 93–107. [Google Scholar] [CrossRef] [PubMed][Green Version] - Feldman, N.H.; Griffiths, T.L.; Morgan, J.L. The influence of categories on perception: Explaining the perceptual magnet effect as optimal statistical inference. Psychol. Rev.
**2009**, 116, 752. [Google Scholar] [CrossRef] [PubMed] - Haxby, J.V.; Gobbini, M.I.; Furey, M.L.; Ishai, A.; Schouten, J.L.; Pietrini, P. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science
**2001**, 293, 2425–2430. [Google Scholar] [CrossRef] [PubMed][Green Version] - Norman, K.A.; Polyn, S.M.; Detre, G.J.; Haxby, J.V. Beyond mind-reading: Multi-voxel pattern analysis of fMRI data. Trends Cognit. Sci.
**2006**, 10, 424–430. [Google Scholar] [CrossRef] [PubMed] - Preston, A.R.; Eichenbaum, H. Interplay of hippocampus and prefrontal cortex in memory. Curr. Biol.
**2013**, 23, R764–R773. [Google Scholar] [CrossRef] [PubMed] - Ciaramelli, E.; Lauro-Grotto, R.; Treves, A. Dissociating episodic from semantic access mode by mutual information measures: Evidence from aging and Alzheimer’s disease. J. Physiol. Paris
**2006**, 100, 142–153. [Google Scholar] [CrossRef] [PubMed] - Tulving, E. Episodic memory: From mind to brain. Annu. Rev. Psychol.
**2002**, 53, 1–25. [Google Scholar] [CrossRef] [PubMed] - Garrard, P.; Perry, R.; Hodges, J.R. Disorders of semantic memory. J. Neurol. Neurosurg. Psychiatry
**1997**, 62, 431. [Google Scholar] [CrossRef] [PubMed] - Conrad, C. Cognitive Economy in Semantic Memory; American Psychological Association: Washington, DC, USA, 1972. [Google Scholar]
- Spivey, M.; Joanisse, M.; McRae, K. The Cambridge Handbook of Psycholinguistics; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
- Kitamura, T.; Ogawa, S.K.; Roy, D.S.; Okuyama, T.; Morrissey, M.D.; Smith, L.M.; Redondo, R.L.; Tonegawa, S. Engrams and circuits crucial for systems consolidation of a memory. Science
**2017**, 356, 73–78. [Google Scholar] [CrossRef] [PubMed][Green Version] - McClelland, J.L.; McNaughton, B.L.; O’Reilly, R.C. Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychol. Rev.
**1995**, 102, 419. [Google Scholar] [CrossRef] [PubMed]

**Figure 1.**(

**a**) Original correlation matrix. (

**b**) Correlation matrix obtained by replacing each within-block entry with the mean correlation value of that noun cluster and each off-block entry with the mean correlation between clusters. The clusters are obtained through the application of a standard clustering algorithm to the original correlation matrix (

**a**). (

**c**) Strictly ultrametric correlation matrix obtained by again replacing each within-block entry with the mean value within that block, and now each off-block entry with the overall off-block mean.

**Figure 2.**(

**a**) Two-dimensional logarithmic density plot of the ratio between the intermediate and the longest edge vs. the ratio between the shortest and the longest edge, in the triangles created by extracting quasi-distances for the triplets of nouns taken from Figure 1a. The triplets are scattered, with an ultrametric content of 0.5. (

**b**) Same as (

**a**), with triplets taken from Figure 1b. The triplets have less scatter and yield an ultrametric content of 0.61. (

**c**) Same as (

**a**), with triplets taken from Figure 1c. Here triplets constitute isosceles triangles with two long sides, as can be seen from the alignment of the ratios along the vertical line ${d}_{med}={d}_{max}$. The ultrametric content (see Appendix C) is exactly 1. In all three panels, the dashed red line corresponds to the line of constant ultrametric content index.

**Figure 3.**(

**a**) The Potts network, here intended as a model of semantic memory, is a coarse description of the cortex in terms of local patches of dense connectivity, which store activity patterns corresponding to local attractors (

**a**). Each patch is a small local network characterized by high connectivity; diluted connections are instead present between units of different patches. The configuration of the individual patch is assumed to converge to a local attractor, synthetically captured by a Potts state. Each Potts unit, depicted in (

**b**) can be in any of S states, where green, orange, blue and red represent the active states ($S=4$). The white circle at the center corresponds to the quiescent state, aimed at capturing a situation of no retrieval of the underlying local network.

**Figure 4.**(

**a**) A tree, adapted from [35]. 1, 2, 3, 4, 5, and 6 are at the same level of the hierarchy. If we consider the nodes 1, 3 and 6, they are each at a distance of 2 of each other, the distance being defined as the distance to the nearest common branching point. If we consider nodes 3, 4 and 5, then they are each at a distance of 1 of each other, such that we get again an equilateral triangle. If we consider 1, 2 and 3, then ${d}_{12}=1$ while ${d}_{13}={d}_{23}=2$, such that we get an isosceles triangle with two long edges. One alternative, an isosceles triangle with two short edges, is impossible to realize: there are no intermediate points between 1 and 3 or 2 and 3, as indicated in red in (

**b**).

**Figure 5.**(

**a**) The workings of a hierarchical algorithm with 3 parents and 3 child patterns per parent. The different lines of squares/circles correspond to the different units in parents/children. Colors correspond to active Potts states while white denotes the quiescent states. $S=3$. (

**b**) The workings of the multi-parent algorithm with $\Pi =3$ parents and ${p}_{par}=3$ child patterns per parent and 5 total child patterns. Black arrows and their thickness denote strength of input. The main difference with the hierarchical algorithm is that each child pattern can receive input from multiple parents. If each parent is to represent a feature and each child a concept, the algorithm entails the generation of a concept from multiple features.

**Figure 6.**(

**a**) Solid lines correspond to the analytical distributions of the field, Equation (A8), in blue is the distribution of the fields produced by a simulation of the algorithm for ${n}_{p}=15$. The parameters are $N=2000$, $S=1$, ${a}_{p}=0.4$, ${n}_{p}=15\cdots 50$ and $\Pi =100$. (

**b**) The mean and standard deviation of the field as a function of the number of parents.

**Figure 7.**(

**a**) Distribution of the maximal fields for $S=2$ and ${n}_{p}=30$. In blue is the distribution of the fields produced by the algorithm and the black line is Equation (A18). (

**b**) The x-axis orders patterns with different number of parents and the y-axis the fields of the units in that pattern. Red points correspond to units that are set to quiescent and green to those that are activated. The boundary between the green and the red corresponds to ${h}_{m}$, the minimum field required for a unit to be set to active. Parameters are $N=2000$, $S=2$, ${a}_{p}=0.4$ and $\Pi =100$.

**Figure 8.**Probability density function of correlations between units (in red) and between patterns (in green) for three different values of both ${a}_{p}$ and f, the latter yielding in this case an average of 1.5, 4.5 and 7.5 parents per pattern. The black vertical line corresponds to the average correlation with uncorrelated patterns distributed independently according to Equation (1). The parameters are $S=5$, $a=0.3$, and $\Pi =150$. The algorithm produces correlations between patterns with high variability relative to the correlation between units, in line with ideas about semantic memory. Note that the algorithm is sensitive to the parameters and their values strongly affect the correlation between patterns.

**Figure 9.**(

**a**) Boxplots of ${C}^{\mu \nu}$ for different values of ${a}_{p}$, with $f=0.05$ fixed. (

**b**) Boxplots of ${C}^{\mu \nu}$ for different values of f with ${a}_{p}=0.4$ fixed. The parameters ${a}_{p}$ and f play different roles in generating the correlations. Increasing the extent ${a}_{p}$ of the input they receive from each parent increases the overall similarity of those children having shared parents, as evidenced by the increasing skewness of the distributions. In contrast, increasing the prolificity f, leads to an increase in the mean number of shared parents, such that all children are more correlated, as shown by the shift in the overall distribution. The black horizontal line corresponds to the average correlation with uncorrelated patterns distributed according to Equation (1). Other parameters are $a=0.3$, $S=5$ and $\Pi =150$.

**Figure 10.**(

**a**) Another visualization of the correlation distribution of Figure 8, with $f=0.05$, ${a}_{p}=0.4$ and $\Pi =150$, decomposed into the distribution for each number of shared parents. (

**b**) Fraction of pairs of patterns (left y-axis, note the logarithmic scale) and mean correlation between those pairs (right y-axis, linear scale) as a function of number of shared parents. The red horizontal line corresponds to the average correlation with uncorrelated patterns distributed according to Equation (1). Pairs of patterns having more shared parents are markedly fewer, although on average more correlated, so they do not affect much the overall mean correlation.

**Figure 11.**The x-axis lists all of the features used to compute the correlation between the nouns in the toy example of Section 1.1, sorted according to their summed weights across all nouns (reported on a semi-logarithmic y-axis). The exponent of the fit, $\zeta =0.078$, indicates that the semantics of this particular set of nouns is effectively dominated by a set of order $1/\zeta \simeq 10$ features.

**Figure 12.**One sample representation of parent-child relations. The squares on the top row represent parents, while the circles at the bottom row represent children. Black lines represent input from the parents to the children. The strength with which each parent affects its children is proportional to $exp(-\zeta \pi )$, where $\pi $ indexes the parents, as explained in the text. For illustration, there are $\Pi =10$ parents, ${p}_{par}=5$ children per parent and $p=50$ total children.

**Figure 13.**Probability density function of correlations between units (in red) and between patterns (in green) for three different values of the dominance rate $\zeta $ and prolificity f, keeping ${a}_{p}=0.4$ constant. For the low value of $\zeta =0.001$, this figure reproduces the middle panel of Figure 8. For higher values of $\zeta $, where the parents become highly heterogeneous, we see the emergence of large correlations.

**Figure 14.**${C}^{\mu \nu}$ for different values of $\zeta $, with ${a}_{p}=0.4$ and $f=0.05$ fixed. Other parameters are $a=0.3$, $S=5$ and $\Pi =150$.

**Figure 15.**(

**a**) Storage capacity as a function of the sparsity a for different values of the dilution parameter ${c}_{m}/N$, for uncorrelated patterns (${C}_{as}=\tilde{a}$), obtained through solutions of the mean-field equations. (

**b**) Storage capacity for uncorrelated patterns (in black) and for correlated patterns (in green). Dots correspond to simulations while the dash-dotted lines to solutions of the mean-field equations. For uncorrelated patterns this is the same curve as in panel (

**a**) with ${c}_{m}/N=0.1$. It is apparent that the mean-field treatment yields better results for uncorrelated patterns; for correlations, it over-estimates the storage capacity. Parameters are $N=2000$, ${c}_{m}=200$, $S=5$, $U=0.5$ and $\beta =200$.

**Figure 16.**(

**a**) Storage capacity ${\alpha}_{c}$ as a function of the sparsity a for different values of the correlation parameters ${a}_{p}$ and f. The storage capacity is defined as the critical storage at which half of all cued patterns are retrieved with overlap of $0.7$ and above. Increasing ${a}_{p}$ and f are both generally detrimental to the capacity. (

**b**) ${\alpha}_{c}$ as a function of the number of Potts states S, which shows that the superlinear increase derived analytically in [34] for randomly correlated patterns (the black curve) is only really approached, within this limited S range, with patterns that are very close to randomly correlated (the orange curve) (

**c**) ${\alpha}_{c}$ as a function of the connectivity ${c}_{m}$ for random dilution (see Equation (7)). The capacity decreases as a function of increasing connectivity. When not explicitly varied, parameters are $N=2000$, ${c}_{m}=200$, $a=0.1$, $S=5$, $U=0.5$, $\beta =200$, $\zeta ={10}^{-6}$ and $\Pi =150$.

**Figure 17.**Storage capacity curves as a function of different correlation parameters. (

**a**) ${\alpha}_{c}$ as a function of f. The full lines correspond to simulations while the dashed line corresponds to solutions of the mean-field equations. It can be seen that similar to Figure 16a, the over-estimation of the capacity through the SCSNA approach holds also for other values of f. (

**b**) ${\alpha}_{c}$ as a function of ${a}_{p}$. When not explicitly varied, the correlation parameters are ${a}_{p}=0.4$, $f=0.05$, $\zeta ={10}^{-6}$ and $\Pi =150$. Network parameters are $N=2000$, ${c}_{m}=200$, $a=0.1$, $S=5$.

**Figure 18.**Mean overlap (

**left**y-axis) and mutual information per connection (dashed black curve, and

**right**y-axis) as a function of the storage load $\alpha $, for different values of the dominance $\zeta $. For low values of $\zeta $, the information falls abruptly at a value of the storage load $\alpha $, and similarly for very high values of $\zeta $, while for intermediate values we observe a more gradual decay, starting at lower values of the storage load. For intermediate values of $\zeta $ however, the information does not go to zero, but rather saturates at a certain value. We call this residual information. In Figure 19, we plot this residual information as a function of $\zeta $. Network parameters are $N=2000$, ${c}_{m}=200$, $S=5$, $a=0.1$, $U=0.5$, $\beta =200$. Correlation parameters are ${a}_{p}=0.4$, $f=0.05$ and $\Pi =150$.

**Figure 19.**(

**a**) Storage capacity (

**left**y-axis) and residual mutual information between cued memory and configuration retrieved, after capacity collapse, as a function of $\zeta $ (

**right**y-axis). The storage capacity displays a trough for intermediate values of the dominance $\zeta $ that is due to an increased clustering of the patterns and the inability of the network to retrieve each one of them with relatively high precision. The apparent increase of the capacity, for very high values of the parameter $\zeta $, is instead due to such clusters vanishing altogether, one by one, as the inputs from weaker parents are dominated by small input to a random state (see Equation (31)). The residual mutual information corroborates the results from the storage capacity: it is only for intermediate values of $\zeta $ that the network can retrieve some information after capacity collapse. Network parameters are $N=2000$, ${c}_{m}=200$, $S=5$, $a=0.1$, $U=0.5$, $\beta =200$. Correlation parameters are ${a}_{p}=0.4$, $f=0.05$ and $\Pi =150$. (

**b**) Phase diagram of the residual information, as a function of the dominance $\zeta $ in the x-axis and prolificity f in the y-axis, giving a fuller picture of the phase transition in (

**a**). Note that the transition to non-zero residual mutual information occurs at higher values of $\zeta $ with increasing f. The black horizontal line, plotted for clarity, corresponds to the value of f used in (

**a**). The three white dots correspond to three points for which a cluster analysis is reported in Figure 20.

**Figure 20.**Cluster analysis applied to patterns generated by the algorithm, for three different values of the dominance parameter $\zeta $, chosen at salient points of the phase diagram in Figure 19. Increasing $\zeta $, the patterns generated by the algorithm become more and more clustered, as the strongest parent of each pattern comes to dominate its activity.

**Table 1.**Ultrametric content computed for distances of triplets of patterns generated by the algorithm, for six different parameter values of the prolificity and the dominance. An increased ultrametric content reflects an increased clustering in the correlations between patterns. For $f=0.2$ and $\zeta =0.1$, the patterns yield values of the ultrametric content index close to that obtained from the nouns (∼0.5). The corresponding clustering structure of the patterns can be seen in Figure 20d.

$\mathit{\zeta}$ | ||||||
---|---|---|---|---|---|---|

0.001 | 0.02 | 0.1 | 1.0 | 10 | ||

$\mathit{f}$ | 0.05 | 0.395 | 0.429 | 0.435 | 0.416 | 0.397 |

0.2 | 0.389 | 0.404 | 0.507 | 0.507 | 0.507 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Boboeva, V.; Brasselet, R.; Treves, A. The Capacity for Correlated Semantic Memories in the Cortex. *Entropy* **2018**, *20*, 824.
https://doi.org/10.3390/e20110824

**AMA Style**

Boboeva V, Brasselet R, Treves A. The Capacity for Correlated Semantic Memories in the Cortex. *Entropy*. 2018; 20(11):824.
https://doi.org/10.3390/e20110824

**Chicago/Turabian Style**

Boboeva, Vezha, Romain Brasselet, and Alessandro Treves. 2018. "The Capacity for Correlated Semantic Memories in the Cortex" *Entropy* 20, no. 11: 824.
https://doi.org/10.3390/e20110824