1. Introduction
Since the seminal work of Hopfield [
1], Attractor Neural Network (ANN) remains as a permanent subject of investigation. With the recent development in the broad field of Artificial Intelligence and the plethora of its applications in modern life, this subject deserves renewed interest.
The Finite Connectivity Attractor Neural Network (FCANN) is an ANN where the neuron’s average connectivity is finite, which allows for a more realistic description of biological neural networks than the classical mean-field one [
2]. The number of neurons and connectivities in biological systems are spread over a wide range. As extreme examples, C. elegans have around 300 neurons with average connectivity of around 10 [
3,
4], while in the human brain these numbers are in the order of
and
, respectively. The FCANN model was introduced in [
5], where an improved replica method was employed to investigate the thermodynamic properties of a network in which the patterns are stored according to a generalized Hebbian rule. In the present work we extend the formalism of [
5] and develop the equations to explicitly evaluate the main observables.
Beyond the storage capacity, key properties of an ANN are the information content capacity and the ability to retrieve the information in different scenarios. The information capacity can be evaluated theoretically through entropy measurements [
6]. The retrieval ability can be investigated through computer simulations or by applying to real problems. As a proof of concept the FCANN is used for retinal images retrieval. Digital Retinal Images for Vessel Extraction and Recognition is a research field within ophthalmology, which allows the early diagnosis of eye diseases such as diabetic retinopathy, glaucoma, and macular degeneration. Also, retina recognition is a biometric modality and technology that utilizes the unique patterns of blood vessels in the retina to identify individuals. This method is highly reliable due to the distinct and unchangeable nature of retinal patterns, which remain stable throughout the individual’s life [
7].
Some main applications of retina recognition are the following ones:
- (a)
Security and Access Control: Retina images are used in high-security environments, such as military and government environments, to control access. Retina scanners are employed to secure restricted areas and ensure that only authorized personnel can enter.
- (b)
Healthcare: Medical identification systems use retina recognition to accurately match patients with their medical records. Retinal scans are also utilized in diagnosing and monitoring diseases like diabetes and hypertension, which affect retinal blood vessels.
- (c)
Banking systems: Financial institutions use retina recognition for secure authentication of transactions, ATMs, and online banking services to increase security and reduce fraud.
In recent years, the use of neural networks has emerged to automate and improve the accuracy of retina recognition [
8]. Convolutional Neural Networks (CNNs) are particularly suitable for retina recognition due to their ability to extract relevant features from images through convolutional and pooling layers. It has been used for health monitoring and even inspired a particular CNN architecture [
9]. Also other types of neural networks, such as multilayer feedforward perceptron, were considered to identify the individual to which a retina belongs. It is more precise than a human observer, automatic, efficient, and fast. However it cannot be used to reconstruct a complete noisy image [
10]. In the present paper we propose to apply an extremely diluted Attractor Neural Network to recognize and retrieve a retinal structure from a noisy sample in a dataset.
The aim of this paper is to offer a deeper understanding of the FCANN, exploring features of its thermodynamics properties and its capabilities as an information storage device. It is organized as follows.
Section 2 summarizes key findings and methods of related works.
Section 3 offers a brief review of the replica method for finite connectivity networks, following the steps of [
5]. In
Section 4, a brief description of the calculation of information content is presented. In
Section 5, we present an evaluation of the RS solution. Numerical simulations and retrieval of real retina patterns are described in
Section 6. In
Section 7 the results are presented. Some further remarks and conclusions are addressed in
Section 8.
2. Related Works
The topology of the FCANN is that of the Erdös–Renyi network [
11], consisting of
N nodes, interacting with a finite neighborhood of nodes, randomly chosen, with average
c. Finite connectivity disordered systems can be theoretically approached through different ways. One of them requires a large number (essentially infinite) of order parameters, like in [
12]. The one that is adopted in the present work access the system’s properties by constructing the distribution of local fields acting on each site. See, e.g., ref. [
13]. In the context of neural networks, it is used in ref. [
5]. The technique has been successfully applied to disordered magnetic materials [
14].
In previous works, the evaluation of information content has been crucial for understanding the dynamics of neural networks and spin models. Specifically, the information content of a fully connected three-state artificial neural network (ANN) was analyzed by ref. [
15]. The study focuses on a three-state ANN, proposing a self-control model to explain low-activity patterns and examining an extremely diluted network. Furthermore, studies on threshold binary ANNs have demonstrated that these networks, when learning biased patterns, exhibit similar behaviors, with mutual information serving as a key measure of the network’s capacity as associative memory [
16], where the authors explore the role of information measures in optimizing learning algorithms, contributing further to the understanding of neural network behavior. Finally, in ref. [
6], it was shown that the mutual information expression as a function of macroscopic measures, such as the overlaps between patterns and neurons, can be used to derive new learning rules and even a complete original quartic (or biquadratic) ANN. This approach underscores the absolute relevance of using entropy information for ANNs, as it provides both a theoretical and practical framework for improving network performance.
Previous studies have also investigated extensions of Attractor Neural Networks to improve retrieval performance and storage capacity for both random and structured patterns. In ref. [
17], a constructive heuristic and a genetic algorithm were proposed to optimize the assignment of retinal vessels and fingerprint patterns to ensembles of diluted attractor networks. By minimizing the similarity between pattern subsets using cosine, Jaccard, and Hamming metrics, the authors reduced cross–talk noise and increased the ensemble’s storage capacity, with validation on random, fingerprint, and retinal image datasets. The retrieval of structured patterns, specifically fingerprints, was also addressed in ref. [
18], where a metric attractor network (MANN) was employed to exploit spatial correlations in the data. A theoretical framework linking retrieval performance to load ratio, connectivity, density, randomness, and a spatial correlation parameter was introduced, with good agreement between theory and experiments.
While the above approaches are rooted in statistical physics and theoretical neuroscience, recent trends in machine learning have shifted toward high-capacity architectures trained on large-scale image and biometric datasets. For instance, ref. [
10] developed a retina-based identification system using feedforward neural networks trained via backpropagation, applying preprocessing, feature extraction, and classification stages. Their system, which used grayscale vascular segmentation from the DRIVE database, demonstrated the feasibility of automated personal identification using the retina as a biometric trait. More recent developments have focused on multimodal biometric systems that integrate multiple physiological characteristics. A representative example is the hybrid identification framework proposed by ref. [
19], which combines convolutional neural networks (CNNs), Softmax, and Random Forest (RF) classifiers for the joint recognition of fingerprint, finger-vein, and face images. Their architecture applies K-means and DBSCAN algorithms for segmentation, exposure fusion for contrast enhancement, and CNNs as feature extractors, followed by classification through Softmax and RF layers. This line of research highlights the strong trend toward data-driven models for high-dimensional feature learning.
Complementary to CNN-based approaches, recurrent neural networks (RNNs) have also been explored for biometric and anomaly detection tasks. In ref. [
20], the authors reviewed RNN applications in biometric authentication, expression recognition, and anomaly detection, emphasizing architectures such as Long Short-Term Memory (LSTM) and deep residual RNNs. These networks capture temporal dependencies in sequential biometric data such as gait, keystroke dynamics, or handwriting, achieving high recognition performance without requiring explicit spatial feature design. The review also underlines the versatility of RNNs for behavioral authentication and continuous monitoring applications.
In contrast to these modern machine learning approaches, characterized by dense connectivity, large parameter spaces, and supervised training on extensive datasets, the FCANN model developed in this work remains grounded in the information and statistical mechanics formulation of attractor networks. Our approach focuses on how information is represented, stored, and retrieved within a sparsely connected system, using entropy and mutual information as evaluation measures. In the present paper, we extend this analysis to an ANN with binary uniform patterns and very low connectivity. Despite the reduced connectivity, we find that entropy information remains essential for optimizing the system’s hyperparameters, namely, the temperature (external noise) and the learning ratio (internal noise). The search for an expression for entropy and the calculation of optimal parameters, aimed at maximizing the mutual information between neurons and the data, proves central to enhancing the overall efficiency of neural networks.
7. Results
The main purpose in this paper is to resume the discussion of the FCANN model addressing some relevant questions not addressed before, like information capacity and RS stability.
All the observable are accessed through the calculation of
and
by using a population dynamic algorithm. There are
sub-networks and, in principle, one local field distribution for each sub-network. Nevertheless, due to the reduction to a single neuron problem, there are only two distinct local fields distributions, one for the single neuron assuming the state +1, for a given pattern and other for the single neuron assuming the state −1, for a given pattern. Furthermore, if the patterns are not biased, like presently, the two distributions are mirrored. The population dynamic algorithm runs as follows, for each distribution: a population
of fields is randomly created. We found that populations of
fields and populations of
produce similar results and we adopted
throughout the paper. Then, for each iteration, (1) an integer
k is chosen according to a Poisson distribution with average
c; (2)
k fields
are randomly chosen from the population; (3) the summation in the Dirac’s
-function in Equation (
25) is calculated; and (4) a further local field is chosen and the result of the previous step is assigned to it. The procedure is repeated until convergence. The joint distribution Equation (
38) is calculated similarly, except that two independently generated populations evolve with the same choice of randomness.
In the RS solution, the thermodynamic behavior of the neural network is characterized by the retrieval overlap
m, the spin glass parameter
q and the overlap between replicas
. As an example of the outcome, plots of these parameters versus the temperature are presented in
Figure 2, for representative connectivity values
and
. The main distinction between them is that there is a re-entrant SG phase at low temperature for
, contrary to
. For
, there are two regimes: (i)
, there is a stable retrieval solution (R) with
and
for
and a stable paramagnetic (PM) solution with
for
, where
is the critical temperature; (ii)
and
, it appears an unstable retrieval solution (R’) with
and
for
. For
,
,
and
are representative of three regimes, depending on
and the temperature, namely: (i)
, where there is a R solution for
and a PM solution for
; (ii)
, where there is an unstable R solution for
, a stable R solution for
and a PM solution for
; (iii)
, where there is a spin glass (SG) solution with
and
for
, an unstable RS solution for
, a stable R solution for
and a PM solution for
. For comparison, the overlaps obtained from simulations on random networks with
neurons, with average connectivity values
and
, are also shown in
Figure 2. The results show that, in the retrieval region,
m is weakly dependent on
T, decreasing abruptly to zero at
. Furthermore, it is worthy to remark that the simulated overlap decreases faster with the increase of
p, as compared with the theoretical overlap. For example, for
and
, the theoretical overlap is greater than 0.5 over a large range in temperature, while the simulated one is around 0.1. Contrary to the absolute value of the order parameters, theoretical and simulation results show a good agreement in which concerns the critical temperature for retrieval, with the agreement increasing with
c.
An overall picture of the model’s behavior can be achieved by drawing
T versus
phase diagrams, which are presented, for
,
and
, in
Figure 3. For comparison, results for the extremely diluted
network [
24] are also shown. The critical temperature
signals the transition R-PM. As
c increases, it approaches
, which is the
result. The AT line (from de Almeida-Thouless [
25]) that signals the R-R’ transition displaces to the right, which means that the RS stable region increases as
c decreases. In particular the finite
c RS solution is stable at
, for a
c dependent low
. This is in contrast to
, where the RS solution is unstable for
. The freezing temperature
that signals the SG-PM transition increases as
c increases, approaching the limit
for
. The R’-SG transition deserves further attention. It is re-entrant, which may be credited to the RS solution instability. According to the full RSB Parisi’s scheme [
26,
27], the R (a ferromagnetic phase) to SG transition is a vertical line at
. We believe that, although difficult to calculate, this still applies to finite connectivity neural networks. It is worthy to remark that the re-entrance becomes less pronounced as
c decreases and that it does not exist at all for
. Furthermore, the theoretical R’-SG transition for
is close to
, which is the full RSB value. This suggests that low connectivity is capable of curing some of the pathologies associated with the RS solution.
Since neural networks deal with information storage and retrieval, it is useful to investigate the relationship between the amount of information, the average connectivity, and the number of stored patterns. From a practical point of view, what is most promising: to built one dense network with a large
c to store a large
p or several less dense networks with a small
c to store a small
p in each one? A tentative answer is addressed in
Figure 4, with the information content plotted as a function of the connectivity for
p varying from 1 to 9, at zero temperature. The results show that
i is a non-monotonic function of
c and, consequently, for each
p there is a
c that maximizes
i. The absolute maximum is
, obtained for both
and
and then
slowly decreases for larger
p. Curves for the information content versus connectivity were also evaluated through numerical simulations on a network with 100,000 units. The result, shown in
Figure 4, shows a good agreement with the RS results, except that the theoretical results overestimate the simulation ones. We could enumerate five reasons for deviations from theoretical to simulations: (1) finite
N, error around
; (2) discrete
c in simulations versus continuous
c in theory; (3) correlations between patterns for small
c; (4) numerical errors in the evaluation of
; and (5) RS hypothesis. All these errors combined can explain the discrepancies in
Figure 2 and why the dots in
Figure 4 are under the continuous curves.
To clarify the tendency,
Figure 5a shows
as a function of
p. Initially,
decreases for
, but then it stabilizes for large
p. The picture is qualitatively similar for larger temperatures. The figure also shows that
is not a monotonic decreasing function of the temperature. Instead, it slightly increases till
before decreasing. This may be explained by the general argument that a low level of noise allows to avoid many spurious local minima. The load value
, corresponding to
, is shown in
Figure 5b. This value is close to
at low
p, and then slowly decreases. It is worthy to remark that
increases with
T at low temperature which means that, for a given
p,
appears at a lower
c.
The simulation results presented so far refer to random input patterns. We may ask whether the theoretical predictions also hold for real-world visual data. To explore this, we employed binarized digital retina patterns derived from the DRIVE dataset [
28], which contains high-resolution fundus images of human retinas. Each image was cropped to remove peripheral borders lacking structural information and then binarized to represent the main vascular structures as black–white patterns. The resulting images have a resolution of
pixels, corresponding to a total of 99,856 binary units per pattern.
The original mean activity of these binary retina patterns, measured as the proportion of white (active) pixels, was approximately , indicating a strong bias toward sparsity. To produce input patterns more comparable to the random binary patterns used in the previous simulations, each image was morphologically dilated to increase the active proportion to approximately . This preprocessing step balances the input activity and facilitates a fair comparison between theoretical and empirical results.
The modified retina images were then used as real input patterns to evaluate the network’s retrieval ability at
(noiseless retrieval), with a connectivity fixed at
. The results, presented in
Figure 6, show that the fixed-point overlap decreases with the number of stored patterns, in agreement with
Figure 2. Moreover, the overlap is only weakly affected by the level of initial noise, indicating that the retrieval state exhibits a large basin of attraction, demonstrating robustness for biometric structured input patterns.
A similar analysis was performed using binarized fingerprint images derived from the FVC2004: Third Fingerprint Verification Competition dataset [
29] as real input patterns. The original images are in grayscale and were first gray-threshold binarized to emphasize ridge structures. The initial mean activity, measured as the proportion of white (active) pixels, was approximately 0.23, reflecting the intrinsic sparsity of fingerprint ridge patterns. To reduce bias and improve comparability with the random and retina-based patterns, the binarized images were morphologically dilated, yielding more balanced activity distributions while preserving the overall ridge topology.
The images were then cropped to retain the regions containing the most relevant fingerprint information, resulting in a final resolution of pixels, corresponding to 89,420 binary units per pattern. As in the retina case, the network connectivity was set to , and retrieval performance was evaluated at (noiseless retrieval).
The results, shown in
Figure 7, exhibit the same qualitative behavior observed with the retinal patterns: the fixed-point overlap
decreases as the number of stored patterns
p increases, while remaining relatively robust to variations in the initial noise level. These findings confirm that the FCANN model can successfully retrieve complex biometric patterns beyond retinal structures, further supporting the generality and robustness of the theoretical predictions when applied to structured, real-world data.
8. Discussion and Conclusions
Biological neural networks are extremely diluted, in the sense that each neuron interacts directly to a few fractions of the neuron population. Even then, the average connectivity amounts to several thousands of connections. Although there is no formal limit to the finite connectivity theory, in practice it is difficult to attain the biologic regime due to limitations in computation requirements: the computer time consumption is roughly proportional to the average connectivity. Nevertheless, the results presented in this paper may be significant to biological NN and, more than this, they may be relevant to applications in artificial neural networks, where the connectivity hardly compares to the biological realm.
Related to the previous work [
5], we present the equations in an alternative form that allows to apply population dynamics and evaluate explicitly the order parameters as a function of the connectivity, learned patterns, and temperature. Also the information entropy was calculated, and we searched for the optimal information capacity as a function of the connectivity. Theoretical results were compared to simulations and biometric data.
A special attention was dedicated to the T versus phase diagram. Using the two-replica method, the AT line for three representative c values was investigated. It was observed that for a connectivity as low as , replica symmetry breaking effects, like the re-entrant behavior, are absent. This suggests that finite connectivity calculations is indeed an improvement, compared to the classical fully connected model. The RS transition from the unstable R to SG was also presented.
Numerical simulations with random and real input patterns were compared to the theory, showing a good agreement in predicting the R-PM transition, as well as in the information content. Nevertheless, there is a partial agreement in the prediction of the R’-SG transition. This subject deserves further investigation.
We keep thinking about biological and artificial neural networks. The results of this work also offer ideas about efficient information storage strategies. The energetic cost associated to the storage must be related both to the network unities (the neurons) as well as to the wiring (couplings). The results show that the maximal information “per coupling” is obtained for and 2, with low connectivity. If the wiring is most expensive than the unities themselves, this implies that low connectivity networks are more efficient to store information than densely connected networks. Meanwhile, the results also show that the maximal information decreases slowly with increasing p. This means that, if the relative wiring to unities cost is not too large, densely connected network could be an efficient strategy. Independently of the relative wiring and unities energy costs, an efficient neural network should operate in the range . It is worthy to remark that this range lies into the region where the RS solution is stable, and then it is a significant result.
Effects of thermal noise on information storage capacity were also investigated. The results show that a low level of noise (low T) are not harmful, and may even be beneficial, which means that sparse networks are robust against thermal noise.