Inpainted Image Reconstruction Using an Extended Hopfield Neural Network Based Machine Learning System

This paper considers the use of a machine learning system for the reconstruction and recognition of distorted or damaged patterns, in particular, images of faces partially covered with masks. The most up-to-date image reconstruction structures are based on constrained optimization algorithms and suitable regularizers. In contrast with the above-mentioned image processing methods, the machine learning system presented in this paper employs the superposition of system vectors setting up asymptotic centers of attraction. The structure of the system is implemented using Hopfield-type neural network-based biorthogonal transformations. The reconstruction property gives rise to a superposition processor and reversible computations. Moreover, this paper’s distorted image reconstruction sets up associative memories where images stored in memory are retrieved by distorted/inpainted key images.


Introduction
Machine learning, a sub-field of artificial intelligence, deals with algorithms that build mathematical models to automatically make decisions or predictions based on sample data called training sets. The concept of learning is the key to understanding intelligence in both biological brain structures and machines. The aim of machine learning is to create mappings y = F(x), y ∈ R m , x ∈ R n , generated by training sets S = {x i , y i } N i=1 , the vectors of which are approximation nodes. Hence, the training points include: The machine learning model described in a previous paper [1] was derived from an extended Hopfield neural network and is based on spectral analysis that uses biorthogonal and orthogonal transformations. It should be emphasized that this system has a universal character that enables the implementation of basic functions of learning systems, such as pattern association, pattern recognition, and inverse modeling. One of the aforementioned properties of this model is the recognition and reconstruction of image patterns. In [1], we presented an example where the object of the reconstruction was an incomplete, inpainted image of a subject named Lena. Such examples of reconstruction allow for the development of a system based on the above-mentioned model of machine learning that can recognize people wearing masks. It is worth noting that the above-mentioned model of machine learning represents an alternative to classical image reconstruction/restoration systems, which make use of such processing tools as inverse modelling, deconvolution, Wiener filters, and PCA (Principal Component Analysis) [2][3][4][5].
Classical image reconstruction systems are currently being intensively supplemented and replaced by those using neural, neuro-fuzzy architecture, and algorithms, especially in medical applications [6][7][8][9][10]. A comprehensive review of recent advances in image reconstruction can be found in [11]. Current research is focused on sparsity, low-rankness, and machine learning [12,13]. It is worth noting that the most up-to-date image reconstruction structures are based on constrained optimization algorithms and adequate regularizers [14].
Recently, deep learning algorithms are driving a renaissance of interest in neural network research and applications (e.g., image processing). Most of the known deep learning algorithms are implemented in the form of ANN (DLNN) learned from training set data by minimizing loss functions. Thus, deep learning approach can be seen as a special topic in optimization theory. Standard types of deep learning neural networks include the multilayer perceptrons (MLP), convolutional neural networks (CNN), recurrent neural networks (RNN), and generative adversarial networks (GAN) [15][16][17][18]. However, optimal networks topology and implementation technology have not yet been selected (the generalizability of networks is not well understood, and there is a lack of explanation for the relationship between the network topology and performance [19]. Nevertheless, we claim that ANN should constitute both a universal algorithmic and physical models used in computational intelligence. It is clear that Hopfield-type neural networks are both physical and algorithmic models suitable for neural computations. Hence, we considered an extended Hopfield-type model of the neural network defined by the following equations: where W-skew-symmetric orthogonal matrix; W s -real symmetric matrix; 1-identity matrix; θ(x)-vector of activation functions; I d -input vector; and ε, w 0 , η-parameters.
An equilibrium equation of neural networks (2), i.e., gives rise to the universal models of machine learning based on biorthogonal transformations, enabling the realization of common learning systems functions. One of these functions is the implementation of associative memories. Thus, this paper's inpainted image reconstruction system sets up associative memories where images stored in memory are retrieved by distorted/inpainted key images. To summarize, we propose a machine learning model that uses biorthogonal transformations based on spectral processing as alternative solutions to deep learning based on optimization procedures.
The rest of this paper is structured as follows: Section 2 provides details on the proposed learning algorithm and presents a structure of the machine learning systems for image processing. Section 2 contains also some results of computational verifications using MATLAB software. Section 3 includes some results of image processing as an inverse problem. Some unique properties of this machine learning system are discussed in Section 4. The conclusions underline the main features of the machine learning system presented in the article.

Machine Learning System for Image Processing
We consider a set of N black and white images represented by m rows and n columns, i.e., a set of (m·n) pixels with different shades of grayness. For vector analysis, each image is transformed by concatenating m rows to form the column vector x i (m·n × 1), i = 1, . . . , N. Thus, the set of N images is represented by the following matrix: where The set of distorted images is given by the matrix: It is straightforward to observe that the training set is as follows: S creates a mapping F(·) defined by the following properties: and Thus, the mapping F is implemented as a machine learning system for image reconstruction. The structure implementing the mapping F(·) defined by Equations (8) and (9) can be obtained as the solutions of the equilibrium Equation (3). Thus, for w 0 = 2, ε = 1 in Equation (3), one gets: where W 2 2 k = −1, skew-symmetric, orthogonal matrix Hence, the N-solutions are as follows: where and is a spectrum matrix of given vectors x i , i.e., and Equation (11) can be seen as a determination of biothogonal transformation T s (·) : and Equation (14) can be seen as an orthogonal transformation: The transformations T s (·) and T −1 (·), arranged as a realization of the mapping F(·), have the block structure, as shown in Figure 1 [1]. The orthogonal transformation T(·), which makes use of the Hurwitz-Radon matrix family [20], allows for determining the Haar-Fourier spectra of the system vectors x i .
In the system, due to the iterative nature of the feedback loop, the following convergence of vectors is obtained:m The convergence determined by Equation (20) is performed in K iterations (K depends on the reconstruction problem, note the example shown below). Moreover, it should be noted that for input image z = x i , i = 1, . . . , N, the output of the system is given by the superposition of system vectors: The system vectors x i set up the attraction centers. The structure in Figure 1 can also be represented as the lumped memory model in Figure 2. It is worth noting that this structure gives rise to the realization of an AI analog processor. However, this topic is beyond the scope of this paper. The synthesis algorithm of the system given in Figure 1 can be found in Appendix A

Computational Verification of the Learning Algorithm-Examples of Face Image Reconstruction and Person Recognition
A. The machine image processing system described in the previous section was used to reconstruct and classify a set of images. The system task was to reconstruct a complete face image based on a masked photo (mask applied by software) and to assign the reconstructed image to a specific person. In the system, photos of 9 faces (N = 9) were stored in memory in the form of a 64 × 64 matrix defining the degree of grayness of individual image pixels. The saved face images are presented in Figure 3. For vector analysis, each image was transformed by concatenating 64 lines into the form of a column vector x i (64·64 × 1), i = 1, . . . , 9. After transformation, the set of 9 images was represented by the matrix X (4096 × 9) : In the experiments, the identification numbers 1, 2, . . . , 9 were assigned to the images.
The system vectors u i = x i i , i = 1, 2, . . . , 9 were used to construct the machine learning system according to the procedure described in the previous section. Examples of the reconstruction of photos of people wearing masks are shown in Figure 4.   Table 1 show that in most cases, the value of the index rounded to the nearest integer corresponds to the nominal value. Thus, the system correctly identifies each person with the exception of Photo Number 3, where the person is incorrectly recognized. Increasing the number of iterations did not change the index, as the process quickly converges to the final value. The convergence of the iterative process is illustrated in Table 2, which presents the index values obtained after successive iterations. The experiment was carried out for Photo Number 2 with the nominal value of the coefficient of 2.0. A significant result that confirms the principle of the proposed system was obtained by substituting in a photo that was not saved in the system. The response shown in Figure 5 is a superposition of the photos stored in memory in the system (Equation (21)).  Table 3. The second column of the table shows the MSE describing the difference between the original and masked photos, whereas the third column shows the error describing the difference between the original photo and the photo after reconstruction. Each time an image saved in the system was analyzed, the mean squared error decreased. For the reconstruction attempt shown in Figure 5, which uses an image not saved in the system, the MSE error is 2950.90. The fractional value of the index in Table 1 reflects the system operation mechanism, which is a weighted combination of numbers 1, . . . , 9. B. In the case of another masking method, as illustrated in Figure 6, a set of distorted images is given according to relationship (6) by the matrix:  The image reconstruction process in Figure 6 is illustrated in Figure 8, which shows the results obtained after 1, 2, 5, 10, and 100 iterations. After 100 iterations, the reconstruction MSE is 0, and the identification index is 9.0. It is worth comparing the above values with the data for Photo Number 9 presented in Tables 1 and 3. It is worth noting that, as mentioned in the Introduction, the reconstruction of Lena' s photo was realized by using the structure presented in Figure 7 [1], as well. For example, one of the distorted images of Lena and the reconstruction is shown in Figure 9. The potential reconstruction of a distorted image using the structure in Figure 1 is illustrated for Photo Number 9 (Figure 3) by superimposing a noise vector generated by using the RAND function in MATLAB. The measure of this distortion is the signal/noise ratio expressed in decibels. The results of such a reconstruction are presented in Table 4 and Figure 10.  Table 4). Based on Table 4, the machine learning system correctly and automatically identified the distorted image at S/N > 10 dB. Yet, even at S/N = 2.7 dB in the reconstructed image, significant similarity to the saved original photo is observed.
The numerical data in Table 4, set as a function, MSE vs. S/N, form the plots presented in Figure 11.

Inpainted Image Recognition and Reconstruction as an Inverse Problem
The image reconstruction models presented in the previous sections are based on the availability of training sets S in Equations (7) and (22) containing original and damaged patterns. Alternatively, a common model of image reconstruction is given by the equation: where A-known processing operator, for example, A is a matrix; x-original image; and y-observed degenerate image.
According to Equation (23), the reconstruction of an image leads to solving the inverse problem. Most of the solutions to Equation (23) in the literature use an optimization solution [14,21], for example: where K-set of feasible solutions; R(x)-regulizer; β regularization parameter. As mentioned above, different types of neural networks are currently used to solve inverse problems in imaging, including image reconstruction. Many approaches to this problem can be found in recent reviews [22,23] and the novel proposal in [24].
The use of the machine learning model shown in Figure 1 to solve Equation (23) leads to the solution of the following problem: where The case of m = n is still under consideration. The generation of the training set for Equation (23) is given by: where x i , i = 1, 2, . . . , N is the vector form of training images, for example, those shown in Figure 3.
Assuming that the matrix A (m × n) in projection (25) is a random matrix, the images y i of the training set become random vectors. For example, training image number 1 takes the form shown in Figure 12. Thus, the vector form of the transformation of this image (No 1) is: where A − (m × n), m > n.
Taking the system vectors u i of the form the structure of the inverse mapping system (25), i.e.,: is given in Figure 13a,b. It should be noted that the biorthogonal transformation T s (·) and orthogonal transformation T(·) in Figure 13 are given by Equations (16) and (17), respectively: Thus, where u i -system vectors Equation (28).
In the system presented in Figure 13, the distorted projections of the images y i , i = 1, . . . , N undergo reconstruction, in contrast with the system in Figure 7, where the distorted images are reconstructed. To illustrate the properties of the reconstruction system presented in Figure 13, a training set S was generated using Equation (26) consisting of nine images x i , i = 1, . . . , 9, where x i were images from Figure 3, and their projections y i , i = 1, . . . , 9 were obtained with the random matrix A. An exemplary transformation of Image Number 9 from Figure 3 is shown in Figure 14. In the system shown in Figure 13b, we obtain: To conclude, Figure 1, Figure 7, and Figure 11 show image reconstruction systems that substantially implement an associative memory structure for recognizing damaged key patterns. However, it is worth noting that on other hand, the system in Figure 13 implements inverse transformation and solves optimization tasks constrained by images stored in memory. Moreover, this system enables the solving of linear Equation (23) by using a random form of training vectors x i in Equation (26) [1] as well.

Discussion on Some Features of the Machine Learning System
A. This section focuses on some of the features that underlie the universality of the machine learning system presented in Figures 1 and 7. First of all, it is clear that this machine learning system can be categorized as an iterative scheme. On other hand, the structure in Figure 1 can be treated as a feedforward block connection constituting a multilayer, deep learning architecture ( Figure 15). The structure in Figure 7 can be similarly treated as a feedforward scheme, as shown in Figure 16.
It is worth nothing that the multilayer structures in Figures 15 and 16 can be seen as an implementation of deep learning using recurrent neural networks (RNN) [25]. However, the topology of these structures is not a result of optimization algorithms typically used to solve the inverse problems.
B. An interesting property of the structure in Figure 1 can be set up by a computational experiment illustrated in Figure 17; i.e., when this structure memorizes only one image (e.g., photo No 2 in Figure 3), then any image is mapped on this memorized image (a property of global attractor).  C. Another interesting aspect of this machine learning system can be derived from the so-called Q-inspired neural networks feature [1]. This feature can be determined by the following statement: Given a set of complex-valued training vectors {x i , y i } N i=1 where x i ∈ C n , y i ∈ C m , n + m = 2 k , k = 3, 4, . . ., a realization of mapping given by complex training vectors, i.e., C n → C m can be implemented as complex-valued neural networks or as a complexvalued machine learning system with the structure presented in Figure 2, where the memory block is determined by the Hermitian matrix W H (W s → W H in Equation (3)).
Such a machine learning system can be used as an image processor to reconstruct complex-valued images. It is clear that the computational efficiency of this system is greater than that of the real-valued approximator (due to the processing of two images by only one system). Figure 18 provides an example of complex-valued image reconstruction.  Figure 1. It should be clear that this image processing sets up only one aspect of the potential system applicability in the field of signal processing. For example, the same machine learning could be used for time-series analysis and forecasting. To generalize, the essential function of the machine learning system described in this paper is the implementation of mapping defined by a training set S = {x i , y i } N i=1 ; i = 1, . . . , N, where dimx i = n, dimy i = m. The recurrence is convergent under the linear independence of input vectors, and the number of vectors N fulfills (see Equation (5)) N < 0.5 (n + m), n + m = 2 k , k = 2, 4, . . .. Thus, a large capacity system (large N) needs a large even dimension (n + m) system. It could be considered as a disadvantage of this machine learning systems.

Conclusions
The aim of this article was to illustrate the potential for using the machine learning system shown in Figure 1 to reconstruct and recognize distorted or damaged patterns, in particular, images of people wearing masks. In contrast to the image reconstruction methods based on using optimization algorithms, this system employs the superposition of system vectors setting up asymptotic centers of attraction. Hence, this system is particularly useful for the implementation of associative memories. Thus, this paper's inpainted image reconstruction sets up associative memories where images stored in memory are retrieved by distorted/inpainted key images. To conclude, we formulated another image processing tool augmenting the set of known image processing methods. Finally, all the image reconstructions presented in this paper, were done using MATLAB (The Math Works, Inc. MATLAB version 2021b).

Conflicts of Interest:
The authors declare no conflict of interest.