A Lagrange Programming Neural Network Approach with an ℓ0-Norm Sparsity Measurement for Sparse Recovery and Its Circuit Realization

Wang, Hao; Feng, Ruibin; Leung, Chi-Sing; Chan, Hau Ping; Constantinides, Anthony G.

doi:10.3390/math10244801

Open AccessArticle

A Lagrange Programming Neural Network Approach with an ℓ₀-Norm Sparsity Measurement for Sparse Recovery and Its Circuit Realization

by

Hao Wang

¹,

Ruibin Feng

²,

Chi-Sing Leung

^2,*,

Hau Ping Chan

²

and

Anthony G. Constantinides

³

¹

College of Electronics and Information Engineering, Shenzhen University, Shenzhen 518060, China

²

Department of Electrical Engineering, City University of Hong Kong, Hong Kong

³

Department of Electrical and Electronic Engineering, Imperial College, London SW7 2BX, UK

^*

Author to whom correspondence should be addressed.

Mathematics 2022, 10(24), 4801; https://doi.org/10.3390/math10244801

Submission received: 25 November 2022 / Revised: 12 December 2022 / Accepted: 13 December 2022 / Published: 16 December 2022

(This article belongs to the Special Issue Mathematics and Its Applications in Science and Engineering II)

Download

Browse Figures

Versions Notes

Abstract

:

Many analog neural network approaches for sparse recovery were based on using

ℓ_{1}

-norm as the surrogate of

ℓ_{0}

-norm. This paper proposes an analog neural network model, namely the Lagrange programming neural network with

ℓ_{p}

objective and quadratic constraint (LPNN-LPQC), with an

ℓ_{0}

-norm sparsity measurement for solving the constrained basis pursuit denoise (CBPDN) problem. As the

ℓ_{0}

-norm is non-differentiable, we first use a differentiable

ℓ_{p}

-norm-like function to approximate the

ℓ_{0}

-norm. However, this

ℓ_{p}

-norm-like function does not have an explicit expression and, thus, we use the locally competitive algorithm (LCA) concept to handle the nonexistence of the explicit expression. With the LCA approach, the dynamics are defined by the internal state vector. In the proposed model, the thresholding elements are not conventional analog elements in analog optimization. This paper also proposes a circuit realization for the thresholding elements. In the theoretical side, we prove that the equilibrium points of our proposed method satisfy Karush Kuhn Tucker (KKT) conditions of the approximated CBPDN problem, and that the equilibrium points of our proposed method are asymptotically stable. We perform a large scale simulation on various algorithms and analog models. Simulation results show that the proposed algorithm is better than or comparable to several state-of-art numerical algorithms, and that it is better than state-of-art analog neural models.

Keywords:

analog neural networks; LPNN; optimization; real-time solution

MSC:

94A12; 68T01; 68T07

1. Introduction

1.1. Background

The last few decades have seen increasingly rapid advances in analog neural networks for solving optimization problems. In an analog neural network, the state transitions of neurons are governed by some differential equations. After the dynamics of the network converge to an equilibrium point, the solution of the problem is obtained from the state of the neurons. From many neural pioneers [1,2,3,4,5,6], this approach is very attractive when real-time solutions are required.

The research of the analog neural network approach could be dated back to the 1980s [2]. One of the earliest analog networks is the Hopfield model [2]. Early applications of the Hopfield model are analog-to-digital conversion and the traveling salesman problem. Later, many analog models [3,4,5,6,7] for various optimization problems were proposed. Over the last decade, many new applications of the analog neural modes were investigated, including image processing, sparse approximation [5,8], mobile target localization [9,10], and feature selection [6]. Recently, several analog techniques [5,8,11,12,13] were designed for solving sparse recovery problems.

In sparse recovery [8,14,15,16,17,18], the aim is to recover an unknown sparse vector

x \in R^{n}

from an observation vector

b \in R^{m}

. For many real life signals, their internal representations are with the sparse property [19]. For example, audio signals are approximately sparse in the time-frequency domain [20]. Sparse recovery techniques can be in many signal processing applications. For instance, we use can sparse recovery techniques for image restoration [18,21,22]. Additionally, they can be used to remove the stripes in hyperspectral images [23]. In the inverse synthetic aperture radar (ISAR) application [24,25], a high-quality image of an object can be obtained from the Fourier transformed signal based on sparse recovery techniques. Another application of sparsity recovery is to process electrocardiogram (ECG) signal for classification of various heart diseases [26].

One of sparse recovery problems is the following

ℓ_{0}

-norm optimization problem:

min_{x} {∥ x ∥}_{0}, subject to b = Φ x,

(1)

where

Φ \in R^{m \times n}

is the measurement matrix. When there are some measurement noise in

b

, the problem becomes the constrained basis pursuit denoise (CBPDN) problem:

min_{x} {∥ x ∥}_{0}, subject to | | b - {Φ x | |}_{2}^{2} \leq m θ^{2},

(2)

where

θ > 0

is the standard deviation of observation noise. Since the

ℓ_{0}

-norm is difficult to handle, we usually use the

ℓ_{1}

-norm to replace the

ℓ_{0}

-norm. The problems, stated in (1) and (2), become

min_{x} {∥ x ∥}_{1}, subject to b = Φ x, and min_{x} {∥ x ∥}_{1}, subject to | | b - {Φ x | |}_{2}^{2} \leq m θ^{2},

(3)

respectively. In the last two decades, many

ℓ_{1}

-norm based numerical algorithms were proposed, such as BPDN-interior [27] and Homotopy [28]. In addition, elegant implementation packages [29,30] are available, such as SPGL1 [30].

Although the aforementioned

ℓ_{1}

-norm relaxation approaches were well studied, they have some drawbacks. For instance, in the BPDN-interior algorithm, the solution vector contain many small non-zero elements [31]. As mentioned in [32,33],

ℓ_{p}

-norm (

0 < p < 1

) is a better choice for replacing

ℓ_{0}

-norm. However,

ℓ_{p}

-norm is a non-convex function, which introduces complex behaviours in the problem-solving process. Therefore, we can use some approximation functions to replace the

ℓ_{p}

-norm term, such as minimax concave penalty (MCP) function [34,35]. In addition, there are other methods, which directly handle the

ℓ_{0}

-norm. They are normalized iterative hard threshold (NIHT) method [36], approximate message passing (AMP) [37],

ℓ_{0}

-norm zero attraction projection (

l_{0}

-ZAP) [38],

ℓ_{0}

-norm alternating direction method of multipliers (

ℓ_{0}

-ADMM) [39,40], and expectation-conditional maximization either (ECME) [41]. All the mentioned

ℓ_{p}

-norm or

ℓ_{0}

-norm techniques in this paragraph are digital numerical algorithms.

1.2. Motivation

Apart from using the numerical methods to solve the sparse recovery problem, we can consider using the analog neural approach for solving sparse recovery problems [5,8,11,12,13,21]. However, those analog models in [5,8,11,12,13,21] were developed based on the

ℓ_{1}

-norm relaxation techniques. Since the

ℓ_{1}

-norm is a surrogate function of the

ℓ_{0}

-norm only, directly using the

ℓ_{0}

-norm or a

ℓ_{0}

-like norm usually leads to a better performance. There is indeed a

ℓ_{0}

-norm-based analog model, namely local competition algorithm (LCA) [31]. However, it was designed for unconstrained sparse recovery problems only.

As directly working with the

ℓ_{p}

-norm or

ℓ_{0}

-norm usually results in better performance, it is interesting to develop some

ℓ_{p}

-norm or

ℓ_{0}

-norm analog models for sparse recovery problems with constraints. Another shortcoming of existing

ℓ_{1}

-norm relaxation neural models [8,11,12,13] is that the corresponding circuit realizations, especially the circuit for projection and thresholding operations, were not discussed.

1.3. Contribution and Organization

This paper focuses on using the analog technique to solve the CBPDN problem with the

ℓ_{0}

-norm objective, stated in (2). Strictly speaking, the

ℓ_{0}

-norm is not a norm and is not differentiable. These properties create difficulties for constructing the analog model for the CBPDN problem.

The paper proposes a

ℓ_{p}

-norm-like function for representing the sparsity measurement. The proposed

ℓ_{p}

-norm-like function is differentiable. We then apply the Lagrange programming neural network (LPNN) framework [8,42] for solving the CBPDN problem. The proposed

ℓ_{p}

-norm-like function does not have a simple expression, but its derivative has. Hence, we borrow the internal state concept from the LCA [31] to construct a LPNN model for the CBPDN problem. We call our model “LPNN with

ℓ_{p}

objective and quadratic constraint (LPNN-LPQC)”.

In developing an analog neural model, one of difficulties is to analyze the behaviour of the analog neural network dynamics, especially, for non-convex objective function without an explicit expression. The paper discusses the stability of the proposed LPNN-LPQC model. We theoretically prove that the equilibrium points of our proposed LPNN-LPQC satisfy Karush Kuhn Tucker (KKT) conditions of the approximated CBPDN problem. We use the term “approximated” because we have applied an approximation for the

ℓ_{0}

-norm. In addition, we prove that the equilibrium points of our proposed method are asymptotically stable.

Unlike some existing analog neural results which do not discuss the circuit realization [8,11,12,13], this paper also discusses the circuit realization of the proposed model. In particular, the detailed circuit design for the thresholding element is given. We then use the MATLAB Simulink to verify our design. In the verification, we find that the MATLAB Simulink results are nearly the same as the corresponding results obtained from the discretized dynamic equations.

This paper also presents a large scale simulation. The simulation result shows that the proposed LPNN-LPQC is better than or comparable to several state-of-art digital numerical algorithms, and that is better than state-of-art analog neural networks.

The rest of this paper is organized as follows. Backgrounds on the LPNN and LCA models are described in Section 2. In Section 3, the proposed LPNN-LPQC is developed. Section 4 discusses the circuit realization of the thresholding element. Section 5 discusses the LPNN-LPQC’s stability. Simulation results and comparisons with state-of-art numerical algorithms are provided in Section 6. Finally, conclusions are drawn in Section 7.

2. LPNN Framework and LCA

2.1. LPNN

The LPNN technique [42] can be used in many applications, such as locating a target in a radar system [9] and

ℓ_{1}

-norm-based sparse recovery [8], and ellipse fitting [43]. It was developed for solving a general non-linear constrained optimization problem:

min_{x} ϕ (x), subject to h (x) = 0,

(4)

where

ϕ : R^{n} \to R

is the objective, and

h : R^{n} \to R^{m}

(

m < n

) represents m equality constraints. In the LPNN approach, we first set up a Lagrangian function, given by

L_{e q} = ϕ (x) + λ^{⊤} h (x),

(5)

where

λ = {[λ_{1}, \dots, λ_{m}]}^{⊤}

is the Lagrange multiplier vector. An LPNN has two classes of neurons: variable neurons for holding

x

and Lagrange neurons for holding

λ

. Its neural dynamics are

μ \frac{d x}{d t} = - \frac{\partial L_{e q}}{\partial x}, and μ \frac{d λ}{d t} = \frac{\partial L_{e q}}{\partial λ},

(6)

where

μ

is the characteristic time constant. Without loss of generality, we consider that

μ

is equal to 1. With (6), when some mild conditions [42] are held, the network settles down at a stable state. The main restriction of using the LPNN framework is that

ϕ (x)

and

h (x)

should be differentiable.

2.2. LCA

The LCA [31,44] aims at solving the following unconstrained optimization problem:

min_{x} L (x) = \frac{1}{2} | | b - {Φ x | |}_{2}^{2} + κ \sum_{i = 1}^{n} S_{α, γ, κ} (x_{i}),

(7)

where

κ \sum_{i = 1}^{n} S_{α, γ, κ} (x_{i})

is a penalty term to improve the sparsity of the resultant

x

. The function

κ S_{α, γ, κ} (x)

does not have an exact expression. Instead, it is defined by introducing an internal state vector

u

. To define

κ S_{α, γ, κ} (x)

, a thresholding function on

u

is first introduced

x_{i} = T_{α, γ, κ} (u_{i}) = sign (u_{i}) \frac{| u_{i} | - α κ}{1 + exp (- γ (| u_{i} | - κ))},

(8)

where

κ > 0

is the threshold and is related to the magnitude of the non-zero elements,

γ \in (0, + \infty)

controls the threshold transition rate or slope around threshold, and

α \in [0, 1]

indicates adjustment fraction after the internal neuron across threshold. Figure 1a illustrates the shape of

T_{α, γ, κ} (u_{i})

under various settings. With

(α, γ, κ) = (1, \infty, κ)

, the threshold function

T_{α, γ, κ} (\cdot)

is the well-known soft threshold function. Given a threshold function, the penalty function

S_{α, γ, κ} (x_{i})

is then defined by

\begin{matrix} κ \partial S_{α, γ, κ} (x_{i}) & = & u_{i} - x_{i} = u_{i} - T_{α, γ, κ} (u_{i}), \end{matrix}

(9)

\begin{matrix} S_{α, γ, κ} (0) & = & 0, \end{matrix}

(10)

where

\partial S_{α, γ, κ} (x_{i})

is the gradient or sub-gradient of

S_{α, γ, κ} (x_{i})

.

In the vector form, (9) is written as

κ \partial (\sum_{i = 1}^{n} S α, γ, κ (x_{i})) = u - x = u - T_{α, γ, κ} (u),

(11)

where

T_{α, γ, κ} (u) = {[T_{α, γ, κ} (u_{1}), \dots, T_{α, γ, κ} (u_{n})]}^{⊤}

.

From (9),

S_{α, γ, κ} (x_{i})

is defined by the derivative. Hence, in general, there is no explicit expression for

S_{α, γ, κ} (x_{i})

. To visualize them, we should use numerical integration. Figure 1b shows the sparsity measure function

S_{α, γ, κ} (x_{i})

under various parameter settings. As shown in the figure, in some cases, such as a large

γ

value, the sparsity measure function

S_{0, γ, κ} (x)

is closed to

ℓ_{p}

-norm (

0 < p < 1

). Hence,

S_{0, γ, κ} (x)

is called the

ℓ_{p}

-norm-like sparsity function.

With the internal state concept and (9), LCA defines the dynamics on

u

(rather than on

x

) as

\frac{d u}{d t} = - \frac{\partial L_{e q}}{\partial x} = - κ \partial (\sum_{i = 1}^{n} S α, γ, κ (x_{i})) + Φ^{⊤} (b - Φ x) = - u + x + Φ^{⊤} (b - Φ x) .

(12)

3. LPNN-LPQC Model

For the proposed model,

α = 0

,

κ = \frac{1}{2}

, and

γ

is a large positive number. The meaning of

κ = \frac{1}{2}

is that the magnitude of the non-zero elements in the resultant

x

should be greater than

\frac{1}{2}

.

For simplicity, we use notation

S (x)

to replace

S_{0, γ, \frac{1}{2}} (x)

. We consider the following LPQC problem:

min_{x} \frac{1}{2} \sum_{i = 1}^{n} S (x_{i}), subject to | | b - {Φ x | |}_{2}^{2} \leq m θ^{2},

(13)

We first discuss some properties of

S (x)

. Afterwards, we derive the LPNN-LPQC model and perform the stability analysis on the proposed model.

3.1. $ℓ_{p}$ -Norm-like Sparsity Measure Function

Recall that from Figure 1b, for large

γ

,

S (x)

is similar to the

ℓ_{p}

-norm. When

α = 0

and

κ = \frac{1}{2}

, the threshold function becomes

x = T (u) = sign (u) \frac{| u |}{1 + exp (- γ (| u | - \frac{1}{2}))} .

(14)

Sparsity measurement function

S (x)

is defined by its derivative:

\begin{matrix} \frac{1}{2} \partial S (x) & = & u - x = u - T (u), \end{matrix}

(15)

\begin{matrix} S (0) & = & 0 . \end{matrix}

(16)

Before discussing the properties of

\frac{1}{2} S (x)

and

T (x)

, we would like to make some remarks. First, the explicit expression of

\frac{1}{2} S (x)

and

T^{- 1} (\cdot)

are not available, but the value of

\frac{1}{2} \partial S (x)

is easily obtained from u based on (14). Second, as shown in the rest of this paper, in the implementation of the proposed model, we need to implement

T (u)

rather than

\frac{1}{2} S (x)

.

We first list a number of properties of

T (u)

and

\frac{1}{2} S (x)

. From (14) and basic mathematics, we have Property 1.

Property 1.

P1.a: $T (u)$ is a continuous odd function.
P1.b: $T (u)$ is strictly monotonically increasing.
P1.c:Inverse of $T (u)$ exists.
P1.d: $T (u)$ is differentiable at everywhere for all real u.

Since

T (u)

is an odd function and monotonically increasing, from basic mathematics,

S (x)

has the following properties.

Property 2.

P2.a: $\frac{1}{2} S (x)$ is an even function.
P2.b: $\frac{1}{2} S (| x |)$ is monotonically increasing with respect to $| x |$ .

Additionally,

T (u)

has the following properties.

Property 3.

Define

f (u) = \frac{1}{1 + exp (- γ (| u | - \frac{1}{2}))}

. Thus, we have

T (u) = u f (u)

. For small positive ϵ, we have the following properties:

P3.a:If $| u | \geq \frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ}$ , then $\frac{1}{1 + ϵ} \leq f (u) < 1$ .
P3.b:If $| u | \leq \frac{1}{2} - \frac{1}{γ} log \frac{1}{ϵ}$ , then $0 < f (u) \leq \frac{ϵ}{1 + ϵ}$ .
P3.c:If $| u | \geq \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}$ , then $1 < \frac{d T (u)}{d u} \leq 1 + \frac{3}{2} ϵ$ ;
P3.d:If $0 \leq | u | \leq \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}$ , then $0 < \frac{d T (u)}{d u} \leq \frac{3}{2} ϵ$ .

Proof.

The proof is exhibited in Appendix A. □

Remark 1.

P3.aandP3.bmean that in some regions of u,

T (u)

is approximately equal to either u or 0. On the contrary, the regions of

| \frac{1}{2} - \frac{1}{γ} log \frac{1}{ϵ} | < | u | < | \frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ} |

are uncertain regions. However, the uncertain regions can be arbitrarily small by choosing a sufficient large γ.

With Properties 1–3, we can prove that

\frac{1}{2} S (x)

is differentiable at everywhere.

Property 4.

\frac{1}{2} S (x)

is a continuously differentiable function.

Proof.

The proof is exhibited in Appendix B. □

Since

\frac{1}{2} S (x)

is differentiable, in the rest of this paper, we replace “∂” with “∇” to indicate the partial derivative of

\frac{1}{2} \sum_{i = 1}^{n} S (x_{i})

, i.e.,

\frac{1}{2} \nabla (\sum_{i = 1}^{n} S (x_{i})) = u - x

. In addition, we discuss two features of

\frac{1}{2} S (\cdot)

, which make

\frac{1}{2} S (\cdot)

to be a good sparsity measure function.

Property 5.

For a given ϵ and sufficient large γ,

Boundness: for $| u | \geq \frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ}$ , $\frac{1}{2} S (x) \approx 1 / 8$ .
Sparsity: for $| u | \leq \frac{1}{2} - \frac{1}{γ} log \frac{1}{ϵ}$ , $\frac{1}{2} S (x) \approx 0$ .

Proof.

The proof is exhibited in Appendix C. □

Note that when

γ

tends to

+ \infty

,

S (x)

becomes the

ℓ_{0}

-norm and

T (u)

is the hard threshold function.

3.2. Properties of LPQC Problem

In order to analyze the properties of the LPQC problem, stated in (13), we review the KKT necessary conditions for general constrained optimization problems.

Lemma 1.

Consider the following non-linear optimization problem:

min_{x} ϕ (x), subject to h (x) \leq 0,

(17)

where

ϕ : R^{n} \to R

is the objective,

h : R^{n} \to R

defines the inequality constraint, ϕ and h are continuously differentiable. If

x^{*}

is a local optimum, then there exists a constant

λ^{*}

, called Lagrange multiplier, such that

Stationarity: $\nabla ϕ (x^{*}) + λ^{*} \nabla h (x^{*}) = 0$ ,
Primal feasibility: $h (x^{*}) \leq 0$ ,
Dual feasibility: $λ^{*} \geq 0$ ,
Complementary slackness: $λ^{*} h (x^{*}) = 0$ .

With the arm with Lemma 1 and some reasonable assumptions, we could simplify the KKT conditions of the LPQC, stated in (13). The result is summarized in Theorem 1.

Theorem 1.

Given that

{| | b | |}_{2}^{2} > m θ^{2}

, and that

x^{*}

is a local optimum of the optimization problem (13), the KKT conditions become

\begin{matrix} u^{*} - x^{*} - λ^{*} Φ^{⊤} (b - Φ x^{*}) = 0, \end{matrix}

(18a)

\begin{matrix} | | b - Φ x^{*} {| |}_{2}^{2} - m θ^{2} = 0, \end{matrix}

(18b)

\begin{matrix} λ^{*} > 0 . \end{matrix}

(18c)

Note that

\nabla (\sum_{i = 1}^{n} S (x_{i}^{*})) = u^{*} - x^{*}

.

Proof.

From Lemma 1, the KKT conditions of (13) are

\begin{matrix} u^{*} - x^{*} - λ^{*} Φ^{⊤} (b - Φ x^{*}) = 0, \end{matrix}

(19a)

\begin{matrix} | | b - Φ x^{*} {| |}_{2}^{2} - m θ^{2} \leq 0, \end{matrix}

(19b)

\begin{matrix} λ^{*} \geq 0, \end{matrix}

(19c)

\begin{matrix} λ^{*} (| | b - Φ x^{*} {| |}_{2}^{2} - m θ^{2}) = 0 . \end{matrix}

(19d)

According to (19c),

λ^{*}

is either greater than or equal to 0. The proof of

λ^{*} > 0

is by contradiction. Assume that

λ^{*} = 0

. From (19a), we have

\nabla (\sum_{i = 1}^{n} S (x_{i}^{*})) = 0

. Furthermore, from

\frac{1}{2} \nabla (\sum_{i = 1}^{n} S (x_{i}^{*})) = u^{*} - x^{*}

and (14), it is easy to prove that

u^{*} = x^{*}

, and that

u^{*} = x^{*} = 0

. On the other hand, from (19b), when

u^{*} = x^{*} = 0

, we have

{| | b | |}_{2}^{2} - m θ^{2} \leq 0

, which contradicts our earlier assumption

{| | b | |}_{2}^{2} > m θ^{2}

. Hence,

λ^{*}

must be greater than 0. In other words, the KKT conditions of (13) become (18). The proof is completed. □

3.3. Dynamics of LPNN-LPQC

In the analog neural approach, we need to design the neural dynamics such that the equilibrium points of the dynamics fulfill the KKT conditions of the problem.

From the LPNN framework, we set up a Lagrangian function:

L_{L P Q C} = \frac{1}{2} (\sum_{i = 1}^{n} S (x_{i}^{*})) + α^{2} (| | b - {Φ x | |}_{2}^{2} - m θ^{2}),

(20)

where

α^{2} : = λ

, which is used to simplify the proof of the Lagrange multiplier

λ

being greater than zero. Based on the concepts of LPNN and LCA, we propose the dynamics of the LPNN-LPQC as

\begin{matrix} \frac{d u}{d t} & = & - \frac{\partial L_{L P Q C}}{\partial x} = - \frac{1}{2} \nabla (\sum_{i = 1}^{n} S (x_{i})) + 2 α^{2} Φ^{⊤} (b - Φ x) \end{matrix}

\begin{matrix} = & - u + x + 2 α^{2} Φ^{⊤} (b - Φ x), \end{matrix}

(21a)

\begin{matrix} \frac{d α}{d t} & = & \frac{\partial L_{L P Q C}}{\partial α} = {2 α (∥ b - Φ x ∥}_{2}^{2} - m θ^{2}) . \end{matrix}

(21b)

4. Circuit Realization

In the concept of analog neural models for optimization process, one important issue is whether the operations in the dynamic equation can be implemented with analog circuits or not. This section addresses this issue.

4.1. Thresholding Element

From the dynamic equations, stated in (21), there are many conventional analog operations, such as adders [45], integrators [45], multipliers [46,47] and square circuits [48,49]. The circuit realization of those common operations are well discussed in [45,46,47,48,49].

There is an unconventional element in the dynamic equations. It is the thresholding function:

x = T (u) = sign (u) \frac{| u |}{1 + exp (- γ (| u | - \frac{1}{2}))} .

Figure 2a shows a generalized realization of the thresholding element for large

γ

. In this generalized realization, the thresholding level is

V_{r e f}

, where

V_{r e f}

is a positive number. The detailed function is given by

x = T (u) = sign (u) \frac{| u |}{1 + exp (- γ (| u | - V_{r e f}))} .

The two MOSFETs in Figure 2a control the thresholding mode.

If the magnitude

| u |

of the input is greater than

V_{r e f}

, then one of the two MOSFETs is on and the circuit in Figure 2a becomes an equivalent one shown in Figure 2b. Clearly, for this equivalent circuit, the output is equal to input.

On the other hand, in Figure 2a, if the magnitude

| u |

of the input is less than or equal to

V_{r e f}

, then the two MOSFETs are off and we obtain another equivalent circuit shown in Figure 2c. In this case, the inputs of the OP-AMP1 are clamped at

u / 2

. Since the two current values,

I_{u}

and

I_{l}

, of the upper and lower paths are equal, the output

u_{o}

of the OP-AMP1 is zero.

In Figure 3, we show the transfer function of our threshold function for

V_{r e f} = 1

. The transfer function is obtained from a circuit simulator. In the simulator, the open loop gain of OP-AMP1 is set to

10^{8}

and the open loop gain of OP-AMP2-to-OP-AMP4 is set to

10^{5}

. For the two MOSFETs, they are n-type with the following parameters: channel width = 100

μ

m, channel length = 200 nm, transconductance = 118

μ

A/V

^{2}

, zero-bias threshold voltage = 430 mV and channel-length modulation = 60 mV

^{- 1}

. From the figure, the transfer function quite matches our theoretical model with a large value of

γ

.

4.2. Circuit Structure

Figure 4 shows the analog realization of (21) in the block diagram level. In the realization, there are n blocks to compute

\frac{d u_{i}}{d t}

’s. Inside each block, there are adders, multipliers, and square circuits. Additionally, there is a block to compute

\frac{d α}{d t}

. The time derivatives

\frac{d u_{i}}{d t}

’s and

\frac{d α}{d t}

are then fed to the integrators to obtain internal variables

u_{i} (t)

and

α

. In order to obtain the decision variables

x_{i} (t)

’s, the internal variables are fed to n thresholding elements (Figure 2), where

V_{r e f} = 1 / 2

.

4.3. Circuit Simulation

In this subsection, we use a small scale problem to verify our approach based on the Matlab Simulink platform. The problem details are:

$n = 8$ and $m = 6$ .
Measurement matrix:

$Φ = \frac{1}{\sqrt{8}} (\begin{matrix} 1 & - 1 & - 1 & - 1 & 1 & - 1 & 1 & 1 \\ 1 & 1 & - 1 & 1 & 1 & 1 & 1 & - 1 \\ - 1 & 1 & - 1 & - 1 & - 1 & 1 & 1 & 1 \\ - 1 & 1 & 1 & 1 & 1 & - 1 & 1 & 1 \\ 1 & 1 & 1 & - 1 & 1 & 1 & - 1 & 1 \\ - 1 & - 1 & - 1 & 1 & 1 & 1 & - 1 & 1 \end{matrix}) .$
Real sparse vector $x = {[- 2, 0, 0, 0, 0, 0, 2, 0]}^{T}$ and noisy observation vector
$b = {[- 0.0001, 0.0038, 1.4149, 1.4145, - 1.4140, - 0.0035]}^{T}$ .
Noise tolerant parameter $θ = 0.001$ .

We build the Simulink file for our model, shown in Figure 4. We implement the thresholding function based on Figure 2. Other blocks are based on Simulink’s functional blocks. To verify our Simulink result, we also consider the discrete time simulation on (21). For digitization of (21), the discrete time simulation equations are

\begin{matrix} u (t + Δ t) & = & u (t) + Δ t \frac{d u}{d t} \end{matrix}

(22a)

\begin{matrix} α (t + Δ t) & = & α (t) + Δ t \frac{d α}{d t} . \end{matrix}

(22b)

where

Δ t = 0.001

. Figure 5 shows the dynamics from the Simulink results and discrete time simulation. From the figure, the dynamics from the two approaches are nearly the same. The final outputs

x

are

\begin{matrix} Simulink : & x = {[- 1.9970, 0, 0, 0, 0, 0, 2.0037, 0]}^{⊤} \end{matrix}

(23a)

\begin{matrix} Discrete time : & x = {[- 1.9965, 0, 0, 0, 0, 0, 2.0032, 0]}^{⊤} . \end{matrix}

(23b)

Clearly, from Figure 5 and (23), both approaches produce the similar results.

5. Properties of the Dynamics

The first issue that we need to address is that equilibrium points of the LPNN-LPQC model should fulfill the KKT conditions of the LPQC problem, stated in (13). Otherwise, the LPNN-LPQC model cannot find out local/global minimums of the LPQC problem. Note that since the LPQC problem or the original

ℓ_{0}

-norm CBPDN problem, stated in (2), are non-convex, few algorithms can ensure that their solutions are the global minimum.

The relationship between the equilibrium points of the LPNN-LPQC model and the KKT conditions are summarized in the following theorem.

Theorem 2.

Given that

{u^{*}, α^{*}}

with

α^{*} \neq 0

is an equilibrium point of the LPNN-LPQC model, this point corresponds to the KKT conditions of the LPQC problem. Note that

x^{*} = T (u^{*})

.

Proof.

From (21), when

{u^{*}, α^{*}}

is an equilibrium point, we

\begin{matrix} - u^{*} + x^{*} + 2 {α^{*}}^{2} Φ^{T} (b - Φ x^{*}) = 0, \end{matrix}

(24a)

\begin{matrix} 2 α^{*} (∥ b - Φ x^{*} ∥_{2}^{2} - m θ^{2}) = 0 . \end{matrix}

(24b)

Clearly, (24a) is the same as (19a) with

{α^{*}}^{2} = λ^{*}

. Additionally, when

α^{*} \neq 0

, from (24b) becomes

∥ b - Φ x^{*} ∥_{2}^{2} - m θ^{2} = 0

which is the same (19b). The proof is completed. □

Theorem 2 tells us that equilibrium points of the LPNN-LPQC model corresponds to the KKT condition of the LPQC problem. Another concern is the stability of the equilibrium points of the LPNN-LPQC model. Theorem 3 presents the stability of the equilibrium points of the LPNN-LPQC model.

Theorem 3.

Given a positive ϵ and for a sufficient large positive γ, if

(u^{*}, α^{*})

with

α \neq 0

is an equilibrium point of the LPNN-LPQC model and for all i either

| u_{i}^{*} | > \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}

or

| u_{i}^{*} | < \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}

, then the equilibrium point is an asymptotically stable point.

Proof.

The condition of either

| u_{i} | > \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}

or

| u_{i} | < \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}

is equivalent to for all i,

| u_{i} | \notin [\frac{1}{2} - ζ, \frac{1}{2} + ζ]

, where

ζ = \frac{1}{γ} log \frac{γ}{ϵ}

. With the condition, we define two index sets,

Γ_{u}

and

Γ_{u}^{c}

, given by

active set $Γ_{u}$ : If $| u_{i} | > 1 + ζ$ , then $i \in Γ_{u}$ .
inactive set $Γ_{u}^{c}$ : If $| u_{i} | < 1 - ζ$ , then $i \in Γ_{u}^{c}$ .

Obviously, for a given

ϵ

and a sufficient large

γ

,

ζ

tends to zero and the value of

ζ

can be arbitrary small. Thus, the inactive neurons have nearly no effect on the dynamics, i.e.,

T (u_{i})

tends to 0 for sufficient large

γ

(see Property 3).

For a given

a

, we define

a_{Γ_{u}}

as the vector composed of the elements of

a

indexed by

Γ_{u}

and

a_{Γ_{u}^{c}}

as the vector composed of the elements of

a

indexed by

Γ_{u}^{c}

. Similarly, given a matrix

Φ

, we define

Φ_{Γ_{u}}

as the matrix composed of the columns of

Φ

indexed by

Γ_{u}

and

Φ_{Γ_{u}^{c}}

as the matrix composed of the columns of

Φ

indexed by

Γ_{u}^{c}

.

Now, we consider the dynamics near

u^{*}

and

α^{*}

. We have two index sets

Γ_{u^{*}}

and

Γ_{u^{*}}^{c}

As mentioned in the above, for inactive states

x_{Γ_{u^{*}}^{c}} \to 0

for sufficient large

γ

. The dynamics given in (21) can be rewritten as:

\begin{matrix} \frac{d u_{Γ_{u^{*}}}}{d t} & = - u_{Γ_{u^{*}}} + x_{Γ_{u^{*}}} + 2 α^{2} Φ_{Γ_{u^{*}}}^{⊤} (b - Φ_{Γ_{u^{*}}} x_{Γ_{u^{*}}}), \end{matrix}

(25a)

\begin{matrix} \frac{d α}{d t} & = 2 α (∥ b - Φ_{Γ_{u^{*}}} x_{Γ_{u^{*}}} ∥_{2}^{2} - m θ^{2}), \end{matrix}

(25b)

\begin{matrix} \frac{d u_{Γ_{u^{*}}^{c}}}{d t} & = - u_{Γ_{u^{*}}^{c}} + 2 α^{2} Φ_{Γ_{u^{*}}^{c}}^{⊤} (b - Φ_{Γ_{u^{*}}} x_{Γ_{u^{*}}}) . \end{matrix}

(25c)

Furthermore, the linearization of (25) around the equilibrium point

(u^{*}, α^{*})

is

[\begin{matrix} \frac{d u_{Γ_{u^{*}}}}{d t} \\ \frac{d α}{d t} \\ \frac{d u_{Γ_{u^{*}}^{c}}}{d t} \end{matrix}] = - H [\begin{matrix} u_{Γ_{u^{*}}} - u_{Γ_{u^{*}}^{c}}^{*} \\ α - α^{*} \\ u_{Γ_{u^{*}}^{c}} - u_{Γ_{u^{*}}^{c}}^{*} \end{matrix}],

(26)

where “

- H

” is the Jacobian matrix at

(u^{*}, α^{*})

and it is given by

- H = [{\begin{array}{c} \frac{d {\dot{u}}_{Γ_{u^{*}}}}{d u_{Γ_{u^{*}}}} & \frac{d {\dot{u}}_{Γ_{u^{*}}}}{d α} & \frac{d {\dot{u}}_{Γ_{u^{*}}}}{d u_{Γ_{u^{*}}^{c}}} \\ \frac{d \dot{α}}{d u_{Γ_{u^{*}}}} & \frac{d \dot{α}}{d α} & \frac{d \dot{α}}{d u_{Γ_{u^{*}}^{c}}} \\ \frac{d {\dot{u}}_{Γ_{u^{*}}^{c}}}{d u_{Γ_{u^{*}}}} & \frac{d {\dot{u}}_{Γ_{u^{*}}^{c}}}{d α} & \frac{d {\dot{u}}_{Γ_{u^{*}}^{c}}}{d u_{Γ_{u^{*}}^{c}}} \end{array}]|}_{(u, α) = (u^{*}, α^{*})} .

(27)

From the given condition, for active nodes, we have

d x_{i} / d u_{i} \approx 1

, and for inactive nodes

d x_{i} / d u_{i} \approx 0

. After deriving the sub-matrices in (27), we obtain

H = [\begin{matrix} 2 {\bar{α}}^{2} Φ_{\bar{Γ}}^{⊤} Φ_{\bar{Γ}} & - 4 \bar{α} Φ_{\bar{Γ}}^{⊤} (b - Φ_{\bar{Γ}} {\bar{x}}_{\bar{Γ}}) & \emptyset \\ 4 \bar{α} {(b - Φ_{\bar{Γ}} {\bar{x}}_{\bar{Γ}})}^{⊤} Φ_{\bar{Γ}} & 0 & \emptyset \\ 2 {\bar{α}}^{2} Φ_{\bar{Γ^{c}}}^{⊤} Φ_{\bar{Γ}} & - 4 \bar{α} Φ_{\bar{Γ^{c}}}^{⊤} (b - Φ_{\bar{Γ}} {\bar{x}}_{\bar{Γ}}) & I \end{matrix}],

(28)

where ∅ denotes a matrix of zero with appropriate size. Following the proof logic of Theorem 5 in [8], one can find that all eigenvalues of

H

are with positive real part. Therefore, according to the classical control theory, the corresponding equilibrium point

(u^{*}, α^{*})

is asymptotically stable. The proof is completed. □

6. Experiment Results

6.1. Comparison Algorithms and Settings

This section compares our proposed LPNN-LPQC model with a number of digital numerical methods and analog neural methods. The comparison numerical methods are SPGL1 [30,50], MCP [35], AMP [37],

ℓ_{0}

-ADMM [39,40],

ℓ_{0}

-ZAP [38], NIHT [36], and ECME [41]. The comparison analog methods are IPNNSR [11] and PNN-

ℓ_{1}

[13].

The seven numerical algorithms are described as follows. The SPGL1 [30,50] is a standard

ℓ_{1}

-norm approach. The MCP [35] is based on the approximation of the

ℓ_{0}

-norm function and its turning parameters are selected by cross validation. The NIHT [36], ECME [41], and AMP [37] are iterative algorithms, and they handle the

ℓ_{0}

-norm term by the hard thresholding concept. The

ℓ_{0}

-ADMM [39,40] uses the frameworks of ADMM and hard thresholding. The

ℓ_{0}

ZAP [38] utilizes the idea of adaptive filter and projection. The two analog comparison models are IPNNSR [11] and PNN-

ℓ_{1}

[13], and are developed based the projection concepts.

The setting of our experiment follows that of the experiment in [51]. The measurement matrix

Φ

is a random

\pm 1 / \sqrt{m}

matrix. The dimension n of the sparse signal

x

is 4096. A sparse signal contains k non-zero elements which locations are randomly chosen with uniform distribution. Their corresponding value are random

\pm 1

. In our experiment we set

k = {75, 100, 125}

.

6.2. Parameter Settings

To conduct the experiment, we need to select some parameters for the proposed method and comparison methods. In our proposed analog method, there are two tuning parameters which are

γ

and

κ

. To make the approximation

ℓ_{p}

-norm close to the

ℓ_{0}

norm, we should use a large

γ

. In our experiment, we set

γ = 10, 000

. Parameter

κ

is used to control the minimum value of the magnitudes of the decision variables

x_{i}

. In our experiment, we set

κ = \frac{1}{2}

and

Δ t = 0.0001

.

The SPGL1 package is used to solve

min_{x} {∥ b - Φ x ∥}_{2}^{2} subject to {∥ x ∥}_{1} \leq k,

where k is the number of non-zero elements in

x

. In the experiment, we set the maximum number of iterations to 100,000.

The MCP algorithm is used to solve

min_{x} {∥ b - Φ x ∥}_{2}^{2} + P_{(λ, γ)} (x),

where

P_{(λ, γ)} (x)

is a penalty function to control the sparsity of the solution, and

λ

and

γ

are parameters in

P_{(λ, γ)} (x)

. We set

γ = 1.5

and use a linear search to obtain the best value of

λ

. We set the maximum number of iterations to 100,000.

The AMP, NIHT, and ECME algorithms are used to solve

min_{x} {∥ b - Φ x ∥}_{2}^{2} subject to {∥ x ∥}_{0} \leq k,

where k is the number of non-zero elements in

x

. We set the maximum number of iterations to 100,000.

The

ℓ_{0}

-ADMM algorithm is used to solve

min_{x} {∥ b - Φ x ∥}_{2}^{2} subject to {∥ x ∥}_{0} \leq k,

where k is the number of non-zero elements in

x

. In

ℓ_{0}

-ADMM, there is an augmented Lagrangian parameter

ρ

. It is set to

0.5

. The maximum number of iterations to 100,000.

The

ℓ_{0}

-ZAP algorithm is used to solve

min_{x} {∥ b - Φ x ∥}_{2}^{2} + γ {∥ x ∥}_{0},

where

γ

is used to control the sparsity of the solution vector. We use a linear search to obtain the best value of

γ

. In addition, there are two parameter

κ

and

α

. In the experiment,

κ = 0.001

and

α = 2

. Additionally, the maximum number of iterations is 100,000.

The IPNNSR and PNN-

ℓ_{1}

analog models are used to solve

min_{x} {∥ x ∥}_{1} subject to b = Φ x .

For IPNNSR, we set

Δ t = 1

. For PNN-

ℓ_{1}

, we set

Δ t = 0.01

.

6.3. Convergence

The proposed algorithm is an analog neural network. Hence, one important issue is the time to reach the equilibrium. Here, we conduct an experiment to empirically study the convergent time. Some typical dynamics are given in Figure 6. In the figure, the first row is the case of

k = 75

and

m = 500

, the second row is the case of

k = 100

and

m = 600

, and the third row is the case of

k = 125

and

m = 700

. Since there are 4096 elements in the decision variable vector

x

, the legibility of the figure will be very poor, if we plot all

u_{i} (t)

’s and

x_{i} (t)

’s in the figure. Therefore, we only plot the dynamics of the

u_{i} (t)

’s and

x_{i} (t)

’s whose original

x_{i}

values are non-zero. From the figure, it can be seen that within 20–40 characteristic time units, the dynamics of our proposed analog neural network settle down.

6.4. Comparison with Other Algorithms

To further analyze the performance of our proposed method, we compare it with seven numerical algorithms and two analog models.

The observation vectors are generated by the following noisy model:

b = Φ x + e

(29)

where

e

is a zero mean Gaussian noise with standard deviation

σ = {0.001, 0.005, 0.01}

. The experiments repeat 100 times with different measurement matrix, initial states, and sparse signals. For each instance we declare that it is successful if

\frac{∥ x_{0} - \hat{x} ∥_{2}}{∥ x_{0} ∥_{2}} \leq tol,

(30)

where

x_{0}

denotes the true signal,

\hat{x}

is the recovered signal, and tol is the tolerant value. In our experiment, we set

t o l = 0.01

.

The successful rate results are shown in Figure 7. For all the algorithms, their performances are improved with the increasing numbers of measurements. From the figure, all

ℓ_{0}

-norm and

ℓ_{p}

-norm models are superior to the

ℓ_{1}

models, including SPGL1, IPNNSR, and PNN-

ℓ_{1}

. In addition, comparing with the seven numerical methods, our proposed method usually needs less measurements to obtain the same probability of exact reconstruction.

Comparing with the two analog models, IPNNSR and PNN-

ℓ_{1}

, our LPNN-LPQC model is better. For example, with 125 non-zero elements and noise level equal to 0.01, the IPNNSR and PNN-

ℓ_{1}

need around 650 measurement vectors, while our LPNN-LPQC needs around 575 to 600 measurement vectors only. The rationale of the improvement is that our model uses a

ℓ_{0}

-norm approach, while IPNNSR and PNN-

ℓ_{1}

are based on the

ℓ_{0}

-norm.

6.5. Comparison with Other Analog Models

As the paper proposed an analog model, namely LPNN-LPQC, for sparse recovery, this subsection performs a deep discussion on the comparison among our proposed LPNN-LPQC, IPNNSR [11], and PNN-

ℓ_{1}

[13]. We discuss three different aspects: probability of exact reconstruction and reconstruction error. In IPNNSR [11] and PNN-

ℓ_{1}

[13], the circuit realization of thresholding operator and the projection operator were not addressed.

6.5.1. Successful Rate of Recall

Since Figure 7 shows the successful rates of all algorithms, it may not be easy to see the difference among the three analog models. In Figure 8, we only present the results of three analog models. From the figure, it can be seen that the performance of our LPNN-LPQC is better than that of IPNNSR and PNN-

ℓ_{1}

. In particular, the performance improvement is significant when the number of non-zero elements is large and the noise level is high. For instance, with 125 non-zero elements and noise level = 0.01, the IPNNSR and PNN-

ℓ_{1}

need around 650 measurements for high successful rates of recall, while our LPNN-LPQC needs around 575 to 600 measurements only.

6.5.2. MSE of Recall

The successful rate results concern about whether the estimated

x

has the correct non-zero positions. Here, in Figure 9, we present the MSE versus the number of measurements used. From Figure 9, it can be seen that in terms of MSE, the performance of our LPNN-LPQC is much better than that of IPNNSR and PNN-

ℓ_{1}

. In most cases, the MSE values of our proposed model are less than those of IPNNSR and PNN-

ℓ_{1}

in one or two orders of magnitude. For example, for

k = 75

,

σ = 0.005

, and

M = 450

, the MSE values of the IPNNSR and PNN-

ℓ_{1}

are around

10^{- 3}

, while the MSE value of the proposed model is around

10^{- 5}

only.

7. Conclusions

This paper proposed a LPNN-LPQC method for sparse recovery under noise environment. The proposed algorithm is an analog neural network. We showed that the equilibrium points of the model satisfy the KKT conditions of the LPQC problem, and that the equilibrium points of the model are asymptotically stable. From the simulation results, we can see that the performance of the proposed algorithm is comparable to, even superior to many state-of-art

ℓ_{0}

-norm or

ℓ_{p}

-norm numerical methods. In addition, our proposed algorithm is superior to two analog models. We also presented the circuit realization for the thresholding element and performed circuit simulation for verifying our realization based on the MATLAB Simulink.

In our analysis, we assume that the realization of analog circuit does not have any time delay or synchronization problem. In fact, time delay and mis-synchronization may affect the stability of analog circuits [52,53]. Hence, one of future works is to investigate the behaviour of the proposed analog model with time delay or mis-synchronization.

Author Contributions

Conceptualization, H.W., R.F., C.-S.L. and A.G.C.; methodology, C.-S.L., H.W., R.F. and H.P.C.; software, H.W., R.F., H.P.C.; writing—original draft preparation, H.W.; writing—review and editing, C.-S.L.; supervision, C.-S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Grant No. 62206178) and a research grant from City University of Hong Kong. Grant number 9678295.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

LPNN	Lagrange Programming Neural Network
LPNN-LPQC	LPNN with $ℓ_{p}$ objective and Quadratic Constraint
CBPDN	Constrained Basis Pursuit Denoise (CBPDN)
LCA	Locally Competitive Algorithm
KKT	Karush Kuhn Tucker
MCP	Minimax Concave Penalty
NIHT	Normalized Iterative Hard Threshold
AMP	Approximate Message Passing
$ℓ_{0} - Z A P$	$ℓ_{0}$ -norm Zero Attraction Projection
$ℓ_{0} - A D M M$	$ℓ_{0}$ -norm Alternating Direction Method of Multipliers
ECME	Expectation Conditional Maximization Either

Appendix A. Proof of Property 3

P3.a

Recall that

f (u) = \frac{1}{1 + exp (- γ (| u | - \frac{1}{2}))}

, and that

f (u)

is an even function. Clearly,

T (u) = u f (u)

is an odd function. First, since

f (u)

is an even function, the proof only presents the case of

u \geq \frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ}

.

In the case of

u \geq \frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ}

,

f (u)

is monotonic increasing. Hence

f (u) \geq f (\frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ}) = \frac{1}{1 + exp (- log \frac{1}{ϵ})} = \frac{1}{1 + ϵ} .

(A1)

In addition, it is easy to show that

f (u)

is monotonic increasing, and that

{lim}_{| u | \to \infty} f (u) = 1

. In conclusion, we have

\frac{1}{1 + ϵ} \leq f (u) < 1

. P3.a is proved.

P3.b

Since

f (u)

is an even function, the proof only presents that the case of

u \leq \frac{1}{2} - \frac{1}{γ} log \frac{1}{ϵ}

.

In the case of

u \leq \frac{1}{2} - \frac{1}{γ} log \frac{1}{ϵ}

,

f (u)

is monotonic increasing. Hence, we have

f (u) \leq f (\frac{1}{2} - \frac{1}{γ} log \frac{1}{ϵ}) = \frac{1}{1 + exp (log \frac{1}{ϵ})} = \frac{ϵ}{1 + ϵ} .

(A2)

Additionally, it is obvious that

f (u) > 0

. Therefore,

0 < f (u) \leq \frac{ϵ}{1 + ϵ}

. P3.b is proved.

P3.c

For simplicity, let

g (u) = exp (- γ (| u | - \frac{1}{2}))

. Thus, the derivative of

T (u)

can be rewritten as

\frac{d T (u)}{d u} = \frac{1}{1 + g (u)} + \frac{γ | u | g (u)}{{(1 + g (u))}^{2}}

(A3)

and the second derivative of

T (u)

is

\frac{d^{2} T (u)}{d u^{2}} = \frac{2 γ sign (u) g (u)}{{(1 + g (u))}^{2}} - \frac{γ^{2} u g (u)}{{(1 + g (u))}^{2}} + \frac{2 γ^{2} u {g (u)}^{2}}{{(1 + g (u))}^{3}} .

(A4)

Since

T (u)

is an odd function and

\frac{d T (u)}{d u}

is an even function, we only present the proof for the case of

u > \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}

. The proof consists of two parts. First, we show the monotonic property of

\frac{d T (u)}{d u}

for

u > \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}

. Second, the upper and lower bounds of

\frac{d T (u)}{d u}

are established.

For the case of

u > \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}

, we have

0 < g (u) < \frac{ϵ}{γ}

. The second derivative of

T (u)

in (A4) can be rewritten as:

\frac{d^{2} T (u)}{d u^{2}} = g (u) \cdot \frac{(2 γ - γ^{2} u) (1 + g (u)) + 2 γ^{2} u g (u)}{{(1 + g (u))}^{3}} .

(A5)

Since

0 < g (u) < ϵ / γ

, we can deduce that

\begin{matrix} (2 γ - γ^{2} u) (1 + g (u)) + 2 γ^{2} u g (u) \\ = & 2 γ - γ^{2} u + (2 γ + γ^{2} u) g (u) < 2 γ - γ^{2} u + (2 γ + γ^{2} u) \frac{ϵ}{γ} \\ = & 2 γ - γ^{2} u + 2 ϵ + 2 γ u ϵ < (2 - γ + 4 ϵ) γ u < 0 . \end{matrix}

(A6)

From (A5), (A6),

g (u) > 0

, and

{(1 + g (u))}^{3} > 0

, we deduce that, if

u > \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}

, then

d^{2} T (u) / d u^{2} < 0

and

\frac{d T (u)}{d u}

is monotonic decreasing. Thus, for

u > \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}

, a lower bound on

\frac{d T (u)}{d u}

is

\frac{d T (u)}{d u} |_{u \to \infty}

. Since

{lim}_{u \to \infty} g (u) = 0

and

\frac{d T (u)}{d u} |_{u \to \infty} = 1

, we have

\frac{d T (u)}{d u} > 1 .

(A7)

On the contrary, an upper bound in

u > \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}

, is

\frac{d T (u)}{d u} |_{u = \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}}

. The value of

\frac{d T (u)}{d u} |_{u = \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}}

is

\begin{matrix} \begin{matrix} \frac{d T (u)}{d u} |_{u = \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}} = \frac{1}{1 + \frac{ϵ}{γ}} + \frac{γ u \frac{ϵ}{γ}}{{(1 + \frac{ϵ}{γ})}^{2}} \\ = \frac{γ^{2} + γ ϵ + u ϵ γ^{2}}{{(γ + ϵ)}^{2}} = \frac{{(γ + ϵ)}^{2} + u ϵ γ^{2} - ϵ γ - ϵ^{2}}{{(γ + ϵ)}^{2}} \\ = 1 + \frac{\frac{1}{2} γ^{2} + γ log \frac{γ}{ϵ} - ϵ - γ}{{(γ + ϵ)}^{2}} ϵ \\ = 1 + \frac{\frac{1}{2} {(γ + ϵ)}^{2} + (log \frac{γ}{ϵ} - 1 - ϵ) γ - ϵ - \frac{1}{2} ϵ^{2}}{{(γ + ϵ)}^{2}} ϵ \\ < 1 + \frac{ϵ}{2} + \frac{(log \frac{γ}{ϵ} - 1 - ϵ) γ}{{(γ + ϵ)}^{2}} ϵ < 1 + \frac{3 ϵ}{2} . \end{matrix} \end{matrix}

(A8)

From (A5) and (A6), and the fact that

\frac{d T (u)}{d u}

is an even function, we can conclude that for a sufficiently large

γ

, if

| u | > \frac{1}{2} + \frac{1}{γ} log \frac{γ}{ϵ}

, then

1 < \frac{d T (u)}{d u} < 1 + \frac{3}{2} ϵ

. The proof of P3.c is completed.

P3.d

The proof consists of two parts. First, we show the monotonic properties of

\frac{d T (u)}{d u}

for

| u | < \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}

. Second, the upper and lower bounds of

\frac{d T (u)}{d u}

are established. Since

T (u)

is an odd function and

\frac{d T (u)}{d u}

is an even function, we only need to consider the region of

u < \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}

.

For

0 < u < \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}

,

sign (u) > 0

,

g (u)

is monotonic decreasing, and

exp (γ) > g (u) > \frac{γ}{ϵ}

. In this region, the second derivative of

T (u)

is

\begin{matrix} \begin{matrix} \frac{d^{2} T (u)}{d u^{2}} = \frac{2 γ g (u)}{{(1 + g (u))}^{2}} - \frac{γ^{2} u g (u)}{{(1 + g (u))}^{2}} + \frac{2 γ^{2} u {g (u)}^{2}}{{(1 + g (u))}^{3}} \\ > - \frac{γ^{2} g (u) u}{{(1 + g (u))}^{2}} + \frac{2 γ^{2} u {g (u)}^{2}}{{(1 + g (u))}^{3}} \\ = (g (u) - 1) \cdot \frac{γ^{2} g (u) u}{{(1 + g (u))}^{3}} . \end{matrix} \end{matrix}

(A9)

Obviously, in this region,

\frac{γ^{2} g (u) u}{{(1 + g (u))}^{3}} > 0

, and

\begin{matrix} g (u) - 1 > \frac{γ}{ϵ} - 1 > 0 . \end{matrix}

(A10)

Hence, we can conclude that, for

0 < u < \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}

,

d^{2} T (u) / d u^{2} > 0

and then

\frac{d T (u)}{d u}

is monotonically increasing. For

u < \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}

, the upper bound of

d T (u) / d u

exists when

u = \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}

. The upper bound is

\begin{matrix} \begin{matrix} \frac{d T (u)}{d u} |_{u = \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}} = \frac{1}{1 + \frac{γ}{ϵ}} + \frac{γ^{2} | u |}{ϵ {(1 + \frac{γ}{ϵ})}^{2}} = \frac{γ^{2} | u | + ϵ + γ}{{(γ + ϵ)}^{2}} ϵ \\ < \frac{\frac{1}{2} γ^{2} + ϵ + γ}{{(γ + ϵ)}^{2}} ϵ = \frac{1}{2} ϵ + \frac{γ + ϵ + - 2 γ ϵ - \frac{1}{2} ϵ^{2}}{{(γ + ϵ)}^{2}} ϵ \\ < \frac{3}{2} ϵ . \end{matrix} \end{matrix}

(A11)

In addition, a lower bound on

\frac{d T (u)}{d u}

exists at

u = 0

,

\begin{matrix} \frac{d T (u)}{d u} |_{u = 0} = \frac{1}{1 + exp (γ)} > 0 \end{matrix}

(A12)

To sum up, for a sufficiently large

γ

, if

u < \frac{1}{2} - \frac{1}{γ} log \frac{γ}{ϵ}

, then

0 < \frac{d T (u)}{d u} < \frac{3}{2} ϵ

. The proof of P3.d is completed.

Appendix B. Proof of Property 4

The continuity and differentiability of

S (x)

can be proved according to the following lemma [54].

Lemma A1.

(Inverse function) Let ϕ be a strictly monotone continuous function on

[a, b]

, with ϕ differentiable at

x_{0} \in (a, b)

and

\frac{d ϕ}{d x} |_{x = x_{0}} \neq 0

. Then

ϕ^{- 1}

exists and is continuous and strictly monotone. Moreover,

ϕ^{- 1}

is differentiable at

y_{0} = ϕ (x_{0})

and

\frac{d ϕ^{- 1} (y)}{d y} |_{y = y_{0}} = \frac{1}{\frac{d ϕ}{d x} |_{x = x_{0}}} .

(A13)

From (A3), one can easily verify that

\frac{d T (u)}{d u} > 0

, and that

T (u)

is a strictly monotone increasing continuous function. Based on Lemma A1,

T^{- 1} (x)

exists. Additionally, it is continuous and differentiable at every point. Hence

\frac{1}{2} \partial S (x) = T^{- 1} (x) - x

is a continuous and differentiable function. The proof is completed.

Appendix C. Proof of Property 5

Boundness: Now, we use the Taylor series of $S (x)$ to estimate the values of $S (\pm \frac{1}{2})$ . Since $S (x)$ is an even function, the proof only presents the case of $u > \frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ}$ .

First, we can use

x = \frac{1}{4}

to obtain the Taylor series expansion of

\frac{1}{2} S (x)

:

\frac{1}{2} S (x) \approx \frac{1}{2} S (\frac{1}{4}) + \frac{1}{2} S^{'} (\frac{1}{4}) (x - \frac{1}{4}) + \frac{1}{2} \frac{1}{2} S^{″} (\frac{1}{4}) {(x - \frac{1}{4})}^{2},

(A14)

where

S^{'} (\frac{1}{4})

and

S^{″} (\frac{1}{4})

are the first and second order derivatives of

S (x)

at

x = \frac{1}{4}

, respectively. Since

\frac{1}{2} S^{'} (x) = u - x = T^{- 1} (x) - x

and

T (\frac{1}{2}) = \frac{1}{4}

, we obtain

\frac{1}{2} S^{'} (\frac{1}{4}) = \frac{1}{2} - \frac{1}{4} = \frac{1}{4} .

(A15)

Additionally,

\frac{1}{2} S^{″} (\frac{1}{4}) = \frac{d T^{- 1} (x)}{d x} |_{x = 1 / 4} - 1

. As

T (\frac{1}{2}) = \frac{1}{4}

, from Lemma 2,

\frac{d T^{- 1} (x)}{d x} |_{x = 1 / 4} = \frac{1}{\frac{d T (u)}{d u} |_{u = \frac{1}{2}}} .

(A16)

Additionally, from (A3),

\frac{d T (u)}{d u} |_{u = \frac{1}{2}} = \frac{2 + γ}{4} .

(A17)

From (A14)–(A17), for a small

ϵ

and a sufficient large

γ

,

\begin{matrix} \frac{1}{2} S (x) & \approx & \frac{1}{2} S (\frac{1}{4}) + \frac{1}{4} (x - \frac{1}{4}) + \frac{1}{2} (\frac{4}{2 + γ} - 1) {(x - \frac{1}{4})}^{2} \\ \approx & \frac{1}{2} S (\frac{1}{2}) + \frac{1}{4} (x - \frac{1}{4}) - \frac{1}{2} {(x - \frac{1}{4})}^{2} . \end{matrix}

(A18)

Thus, we have

\begin{matrix} \frac{1}{2} S (\frac{1}{2}) \approx \frac{1}{2} S (\frac{1}{4}) + \frac{1}{32}, and \frac{1}{2} S (0) \approx \frac{1}{2} S (\frac{1}{4}) - \frac{3}{32} . \end{matrix}

(A19)

As

\frac{1}{2} S (0) = 0

, we obtain

\frac{1}{2} S (\frac{1}{2}) \approx \frac{1}{8}

. Similarly, it is also easy to obtain that

\frac{1}{2} S (- \frac{1}{2}) \approx \frac{1}{8}

.

Now we would like to know for

u > \frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ}

, i.e.,

x > T (\frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ})

, what the value of

\frac{1}{2} S (x)

is.

Let

u_{0} = \frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ}

and

x_{0} = T (u_{0})

. Since

S (x)

is an even function and is monotonic increasing (for

x > 0

), for

x \geq x_{0}

, we have

\frac{1}{2} S (x) - \frac{1}{2} S (x_{0}) = \int_{x_{0}}^{x} (u - χ) d χ .

(A20)

From P3, we have

u \leq (1 + ϵ) x

. Thus,

\frac{1}{2} S (x) - \frac{1}{2} S (x_{0}) \leq \int_{x_{0}}^{x} ϵ χ d χ .

(A21)

Thus, for a sufficient small

ϵ

,

\frac{1}{2} S (x) - \frac{1}{2} S (x_{0}) \approx 0

. In addition, for sufficient small

ϵ

and sufficient large

γ

,

u_{0} \approx \frac{1}{2}

. As

\frac{1}{2} S (\frac{1}{2}) \approx \frac{1}{8}

, we have

\frac{1}{2} S (x) \approx \frac{1}{8}

for

u \geq \frac{1}{2} + \frac{1}{γ} log \frac{1}{ϵ}

.

Sparsity: Since

\frac{1}{2} S (x)

is an even function, we only show the proof for

0 < x < T (\frac{1}{2} - \frac{1}{γ} log \frac{1}{ϵ})

, i.e., the case of

0 < u < \frac{1}{2} - \frac{1}{γ} log \frac{1}{ϵ}

. Let

u_{0} = \frac{1}{2} - \frac{1}{γ} log \frac{1}{ϵ}

and

x_{0} = T (u_{0})

. For

x > 0

,

\frac{1}{2} S (x)

is a monotonical increasing. Hence, we have

0 = \frac{1}{2} S (0) < \frac{1}{2} S (x) < \frac{1}{2} S (x_{0})

. Therefore, we have

\frac{1}{2} S (x_{0}) - \frac{1}{2} S (0) = \int_{0}^{x_{0}} u - χ d χ .

From P3.b, we can deduce that

0 < x_{0} \leq \frac{ϵ u_{0}}{1 + ϵ}

. For a sufficient small

ϵ

,

x_{0} \leq \frac{ϵ u_{0}}{1 + ϵ} \approx 0

. Obviously, in this region

u - χ

is bounded. Hence, we have

\frac{1}{2} S (x_{0}) \approx 0

. As

0 = \frac{1}{2} S (0) < \frac{1}{2} S (x) < \frac{1}{2} S (x_{0})

, we can say that

\frac{1}{2} S (x) \approx 0

for

0 \leq x \leq x_{0}

. In conclusion,

\frac{1}{2} S (x) \approx 0

if

| x | < T (\frac{1}{2} - \frac{1}{η} log \frac{1}{ϵ})

. The proof is completed.

References

Chua, L.; Lin, G.N. Nonlinear programming without computation. IEEE Trans. Circuits Syst. 1984, 31, 182–188. [Google Scholar] [CrossRef]
Tank, D.; Hopfield, J. Simple ‘neural’ optimization networks: An A/D converter, signal decision circuit, and a linear programming circuit. IEEE Trans. Circuits Syst. 1986, 33, 533–541. [Google Scholar] [CrossRef] [Green Version]
Xia, Y.; Leung, H.; Wang, J. A projection neural network and its application to constrained optimization problems. IEEE Trans. Circuits Syst. Fundam. Theory Appl. 2002, 49, 447–458. [Google Scholar]
Xia, Y.; Wang, J. A general projection neural network for solving monotone variational inequalities and related optimization problems. IEEE Trans. Neural Netw. 2004, 15, 318–328. [Google Scholar] [CrossRef] [PubMed]
Wang, H.; Lee, C.M.; Feng, R.; Leung, C.S. An analog neural network approach for the least absolute shrinkage and selection operator problem. Neural Comput. Appl. 2018, 29, 389–400. [Google Scholar] [CrossRef]
Wang, Y.; Li, X.; Wang, J. A neurodynamic optimization approach to supervised feature selection via fractional programming. Neural Netw. 2021, 136, 194–206. [Google Scholar] [CrossRef]
Bouzerdoum, A.; Pattison, T.R. Neural network for quadratic optimization with bound constraints. IEEE Trans. Neural Netw. 1993, 4, 293–304. [Google Scholar] [CrossRef]
Feng, R.; Leung, C.S.; Constantinides, A.G.; Zeng, W.J. Lagrange Programming Neural Network for Nondifferentiable Optimization Problems in Sparse Approximation. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 2395–2407. [Google Scholar] [CrossRef]
Wang, H.; Feng, R.; Leung, A.C.S.; Tsang, K.F. Lagrange programming neural network approaches for robust time-of-arrival localization. Cogn. Comput. 2018, 10, 23–34. [Google Scholar] [CrossRef]
Shi, Z.; Wang, H.; Leung, C.S.; So, H.C.; Member EURASIP. Robust MIMO radar target localization based on Lagrange programming neural network. Signal Process. 2020, 174, 107574. [Google Scholar] [CrossRef]
Liu, Q.; Wang, J. L_{1}-minimization algorithms for sparse signal reconstruction based on a projection neural network. IEEE Trans. Neural Netw. Learn. Syst. 2015, 27, 698–707. [Google Scholar] [CrossRef] [PubMed]
Yan, Z.; Le, X.; Wen, S.; Lu, J. A Continuous-Time Recurrent Neural Network for Sparse Signal Reconstruction Via ℓ₁ Minimization. In Proceedings of the 2018 Eighth International Conference on Information Science and Technology (ICIST), Cordoba, Granada, and Seville, Spain, 30 June–6 July 2018; pp. 43–49. [Google Scholar] [CrossRef]
Wen, H.; Wang, H.; He, X. A Neurodynamic Algorithm for Sparse Signal Reconstruction with Finite-Time Convergence. Circuits Syst. Signal Process. 2020, 39, 6058–6072. [Google Scholar] [CrossRef]
Donoho, D.; Huo, X. Uncertainty principles and ideal atomic decomposition. IEEE Trans. Inf. Theory 1999, 47, 2845–2862. [Google Scholar] [CrossRef] [Green Version]
Donoho, D.L.; Elad, M. Optimally sparse representation in general (nonorthogonal) dictionaries via l¹ minimization. Proc. Natl. Acad. Sci. USA 2003, 100, 2197–2202. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chartrand, R. Exact reconstruction of sparse signals via nonconvex minimization. IEEE Signal Process. Lett. 2007, 14, 707–710. [Google Scholar] [CrossRef]
Blumensath, T.; Davies, M.E. Iterative thresholding for sparse approximations. J. Fourier Anal. Appl. 2008, 14, 629–654. [Google Scholar] [CrossRef] [Green Version]
Jin, D.; Yang, G.; Li, Z.; Liu, H. Sparse recovery algorithm for compressed sensing using smoothed ℓ₀-norm and randomized coordinate descent. Mathematics 2019, 7, 834. [Google Scholar] [CrossRef] [Green Version]
Stanković, L.; Sejdić, E.; Stanković, S.; Daković, M.; Orović, I. A tutorial on sparse signal reconstruction and its applications in signal processing. Circuits Syst. Signal Process. 2019, 38, 1206–1263. [Google Scholar] [CrossRef]
Stanković, I.; Ioana, C.; Daković, M. On the reconstruction of nonsparse time-frequency signals with sparsity constraint from a reduced set of samples. Signal Process. 2018, 142, 480–484. [Google Scholar] [CrossRef]
Dai, C.; Che, H.; Leung, M.F. A neurodynamic optimization approach for L1 minimization with application to compressed image reconstruction. Int. J. Artif. Intell. Tools 2021, 30, 2140007. [Google Scholar] [CrossRef]
Bioucas-Dias, J.M.; Figueiredo, M.A. A new TwIST: Two-step iterative shrinkage/thresholding algorithms for image restoration. IEEE Trans. Image Process. 2007, 16, 2992–3004. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kong, X.; Zhao, Y.; Xue, J.; Chan, J.C.W.; Kong, S.G. Global and local tensor sparse approximation models for hyperspectral image destriping. Remote Sens. 2020, 12, 704. [Google Scholar] [CrossRef] [Green Version]
Costanzo, S.; Rocha, Á.; Migliore, M.D. Compressed sensing: Applications in radar and communications. Sci. World J. 2016, 2016, 5407415. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Li, S.; Zhao, G.; Zhang, W.; Qiu, Q.; Sun, H. ISAR imaging by two-dimensional convex optimization-based compressive sensing. IEEE Sens. J. 2016, 16, 7088–7093. [Google Scholar] [CrossRef]
Craven, D.; McGinley, B.; Kilmartin, L.; Glavin, M.; Jones, E. Compressed sensing for bioelectric signals: A review. IEEE J. Biomed. Health Inform. 2014, 19, 529–540. [Google Scholar] [CrossRef]
Chen, S.S.; Donoho, D.L.; Saunders, M.A. Atomic decomposition by basis pursuit. SIAM Rev. 2001, 43, 129–159. [Google Scholar] [CrossRef] [Green Version]
Osborne, M.R.; Presnell, B.; Turlach, B.A. A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 2000, 20, 389–403. [Google Scholar] [CrossRef] [Green Version]
Candes, E.; Romberg, J. l1-Magic. 2017. Available online: https://candes.su.domains/software/l1magic/downloads/l1magic.pdf (accessed on 1 December 2022).
van den Berg, E.; Friedlander, M.P. SPGL1: A Solver for Sparse Least Squares. 2007. Available online: https://friedlander.io/spgl1/ (accessed on 1 December 2022).
Rozell, C.J.; Johnson, D.H.; Baraniuk, R.G.; Olshausen, B.A. Sparse coding via thresholding and local competition in neural circuits. Neural Comput. 2008, 20, 2526–2563. [Google Scholar] [CrossRef]
Engan, K.; Rao, B.D.; Kreutz-Delgado, K. Regularized FOCUSS for subset selection in noise. In Proceedings of the NORSIG 2000, Kolmarden, Sweden, 13–15 June 2000; pp. 247–250. [Google Scholar]
Rao, B.D.; Kreutz-Delgado, K. An affine scaling methodology for best basis selection. IEEE Trans. Signal Process. 1999, 47, 187–200. [Google Scholar] [CrossRef] [Green Version]
Zhang, C.H. Nearly unbiased variable selection under minimax concave penalty. Ann. Stat. 2010, 38, 894–942. [Google Scholar] [CrossRef]
Breheny, P.; Huang, J. Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection. Ann. Appl. Stat. 2011, 5, 232. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Blumensath, T.; Davies, M.E. Normalized iterative hard thresholding: Guaranteed stability and performance. IEEE J. Sel. Top. Signal Process. 2010, 4, 298–309. [Google Scholar] [CrossRef] [Green Version]
Donoho, D.L.; Maleki, A.; Montanari, A. Message-passing algorithms for compressed sensing. Proc. Natl. Acad. Sci. USA 2009, 106, 18914–18919. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Jin, J.; Gu, Y.; Mei, S. A stochastic gradient approach on compressive sensing signal reconstruction based on adaptive filtering framework. IEEE J. Sel. Top. Signal Process. 2010, 4, 409–420. [Google Scholar] [CrossRef] [Green Version]
Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J. Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 2011, 3, 1–122. [Google Scholar] [CrossRef]
Song, C.; Xia, S.T. Alternating direction algorithms for ∖ell_0 regularization in compressed sensing. arXiv 2016, arXiv:1604.04424. [Google Scholar]
Qiu, K.; Dogandzic, A. ECME thresholding methods for sparse signal reconstruction. arXiv 2010, arXiv:1004.4880. [Google Scholar]
Zhang, S.; Constantinidies, A.G. Lagrange Programming Neural Networks. IEEE Trans. Circuits Syst. II 1992, 39, 441–452. [Google Scholar] [CrossRef]
Shi, Z.; Wang, H.; Leung, C.S.; So, H.C.; Liang, J.; Tsang, K.F.; Constantinides, A.G. Robust ellipse fitting based on Lagrange programming neural network and locally competitive algorithm. Neurocomputing 2020, 399, 399–413. [Google Scholar] [CrossRef]
Balavoine, A.; Rozell, C.; Romberg, J. Global convergence of the locally competitive algorithm. In Proceedings of the IEEE Signal Processing Education Workshop (DSP/SPE) 2011, Sedona, AZ, USA, 4–7 January 2011; pp. 431–436. [Google Scholar]
Pandey, O. Operational Amplifier (Op-Amp). In Electronics Engineering; Springer: Berlin/Heidelberg, Germany, 2022; pp. 233–270. [Google Scholar]
Bult, K.; Wallinga, H. A CMOS four-quadrant analog multiplier. IEEE J. Solid-State Circuits 1986, 21, 430–435. [Google Scholar] [CrossRef] [Green Version]
Chen, C.; Li, Z. A low-power CMOS analog multiplier. IEEE Trans. Circuits Syst. II Express Briefs 2006, 53, 100–104. [Google Scholar] [CrossRef]
Filanovsky, I.; Baltes, H. Simple CMOS analog square-rooting and squaring circuits. IEEE Trans. Circuits Syst. I Fundam. Theory Appl. 1992, 39, 312–315. [Google Scholar] [CrossRef]
Sakul, C. A new CMOS squaring circuit using voltage/current input. In Proceedings of the 23rd International Technical Conference on Circuits/Systems, Computers and Communications ITC-CSCC, Phuket, Thailand, 5–8 July 2008; pp. 525–528. [Google Scholar]
van den Berg, E.; Friedlander, M.P. Probing the Pareto frontier for basis pursuit solutions. SIAM J. Sci. Comput. 2008, 31, 890–912. [Google Scholar] [CrossRef] [Green Version]
Ji, S.; Xue, Y.; Carin, L. Bayesian compressive sensing. IEEE Trans. Signal Process. 2008, 56, 2346–2356. [Google Scholar] [CrossRef]
Vadivel, R.; Hammachukiattikul, P.; Zhu, Q.; Gunasekaran, N. Event-triggered synchronization for stochastic delayed neural networks: Passivity and passification case. Asian J. Control 2022. [Google Scholar] [CrossRef]
Chanthorn, P.; Rajchakit, G.; Humphries, U.; Kaewmesri, P.; Sriraman, R.; Lim, C.P. A delay-dividing approach to robust stability of uncertain stochastic complex-valued hopfield delayed neural networks. Symmetry 2020, 12, 683. [Google Scholar] [CrossRef]
Spruck, J. Strictly Monotone Functions and the Inverse Function Theorem. Lecture Notes in MATH–405. Available online: https://math.jhu.edu/~js/math405/405.monotone.pdf (accessed on 24 November 2021).

Figure 1. Threshold function and sparsity measure function. (a) The shape of

T_{α, γ, κ} (u_{i})

under various settings. (b) Sparsity measure function. For

α = 0

,

γ \to \infty

and

κ = 1

, the value of x cannot be in the range of

(0, 1)

based on the property of the ideal thresholding function

T_{0, \infty, 1} (u)

.

Figure 1. Threshold function and sparsity measure function. (a) The shape of

T_{α, γ, κ} (u_{i})

under various settings. (b) Sparsity measure function. For

α = 0

,

γ \to \infty

and

κ = 1

, the value of x cannot be in the range of

(0, 1)

based on the property of the ideal thresholding function

T_{0, \infty, 1} (u)

.

Figure 2. (a) Circuit for the thresholding function

T (x)

. (b) Equivalent circuit of the thresholding function

T (x)

when the magnitude of input is greater than

V_{r e f}

. (c) Equivalent circuit of the thresholding function

T (x)

when the magnitude of input is less than or equal to

V_{r e f}

.

Figure 2. (a) Circuit for the thresholding function

T (x)

. (b) Equivalent circuit of the thresholding function

T (x)

when the magnitude of input is greater than

V_{r e f}

. (c) Equivalent circuit of the thresholding function

T (x)

when the magnitude of input is less than or equal to

V_{r e f}

.

Figure 3. Thresholding function

T (x)

obtained from the circuit simulation of Figure 2 with

V_{r e f} = 1

.

Figure 3. Thresholding function

T (x)

obtained from the circuit simulation of Figure 2 with

V_{r e f} = 1

.

Figure 4. Analog realization of (21) in the block diagram level.

Figure 5. Dynamics obtained from Simulink and discrete time simulation.

Figure 6. Typical dynamics of

u

and

x

for the LPNN-LPQC model, where

n = 4096

. The first column:

k = 75

,

m = 500

. The second column:

k = 100

,

m = 600

. The third column,

k = 125

,

m = 700

. In all sub-figures, we only show the dynamics of the

u_{i}

’s and

x_{i}

’s whose original

x_{i}

values are non-zero, because there are 4096 curves in each sub-figure when we show all the dynamics for

u_{i}

’s and

x_{i}

’s.

Figure 6. Typical dynamics of

u

and

x

for the LPNN-LPQC model, where

n = 4096

. The first column:

k = 75

,

m = 500

. The second column:

k = 100

,

m = 600

. The third column,

k = 125

,

m = 700

. In all sub-figures, we only show the dynamics of the

u_{i}

’s and

x_{i}

’s whose original

x_{i}

values are non-zero, because there are 4096 curves in each sub-figure when we show all the dynamics for

u_{i}

’s and

x_{i}

’s.

Figure 7. Simulation results of different algorithms, where

n = 4096

. For the first column,

k = 75

. For the second column,

k = 100

. For the third column,

k = 125

. The three rows are based on three different noise levels. The experiments are repeated 100 times using different settings.

Figure 7. Simulation results of different algorithms, where

n = 4096

. For the first column,

k = 75

. For the second column,

k = 100

. For the third column,

k = 125

. The three rows are based on three different noise levels. The experiments are repeated 100 times using different settings.

Figure 8. Comparison with other analog models on probability of reconstruction, where

n = 4096

. For the first column,

k = 75

. For the second column,

k = 100

. For the third column,

k = 125

. The three rows are based on three different noise levels. The experiments are repeated 100 times using different settings.

Figure 8. Comparison with other analog models on probability of reconstruction, where

n = 4096

. For the first column,

k = 75

. For the second column,

k = 100

. For the third column,

k = 125

. The three rows are based on three different noise levels. The experiments are repeated 100 times using different settings.

Figure 9. Comparison with other analog models on MSE, where

n = 4096

. For the first column,

k = 75

. For the second column,

k = 100

. For the third column,

k = 125

. The three rows are based on three different noise levels. The experiments are repeated 100 times using different settings.

Figure 9. Comparison with other analog models on MSE, where

n = 4096

. For the first column,

k = 75

. For the second column,

k = 100

. For the third column,

k = 125

. The three rows are based on three different noise levels. The experiments are repeated 100 times using different settings.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, H.; Feng, R.; Leung, C.-S.; Chan, H.P.; Constantinides, A.G. A Lagrange Programming Neural Network Approach with an ℓ₀-Norm Sparsity Measurement for Sparse Recovery and Its Circuit Realization. Mathematics 2022, 10, 4801. https://doi.org/10.3390/math10244801

AMA Style

Wang H, Feng R, Leung C-S, Chan HP, Constantinides AG. A Lagrange Programming Neural Network Approach with an ℓ₀-Norm Sparsity Measurement for Sparse Recovery and Its Circuit Realization. Mathematics. 2022; 10(24):4801. https://doi.org/10.3390/math10244801

Chicago/Turabian Style

Wang, Hao, Ruibin Feng, Chi-Sing Leung, Hau Ping Chan, and Anthony G. Constantinides. 2022. "A Lagrange Programming Neural Network Approach with an ℓ₀-Norm Sparsity Measurement for Sparse Recovery and Its Circuit Realization" Mathematics 10, no. 24: 4801. https://doi.org/10.3390/math10244801

APA Style

Wang, H., Feng, R., Leung, C.-S., Chan, H. P., & Constantinides, A. G. (2022). A Lagrange Programming Neural Network Approach with an ℓ₀-Norm Sparsity Measurement for Sparse Recovery and Its Circuit Realization. Mathematics, 10(24), 4801. https://doi.org/10.3390/math10244801

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Lagrange Programming Neural Network Approach with an ℓ0-Norm Sparsity Measurement for Sparse Recovery and Its Circuit Realization

Abstract

1. Introduction

1.1. Background

1.2. Motivation

1.3. Contribution and Organization

2. LPNN Framework and LCA

2.1. LPNN

2.2. LCA

3. LPNN-LPQC Model

3.1. ℓ p -Norm-like Sparsity Measure Function

3.2. Properties of LPQC Problem

3.3. Dynamics of LPNN-LPQC

4. Circuit Realization

4.1. Thresholding Element

4.2. Circuit Structure

4.3. Circuit Simulation

5. Properties of the Dynamics

6. Experiment Results

6.1. Comparison Algorithms and Settings

6.2. Parameter Settings

6.3. Convergence

6.4. Comparison with Other Algorithms

6.5. Comparison with Other Analog Models

6.5.1. Successful Rate of Recall

6.5.2. MSE of Recall

7. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Proof of Property 3

Appendix B. Proof of Property 4

Appendix C. Proof of Property 5

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

A Lagrange Programming Neural Network Approach with an ℓ₀-Norm Sparsity Measurement for Sparse Recovery and Its Circuit Realization

3.1. $ℓ_{p}$ -Norm-like Sparsity Measure Function