A Levenberg-Marquardt Method for Tensor Approximation

Jinyao Zhao; Xuejuan Zhang; Jinling Zhao

doi:10.3390/sym15030694

,

and

School of Mathematics and Physics, University of Science and Technology Beijing, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Symmetry2023, 15(3), 694;https://doi.org/10.3390/sym15030694

This article belongs to the Special Issue Symmetry and Asymmetry in AI-Enabled Human-Centric Collaborative Computing

Version Notes

Order Reprints

Abstract

This paper presents a tensor approximation algorithm, based on the Levenberg–Marquardt method for the nonlinear least square problem, to decompose large-scale tensors into the sum of the products of vector groups of a given scale, or to obtain a low-rank tensor approximation without losing too much accuracy. An Armijo-like rule of inexact line search is also introduced into this algorithm. The result of the tensor decomposition is adjustable, which implies that the decomposition can be specified according to the users’ requirements. The convergence is proved, and numerical experiments show that it has some advantages over the classical Levenberg–Marquardt method. This algorithm is applicable to both symmetric and asymmetric tensors, and it is expected to play a role in the field of large-scale data compression storage and large-scale tensor approximation operations.

Keywords:

tensor decomposition; Levenberg–Marquardt method; non-linear least squares problem; damped method

1. Introduction

Tensors are widely used in various practical fields, such as video analysis [,], text analysis [], high-level network analysis [,,], data encryption [,,], privacy preserving [,,], and so on. As a form of data storage, tensors usually contain a large amount of data, which sometimes results in great difficulty in solving problems with tensors, for instance, solving tensor equations and so on [,,,]. Sometimes it will even be impossible to store all entries of a higher-order tensor explicitly. Thus, tensor decomposition and tensor approximation have received more and more attention since the last century, especially with the era of big data coming. Research on tensors has become an attractive topic, and an increasing number of methods [,,,] for solving tensor decomposition problems have been proposed, such as the widely-used steepest descent method, Newton’s method [], Gauss–Newton method [], CANDECOMP/PARAFAC (CP) decomposition [], Tucker decomposition, Tensor-Train (TT) decomposition, and algebraic method [,,], and so on. We also refer to [] and the references therein for more related work on low-rank tensor approximation techniques. These methods have rich theoretical achievements and practical applications [,,]. Among them, the CP decomposition is an ideal method in both theory and accuracy aspects, and the specific algorithm is shown in []. However, finding the rank of CP decomposition is NP-hard [], thus there are only several types of tensors that can be decomposed by CP decomposition. Meanwhile, part of them will even increase the data size when the rank is too large. As for TT decomposition [], this tensor decomposition method is convenient by applying Singular Value Decomposition (SVD), but the accuracy will be greatly degraded.

While the accuracy of tensor decomposition is crucial in practical applications, it is inevitable that errors will occur, as long as it remains within an acceptable range. At the same time, the accuracy of tensor approximation is also important. To reduce the storage and minimize the error, this paper explores a nonlinear least square method for tensor decomposition and low-rank CP tensor approximation, which decomposes or approximates a tensor into/by a sum of rank-one tensors. We first transform the problem into a non-linear least squares problem, so that we can make use of the corresponding theories and techniques of the descent methods. Then, we combine the Levenberg–Marquardt [,,] method with the damped method [] to obtain a more efficient method for solving this least squares problem. An Armijo-like rule of inexact line search is introduced into this algorithm to ensure the convergence, which makes it different from the Alternating Least Squares (ALS) method. We also optimize the damping parameter to accelerate the convergence rate. It is known that Levenberg–Marquardt method was separately proposed by Kenneth Levenberg [] and Donald Marquardt [], and it is an effective method to solve the least squares problems, since it can achieve local quadratic convergence under some certain local error bound assumption. Recently, Levenberg–Marquardt method has been extended to solve tensor-related problems, such as tensor equations, complementarity problems and tensor split feasibility problems [,,,].

It should be noted that, although the computational efficiency for tensor approximation has been improved by this damped Levenberg–Marquardt method, the iterative process will spend more and more time ineluctably. However, the proposed method still has the following advantages. Firstly, the tensor storage can be significantly reduced since the tensor approximation can be specified according to the users’ requirements. Secondly, the proposed algorithm is universal because the tensor need not satisfy any special properties, and it is applicable to both symmetric and asymmetric tensors. Thirdly, this method is convergent in theory and shows some advantages over the classical Levenberg–Marquardt method in numerical experiments. The primary contribution of this paper is a methodological attempt, and it may play a role in the fields of large-scale data compression storage and large-scale tensor approximation operations, just as other tensor approximation methods.

The rest of this paper is organized as follows. Section 2 introduces the symbols and concepts. Section 3 focuses on the tensor approximation and converts it into a non-linear least squares problems. Then, in order to solve this non-linear least squares problem, an algorithm based on the Levenberg–Marquardt method combined with the damped method, is introduced in Section 3. After that, some numerical experiments are performed to test the feasibility and efficiency of the proposed algorithm in Section 4. Finally, Section 5 concludes the paper.

2. Preliminaries

2.1. Notation

Throughout this paper, vectors are written as italic lowercase letters, such as

x, y, z, \dots

; the matrices are written as capital letters, such as

A, B, \dots

; and the tensors are written as

A, B, \dots

. The symbol

N

denotes the set of non-negative integers, the symbol

N_{+}

denotes the set of positive integers,

R

denotes the Euclidean space, and

R^{n}

denotes n-dimensional Euclidean space.

Suppose the function

F (x) : R^{n} ⟼ R

is differentiable and so smooth that the following Taylor expansion is valid:

F (x + h) = F (x) + h^{T} g + \frac{1}{2} h^{T} H h + O ({∥ x ∥}^{3}),

(1)

where g in this paper denotes the gradient

g \equiv \nabla F (x) = [\begin{matrix} \frac{\partial F}{\partial x_{1}} (x) \\ ⋮ \\ \frac{\partial F}{\partial x_{n}} (x) \end{matrix}],

(2)

and H stands for the Hessian

H \equiv \nabla^{2} F (x) = [\begin{matrix} \frac{\partial F^{2}}{\partial x_{i} \partial x_{j}} (x) \end{matrix}] .

(3)

Let the function L be defined as follows:

F (x + h) \approx L (h) \equiv F (x) + h^{T} g + \frac{1}{2} h^{T} H h,

(4)

where gain ratio

ϱ

in the Levenberg–Marquardt method is defined as

ϱ = \frac{F (x) - F (x + h)}{L (0) - L (h)} .

(5)

For a vector function

f : R^{n} ⟼ R^{m}, m \geq n

, its Taylor expansion is

f (x + h) = f (x) + J (x) h + O ({∥ h ∥}^{2}),

(6)

where

f = {(f_{1}, \dots, f_{m})}^{T}

and

J \in R^{m \times n}

is the Jacobian, with

{(J (x))}_{i j} = \frac{\partial f_{i}}{\partial x_{j}} (x) .

(7)

2.2. Least Squares Problem

A non-linear least squares problem is to find

x^{*}

, which is a local minimizer for

F (x) = \frac{1}{2} \sum_{i = 1}^{m} {(f_{i} (x))}^{2},

(8)

where

f_{i} : R^{n} ⟼ R, i = 1, \dots, m, m > n

, are given functions.

2.3. Descending Condition

From a starting point

x_{0}

, a descent method produces a series of points

x_{1}, x_{2}, \dots

, which (hopefully) converges to

x^{*}

, a local minimizer for a given function, and the descending condition is []

F (x_{k + 1}) < F (x_{k}) .

(9)

2.4. Rank-One Tensors

If an mth-order tensor

A \in R^{I_{1} \times I_{2} \times \dots \times I_{m}}

satisfies

A = α^{(1)} \circ α^{(2)} \circ \dots \circ α^{(m)},

(10)

where

α^{(i)} \in R^{I_{i}}, i = 1, 2, \dots, m

, and the symbol “∘” means the vector outer product, i.e.,

A_{i_{1} i_{2} \dots i_{m}} = α_{i_{1}}^{(1)} α_{i_{2}}^{(2)} \dots α_{i_{m}}^{(m)},

(11)

then

A

is called a rank-one tensor [].

2.5. Frobenius Norm

For an mth-order tensor

A \in R^{I_{1} \times I_{2} \times \dots \times I_{m}}

, the Frobenius norm

{∥ A ∥}_{F}

is defined as

{∥ A ∥}_{F} = \sum_{i_{1}, i_{2}, \dots, i_{m}} A_{i_{1} i_{2} \dots i_{m}}^{2} .

(12)

3. A Damped Levenberg-Marquardt Method for Tensor Approximation

For a given tensor

A \in R^{I_{1} \times I_{2} \times \dots \times I_{m}}, I_{i} \in N_{+}, i = 1, 2, \dots, m

, the number of the data is denoted as

N_{0} = I_{1} \times I_{2} \times \dots \times I_{m}

. Given

R \in R

, the tensor

A

can be approximated by

A^{*}

, with

A^{*} = A^{(1)} + A^{(2)} + \dots + A^{(R)},

(13)

where

A^{(i)} \in R^{I_{1} \times I_{2} \times \dots \times I_{m}}, i = 1, 2, \dots, R

, according to (10), are all rank-one tensors as

A^{(i)} = u_{1}^{(i)} \circ u_{2}^{(i)} \circ \dots \circ u_{m}^{(i)} .

(14)

Here,

u_{j}^{(i)} = [\begin{matrix} u_{j 1}^{(i)} \\ ⋮ \\ u_{j I_{j}}^{(i)} \end{matrix}], i = 1, 2, \dots, R, j = 1, 2, \dots, m .

(15)

Thus, there are

N_{d} = (I_{1} + I_{2} + \dots + I_{m}) \times R

variables in the vectors

u_{j}^{(i)}, i = 1, 2, \dots, R, j = 1, 2, \dots, m

. In order to approximate

A

with

A^{*}

, we construct a vector function

f : R^{N_{d}} ⟼ R^{N_{0}}

,

f = [\begin{matrix} f_{11 \dots 1} \\ f_{21 \dots 1} \\ ⋮ \\ f_{I_{1} 1 \dots 1} \\ ⋮ \\ f_{I_{1} I_{2} \dots I m} \end{matrix}],

(16)

where

f_{i_{1} i_{2} \dots i_{m}} = A_{i_{1} i_{2} \dots i_{m}} - \sum_{r = 1}^{R} u_{1 i_{1}}^{(r)} u_{2 i_{2}}^{(r)} \dots u_{m i_{m}}^{(r)}, i_{p} = 1, 2, \dots, I_{p}, p = 1, 2, \dots, m .

(17)

Then, we obtain a non-linear least squares problem, which is to find

u^{*} = a r g m i n_{u} F (u),

(18)

where

F (u) = \frac{1}{2} \sum_{i_{1}, i_{2}, \dots, i_{m}} {f_{i_{1} i_{2} \dots i_{m}}}^{2} = \frac{1}{2} {∥ f ∥}^{2} = \frac{1}{2} f^{T} f,

(19)

and

u = {[\begin{matrix} u_{11}^{1} \\ ⋮ \\ u_{1 I_{1}}^{1} \\ u_{11}^{2} \\ ⋮ \\ u_{1 I_{1}}^{R} \\ u_{21}^{1} \\ ⋮ \\ u_{m I_{m}}^{R} \end{matrix}]}_{N_{d} \times 1} .

(20)

Concurrently,

\frac{\partial F}{\partial u_{q p}^{(r)}} = \sum_{i = 1}^{m} f_{i_{1} i_{2} \dots i_{m}} \frac{\partial f_{i_{1} i_{2} \dots i_{m}}}{\partial u_{q p}^{(r)}},

(21)

combined with (7),

J (u) = [\begin{matrix} \frac{\partial f_{11 \dots 1}}{\partial u_{11}^{1}} & \dots & \frac{\partial f_{11 \dots 1}}{\partial u_{m I_{m}}^{(R)}} \\ ⋮ & ⋱ & ⋮ \\ \frac{\partial f_{I_{1} I_{2} \dots I_{m}}}{\partial u_{11}^{1}} & \dots & \frac{\partial f_{I_{1} I_{2} \dots I_{m}}}{\partial u_{m I_{m}}^{(R)}} \end{matrix}],

(22)

yields

\nabla F (u) = {J (u)}^{T} f (u) .

(23)

Consequently, the number of data decreases from

N_{0}

to

N_{d}

. In other words, when the given number R is small enough so that

N_{0} > N_{d}

, then obviously the data size will be reduced.

For obtaining a minimizer of (19), this paper applies the Levenberg–Marquardt method combined with the Damped method, inspired by the high efficiency of the Levenberg–Marquardt method in solving the least squares problem. With the starting point

u_{0}

, we generate a sequence

{u_{k}}

, which satisfies the descending condition (9). Every step from the current point u consists of: (a) find a descent direction h; (b) find a step size

α > 0

to achieve a proper decrease in the

F - v a l u e

; and (c) generate the next point

u_{n e w} = u + α h

. The iterative direction h (also written as

h (u)

) is obtained by solving the following equation:

(J^{T} J + μ I) h = - g,

(24)

where

J = J (u)

in (22),

g = \nabla F (u)

in (23), I is the identity matrix, and

μ

is the damping parameter.

According to (5), the gain ratio in this damped Levenberg–Marquardt method is

ϱ = \frac{F (u) - F (u + h)}{L (0) - L (h)},

(25)

for

\begin{matrix} L (0) - L (h) & = & - h^{T} J^{T} f - \frac{1}{2} h^{T} J^{T} J h \\ = & - \frac{1}{2} h^{T} (2 g + (J^{T} J + μ I - μ I) h) \\ = & \frac{1}{2} h^{T} (μ h - g), \end{matrix}

(26)

where

h^{T} h

is positive as long as

h \neq 0

, and we find by (24) that

- h^{T} g = h^{T} (J^{T} J + μ I) h > 0

. Therefore, whether

ϱ

is positive or not is determined by

F (u) - F (u + h)

. Now, the damped Levenberg–Marquardt algorithm is stated as follows (Algorithm 1).

Remark 1.

In this algorithm, we applied the Damped technique, which indicates that when ϱ is small, μ should be increased, and thereby the penalty on large steps will be increased; on the contrary, when ϱ is large,

L (h)

is a good approximation to

F (u + h)

for h, and μ should be decreased. According to [], the following strategy in general outperforms: if

ϱ > 0

, then

μ : = μ \times m a x {\frac{1}{3}, 1 - {(2 ϱ - 1)}^{3}}, ν : = 2

, otherwise

μ : = μ ν, ν : = 2 ν

. The termination condition is

{∥ g ∥}_{\infty} \leq ϵ_{1}

. Moreover, steps 9 and 10 show the line search process [], where

k_{m a x}, τ, ϵ_{1}, ϵ_{2}

are given parameters, and

a_{i i}, i = 1, 2, \dots, N_{d}

are diagonal elements of matrix

A_{N_{d} \times N_{d}}

.

Theorem 1.

If

μ^{*}

is the cluster point of sequence

{μ_{k}}

generated by Algorithm 1, then

g (u^{*}) = 0

.

Proof.

Suppose the contrary, it holds

g {(μ)}^{*} \neq 0

. From (24) and

μ^{*}

as the cluster point, we obtain

{h (u^{*})}^{T} {(J (u^{*})}^{T} J (u^{*}) + μ I) h (u^{*}) = - {h (u^{*})}^{T} g (u^{*}) > 0,

(27)

then

{h (u^{*})}^{T} g (u^{*}) < 0

. According to steps 9 and 10 in Algorithm 1, we have

F (u^{*} + β^{m^{*}}) < F (u^{*}) + σ β^{m^{*}} {h (u^{*})}^{T} g (u^{*}),

(28)

for sequence

{m_{i}}, {u_{i}}

generated by steps in Algorithm 1, when

k_{i}

is sufficiently large, it holds that

F (u_{k_{i}} + β^{m^{*}}) \leq F (u_{k_{i}}) + σ β^{m^{*}} {h (u_{k_{i}})}^{T} g (u_{k_{i}}) .

(29)

By the Armijo criterion, we obtain

m^{*} \geq m_{k_{i}}

, then

\begin{matrix} F (u_{k_{i} + 1}) & = & F (u_{k_{i}} + β^{m_{k_{i}}} h_{k_{i}}) \\ \leq & F (u_{k_{i}}) + σ β^{m_{k_{i}}} {h (u_{k_{i}})}^{T} g (u_{k_{i}}) \\ \leq & F (u_{k_{i}}) + σ β^{m^{*}} {h (u_{k_{i}})}^{T} g (u_{k_{i}}) . \end{matrix}

(30)

It follows that

F (u_{k_{i} + 1}) \leq F (u_{k_{i}}) + σ β^{m^{*}} {h (u_{k_{i}})}^{T} g (u_{k_{i}}) .

(31)

Take the limit, and we have

F (u^{*}) \leq F (u^{*}) + σ β^{m^{*}} {h (u^{*})}^{T} g (u^{*}),

(32)

which implies that

{h (u^{*})}^{T} g (u^{*}) > 0

. This contradicts (27). Therefore, it holds that

g (u^{*}) = 0

. □

Algorithm 1 A damped Levenberg–Marquardt method for tensor approximation.

Input:

k : = 0; ν : = 2; u : = u_{0}; η = 2

σ : = 0.5; β : = 0.5; m : = 0;

A : = {J (u)}^{T} J (u); g : = {J (u)}^{T} f (u)

f o u n d : = {(∥ g ∥}_{\infty} \leq ϵ_{1}); μ : = τ \times m a x {a_{i i}}

Output: u

1:: while not $f o u n d$ and $(k < k_{m a x})$ do
2:: $k = k + 1;$ Solve $(A + μ I) h = - g$
3:: if $∥ h ∥ \leq ϵ_{2} (∥ u ∥ + ϵ_{2})$ then
4:: $f o u n d : = t r u e$
5:: else
6:: $u_{n e w} : = u + h$
7:: $ϱ : = (F (u) - F (x_{n e w})) / (L (0) - L (h))$
8:: if $ϱ > 0$ then
9:: while not $F (u + β^{m} h) < F (u) + σ β^{m} h^{T} g$ do
10:: $m = m + 1$
11:: $u_{n e w} : = u + β^{m} h; m = 0$
12:: $u : = u_{n e w}$
13:: $A : = {J (u)}^{T} J (u); g : = {J (u)}^{T} f (u)$
14:: $f o u n d : = (∥ g ∥_{\infty} \leq ϵ_{1})$
15:: $μ : = μ \times m a x {\frac{1}{3}, 1 - {(2 ϱ - 1)}^{3}}; ν : = 2$
16:: else
17:: $μ : = μ \times ν; ν : = η \times ν$
18:: return result

4. Numerical Results

In this section, we perform some numerical experiments to test the proposed method and report the results. We evaluate the error by

Δ = ∥ A - A^{*} ∥_{F}

. The computing environment is MATLAB R2018b on a laptop with Intel(R) Core(TM) i7-7700HQ CPU @ 2.80 GHz 2.80 GHz and 16.0 GB RAM.

Example 1.

The tensor

A \in R^{3 \times 3 \times 3}

is generated such that all elements are integers in the range

[1, 50]

, with

N_{0} = 27

. Let

R = 2

with

N_{d} = 18

, and parameters are

ϵ_{1} = 0.01, ϵ_{2} = 0.01, τ = 0.01,

with

u_{0} \subset N_{+} \cap [1, 10]

, we can obtain the iterative process as follows.

The iterative process for Algorithm 1 for solving this example is reported in Table 1 For comparison, the classical Levenberg–Marquardt method with simply

μ = ∥ F (u) ∥

is also applied to solve this example, and the iterative process is shown in Table 2. Obviously, Algorithm 1 has a higher convergence speed than the classical method with simply

μ = ∥ F (u) ∥

. Figure 1 and Figure 2 show the decreasing process of F value when solving this problem by Algorithm 1 and the classical Levenberg–Marquardt method, respectively.

Table 1. Results of Algorithm 1 solving Example 1.

Table 2. Results of the classical Levenberg–Marquardt method solving Example 1.

Figure 1. F-value,

μ

is generated by Algorithm 1.

Figure 2. F-value,

μ = F (u)

by classical L-M method.

Example 2.

The tensor

A \in R^{4 \times 3 \times 4 \times 5}

is generated such that all elements are integers in the range

[1, 50]

, with

N_{0} = 240

. Let

R = 3

with

N_{d} = 48

, and parameters are

ϵ_{1} = 0.01, ϵ_{2} = 0.01, τ = 0.01,

with

u_{0} \subset N_{+} \cap [1, 10]

. The iterative process is stated in the following Table 3. This example shows that Algorithm 1 is also applicable for non-square tensors.

Table 3. Results of Algorithm 1 solving Example 2.

Example 3.

The tensor

A \in R^{2 \times 2 \times 2 \times 2 \times 2}

is generated such that all elements are integers in the range

[1, 50]

, with

N_{0} = 32

. Let

R = 2

with

N_{d} = 20

, and parameters are

ϵ_{1} = 0.01, ϵ_{2} = 0.01, τ = 0.01,

with

u_{0} \subset N_{+} \cap [1, 10]

. The costing time for some essential parts in Algorithm 1 is given in Table 4.

Table 4. Example 3.

Here, the column of “Time for h” states the time for computing h, and that of “Time cost” states the runtime. We see obviously from Table 4 that, solving

∥ h ∥

occupies most of the runtime. Here, a comprehensive algorithm is adopted in this paper by the function “mldivide” in MATLAB R2018b, which includes LU decomposition and so on.

Example 4.

The tensor

A \in R^{7 \times 7 \times 3}

is generated from RGB of some image (

7 \times 7

pixels) shown as follows, with

N_{0} = 147

. Let

R = 2

with

N_{d} = 34

, and parameters are

ϵ_{1} = 0.01, ϵ_{2} = 0.01, τ = 0.01,

with

u_{0} \subset N_{+} \cap [1, 10]

. We apply Algorithm 1 to decompose these tensors generated from the images, and obtain their approximating rank-one tensors, then show the corresponding images generated from the approximating tensors (i.e., the restoring images generated from

u^{*}

). See Figure 3, Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8. Here, Figure 3, Figure 5, and Figure 7 are the original images, and comparatively, Figure 4, Figure 6 and Figure 8 are the restoring images. The iterative process is stated in Table 5. The table and Figure 3, Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8 show that Algorithm 1 is applicable for color image tensor decomposition.

Figure 3. No.1 yellow and blue color bar original image.

Figure 4. No.1 yellow and blue color bar restoring image.

Figure 5. No.2 red yellow and blue color bar original image.

Figure 6. No.2 red yellow and blue color bar restoring image.

Figure 7. No.3 letter E original image.

Figure 8. No.3 letter E restoring image.

Table 5. Example 4.

From these numerical results, we find that Algorithm 1 can be used in tensor decomposition. Compared with the classical Levenberg–Marquardt method, this damped Levenberg–Marquardt method behaves better in tensor approximation. With the help of the damped technique, the objective function F is descending rapidly. The damped parameter

μ

plays an important role in the adjustment of h. By a tensor approximation technique, the data size is obviously decreased; however, the computing speed becomes slower as the size of the tensor increases. The principle reason is that solving the Equation (24) for obtaining h costs the most time.

5. Conclusions

In this paper, we explore a tensor approximation method to reduce the scale of tensor storage, based on nonlinear least squares. It improves the classical Levenberg–Marquardt algorithm by adjusting the damped parameter

μ

with the gain ratio

ϱ

, which can speed up the convergence of the algorithm. An Armijo-like line search rule is also introduced into this algorithm to ensure the convergence. Preliminary numerical results show that this tensor decomposition method can help to reduce the data size effectively. Additionally, this method does not require the tensor to satisfy any specific properties, such as supersymmetry [] and so on. The tensor can be square or non-square, sparse or nonsparse, symmetric or asymmetric. However, the computing time increases rapidly as the size of the tensor increases. By analyzing the results of the numerical experiments, the main reason causing this to happen is that solving the linear equations (step 2 in Algorithm 1) to compute h costs most of the time. How to conquer it still needs further study. Nevertheless, the proposed method still makes some sense in the field of large-scale data compression storage and large-scale tensor approximation operations.

Author Contributions

In this paper, J.Z. (Jinyao Zhao) is in charge of the Levenberg–Marquardt method for the tensor decomposition problem and the numerical experiments and most of the paper writing. X.Z. is in charge of part of the work and correcting. J.Z. (Jinling Zhao) is in charge of methodology, theoretical analysis, experiments design and supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China grant number 12171105, 11271206 and the Fundamental Research Funds for the Central Universities grant number FRF-DF-19-004.

Data Availability Statement

The authors confirm that the data supporting the findings of this study are available within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, Q.; Shi, X.; Schonfeld, D. A general framework for robust HOSVD-based indexing and retrieval with high-order tensor data. In Proceedings of the IEEE International Conference on Acoustics, Prague, Czech Republic, 22–27 May 2011. [Google Scholar]
Yan, H.Y.; Chen, M.Q.; Hu, L.; Jia, C.F. Secure video retrieval using image query on an untrusted cloud. Appl. Soft Comput. 2020, 97, 106782. [Google Scholar] [CrossRef]
Liu, N.; Zhang, B.Y.; Yan, J. Text Representation: From Vector to Tensor. In Proceedings of the IEEE International Conference on Data Mining, Houston, TX, USA, 27–30 November 2005. [Google Scholar]
Kolda, T.G.; Bader, B.W.; Kenny, J.P. Higher-Order Web Link Analysis Using Multilinear Algebra. In Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), Houston, TX, USA, 27–30 November 2005. [Google Scholar]
Jiang, N.; Jie, W.; Li, J.; Liu, X.M.; Jin, D. GATrust: A Multi-Aspect Graph Attention Network Model for Trust Assessment in OSNs. IEEE Trans. Knowl. Data Eng. 2022. Early Access. [Google Scholar] [CrossRef]
Ai, S.; Hong, S.; Zheng, X.Y.; Wang, Y.; Liu, X.Z. CSRT rumor spreading model based on complex network. Int. J. Intell. Syst. 2021, 36, 1903–1913. [Google Scholar] [CrossRef]
Liu, Z.L.; Huang, Y.Y.; Song, X.F.; Li, B.; Li, J.; Yuan, Y.L.; Dong, C.Y. Eurus: Towards an Efficient Searchable Symmetric Encryption With Size Pattern Protection. IEEE Trans. Dependable Secur. Comput. 2022, 19, 2023–2037. [Google Scholar] [CrossRef]
Gao, C.Z.; Li, J.; Xia, S.B.; Choo, K.K.R.; Lou, W.J.; Dong, C.Y. MAS-Encryption and its Applications in Privacy-Preserving Classifiers. IEEE Trans. Knowl. Data Eng. 2022, 34, 2306–2323. [Google Scholar] [CrossRef]
Mo, K.H.; Tang, W.X.; Li, J.; Yuan, X. Attacking Deep Reinforcement Learning with Decoupled Adversarial Policy. IEEE Trans. Dependable Secur. Comput. 2023, 20, 758–768. [Google Scholar] [CrossRef]
Zhu, T.Q.; Zhou, W.; Ye, D.Y.; Cheng, Z.S.; Li, J. Resource Allocation in IoT Edge Computing via Concurrent Federated Reinforcement Learning. IEEE Internet Things J. 2022, 9, 1414–1426. [Google Scholar]
Liu, Z.L.; Lv, S.Y.; Li, J.; Huang, Y.Y.; Guo, L.; Yuan, Y.L.; Dong, C.Y. EncodeORE: Reducing Leakage and Preserving Practicality in Order-Revealing Encryption. IEEE Trans. Dependable Secur. Comput. 2022, 19, 1579–1591. [Google Scholar] [CrossRef]
Zhu, T.Q.; Li, J.; Hu, X.Y.; Xiong, P.; Zhou, W.L. The Dynamic Privacy-Preserving Mechanisms for Online Dynamic Social Networks. IEEE Trans. Knowl. Data Eng. 2022, 34, 2962–2974. [Google Scholar] [CrossRef]
Li, X.; Ng, M.K. Solving sparse non-negative tensor equations: Algorithms and applications. Front. Math. China 2015, 10, 649–680. [Google Scholar] [CrossRef]
Li, D.H.; Xie, S.; Xu, H.R. Splitting methods for tensor equations. Numer. Linear Algebra Appl. 2017, 24, e2102. [Google Scholar] [CrossRef]
Ding, W.Y.; Wei, Y.M. Solving multi-linear systems with M-tensors. J. Sci. Comput. 2016, 68, 689–715. [Google Scholar] [CrossRef]
Han, L. A homotopy method for solving multilinear systems with M-tensors. Appl. Math. Lett. 2017, 69, 49–54. [Google Scholar] [CrossRef]
Hitchcock, F.L. The expression of a tensor or a polyadic as a sum of products. J. Math. Phys. 1927, 6, 164–189. [Google Scholar] [CrossRef]
Hitchcock, F.L. Multiple invariants and generalized rank of a p-way matrix or tensor. J. Math. Phys. 1928, 7, 39–79. [Google Scholar] [CrossRef]
Kolda, T.G. Multilinear operators for higher-order decompositions. Sandia Rep. 2006. [Google Scholar] [CrossRef]
Sidiropoulos, N.D.; Bro, R. On the uniqueness of multilinear decomposition of N-way arrays. J. Chemom. 2015, 14, 229–239. [Google Scholar] [CrossRef]
Goncalves, M.L.N.; Oliveira, F.R. An Inexact Newton-like conditional gradient method for constrained nonlinear systems. Appl. Numer. Math. 2018, 132, 22–34. [Google Scholar] [CrossRef]
Madsen, K.; Nielsen, H.B.; Tingleff, O. Methods for Non-Linear Least Squares Problems, 2nd ed.; Technical University of Denmark: Kongens Lyngby, Denmark, 2004; Available online: https://orbit.dtu.dk/en/publications/methods-for-non-linear-least-squares-problems-2nd-ed (accessed on 7 March 2023).
Harshman, R.A. Foundations of the PARAFAC procedure: Models and conditions for an “explanatory” multimodal factor analysis. UCLA Work. Pap. Phon. 1970, 16, 1–84. [Google Scholar]
Drexler, F. Eine methode zur Berechnung Sämtlicher Lösungen von polynomgleichungssytemen. Numer. Math. 1977, 29, 45–48. [Google Scholar] [CrossRef]
Garia, C.; Zangwill, W. Finding all solutions to polynomial systems and other systems of equations. Math. Progam. 1979, 16, 159–176. [Google Scholar] [CrossRef]
Li, T. Numerical solution of multivariate polynomial systems by homotopy continuation methods. Acta Numer. 1997, 6, 399–436. [Google Scholar] [CrossRef]
Grasedyck, L.; Kressner, D.; Tobler, C. A literature survey of low-rank tensor approximation techniques. Gamm-Mitteilungen 2013, 36, 53–78. [Google Scholar] [CrossRef]
Li, J.; Ye, H.; Li, T.; Wang, W.; Lou, W.J.; Hou, Y.; Liu, J.Q.; Lu, R.X. Efficient and Secure Outsourcing of Differentially Private Data Publishing With Multiple Evaluators. IEEE Trans. Dependable Secur. Comput. 2022, 19, 67–76. [Google Scholar] [CrossRef]
Yan, H.Y.; Hu, L.; Xiang, X.Y.; Liu, Z.L.; Yuan, X. PPCL: Privacy-preserving collaborative learning for mitigating indirect information leakage. Inf. Sci. 2021, 548, 423–437. [Google Scholar] [CrossRef]
Hu, L.; Yan, H.Y.; Li, L.; Pan, Z.J.; Liu, X.Z.; Zhang, Z.L. MHAT: An efficient model-heterogenous aggregation training scheme for federated learning. Inf. Sci. 2021, 560, 493–503. [Google Scholar] [CrossRef]
Kolda, T.G.; Bader, B.W. Tensor decompositions and applications. SIAM Rev. 2009, 51, 455–500. [Google Scholar] [CrossRef]
Carroll, J.D.; Chang, J.J. Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young” decomposition. Psychometrika 1970, 35, 283–319. [Google Scholar] [CrossRef]
Oseledets, I.V. Tensor-train decomposition. SIAM J. Sci. Comput. 2011, 33, 2295–2317. [Google Scholar] [CrossRef]
Levenberg, K. A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 1994, 2, 436–438. [Google Scholar] [CrossRef]
Marquardt, D. An algorithm for least-squares estimation of nonlinear parameters. SIAM J. Appl. Math. 1963, 11, 431–441. [Google Scholar] [CrossRef]
Tichavsky, P.; Phan, H.A.; Cichocki, A. Krylov-Levenberg-Marquardt Algorithm for Structured Tucker Tensor Decompositions. IEEE J. Sel. Top. Signal Process. 2021, 99, 1–10. [Google Scholar] [CrossRef]
Nielsen, H.B. Damping Parameter in Marquardt’s Method. IMM. 1999. Available online: https://findit.dtu.dk/en/catalog/537f0cba7401dbcc120040af (accessed on 7 March 2023).
Huang, B.H.; Ma, C.F. The modulus-based Levenberg-Marquardt method for solving linear complementarity problem. Numer. Math.Theory Methods Appl. 2018, 12, 154–168. [Google Scholar]
Huang, B.H.; Ma, C.F. Accelerated modulus-based matrix splitting iteration method for a class of nonlinear complementarity problems. Comput. Appl. Math. 2018, 37, 3053–3076. [Google Scholar] [CrossRef]
Lv, C.Q.; Ma, C.F. A Levenberg-Marquardt method for solving semi-symmetric tensor equations. J. Comput. Appl. Math. 2018, 332, 13–25. [Google Scholar] [CrossRef]
Jin, Y.X.; Zhao, J.L. A Levenberg–Marquardt Method for Solving the Tensor Split Feasibility Problem. J. Oper. Res. Soc. China 2021, 9, 797–817. [Google Scholar] [CrossRef]

Figure 1. F-value,

μ

is generated by Algorithm 1.

Figure 2. F-value,

μ = F (u)

by classical L-M method.

Figure 3. No.1 yellow and blue color bar original image.

Figure 4. No.1 yellow and blue color bar restoring image.

Figure 5. No.2 red yellow and blue color bar original image.

Figure 6. No.2 red yellow and blue color bar restoring image.

Figure 7. No.3 letter E original image.

Figure 8. No.3 letter E restoring image.

Table 1. Results of Algorithm 1 solving Example 1.

Iteration Steps	Value $μ$	Value $∥ h ∥$	Time Cost (s)	Value $Δ$
1	$125.67$	$7.715$	$9.5771$	$1.066451 \times 10^{3}$
2	$41.89$	$5.9723$	$8.9491$	$5.939391 \times 10^{2}$
3	$13.9633$	$4.4347$	$8.9854$	$3.241196 \times 10^{2}$
4	$4.6544$	$3.3805$	$8.8872$	$1.750603 \times 10^{2}$
5	$1.5515$	$3.7187$	$9.2101$	$9.715369 \times 10^{1}$
⋮	⋮	⋮	⋮	⋮
25	$8.8372 \times 10^{- 10}$	$0.2832$	$11.5673$	$3.808415 \times 10^{1}$
26	$2.9457 \times 10^{- 10}$	$0.2559$	$12.3084$	$3.80814 \times 10^{1}$
27	$9.8191 \times 10^{- 11}$	$0.2308$	$11.3772$	$3.807923 \times 10^{1}$
28	$3.273 \times 10^{- 11}$	$0.2079$	$10.5674$	$3.807751 \times 10^{1}$
29	$3.273 \times 10^{- 11}$	$0.1871$	$3.4915$	$3.807751 \times 10^{1}$

Table 2. Results of the classical Levenberg–Marquardt method solving Example 1.

Iteration Steps	Value $μ$	Value $∥ h ∥$	Time Cost (s)	Value $Δ$
1	$1.7023 \times 10^{6}$	$0.2677$	$8.9846$	$1.845141 \times 10^{3}$
2	$1.6388 \times 10^{6}$	$0.2698$	$8.9275$	$1.810389 \times 10^{3}$
3	$1.5766 \times 10^{6}$	$0.2718$	$8.9608$	$1.775739 \times 10^{3}$
4	$1.5159 \times 10^{6}$	$0.274$	$8.9397$	$1.741189 \times 10^{3}$
5	$1.4565 \times 10^{6}$	$0.2762$	$9.2693$	$1.706739 \times 10^{3}$
⋮	⋮	⋮	⋮	⋮
206	$830.8917$	$0.0517$	$8.9761$	$4.076498 \times 10^{1}$
207	$829.7662$	$0.0512$	$8.9692$	$4.073736 \times 10^{1}$
208	$828.6646$	$0.0507$	$9.0058$	$4.071031 \times 10^{1}$
209	$827.5866$	$0.0502$	$9.0094$	$4.068382 \times 10^{1}$
210	$826.5316$	$0.0497$	$8.9816$	$4.065788 \times 10^{1}$

Table 3. Results of Algorithm 1 solving Example 2.

Iteration Steps	Value $∥ h ∥$	Total Time Cost (s)	Value $Δ$
1	$10.4769$	$148.375561$	$1.919757 \times 10^{4}$
2	$6.9689$	$465.724726$	$5.012392 \times 10^{3}$
3	$3.9841$	$2570.192530$	$1.086476 \times 10^{3}$
4	$2.0578$	$5533.349151$	$2.746202 \times 10^{2}$
5	$1.6461$	$8337.919030$	$2.068849 \times 10^{2}$
6	$2.0244$	$10861.988487$	$1.861166 \times 10^{2}$
7	$1.6314$	$42811.813562$	$1.782579 \times 10^{2}$

Table 4. Example 3.

Steps	$∥ h ∥$	Time for h (s)	Time Cost (s)	Total Time (s)	Value $Δ$
1	$4.3183$	$1.9449$	$11.9619$	$18.051$	$3.765761 \times 4$
2	$3.8025$	$7.9705$	$16.9499$	$35.7049$	$2.210137 \times 4$
3	$3.2873$	$46.3806$	$55.1787$	$92.1544$	$1.288507 \times 4$
4	$2.7572$	$764.5759$	$773.0252$	$865.2164$	$7.424059 \times 3$
5	$2.2295$	$9.7946 \times 10^{3}$	$9.8077 \times 10^{3}$	$1.0674 \times 10^{4}$	$4.207814 \times 3$
6	$1.777$	$9.6397 \times 10^{3}$	$9.6496 \times 10^{3}$	$2.0325 \times 10^{4}$	$2.344068 \times 3$
7	$1.4312$	$1.1416 \times 10^{4}$	$1.1426 \times 10^{4}$	$3.1751 \times 10^{4}$	$1.279743 \times 3$
8	$1.2552$	$9.7942 \times 10^{3}$	$9.804 \times 10^{3}$	$4.1556 \times 10^{4}$	$6.879879 \times 2$
9	$1.436$	$1.0784 \times 10^{4}$	$1.0795 \times 10^{4}$	$5.2351 \times 10^{4}$	$3.741665 \times 2$

Table 5. Example 4.

Serial Number	Iterative Steps	Time Cost (s)	Value $Δ$
1	13	$3.8497 \times 10^{2}$	$1.216391 \times 10^{1}$
2	8	$2.0676 \times 10^{2}$	$3.580512 \times 10^{2}$
3	22	$3.5548 \times 10^{3}$	$1.216391 \times 10^{1}$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

A Levenberg-Marquardt Method for Tensor Approximation

Abstract

1. Introduction

2. Preliminaries

2.1. Notation

2.2. Least Squares Problem

2.3. Descending Condition

2.4. Rank-One Tensors

2.5. Frobenius Norm

3. A Damped Levenberg-Marquardt Method for Tensor Approximation

4. Numerical Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics