Tensor Global Extrapolation Methods Using the n-Mode and the Einstein Products

: In this paper, we present new Tensor extrapolation methods as generalizations of well known vector, matrix and block extrapolation methods such as polynomial extrapolation methods or (cid:101) -type algorithms. We will deﬁne new tensor products that will be used to introduce global tensor extrapolation methods. We discuss the application of these methods to the solution of linear and non linear tensor systems of equations and propose an efﬁcient implementation of these methods via the global-QR decomposition.


Introduction
Scalar extrapolation methods have been developed to accelerate the convergence of some sequences of numbers in R or C. This process consists in transforming a sequence (s n ) converging to s to new ones by applying some transformation T k , k = 1, . . . defined as follows where g i (n), i = 1, . . . , k − 1 is a scalar sequence that defines the method. One of the earlier extrapolation method is the well known Aitken's ∆ 2 process [1] defined by where the first and the second forward differences are defined by ∆s n = s n+1 − s n and ∆ 2 s n = ∆s n+1 − ∆s n and in that case we have g 1 (n) = ∆s n . Such a process was first proposed in 1926 by Aitken in [2]. It is well known that under some assumptions, the new Aitken sequence T define some new tensor products that will allow us to develop the new methods based on orthogonal or oblique projections onto subspaces of small dimensions. It will be shown that when the tensor sequence is generated linearly then the proposed methods are theoretically equivalent to some tensor Krylov subspace methods such as the tensor version of GMRES and Lanczos methods developed recently in [18,19]. The remainder of this paper is organized as follows. In Section 2, we give notations, some basic definitions and properties related to tensors. In Section 3, we introduce the tensor versions of the vector polynomial extrapolation methods namely the Tensor Global Reduced Rank Extrapolation (TG-RRE), the Tensor Global Minimal Polynomial Extrapolation (TG-MPE) and the Tensor Global Modified Minimal Polynomial Extrapolation (TG-MMPE). We also give a Global tensor version of the topological -transformation. Section 4 describes the application of the proposed methods for solving linear and nonlinear tensor system of equations. In Section 5, we introduce efficient implementations via the tensor global-QR decomposition and in the last section, we present some numerical experiments.

Preliminaries and Notations
In this section, we briefly review some concepts and notions that are used throughout the paper. A tensor is a multidimensional array of data and a natural extension of scalars, vectors and matrices to a higher order, a scalar is a 0th order tensor, a vector is a 1th-order tensor and a matrix is 2th-order tensor. The tensor order is the number of its indices, which is called modes or ways. For a given N-mode tensor X ∈ R I 1 ×I 2 ×...×I N , the notation x i 1 ,...,i N (with 1 ≤ i j ≤ I j , j = 1, . . . N) stands for element (i 1 , . . . , i N ) of the tensor X . Corresponding to a given tensor X ∈ R I 1 ×I 2 ...×I N , the notation X :, :, . . . , : (N−1)−times ,k k = 1, 2, . . . , I N denotes a tensor in R I 1 ×I 2 ×...×I N−1 which is obtained by fixing the last index and is called frontal slice. Fibers are the higher-order analogue of matrix rows and columns. A fiber is defined by fixing every index but one. A matrix column is a mode-1 fiber and a matrix row is a mode-2 fiber. Figure 1 shows the frontal, horizontal and lateral slices of a third order tensor and also a mode-3 tube fiber. The n-mode matrix of a tensor X ∈ R I 1 ×I 2 ×I 3 ...×I N is denoted by X (n) and arranges the mode-n fibers to be the columns of the resulting matrix X (n) ∈ R I n ×(I 1 ...I 3 ...I n−1 I n+1 ...I N ) with n = 1, . . . , N. We have: An important operation for a tensor is the tensor-matrix multiplication [20], also known as n-mode product of a tensor X ∈ R I 1 ×I 2 ×...×I N with a matrix U ∈ R J×I n , the n-mode product is a tensor of size I 1 × I 2 × . . . I n−1 × J × I n+1 × . . . × I N whose entries are given by: The n-mode vector product of a tensor X ∈ R I 1 ×I 2 ×...×I N with a vector v ∈ R I n is denote by (X × n v) is a tensor of size I 1 × I 2 × . . . I n−1 × I n+1 × . . . × I N whose entries are given by: We recall the definition and some properties of the Einstein product, see [19].
..×J M , the Einstein product of the tensors A and B is the tensor of size (I 1 × . . . × I L × J 1 × . . . × J M ) defined as: ..,i M ,j 1 ,...,j N = a j 1 ,...,j N ,i 1 ,...,i M then B is called the transpose of A and denoted by A T .
A tensor D = d i 1 ,...,i N ,j 1 ,...,j N ∈ C I 1 ×...×I N ×I 1 ×...×I N is diagonal if d i 1 ,...,i N ,j 1 ,...,j N = 0 in the case that the indices i 1 , . . . , i N are different from j 1 , . . . , j N . If all the entries of tensor are zero, we say this tensor is zero tensor, denoted by O.
The trace of the tensor A ∈ C I 1 ×...×I N ×I 1 ×...×I N is given by tr The inner product of two tensors of the same order A, B ∈ R I 1 ×...×I N ×J 1 ×...×J m is given by and the associated norm is defined by: where I the identity tensor is called a unit tensor or identity tensor which all this entries are zero except for the diagonal entries I i 1 ,...,i N ,i 1 ,...,i N = 1.
Proof. The proof comes easily from the definition of the two involved tensor products.

Tensor Extrapolation Methods
In this section, we present two classes of new tensor global extrapolation methods. The first class contains the tensor polynomial based methods while the second class is devoted to the global tensor topological -algorithm.
We will see later how to choose the vector α (k) ∈ R k .
Let T k denote the new transformation obtained from (T k ) as follows: Notice that the scalars α (k) j are the same in the expressions of T k (S n ) and T k (S n ). We define now the generalized residual of T Then we get where the first forward differences ∆S n = S n+1 − S n and ∆H k ( is obtained from the orthogonality relation: If H k,n and L k,n denote the tensor subspaces H k,n = span{∆G 0 (n), . . . , ∆G k−1 (n)} and L k,n = span{Y (6) and (7), the generalized residual satisfies the following relations: and (29) and (9) could be expressed as follows: and Assuming that (L k,n N+M+1 ∆H k (n)) is nonsingular, the vector α (k) appearing in the expression (6) of the generalized residual R(T (n) k ) is given by Therefore, the approximation T (n) k is computed as follows For the tensor global polynomial extrapolation methods, namely Tensor Global MPE (TG-MPE), Tensor Global RRE (TG-RRE) and Tensor Global MMPE (TG-MMPE), the auxiliary sequences are given as where ∆ 2 is the second forward difference ∆ 2 S n = ∆S n+1 − ∆S n . In this case, using the relation (12) and the fact that H k (n) = ∆V k (n), and ∆H k (n) = ∆ 2 V k (n), the approximations T (n) k given in (13) can be expressed as: It is clear that T (n) k exists and is unique if and only if the square matrix L k,n (N+M+1) ∆ 2 V k (n) is nonsingular. The generalized residual given in the relation (6) can be expressed as follows: The choice of the tensors Y (n) ..×K M required in the orthogonality relation (7) determines the global-polynomial tensor extrapolation method. For the three well known extrapolation polynomial-based methods, we have the following choices Next, we will propose an efficient implementation of these methods. For this purpose, we first express the approximation T (n) k given in relation (2) in a different way.
The system of linear Equation (17) can be written as With these notations, the linear system of Equation (18) becomes The above system of equations can also be expressed in the following compact form where k have been calculated and introduce the new variables then the tensor approximation T (n) k can be expressed as where

The tensor Global Topological -Transformation
In [6], Brezinski proposed a generalization of the scalar -algorithm for vector sequences called the topological -algorithm (TEA). The matrix case has been introduced by Jbilou and Sadok in ( [9]). In this section we define the tensor global topological -transformation (TG-TET).
Let (S n ) be a sequence of tensors of ∈ R I 1 ×I 2 ...×I N ×K 1 ×K 2 ...×K M and consider approximations E k (S n ) = E (n) k of the limit of the tensor sequence (S n ) n∈N such that where a We set E (n) k and define the j-th tensor generalized residual as follows The coefficients a are computed such that each j-th generalized residual is orthogonal to some chosen tensor Y ∈ R I 1 ×I 2 ...
Let D k,n denote the following matrix Then, from the orthogonality relation (27), a (n) = (a (n) 1 , . . . , a (n) k ) T ∈ R k is given by where z (n) = ( Y, ∆S n , . . . , Y, ∆S n+k−1 ) T . We assume that the matrix D k,n is nonsingular. Hence, the approximation E (n) k exists, is unique and is expressed as where W k (n) = [∆S n , . . . , ∆S n+k−1 ]. If the matrix D k,n is singular, then the approximation E (n) k is not defined and in that case we will have a breakdown of the method. One possibility to overcome this drawback is, instead of computing a (n) by using (29), we can solve the least squares problem min a (n) ∈R k D k,n a (n) + z (n) .
In the next sections, we will see how these tensor global extrapolation methods could be applied for solving linear and nonlinear tensor equations.

Application to Tensor Linear Systems
Consider the following tensor linear system of equations where A is a tensor in R I 1 ×I 2 ×...×I N ×I 1 ×I 2 ×...×I N , B ∈ R I 1 ×I 2 ×...×I N ×J 1 ×J 2 ×...×J M and X ∈ R I 1 ×I 2 ×...×I N ×J 1 ×J 2 ×...×J M is the unknown tensor to be determined. We assume here that the square tensor A is nonsingular so that (31) has a unique solution X * = A −1 B. The purpose here is to apply tensor extrapolation methods to the problem (31) to compute approximate solutions to the solution X * of (31). Starting from an initial tensor guess S 0 , we construct the linear sequence (S n ) n as follows Notice that if the sequence (S n ) n is convergent, then its limit S = X * is the solution of the system (31). In [18,19], tensor Krylov subspaces via the Einstein product have been introduced including the tensor GMRES and some tensor based Lanczos methods. The next theorem shows that, when applied to the generated linear sequence (32), the proposed tensor extrapolation methods are Tensor Krylov subspace methods and are mathematically equivalent to some well known Krylov based methods such as the tensor GMRES. Theorem 1. When applied to the sequence generated by (32), TG-RRE, and TG-MPE are tensor Krylov subspace methods and are mathematically equivalent to the tensor global GMRES and the tensor global Arnoldi methods, respectively.
Consequently using (6)  For simplicity and unless specified otherwise, we set n = 0, T (0) k = T k , S 0 = X 0 the initial guess and drop the index n in all our notations. When applied to the sequence generated by the linear relation (32), the TG-RRE, TG-MMPE and the TG-MPE above produce approximations X k = T k such that the corresponding residual R k = B − A * N T k satisfies the relations Notice that, since V k = K m (A, R 0 ) (the tensor Krylov subspace defined in [18,19]), the extrapolation methods above are tensor global Krylov subspace methods. TG-RRE is an orthogonal projection and is theoretically equivalent to the tensor global GMRES while TG-MPE is oblique projection method and is equivalent to the tensor global Arnoldi method.
As the TG-RRE is an orthogonal projection method, we also have the classical minimization property for the residual When the linear process (32) is convergent, it is more useful in practice to apply the tensor extrapolation methods after some fixed number of iterations. To save memory, we can also use the algorithm in a cycling mode which means that the iterations are restarted after a fixed number m of iterations. Algorithm 1 is summarized as follows:

Algorithm 1 TG-RRE, TG-MPE and TG-MMPE Algorithms
Step 1. k = 0 choose X 0 and the numbers p and m.

Application to Non Linear Tensor Systems
Consider the nonlinear system of tensor equations with G(X ) an operator from R I 1 ×I 2 ×...×I N ×J 1 ×J 2 ×...×J M onto R I 1 ×I 2 ×...×I N ×J 1 ×J 2 ×...×J M , and X * ∈ R I 1 ×I 2 ×...×I N ×J 1 ×J 2 ×...×J M the solution of the system of Equation (33). For any arbitrary tensor X , the residual is given by Let (S n ) n be the sequence of tensors generated from an initial guess S 0 as follows S n+1 = G(S n ), j = 0, 1 . . . .
To get approximate solutions to a solution of (33), we apply the tensor extrapolation methods above to the sequence (S n ). As for the linear case, the different steps are summarized in the following Algorithm 2.
Algorithm 2 TG-RRE, TG-MPE and TG-MMPE for nonlinear systems Step 1. k = 0 choose X 0 and the numbers p and m.
Step 2. Basic iteration Compute the approximation T m , k = k + 1 and go to step 2.

Remark 2.
As we stated earlier, when applied to linearly generated tensor sequences, the proposed tensor extrapolation methods produce the same iterates as some well known tensor Krylov subspace methods but differ in the way that the approximations are computed. In the case where the process of generating the sequence (S n ) n is not known and what is known is only the terms of this sequences, Krylov-based methods or Newton-type methods could not be used and in that case, extrapolation methods are well come. Such a problem can be found for example in some statistical-problems; see for example [16] for the application of the vector -algorithm of Wynn [23] in the expectation-maximization (EM) algorithm to find maximum likelihood estimates from incomplete or missing data.

The Global-QR Implementation of TG-MPE/TG-RRE
The purpose of this section is to give an efficient implementation of TG-MPE and TG-RRE using a generalisation of the technique based on the QR decomposition and given in [11] for vector MPE and vector RRE methods. We first introduce the Tensor Global-QR decomposition. Let U ∈ R I 1 ×I 2 ××...I M ×J 1 ×J 2 ×...×J N ×k , be an (M + N + 1)-mode tensor with the column tensors U 0 , . . . , U k−1 ∈ R I 1 ×I 2 ×...×I M ×J 1 ×J 2 ×...×J N . Then, there is an (M + N + 1)-mode orthogonal tensor Q = [Q 0 , . . . , Q k−1 ] ∈ R I 1 ×I 2 ××...I M ×J 1 ×J 2 ×...×J N ×k satisfying Q (N+M+1) Q = I k×k and an upper triangular matrix R ∈ R k×k such that The tensor decomposition (35) will be called the Tensor Global-QR (TG-QR) decomposition of A and is summarized in the following Algorithm 3.
• End for (c) r j,j = W, W 1 2 (d) Q j = W r j,j (e) End for

End
To show that the tensor Q is orthogonal, we just proceed by induction on k. The coefficients of the matrix R are the r i,j 's given by Algorithm 3. Next, we give a proposition to be used.
For simplicity, we set here n = 0 and then the approximation T (0) k (defined earlier in (23)) is given by Substituting now V k = V k (0) = Q × (N+M+1) R T and using Proposition 2, we get which gives where θ j is the (j + 1) component of the column vector Rδ (k) ; the matrix R and the tensors Q j ∈ R I 1 ×...I M ×J 1 ×...×J N , j = 0, . . . , k − 1 are given by Algorithm 3. To compute δ (k) , we have first to compute k−1 ) T and use the relations given in (22). For TG-MPE, the coefficients γ (k) l 's of the vector γ (k) are determined by computing the vector β (k) , the solution of the linear system of equations (20) For TG-RRE, β (k) are determined by solving the linear system of equation . . . . . .
Finally, once β (k) is computed, the coefficients γ (k) l 's are given by The following Algorithm 4 summarizes the main steps.
Algorithm 4 Implementation of TG-MPE/TG-RRE via the global-QR decomposition 1. S 0 is a given initial guess and k is a fixed index. 2. Apply Algorithm 3 to compute the global-QR decomposition of the tensor ∆V k . 3. Compute β (k) by solving the linear system (38) or (39). 4. Compute the coefficients γ (k) j from (40). 5. Compute δ (k) from the relation (22) and θ j is (j + 1) component of the column vector Rδ (k) .

Compute T
As a numerical test, we consider the linear system of tensor equations given by (31) where A is a random tensor N = 3, I 1 = I 2 = 20, I 3 = 10, J 1 = 20, J 2 = 10 and J 3 = 5 as in [19]. The exact solution is the tensor X whose elements are all one and the right hand side B is given by B = A * N X . The computations are carried out using MATLAB 7.4 with machine epsilon about 2 · 10 −16 .
We took m = 10, p = 1 and stopped the iteration when the relative error norm of the residual was less than 10 −6 . Then after k = 6 iterations, Algorithm 4 for the tensor RRE method, gives the relative residual norm R k R 0 = 8.5 × 10 −7 where the initial guess is X 0 = O. We notice that the convergence of the original sequence (S n ) is very slow and at n = 200, we get a relative residual R(S n )/R 0 of 10 −1 . Generally, extrapolation methods are more effective than Krylov-based methods for nonlinear problems.
More experimental studies and applications in some areas such as mutilinear google pagerank should be considered in the future.

Conclusions
In this paper, we introduced new tensor global extrapolation methods to accelerate the convergence of some tensor sequences. The proposed methods are generalisations to the tensor case of some well known vector extrapolation methods such as the reduced rank extrapolation or the topological epsilon algorithm. The new methods were defined as orthogonal or oblique projection processes using the Einstein product and also some new interesting tensor products. We showed how to apply the derived algorithms to tensor linear and nonlinear systems of tensor equations. Application to some interesting problems such as the multilinear page rank is still under investigation.