Abstract
A novel algorithm called the Proper Generalized Decomposition (PGD) is widely used by the engineering community to compute the solution of high dimensional problems. However, it is well-known that the bottleneck of its practical implementation focuses on the computation of the so-called best rank-one approximation. Motivated by this fact, we are going to discuss some of the geometrical aspects of the best rank-one approximation procedure. More precisely, our main result is to construct explicitly a vector field over a low-dimensional vector space and to prove that we can identify its stationary points with the critical points of the best rank-one optimization problem. To obtain this result, we endow the set of tensors with fixed rank-one with an explicit geometric structure.
Keywords:
proper generalised decomposition; alternating least squares; greedy rank one update algorithm; tensor numerical methods MSC:
15A69; 15A23; 57R25; 65N30
1. Introduction
Many problems in science and engineering are hard to compute due their numerical complexity. Moreover, in the analysis of complex systems under a real time constraint, the evaluation of all possible scenarios appears as a necessity [1]. Despite the improvements in techniques used in high dimensional problems, some challenging questions remain unresolved due to the efficiency of our computers. However, a novel technique called Proper Generalized Decomposition (PGD) [2,3] has been developed to provide an answer of these difficult tasks. It was initially proposed to compute, in a separated representation framework, the variational solution of partial differential equations (PDE) defined over a tensor product space [4]. It is possible to distinguish two different benefits. The first one is the possibility of managing high dimensional problems, and the second is the possibility to include the model’s parameters as extra-coordinates. This last fact gives a powerful strategy to deal with classical problems because the PGD framework facilitates an efficient design and a real-time decision-making [5,6].
This novel technique allows for computing the whole set of solutions of a parametrized problem. The strategy is to include in an equivalent non-parametrized problem all possible parameter values as extra-coordinates. The name of this particular PGD based approach is Progressive Variational Vademecum [1], and it can be implemented offline. As a consequence, the PGD based approach opens the possibility of solving problems in industry with a different strategy not envisioned until now.
The mathematical analysis of the PGD was given by Falcó and Nouy, Ref. [4] in a Hilbert space framework and in [7] for a more general setting.
The Greedy Rank One Algorithm (GROA) [8] used to solve high-dimensional linear systems (with a full rank matrix) is the procedure of choice in the engineering community to implement the PGD. It is an iterative method made up of two steps that we can cyclically repeat until convergence. The first one consists of computing the minimal residual of the linear system over the set of tensors with bounded rank-one. In the second step, we use this optimal rank-one solution to update the residual. In the following, we return to the first step. Ammar et al. [8] propose an Alternating Least Squares (ALS) Algorithm for the practical implementation of the first step of the GROA. In the aforementioned paper, the authors justify the choice of the ALS showing that its convergence to a critical point of the optimization problem (not necessarily an optimal one) is assured under very weak conditions. In addition, the convergence has been studied by El Hamidi, Osman and Jazar [9] in the framework of Sobolev tensor spaces.
In this work, we want to study the optimization problem of the first step the GROA. To this end, even though our problem is not convex, we will take into account the relationship between the underlying convex optimization problem and the behaviour of its associated gradient flow. It is well-known that the vector field constructed by using a convex functional (for example related with a convex minimization problem defined over a finite dimensional vector space), has a gradient flow that provides a dynamical system with a unique stationary point. Moreover, it can be shown that it is a sink and its stable manifold coincides with the whole domain of the convex functional. This fact motivates the classical paradigm about the convergence of gradient-based numerical optimization algorithms.
In this work, we want to adapt the above paradigm in the framework of the GROA [8]. The main goal is to find a vector field in a low-dimensional vector space related to the gradient flow of a convex functional defined over the set of tensors of fixed rank one. The idea is to use this vector field to characterize the behaviour of the solutions of the non-convex optimization problem associated with the first step of GROA. To achieve this, we will prove that the set of critical points of the optimization problem over the set of tensors with fixed rank-one can be identified with the set of stationary points of that vector field. In order to construct it, we will proceed as follows. First, we will endow the set tensors of fixed rank-one with an explicit structure of smooth manifold. Second, by the help of this geometric structure, we will explicitly construct a vector field over a low dimensional vector space related with the first step of GROA. Finally, we will show that the set of stationary points of this vector field coincides with the set of critical points of the optimization problem associated with the PGD algorithm. In consequence, this vector field allows us to explain the dynamical behaviour around each of its stationary points. Moreover, we can get explicit information, in a neighbourhood of each of these stationary points, about the structure of its stable and unstable manifolds. In our opinion, a more precise knowledge of these invariant sets can help us develop better and more efficient PGD approaches.
The paper is organised as follows. Section 2 provides some preliminary definitions and results used along this paper. Section 3 shows a geometric approach to the PGD. In Section 4, the characterization of the smooth manifold of the set of tensors of fixed rank-one is given. After that, Section 5 shows the first order optimality conditions for the PGD which is the main result of this paper. Finally, Section 6 provides some conclusions of the work.
2. Preliminary Definitions and Results
First of all, we introduce some notation used along this paper. We denote by the set of -matrices and by the transpose of a given matrix As usual, we use
to denote the Euclidean inner product in and its corresponding 2-norm, by Let be the -identity matrix and when the dimension is clear from the context, we simply denote it by Given a sequence we say that a vector can be written as
if and only if
in the -topology. Now, we recall the definition and some properties of the Kronecker product. The Kronecker product of and written is the tensor algebraic operation defined as
To conclude, we list some of the well-known properties of the Kronecker product (see, for example, [10] or [11]).
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- If A and B are banded, then is banded.
- 7.
- If A and B are symmetric, then is symmetric.
- 8.
- If A and B are definite positive, then is definite positive.
The concept of separated representation was introduced by Beylkin and Mohlenkamp in [12], and it is related to the problem of constructing the approximate solutions of some classes of problems in high-dimensional spaces by means of a separable function. In particular, for a given map
we say that it has a separable representation if
Now, consider a mesh of in the -variable given by -mesh points, then we can write a discrete version of (1) by
where for Observe that, for each if denotes the vector with components for then (2) is equivalent to
We point out that (3) is a useful expression to implement numerical algorithms using the Matlab and Octave function kron.
Suppose that, for given a linear Partial Differential Equation, and after a discretization by means of Finite Elements, we need to solve the linear system:
where A is a -dimensional invertible matrix, for some that is, Then, from all said above, a low rank approximation
with sufficient approximation exists, for some and where for and Moreover, we would show that
that is,
Thus, in a first approach to solve it, we would like to determine vectors for that minimizes
or, in short
by using the notation introduced in [13].
The Proper Generalised Decomposition (PGD in short) appears when we consider solving the linear Equation (4) as an optimization problem as follows. For each fixed and , we define a map
hence
holds. The goal is to use (5) to approximate the solution of (4). To this end, for each we define the set
introduced in [13], in the following way. Given , we say that if where for For , we define inductively that is,
Note that for all
Unfortunately, from Proposition 4.1 (a) of [13], we have that the set is not necessarily (or even usually) closed for each However, from Proposition 4.2 of [13], it follows that is a closed set in any norm-topology. This fact implies (see Lemma 1 in [8]), that given , then, for every , we have that the set
This allows for considering the following iterative scheme. Let and, for each , take
Note that, given and , we can construct for each by using (8) and (9), a vector
Here, we assume that for that is, Since we define the for obtained by the Greedy Rank-One Update Algorithms (8) and (9) as
The following theorem (see Theorem 1 in [8]) gives the convergence of the Greedy Rank-One Update Approximation for solving linear systems with full rank matrix.
Theorem 1.
Let and Then, by using the iterative scheme (8) and (9), we obtain that the sequence is strictly decreasing and
Moreover, the rate of convergence is given by
for where
for
From (10), we obtain that, if then for all Thus, the above theorem allows for us to construct a procedure, which we give in the pseudo-code form in Algorithm 1, under the assumption that we have a numerical method in order to find a solving (7) (see the step 5 in Algorithm 1) and that we introduce below.
| Algorithm 1 Greedy Rank-One Update |
| 1: procedure GROU () 2: 3: 4: for do 5: 6: 7: 8: if or then goto 13 9: end if 10: end for 11: return and 12: break 13: return and 14: end procedure |
3. A Geometric Approach to the PGD
In this section, to study the procedure given in the line 5 of the Algorithm 1, we introduce a smooth manifold. To this end, introduce first the set of tensors of fixed rank-one in the tensor space defined as
where Observe that the set
Then, our first result is the following theorem of the alternative.
Theorem 2.
Let and Either or but not both.
Proof.
Assume that and that there exists Since we can write
and holds for all we have
Now, consider the map defined as
Then, for all holds. Observe that
and hence the map f has a global minimum for a contradiction. □
The main consequence of this result is the following. It says that the output of the procedure given in step 5 in Algorithm 1 always remains in the set before it gives us the final output.
Corollary 1.
Let and such that Then, for some if and only if and hence for all
As a consequence of the above corollary, the situation of interest is when given and , we have that Thus, in order to study the vectors in we need to characterize the structure of the critical points of the map restricted to the set To see this in the next section, we provide to of a structure of smooth manifold.
4. The Set of Tensors of Fixed Rank-One as a Smooth Manifold
Along this paper, we will consider a manifold as a pair , where is a subset of some finite-dimensional vector space V and is an atlas representing the local coordinate system of We recall the definition of an atlas associated with a set
Definition 1.
Let be a set. An atlas of class or analytic on is a family of charts with some indexing set namely having the following properties (see [14]):
- AT1
- is a covering of that is, for all and
- AT2
- For each stands for a bijection of onto an open set of a finite dimensional normed space and for any α and β the set is open in
- AT3
- Finally, if we let and the transition mapping is a diffeomorphism of class or analytic.
Observe that the condition of an open covering is not used, see [14]. Moreover, in AT2, we do not require that the normed spaces to be the same for all indices or even to be isomorphic. If is linearly isomorphic to some finite dimensional normed space X for all we have the following definition.
Definition 2.
Let be a set and X be a finite dimensional normed space. We say that is a (respectively, analytic) manifold modelled on X if there exists an atlas of class (respectively, analytic) over with linearly isomorphic to X for all
Since different atlases can give the same manifold, we say that two atlases are compatible if each chart of one atlas is compatible with the charts of the other atlas in the sense of AT3. One verifies that the relation of compatibility between atlases is an equivalence relation.
Definition 3.
An equivalence class of atlases of class on also denoted by is said to define a structure of a -manifold on and hence we say that is a finite dimensional manifold. In a similar way, if an equivalence class of atlases is given by analytic maps, then we say that is an analytic finite dimensional manifold.
For each , we construct a local chart as follows. Let be
the orthogonal complement of the linear space for Let us consider the set
which is an open and dense set of the finite-dimensional vector space
Observe that the vector space is linearly isomorphic to the vector space for all
Now, we introduce the set
in for which we can construct a natural bijection:
Then, we can state the following result.
Theorem 3.
The set is an atlas for and hence is a -manifold modelled on
Proof.
Clearly, AT1 holds. To prove AT2 and AT3, let us consider be such that Without loss of generality we may assume that and , where for Then, for each , there exists a unique and a unique such that
Since
holds, there exists a unique such that
Thus, multiplying (13) on the left side by and, by using that and we obtain
Hence,
defines a -function from the open set to for each Moreover,
holds for Observe that can be written as
where
has norm one for In addition,
where
has norm one for Thus,
clearly defines a -function from the open set to Finally, we conclude that
the map
is given by
and it is This follows AT2, AT3 and concludes the proof of the theorem. □
The construction of as an algebraic variety is well-known (see, for example, [15]). More recently, in [16], a structure of smooth manifold is given, in the framework of Banach spaces, for the set of tensors of fixed rank-one. Following [16], it can be shown that the manifold is also a principal bundle as follows. Consider the Grassmann manifold of one-dimensional, subspaces of denoted by for and define the surjective map
Then, for each , it holds that
Consequently, is also a principal bundle with base space and fibre It allows for decomposing the tangent space at denoted into the vertical and horizontal spaces:
where
and
5. On the First Order Optimality Conditions for the PGD
The goal of this section is given and characterize the points in the manifold satisfying the first order optimality conditions of the problem
Recall that the map is defined in the whole ambient space We will denote its derivative at by which is a bounded linear map from to From Theorem 3, we known that is a -manifold and hence it allows us to write the constrained map as follows. Since
and we can take into account the standard inclusion map
in order to write
Definition 4.
We say that is a critical point for in if
holds for all Clearly, if is an extremal point for in , then it is also a critical point for in
Observe that we can write
where on the left side of the equality, we consider over , whereas, in the right one, is considered defined over the whole space. Thus, by using the chain rule, we have
that is,
In order to compute (15), we consider first the standard inclusion map i that in local coordinates in a neighbourhood of looks like
Here, Hence, its derivative as a morphism (a map between manifolds)
is given by
where
for From (15), we have that is a critical point for in if and only if
that is, it is equivalent to state that
Now, we will prove the following result that characterizes the set of critical points for in as the stationary points of a vector field in , and it is the main result of this paper.
Theorem 4.
Given and Then, is a critical point for in if and only if , where is a vector field in given by
Here,
for
Proof.
Since is a critical point for in if and only if (17) holds for Then, (17) is equivalent to prove that
and
Since when for we conclude that is a critical point for in if and only if
Writing the equality in (20) as
we conclude that it is equivalent to state that
for This concludes the proof of theorem. □
Remark 1.
There are several strategies in the literature that can be used to solve . The first one, closely related to the one used in [2,3], is to find a fixed point of the map
One of the most popular numerical strategies to compute an approximated value of such that is founded under the following argument. Since the is equal to
we can choose randomly some vectors for and then try to solve the linear system
By using Least Squares, we can compute a and next we can proceed in a similar way iteratively with each of the other components as follows. After the step , we know that Then, by choosing randomly , we can solve, by using Least Squares, the linear system
to compute We can proceed cyclically until holds. This strategy is known by the name of the Alternating Least Squares (ALS).
From (17), we have that is a critical point of in if and only if
holds, that is, the residual at is orthogonal to linear subspace Observe that where
is an -dimensional subspace of for Then, the next result explicitly describes the linear subspace
Proposition 1.
The linear map is injective. Moreover,
Proof.
Assume that
Since for and for all then
Since , then for This follows the first statement. To prove the second one, we remark that the inclusion
between both subspaces is trivial. Clearly,
holds and then the second statement is also proved. □
6. Conclusions
In this paper, a geometric upgrade of the PGD algorithm is proposed. To this end, we endow a set of tensors of fixed rank-one with a smooth manifold structure. This construction provides the solutions of a non-convex optimization problem by using a set of stationary points of a vector field. This new perspective allows for the characterization of the behaviour of the solutions of the Greedy Rank One Algorithm which could not be achieved with the ALS strategy used in [8].
Author Contributions
Investigation, writing and revision: A.F., L.H., N.M., M.C.M. and E.N. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the GVA/2019/124 grant from Generalitat Valenciana and by the RTI2018-093521-B-C32 grant from the Ministerio de Ciencia, Innovación y Universidades.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
References
- Falcó, A.; Montés, N.; Chinesta, F.; Hilario, L.; Mora, M.C. On the existence of a Progressive Variational Vademecum based on the Proper Generalized Decomposition for a Class of Elliptic Parametrized Problems. J. Comput. Appl. Math. 2018, 330, 1093–1107. [Google Scholar] [CrossRef]
- Ammar, A.; Mokdad, B.; Chinesta, F.; Keunings, R. A new family of solvers for some classes of multidimensional partial differential equations encountered in Kinetic Theory modelling Complex Fluids. J. Non-Newton. Fluid Mech. 2006, 139, 153–176. [Google Scholar] [CrossRef]
- Ammar, A.; Mokdad, B.; Chinesta, F.; Keunings, R. A new family of solvers for some classes of multidimensional partial differential equations encountered in Kinetic Theory modelling Complex Fluids. Part II: Transient Simulations using Space-Time Separated Representations. J. Non–Newton. Fluid Mech. 2007, 144, 98–121. [Google Scholar] [CrossRef]
- Falcó, A.; Nouy, A. A Proper Generalized Decomposition for the solution of elliptic problems in abstract form by using a functional Eckart- Young approach. J. Math. Anal. Appl. 2011, 376, 469–480. [Google Scholar] [CrossRef]
- Chinesta, F.; Leygue, A.; Bordeu, F.; Aguado, J.V.; Cueto, E.; González, D.; Alfaro, I.; Ammar, A.; Huerta, A. PGD-Based computational vademecum for efficient Design, Optimization and Control. Arch. Comput. Methods Eng. 2013, 20, 31–49. [Google Scholar] [CrossRef]
- Chinesta, F.; Ladevéze, P. Separated Representations and PGD-Based Model Reduction; Springer International Centre for Mechanical Sciences: Cham, Switzerland, 2014. [Google Scholar]
- Falcó, A.; Nouy, A. Proper Generalized Decomposition for Nonlinear Convex Problems in Tensor Banach Spaces. Numer. Math. 2012, 121, 503–530. [Google Scholar] [CrossRef]
- Ammar, A.; Chinesta, F.; Falcó, A. On the convergence of a greedy rank-one update algorithm for a class of linear systems. Arch. Comput. Methods Eng. 2010, 17, 473–486. [Google Scholar] [CrossRef]
- El Hamidi, A.; Osman, H.; Jazar, M. On the convergence of alternating minimization methods in variational PGD. Comput. Optim. Appl. 2017, 68, 455–472. [Google Scholar] [CrossRef]
- Graham, A. Kronecker Products and Matrix Calculus with Applications; John Wiley: Hoboken, NJ, USA, 1981. [Google Scholar]
- Van Loan, C.F. The ubiquitous Kronecker product. J. Comput. Appl. Math. 2000, 123, 85–100. [Google Scholar] [CrossRef]
- Beylkin, G.; Mohlenkamp, M.J. Algorithms for Numerical Analysis in High Dimensions. SIAM J. Sci. Comput. 2005, 26, 2133–2159. [Google Scholar] [CrossRef]
- de Silva, V.; Lim, L.-H. Tensor Rank and Ill-posedness of the best low-rank approximation problem. SIAM J. Matrix Anal. Appl. 2008, 30, 1084–1127. [Google Scholar] [CrossRef]
- Lang, S. Differential and Riemannian Manifolds; Graduate Texts in Mathematics; Springer: Berlin/Heidelberg, Germany, 1995; Volume 160. [Google Scholar]
- Landsberg, J.M. Tensor: Geometry and Applications. Graduate Studies in Mathematics; American Mathematical Society: Providence, Rhode Island, 2012; Volume 128. [Google Scholar]
- Falcó, A.; Hackbusch, W.; Nouy, A. On the Dirac-Frenkel Variational Principal on Tensor Banach Spaces. Found. Comput. Math. 2019, 19, 159–204. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).