Abstract
This paper proposes a method for solving optimisation problems involving piecewise quadratic functions. The method provides a solution in a finite number of iterations, and the computational complexity of the proposed method is locally polynomial of the problem dimension, i.e., if the initial point belongs to the sufficiently small neighbourhood of the solution set. Proposed method could be applied for solving large systems of linear inequalities.
1. Introduction
Let us consider the following optimisation problem:
where , A is an matrix, , , , , , and is the Euclidean norm of .
In this paper, a method for solving the problem in (1) is proposed; moreover, the number of iterations (equivalent to the computational complexity) required by the proposed method with respect to m and n is locally polynomial, and in the worst-case scenario, it has a geometric convergence rate.
Let us define the set of solutions of (1) as follows
If some point sufficiently close to the set of solutions to (1) is known, then it is possible to find a solution of (1) within a polynomial number of computational iterations; thus, the computational complexity is of the order of .
Many methods for solving (1) have been proposed (cf. Karmanov [1], Golikov and Evtushenko [2], Evtushenko and Golikov [3], Tretyakov [4], Tretyakov and Tyrtyshnikov [5] and Han [6]). All of these methods have reasonable computational complexity but, as mentioned above, to date, no strongly polynomial-time algorithm for solving (1) has been proposed. In studies by Tretyakov and Tyrtyshnikov [7] and Mangasarian [8], linear programming problems were solved by reducing them to the unconditional minimisation of strongly convex piecewise quadratic functions. A solution is obtained within a finite polynomial number of iterations if the starting point of the algorithm belongs to a sufficiently close neighbourhood of the unique solution to the problem. Unfortunately, the authors imposed severe limitations on the functions to be minimised: they should be strongly convex, the eigenvalues of the Hessian matrices should satisfy specific conditions, etc.
These results create significant limitations on the class of problems that can be solved: it is required that (1) has only one unique solution, etc. The solution method described by Tretyakov and Tyrtyshnikov [7] is based on exploiting information about the problem being solved by analysing a sufficiently small neighbourhood of an arbitrary solution of (1). Analogous methods were proposed by Facchinei et al. [9] for the forecasting (identification) of the active constraints in a sufficiently close neighbourhood of the solution to the problem. In papers by Tretyakov and Tyrtyshnikov [5] and Wright [10], locally polynomial methods for solving quadratic programming problems based on similar ideas were presented. Tretyakov [4] proposed the gradient projection method for solving (1); this method involves finding a solution of (1) in a finite number of iterations and is a combination of iterative and straightforward (e.g., Gaussian) methods.
This paper proposes a computational method for solving (1). When the starting point of the proposed method is sufficiently close to the set of solutions to (1), then its computational complexity is locally polynomial, i.e., it is of the order of .
We point out that solving a system of linear inequalities involves
where the – m-dimensional vectors of zeroes can be reduced to solve the problem (1). This means that the number of computations required for establishing a solution (if a given system of linear inequalities has one) is locally polynomial.
Let us denote
It is obvious that the set X might be empty in general, but our method, presented in this paper, either determines this situation in a locally polynomial number of computations or provides a solution to the system (3). The proposed method could be applied when solving large systems of linear inequalities, which appear in many practical, industrial applications, e.g., the simplex method (Pan [11]), Karmarkar’s method (Wright, [12]), Chubanov’s method (Roos [13]), and Fourier–Motzkin elimination method (Khachiyan [14], I. Šimeček, R. Fritsch, D. Langr, R. Lórencz [15]).
2. Definitions and Theoretical Results
Let
Theorem 1.
The function is convex and has a nonempty set of minimal values
Proof.
Theorem 1 follows immediately from the well-known features of quadratic-type convex functions (see, e.g., [16]). □
Therefore, in the general case, our goal is to solve the following equation
In the sequel, stands for an arbitrary element of (the minimum point of ). If the minimum value of is equal to zero, then , and if the minimum value of is positive, then . Let us denote
and
where is introduced to simplify the definitions of the sets and .
According to (7) and the above notations, should satisfy the formula
The formula (10) is equivalent to a condition that should be satisfied at point
In (11), it is considered that
This, in turn, means that, in the general case, we should solve the following equations
or
Without loss of generality, we may denote
where .
The main idea exploited in this paper is based on the following Lemma. For we set .
Lemma 1.
Let be a solution to the problem (1). Then, there exists , such that for any , the inequality implies the inequality .
Proof.
If that is , then, by continuity of the function , there exists , such that for all . Set
Then, for all and for all , we have . Consequently, if there exists such that with some , then , that is . □
By virtue of the above lemma, in a sufficiently small neighbourhood of some fixed point , for every , the following hold
Now, our goal is to correctly define the sets and based on the information gained at point . Let us denote
Let and represent the matrix and vector obtained from A and b, respectively. The rows of and the coefficients of correspond to the index set, which is defined by . In this case, Equations (12)–(13) may be rewritten as
Let denote the matrix in the equations in (14) corresponding to the maximum set of linearly independent rows, and let denote the corresponding vector of constant terms in (14).
The equations in (14) may be reformulated in the following way
Let us observe that, at point , the following holds
This, in turn, means that
If the rank of a matrix B of size is equal to r, then the pseudoinverse matrix (operator) may be defined as . We denote the quadratic matrix orthogonally projected on the space containing the rows of matrix B as
, and its projection on the orthogonal complement of the matrix is denoted as , where I is an all-ones matrix of size
Let a point be the projection of point on the set . Let us observe that if and is sufficiently small.
Moreover, if the constraints at point are for a certain , then we define the set in the following way
Otherwise, if the constraints at point are for a certain , we define the set in an analogous way
Now, we redefine , and as follows
Next, we project point on the new set , cf. (18), and a new point is obtained.
Let
define the operator for the projection of point x on set
3. Algorithm for Finding the Solution of (1)
In this section, the algorithm designed to find the solution to (1) is presented. The main idea of this algorithm is based on information related to a current point belonging to a sufficiently small neighbourhood of the point . We also demonstrate how to find such a point. The proposed method comprises two algorithms. The starting point of the method can be arbitrary, because Algorithm 2 (gradient method with a special step selection) starts at an arbitrary point and, on a certain iteration, it will provide a point arbitrarily close to the solution set. Therefore, Algorithm 1 could start at the point specified by Algorithm 2.
Algorithm 1.
Initialisation Step: For the current point , the sets of indices , and are defined according to (9). If set , then is the solution of (1) and Algorithm 1 is terminated. Otherwise, the Main Recursive Step is performed.
Main Recursive Step: Let, the projection of pointon the set, be defined according to (20). We check if the following condition is satisfied
Checking Step: If (21) holds, then , and Equation (10) is satisfied; is the solution of (1), as defined in (2), and Algorithm 1 is terminated. Otherwise, if for certain values of the condition (21) is violated and , we define , and according to (19), is redefined according to (18), and the Main Recursive Step is repeated.
The set is finite, and ; therefore, the number of changes to the index sets , and does not exceed m, and finally, the point fulfilling (12) is established. This means that is the solution of (1), as defined in (2).
It is of utmost importance that belongs to a sufficiently small neighbourhood of the point , because otherwise, may not satisfy (12). If this is not the case, it is necessary to find another point that is closer to . The process for accomplishing this is described below.
Theorem 2.
For a sufficiently small and for every , Algorithm 1 provides as the solution for
and this is equivalent to finding the solution for (12) within a number of iterations of the order of .
Proof.
The proof is based on the observation that for belonging to a sufficiently small neighbourhood of the point , according to Lemma 1, the constraints correspond to constraints . Therefore,
Let us determine as the projection of the point on the set , which is defined according to (18). It may happen that the set becomes enlarged. However, the number of iterations required when becomes enlarged does not exceed m, the number of elements in the set D. Therefore, at some iteration, (21) is satisfied. This means that satisfies (12) or, equivalently, . This demonstrates that is the solution for (1), as defined in (2). The computational complexity of establishing each projection is of order ; this process takes the computational effort related to the multiplications of matrices into account. The number of iterations does not exceed m and, therefore, the overall computational complexity is of order . □
To complement the presentation of this chapter, the gradient method for establishing belonging to the sufficiently small neighbourhood of some fixed solution to (1) is described. This gradient method has the following scheme
where and gradient fulfils the Lipschitz condition
The convergence of the gradient method (23) is considered in the following theorem, cf. Karmanov [1].
Theorem 3.
Let and the sequence , , be constructed according to (23). Then,
Proof.
Now, we have all the necessary prerequisites to present the solution algorithm for (3).
Algorithm 2.
Initialisation Step: Let , and let be an arbitrary point in .
Main Recursive Step: Let
Checking Step: If is the solution for (3), then Algorithm 2 is terminated. Otherwise, we set , and the Main Recursive Step is repeated.
Theorem 4.
There exists a finite such that and is the solution for (3).
Proof.
The sequence converges to a fixed and therefore, at a certain iteration , the hypothesis of Theorem 2 is satisfied, and we obtain the solution . □
Theorem 4 allows us to establish whether (3) has a solution or not.
4. Conclusions and Appendix
As previously mentioned, the locally polynomial complexity estimate is valid only if the starting point of the proposed method belongs to a sufficiently small neighbourhood of the set of solutions . To reach such a desired point, the gradient method (23) is used. There are accelerated gradient methods (see those of Nesterov [17] and Poliak [18]), but these methods do not guarantee monotonic convergence to a set of solutions . The method presented in this paper monotonically converges to a certain point , . It is obvious that the point depends on the initial point and, therefore, the number of iterations required by the gradient method for entering the proper neighbourhood of point depends on the position of the initial point . Moreover, the radius of the neighbourhood of point , which the gradient method should reach, is unknown in the general case and depends on the specific problem being considered. However, it appears that we can guarantee a geometric convergence rate for the gradient method (23) while minimising piecewise quadratic functions of the form (5).
Namely, for every strongly convex function the gradient method (23) has a geometric convergence rate, i.e.,
where c is a constant that is independent of the size of the problem but depends on the initial point . In the general case, for functions that are not convex in the strongest sense, there is no proof of the geometric convergence of the gradient method (23). However, in the case where the function is given by (5), it is possible to prove the geometric convergence of the gradient method (23). Let
The theorem presented below proves the strong convexity of the function in the cone of convergence.
Theorem 5.
The elements of the sequence defined by (23), belong to the cone of strong convexity of the function , namely ,, the function is uniformly strongly convex for the sequence , i.e.,
where , , , and .
Proof.
First, it should be pointed out that because the second derivative of the function has a finite number of points of discontinuity in every direction, i.e., on the ray , there exists such that on the closed interval , the function has a continuous second derivative that obviously depends on . Let us assume that the theorem does not hold, i.e., there is not , such that (24) holds. This means that for
the following must hold
or
For vector , the following condition holds, or, due to the construction of ,
where , is a certain fixed constant. Let be (locally) the projection of on the set . Then, due to , , we have
Let us set sufficiently small and consider the points , …. Then, according to Theorem 3, we have
On the other hand, according to (26), when
This is contradictory to (27), and therefore Theorem 5 holds. □
Theorem 5 allows for the estimation of the convergence rate of the gradient method (23).
Theorem 6.
Under the assumptions of Theorem 5 for the sequence constructed according to (23), the following convergence rates hold
where , and and ; the constants , are independent of the value of k but depend on the initial point .
Proof.
The conduction computational experiments and comparison of the presented method with other methods in the literature remains a topic for future research.
Author Contributions
Conceptualisation, A.T. and K.S.; methodology, A.P., K.S. and A.T.; validation, A.P., K.S. and A.T.; formal analysis, A.P.; investigation, A.P., K.S. and A.T.; resources, A.T.; writing—original draft preparation, A.P. and K.S.; supervision, A.T.; project administration, A.P.; funding acquisition, A.P. and A.T. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Ministry of Science and Higher Education, grant number 61/20/B.
Acknowledgments
The research of the third author was supported by Russian Sciences Foundation (project No 21-71-30005).
Conflicts of Interest
The authors declare no conflict of interest. Formal analysis, funding acquisition, investigation, methodology, project administration, validation, writing—original draft, A.P.; conceptualization, investigation, methodology, validation, writing—original draft, K.S.; conceptualization, funding acquisition, investigation, methodology, resources, supervision, validation, A.T.
References
- Karmanov, V.G. Mathematical Programming; Mir Publishers: Moscow, Russia, 1989. [Google Scholar]
- Golikov, A.; Evtushenko, Y.G. Theorems of the alternative and their applications in numerical methods. Comput. Math. Math. Phys. 2003, 43, 338–358. [Google Scholar]
- Evtushenko, Y.G.; Golikov, A. New perspective on the theorems of alternative. In High Performance Algorithms and Software for Nonlinear Optimization; Springer: Berlin/Heidelberger, Germany, 2003; pp. 227–241. [Google Scholar]
- Tretyakov, A. A finite-termination gradient projection method for solving systems of linear inequalities. Russ. J. Numer. Anal. Model. 2010, 25, 279–288. [Google Scholar] [CrossRef]
- Tretyakov, A.; Tyrtyshnikov, E. Exact differentiable penalty for a problem of quadratic programming with the use of a gradient-projective method. Russ. J. Numer. Anal. Model. 2015, 30, 121–128. [Google Scholar] [CrossRef]
- Han, S.-P. Least-Squares Solution of Linear Inequalities; Technical Report; Wisconsin Univ-Madison Mathematics Research Center: Madison, WI, USA, 1980. [Google Scholar]
- Tretyakov, A.; Tyrtyshnikov, E. A finite gradient-projective solver for a quadratic programming problem. Russ. J. Numer. Anal. Model. 2013, 28, 289–300. [Google Scholar] [CrossRef]
- Mangasarian, O. A Finite Newton Method for Classification Problems; Technical Report, Technical Report 01-11; Data Mining Institute, Computer Sciences Department, University of Wisconsin: Madison, WI, USA, 2001. [Google Scholar]
- Facchinei, F.; Fischer, A.; Kanzow, C. On the accurate identification of active constraints. SIAM 1998, 9, 14–32. [Google Scholar] [CrossRef]
- Wright, S.J. An algorithm for degenerate nonlinear programming with rapid local convergence. SIAM 2005, 15, 673–696. [Google Scholar] [CrossRef][Green Version]
- Pan, P.Q. A Projective Simplex Algorithm Using LU Decomposition. Comput. Math. Appl. 2000, 39, 187–208. [Google Scholar] [CrossRef][Green Version]
- Wright, M.H. The interior-point revolution in optimization: History, recent developments, and lasting consequences. Bull. Am. Math. Soc. 2005, 42, 39–56. [Google Scholar] [CrossRef]
- Roos, K. An improved version of Chubanov’s method for solving a homogeneous feasibility problem. Optim. Methods. Softw. 2018, 33, 26–44. [Google Scholar] [CrossRef]
- Khachiyan, L. Fourier-Motzkin elimination method. In Encyclopedia of Optimization; Floudas, C.A., Pardalos, P.M., Eds.; Springer: Berlin/Heidelberger, Germany, 2009; pp. 1074–1077. [Google Scholar]
- Šimeček, I.; Fritsch, R.; Langr, D.; Lórencz, R. Paralel solver of large systems of linear inequalities using Fourier-Motzkin elimination. Comput. Inform. 2016, 35, 1307–1337. [Google Scholar]
- Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 1970. [Google Scholar]
- Nesterov, Y. Nesterov. One class of methods of unconditional minimization of a convex function, having a high rate of convergence. USSR 1984, 24, 80–82. [Google Scholar]
- Poliak, B. Introduction to Optimization; Optimization Software, Inc.: New York, NY, USA, 1987. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).